Data storage for computer vision problem

MikeErsam report abuse

I have a computer vision project where I need to use a lot of images for training. In addition, it is a long-term project and we are constantly getting new images. I definitely need a proper way of storing all the data. As far as I know, there are such solutions as data lakes. But also I have heard about data warehouses. Which should we use in our project?

Answers

LubNot04 report abuse

In my opinion, you should use a data lake. It suits better for unstructured data, and images are the examples of unstructured data.

MikeErsam report abuse

Ok, got it. Are there some other important difference between data lakes and data warehouses besides that they are designated for different types of data? What if my project would be extended and I will need to use some structured data, will I need to create a data warehouse for it?

LubNot04 report abuse

No, you won't have to. Data lakes are for all data types, including structured. Data warehouses are mainly for structured and already processed data. The benefit of a data lake is that you don't need to process data at the uploading time. You don't even need to know how you will use the data, you just need to dump ALL the data to the data lake. So, it is perfect for your case. And also, data lake's storage space is cheaper than the storage space for a data warehouse, because they use different underlying storage technologies. You can search more about the differences between data lakes and data warehouses on the Internet, but my strong belief is that data lake is what you need in your situation.

MikeErsam report abuse

Thanks, your thoughts clarified things for me a lot.

Add Answer

Need support?

Just drop us an email to ... Show more