We knew we needed to collect a data set that has far more images than we have ever had before, perhaps thousands of times more,...
We knew we needed to collect a data set that has far more images than we have ever had before, perhaps thousands of times more, and together with Professor Kai Li at Princeton University, we launched the ImageNet project in 2007. Luckily, we didn’t have to mount a camera on our head and wait for many years. We went to the Internet, the biggest treasure trove of pictures that humans have ever created. We downloaded nearly a billion images and used crowdsourcing technology like the Amazon Mechanical Turk platform to help us to label these images. At its peak, ImageNet was one of the biggest employers of the Amazon Mechanical Turk workers: together, almost 50,000 workers from 167 countries around the world helped us to clean, sort and label nearly a billion candidate images. That was how much effort it took to capture even a fraction of the imagery a child’s mind takes in in the early developmental years.Li Fei Fei, How we’re teaching computers to understand pictures