We’re really excited about our upcoming EGG Paris and EGG NYC conferences next week. There will be data science experts running demos, talks and use cases from industry leaders, and lots of swag. We wanted to celebrate with a cool new visual deep learning model in Dataiku 5.0.
Following the example of everyone’s favorite Silicon Valley episode, we decided to tackle image classification based on whether something is an Egg, or, well, not an Egg.
Just like with Hotdog-Not-Hotdog, Dataiku’s Deep Learning leverages Keras and TensorFlow, but it does all the hard parts for you.
The Process
We started off by gathering lots of images and dividing them into folders based on whether they were Eggs or Not:
Can you guess which picture is in which category ?
After you’ve got image sets, we recommend shuffling them, so that the training set doesn’t accidentally use only Egg images or Not Egg images.
Then you can leverage a deep learning model just like any other model in Dataiku. While the training may take a few minutes, once it finishes, you can see how the model performed on each image and adjust accordingly. After our first test, we added more egg images to the training set in order to improve the model. With more time and images, you can further improve your model accuracy, but if your use case involves human eyes double checking the classification anyway, this may not be the best use of your time.
If you want to check out more details about the model, you can click here.
Next Steps
While our use case is not the most practical, there are lots of ways to improve the functionality of this model (and others). If we wanted to make the web app more interactive, we could enable users to post their own images and see how the model evaluates them. We could also share our results with other developers on our team so that they could help improve the model. The more hands on a project, the faster it will improve, so even for low-stakes models, this is a good idea.
If you want to try building your own image classification system, free to try Dataiku 5.0 today. You can also learn more about Deep Learning models at our EGG conferences, including a use case on how Giphy handles search functionality. If you don’t have tickets to EGG already, (we’re sold out!) consider attending our upcoming webinar on Deep Learning training and real-world use cases.
Many thanks to Alex Reutter for building the model and demo.