In my project on the Imaginghub, I have tried to give a guide, which will help you to set up environment on your PC and Raspberry PI, train model for fruit classification, localization and deploy it as simple real time program.
In this blog article I want to give you background information and a few insights of the starting point of your project.
Overview
As you know, a computer works with images as with an array of numbers.
And we need to create a smart model to extract features from a bunch of numbers and classify those features to a specific category.
Before “deep learning”, engineers were doing it by hand-crafted features like Histogram of oriented gradients (HOG), Haar, Local Binary Patterns (LBP), SIFT. Classification with those features is pretty fast and robust. But for complex objects and tasks there is no easy way to describe those features by hand and this requires you to become an expert in the target field. That’s why artificial neural networks (ANN) are so popular nowadays. With an ANN it has become possible to classify thousands of classes with human accuracy. ANNs are great in generalization and solve tasks when there is no need to be an expert - but it is necessary to have a huge number of examples. So any person with data could create any intelligent model.
As you possibly know there are three main steps to build any neural network model using machine learning algorithms:
- Collect data
- Train model
- Deploy model
Collect Data
As you may remember, our goal was to detect two fruits - apples and pears. For this we need to collect as much data as possible. This process could be painful especially if we are going to do it from scratch, because firstly we need to download hundreds of images and then manually annotate every target object in images.
Thankfully for creators of Imagenet, Kaggle Challenge we can search for any object and download directly images of interest with annotations. Take a look on those websites and get any images you want.
Train model
Now you have all the required data. But to train any model we need to split the data into train/test datasets. Checkout this video from Udacity’s Deep Learning course to know more about why we need to split the dataset.
Training is the most challenging step as it takes a lot of time to train the model and make it accurate enough to use it in production.
I hope you are familiar with simple neural networks architecture, because here we will use an advanced one. If not, have a look at this simple explanation.
Are you ready to know more about the algorithm?
For this task YOLO (version 2) is the best choice. To get more information about this approach please refer to this paper or website. Also implementation of this architecture from scratch could be challenging, so we will use darkflow code.
With existing tools it is much easier to train data by just specifying the folder with data and defining network architecture. Also the training process can be time consuming, but with GPU usage it’ll be much faster.
When you have your model well trained and tested on your data, you can easily deploy it to Raspberry Pi. I got something like this:
As you can see the model is pretty slow but here are some considerations that can help to make it faster (I’m going to go through this in the next articles):
- apply quantization (reduces model size and operates with bytes instead of floats)
- build model with C++ instead of Python (C++ is more optimized for this kind of task and there is a possibility to use advanced processor's capabilities)
- use image queues as input so that the recognition process begins to look like an image pipeline.
Here is a great performance guide on how to make Tensorflow faster.
Congratulations
You did a great job ;-). It wasn’t so easy to implement one of the newest object detection algorithms and deploy it to Raspberry Pi. We went through all steps that are essential to real Deep Learning researcher. Now you have a lot to do in order to make the model more efficient, robust and accurate. Every day people publish new results and findings and there is so much information we don't have yet.
I wish you good luck and I am looking forward to hearing about your achievements in the field of deep learning. Just log in on the Imaginghub and post your comments on the blog article.
Comments
Report comment