This guide will help you to setup environment on your PC and Raspberry PI, train model for fruits classification and localization and deploy it as simple realtime program.
1 Laptop, 1 Raspberry PI 2, 1 Web-camera, 1 Apple, 1 Pear
Getting process started
All steps are coded in Fruit.ipynb, so you can use it to pass through all steps on your own.
Step 1. Collect data
Download prepared data here or follow next steps to do it by yourself:
Step 2. Data preprocessing
Pascal Voc annotation format is simply describes all information about image: size, objects and its pose, bounding boxes, names. Change next information in files:
Add extension to filenames
Change object names from class wnid to real names n07739125 -> apple
Split to train/test. Common practice is to use 80% of data on training and 20% of data on testing.
After all manipulations with data we got folder with train/test data corresponding to each class with images and annotations in it.
git clone https://github.com/thtrieu/darkflow.git
2. Go to project folder
3. Install darkflow
pip install .
Now let’s create model configuration.
Edit labels.txt file. Substitute existing labels with new ones : apple, pear
Note: labels name in labels.txt should match with object name in annotations files, because darkflow is looking directly for specified object names.
2. Design net
Go to ../darkflow/cfg/
Make a new copy of tiny-yolo.cfg and rename it to yolo_fruits.cfg
Open yolo_fruits.cfg and make simple changes:
line 110 [convolutional] filters=35 #(2 + 4 + 1) * 5, where 2 - num_classes (pear/apple) 4 - x, y, w, h (parametrs of bounding box) 1 - confidence of bounding box 5 - num_anchors ( look at region section ) line 120 [region] classes=2 num=5
3. Calculate anchors
line 118 [region] anchors=7.19,7.25, 5.25,2.59, 6.06,7.34, 9.09,5.47, 9.87,4.5625
What anchors represents?
Anchors are an averaged width and height of all objects bounding boxes. It calculates with k-Means algorithm. For example, in our case k-Means algorithm found 5 closest bounding boxes groups and calculates centroids for each group. That is why we have 10 numbers (5(num_anchors) * 2(w, h)).
Original anchors are the next values: [230 232], [168 83], [194 235], [291 175], [316 146]:
So why anchors represented by small numbers with floating point instead of original values?
Due to YOLO algorithm we need to scale anchors with respect to the feature map in neural network architecture. For example in YOLO-tiny input image has size 416x416 and that last feature map size is 13x13. Scale ratio is 13/416. We mutiply each value of original values with scale ratio: [230, 232] * (13/416)=[7.19, 7.25]. That’s right - we got same numbers.
4. Download weights (here) and place it to darkflow/bin
tiny-yolo-voc.weights corresponds to tiny-yolo.cfg
yolo.weights corresponds to yolo.cfg
5. Follow instructions on how you can run a model
flow --model cfg/yolo-fruits.cfg --load bin/yolo-tiny.weights --train --gpu 1.0 --annotation ../train/annotations --dataset ../train/images
--gpu 1.0 - shows how much performance of GPU is going to be used. If there is no GPU in your PC remove this flag (But be ready to wait a while when you training on CPU). If you paused or stopped training process you can easily continue with the same command but with flag:
--load -1 - takes the latest checkpoint from ../darkflow/ckpt/
I suggest you to look at Tensorboard while training model. Darkflow automatically saves summaries to ../darkflow/summary/ folder. Simply run:
tensorboard --logdir=../darkflow/summary --port=6001
Go to your browser, print https://localhost:6001 and checkout loss function. Loss graph updates in real-time so you can easily check when there are any mistakes. If loss value do not become smaller then network does not learn anything.
‘None type’ has no attribute ‘shape’
This means that there is no image according to specified filename in annotations. Check that image filenames in annotations corresponds to filenames in images folder.
OUT_OF_MEMORY (when using GPU)
When using GPU all data has to be loaded to GPU memory buffer. Sometimes there is not enough space to place all data. As possible solution you can decrease batch size (default=16) using flag:
Simplest way to evaluate model is to use darkflow command line tool:
flow --imgdir ../test/images --model cfg/yolo-fruits.cfg --load -1 --threshold 0.1 --gpu 1.0
Notice that we set threshold value to 0.1 (default value - 0.6), because we want to see all detections from our model. But play with threshold and find best value.
Once you satisfied with the accuracy of your model you can export it and use in real application. Just type the next command:
flow --model cfg/yolo-fruits.cfg --load -1 --savepb
You can find your model yolo_fruits.pb and .meta in ../darkflow/built_graph/ folder. For detailed explanation look at section ‘Save the built graph to a protobuf file (.pb)’.
Let prepare Raspberry PI.
Install OS. Check out this guide.
Note: I’ve tested model deployment on Raspbian OS.
Install Tensorflow. Go here.
Note: Look at section “Installing from Pip” and do exactly the same steps. Install for python 3.
Install OpenCV. Go here.
Install Darkflow as in previous steps.
Note: To install Cython use:
sudo pip3 install Cython
Now you have everything ready to use and test your model. (Code)
Copy yolo_fruits.pb and yolo_fruits.meta to Raspberry Pi.
Run command in darkflow folder:
flow --pbLoad yolo_fruits.pb --metaLoad yolo_fruits.meta demo camera
Forward pass neural networks on CPU is time consuming so be ready to get from <1 FPS. I’ve tested on Raspberry PI 2 and got 0.3 FPS, but it could be slight faster on Raspberry PI 3.
Using model on smartphone would be much faster as it have GPU.
Change color of bounding boxes:
Go to .meta file and find ‘colors’ field. Change colors in RGB format
Change labels names:
Go to .meta file and find labels field. Change colors names
Change threshold to identify object with more confidence.