mirror of https://github.com/opencv/opencv.git synced 2025-07-24 14:06:27 +08:00

History

Alexander Alekhin 314246d396 Merge pull request #11459 from dkurt:dnn_mobilenet_v2		2018-05-11 09:48:05 +00:00
..
face_detector	Merge pull request #11236 from dkurt:dnn_fuse_l2_norm	2018-04-11 15:09:55 +00:00
classification.cpp	Semantic segmentation sample.	2018-03-08 11:02:26 +03:00
classification.py	Update tutorials. A new cv::dnn::readNet function	2018-03-04 20:30:22 +03:00
CMakeLists.txt	Update links to OpenCV's face detection network	2018-04-02 13:02:56 +03:00
colorization.cpp	Minor refactoring in several C++ samples:	2018-03-06 14:23:20 +03:00
colorization.py	Merge pull request #10777 from berak:dnn_colorize_cpp	2018-02-05 15:07:40 +03:00
edge_detection.py	Custom deep learning layers in Python	2018-04-26 09:25:18 +03:00
fast_neural_style.py	Layers for fast-neural-style models: https://github.com/jcjohnson/fast-neural-style	2017-10-27 14:26:45 +03:00
js_face_recognition.html	Update links to OpenCV's face detection network	2018-04-02 13:02:56 +03:00
mobilenet_ssd_accuracy.py	Specific version of MobileNet-SSD from TensorFlow	2017-11-24 13:40:35 +03:00
object_detection.cpp	select the device (video capture)	2018-05-09 17:20:02 +03:00
object_detection.py	Support YOLOv3 model from Darknet	2018-04-16 18:44:12 +03:00
openpose.cpp	select the device (video capture)	2018-05-09 17:20:02 +03:00
openpose.py	fixed samples/dnn/openpose.py	2018-03-15 05:17:57 +09:00
README.md	Update links to OpenCV's face detection network	2018-04-02 13:02:56 +03:00
segmentation.cpp	select the device (video capture)	2018-05-09 17:20:02 +03:00
segmentation.py	Semantic segmentation sample.	2018-03-08 11:02:26 +03:00
shrink_tf_graph_weights.py	Text TensorFlow graphs parsing. MobileNet-SSD for 90 classes.	2017-10-08 22:25:29 +03:00
tf_text_graph_ssd.py	Update script to generate MobileNet-SSD V2 text graph	2018-05-04 07:55:18 +03:00

README.md

OpenCV deep learning module samples

Model Zoo

Object detection

Model	Scale	Size WxH	Mean subtraction	Channels order
MobileNet-SSD, Caffe	`0.00784 (2/255)`	`300x300`	`127.5 127.5 127.5`	BGR
OpenCV face detector	`1.0`	`300x300`	`104 177 123`	BGR
SSDs from TensorFlow	`0.00784 (2/255)`	`300x300`	`127.5 127.5 127.5`	RGB
YOLO	`0.00392 (1/255)`	`416x416`	`0 0 0`	RGB
VGG16-SSD	`1.0`	`300x300`	`104 117 123`	BGR
Faster-RCNN	`1.0`	`800x600`	`102.9801, 115.9465, 122.7717`	BGR
R-FCN	`1.0`	`800x600`	`102.9801 115.9465 122.7717`	BGR

Face detection

An origin model with single precision floating point weights has been quantized using TensorFlow framework. To achieve the best accuracy run the model on BGR images resized to 300x300 applying mean subtraction of values (104, 177, 123) for each blue, green and red channels correspondingly.

The following are accuracy metrics obtained using COCO object detection evaluation tool on FDDB dataset (see script) applying resize to 300x300 and keeping an origin images' sizes.

AP - Average Precision                            | FP32/FP16 | UINT8          | FP32/FP16 | UINT8          |
AR - Average Recall                               | 300x300   | 300x300        | any size  | any size       |
--------------------------------------------------|-----------|----------------|-----------|----------------|
AP @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] | 0.408     | 0.408          | 0.378     | 0.328 (-0.050) |
AP @[ IoU=0.50      | area=   all | maxDets=100 ] | 0.849     | 0.849          | 0.797     | 0.790 (-0.007) |
AP @[ IoU=0.75      | area=   all | maxDets=100 ] | 0.251     | 0.251          | 0.208     | 0.140 (-0.068) |
AP @[ IoU=0.50:0.95 | area= small | maxDets=100 ] | 0.050     | 0.051 (+0.001) | 0.107     | 0.070 (-0.037) |
AP @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] | 0.381     | 0.379 (-0.002) | 0.380     | 0.368 (-0.012) |
AP @[ IoU=0.50:0.95 | area= large | maxDets=100 ] | 0.455     | 0.455          | 0.412     | 0.337 (-0.075) |
AR @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] | 0.299     | 0.299          | 0.279     | 0.246 (-0.033) |
AR @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] | 0.482     | 0.482          | 0.476     | 0.436 (-0.040) |
AR @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] | 0.496     | 0.496          | 0.491     | 0.451 (-0.040) |
AR @[ IoU=0.50:0.95 | area= small | maxDets=100 ] | 0.189     | 0.193 (+0.004) | 0.284     | 0.232 (-0.052) |
AR @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] | 0.481     | 0.480 (-0.001) | 0.470     | 0.458 (-0.012) |
AR @[ IoU=0.50:0.95 | area= large | maxDets=100 ] | 0.528     | 0.528          | 0.520     | 0.462 (-0.058) |

Classification

Model	Scale	Size WxH	Mean subtraction	Channels order
GoogLeNet	`1.0`	`224x224`	`104 117 123`	BGR
SqueezeNet	`1.0`	`227x227`	`0 0 0`	BGR

Semantic segmentation

Model	Scale	Size WxH	Mean subtraction	Channels order
ENet	`0.00392 (1/255)`	`1024x512`	`0 0 0`	RGB
FCN8s	`1.0`	`500x500`	`0 0 0`	BGR