mirror of https://github.com/opencv/opencv.git synced 2024-11-29 05:29:54 +08:00

History

Alexander Alekhin 3870891d2d update samples/dnn/face_detector/.gitignore		2018-04-04 19:12:50 +03:00
..
face_detector	update samples/dnn/face_detector/.gitignore	2018-04-04 19:12:50 +03:00
classification.cpp	Semantic segmentation sample.	2018-03-08 11:02:26 +03:00
classification.py	Update tutorials. A new cv::dnn::readNet function	2018-03-04 20:30:22 +03:00
CMakeLists.txt	Update links to OpenCV's face detection network	2018-04-02 13:02:56 +03:00
colorization.cpp	Minor refactoring in several C++ samples:	2018-03-06 14:23:20 +03:00
colorization.py	Merge pull request #10777 from berak:dnn_colorize_cpp	2018-02-05 15:07:40 +03:00
fast_neural_style.py	Layers for fast-neural-style models: https://github.com/jcjohnson/fast-neural-style	2017-10-27 14:26:45 +03:00
js_face_recognition.html	Update links to OpenCV's face detection network	2018-04-02 13:02:56 +03:00
mobilenet_ssd_accuracy.py	Specific version of MobileNet-SSD from TensorFlow	2017-11-24 13:40:35 +03:00
object_detection.cpp	Semantic segmentation sample.	2018-03-08 11:02:26 +03:00
object_detection.py	Update tutorials. A new cv::dnn::readNet function	2018-03-04 20:30:22 +03:00
openpose.cpp	dnn: add an openpose.cpp sample	2018-03-16 19:36:45 +01:00
openpose.py	fixed samples/dnn/openpose.py	2018-03-15 05:17:57 +09:00
README.md	Update links to OpenCV's face detection network	2018-04-02 13:02:56 +03:00
segmentation.cpp	Semantic segmentation sample.	2018-03-08 11:02:26 +03:00
segmentation.py	Semantic segmentation sample.	2018-03-08 11:02:26 +03:00
shrink_tf_graph_weights.py	Text TensorFlow graphs parsing. MobileNet-SSD for 90 classes.	2017-10-08 22:25:29 +03:00
tf_text_graph_ssd.py	Fix minimal aspect ratio scale for SSDs from TensorFlow	2018-03-28 12:57:06 +03:00

README.md

OpenCV deep learning module samples

Model Zoo

Object detection

Model	Scale	Size WxH	Mean subtraction	Channels order
MobileNet-SSD, Caffe	`0.00784 (2/255)`	`300x300`	`127.5 127.5 127.5`	BGR
OpenCV face detector	`1.0`	`300x300`	`104 177 123`	BGR
SSDs from TensorFlow	`0.00784 (2/255)`	`300x300`	`127.5 127.5 127.5`	RGB
YOLO	`0.00392 (1/255)`	`416x416`	`0 0 0`	RGB
VGG16-SSD	`1.0`	`300x300`	`104 117 123`	BGR
Faster-RCNN	`1.0`	`800x600`	`102.9801, 115.9465, 122.7717`	BGR
R-FCN	`1.0`	`800x600`	`102.9801 115.9465 122.7717`	BGR

Face detection

An origin model with single precision floating point weights has been quantized using TensorFlow framework. To achieve the best accuracy run the model on BGR images resized to 300x300 applying mean subtraction of values (104, 177, 123) for each blue, green and red channels correspondingly.

The following are accuracy metrics obtained using COCO object detection evaluation tool on FDDB dataset (see script) applying resize to 300x300 and keeping an origin images' sizes.

AP - Average Precision                            | FP32/FP16 | UINT8          | FP32/FP16 | UINT8          |
AR - Average Recall                               | 300x300   | 300x300        | any size  | any size       |
--------------------------------------------------|-----------|----------------|-----------|----------------|
AP @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] | 0.408     | 0.408          | 0.378     | 0.328 (-0.050) |
AP @[ IoU=0.50      | area=   all | maxDets=100 ] | 0.849     | 0.849          | 0.797     | 0.790 (-0.007) |
AP @[ IoU=0.75      | area=   all | maxDets=100 ] | 0.251     | 0.251          | 0.208     | 0.140 (-0.068) |
AP @[ IoU=0.50:0.95 | area= small | maxDets=100 ] | 0.050     | 0.051 (+0.001) | 0.107     | 0.070 (-0.037) |
AP @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] | 0.381     | 0.379 (-0.002) | 0.380     | 0.368 (-0.012) |
AP @[ IoU=0.50:0.95 | area= large | maxDets=100 ] | 0.455     | 0.455          | 0.412     | 0.337 (-0.075) |
AR @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] | 0.299     | 0.299          | 0.279     | 0.246 (-0.033) |
AR @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] | 0.482     | 0.482          | 0.476     | 0.436 (-0.040) |
AR @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] | 0.496     | 0.496          | 0.491     | 0.451 (-0.040) |
AR @[ IoU=0.50:0.95 | area= small | maxDets=100 ] | 0.189     | 0.193 (+0.004) | 0.284     | 0.232 (-0.052) |
AR @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] | 0.481     | 0.480 (-0.001) | 0.470     | 0.458 (-0.012) |
AR @[ IoU=0.50:0.95 | area= large | maxDets=100 ] | 0.528     | 0.528          | 0.520     | 0.462 (-0.058) |

Classification

Model	Scale	Size WxH	Mean subtraction	Channels order
GoogLeNet	`1.0`	`224x224`	`104 117 123`	BGR
SqueezeNet	`1.0`	`227x227`	`0 0 0`	BGR

Semantic segmentation

Model	Scale	Size WxH	Mean subtraction	Channels order
ENet	`0.00392 (1/255)`	`1024x512`	`0 0 0`	RGB
FCN8s	`1.0`	`500x500`	`0 0 0`	BGR