OpenCV3.4.2で始める浅いディープラーニング (その3 物体検出)

example_dnn_object_detectionは、画像中の物体の矩形領域を見つけてラベル付けした答えを返してくれる物体検出プログラムです。opencv/samples/dnn at master · opencv/opencv · GitHubのModel Zooに挙げられているモデルが使えます。
モデル名と使用するパラメータはModelZooに掛かれているのですが、具体的なモデルファイル名が書かれていません。対応があっているかどうか若干不安がありますが、それぞれ試してみたいと思います。

1. example_dnn_object_detection

MobileNet-SSD, Caffe

example_dnn_object_detection.exe --model=MobileNetSSD_deploy.caffemodel --config=MobileNetSSD_deploy.prototxt --scale=0.00784 --width=300 --height=300 --mean=127.5 127.5 127.5 --classes=object_detection_classes_pascal_voc.txt --input=dog416.png

OpenCV face detector

example_dnn_object_detection.exe --model=opencv_face_detector.caffemodel --config=opencv_face_detector.prototxt --scale=1.0 --width=300 --height=300 --mean=104 177 123 --classes=object_detection_classes_coco.txt --input=googlenet_1.png

f:id:cvl-robot:20180725164725p:plain

SSDs from TensorFlow

example_dnn_object_detection.exe --model=ssd_inception_v2_coco_2017_11_17.pb --config=ssd_inception_v2_coco_2017_11_17.pbtxt --scale=0.00784 --width=300 --height=300 --mean=127.5 127.5 127.5 --rgb=1 --classes=object_detection_classes_coco.txt --input=dog416.png --backend=3

example_dnn_object_detection.exe --model=ssd_mobilenet_v1_coco_2017_11_17.pb --config=ssd_mobilenet_v1_coco_2017_11_17.pbtxt --scale=0.00784 --width=300 --height=300 --mean=127.5 127.5 127.5 --rgb=1 --classes=object_detection_classes_coco.txt --input=dog416.png --backend=3

YOLO

example_dnn_object_detection.exe --model=yolov3.weights --config=yolov3.cfg --scale=0.00392 --width=416 --height=416 --mean=0 0 0 --rgb --classes=object_detection_classes_yolov3.txt --input=dog416.png

f:id:cvl-robot:20180725163105p:plain

VGG16-SSD

example_dnn_object_detection.exe --model=VGG_ILSVRC2016_SSD_300x300_iter_440000.caffemodel --config=ssd_vgg16.prototxt --scale=1.0 --width=300 --height=300 --mean=104 117 123 --input=dog416.png

Faster-RCNN

example_dnn_object_detection.exe --model=VGG16_faster_rcnn_final.caffemodel --config=faster_rcnn_vgg16.prototxt --scale=1.0 --width=800 --height=600 --mean=102.9801 115.9465 122.7717 --classes=object_detection_classes_pascal_voc.txt --input=dog416.png

f:id:cvl-robot:20180725170906p:plain

example_dnn_object_detection.exe --model=ZF_faster_rcnn_final.caffemodel --config=faster_rcnn_zf.prototxt --scale=1.0 --width=800 --height=600 --mean=102.9801 115.9465 122.7717 --classes=object_detection_classes_pascal_voc.txt --input=dog416.png

違いが良く分からない。

R-FCN

example_dnn_object_detection.exe --model=resnet50_rfcn_final.caffemodel --config=rfcn_pascal_voc_resnet50.prototxt --scale=1.0 --width=800 --height=600 --mean=102.9801 115.9465 122.7717 --classes=object_detection_classes_pascal_voc.txt --input=dog416.png

???? これと、この下の対応モデルが良く分からない。

Faster-RCNN, ResNet backbone

example_dnn_object_detection.exe --model=resnet50_rfcn_final.caffemodel --config=rfcn_pascal_voc_resnet50.prototxt --scale=1.0 --width=300 --height=300 --mean=103.939 116.779 123.68 --rgb=1 --classes=object_detection_classes_pascal_voc.txt --input=dog416.png

f:id:cvl-robot:20180725172713p:plain
????

Faster-RCNN, InceptionV2 backbone

example_dnn_object_detection.exe --model=faster_rcnn_inception_v2_coco_2018_01_28.pb --config=faster_rcnn_inception_v2_coco_2018_01_28.pbtxt --scale=0.00784 --width=300 --height=300 --mean=127.5 127.5 127.5 --rgb=1 --classes=object_detection_classes_coco.txt --input=dog416.png --backend=3

メモ

tensorflowベースのモデルは、OpenVINOベースのバックエンドだとうまく動いてくれません。
フレームワーク毎のモデルファイルとコンフィグファイルの拡張子を確認しておきます。
Caffe: .caffemodel, .prototxt
TensorFlow: .pb, pbtxt
Torch: .t7 or .net (cfgファイルを内包していて、modelだけを指定する)
Darknet: .weights, .cfg