Hi guys!
I have an idea of some autofocus module:
1. Need to copy Deep Neural Network basics (rewrite in C/C++?)
2. There is Tiny YOLOv3 for constrained environments:
https://pjreddie.com/darknet/yolo/The bad part is that weight file still too big (33.7 Mb), can we locate it in fragmented shoot_malloc memory?
3. Make a small JPG/any other fast format dump from the video stream
4. Analyze it using YOLO, detect e.g. faces, optionally draw bounding boxes around on LiveView
5. Saving size of bounding box and it's position
6. If bounding box became bigger - focusing front, if became smaller - focusing back
7. Also we have info about focal length and resolution, so we can calculate physical size of subject in bounding box, isn't it? It can help sometimes I think.
The most difficult part is to run all this stuff inside small memory and I'm not sure about CPU capabilities.
The other workaround is to train own DNN, not so powerful but enough to test and with small weights file.
What do you think?
P.S. The awesome part about YOLO is that it can detect not only faces, but subject. E.g. you can give the camera a command: "keep in focus this banana please" or "keep in focus this one horse please".
P.S.S. We will need an interface to switch between the targets (joystick?) and a targets description itself (type in a keyboard? predefined choise menu: animals, people faces, cars etc.).
I have 5D Mark II. so I'm asking here.
Also I get some old knowledge of C/C++, and some ASM, but never developed any low-level things, only video-games.