ML on ML (machine learning on Magic Lantern)

Started by names_are_hard, April 01, 2023, 01:02:35 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

names_are_hard

This post contains some lies.  It was a joke, a jape, a ruse.  It's mostly true!  Explanation below.

We've known this for a while, but unusually, the 200D has an FPGA.  These are "Field Programmable", they're designed to be configured at runtime, so you can make them do different things.

Turns out, this one is quite powerful, and after a lot of work I managed to modify the bitstream to run machine-learning algorithms.

I thought real-time object detection would make a nice demo.  Here I'm running YOLOv3, a one-shot, deep neural network algorithm:
https://pjreddie.com/darknet/yolo/

What do you think?


https://i.imgur.com/guA8Tm8.mp4



Apologies for the shaky cam on this one, I've got a second cam on a gorilla pod tucked under my arm.  Quite hard to film yourself filming:

https://i.imgur.com/D4MYBfE.mp4


This creates so many possibilities now we can run modern AI on Canon cams!  Could be quite useful for Exif tagging images.


Skinny


names_are_hard


RumiG

It is the first of April, could it be a coincidence....

iaburn

From another user an on April 1st, I would say it's 100% fake... but in this case I'm just speechless  :o :o
I had no idea about the FPGA on that camera, and you talk about programming this chip with no official tools like something everyone can do  :o

names_are_hard

I've cheated a little in the description ;)  But there is no video trickery going on, and it really is doing YOLOv3 object detection.  Anyone with a 200D can run this.

yourboylloyd

No freaking way... I'll have to check back tomorrow after April 1
Join the ML discord! https://discord.gg/H7h6rfq

ilia3101

Wait... FPGA? Could it do any image processing? Like log encoding raw video?

Do other cameras have one?

names_are_hard

Log encoding raw video output doesn't make sense.  The sensor captures 14 bit, log is a curve that reduces this to typically 10 bits in a different way to non-log.  But if you simply keep all 14 bits, you can do that in post while retaining other options.

Given that we can't yet record raw video on 200D the question also doesn't yet matter :)

Yes, many other cams around this era have an FPGA.

Walter Schulz

Quote from: ilia3101 on April 02, 2023, 01:01:10 AM
Do other cameras have one?

Yes, big fat Spartan-6 on board of 5DS/5DSR and 7DII.

names_are_hard

This post is true

24 hours have passed.  So, how does it really work?

I didn't even lie very much!  There is an FPGA.  I'm not using it.  You probably can do cool things with it, but I don't know how it works.

There is no video trickery, there are no wires.

200d has wifi.  I've got Magic Lantern to use that, beam the LV data to my PC in real-time, and run object detection there.  Detected object coordinates are sent back, and ML draws the boxes.  I am really doing YOLO object detection, which really is a DNN algorithm.

We can do real-time machine learning processing, in a useful sense!  You can do it in the field too, tethered through your phone, I've tested this and it works.  The cam is not plugged into anything, it's not some HDMI trick.  You could replace object detection with anything you wanted.  You could upload your images as soon as you take them, in the background (but I warn you, the wifi on the cam seems quite slow, it would be about 15s per JPG).

Technical details below.

I cleaned up earlier network code from 6D and Turtius:
https://magiclantern.fandom.com/wiki/6D/Networking
https://github.com/turtiustrek/magiclantern_simplified_minecraft/commit/2bf635a77a17c19d5d6ef4800f80f4caf5ea9989

Turned that into an okay not great quality socket / wifi header:
https://github.com/reticulatedpines/magiclantern_simplified/blob/b8727b0d5b423ee7735f425a70bfe6fddd36b2b1/src/ml_socket.h

Wrote a module for the cam to read LV, strip the chroma bytes to save network bandwidth (YOLO doesn't use colour), and send to the server.  This is 200D only for now, but easy to port to cams that have Wifi.  Network config is stored on the card.
https://github.com/reticulatedpines/magiclantern_simplified/blob/b8727b0d5b423ee7735f425a70bfe6fddd36b2b1/modules/yolo/yolo.c

Wrote a fairly simple server that handles the cam data and does the image processing.
https://github.com/reticulatedpines/magiclantern_simplified/blob/b8727b0d5b423ee7735f425a70bfe6fddd36b2b1/modules/yolo/yolo_server.py

That branch will get removed when I integrate the code into dev.  There's a few things that need improving first.

The server sends back detection box coords, which the module displays.  The lag is almost all network related.  Object detection takes 20ms (GPU accelerated).  But it's taking 200-500ms to send 300kB of LV data.  Sometimes it lags and takes > 3s.  This is much slower than it should be, it's 802.11g so the theoretical max is around 5MB/s, and we're getting...  0.6MB/s.  The 6D networking notes suggest this is limited by the priority of the Canon networking tasks, so perhaps this can be tweaked.  I've tried a few things in that area but no luck so far.

OpenCV and Cuda are doing the heavy lifting.  They can do a lot of things, and 500ms response time is fine for some tasks.  Got an idea for something cooler, or more useful?  There really are so many things we can do with off-cam CPU / GPU processing!

Danne

Hm, real time. That is impressive. And you managed to put it into the cam. Even better.

yourboylloyd

That's really cool. Haven't seen anything this cool in a while tbh. I can't imagine all of the practical uses that can come from it.
Join the ML discord! https://discord.gg/H7h6rfq

names_are_hard

Thanks.  It does make a cool demo and it's fun to play with, but ID'ing stuff on LV isn't that useful really (maybe if you were using it as a security cam?).

If you can think of some more practical network connected ideas, I might be able to implement them.

iaburn

It's very cool to see, but hard to think of practical use cases where this setup will be better than a normal web cam. If you need a computer nearby I guess most of the time it will be better to see what's going on on the computer's screen  ::)

names_are_hard

You don't need a computer nearby, and it isn't limited to object detection.

You can connect via your phone, and ML can send any data to any server in the world, which can process it, and send results back.

What data is sent, and what processing happens, is up to us.

domasa

Use external device for computing/data is a great idea!

Mobile could save GPS value to Wifi SD card (when camera has not wifi) continuously and ML should save this value to photo image after take photo... Would it be possible?

names_are_hard

Wifi SD cards are different, and I don't have one.  Perhaps they work the same, perhaps not.  200D has bluetooth and can tag images with GPS via phone already in this manner.