DIGIC 7 development (200D/SL2, 800D/T7i, 77D, 6D2)

Started by feedrail, June 12, 2017, 07:05:50 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

fgervais


names_are_hard

Good progess over the weekend.  The horrible task bugs were defeated.  Stable ML code up to GUI.  Button handling works.  Menus don't, but drawing does.  Getting there.

[gifv]https://i.imgur.com/ANitaWm.mp4[/gifv]

names_are_hard

Don't get excited, it's just the menus, nothing works...  but:

[gifv]https://i.imgur.com/UP8gqR0.mp4[/gifv]


Teamwork with Kitor, thanks!

Raging Ocelot

This makes me happy. I have a T7i upgraded from a T3i and I can't wait for a magic lantern upload for my T7i.. Thank you so much for your work.

Ondre

Maybe I can make a donation to support the developer?

Walter Schulz

Bitcoin only, ATM. And you can only donate to the project but not to a particular port or dev.

Straight_Shooter

It would be great if one could donate to this project via PayPal. I know I would give a couple bucks to it for sure.

names_are_hard

Have an HDMI capture progress report from 200D.  Minor edits to skip boring pauses:

[gifv]https://i.imgur.com/hBmJDKN.mp4[/gifv]

names_are_hard

Quite hard bug diagnosed and provisionally fixed in module relocation.  Took me a few weeks to understand.  This is very significant, it removes a large number of very hard to understand crashes from modules, and exposes other, easier to debug problems.

Module loading in ML works by building ELF objects, which are copied to card, and then ML uses libtcc to load these into mem.  The loading process has a relocation step.  There were (probably) two problems here.  The more important, libtcc has I think a bug when it checks to see if the relocation is too far.  Since modules are built as ARM, not Thumb, the default approach means the target of calls / jumps can only be +-32MB from the call site; ARM encodes the address in 24 bits.

For at least Digic 7 cams, offsets can be outside this range, they're typically in 0xe000.0000 or 0xdf00.0000 areas.  Older cams are in 0xff80.0000 or thereabouts, so you can underflow and be within 32MB.  Libtcc tries to check if the target is outside this range but the error condition doesn't trigger.  This was causing tccelf.c, relocate_sections() to fixup the modules with inappropriate relocations.  And that meant calls from within module code would go to *completely unrelated* offsets, with unpredictable and very hard to diagnose crashes.

As far as I can see, the fact that older cams happen to allow modules to successfully relocate was always luck.  If heap allocs had occured from a different region, or if stub addresses hadn't happened to be "near" to heap addresses via wraparound, it would never have worked.

My provisional fix is to build modules with -mlong-calls.  This changes the asm output so the form is "blx r3" style, which allows full 32-bit offsets.  This incurs a minor space and perf cost, and is currently untested on old cams.

kitor

Quoteand is currently untested on old cams.

Not true anymore, verified on 50D to work as intended ;)
Too many Canon cameras.
If you have a dead R, RP, 250D mainboard (e.g. after camera repair) and want to donate for experiments, I'll cover shipping costs.

heder

"+/-32 MB near jump issue" : This is (I guess) the problem that all new cameras will face. It's also on the 1300D. On 1300D you can not hijack any calls as the addresses (csanon firmware) are further away than +/-32MB from the magic lantern firmware. I recreated a patch function that instead of using a single near jump, uses a near jump that jumps into a long call jump. Maybe this problem is also present on DIGIC7? ... if so a *temporary* solution is present in the 1300D thread.
... some text here ..

names_are_hard

For the module context, since we control the source and are compiling it, I think using -mlong-calls is an appropriate fix.  For patching at runtime, that won't work and you might need to do what you're describing.  I guess that's the context you need to work in, so you can't use -mlong-calls?  Do modules work on 1300D?  If DryOS code is far from heap alloc region, I think they won't, and I also think that my fix will work for you.

I'd like someone with better experience with ARM to confirm -mlong-calls makes sense for where I'm using it - any volunteers?

I'm not doing any runtime patches yet on 200D, they haven't been needed.  I'm assuming I'll use MMU remapping when it gets to that (or, if I'm lucky, overwriting a thunked function, the 200D has *some* code in RWX pages).

heder

Quote from: names_are_hard on May 23, 2021, 03:03:51 PM
For the module context, since we control the source and are compiling it, I think using -mlong-calls is an appropriate fix.  For patching at runtime, that won't work and you might need to do what you're describing.  I guess that's the context you need to work in, so you can't use -mlong-calls?  Do modules work on 1300D?  If DryOS code is far from heap alloc region, I think they won't, and I also think that my fix will work for you.

I'd like someone with better experience with ARM to confirm -mlong-calls makes sense for where I'm using it - any volunteers?

I'm not doing any runtime patches yet on 200D, they haven't been needed.  I'm assuming I'll use MMU remapping when it gets to that (or, if I'm lucky, overwriting a thunked function, the 200D has *some* code in RWX pages).

I think the -mlong-calls trick is the way to go, the optimization gained by using relative jumps is ussless.

I dont know if modules are running on the 1300D, I just worked out a solution for hijacking firmware functions, where you patch a single instruction, and you're right about that the heap alloc region need to be close, aka within +/-32MB from canon firmware (99%). 

I'll ask citrix to read this thread
... some text here ..

names_are_hard

I'm not sure.  Relative calls are 1 instruction, 4 bytes and no memory access. Long calls are 2 instructions, 8 + 4 bytes including an addition d-cache read.  It's easy to imagine situations where it would be significant, but I don't know if we hit them.  Not enough runs on 200D for me to try profiling it, and I can't use relative calls for modules so I can't do a comparison easily.

kadushkin90

Hello, I don't understand anything about programming, but I do have a 6D mark II camera available. Is there anything I can do to help you?

Walter Schulz

I don't think so. ATM programmers need physical access to a camera.
One thing you are able to do now but it won't accelerate porting ML to 6D2: Looking for UART connectors.

EDIT: Not needed anymore. Watched a disassembly video. I'm sure UART is located right from rear dial and there is a rectangular hole to access it.

names_are_hard

Beep boop.  200D has a game.

[gifv]https://i.imgur.com/FrzpFvk.mp4[/gifv]

One day, font addr will be good.

names_are_hard

Made some improvements to memory management routines, and fixed some ML assumptions that were not true for newer Digic:
https://github.com/reticulatedpines/magiclantern_simplified/tree/feature_show_free_mem

This means there's now a useful display of memory info for 200D.  Should be not hard to extend to other semi-supported Digic 6, 7, 8 cams:

[gifv]https://i.imgur.com/PAcDmMF.mp4[/gifv]

names_are_hard

Progress continues.  Free Mem work uncovered some complicated problems with SRM allocator that took a while to get tamed (Kitor helped a lot here, thanks!).  For now, we have disabled this allocator on D678 (couldn't get free() to work correctly, which causes crashes due to SRM being a LIFO allocator).

Due to bumping up against the ML reserved memory limit, I worked out how to steal more memory from DryOS in early init.  ASCII art summary here: https://github.com/reticulatedpines/magiclantern_simplified/commit/a0bc8f60af48ff48e448632e55507c47445d2246
This works well on the cameras we've tested on (200D, R, RP, M50 I think, maybe not all of these).  We have enough memory now that we can have multiple features enabled, plus I made it simpler to adjust how much is being stolen.  But a downside is that it requires changes to stubs.S for other D678 cams in order to build.  That means I have to do some reversing work on every other D678 cam, or split the cams so some use the old system.  I'd like to avoid the latter option as the solution should be general, and splitting would make the code quite a lot more ugly.

Separately, I got task info working.  This required tracking down some changes to task related structs (task_attr_str).  And fixing a bunch of null pointer derefs in ML code!  Digic 4 and 5 cams don't crash on null pointer derefs, but D678 ones do, due to MMU settings...  I'm sure there's lots more of that fun to discover.

Next up is finding something acceptable to do with new memory stealing on the other D678 cams, then getting Qemu regression testing working locally, so I can have some confidence that changes to support new cams hasn't broken old cams.  I think it's likely this will expose some problems, my repo is a poorly tested merge of several important upstream branches, these will then need fixing.

OverrRyde

Thank you very much for the updates! Created and account to keep track.

Has there been any progress since the last update? All i really want is clean HDMI! LOL

thank you again!

Brakeless

Thanks guys. I'm not a develop, but I have now 6D2 and after 2 weeks I will return my EOS R. I can run your works if it helps you...? I just try to help. Hello from Russia ❤️

names_are_hard

Thanks for the offer.  We have an active dev with EOS R, but there is nothing that needs testing.  6D2 doesn't have a dev working on it, so, again, nothing to test.  Later on you could maybe help porting by doing tests on 6D2, but I think all devs are busy enough for now that they won't have time for that.  I'll let you know if I'm wrong!

names_are_hard

All D678 cams can now use the same improved mem stealing logic.  This means one less file to maintain, plus more memory for ML.  This was fairly complicated work and I had to update every cam after confirming stuff in roms.  Kitor had a nice idea on how to simplify one part of this process that reduces what you need to find on new ports, as well as letting me make the code cleaner.

Implemented a mechanism for allowing limited PROP_REQUEST_CHANGE so we can safely experiment without enabling all props.

A bunch of small improvements to the build system to remove some annoyances, and a major fix that removes a significant race condition with parallel builds.  The "minimal" subdir builds were clobbering files during build, causing "make zip" builds to fail randomly.  Was quite annoying to track down.

Found initial stubs for 6D2 builds, should be easy now to get what we have running on that cam.

Generally, a lot of boring but useful work that is applicable to all D678 cams, not just Digic 7.  There's some small bonuses as well, you can check the repo if you want to find them.

Special thanks to kitor and coon who have been quite active lately doing a lot of good work, checking my stuff, talking ideas over, as well as doing their own work and merging in.  I'll let them talk about their own parts!  Between us we're doing well keeping all the things we add working across D678.

We even have a few possible new devs in Discord, who will hopefully stick around!

Thelgord

"Found initial stubs for 6D2 builds, should be easy now to get what we have running on that cam."

Thank you! Just found this thread and I am excited that my camera is being worked on :) If you need a tester for the 6D2 just hit me up :)

shankar101

Hi where did the development reach?? I believe 200d is going to be way better than  eos m for raw.