ML on EOS-M2

Started by Palpatine, September 22, 2015, 02:48:23 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

nikfreak

a1ex always told me to use the blink code (led blinking) whenever I got stuck while porting to 70D/100D.
That way it was rather easy to determine at which part of the code we got stuck. Mostly the symptom for becoming stuck was a continuous LED light and not a blinking one.
In your case I would start here in boot-hack.c:


/* This runs ML initialization routines and starts user tasks.
* Unlike init_task, from here we can do file I/O and others.
*/
static void my_big_init_task()
{
    _find_ml_card();
    _load_fonts();


and place some strings/printf's/debugmsg (OK1/2/3...) before or in between to see if you reach that part of the code in qemu. Afterwards continue or revert until you reach a code part where you don't get stuck. As last time I used QEMU with ML is years ago i can't tell you really what code exactly to use for the blinking LED alternative. You need something you will see in qemu console...
[size=8pt]70D.112 & 100D.101[/size]

dfort

Hi @nicfreak

That gave me an idea:

static void my_big_init_task()
{
//    _find_ml_card();
    _load_fonts();

#ifdef CONFIG_HELLO_WORLD
    hello_world();


Skip the card for now and just print "Hello, World!" and the firmware signature.



Well at least it doesn't crash out of QEMU and the LED is blinking. The font addresses check out so it is something else. Oh, and I navigated out of the Canon menu just to get a clean screenshot. The Canon menus continue to work but it looks even worse with them on.

nikfreak

sounds like you need to triple-check the FIO stubs cause _find_ml_card() is part of fio-ml and most of the code is FIO related.
[size=8pt]70D.112 & 100D.101[/size]

dfort

Yep, you nailed it. A couple of my FIO stubs were off.


/** File I/O **/
NSTUB(0xFF357704,  FIO_CloseFile)
-NSTUB(0xFF3576FC,  FIO_FindClose)
+NSTUB(0xFF3586FC,  FIO_FindClose)
NSTUB(0xFF35861C,  FIO_FindNextEx)
NSTUB(0xFF3574B4, _FIO_ReadFile)
-NSTUB(0xFF357704,  FIO_SeekSkipFile)
+NSTUB(0xff357564,  FIO_SeekSkipFile)
NSTUB(0xFF357654, _FIO_WriteFile)
NSTUB(0xFF357F60, _FIO_CreateDirectory)
NSTUB(0xFF357360, _FIO_CreateFile)


Finally!



Moving beyond "Hello, World!" it saves ROM0.BIN and ROM1.BIN and prints error messages and saves ASSERT00.LOG and... Looks like there's more work to do.

So here is where we're at now:

Error loading 'ML/MODULES/EOSM2_103.sym': File does not exist
...
ML ASSERT:
streq(stateobj->type, "StateObject")
at ../../src/state-object.c:251 (stateobj_start_spy), task ml_init
lv:0 mode:3


But wait, it is there! Right at the top of the list:

ls -la /Volumes/EOS_DIGITAL/ML/modules/
total 1824
drwxrwxrwx  1 rosiefort  staff   16384 Jul 27 16:07 .
drwxrwxrwx  1 rosiefort  staff   16384 Jul 27 16:06 ..
-rwxrwxrwx  1 rosiefort  staff   36278 Jul 27 16:06 EOSM2_103.sym
-rwxrwxrwx  1 rosiefort  staff      96 Dec 31  1999 LOADING.LCK
-rwxrwxrwx  1 rosiefort  staff   21876 Jul 27 16:07 adv_int.mo
-rwxrwxrwx  1 rosiefort  staff   13200 Jul 27 16:07 arkanoid.mo
-rwxrwxrwx  1 rosiefort  staff   18804 Jul 27 16:07 autoexpo.mo
-rwxrwxrwx  1 rosiefort  staff   18128 Jul 27 16:07 bench.mo
-rwxrwxrwx  1 rosiefort  staff    8536 Jul 27 16:07 deflick.mo
-rwxrwxrwx  1 rosiefort  staff   15916 Jul 27 16:07 dual_iso.mo
-rwxrwxrwx  1 rosiefort  staff   32372 Jul 27 16:07 ettr.mo
-rwxrwxrwx  1 rosiefort  staff   15668 Jul 27 16:07 file_man.mo
-rwxrwxrwx  1 rosiefort  staff  305600 Jul 27 16:07 lua.mo
-rwxrwxrwx  1 rosiefort  staff   41680 Jul 27 16:07 mlv_lite.mo
-rwxrwxrwx  1 rosiefort  staff   45272 Jul 27 16:07 mlv_play.mo
-rwxrwxrwx  1 rosiefort  staff   64384 Jul 27 16:07 mlv_rec.mo
-rwxrwxrwx  1 rosiefort  staff   11248 Jul 27 16:07 mlv_snd.mo
-rwxrwxrwx  1 rosiefort  staff    6916 Jul 27 16:07 pic_view.mo
-rwxrwxrwx  1 rosiefort  staff   83852 Jul 27 16:07 selftest.mo
-rwxrwxrwx  1 rosiefort  staff   21952 Jul 27 16:07 silent.mo


I believe that LOADING.LCK file is because I don't know how to exit from QEMU cleanly. Either that or my system is on Throwback Thursday.

a1ex

Quote from: dfort on July 28, 2017, 01:24:10 AM
Error loading 'ML/MODULES/EOSM2_103.sym': File does not exist

But wait, it is there!

Long Live Microsoft :)

This feature is only available in older models (5D2 generation).

Quote
I believe that LOADING.LCK file is because I don't know how to exit from QEMU cleanly.

Right. There is an attempt to emulate a shutdown (press B to simulate opening the battery door), but it's incomplete.

dfort

EOSM2103.FIR - Looks like Canon is taking all they can get out of an 8.3 filename.

Following what was done on the 1100D:

# Definitions for version 103
ML_MODULES_SYM_NAME=m2_$(FW_VERSION).sym


Got past that. Now on to the next issue:

ASSERT00.LOG
ML ASSERT:
streq(stateobj->type, "StateObject")
at ../../src/state-object.c:251 (stateobj_start_spy), task ml_init
lv:0 mode:3

ml_init stack: 1f3978 [1f39c8-1ef9c8]
0xUNKNOWN  @ 44c8dc:1f39b0
0x0044CA04 @ 477e28:1f39a8
0x0044C468 @ 44ca64:1f3978

Magic Lantern version : Nightly.2017Jul28.EOSM2103
Mercurial changeset   : 69d91c7c4317 (qemu-wip) tip
Built on 2017-07-28 15:40:15 UTC by [email protected].
Free Memory  : 344K + 1158K


Is this typical of a new port or is the EOSM2 an especially stubborn little bastard?

BTW--I keep the EOSM2 branch updated with the latest changes that work in QEMU: https://bitbucket.org/daniel_fort/magic-lantern/branch/EOSM2.103_wip

dfort

Found another stub that was off, _engio_write, though this time it didn't resolve the current issue.

/**
* State object hooks are pieces of code that run in Canon tasks (state objects). See state-object.c .
* They might slow down Canon code, so here you can disable all of them (useful for debugging or early ports)
*/
#define CONFIG_STATE_OBJECT_HOOKS


Tried disabling CONFIG_STATE_OBJECT_HOOKS but it won't compile and it was a bit too complicated for me trying to find all the pieces needed to get it to build.

I did track down this QEMU message:

[   menu_task:0044db24 ] (32:03) read_entire_file:736: failed


to this:

fio.ml.c
uint8_t* read_entire_file(const char * filename, int* buf_size)
...
getfilesize_fail:
    DEBUG("failed");
    return NULL;


All of this is happening in task ml_init so I take it that we still have a long journey into the UNKNOWN ahead.

ml_init stack: 1f3978 [1f39c8-1ef9c8]
0xUNKNOWN  @ 44c8dc:1f39b0
0x0044CA04 @ 477e28:1f39a8
0x0044C468 @ 44ca64:1f3978


Any hints on how to track this down? Maybe where to put in a break point?

a1ex

http://www.magiclantern.fm/forum/index.php?topic=19933 -> commands to translate the numbers from the stack trace into source code lines. These addresses are from ML code; if I compile my own from the same source, I would probably get different results, unless I'd have the same compiler version and the same modifications to the source code, if any. Still, let me try it:


hg clone https://bitbucket.org/daniel_fort/magic-lantern/
cd magic-lantern
hg up 69d91c7c4317 -C  # changeset from the stack trace log

# all options and source code modifications are visible in autoexec.bin, but as I don't have it, I'm just guessing
echo "CONFIG_QEMU=y" > Makefile.user
sed -i 's!#define CONFIG_HELLO_WORLD!//#define CONFIG_HELLO_WORLD!' src/config-defines.h

cd platform/EOSM2.103
make clean; make

eu-addr2line -s -S --pretty-print -e magiclantern 0x44c8dc 0x477e28 0x44ca64
my_big_init_task+0x58 at boot-hack.c:307
stateobj_start_spy.constprop.0+0x2c at state-object.c:251
ml_assert_handler+0x60 at boot-hack.c:539

eu-addr2line -s -S --pretty-print -e magiclantern 0x0044CA04 0x0044C468
ml_assert_handler at boot-hack.c:530
backtrace_getstr at backtrace.c:877


Looks OK!

The stack trace with -d callstack is more complete (it finds the 0xUNKNOWNs and also contains function arguments); you can get it with a breakpoint on ml_assert_handler and calling print_current_location_with_callstack from there.

Even better - we are debugging code code compiled by us, which - on the qemu branch - also has debug info for gdb, in the same way as a regular PC program has debug info. This info is not copied to the card - it's kept in the "magiclantern" file, which is actually an elf. With this info, gdb can print a backtrace as well, and it's probably better than ours, as long as the error is in our code. There's no debug info on Canon firmware, other than their (very helpful) debug messages.


. ./export_ml_syms.sh EOSM2.103
./run_canon_fw.sh EOSM2,firmware="boot=1" -d debugmsg,callstack -s -S & arm-none-eabi-gdb -x EOSM2/debugmsg.gdb
...
CTRL-C before the error
(gdb) symbol-file ../magic-lantern/platform/EOSM2.103/magiclantern
(gdb) b ml_assert_handler
(gdb) continue
...
Breakpoint 4, ml_assert_handler (...)

(gdb) bt
#0  ml_assert_handler (msg=msg@entry=0x4a5ad8 "streq(stateobj->type, \"StateObject\")", file=file@entry=0x4a5afd "../../src/state-object.c", line=line@entry=0xfb, func=func@entry=0x49ca48 <__func__.7031> "stateobj_start_spy") at ../../src/boot-hack.c:530
#1  0x00477e2c in stateobj_start_spy (stateobj=0xe51f3e14, spy=0x477cd0 <stateobj_lv_spy>) at ../../src/state-object.c:251
#2  0x0044c8e0 in call_init_funcs () at ../../src/boot-hack.c:307
#3  my_big_init_task () at ../../src/boot-hack.c:448
#4  0x0000ca18 in ?? ()

(gdb) print_current_location_with_callstack
Current stack: [1f39c8-1ef9c8] sp=1f39a8                                         at [ml_init:44ca04:477e2c] (ml_assert_handler)
0x44C884 my_big_init_task(0, 44c884 my_big_init_task, 19980218, 19980218)        at [ml_init:ca14:1f39c0] (pc:sp)
0x477E80 state_init(32, 3, 49d60d "Calling init_func %s (%x)", 477e80 state_init)
                                                                                 at [ml_init:44c8dc:1f39b0] (my_big_init_task) (pc:sp)
  0x44CA04 ml_assert_handler(4a5ad8 "streq(stateobj->type, "StateObject")", 4a5afd "../../src/state-object.c", fb, 49ca48 "stateobj_start_spy")
                                                                                 at [ml_init:477e28:1f39a8] (stateobj_start_spy.constprop.0) (pc:sp)


With debug info available, gdb gives a pretty good backtrace. We don't have this luxury when debugging Canon code though, and that's the reason I wrote the callstack analysis. The callstack trace does not require any debug info, but requires the code to be instrumented (therefore it's slower than normal execution). The backtrace from the assert log does not require instrumentation, but also gives a lot less info.

With my WIP version of QEMU (option to identify tail function calls), the stack trace would be:

[...] -d debugmsg,callstack,tail [...]
[...]
(gdb) print_current_location_with_callstack
Current stack: [1f39c8-1ef9c8] sp=1f39a8                                         at [ml_init:44ca04:477e2c] (ml_assert_handler)
0x44C884 my_big_init_task(0, 44c884 my_big_init_task, 19980218, 19980218)        at [ml_init:ca14:1f39c0] (pc:sp)
0x477E80 state_init(32, 3, 49d60d "Calling init_func %s (%x)", 477e80 state_init)
                                                                                 at [ml_init:44c8dc:1f39b0] (my_big_init_task) (pc:sp)
  0x477DFC stateobj_start_spy.constprop.0(e51f3e14, 4bd9a4, 0, 477e80 state_init)
                                                                                 at [ml_init:477e9c:1f39b0] (state_init) (pc:sp)
   0x44CA04 ml_assert_handler(4a5ad8 "streq(stateobj->type, "StateObject")", 4a5afd "../../src/state-object.c", fb, 49ca48 "stateobj_start_spy")
                                                                                 at [ml_init:477e28:1f39a8] (stateobj_start_spy.constprop.0) (pc:sp)


Note: call_init_funcs() is not listed in my stack trace because the compiler inlined it. Still, gdb did a very good job identifying it. In my trace, it only says state_init was called from 44c8dc which maps to boot-hack.c:307.

Why did my stack trace list 0x4bd9a4 as the second argument of stateobj_start_spy?! In gdb's backtrace, it's 0x477cd0 (which is correct).

Answer: the compiler hardcoded this one, so the compiled function (in assembly) only receives one argument:

(gdb) disas state_init
   0x00477e80 <+0>: push {r3, lr}
   0x00477e84 <+4>: mov r3, #589824 ; 0x90000
   0x00477e88 <+8>: ldr r0, [r3, #1456] ; 0x5b0
   0x00477e8c <+12>: bl 0x477dfc <stateobj_start_spy>
   0x00477e90 <+16>: mov r3, #262144 ; 0x40000
   0x00477e94 <+20>: ldr r0, [r3, #1252] ; 0x4e4
   0x00477e98 <+24>: pop {r3, lr}
   0x00477e9c <+28>: b 0x477dfc <stateobj_start_spy>

(gdb) disas disas stateobj_start_spy
...
   0x00477e58 <+92>: ldr r3, [pc, #28] ; 0x477e7c <stateobj_start_spy+128>
   0x00477e5c <+96>: str r3, [r4, #12]
   0x00477e60 <+100>: mov r0, #0
   0x00477e64 <+104>: pop {r4, pc}

(gdb) x 0x477e7c
0x477e7c <stateobj_start_spy+128>: 0x00477cd0


Anyway. The error from state objects appears to be a check that's working very well :D (which means, the state object definitions on working ports can be trusted to be correct).

The error from read_entire_file is unrelated. Exercise: find out where it comes from (using the same technique).

DeafEyeJedi

This is all very exciting progress so far and keep them rolling along!
5D3.113 | 5D3.123 | EOSM.203 | 7D.203 | 70D.112 | 100D.101 | EOSM2.* | 50D.109

JohanJ

Quote from: DeafEyeJedi on July 29, 2017, 11:40:03 PM
This is all very exciting progress so far and keep them rolling along!
+1

Sent from my SM-G930F using Tapatalk

60D.111 / 100D.101 / M2.103

dfort

And the crowd goes wild!

@DeafEyeJedi - Thanks again for letting me borrow your 100D, I'm getting a lot of mileage out of the firmware dump.

Quote from: a1ex on July 29, 2017, 09:30:22 AM
The error from read_entire_file is unrelated. Exercise: find out where it comes from (using the same technique).

That was a bit of a wild goose chase. I'll spare you all the nitty-gritty details but basically:

#0  read_entire_file (filename=filename@entry=0x49f806 "ML/SETTINGS/CURRENT.SET", buf_size=buf_size@entry=0x1f38f4) at ../../src/fio-ml.c:707
#1  0x0045e704 in config_choose_startup_preset () at ../../src/config.c:771
#2  config_load () at ../../src/config.c:903
#3  0x0044c910 in my_big_init_task () at ../../src/boot-hack.c:458
#4  0x0000ca18 in ?? ()
...
#0  read_entire_file (filename=filename@entry=0x1f38f8 "ML/SETTINGS/magic.cfg", buf_size=buf_size@entry=0x4ba78c <config_file_size>) at ../../src/fio-ml.c:707
#1  0x0045e2cc in config_parse_file (filename=0x1f38f8 "ML/SETTINGS/magic.cfg", filename@entry=0x1f38f0 "\023") at ../../src/config.c:316
#2  0x0045ea1c in config_load () at ../../src/config.c:913
#3  0x0044c910 in my_big_init_task () at ../../src/boot-hack.c:458
#4  0x0000ca18 in ?? ()
...
#0  read_entire_file (filename=0x18e2a0 "ML/SETTINGS/MENU.CFG", filename@entry=0x18e298 "\030\002\230\031", buf_size=0x18e29c, buf_size@entry=0x18e294) at ../../src/fio-ml.c:707
#1  0x00454a50 in menu_load_flags (filename=0x18e298 "\030\002\230\031") at ../../src/menu.c:5600
#2  config_menu_load_flags () at ../../src/menu.c:5633
#3  menu_task (unused=<optimized out>) at ../../src/menu.c:4982
#4  0x0000ca18 in ?? ()


So the failed read_entire_file message is because it is loading the configuration files but since this is a first run there are no saved settings yet. What would a camera that is known to work do in this case? How about the 700D?

[     ml_init:0044db24 ] (32:03) read_entire_file:736: failed
[     ml_init:0045e344 ] (32:03) config_parse: Read 0 config values


So like the state objects message, read_entire_file seems to be working fine. However, the 700D running in QEMU displays an endless loop, apparently waiting for user input:

[****] task_hook(141b34) 13fb60() -> 13fb60(), from 141b34
[****] task_hook(13fb60) 0(????????) -> 141b34(T ??), from 13fb60
[****] task_hook(141b34) 13fb60() -> 13fb60(), from 141b34
[****] task_hook(141990) 0(????????) -> 141b34(T ??), from 141990
[****] task_hook(13fb60) 0(????????) -> 141990(), from 13fb60


While the EOSM2 stops displaying messages. So the problem seems to be getting to the next task.

I tried that eu-addr2line trick but ran into a problem, it isn't available on the Mac and I was having trouble compiling elfutils from source. However, I did have gaddr2line from Homebrew and it supports ARM code. It is somewhat different from eu-addr2line but seems to work:

gaddr2line -sf --pretty-print -b elf32-littlearm -e magiclantern 0x44c8dc 0x477e28 0x44ca64
call_init_funcs at boot-hack.c:307
stateobj_start_spy at state-object.c:251
ml_assert_handler at boot-hack.c:539

gaddr2line -sf --pretty-print -b elf32-littlearm -e magiclantern 0x0044CA04 0x0044C468
ml_assert_handler at boot-hack.c:530
backtrace_getstr at backtrace.c:877


One question -- if the the state objects message is not a problem, why is it printing an error message and saving a crash log?


a1ex

Quote from: dfort on July 30, 2017, 07:53:44 PM
One question -- if the the state objects message is not a problem, why is it printing an error message and saving a crash log?

It shows where the problem is. This is what I meant by "a check that's working very well" - if there's something wrong with the state object definitions, it shows the error, rather than allowing it to go undetected.

dfort

Got it, the error check is working properly so that means that I've still got some errors--probably in stubs.S or consts.h? I keep going over them and keep finding issues. Partially because when I went through it the first time I picked whatever camera seemed to be the closest to matching the section of code I was doing my pattern checking on but I've been copying from stubs that are off on other platforms.

Here's one that should be checked out.

700D.114/stubs.S
-NSTUB(0xFF702A3C,  PlayMain_handler)
+NSTUB(0xFF3B9E94,  PlayMain_handler)


I'm pretty sure that this is the correct "fix" but how to confirm it?

Haven't found the silver bullet for whatever is haunting the EOSM2.

a1ex

There are a few more camera-specific files under the platform directory.

Regarding 700D PlayMain_handler:

cd platform/700D.114
hg blame -c stubs.S | grep PlayMain_handler
3c718689749d: NSTUB(0xFF702A3C,  PlayMain_handler)

hg log -r 3c718689749d
changeset:   14427:3c718689749d
branch:      700D
parent:      14406:96ca71e19bf1
user:        alex@thinkpad
date:        Sat Aug 13 09:23:49 2016
summary:     700D: fix PlayMain_handler stub (fixes SET+MainDial and others)

hg export 3c718689749d
...
-NSTUB(0xFF3B9E94,  PlayMain_handler)
+NSTUB(0xFF702A3C,  PlayMain_handler)


;)


dfort

Oh yeah and I commented on that 700D fix. Well at least I found the address where it should have worked if the 700D was just a little bit more like the EOSM.

Quote from: a1ex on July 31, 2017, 11:47:26 AM
There are a few more camera-specific files under the platform directory.

Yes, but some of that stuff looks really scary and maybe needs the dm-spy-experiments branch running on the camera to ferret them out?

Then there's that directory with yet another platform subdirectory with a couple of files that can't possibly be the one that will solve the state objects issue.  ???

magic-lantern/platform/EOSM.103/include/platform/state-object.h
#ifndef __platform_state_object_h
#define __platform_state_object_h

#define DISPLAY_STATE DISPLAY_STATEOBJ
#define INPUT_SET_IMAGE_VRAM_PARAMETER_MUTE_FLIP_CBR 26 // need to verify
#define INPUT_ENABLE_IMAGE_PHYSICAL_SCREEN_PARAMETER 27 // need to verify
#define EVF_STATE    (*(struct state_object **)0x91CF0)
#define MOVREC_STATE (*(struct state_object **)0x93AF8)
#define SSS_STATE    (*(struct state_object **)0x9169C)

#endif // __platform_state_object_h


Ok--that fixed it. So what's next? It seems to be running but I can't get to the Magic Lantern menu to check it out. Guess I should try to figure out how you did it.

a1ex

That's probably because the M2 does not show the "idle" Canon screen (the one with shooting settings); as soon as you close the date/time dialog, it will go to LiveView (which doesn't work in QEMU).

You should be able to work around it by allowing the menu to come up in any GUI state, not just when "idle".

dfort

Quote from: a1ex on August 01, 2017, 10:19:43 AM
You should be able to work around it by allowing the menu to come up in any GUI state, not just when "idle".

I tried commenting out "idle" statements like this one:

menu.c
        if (gui_state == GUISTATE_IDLE || (gui_menu_shown() && !beta_should_warn()))
        {
            give_semaphore( gui_sem );
            return 0;
        }


but still no ML menus. I did discover that I missed a setting in menu.c for long press down key (a.k.a. trash button) but that didn't do it either.

What I'm seeing is this screen when it boots up:



I could change the date but it doesn't save it so I gave up doing that and keep reliving the same day over and over again until I get it right--like on the movie Groundhog Day. As soon as I close the date/time dialog (by pressing the "m" key) it doesn't go to LiveView, it goes to this screen:



Once again I can navigate all over the Canon menus and change settings but they aren't saved. Pressing "m" once more gets into what I take it is LiveView--just a black screen. No need to show a screenshot of that one! The frustrating part is once in LiveView, always in LiveView--there doesn't seem to be any way to get back to the Canon menus while on other cameras pressing the "m" key will get you back.

ML does seem to be running in the background because if I do a second QEMU session without first removing the "LOADING.LCK" file the "Skipping module loading." message shows up even when in LiveView.



One more screenshot, here is what happens when using "CONFIG_QEMU=y" in Makefile.user and not building in the qemu branch.



Obviously not useful for the EOSM2 but it is interesting seeing some of the other camera button options.

Ok--I've got to ask. At what point can we run at least the minimal "Hello, World!" on the camera? I've been told that in order to set the boot flag on the camera I need to ask for a "ML-SETUP.FIR" from a developer because that is one thing that a mere minion like me is not allowed to have unless he has proven himself worthy. Am I there yet?

BTW--if I brick the camera it would probably disappoint the two EOSM2 owners who are following this topic more than it does me. At least I could move on to other projects!

DeafEyeJedi

QuoteBTW--if I brick the camera it would probably disappoint the two EOSM2 owners who are following this topic more than it does me. At least I could move on to other projects!

Well if that's the case then you are more than welcome to borrow my soon to be arriving M2 body after ordering a decent used one through Amazon recently.  :P
5D3.113 | 5D3.123 | EOSM.203 | 7D.203 | 70D.112 | 100D.101 | EOSM2.* | 50D.109

dfort

So there's no way to get out of this, come hell or high water!

glassescreditsroll

Quote from: dfort on August 01, 2017, 07:55:50 PM


BTW--if I brick the camera it would probably disappoint the two EOSM2 owners who are following this topic more than it does me. At least I could move on to other projects!
I check in every other day I don't known what's going on but I'm eggerly waiting for this to finally work on EOS m2 lol

dfort

Quote from: glassescreditsroll on August 02, 2017, 09:49:19 PM
I'm eggerly waiting

This egg looks like it is about to hatch but not quite yet--that is if it doesn't crack first!

What is going on is that as someone who is a non-developer with very limited coding abilities but with some success tinkering and contributing to the Magic Lantern project, I decided to take on the challenge of porting a camera that should be relatively easy. I keep running into problems and a1ex keeps pointing me in the right direction--though sometimes figuring out his hints is a challenge in itself.

I'm eager to see something working on the camera but a1ex knows better than I that this is a process that shouldn't be rushed and it is better to make sure it runs well on QEMU before trying it on the camera. Basically the only ML program I ran on the EOSM2 so far has been the firmware dumper, everything else has been done on a "virtual" EOSM2 via QEMU running on a MacBook Pro.

I picked the EOSM2 because it is cheap and should be one of the easier cameras to port. This camera also had a limited distribution so there aren't a bunch of users putting pressure on me to hurry up and get it working, though I think we're all eager to see at least "Hello, World!" running on the "real" camera.

The latest problem I solved was that the selftest module wasn't compiling because the AbortEDmac stub was missing. In the process of trying to figure out what was going on I also found several places where I needed to give the EOSM2 the same treatment as the EOSM because these cameras are very closely related.

I'm still trying to figure out how to get the ML menu to show up in QEMU.

Quote from: a1ex on August 01, 2017, 10:19:43 AM
That's probably because the M2 does not show the "idle" Canon screen (the one with shooting settings); as soon as you close the date/time dialog, it will go to LiveView (which doesn't work in QEMU).

You should be able to work around it by allowing the menu to come up in any GUI state, not just when "idle".

Can I phone a friend or ask the audience for the answer?

dfort

Tried my luck bringing up the ML menus in QEMU on some other cameras with pretty much the same results as the EOSM2--nada, nuthin, zip. Maybe it is a Mac keyboard thing with the delete key not mapping to the camera trash button? It shouldn't matter on the EOSM2 because the trash button is also the down arrow key.

Also found some errors that look like they might be pointing to a problem:

    13:    34.304 [SEQ ERROR] NotifyComplete (Cur = 1, 0x2, Flag = 0x20000000)
    14:    38.400 [PROPAD] ERROR GetPropertyData ID (0) = 0x00030048
    15:    38.400 ERROR [RTC] PROPAD_GetPropertyData : PROP_RTC
    16:    42.752 [RTC] ChangePropertyCBR 0x0, 0x0
    17:    43.520 [RTC] RTC_Permit 0x0
...
    83:   253.952 [PROPAD] ERROR GetPropertyData ID (1) = 0x01050038
    84:   254.464 [PROPAD] ERROR GetPropertyData ID (1) = 0x01050041
    85:   254.720 [PROPAD] ERROR GetPropertyData ID (1) = 0x01050042
    86:   254.720 [PROPAD] ERROR GetPropertyData ID (1) = 0x01050043
    87:   254.976 [PROPAD] ERROR GetPropertyData ID (1) = 0x01050044
    88:   255.232 [PROPAD] ERROR GetPropertyData ID (1) = 0x0105004F
    89:   255.232 [PROPAD] ERROR GetPropertyData ID (1) = 0x01050050
    90:   255.488 [PROPAD] ERROR GetPropertyData ID (1) = 0x01050051
    91:   190.208 [PROPAD] ERROR GetPropertyData ID (1) = 0x01050052
    92:   190.720 [PROPAD] ERROR GetPropertyData ID (1) = 0x0105010E
    93:   190.976 [PROPAD] ERROR GetPropertyData ID (1) = 0x0105010F
    94:   190.976 [PROPAD] ERROR GetPropertyData ID (1) = 0x01050110
    95:   190.976 [PROPAD] ERROR GetPropertyData ID (1) = 0x01050111
    96:   191.232 [PROPAD] ERROR GetPropertyData ID (1) = 0x0104000B


The PROPAD_GetPropertyData stub was removed a while back in part to resolve one of my bug reports. Had no idea what was going on back then and still don't understand it. Seems to be a problem with the Properties stubs but after quadruple checking them they look fine.

a1ex

Those errors are likely just imperfect emulation (we are using properties from 100D, not from a real camera).

However, we should be close to getting these things from real hardware. If the dm-spy-experiments branch saves a valid log in QEMU with CONFIG_DEBUG_INTERCEPT_STARTUP=y and CONFIG_QEMU=n, that means we are already there and I'll enable the boot flag.

We'll also need the sf_dump module - that should re-create the SFDATA.BIN file, although I've never tested it that way (todo: include this in the test suite).

dfort

Getting closer--but not quite yet.

End of QEMU run with dm-spy-experiments branch merged with EOSM2.103_wip:
[  debug_task:ff0d8b9c ] (85:03) GUI_Control:-7 0x0
[ GuiMainTask:ff0d8f50 ] (84:01) GUI_CONTROL:-7
[ GuiMainTask:ff1c1638 ] (84:06) ***** GUI_Control_Post(-7)
[ GuiMainTask:ff1c187c ] (84:01) gui control end
[ GuiMainTask:ff1c189c ] (84:01) 0msec = 5550 - 5550
[ GuiMainTask:ff1c18b8 ] (84:01) 256msec = 563456 - 563712
   187:  5609.984 [GUI_M] ERROR ***** GUI_Control_Post(-7)


And we know how to work with this Assert LOG:
ML ASSERT:
mem_sem
at ../../src/mem.c:854 (__mem_malloc), task ml_init
lv:0 mode:3

ml_init stack: 1f38d0 [1f39c8-1ef9c8]
0xUNKNOWN  @ 44c7f0:1f39b0
0x00453F44 @ 49468c:1f39a8
0x004501E0 @ 45423c:1f3968
0x0044F124 @ 450260:1f3950
0x0044C914 @ 44f15c:1f3900
0x0044C468 @ 44c974:1f38d0

Magic Lantern version : Nightly.2017Aug05.EOSM2103
Mercurial changeset   : d538e46a12aa+2214781fcb95+ (dm-spy-experiments-EOSM2.103) tip
Built on 2017-08-05 22:41:51 UTC by rosiefort@RosieFoComputer.
Free Memory  : 344K + 1158K


Before getting back to debugging, looks like I found a couple of missing stubs for the EOSM:

NSTUB(0xff0d7830,  GUI_ChangeMode)
...
NSTUB(0xff1c4670,  gui_massive_event_loop)


or were these removed on purpose?

dfort

How does a "bad merge" happen? Just wondering because I might have been dealing with one. I'm sure I resolved the conflicts properly.

Those problems in my last post also showed up on the 700D:

ML ASSERT:
mem_sem
at ../../src/mem.c:854 (__mem_malloc), task ml_init
...
Magic Lantern version : Nightly.2017Aug06.700D114
Mercurial changeset   : 2c1d19295d9f (dm-spy-experiments-EOSM2.103)


which I have running with CONFIG_ALLOCATE_MEMORY_POOL and also on the 550D which I didn't touch:

ML ASSERT:
mem_sem
at ../../src/mem.c:854 (__mem_malloc), task ml_init
...
Magic Lantern version : Nightly.2017Aug06.550D109
Mercurial changeset   : 2c1d19295d9f (dm-spy-experiments-EOSM2.103)


What I did was merge my EOSM2.103_wip branch with dm-spy-experiments. Obviously something went wrong. On the regular dm-spy-experiments branch these cameras run fine in QEMU. In fact I was able to get into and navigate around the ML menus.

I've been doing lots of changes and merging so maybe I should start from a clean working dm-spy-experiments branch and start over again.