Intercepting DebugMsg with cache hacks

Started by a1ex, August 28, 2012, 10:56:48 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

a1ex

Problem: Canon logs get trimmed (because they are too big), or they don't log all messages.

Is there a way to log all DebugMsg calls?

YES :)

https://bitbucket.org/hudson/magic-lantern/src/tip/src/dm-spy.c

Credits goto g3gg0 for the ultimate patching tool :)


        cache_fake((uint32_t)&DryosDebugMsg, B_INSTR((uint32_t)&DryosDebugMsg, my_DebugMsg), TYPE_ICACHE);

nanomad

Remember to add dm-spy.o to your Makefile too ;)
EOS 1100D | EOS 650 (No, I didn't forget the D) | Ye Olde Canon EF Lenses ('87): 50 f/1.8 - 28 f/2.8 - 70-210 f/4 | EF-S 18-55 f/3.5-5.6 | Metz 36 AF-5

Indy


1%

Yes, indeed!

Hopefully I can use this to see all JPCORE messages

Nope, still can't see stuff like this:

[JPCORE] SetEncodeH264Parameter I %d,%d,%d'

nanomad

EOS 1100D | EOS 650 (No, I didn't forget the D) | Ye Olde Canon EF Lenses ('87): 50 f/1.8 - 28 f/2.8 - 70-210 f/4 | EF-S 18-55 f/3.5-5.6 | Metz 36 AF-5

a1ex

1%: you need to enable ENGIO messages.


[JPCORE] SetEncodeH264Parameter I 1920,1088,1
[JPCORE] SetEncodeH264Parameter P 1920,1088,2


This should be what happens when encoding one frame (5D2): http://pastebin.com/J21HuvLX

1%

When I was setting the store level to 15, 2 (what I got from Tramell's post) I was getting engio writes but barely anything from jpcore in regular dumpf. I'll try again but what you posted is pretty good to look through. Set store level before or after starting the hijacked logger?


LV_GetPtpZoomWindowArea IX=2327(2496), IY=1582(1630) - Full frame size?
RequestH264Encode is set to 1920x1080 from only 2 calls + the 2 parameters I/P. But I'm getting off topic...

nanomad

Well 0xFF2493B0 Looks like an encoder initialization routine to me
EOS 1100D | EOS 650 (No, I didn't forget the D) | Ye Olde Canon EF Lenses ('87): 50 f/1.8 - 28 f/2.8 - 70-210 f/4 | EF-S 18-55 f/3.5-5.6 | Metz 36 AF-5

1%

Yep, initializes all the modes... Enabling engio logging worked to see some messages. 

jplxpto

Quote from: a1ex on August 28, 2012, 10:56:48 AM
Problem: Canon logs get trimmed (because they are too big), or they don't log all messages.

Is there a way to log all DebugMsg calls?

YES :)

https://bitbucket.org/hudson/magic-lantern/src/tip/src/dm-spy.c

Credits goto g3gg0 for the ultimate patching tool :)


        cache_fake((uint32_t)&DryosDebugMsg, B_INSTR((uint32_t)&DryosDebugMsg, my_DebugMsg), TYPE_ICACHE);


--------------------------------

Can I use the methods in the file cache_hacks.h, I want to replace the call of some functions of the 40D.
I found a function in 40D program that those records p15. I believe that they start to cache I and D.
Can you explain me better how this mechanism works?

I want to write all messages directly to a file without having to invoke dumpf.
For that I need to replace the function that writes messages to the log 0xFFD4C1EC

Thank you

a1ex

I wanted to take a closer look at the boot process, and I've also seen Maqs on IRC doing some low-level research, so today I sat down and pimped this function a little.

Code: https://bitbucket.org/hudson/magic-lantern/commits/branch/dm-spy-experiments
First proof of concept: https://bitbucket.org/hudson/magic-lantern/commits/9e7945c

There are two interesting points here: the plain DebugMsg logging, and a new generic code logger.

Plain DebugMsg log (dm-spy.c):
- updated for the latest codebase (you can enable it from Makefile.user with CONFIG_DEBUG_INTERCEPT, it uses the new patch manager, stuff like that)
- prints task name for each debug message
- can intercept the debug messages very early in the boot process (CONFIG_DEBUG_INTERCEPT_STARTUP)

Startup log example from 5D2: http://pastebin.com/zYc1jwaq

New generic logger (dm-spy-extra.c, built on top of g3gg0's GDB hooks mixed with dm-spy):

- With a simple description like this, you can trace virtually any Canon function calls, combined with Canon's own debug messages:

    { 0xff9b9198, "StateTransition", 4 , state_transition_log },
    { 0xff9b989c, "TryPostEvent", 4 },


- by default, this uses a generic logger that should cover many usual cases, but you can always use a custom logging function.

- the code does a bit of reverse engineering on the fly (see also asm.c): for functions, it guesses the type for its arguments (string, function name from referenced string, or plain hex) and it also prints the caller address.

          PropMgr:33:01: SendPipeEvent [2][0][5]
          PropMgr:00:00: *** TryPostStageEvent(&"StageClass", &"PtpDps", 0x5, 0x92a96c), from ff83a9f8
          PropMgr:00:00: *** TryPostEvent(&"TaskClass", &"PropMgr", 0x5, 0x735b64), from ff85f220
     ShootCapture:00:00: *** RegisterEDmacCompleteCBR(0x2, 0xffa37488 "CCDWriteEDmacCompleteCBR", 0x0), from ffa37644


- this replaces the old tp-spy and state object logger, with fairly generic code:

         EventMgr:00:00: *** TryPostEvent(&"TaskClass", &"LiveViewMgr", 0xe, 0x4), from ff835648
     ShootCapture:00:00: *** SCSState: (8) --10--> (8)                              x=640190 z=0 t=ffcd0741
     ShootCapture:93:03: scsReleaseEnd
      LiveViewMgr:00:00: *** LVState: (0) --14--> (0)                              x=72c44c z=4 t=72ee54
           RscMgr:00:00: *** SRMState: (0) --13--> (0)                              x=63cb0c z=515f1a t=ffcd0741
           RscMgr:00:00: *** SRMState: (0) --14--> (0)                              x=63cb0c z=0 t=ffcd0741


Picture taking log from 5D2: http://pastebin.com/8TPBUAbT

Known issues:
- startup log cannot use the extra loggers (probably GDB hooks get destroyed somehow, didn't look too much into it)
- be careful on low-memory cameras: the startup log allocates 1MB in the main binary. I've only tested it only on 5D2; for other cameras you may need to either limit the log size, remove some ML features, or find some place to allocate it without malloc calls - since the malloc library is not yet initialized, neither in Canon code, nor in ML.

Happy hacking!

nikfreak

I get some error if i compile with "CONFIG_DEBUG_INTERCET_STARTUP = y". GDB = y is ofc set.
The error doesn't happen with "CONFIG_DEBUG_INTERCEPT = y".


make[2]: Leaving directory `/home/magiclantern/nik6d/magic-lantern/tcc'
[ CC       ]   module.o
[ CC       ]   afma.o
[ CC       ]   gps.o
[ AR       ]   strrchr.o
[ AR       ]   dietlibc.a
[ AR       ]   lib_a-setjmp.o
[ AR       ]   newlib-libc.a
[ CP       ]   newlib-libm.a
[ CP       ]   gcc-libgcc.a
[ LD       ]   magiclantern
dm-spy-extra.o: In function `generic_log':
dm-spy-extra.c:(.text+0xc4): undefined reference to `looks_like_string'
dm-spy-extra.c:(.text+0xe0): undefined reference to `is_sane_ptr'
dm-spy-extra.c:(.text+0xf0): undefined reference to `looks_like_string'
dm-spy-extra.c:(.text+0x11c): undefined reference to `asm_guess_func_name_from_string'
make[1]: *** [magiclantern] Error 1
make[1]: Leaving directory `/home/magiclantern/nik6d/magic-lantern/platform/6D.113'
make: *** [6D] Error 2




Anyone else encountering this compile error with "CONFIG_DEBUG_INTERCET_STARTUP = y" ???
[size=8pt]70D.112 & 100D.101[/size]

a1ex


nikfreak

[size=8pt]70D.112 & 100D.101[/size]

dfort

A group of us are trying to resolve the EOS-M shutter-bug issue and someone asked for a dm_spy_experiments build to look into what is going on in the boot process. Since I am able to compile ML on my MacBook Pro I tackled this project. Here are some issues and workarounds that I made to get it to compile--hope I didn't mess anything up!

Starting with a fresh clone of dm_spy_experiments:

First of all I copied Makefile.user.default to Makefile.user and turned on CONFIG_DEBUG_INTERCEPT and CONFIG_DEBUG_INTERCEPT_STARTUP. I left CONFIG_DEBUG_INTERCEPT_STARTUP_BLINK off because I think it might not apply to the EOS-M.
# Intercept all DebugMsg's
CONFIG_DEBUG_INTERCEPT = y
CONFIG_DEBUG_INTERCEPT_STARTUP = y
CONFIG_DEBUG_INTERCEPT_STARTUP_BLINK = n


My first attempts to compile I got errors relating to the menu and documentation. My workaround was to copy src/menuindexentries.h from the unified branch--it is missing in dm_spy_experiments.

Since I'm on a Mac using the macports version of truncate I got the "truncate: illegal option -- s" so a quick search revealed the following changes in src/Makefile.src:
- @truncate -s 284 $@
+ @truncate $@ 284
and
- truncate -s 0 $@ ; \
+ truncate $@ 0 ; \


So I got it to compile and it seems to be working. The DM.LOG looks very interesting. Hopefully this will help figure out what's going on with the EOS-M shutter-bug.

A big thanks to a1ex!

DeafEyeJedi

@dfort:

That's just awesome news for M users. Great read and thanks for pointing us to this thread.

Thumbs up to @a1ex as always!
5D3.113 | 5D3.123 | EOSM.203 | 7D.203 | 70D.112 | 100D.101 | EOSM2.* | 50D.109

dfort

I just wanted to mention that nikfreak made an updated branch of dm-spy-experiments that is working great. A few of us are using this on the EOS-M to run tests on the shutter-bug--lots of interesting logs.

https://bitbucket.org/nikfreak/magic-lantern/branch/dmspy-70D-new

nikfreak

It's just the latest unified branch merged into the dmspy-experiments-branch with the additions of 70D related committs. Goal:

Get 70D merged into unified and hope a1ex finds some free time and update some relevant branches which will make life easier and enable us to add committs there: dmspy-experiments, qemu
[size=8pt]70D.112 & 100D.101[/size]

escho

QuoteKnown issues:
- startup log cannot use the extra loggers (probably GDB hooks get destroyed somehow, didn't look too much into it)
- be careful on low-memory cameras: the startup log allocates 1MB in the main binary. I've only tested it only on 5D2; for other cameras you may need to either limit the log size, remove some ML features, or find some place to allocate it without malloc calls - since the malloc library is not yet initialized, neither in Canon code, nor in ML.

Some feedback using 600D:

The cam does pretty things:
M-Mode: Freezeing, if I switch the cam on
Video-Mode: I can start the cam. But I,m told, there is no card inside
Sometimes bootlog is created, sometimes not.

After removing a bunch of features and compiling new, no problems more . I receive a nice bootlog

Edgar
https://sternenkarten.com/
600D, 6D, openSUSE Tumbleweed

rbrune

I'm trying to get the current dm-spy-experiments running on EOSM, is there an easy way to find the current_task address in RAM?
It is missing from the stubs and I'm not sure what to look for. Can I retrieve/reduce it by using get_current_task?

a1ex

Yes, it's very easy to find from get_current_task. It reads current_task->taskId, and and in tasks.h, taskId is at offset 0x40 in struct task.

5D3:

FF9E1474             get_current_task
ROM:FF9E1480 1C 00 9F 05                 LDREQ   R0, =0x23E14
ROM:FF9E1484 00 00 90 05                 LDREQ   R0, [R0]
ROM:FF9E1488 40 00 90 05                 LDREQ   R0, [R0,#0x40]


For CURRENT_INTERRUPT_ADDR (optional, but very useful for low-level research), you can find instructions in qemu/scripts/debug-logging.gdb (qemu branch).

rbrune

Quote from: a1ex on November 15, 2016, 11:05:57 PM
Yes, it's very easy to find from get_current_task. It reads current_task->taskId, and and in tasks.h, taskId is at offset 0x40 in struct task.

For CURRENT_INTERRUPT_ADDR (optional, but very useful for low-level research), you can find instructions in qemu/scripts/debug-logging.gdb (qemu branch).

Thanks!
Much easier than expected. I made a pull request for the current_task stub.
The log is way more detailed than I expected.

I will take a look at CURRENT_INTERRUPT_ADDR tonight.

josepvm

So, for 50D:


loc_ff8166b4: ; 14 refs     <  -------------------- (get_current_task)

ff8166b4: e59f0020 ldr r0, [pc, #32] ; ff8166dc: (00001a74)
ff8166b8: e5900000 ldr r0, [r0]
ff8166bc: e3500000 cmp r0, #0
ff8166c0: 059f001c ldreq r0, [pc, #28] ; ff8166e4: (00001a70)        <--------
ff8166c4: 05900000 ldreq r0, [r0]
ff8166c8: 05900040 ldreq r0, [r0, #64] ; 0x40


Can I assume that current_task = 0x1A70 ?

a1ex

That's right.

You can check it either by:
- looking in the log files from dm-spy-experiments (whether they contain valid task names)
- running the firmware under QEMU and GDB like this (CURRENT_TASK in debugmsg.gdb)
- running the firmware under QEMU with "-d io" and checking whether the I/O register accesses have valid task names (define current_task_addr in qemu/eos/model_list.c)

a1ex

Some additions and tips:

You can now log the return value of any function (a temporary logging point is added on the LR address on the fly, similar to gdb tbreak). Example:


     { 0xFF304A74, "IsAddressEEP", 1 | RET },
     { 0xFF309F34, "IsAddressSerialFlash", 1 | RET },
     { 0xFF128228, "Compare_FROMAddress", 1 | RET },



75500>  Terminate:000aee60:00:00: *** Compare_FROMAddress(0x2000000), from ff0ce434
75600>  Terminate:000aee60:00:00: *** IsAddressEEP(0xf8f51000), from ff128260
75600>  Terminate:000aec18:00:00: ==> 0x0, from ff128260
75600>  Terminate:000aee60:00:00: *** IsAddressSerialFlash(0xf8f51000), from ff128274
76100>  Terminate:000aec18:00:00: ==> 0x0, from ff128274
76200>  Terminate:ff1282a4:02:03: Compare FROMAddress (0) 0x406cf800 0xfff40000 Size 4088
76300>  Terminate:000aec18:00:00: ==> 0x0, from ff0ce434


(notice the nested calls)



- RAM pointers that appear to be data structures are logged as well (no need to do anything special):

     { 0xFF127844, "prop_erase", 1 },
     { 0xFF1272FC, "propad_erase", 3 },


76300>  Terminate:000aee60:00:00: *** prop_erase(0x2000000), from ff0ce444
76400>  Terminate:000aee60:00:00: *** propad_erase(0x6cf708, 0x1, 0x147c70), from ff127870
76400>  Terminate:000aef5c:00:00:     *0x6cf708 = { f8f40000 fff40000 20000 2  f8f51000 f8f52000 406cf800 406cf790  1000 1 64f55c 6d0904 ... }
76400>  Terminate:000aef5c:00:00:     *0x147c70 = { 6cf708 0 19980218 19980218  19980218 23f2c 19980218 ff0ce448  0 becd3c 19980218 ff0f8ee0 ... }




- Functions already defined in stubs can be taken from there (no need to type their address again):

    STUB_ENTRY(task_create, 5),
    STUB_ENTRY(msleep, 1),
    STUB_ENTRY(_malloc, 1 | RET),
    STUB_ENTRY(_free, 1),




- The logging points do not have to be functions - they can be anywhere in the middle of the code (in theory - bug reports welcome) and you can log arbitrary register values (without a custom logging function):


    { 0xFF1C8C00, "resinfo_wait_smth", R(0) },
    { 0xFF1C8AC0, "resinfo_check", R(7) },




- You can log interrupts as well. Define LOG_INTERRUPTS in dm-spy-extra.c:

ShootCaptu:000afe68:00:00: *** register_interrupt("Jp57IntrHandler", 0xa3, 0xff28dad0, 0x0), from ff28f71c
ShootPreDe:ff14f698:95:03: spsInit
**INT-A3h*:000afc54:00:00: >>> INT-A3h Jp57IntrHandler ff28dad0(0)
**INT-A3h*:00000588:00:00: <<< INT-A3h done






- You can run the startup logging under QEMU, in real-time, with no cache conflicts:


CONFIG_DEBUG_INTERCEPT_STARTUP = y
CONFIG_QEMU = y


It's best to install QEMU from the qemu branch for now - which should not cause problems with branching, as QEMU runs outside the ML tree.

There is similar functionality in the GDB scripts from the QEMU branch - they output somewhat similar logs, without modifying the guest code (so they can be used very early in the boot process). The method from here requires running autoexec.bin alongside the main firmware.

Q: Then why not just use GDB? (when running in QEMU)

A: it's very very slow. I bet you don't like to wait for half an hour for the firmware to boot the GUI under GDB, when it can do the same thing with DebugMsg hooks in about 1 minute.

Q: Then why do we use GDB?

A: It works, it's well-tested and does not require modifying the guest code. The method from here must have ML ported on your camera before you can use it (not to mention bugs in this code possibly affecting the results), while with GDB, just add your logging points and you are good to go.




- Anything helpful for understanding ENGIO, ADTG, EDMAC and similar monsters?

These things are a bit verbose, so a bunch of messages are disabled by default. Comment out this in dm-spy.c:

    if (class == 21) // engio, lots of messages
        return;


There is also a script in the iso-research branch, that can annotate logs with info about ADTG/CMOS/ENGIO registers:

https://bitbucket.org/hudson/magic-lantern/src/iso-research/modules/adtg_gui/annotate_log.py

Some registers are set directly (without the EngDrvOut/engio_write wrappers) - simply by writing to memory. Currently, the only way to catch these is to run the analyzed routine in QEMU (assuming it can emulate at least some of it) and log the MMIO activity:


./run_canon_fw.sh 5D3,firmware="boot=1" -d io





- What about timestamps?

We have those, too. The number on the left of > is the hardware counter aka digic timer (microseconds, wraps at 0x100000):

04A55>       init:000af020:00:00: *** msleep(0x64), from 69674
...
1BACC>       init:000af020:00:00: *** msleep(0x64), from 69674


All that's left is to unwrap the counter in post.




- Common errors and how to solve them?


Will NOT load on:
    100D (debug_intercept)


Recompile ML core, double-checking Makefile.user (CONFIG_DEBUG_INTERCEPT = y) and running "make clean" before.


Birthday paradox: ff4fa224 RegisterHead3InterruptHandler
  conflicts with: ff216a20 RegisterHead1InterruptHandler


That's a limitation of cache hacks - you can't patch two ROM addresses whose index bits (mask 0x7E0) are the same, but the more significant bits are different.

What you can do?
- Find a different logging address (sometimes, moving a few bytes around will change the index bits, solving the conflict)
- Log the two conflicting functions in two sessions
- Try under QEMU (which does not emulate cache)
- Update cache_fake to lock down two caching lines for ROM patches.




The examples were from 5D3 1.2.3 (most) and 60D 1.1.1 (resinfo).

Happy reversing.