How to run Magic Lantern into QEMU?!...

Started by jplxpto, September 23, 2012, 08:29:02 PM

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

a1ex

Quote from: nkls on May 30, 2016, 10:26:16 PM
Maybe count %-signs and branch to different cases?

Yes, that's exactly what I ended up with. I made a generic logging library for various functions around the firmware (tasks, semaphores, timers, interrupts, MPU communication), all in pure gdb:

https://bitbucket.org/hudson/magic-lantern/src/qemu-nkls/contrib/qemu/scripts/debug-logging.gdb

This file can be included in the model-specific GDB script, like this:
https://bitbucket.org/hudson/magic-lantern/src/qemu-nkls/contrib/qemu/scripts/5D3/debugmsg.gdb
https://bitbucket.org/hudson/magic-lantern/src/qemu-nkls/contrib/qemu/scripts/70D/debugmsg.gdb

and then get a log from both QEMU and GDB on the same terminal with a command like this:

./run_canon_fw.sh 5D3 -s -S & arm-none-eabi-gdb -x 5D3/debugmsg.gdb


The functionality is similar to the one in the dm-spy-experiments branch, but - as you pointed out - it works without changing the guest code or having to load autoexec.bin (for example, I could use it on 7D2).

Also, during the past two weeks (when I was without internet access), I made huge progress regarding emulation of 70D, 5D3, 7D2 and EOS M3. Canon GUI doesn't start yet, but it shouldn't be very far away.

I've also extended model_list to include other model-specific parameters, and made it easy to add new ones:

https://bitbucket.org/hudson/magic-lantern/src/qemu-nkls/contrib/qemu/eos/model_list.c

BTW, I suspect the serial flash contents are not fully correct. In your dumper, I changed the buffer to uncacheable (fio_malloc) and got slightly better results (property blocks were recognized), and I also had to do this change when reading the serial flash contents via DMA. The data appears offset by half-byte, and on regular (non-DMA) reads, Canon code fixes it in the same way (in ReadBlockSerialFlash, for blocks smaller than 0x200). I guess the DMA engine is expected to apply the same "fix".

However, after these two changes, I still suspect data corruption (couldn't parse the property data structures completely - the first few properties are fine, and after a while it's pure gibberish, at least on the 70D SF dump from nikfreak). Guess we'll have to dump the RAM address where the serial flash contents is loaded (or maybe attempt to read the entire serial flash with a single call in the dumper).

Walter Schulz

Quote from: a1ex on June 12, 2016, 12:47:44 PM
Also, during the past two weeks (when I was without internet access), I made huge progress regarding emulation of 70D, 5D3, 7D2 and EOS M3.

EOS M3? Didn't see that coming ...

a1ex

Well, it's the only other DIGIC 6 camera for which I have a firmware (or did any other camera get a firmware update meanwhile?)

Sure, it's a PowerShot, but the DryOS core is the same, so I actually used it in order to understand the 7D2 code better. I only tried to emulate Canon firmware, didn't try to load any sort of custom code on it yet.

I'll post more details about the M3 on the relevant thread, since this emulation can also be useful for those who are porting CHDK on this camera.

nkls

Wow, I'm impressed! Nice job! :) I won't be able to try it out in the upcoming two weeks, but I'll give it a go after that.

You are right about the sf data being manipulated in some way, I wondered what strange format would produce data like:

0000fff0: ffff ffff ffff ffff ffff ffff ffff ffff  ................
00010000: 0fff f000 0dcf 70f0 0000 0000 1d0f 70f0  ......p.......p.
00010010: 0000 0000 0040 3000 0000 0000 10c0 0000  .....@0.........
00010020: 0fff ffff f010 0000 10c0 0000 0fea fdcb  ................
00010030: a080 0000 1180 0000 0000 0000 0000 0000  ................
00010040: 0000 0000 0000 0000 0020 0000 1180 0000  ......... ......

but now that you say it it looks very half-byte shifted at 0x10000.

Interestingly, the first block contains the version string "4.2.1" which is not half-byte shifted.

00000000: 4603 0080 342e 322e 3100 0000 0000 0000  F...4.2.1.......
00000010: 0000 0000 ffff ffff ffff ffff ffff ffff  ................
00000020: ffff ffff ffff ffff ffff ffff ffff ffff  ................


Is it just me or is the write address not incremented in your fix?

So wait, is the data still unshifted when you use uncacheable memory? What happens with the data block exactly -- is the first half-byte always set to zero, and the last half-byte discarded? Any indication of the higher-level calls requesting more data than asked for, or is the last byte of a read always assumed to be bogus? (note to self)

You could also try to force it to only use non-DMA/only-DMA with gdb and see if that works better, I recall getting different results when doing that.

Quote from: a1ex on June 12, 2016, 12:47:44 PM
Guess we'll have to dump the RAM address where the serial flash contents is loaded (or maybe attempt to read the entire serial flash with a single call in the dumper).
It'd be tricky to allocate 16MB(?) for the full flash wouldn't it? It would also be interesting to see how e.g. one 2kb read compares to two 1kb reads, maybe we just need to overlap the block reads by a few bytes and unshift them to make it work.
100D.100A

a1ex

Quote from: nkls on June 13, 2016, 10:57:16 PM
Is it just me or is the write address not incremented in your fix?

Wow, great catch; it fixes a bunch of asserts in 70D log, regarding those properties located in serial flash.

Quote
So wait, is the data still unshifted when you use uncacheable memory?
Yes.

QuoteWhat happens with the data block exactly -- is the first half-byte always set to zero, and the last half-byte discarded?

Canon code does this:

if ( size < 0x200 )
{
      readSerialFlash(src, buf, 512);
      while ( offset < size )
      {
        *(_BYTE *)(out + offset) = 16 * *(_BYTE *)(buf + offset) | (*(_BYTE *)(buf + offset + 1) >> 4);
        ++offset;
      }
}
else
{
    v6 = readSerialFlashWithQuad(src, out, size);
}


Quote
It'd be tricky to allocate 16MB(?) for the full flash wouldn't it?

Nope, just try it.

(some background info here and here)

nkls

I had another look at the flash memory yesterday, and I figured out a few things. Your code appears to load the serial flash correctly for the 100D without changes, and it might be that it's just the spells causing the property errors. (Still no canon gui.)

The chip in my camera is a Winbond W25Q128FV (datasheet), and not a Macronix as I thought before. The Manufacturer Code and device ID is {0xEF,0x40,0x18} which is what I got from SPI instruction 9Fh. This doesn't matter much since it works anyways, but it's good to have the actual data sheet to base the flash emulation of.

There are two high-level interfaces (with the same debug names) for the serial flash. One uses SPI-only transfer with "serial layout", while the other is "QUAD" read which reads data through DMA, or SPI if the read block is small enough. The DMA read has this shifted data layout, maybe due to physical wiring limitations or obfuscation or whatnot. The code you posted is used by the property manager, and is from the QUAD interface, which has to un-shift the SPI data to ensure proper data layout.

This explains why the version string in the first block is readable -- it is read and written by the SPI-only interface which don't expect DMA layout. I've used the address of the SPI-only interface in my dumper, so the data in the dump should be the same as in the actual flash.

I've also tried to change the block size, and there is no difference between the data received from a read-all-at-once dump file and a 1024-byte fio_malloc'ed dumper.
100D.100A

a1ex

Quote from: nkls on June 19, 2016, 12:26:47 PM
it might be that it's just the spells causing the property errors.

These should be easy to fix. Did you take the spells from another camera, or did you log the 100D startup with dm-spy-experiments?

Spells from another camera will cause issues.




BTW, I've included a small SD/CF image here, hopefully this makes it easier to install and get started. The card image is bootable and includes a small autoexec.bin that does the display test, as this runs on most (if not all) DIGIC 4/5 models out of the box.

That means: run the install script, compile QEMU, copy your ROM from the ML card and it should be ready to go.

Still, it won't display the GUI on cameras other than 60D...

a1ex

Okay, I know you won't believe this one.

After about 2 weeks of intensive work on QEMU, I couldn't manage to get 70D and 5D3 working.

Today, after about half an hour of tinkering... 1200D Canon GUI boots.... with the MPU SPELLS from 60D!



Emulation log: 1200D-qemu.log

These were the changes I did to QEMU for 1200D: [1] [2] [3]. And a GDB script to help me see what's going on: [4].

I tried the 60D spells just to see how it goes, but I didn't hope it would go that far.

(side note: currently we can see the MPU spells only after ML is ported, as they come from a secondary CPU whose firmware is pretty much impossible to understand, at least for me)

At this point, I think it's worth trying most other DIGIC 4 cameras with SD card. They probably require a MPU spell log (easy to get, as described here), and probably a few other minor tweaks. Happy hacking!

P.S. looks like it took me more time to write this message, than to actually get the 1200D GUI booting :)

a1ex

Some more:

1100D also ran out of the box with 60D spells:


550D is stubborn; I ended up with this after patching lvInit (not yet committed):


600D needs its own MPU spell set (most likely because of the crop video mode settings, which are stored in the MPU), but I expect it to work without much trouble.

mathias

alex, I am having a lot of errors like

/home/matias/qemu/qemu-2.5.0/hw/arm/../eos/eos_handle_serial_flash.c:38:5: error: 'for' loop initial declarations are only allowed in C99 mode
/home/matias/qemu/qemu-2.5.0/hw/arm/../eos/eos_handle_serial_flash.c:38:5: note: use option -std=c99 or -std=gnu99 to compile your code

I've fixed some, but seems that some config is missing, or i am wrong?

a1ex

Interesting; I used to have C99 errors before, but they were no longer present after upgrading to QEMU 2.5.0 and Ubuntu 15.10. I thought QEMU devs enabled C99 in newer versions, so I started using C99 constructs in my code.

Looks like the answer is here:

https://gcc.gnu.org/onlinedocs/gcc-4.8.0/gcc/Standards.html
Quote
The default, if no C language dialect options are given, is -std=gnu90

https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Standards.html
Quote
The default, if no C language dialect options are given, is -std=gnu11

Try adding -std=gnu99 to CFLAGS in your QEMU Makefile, or when configuring it. For example:


/path/to/ml/qemu/qemu-2.5.0$ CFLAGS=-std=gnu99 ../configure_eos.sh


(this should be included in the installer, actually)

mathias

Great,

I was able to run 1200D with Hello world in qemu 2.5.0

a1ex

Interesting, after trying a couple of times (and getting different execution logs every time) I finally got your hello world running as well.

Looks like there is a race condition somewhere in the emulation.

mathias

I don't know why but it's throwing an error if i remove hello world from define. the error is not clear (at least for my)

Error loading 'ML/MODULES/1200D_100.sym': File does not exist
while in the emulator I see some error like symbols not found (dissapears fast), if I hit del key screen goes black.

but tried in 1.6 version, no error and del key givesme ML menu.

I don't know how did you do to display GUI in qemu
(before testing port in my camera I always try to see it working in qemu)

a1ex

If the file does exist, ML cannot load it because its name is longer than 8 characters.

Were you able to get the GUI in QEMU 2.5.0 like this?



For me, it doesn't work every time; I have to start QEMU a couple of times to get this screen. This only happens when loading autoexec.bin; if I try to run plain Canon firmware (with bootflag disabled), it works every time.

nikfreak

Quote from: mathias on June 29, 2016, 05:50:04 AM
Error loading 'ML/MODULES/1200D_100.sym': File does not exist
while in the emulator I see some error like symbols not found (dissapears fast), if I hit del key screen goes black.

If 8.3 filenaming applies to qemu, too then you got your answer now: shorten to "x70_100.sym" or something like that.
[size=8pt]70D.112 & 100D.101[/size]

a1ex

Quote from: nikfreak on June 29, 2016, 07:47:20 AM
If 8.3 filenaming applies to qemu

QEMU actually emulates a SD card device, with all the low-level communication (including DMA transfers), so filesystem behavior should match Canon's. For example, if you start the emulation on a formatted SD image, you will see Canon firmware creating the DCIM and MISC folders.

I expect formatting from Canon menu should work as well, just didn't try it. This needs the MPU spells (button codes and whatever other GUI events there might be) for navigating Canon menu.

For CF cards, emulation currently works properly in the bootloader (e.g. loading autoexec.bin from card), but not in the main firmware.

mathias

Well tried to reinstall qemu (in case my fail) but same errors appears.
using 1200D branch, if I run it with HELLO_WORLD I get this: (no cannon GUI)


if i remove it i get file not found as I said before. (Double checked and the file exist, this model 1200D only has SD card so i am just mounting sd, notice the file in the screenshot)


a1ex

Copy all ML files (make install), not just autoexec ;)

a1ex

600D is just as stubborn as 550D, but easier to debug (more recent codebase).

With proper MPU spells, it gives the date/time screen, just like 550D.

With a small patch on the RTC init routine, it gives the sensor cleaning animation (real-time!)



Greg

500D :
0xFF18A884 - mpu_send
0xFF05C1F0 - mpu_recv

a1ex

Some progress understanding MPU messages:

Button codes

They are encoded like this:

0x06, 0x05, 0x06, 0x00, btn_code, btn_code_arg


The button codes can be found from bindReceiveSwitch - this translates the MPU button codes (btn_code, btn_code_arg) into GUI button codes as used by GuiMainTask (the BGMT constants from gui.h). When the GUI button codes are sent to GuiMainTask, they appear in the debug log as "GUI_Control:%d 0x%x". When they are actually processed, they appear as "GUI_CONTROL:%d".

Since finding these button codes for each camera would be incredibly boring, I wrote a Python script to get them automatically from the ROM, by directly emulating bindReceiveSwitch in unicorn, trying all usual input values and checking debug messages.

btn_code_arg meaning can be: press/unpress (1,0), scrollwheel direction (1, -1) and number of steps for very fast turns (2, -3 etc), or some buttons can be grouped under a single btn_code (for example, the direction pad).

Many button codes are common across all cameras (DIGIC 4 and 5): MENU (0,1), INFO (1,1), PLAY (3,1), DELETE (4,1), SET (12,1/0), scrollwheels (13 and 14), others are not.

btn_code = 30 is ServiceMenu on all cameras. Interesting string on 1200D: "Enter Secret mode Electric Shutter!!!".

Some cameras also use a generic event, GUICMD_PRESS_BUTTON_SOMETHING, which I don't know how to interpret (other than some button was pressed).

Side note: this research uncovered a few subtle bugs regarding button codes on 600D, 100D and EOS M, and not-so-subtle on 1100D (see this PR).

GUI modes

On 600D, menu navigation looks somewhat like this ("spell" being data sent from ICU to MPU, and "reply" being the response from MPU):

    { 0x06, 0x05, 0x03, 0x19, 0x00, 0x00 }, {                   /* spell #44 */
        { 0x06, 0x05, 0x03, 0x17, 0x9a, 0x00 },                 /* reply #44.1 */
        { 0x06, 0x05, 0x06, 0x26, 0x01, 0x00 },                 /* reply #44.2, GUI_Control:76, bindReceiveSwitch(38, 1) */
        { 0x06, 0x05, 0x06, 0x00, 0x01, 0x00 },                 /* reply #44.3, BGMT_MENU, GUI_Control:6, bindReceiveSwitch(0, 1) */
        { 0x06, 0x05, 0x04, 0x0d, 0x00, 0x00 },                 /* reply #44.4 */
        { 0 } } }, {
    { 0x06, 0x05, 0x03, 0x19, 0x00, 0x00 }, {                   /* spell #45 */
        { 0x06, 0x05, 0x06, 0x26, 0x01, 0x00 },                 /* reply #45.1, GUI_Control:76, bindReceiveSwitch(38, 1) */
        { 0x06, 0x05, 0x06, 0x00, 0x01, 0x00 },                 /* reply #45.2, BGMT_MENU, GUI_Control:6, bindReceiveSwitch(0, 1) */
        { 0 } } }, {
    { 0x06, 0x05, 0x04, 0x00, 0x01, 0x00 }, {                   /* spell #46, NotifyGUIEvent(1) */
        { 0x06, 0x05, 0x06, 0x0a, 0x00, 0x00 },                 /* reply #46.1, BGMT_UNPRESS_ZOOMOUT_MAYBE, GUI_Control:17, bindReceiveSwitch(10, 0) */
        { 0x06, 0x05, 0x06, 0x09, 0x00, 0x00 },                 /* reply #46.2, BGMT_UNPRESS_ZOOMIN_MAYBE, GUI_Control:15, bindReceiveSwitch(9, 0) */
        { 0x06, 0x05, 0x04, 0x00, 0x01, 0x01 },                 /* reply #46.3 */
        { 0x0e, 0x0c, 0x0a, 0x08, 0x11, 0x00, 0x15, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00 },/* reply #46.4 */
        { 0 } } }, {
    { 0x08, 0x06, 0x00, 0x00, 0x04, 0x00, 0x00 }, {             /* spell #47, Complete WaitID = 0x80020000 */
        { 0 } } }, {
    { 0x06, 0x05, 0x03, 0x34, 0x00, 0x00 }, {                   /* spell #48 */
        { 0 } } }, {
    { 0x06, 0x05, 0x03, 0x19, 0x00, 0x00 }, {                   /* spell #49 */
        { 0x06, 0x05, 0x06, 0x26, 0x01, 0x00 },                 /* reply #49.1, GUI_Control:76, bindReceiveSwitch(38, 1) */
        { 0x06, 0x05, 0x06, 0x1a, 0x01, 0x00 },                 /* reply #49.2, BGMT_PRESS_RIGHT, GUI_Control:35, bindReceiveSwitch(26, 1) */
        { 0x06, 0x05, 0x06, 0x1a, 0x00, 0x00 },                 /* reply #49.3, BGMT_UNPRESS_RIGHT, GUI_Control:36, bindReceiveSwitch(26, 0) */
        { 0x06, 0x05, 0x06, 0x26, 0x01, 0x00 },                 /* reply #49.4, GUI_Control:76, bindReceiveSwitch(38, 1) */
        { 0x06, 0x05, 0x06, 0x1a, 0x01, 0x00 },                 /* reply #49.5, BGMT_PRESS_RIGHT, GUI_Control:35, bindReceiveSwitch(26, 1) */
        { 0x06, 0x05, 0x06, 0x1a, 0x00, 0x00 },                 /* reply #49.6, BGMT_UNPRESS_RIGHT, GUI_Control:36, bindReceiveSwitch(26, 0) */


NotifyGUIEvent (called by SetGUIRequestMode) sends a message like this:

0x06, 0x05, 0x04, 0x00, event_code, 0x00


The MPU is supposed to reply something, probably this:

0x06, 0x05, 0x04, 0x00, event_code, 0x01


Now the interesting part: if I enable the NotifyGUIEvent reply on 60D, and the other cameras that accept the same MPU spell set, they no longer boot the GUI: instead, they go to the date/time dialog. You can adjust the date/time in QEMU using the arrow keys, scrollwheels, and the spacebar (SET), but when pressing OK, the GUI freezes.

Obviously, with NotifyGUIEvent disabled, the GUI mode can't be changed (so you can't enter Canon menu, or playback mode, or whatever).

The question is: how to make the GUI mode switches work, so one can navigate the menu?




Note: with current implementation, you can already navigate ML menu without CONFIG_QEMU=y... if you define GUIMODE_ML_MENU = 0 (so it won't try to change the GUI mode, because that part doesn't work). One big step closer towards running unmodified ML in QEMU :)

Greg

500D :

# ./run_canon_fw.sh 500D -s -S & arm-none-eabi-gdb -x 500D/debugmsg.gdb

source -v debug-logging.gdb

macro define CURRENT_TASK 0x1A74
macro define CURRENT_ISR  (*(int*)0x664 ? (*(int*)0x668) >> 2 : 0)

b *0xFF066A98
DebugMsg_log

b *0xFF069E2C
task_create_log

b *0xFF064520
load_default_date_time_log
macro define RTC_VALID_FLAG (*(int*)0x2BC4)

cont

budafilms

Hi everybody,
an optimistic question from someone without skills to use this: wich is the real utility/advantage of this?

(I mean, for example, more memory, more resolution, new language for coding, girls...)

Thanks!

Walter Schulz

Testing code without cam:
- Cam will not brick
- You have not to wear out gear (card, slot, cardreader, etc.) each time you have to replace binaries.