How to run Magic Lantern into QEMU?!...

Started by jplxpto, September 23, 2012, 08:29:02 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Greg

500D with spells from 550D has a new menu item :


a1ex

Very cool. I had this setting enabled on 550D, so it must be saved on the MPU side (probably EEPROM).

Old notes about it: https://groups.google.com/d/topic/ml-devel/ti8GyVqEZmo/discussion

Greg

Yes it is stored in the MPU.
Studio mode on  { 0x06, 0x05, 0x01, 0x42, 0x01, 0x00 },
Studio mode off { 0x06, 0x05, 0x01, 0x42, 0x00, 0x00 },

Greg


Greg



g3gg0

i am still stunned how good the emulation works.. :)
Help us with datasheets - Help us with register dumps
magic lantern: 1Magic9991E1eWbGvrsx186GovYCXFbppY, server expenses: [email protected]
ONLY donate for things we have done, not for things you expect!

Greg

Live View VRAM patch :



[EDMAC#18] Starting transfer to 0x1B07800 from conn #4, 1440x424, flags=0x20000080
Loading photo raw data from ./500D/VRAM/PH-LV/LV-000.422...
[EDMAC#18] 610560 bytes written to 1B07800-1B9C900.

a1ex

500D menu navigation works here too, thanks Greg :D

Currently, this old camera is the only one that lets you navigate Canon menu. All other models show either a static GUI or the date/time screen.

Also committed:
- initial support for EOS M10 and M5 (for CHDK)
- an option to export function calls to IDA
- an experiment to group related MPU messages from timestamps (in the dm-spy-experiments branch)
- some auto comments regarding MPU messages
- minor fixes here and there.

a1ex

Formatting the virtual card works too, both from Canon and ML (of course, on 500D) :)

This is the first test in the suite that runs an unmodified ML binary. It actually downloads the current nightly build (at the time of writing) and checks both the GUI (expected screens) and the card contents (whether ML still boots after being restored).

Test log

This log should also contain useful info (what commands to run), should you want to reproduce these experiments on your PC. I should probably write a guide, other than the tips from the install script.

Menu screens currently covered by the test suite:







There are still some nondeterministic bugs (that's why some tests are retried a few times, until sucess); those will need fixing before using QEMU as a test platform for ML builds. Still, it already starts to be useful (for example, for getting menu screenshots).

At this stage, I think the old implementation is no longer useful, so we may start thinking about merging the QEMU branch into unified. This will remove most of those CONFIG_QEMU hacks from the source code.

BTW, if you have experience with some testing framework, and you know a nicer way to implement the current tests, I'd be interested in hearing from you.

Greg

Firmware update
ROM modified with hexeditor "DisableMainFirm" - http://magiclantern.wikia.com/wiki/Bootflags
500D 1.1.1 -> 1.1.2



200:  5337.856 [UPD] Welcome to Update program
201:  5337.856 [UPD]   Program Ver.Slave 0.2.0
208:  5338.112 [UPD] ------------ Initialized
277:  5343.232 [UPD] CurrentVersion=1.1.1
278:  5343.232 [UPD] DS_MODELID =0x80000252
295:  5350.144 [MS] LOCK (1)
535:  8316.672 [UPD] StartFirmupProgress
569:  8319.232 [UPD] ERROR Do not read
571:  8358.912 [UPD] 0=UPD_VerifyFirmware
572:  8830.464 [UPD] 0=UPD_DecryptoFirmware
574:  8850.944 [UPD] CheckSum file=0xd960afec buffer=0xd960afec
575:  8947.968 [UPD] SAFEMODE
620: 35370.240 [UPD] ERR 1=updSpecificPartner



a1ex

First ML change I've tested in QEMU on all unified models:

https://bitbucket.org/hudson/magic-lantern/pull-requests/796/new-method-for-getting-current-task-names/diff

Latest update adds partial 7D support (slave CPU only, without IPC).

Test log.

a1ex

A small change that unlocked Canon menu navigation on many models:

https://bitbucket.org/hudson/magic-lantern/commits/c881ba2

After some refactoring and porting the 500D MPU messages required for GUI, Canon menu navigation is now also working on...

60D, 550D, 600D, 700D, 100D, 1100D and 1200D!

Screenshots (guess the cam):






Test log.

All screenshots here (click on Expand all).

This is a big breakthrough, as it effectively lets me review ML ports or code changes on cameras I don't own :)

DeafEyeJedi

5D3.113 | 5D3.123 | EOSM.203 | 7D.203 | 70D.112 | 100D.101 | EOSM2.* | 50D.109

g3gg0

please don't let jenkins test builds format the card. i am scared it will wipe our build server  :o

kidding :D
cool work!
Help us with datasheets - Help us with register dumps
magic lantern: 1Magic9991E1eWbGvrsx186GovYCXFbppY, server expenses: [email protected]
ONLY donate for things we have done, not for things you expect!

nkls

Nice work! Finally some running cameras to toy around with.  :D

I've managed to merge the latest changes into qemu v2.9.0-rc1. It's probably not a perfect merge, the camera gui hangs up more often than on v2.5.0 for me.

The patches to qemu code merged quite well, but someone with more knowledge of them should probably review whether they are ok/redundant in 2.9.0.

A major difference is that I've replaced the interrupt thread with a QEMUTimer. This solved som iothread lock bug, but I guess it should work better than the thread since it's now synced to the guest clock and not the host system.
100D.100A

a1ex

Quote from: nkls on March 27, 2017, 10:12:32 PM
A major difference is that I've replaced the interrupt thread with a QEMUTimer. This solved som iothread lock bug, but I guess it should work better than the thread since it's now synced to the guest clock and not the host system.

Yay! That's what I wanted to do next, hoping it would solve the GUI lock-ups. I/O lock was another issue that I didn't know how to solve.

Quote
I've managed to merge the latest changes into qemu v2.9.0-rc1. It's probably not a perfect merge, the camera gui hangs up more often than on v2.5.0 for me.

My attempt to merge with 2.8.0 was not very successful (got stuck at making the serial port work, so ended up disabling it), but otherwise it seemed to run fairly well. Will try to integrate your changes and see how it goes.

a1ex

Quote from: nkls on March 27, 2017, 10:12:32 PM
A major difference is that I've replaced the interrupt thread with a QEMUTimer. This solved som iothread lock bug, but I guess it should work better than the thread since it's now synced to the guest clock and not the host system.

This solved the intermittent I/O lock-ups that I was unable to track down for a long time!!!

5D3 SD card test successfully ran 10 times in a row. Previously, I had to run it about 5 times to get one successful run...

All the menu navigation tests from test suite passed with flying colors, without any retries required!

Thanks nkls!!!

DeafEyeJedi

5D3.113 | 5D3.123 | EOSM.203 | 7D.203 | 70D.112 | 100D.101 | EOSM2.* | 50D.109

a1ex

Added a couple of jobs on the build server:

- QEMU-dm-spy: compiles the dm-spy-experiments branch and runs the binary in the emulator. These logs contain all Canon's debug messages, and optionally all MMIO activity. Should be useful for anyone who wants to understand the startup process.
- QEMU-boot-check: compiles ML from every camera model (from nightly) with CONFIG_QEMU=y and runs it for a few seconds in the emulator; this compilation flag enables additional debug info at startup, useful for checking the boot process (where autoexec.bin is loaded, how much memory it takes, what it does to reserve it and so on).
- QEMU-FA_CaptureTestImage: compiles a minimal autoexec.bin that calls FA_CaptureTestImage (therefore taking a full-res silent picture). All the debug messages from Canon and all the MMIO activity are logged. Might be useful for understanding the still photo capture process.
- QEMU-tests: that's the test suite for QEMU (presented earlier in this thread)

All these tests have HTML logs (actually just plain text with colors) and screenshots (where it's the case).

I'm also thinking to run some basic tests on the nightly, on those models with functional GUI (tests such as menu screenshots, load each module, check memory usage, run some simple Lua scripts). The emulation is not there yet for more complex tests (for example, we cannot take a CR2 picture or go to LiveView).

a1ex

Some updates:
- upgraded to QEMU 2.9.0, thanks nkls (still experimental, as I had quite a bit of trouble with it, so it's in a different branch for now)
- fixed another (or maybe the same?) nondeterministic lock-up (see a few posts above)
- initial support for 1300D (WIP)
- options to log memory accesses (aka memory tracing); run with "-d help" to get the list

The lock-up bug was showing up very rarely on 2.5.0 after the timer refactoring from nkls (let's say about 1 out of 100 runs was bad), but after upgrading to 2.9.0 it showed up in more than half of the test runs (or about 1/5 of the test runs if the log was redirected to file). Narrowed down to interrupt controller (from a change made many months ago to support 1000D and other VxWorks models).

I'm also experimenting with logging all memory accesses made by the guest firmware, on 2.5.0. Examples for 1300D:


./run_canon_fw.sh 1300D -d romw
...
Firm Jump RAM to ROM 0xFE0C0000
K404 READY
[rom1]     at 0x0001D54C:0001D54C [0xF8000000] <- 0x6       : 8-bit
[rom1]     at 0x0001D54C:0001D54C [0xF8000000] <- 0x6       : 8-bit
[rom1]     at 0x0001D54C:0001D54C [0xF8000000] <- 0xE9      : 8-bit
[DMA1] Copy [0xF8E60000] -> [0x402D4000], length [0x0026BBF8], flags [0x00030001]
[DMA1] OK
     0:    20.480 [STARTUP]



./run_canon_fw.sh 1300D -d ramw,romr
...
[rom1]     at 0xFE0C000C:001000EC [0xFEA7A270] -> 0xE92D4010
[ram]      at 0xFE0C000C:001000EC [0x00001900] <- 0xE92D4010
[rom1]     at 0xFE0C009C:001000EC [0xFEA7A274] -> 0xEB000BAB
[ram]      at 0xFE0C009C:001000EC [0x00001904] <- 0xEB000BAB
...


I know I'm almost certainly reinventing the wheel, but I had only limited success with these modified versions:
- mtrace uses a very very old QEMU
- panda 1.0 uses QEMU 1.0.1, examples work, lots of nice tools, but appears deprecated (shouldn't be hard to roll back our patches to the older version)
- panda 2.0 uses a very recent QEMU, but could not run any ARM examples (segmentation fault). Also, most of the cool tools from panda 1.0 are not ported yet.
- QEMU-DBI is "being upstreamed into QEMU", and a large part of it is already in 2.9.0 (the main reason I've upgraded). TODO: figure out how to use it...
- QEMU-CHERI is a mod for MIPS that also traces memory and instructions (nice to see how it works)
- the last one, QEMU-trace, is a very simple patch that showed me where to place the hooks in the QEMU codebase (also with this message and this thread from mailing lists).

So, yeah, I still want to use the state-of-art method for logging memory accesses, just need to figure out how. Until then, my monkey-patched method appears to work pretty well (can rebuild the memory contents from the trace) and has very little overhead as long as I'm not printing each access to the console.

a1ex

Currently experimenting with a binary instrumentation tool similar to valgrind's memcheck (though a lot more primitive, as I'm reinventing the wheel again). It's written on top of the memory tracing (which is already committed) and a similar hook calling every time a new code block (TranslationBlock) is executed.

Quick example (don't click me):

    uint32_t * p = malloc(1234);
    qprintf("p=%x\n", p);
    p[100] = p[200] + 1;            /* use of uninitialized value (read) */
    free(p);
    qprintf("p freed\n");
    p[20] = p[30] + 1;              /* use after free (both read and write) */
    qprintf("test complete\n");


From emulation log:

p=fb440
[run_test:589c8:589c8] fb760 uninitialized
p freed
[run_test:589e4:589e4] fb4b8 read after free (0)
[run_test:589f0:589e4] fb490 written after free (1)
test complete


The current state is just a very rough proof of concept, but it already found a bunch of null pointer, uninitialized memory and thread safety bugs :D

eduperez

Quote from: a1ex on May 01, 2017, 01:49:29 AM
The current state is just a very rough proof of concept, but it already found a bunch of null pointer, uninitialized memory and thread safety bugs :D

For a moment I though you where talking about bugs in Canon's code...  :o

a1ex

For your viewing pleasure:

http://builds.magiclantern.fm/jenkins/job/QEMU-memcheck/QEMU_memcheck_logs/500D.111-memchk.log.html

The analysis only has 500D stubs for now (though it's easy to add for other models).

The first bunch of TCM warnings can be ignored (these are the initialization sequence). The remaining TCM accesses from Canon tasks are probably bugs in Canon firmware, or in my emulation.

Here's an obvious one, if you look it up in the disassembly:

[FileMgr:ff3b5d38:ff3b5d38] address 0 written to TCM (12)
[FileMgr:ff3b5d48:ff3b5d48] address 0 written to TCM (2000)
[FileMgr:ff3b5d58:ff3b5d58] address 0 written to TCM (100)

a1ex

More updates:

- 50D boots the GUI! (figured it out from this log)
- 5D2 is very close
- faster emulation (test suite about twice as fast)
- code coverage report



Self-testing log