Canon 80D

a1ex · January 31, 2019, 10:01:41 PM

Awesome! It saved all of that without crashing?

CONFIG_80D: it should be defined by the Makefiles. If I compile the same code on 5D4, for example, it will refuse to run. If I write some gibberish into the CONFIG_80D block from log-d6.c, I get a compile error (without having to define CONFIG_80D manually anywhere).

Back to cleaning up io_trace.

sombree · January 31, 2019, 10:25:25 PM

Yep, no crash, just a little longer camera start up

As for CONFIG_80D - you're right, it's defined by the Makefile.

a1ex · February 01, 2019, 10:34:28 PM

OK, io_trace code committed. Whew, this was hard!

Compile with: make clean && make CONFIG_MMIO_TRACE=y

If it locks up:
- try previous changesets
- reduce the logged range (for example, try logging from 0xC0220000, size 0x1000)
- give me as many details about the crash as you can (it's very hard to debug)

Fingers crossed.

Chellyandruu · February 02, 2019, 01:25:17 AM

https://www.dropbox.com/s/g8u4gmchldcngxr/DEBUGMSG.LOG?dl=0

a1ex · February 02, 2019, 07:09:21 AM

Make sure you compile with CONFIG_MMIO_TRACE=y on the command line. There should be 2 log files.

If in doubt, run "make clean" first (updated the command), as the build system doesn't know it needs to recompile if you change only the options. It only recompiles if you edit source files.

sombree · February 02, 2019, 09:53:50 AM

It worked out of box

mmio log without lens
mmio log with lens attached
another one - with a little exercise - took a photo, fiddled with a touchscreen, changed iso etc.

a1ex · February 02, 2019, 07:48:36 PM

Really nice! These files contain the ultimate reference data for DIGIC 6!

Previously, I had to guess many of these values, from what the code seemed to expect. That took me probably months of trial and error (well, years with the fragmentation, as I didn't work only on this). You may ask - why didn't I start with this from the beginning? Err... back then, I simply did not understand how computers work, well enough to write the MMIO tracing code. I've learned it from... all of that trial and error (not only from the 80D firmware, but from investigating about 40 EOS models from different generations). I've tried to summarize what I've learned here, but it's probably a bit too advanced for beginners.

Anyway, let's try logging some more address ranges:

- from 0xBFE00000, size 0x200000 (expecting some activity from GuiMainTask)
- from 0xEE000000, size ~~0x1000000~~ 0x2000000 (no activity in QEMU; unused?)
- ~~from 0xA0000000, size 0x40000000 (expecting all MMIO activity, i.e. C, D and BFE ranges)~~ (nevermind, invalid range)

I'll be back with a guide on how to interpret these huge logs.

sombree · February 02, 2019, 09:58:41 PM

from 0xBFE00000, size 0x200000 - camera locks up with Err70, no log is saved. Edit: sometimes camera locks up with Err70 but led still blinks as it's supposed to.
from 0xEE000000, size 0x1000000 - MMIO log looks empty - log.
from 0xA0000000, size 0x40000000 - camera locks up without any error, no log is saved.

Ant123 · February 03, 2019, 09:03:11 AM

Does MMIO trace work in LiveView mode?

On EOS M3 io_trace causes crash in Rec mode.

sombree · February 03, 2019, 11:28:18 AM

Few times it didn't work in regular (photo) LiveView - on-screen image was garbled and there was no log. Right now I've tried again and no problem at all, both in photo and video LV (log).

Ant123 · February 03, 2019, 12:24:15 PM

Quote from: sombree on February 02, 2019, 09:58:41 PM
from 0xBFE00000, size 0x200000 - camera locks up with Err70, no log is saved.

0xBFF00000 memory region contains message buffers shared with MZRM core. So probably io_trace code is not fast enough for transfers caused by memcpy.

sombree · February 03, 2019, 01:41:15 PM

After builiding with -O2 (instead of default -Os) I was able to get some partial logs:
from 0xBFE00000, size 0x200000 - log
from 0xA0000000, size 0x40000000 - camera locks up with Err60, no log is saved.

Edit:
I checked from 0xA0000000, size 0xFFFFFFF - camera doesn't lock up and MMIO log is empty (unused range?).

a1ex · February 04, 2019, 09:01:57 AM

Indeed, access in the BFE region is done with highly optimized code, memcpy and memcpy-like This is where io_trace struggles.

Possible reasons for crash:
- Too slow (noticeable with a huge number of events). In this case, using a lower frame rate for LiveView usually helps.
- compiler optimization: the io_trace code is handwritten assembly (i.e. not affected at all)
- maybe the other logging code (written in C) was improved by -O2, but I don't expect a big difference
- I'd be surprised if the optimization level really helped in a repeatable way (i.e. consistently crashing with -Os and consistently working with -O2)
- either way, these partial logs were really helpful, so whatever you did to capture them, it was good

- Re-execution context (register contents) slightly different. This could affect the outcome of the original code (the path it takes). My tracing code works by:
- disallowing memory access in the logged area
- this camera has a memory protection unit with 8 memory regions (from 0 to 7)
- 80D firmware uses just the first 7 (likely true on other D6 models)
- so, I could configure the last memory region (which has the highest priority) to cover the logged area and to disable the read/write permissions
- this will cause a data abort exception for the instruction trying to read or write in the logged range
- this instruction will have to be re-executed:
- I need to log the result of that instruction (i.e. return value from a MMIO register), so I can't just jump back into the original code
- therefore, I'm copying the original instruction in the middle of my data abort handler
- before re-executing it, I disable the memory protection region, sync the caches and restore context (original registers)
- then I re-enable the memory protection, to keep catching further MMIO reads/writes
- this lets me to log stuff both before and after the trapped instruction
- registers in data abort mode are not exactly the same as in user mode (SP and LR differ)
- this is not usually a problem in regular MMIO code, but highly optimized code like memcpy does use LR as a general-purpose register; maybe also SP
- this edge case (or corner case, if you prefer) is not handled well by io_trace
- other bugs in the low-level io_trace code, that I'm not aware of

The partial log shows a very important hint: at address 0xBFE00000, the firmware writes 0 or 1, but always reads 0. That means this address (possibly the entire range) behaves somewhat like MMIO, rather than like simple memory. It's probably shared memory, where the secondary cores (like Omar and Zico) are writing, too. Still, as long as we are not emulating these secondary cores, we need to model their behavior. If no code is executed by the main core from this memory range, modeling it as MMIO (rather than regular memory) is probably the way to go.

What to do about logging the activity in this range?
- fix the io_trace code to re-execute the trapped instruction with the original SP and LR
- replace memcpy with a slower function (easier to log with current code)
- try to log smaller regions

I'll look into the first two. You could attempt to log from 0xBFF00000, size 0x100000 (i.e. the upper half, used by MZRM). The lower half (0xBFF00000, size 0x100000) appears to be used by Omar.

In QEMU, logging the BFF range gives over 12000 MMIO events, starting with:

News:

Canon 80D