Indeed, access in the BFE region is done with highly optimized code, memcpy and memcpy-like This is where io_trace struggles.
Possible reasons for crash:
- Too slow (noticeable with a huge number of events). In this case, using a lower frame rate for LiveView usually helps.
- compiler optimization: the io_trace code is handwritten assembly (i.e. not affected at all)
- maybe the other logging code (written in C) was improved by -O2, but I don't expect a big difference
- I'd be surprised if the optimization level really helped in a repeatable way (i.e. consistently crashing with -Os and consistently working with -O2)
- either way, these partial logs were really helpful, so whatever you did to capture them, it was good
- Re-execution context (register contents) slightly different. This could affect the outcome of the original code (the path it takes). My tracing code works by:
- disallowing memory access in the logged area
- this camera has a memory protection unit with 8 memory regions (from 0 to 7)
- 80D firmware uses just the first 7 (likely true on other D6 models)
- so, I could configure the last memory region (which has the highest priority) to cover the logged area and to disable the read/write permissions
- this will cause a data abort exception for the instruction trying to read or write in the logged range
- this instruction will have to be re-executed:
- I need to log the result of that instruction (i.e. return value from a MMIO register), so I can't just jump back into the original code
- therefore, I'm copying the original instruction in the middle of my data abort handler
- before re-executing it, I disable the memory protection region, sync the caches and restore context (original registers)
- then I re-enable the memory protection, to keep catching further MMIO reads/writes
- this lets me to log stuff both before and after the trapped instruction
- registers in data abort mode are not exactly the same as in user mode (SP and LR differ)
- this is not usually a problem in regular MMIO code, but highly optimized code like memcpy
does use LR as a general-purpose register; maybe also SP
- this edge case (or corner case, if you prefer) is not handled well by io_trace
- other bugs in the low-level io_trace code, that I'm not aware of
The partial log shows a very important hint: at address 0xBFE00000, the firmware writes 0 or 1, but always reads 0. That means this address (possibly the entire range) behaves somewhat like MMIO, rather than like simple memory. It's probably shared memory, where the secondary cores (like Omar and Zico) are writing, too. Still, as long as we are not emulating these secondary cores, we need to model their behavior. If no code is executed by the main core from this memory range, modeling it as MMIO (rather than regular memory) is probably the way to go.
What to do about logging the activity in this range?
- fix the io_trace code to re-execute the trapped instruction with the original SP and LR
- replace memcpy with a slower function (easier to log with current code)
- try to log smaller regions
I'll look into the first two. You could attempt to log from 0xBFF00000, size 0x100000 (i.e. the upper half, used by MZRM). The lower half (0xBFF00000, size 0x100000) appears to be used by Omar.
In QEMU, logging the BFF range gives over 12000 MMIO events, starting with:
4.494781 GuiMainTas:fe64ee06:MMIO : [0xBFF20000] <- 0x00000106
4.494782 GuiMainTas:fe64ee0a:MMIO : [0xBFF20004] <- 0x8201FC60
4.495039 GuiMainTas:fe64ee0e:MMIO : [0xBFF20008] <- 0xA0FFFF01
It still contains highly optimized memcpy-like code, so don't worry if it won't work out of the box.
Size 0xFFFFFFF means, according to Cortex R4 TRM page 123:
- enabled
- region size: 4 GB
- all sub-regions disabled
In other words, it's effectively disabled (it won't log anything).
Size must be power of 2 (p. 124). Start address must be multiple of size (p. 178).
Noticed a typo - from 0xEE000000, the size should have been 0x2000000. Either way, there's no activity in QEMU on this address range. This comes from the startup configuration:
FE0258E4: MCR p15,0,Rd,cr6,cr2,0: RGNR <- 0x5
FE0258EC: MCR p15,0,Rd,cr6,cr1,0: DRBAR <- 0xEE000000
FE0258F4: MCR p15,0,Rd,cr6,cr1,4: DRACR <- 0x329 (P:RW U:RW; Inner Write-back, write-allocate; Outer Write-back, write-allocate; Non-shared)
FE0258FC: MCR p15,0,Rd,cr6,cr1,2: DRSR <- 0x31 (0x2000000)
@Ant123: if you managed to run this on the M3, I'd like to see some logs

Found a slightly better way to merge the two logs:
cat DEBUGMSG.LOG MMIO.LOG | ansi2txt | sort -k 1n -s > STARTUP.LOG
Still, the first few lines (those timestamped with 0.000000, i.e. before the microsecond timer was started by Canon firmware) won't be merged in the right order. This is addressed in the full io_trace code (io-trace-full branch). I might be tempted to run that, too; just need to find the full set of stubs to compile the regular ML codebase. Easy coding task, if anyone wants to help.
Currently trying to find a way to annotate these huge logs (i.e. what each register does).