I think I have found something significant for DIGIC 7 (and probably 8, 10, maybe 6). The task struct now has a field that determines which CPU the task is scheduled for.
The old struct looks like this for Digic < 7 (only the end shown):
uint8_t yieldRequest; // 0x47, 1
uint8_t unknown_0c; // 0x48, 1
uint8_t sleepReason; // 0x49, 1
uint8_t unknown_0d; // 0x4a, 1
uint8_t unknown_0e; // 0x4b, 1
struct context *context; // 0x4c, 4
For 200D (probably all D7), I think it looks like this:
uint8_t yieldRequest; // 0x4b, 1
uint8_t unknown_0c; // 0x4c, 1
uint8_t sleepReason; // 0x4d, 1
uint8_t unknown_0d; // 0x4e, 1
uint8_t unknown_0e; // 0x4f, 1
uint8_t cpu; // 0x50, 1
uint8_t unknown_10; // 0x51, 1
uint8_t unknown_11; // 0x52, 1
uint8_t unknown_12; // 0x53, 1
struct context *context; // 0x54, 4
uint32_t unknown_13; // 0x58, 4
Trying to update Qemu for 200D, it was failing to reach code that would initialise needed structures in memory. A task was getting scheduled to do the init, but never running. It was being scheduled for cpu1, which we disable in Qemu early on.
Forcing cpu = 0 for all tasks, I see the "init1" task running and the structures getting at least partially populated. Unfortunately it triggers an exception, but before it would never run at all. It looks to me that there are two init routines, designed to run in parallel (presumably no dependencies?). See 0xe0040224 and 0xe0040220. The former is "init", the latter "init1".
Beyond helping with qemu, this may allow ML to choose which cpu tasks run on - and that could mean we can run things twice as fast, if we're cpu bound. Are we cpu bound for recording video, for example?