Canon asserts are non-fatal; our logging is bad

Started by names_are_hard, January 26, 2025, 12:58:16 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

names_are_hard

This is quite a significant finding for debugging.  DryOS asserts are non-fatal.  This means that if a problem occurs, multiple asserts can be triggered before the cam "notices" and hangs.  This interacts, badly, with ML assert handling.

ML code overrides the assert handler with my_assert_handler() in init.c.  But this code doesn't expect multiple asserts are possible.  The text of the assert is written into a buffer, assert_msg.  But always to the start of the buffer.  We then call request_crash_log(), but this is delayed.  If multiple asserts occur, only the last gets captured, earlier ones are overwritten.  And the last assert is typically the least useful; since an earlier assert failed, the state is unexpected by definition and it's probable later asserts happen due to this.  I spent a long time staring at Ghidra thinking "but it's not possible for this combination of conditions to occur!".

If your cam crashes and manages to save both CRASHXX.LOG and log00X.log, then log00X.log will hold the first assert; DryOS does the right thing.  CRASHXX.LOG will hold whatever assert happened before the cam goes into Err mode (e.g. Err 70, big red cross); ML does the wrong thing.

I made a quick hack modification to my_assert_handler() to save a chain of asserts.  This is much more useful!  Now the earliest assert is reliably logged, as well as later ones (which have some utility since they're often related).  This needs more work as the buffer isn't large enough.

Danne


theBilalFakhouri