Some updates:
1) Got clang's thread safety analysis working in plain C. To get these warnings, compile with:
make clean; make PREPRO=y
How it works:
With PREPRO=y:
- it uses GCC preprocessor to create .i files (preprocessed C)
- it calls clang to check the .i files for thread safety (this is a fake target, you can trigger it manually with e.g. "make menu.t")
- it then calls gcc to build the object file from the preprocessed file
- this build process is a bit slower and a bit more verbose, but should generate the same binary output as a regular build (tested manually - the only difference is build date)
Without PREPRO=y (regular build):
- regular make will build as usual (.c -> .o)
- make *.t will build .c -> .i -> .t (so in this case, the preprocessed C is not used during compilation)
With PREPRO=y PYCPARSER=y:
- this uses some additional workarounds to enable analysis with pycparser (which gets stuck on gcc extensions used by us), so the binary resulted in this way may not be suitable for executing (though the differences are probably minor).
You can already see a few annotated sources. For example,
lvinfo (which is pretty recent code) was already handling threads very well, so it got just a (false) warning. Other files are not so clean:
mlv_lite gives some warnings; most of them are related from some vsync variables initialized in raw_rec_task (which should be fine, since vsync hook is not running during that initialization), or with updating resolution parameters (which is done from 3 different tasks, but there is some additional logic to make sure only one of them runs at a given time).
Still, only a tiny part of the code base was annotated and reviewed. It's a huge task, and the reward is not obvious at first sight - getting rid of some insidious bugs that happen with low probability and can be impossible to narrow down. But it's badly needed, as the project got pretty complex and is used by many for serious work.
2) Also
added a test (in the selftest module) that checks various functions for thread safety (including a known unsafe function to make sure the test actually works).
This test works by creating 2 tasks that call the tested function in a loop, with some tricks to force a lot of context switches.
While writing this test, I've noticed something interesting, that should help getting rid of those race conditions in certain cases:
If we have two DryOS tasks with equal priorities, they will never interrupt each other, unless one task does some sort of yielding (msleep, waiting at semaphore / message queue / for event flags etc). That means, two DryOS tasks with equal priorities will use
cooperative multitasking.
This property will probably no longer hold true on
DIGIC 7 (dual core Cortex A9). Not tested on DIGIC 6.
The test I've ran to confirm this hypothesis was to run a long loop with NOPs (a few seconds) in 2 tasks and count the context switches. You can enable this test by defining
TEST_EQUAL_PRIO in selftest.c and optionally logging context switches from our task_dispatch_hook (e.g. with CONFIG_TSKMON_TRACE, printf in qemu or anything else you consider useful).