Danne's crop_rec_4k, 5DIII

Started by Danne, November 09, 2018, 05:11:37 PM

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

ilia3101

Ok nevermind about all that for now. First I need to understand how write_frames works...

Could anyone who knows explain: why it is only called once for multiple slots? it only gets the pointer from the first slot as the ptr argument, yet somehow writes all of the slots to the card?

And as far as I can tell a "slot" is an MLV block waiting to be written, which can be a frame or any other type - correct?

names_are_hard

Quote from: ilia3101 on January 19, 2020, 11:00:19 PM
Also why does the code sometimes use '0' instead of NULL for pointers?

NULL and 0 have the same value.  But, NULL implies pointer.  So many people think you should use NULL if it's a pointer, because it lets other people know they're dealing with a pointer.  I agree with this but it's basically a style choice.

ilia3101

Quote from: ilia3101 on January 20, 2020, 12:00:53 AM
Could anyone who knows explain: why it is only called once for multiple slots? it only gets the pointer from the first slot as the ptr argument, yet somehow writes all of the slots to the card?

Ok firgured this one out, all the blocks are actually in a single continuous block of memory behind the scenes, somehow. Therefore the pointer to the first block is actually a pointer to all the blocks needing to be written.

I guess this must roll over back to the beginning of the buffer when it reaches the end, but where is the code for detecting the roll over?

I think can make spanning work now I kinda understand what's going on.

These are just notes for myself, but I'm writing here so anyone who knows better can correct me.

a1ex

Indeed, your understanding is correct, so far. Most of these things are not related to Danne's changes; mlv_lite has the same quirks in the main repository (so, the discussion here is probably off-topic).

Anyway - mlv_lite is pretty much the codebase from the original raw_rec (i.e. most of that is my code, updated by dmilligan to support the MLV format at the same recording speeds as the original raw_rec implementation, if not a bit faster). Long before that, mlv_rec was written by g3gg0 in an attempt to completely replace raw_rec; unfortunately, it was a little slower, and its menus and internals were a bit too complex for my taste, so... we ended up maintaining both. Not ideal, I know. It would make sense to merge both modules somehow, keeping both the raw recording speed and the features, if possible.

Regarding 'loads of "NO_THREAD_SAFETY_ANALYSIS"' - there are only two, that I could find, and that's because the static analysis does not handle cli/sei correctly. I don't see this as a reason to be concerned - on the contrary. But I cannot guarantee 100% correctness either - maybe rewriting some parts of ML in Rust could help with that. No, I don't know Rust, besides trying to run Hello World some years ago, but it seems to be something worth learning.

About what things are executing - there are our own DryOS tasks, more or less straighforward to understand, and there are hooks placed at various locations in Canon firmware (usually in their DryOS tasks). The most interesting one here is the VSYNC hook, aka "EvfReadOutDone" on recent models (i.e. triggered once after capturing every single LiveView frame, in Canon's Evf/LiveView task, which runs with very high priority). That's where each frame is processed, and the next slot is chosen (ideally continuing the previous memory chunk, as it's a lot faster to write several megabytes at once).

However, what exactly happens within each LiveView frame was only recently understood, and only to a very limited extent.

ilia3101

Thanks for the response a1ex

True it is a bit off topic, but the goal is adding card spanning which is on topic I think.

Is the reason for mlv_rec being slower that it writes one frame at a time? It would be nice to merge them. Would unified be the best starting point for this kind of change? Also about mlv_rec not loading in crop_rec_4k, is that a difficult fix?

Quote from: a1ex on January 20, 2020, 04:48:54 PM
Regarding 'loads of "NO_THREAD_SAFETY_ANALYSIS"' - there are only two, that I could find, and that's because the static analysis does not handle cli/sei correctly. I don't see this as a reason to be concerned - on the contrary. But I cannot guarantee 100% correctness either - maybe rewriting some parts of ML in Rust could help with that. No, I don't know Rust, besides trying to run Hello World some years ago, but it seems to be something worth learning.

Hm rust, can it even be compiled to run on these cameras? Might it use any dynamic memory allocation behind the scenes?

Quote from: a1ex on February 05, 2017, 02:12:43 AM
Therefore, if a function is called from more than one task, it must be thread-safe.

This is a good way to think about it, will keep in mind.

Quote from: a1ex on January 20, 2020, 04:48:54 PM
About what things are executing - there are our own DryOS tasks, more or less straighforward to understand, and there are hooks placed at various locations in Canon firmware (usually in their DryOS tasks). The most interesting one here is the VSYNC hook, aka "EvfReadOutDone" on recent models (i.e. triggered once after capturing every single LiveView frame, in Canon's Evf/LiveView task, which runs with very high priority). That's where each frame is processed, and the next slot is chosen (ideally continuing the previous memory chunk, as it's a lot faster to write several megabytes at once).

However, what exactly happens within each LiveView frame was only recently understood, and only to a very limited extent.

Interesting to know. So the VSYNC hook is how ML redirects the image data to it's own memory?

And back to implementing spanning in mlv_lite:

My simplest (most threadsafe) idea was: the main thread will split each write in to an SD and a CF card part, then tell the SD card thread what to write (memory pointer and number of bytes). The splitting should be a 1:5 ratio, or learnt as the recording goes on. But this is not very good as it will make smaller chunks be written each time, and if the SD card has a hiccup, the main thread will be waiting for it to finish, and the other way round, the SD card would not be fully utilised. And if there is not enough frames in the queue, it can't be split very well.

I want to figure out a better way to share the queue between two threads. Do you think that is possible?

Haven't really made sense of the whole writing loop yet, so I can't tell.

Danne

Mlv_rec works here. Only tested with eosm and with that particular branch. I thnk mlv_rec works with 5d3 too. At least with a1ex crop_rec_4k branch.
With that said only thing personally missed in mlv_lite is card spanning :)

ilia3101

oh it does? not with crop_rec.mo enabled though?

Danne

Crop_rec works with eosm.
A bit off topic but it needs certain other modules enabled to work I guess. I'm sure it can be solved with a little trial and error. But let's fix card spanning on mlv_lite first ;)

ilia3101


ilia3101

I think this line:

        writing_queue_head = after_last_grouped;


should be moved to before the frame(s) start being written, so that when the other thread starts looking for frames, it will get unwritten ones.

Or will moving this line affect whatever process is actually putting the frames in to the buffer? In that case, I can add an extra version of this variable for the write threads, updated at the start, and leave the original one for the process actually inserting frames, updated at the end as before (will need some care when both threads are updating it)

I also think the SD card thread should only write one frame at a time, otherwise it will be wasting time and memory, especially when the buffer starts getting full.

This is my current best idea for spanning.


@Danne Do you know how much extra write speed mlv_rec spanning adds? Does it perfectly sum CF and SD speed or is there overhead?

Danne

I used it briefly when it was implemented. Wild guess it gives around 20mb extra. But. 5d3 can do sd_uhs hack so we might get like maybe 40mb extra rough estimation?

ilia3101

Woahh! Didn't know sd_uhs worked on 5D3 as well. So we may get 140MB/s which could give continuous UHD 😱

Danne

Well, a1ex would know this for sure but 125-130 might be more realistic maybe.

Quentin


ilia3101

I HAVE MADE WORKING SPANNING!!!!!! This is what the commit looks like: https://github.com/ilia3101/mlv_lite_spanning/commit/0684bab9e5937a563af8409f5c226af73b8e7c74

I have used github as I don't understand how to use bitbucket or mercurial. On that github repo you can see the before and after.

unfortunately: there is one case where the threads collide and recording stops with "Slot check error" - it happens rarely, and I know exactly why, I just haven't found a way to do mutual exclusion yet.

I am pretty sure the semaphore is what I should use as a "mutex", but I could not get it to work, could you help me a little bit please @a1ex? I just want to know how to use the semaphore for mutex purposes.


Danne

Wow. Impressive. No cam today but will test soon! Hope a1ex can take a look.
Bravo Ilia3101!!

ilia3101

I will upload test builds, I just don't recommend it for anything other than testing!

Recording WILL STOP due to an error sometimes!

Edit: I won't upload builds actually, until I have got it in to a more usable state. Testing right now won't mean anything.

ilia3101

When I add a semaphore eveything just freezes forever, and says something like "25 frames" or "31 frames" until I turn the whole camera off.

All I did was add a pair of take_semaphore and give_semaphore, timeout 0 which I guess means no timeout. I might just try something dumb like a spinlock.

a1ex

Nice progress!

To debug these semaphores, you may use the wrappers at the beginning of menu.c. These will tell you who acquired the semaphore (what line of code), and you can also add the task name (get_current_task_name). What you print on the display, should stay there even if the camera locks up.

However, a hard lock up (when everything stops responding) generally means some kind of invalid hardware (MMIO) access, or DryOS memory overwritten, i.e. messed up badly enough that vanilla firmware can no longer continue. If some task ends up "simply" waiting at semaphore, that shouldn't bring down the entire machine.

For 32-bit integer assignments, on a single-core ARM CPU, semaphores should not be needed - afaik, this operation is atomic. That's probably not true on multi-core systems.

For very short critical sections, you may use cli/sei, but no other tasks will be able to run between those two. You should not call any DryOS API either (such as msleep), as these might enable the interrupts on their own.

Also take a look at the selftest module - that also serves as "documentation" of expected behavior and return value of various DryOS functions. The VxWorks documentation should be helpful, too, as the high-level behavior is pretty much identical (but the internals are not).

ilia3101

Quote from: a1ex on January 22, 2020, 10:16:46 PM
However, a hard lock up (when everything stops responding) generally means some kind of invalid hardware (MMIO) access, or DryOS memory overwritten, i.e. messed up badly enough that vanilla firmware can no longer continue. If some task ends up "simply" waiting at semaphore, that shouldn't bring down the entire machine.

None of my lockups were hard, just raw recording thread(s) locking up - I could still enter ML menu and it would just say "stopping"

Quote from: a1ex on January 22, 2020, 10:16:46 PM
For 32-bit integer assignments, on a single-core ARM CPU, semaphores should not be needed - afaik, this operation is atomic. That's probably not true on multi-core systems.

Doesn't apply to the current situation (I think), but good to know.

Quote from: a1ex on January 22, 2020, 10:16:46 PM
For very short critical sections, you may use cli/sei, but no other tasks will be able to run between those two. You should not call any DryOS API either (such as msleep), as these might enable the interrupts on their own.

This may be the perfect solution! I will try it out. Didn't realise what cli/sei were until I just googled it. Wow!

Is "ASSERT" and beep/bmp_printf allowed between cli/sei?


Will have a look at menu.c and selftest module as well.

ilia3101

I added sei/cli and it seems to fix the problem, but I think the code runs between them for too long, so I will still try and sort out the mutex.

Just one last question for today: There is loads of "Frame order error" messages at the end of recording - it only seems to happen when the video's bitrate exceeds or almost exceeds total write speed. Is this a major issue?

With UHD total write speed tops out around 111MB/s right now, not as amazing as I expected. However I haven't tried sd_uhs yet. Update: I compiled sd_uhs in your branch @Danne and it says not supported on the 5D3 - where do I get the latest sd_uhs?

Danne

Check the code in sd_uhs.c(my branch):
/* Below cams not tested/supported atm. Try it by enabling sd_overclock_task(); */
    if (is_camera("5D3", "1.1.3"))
    {
        /* sd_setup_mode:
         * sdSendCommand: CMD%d  Retry=... ->
         * sd_configure_device(1) (called after a function without args) ->
         * sd_setup_mode(dev) if dev is 1 or 2 ->
         * logging hook is placed in the middle of sd_setup_mode_in (not at start)
         */
        sd_setup_mode       = 0xFF47B4C0;   /* start of the function; not strictly needed on 5D3 */
        sd_setup_mode_in    = 0xFF47B4EC;   /* after loading sd_mode in R0, before the switch */
        sd_setup_mode_reg   = 0;            /* switch variable is in R0 */
        sd_set_function     = 0xFF6ADE34;   /* sdSetFunction */
        sd_enable_18V       = 0xFF47B4B8;   /* 5D3 only (Set 1.8V Signaling) */
        SD_ReConfiguration  = (void *) 0xFF6AFF1C;
     /* sd_overclock_task(); */    }

    if (is_camera("5D3", "1.2.3"))
    {
        sd_setup_mode       = 0xFF484474;
        sd_setup_mode_in    = 0xFF4844A0;
        sd_setup_mode_reg   = 0;
        sd_set_function     = 0xFF6B8FD0;
        sd_enable_18V       = 0xFF48446C;   /* 5D3 only (Set 1.8V Signaling) */
        SD_ReConfiguration  = (void *) 0xFF6BB0B8;
     /* sd_overclock_task(); */    }

return 0;
}

Enable the function with this:
/* sd_overclock_task(); */
to this:
sd_overclock_task();

Don´t forget, the module will autoenable when starting camera.
in modules.c
char *core_modules[] = {"mlv_lite", "crop_rec", "mlv_play", "mlv_snd", "lua", "file_man", "dual_iso", "silent", "ettr"};

Just tested over here and seems to work but havn´t checked agains any benchmark tests.

ilia3101

Didn't work, SD card just benchmarks at 0.0MB/s and there is some error when raw recording. (I'm firmware 1.1.3)

Danne

Well, I can´t really help you on that as I have no issues here.
You could test sd_uhs.mo from the sd_uhs branch. Here is a compiled version:
https://bitbucket.org/Dannephoto/magic-lantern/downloads/sd_uhs.mo

Try and replace it with what´s in your modules folder now. This version needs to be enabled from with debug tab. Not as convenient but check if it works better.