Ultrafast framed preview (5D3)

Started by a.sintes, August 23, 2023, 05:10:58 PM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

a.sintes

UPDATED on 2023-09-28 to summarize the whole discussion thread :-)


After some time spent tweaking Danne's crop_rec codebase for the 5D3, I finally got some interesting results around the framed preview, with visible performance increases, leading to an ultrafast framed preview feature.


Latest feature overview video:


warning: the "Framed preview" menus seen in this video are not the latest ones, please read below for the updated version.


Technical details:

The purpose of this ultrafast feature is to reduce as much as possible the computation time required to perform the framed RAW preview rendering in LiveView in order to save CPU time to do other tasks (e.g.: recording RAW data).

This is achieved by precomputing everything possible, notably the RAW & LV buffers offsets and the RGB gamma transformations, so the drawing itself may consist in the lightest possible loop doing linear accesses to data thanks to simple pointer dereferencing.

The latest version of this feature only requires a 675KB memory allocation to work, even when dealing with half resolution, which must be sustainable by most cameras.

By doing this in colored and grayscale previews, we instantly get a smoother preview (both previewing & during recording), allowing us to potentially reduce the sleep times defined to leave enough headroom for the CPU to record the RAW data.

The cache precomputation is managed by simply computing a determinant value that may change when selecting a new RAW video resolution, so it's quite transparent code-wise.


Ultrafast feature menu current organisation (following @Grognard recommandations):

Framed preview
    Engine: legacy | ultrafast
    Comportment
        Idle
            Style: colored | grayscaled
            Resolution: half | quarter
        Recording
            Style: colored | grayscaled
            Resolution: half | quarter
    Timing: legacy | tempered | agressive
    Statistics: off | on

about Comportment:
We can now choose both style & resolution based over the raw_recording_state of the camera (idle or recording), which is more natural for the user (adaptive raw preview).
Anyway, we can also continue to call the raw preview routine by forcing RAW_PREVIEW_COLOR_HALFRES or RAW_PREVIEW_GRAY_ULTRA_FAST (legacy comportment), meaning we can again switch between half resolution colored and quarter resolution grayscale during mlv_play replay (was broken before) with both the legacy or ultrafast framed preview engine.
Adaptive mode also helps to avoid the colored/grayscale switch during recording when dealing with "LV freeze" framing, which is more confortable.

about Timing:
- legacy relies on the current sleep statements as in Danne's repository
- tempered only tries to speed-up things when idling (or replaying via mlv_play), which is a good compromise (faster before / safe during recording)
- agressive tries to also reduce the sleep values when recording to speed-up the display a bit (need to be tested more: may lead to unexpected recording stop depending of the write buffer saturation)


Performance increases:

We can find below the current benchmark statistics dumped on my 5D3, the values being averaged on 1000 display loops:

Quote
style
Quote
engine
Quote
hz. res.
Quote
timing
Quote
draw(ms)
Quote
gain
Quote
fps
Quote
gain
colorlegacyhalflegacy180.211ref.4.451ref.
colorultrafasthalflegacy78.703x2.298.200x1.84
colorultrafasthalfultrafast78.230x2.309.360x2.10
colorultrafastquarterlegacy51.162x3.52(*1)10.429x2.34(*1)
colorultrafastquarterultrafast53.157x3.39(*1)12.602x2.83(*1)

Quote
style
Quote
engine
Quote
hz. res.
Quote
timing
Quote
draw(ms)
Quote
gain
Quote
fps
Quote
gain
grayscalelegacyquarterlegacy46.452ref.10.840ref.
grayscaleultrafasthalflegacy33.810x1.37(*2)12.639x1.17(*2)
grayscaleultrafasthalfultrafast31.939x1.45(*2)17.840x1.65(*2)
grayscaleultrafastquarterlegacy22.598x2.0615.184x1.40
grayscaleultrafastquarterultrafast18.137x2.5623.475x2.17

(*1): no legacy reference
(*2): no legacy reference, but we got performance gain even when compared to legacy quarter resolution!

As we can see, we generally get a preview drawing routine which is between 2 and 2.5 times faster than before using this look-up table technique, leading to a global increase of performances (display frame rate) around 2 times faster: the good news about this is we can start to have a very usable quarter-resolution colored preview (~13fps) and above all an almost "realtime" (24fps) grayscale preview, even when recording.


Source code and Pull Request:

The (5D3) source code is actually available on my GitHub repository, which is basically a fork of Danne's BitBucket one in which I continue to play (currently working on lens.c modifications and a reliable focus sequencing module).

Developpers, please help the ML community by reviewing the ultrafast framed preview Pull Request opened on the "magiclantern_simplified" repository, so it may be profitable for non-5D3 users.


Download links:
5D3 (1.1.3 & 1.2.3 firwmares) download links are now available directly here.

These packages are a complete replacement of Danne's ones, the code being up to date with his latest code changes (February), including all the ML modules plus the Cinematographer-mode one: you'd better then do a fresh install to use it properly.


Final words:

Thanks a lot to names_are_hard, WalterSchulz, Danne and every testers out there!
It's too bad she won't live, but then again, who does?

Walter Schulz

Huh? Helping with programming steps? Me?
LOL!

a.sintes

or maybe it was just names_are_hard then? :)
It's too bad she won't live, but then again, who does?

names_are_hard

Walter regularly helps me with programming, otherwise how would I know how features should work, or what cams have them?

Fast preview looks good!  Haven't looked at the code but it doesn't sound like it should be too hard to integrate into any other repo, or make it work on more than the 5d3.  You might be able to get around the large single alloc problem by segmenting it.  Try say, 6 x 256kB allocs.  You'd need to segment the LUT too, and do perhaps some bitmasking ops to quickly determine which segment to use.  Only benchmarks can tell if this is worthwhile :)

Re my opinion on "where", see my recent post: https://www.magiclantern.fm/forum/index.php?topic=26814.msg244762;topicseen#msg244762

My repo now has crop_rec_4k_mlv_snd code (and, unlike other crop_rec_4k_mlv_snd forks, has all the *other* code, too).  I'd like to start bringing in code from other popular repos, so we can have a single place to work from again.  I've fixed a significant number of bugs (especially in the module system) and improved build time performance greatly.  It builds easily on modern systems, with modern compilers etc (this makes the ML binary smaller as an additional bonus).

I can't make people use it, but I think most people would like to have a single repo that's useful for everyone, and I'm willing to put in the work to merge code.

a.sintes

Yep I've seen your posts about the ML unified Git codebase you're working on and I can only be happy with this initiative: just cloned the repository this morning to check and yes, I confirm it will be very easy to integrate this preview code in it as the framed preview mechanism is basically the exact same as the one I'm currently working on.

What I propose for now is to continue to upgrade and test this ultrafast code using Danne's codebase for a while and once stabilized and validated by the community I'll open a merge request on both magiclantern_simplified and magic-lantern_dannephoto_git repositories (may be helpful for 5D3 users until your codebase properly covers 100% of Danne's features - if not already the case).

Regarding the allocation segmentation, I'll try first a slightly different drawing technique that may allow me to be able to do full-resolution colored preview using the same current buffer size, by only building a look-up-table around the RAW data pointers and doing a simple linear pointer increment in the LV buffer: must be straightforward enough in most of the cases (horizontal layout) - and potentially slightly faster also - but I need to find a way to "skip" some pixels to feed left & right black bars when dealing with some vertical aspect ratios, without having to rely on stride and column skipping techniques that are less performants.
It's too bad she won't live, but then again, who does?

a.sintes

Hi,

Finally got time to update the previous code and reduce by two the amount of memory required for the look-up-table, meaning I've finally managed to make it work for both colored & grayscaled preview in half and/or quarter horizontal resolution.

I've recording an extensive presentation of this experimental feature, so please refer to the following video first:



New menus are added in the mlv_lite module ("RAW video" menu), providing the following options:


"Framed preview": gives access to the framed preview configuration and ultrafast modes:

"Engine: allows to switch between legacy ("no cache") and experimental "ultrafast" framed preview engine
"Style": allows to stick to current style comportment (colored preview at rest/when possible in LV freeze, grayscaled when recording/in need for speed), or force the display to be always colored or grayscaled
"Timing": allows to switch between legacy or experimental "ultrafast" agressive time tweaking, consisting in reducing (or even removing) sleep statements to increase the display performances, thanks to the fastest drawing routine
"Statistics": (de)activate statistics dump in the ML console, displaying something like "[Framed preview] 18.137ms 23.475fps" every 100 displayed frames, showing the averaged computation time consumed by the framed preview drawing routine and the global refresh framerate as perceived by the user

"Resolution": [works only with "ultrafast" engine] allows to individually select the horizontal resolution of both colored and grayscaled preview, with a "half" (better quality but slower) or "quarter" (faster) resolution (note: using legacy, the colored preview is using half resolution and the grayscaled preview is always quarter resolution by design)



We can find below the current benchmark statistics dumped on my 5D3, the values being averaged on 1000 display loops:

Quote
style
Quote
engine
Quote
hz. res.
Quote
timing
Quote
draw(ms)
Quote
gain
Quote
fps
Quote
gain
colorlegacyhalflegacy180.211ref.4.451ref.
colorultrafasthalflegacy78.703x2.298.200x1.84
colorultrafasthalfultrafast78.230x2.309.360x2.10
colorultrafastquarterlegacy51.162x3.52(*1)10.429x2.34(*1)
colorultrafastquarterultrafast53.157x3.39(*1)12.602x2.83(*1)

Quote
style
Quote
engine
Quote
hz. res.
Quote
timing
Quote
draw(ms)
Quote
gain
Quote
fps
Quote
gain
grayscalelegacyquarterlegacy46.452ref.10.840ref.
grayscaleultrafasthalflegacy33.810x1.37(*2)12.639x1.17(*2)
grayscaleultrafasthalfultrafast31.939x1.45(*2)17.840x1.65(*2)
grayscaleultrafastquarterlegacy22.598x2.0615.184x1.40
grayscaleultrafastquarterultrafast18.137x2.5623.475x2.17

(*1): no legacy reference
(*2): no legacy reference, but we got performance gain even when compared to legacy quarter resolution!

As we can see, we generally get a preview drawing routine which is between 2 and 2.5 times faster than before using this look-up table technique, leading to a global increase of performances (display frame rate) around 2 times faster: the good news about this is we can start to have a very usable quarter-resolution colored preview (~13fps) and above all an almost "realtime" (24fps) grayscale preview, even when recording.

This new version only requires a 675KB memory allocation to work, even when dealing with half resolution, which must be sustainable by most cameras.

As far as I tested, it works for every selected resolutions and aspect ratios - even on anamorphic modes - with globally the same performance increase (the LV feeding process is always quite the same) and I've encountered no specific issue (I've fixed the previous one around the black-bars).

This archive provides the latest version of the code (to be applied on Danne's repository), including the following changes:
- src/raw.h [117-148] framed preview configuration management
- src/raw.c [2209-2590] additional ultrafast drawing routines ("templated" C code)
- src/raw.c [2591-2702] alteration of the existing raw_preview_fast_ex function for ultrafast
- modules/mlv_lite/mlv_lite.c [184-189], [1317-1345], [4076-4141], [4716-4722] & [4761-4766] framed preview menu & configuration management
- modules/mlv_lite/mlv_lite.c [4600-4615] alteration of the existing raw_rec_update_preview function for ultrafast time tweaking
- modules/mlv_play/mlv_play.c [1511-1512] alteration of the existing mlv_play_render_task function for ultrafast time tweaking


If you want to test it without a local repository build, you can use the two following links:
precompiled 5D3.113 build
precompiled 5D3.123 build

Enjoy and have a nice week-end!
It's too bad she won't live, but then again, who does?

andy kh

im not able to download. the link is not working
edit: downloaded thanks
5D Mark III - 70D

Mattia

Hi Sintes! Thanks for your great work! I've installed your latest build on my 5d3 but it seems some parts of the menu are missing, specifically all the presets are empty, so I can't select one. Am I doing something wrong?

andy kh

Quote from: Walter Schulz on August 27, 2023, 02:06:29 PM
Link is valid and working for me.
Try different browser and/or VPN.
tried different browser and it works thanks
5D Mark III - 70D

Mattia

Did you install it on top of another ML installation (overwriting the files) or did you format your sd card before?

a.sintes

Yep, currently you need to install it on top of latest Danne's build:
1. install Danne build
2. download the ultrafast archive & extract it
3. overwrite Danne's ML with archive content (all files)

No better shot for now as it's not integrated in any repository.
It's too bad she won't live, but then again, who does?

a.sintes

for the archive download link, I suppose it's because the files are hosted on my non-HTTPs website
It's too bad she won't live, but then again, who does?

Danne

Quote from: a.sintes on August 27, 2023, 02:34:41 PM
Yep, currently you need to install it on top of latest Danne's build:
1. install Danne build
2. download the ultrafast archive & extract it
3. overwrite Danne's ML with archive content (all files)

No better shot for now as it's not integrated in any repository.

Feel free to post full builds. Great work 👍

a.sintes

Thanks!

Links to full builds (based over Danne's repository 2023-02-03 source code including all modules & Lua scripts + ultrafast preview modifications + additional cinematographer module):
- crop_rec_4k_mlv_snd_isogain_1x3_presets_ultrafast_2023Aug27.5D3113
- crop_rec_4k_mlv_snd_isogain_1x3_presets_ultrafast_2023Aug27.5D3123

Some warnings may occur during the download operation due to a non-HTTPs web-hosting (you can safely bypass).
It's too bad she won't live, but then again, who does?

Skinny

Very nice progress! Really, especially mlv_play speed increase could be very useful! Will it be ever ported to other cameras? Like 5D2..

a.sintes

As discussed before with names_are_hard, the modifications are done mainly in the raw.c source code which seems to be quite common for all the camera models in ML, so yes I've good hope it will be available for all compatible camera! (at least the ultrafast framed preview itself, need to check for the timing tweak which is more related to mlv_lite & mlv_play modules).

I'm waiting for some users' feedback on the 5D3 to ensure it works as expected, then I'll open a merge request on both magiclantern_simplified and magic-lantern_dannephoto_git repositories.
Don't know if the 5D2 is currently covered by the simplified repo.?
It's too bad she won't live, but then again, who does?

Bruno Moly

I would love to record 2:35:1 in 5.7K with 5d3.
Would fast framed preview work? Would prefer filming in full frame. I mean with "RATIO" turned off.
Canon EOS 5D  iii

a.sintes

don't exactly get what you're trying to achieve but anyway if the preview was ok before, yes, it will work: I've not changed the way Danne's ML works on the 5D3 (resolution, aspect ratio, presets etc.), I've just increased the software performances of the "framed preview" :)
It's too bad she won't live, but then again, who does?

Mattia

Quote from: a.sintes on August 27, 2023, 06:34:23 PM
I'm waiting for some users' feedback on the 5D3 to ensure it works as expected, then I'll open a merge request on both magiclantern_simplified and magic-lantern_dannephoto_git repositories.
Don't know if the 5D2 is currently covered by the simplified repo.?

I tried it yesterday and it works as expected but only when I set things so I can have continuous shooting. Then it's all perfect. If I exagerate in resolution or bit depth and the camera stops recording after few frames I don't have a fast preview there, but I guess this is the normal functioning..

a.sintes

yes, of course it doesn't change the limitations of the writting process on the camera (in UHD I personally continue to use "LV freeze", that also benefit of the ultrafast mode in a way), anyway maybe you can avoid some recording drops by just activating the ultrafast engine without selecting the timing tweak so it will leave some extra CPU resources that may be useful to avoid buffer saturation, helping to run a longer recording distance than before.

We need to keep in mind that most of time when the display is freezing, it's because the write buffer starts to be saturated so we need to try to stop everything else on the camera except the RAW writting process to have a chance to keep the recording alive...
It's too bad she won't live, but then again, who does?

Mattia

Anyway, yes, it works very well and the preview is much more fluid and responsive than before! Thanks for this great hack! :)

Bender@arsch

I've tested it and it works very well, good job.
It's nice to see that things keep progressing.

But with some side effects/ problems:

- If i use grayscale preview, i can't switch back from crop preview if i press half shutter or star button for focusing for example

- (Raw)Zebras don't hide while record now (I set hide while record, but it has no effect) -> i need to disable it manually everytime

- write speed with ultrafast grayscale is lower now i think-> is it possible to set the colored preview only without changing grayscale while recording?

I don't testing Play preview or have the raw data checked for errors....

a.sintes

Thanks a lot for the testing & feedback!

- I don't reproduce your first issue, I can press half shutter to auto-focus then it's going back properly to the preview when released.
Anyway please note I can see then some bad side effect to this back-and-forth, with the black bars being not properly cleared when going back (need to find where to add the piece of code to clear the screen so it will always work well, thought it was solved...), but it doesn't affect the preview rectangle itself

- Just checked and the zebras are properly hidden with the "hide while record" option set here, anyway I've got some strange results to my eyes with zebra displays... need to check if it was looking the same before with the original Danne's repository, I'll keep you in touch: as I've rebuilt the whole thing with a more recent compiler, maybe it adds some unexpected side effects

- this is a potential side-effect of the "timing tweak" option: can you please try to disable it (keeping the "ultrafast" engine on) and check if it's the same as before in terms of writing speed?
As explained this option is quite experimental as I don't know exactly if the initial magic numbers set in the recording code were wisely chosen or not, so maybe I've reduced them too much :-)
For the second part: yes you can now get the colored preview during recording, just switch to "all colored" in the "style" sub-menu, you can also downscale its horizontal resolution to "quarter" to be faster (actually: quarter colored preview is now faster than previous quarter grayscaled, so it's worth it).
It's too bad she won't live, but then again, who does?

Bender@arsch

Thanks for reply. I tested it again and all problems are gone 🤔... Only 2 times i hang in crop preview (from maybe 20 times), and i can't also reproduce it. But after refreshing live view, all works again (double press menu).

I think i need more testing 😅

a.sintes

Encountered this also from time to time (specifically with UHD preset, less with 3.5K - particularly when playing too much with preset/resolution changes), but it was already there for as long as I've been using regular Danne's builds on my 5D3 :-)
As you guess, going back-and-forth in the ML menus generally solve it, sometimes we need just to restart the camera.
It's too bad she won't live, but then again, who does?