MLV App 1.14 - All in one MLV Video Post Processing App [Windows, Mac and Linux]

Started by ilia3101, July 08, 2017, 10:19:19 PM

Previous topic - Next topic

0 Members and 4 Guests are viewing this topic.

masc

What do you mean with "optimized"?! What should be different for 1x3? MLVApp is optimzied for all options we have - as good as we can ;)
5D3.113 | EOSM.202

mlrocks

Quote from: masc on May 20, 2022, 09:29:59 PM
What do you mean with "optimized"?! What should be different for 1x3? MLVApp is optimzied for all options we have - as good as we can ;)

The sharpening right now is good for 1x1, not for 1x3?

masc

I recommend to sharpen 1x3 footage (if neccessary at all) in NLE after MLVApp export. For an improvement inside MLVApp we don't need a 1x3 optimization option - we would need a complete rewrite with a different conecpt.
5D3.113 | EOSM.202

mlrocks

Quote from: masc on May 20, 2022, 10:58:23 PM
I recommend to sharpen 1x3 footage (if neccessary at all) in NLE after MLVApp export. For an improvement inside MLVApp we don't need a 1x3 optimization option - we would need a complete rewrite with a different conecpt.

OK. It is not as simple as I thought. Thanks for the explanation.

mlrocks

Apple ARM M1 is an 8 core cpu with a typical tdp of 15w, passmark 15000, single thread 4000.
Intel Xeon E3-1270 V2 is a 4 core 8 thread cpu with a tdp of 70w, passmark 6000, single thread 2000.
The hardware does not explain the difference of 4 folds in full hd exporting speed, about 2 folds to 2.5 folds.
Maybe the mlv app is more optimized for the Apple hardware?


Intel i3-3240 is a 2 core 4 thread cpu with a tdp of 55w, passmark 2300, single thread 1800.
AMD Athlon 5370 is a 4 core 4 thread cpu with a tdp of 25w, passmark 2000, single thread 750.
Both has similar passmark and 4 threads, yet i3 3240 is about twice faster than athlon 5370. The single thread performance is 2.5 folds, is the single thread performance the key here? If so, it can not explain the two folds difference between E3-1270v2 and i3-3240.
Probably the low level C libraries are more tuned to the Intel structure than to the AMD's?

Probably the passmark is not a good criteria here?


masc

Rendering on macOS has always been faster than on Windows. When I compare some of my systems: a dualcore i5 2.4GHz with macOS has a similar performance like a quadcore i7 3.4GHz with Win10.
There is no optimization for certain systems in MLVApp. The code is identical (appart from AVFoundation export which is Apple only and post export scripts) for all OS. The only difference are the compilers and the operating system itself.
On Linux I saw very similar performance to macOS, but I don't use this really.
The code is open source and online. If you find optimized compilers for some CPU architecture: please try if you get more performance. Compiling is very easy.
5D3.113 | EOSM.202

mlrocks

Quote from: masc on May 21, 2022, 12:52:29 PM
Rendering on macOS has always been faster than on Windows. When I compare some of my systems: a dualcore i5 2.4GHz with macOS has a similar performance like a quadcore i7 3.4GHz with Win10.
There is no optimization for certain systems in MLVApp. The code is identical (appart from AVFoundation export which is Apple only and post export scripts) for all OS. The only difference are the compilers and the operating system.
On Linux I saw very similar performance to macOS, but I don't use this really.

Maybe the MacOS and Linux kernels are more efficient. I remember that MacOS later versions use Linux kernel, not sure if it is still so on Arm M1 MBA.

masc

Quote from: mlrocks on May 21, 2022, 12:55:57 PM
Maybe the MacOS and Linux kernels are more efficient. I remember that MacOS later versions use Linux kernel, not sure if it is still so on Arm M1 MBA.
This is what I think. macOS is based on unix OS family.
5D3.113 | EOSM.202

mlrocks


names_are_hard

Quote
I remember that MacOS later versions use Linux kernel
MacOS has never had a Linux kernel.

Quote
Better avoid AMD cpus and video cards. Something I was not aware before.
Please don't give completely unfounded advice.  This is just untrue.  It's probably the *opposite* of true for MLVApp, which is heavily threaded; current Zen based AMD CPUs normally beat Intel (and Apple M1) based systems in multi-threaded workloads.

Don't guess.  Don't assume benchmarks *on a different piece of software* will be representative.  Benchmarking is notoriously complicated.  Benchmark using MLVApp on different systems; it's free and available on all OSes!  If somebody gives me an MLV file and instructions, I'll bench on my modern 8c/16t AMD Linux system.

mlrocks

Quote from: names_are_hard on May 21, 2022, 05:16:01 PM
MacOS has never had a Linux kernel.
Please don't give completely unfounded advice.  This is just untrue.  It's probably the *opposite* of true for MLVApp, which is heavily threaded; current Zen based AMD CPUs normally beat Intel (and Apple M1) based systems in multi-threaded workloads.

Don't guess.  Don't assume benchmarks *on a different piece of software* will be representative.  Benchmarking is notoriously complicated.  Benchmark using MLVApp on different systems; it's free and available on all OSes!  If somebody gives me an MLV file and instructions, I'll bench on my modern 8c/16t AMD Linux system.

I agree with you that maybe passmark is not very suitable for the mlv app. Maybe using MLV App is a better way for benchmarking. I am converting almost of all of my home systems for MLV App now, so it makes sense to use it as the benchmarking criteria.

I DIYed my first video/gaming AMD desktop about 8 years ago, with Sabertooth 990FX tough mo, Athlon FX8350 overclocked, AMD R9 295X2 8GB video card, 32GB DDR3 RAM, SATA3 SSD. At the time it was extremely powerful. I enjoyed 2k gaming long before it becomes main stream. Rawtherapee raw photo editing was totally real time on this system. The set up will cost twice higher if using Intel ecosystem, 5 times higher if using Mac Pro. The system went dead after several years of intensive use. I think that it was my fault, not optimizing the air flow well enough, even there were a liquid cooling system for the cpu and a separated one for the video card.

Anyways, I like AMD's lower price. I am very happy that AMD is leading Intel now. But I recently realize that AMD upgrading path overhead room is not as much as Intel based mother board. Also, I just realize that AMD CPU and GPU is not well supported by low level compilers. As an end user, I really hope AMD can address these two main issues, so that I can build my new system Zen4 or Zen 5 based. Cheers,


names_are_hard

Quote
Also, I just realize that AMD CPU and GPU is not well supported by low level compilers

This is not true.  The article you linked to was about a specific *Intel* library that *Intel* deliberately made to be poorly optimised for AMD CPUs.  It has no relevance to "low level compilers".  Only one specific Intel library.  MLV App isn't even using that library!  Compilers optimise just fine on AMD.

Benchmarking is always quite tricky.  Since we can run MLVApp anywhere, when you're interested in MLVApp performance, it's the obvious best choice to use.  Ideally, script running MLVApp, always with the same input file, processing options etc.  That way you can get reproducible results, and you can share the script with other people so they can test on their systems.  It would be cool to see a set of results across a lot of different systems.

mlrocks

Quote from: names_are_hard on May 21, 2022, 08:40:09 PM
This is not true.  The article you linked to was about a specific *Intel* library that *Intel* deliberately made to be poorly optimised for AMD CPUs.  It has no relevance to "low level compilers".  Only one specific Intel library.  MLV App isn't even using that library!  Compilers optimise just fine on AMD.

Benchmarking is always quite tricky.  Since we can run MLVApp anywhere, when you're interested in MLVApp performance, it's the obvious best choice to use.  Ideally, script running MLVApp, always with the same input file, processing options etc.  That way you can get reproducible results, and you can share the script with other people so they can test on their systems.  It would be cool to see a set of results across a lot of different systems.

For MLV App parameters, for benchmark testing, I set all for default, export to h264 mp4 high quality, which can be imported by Blender VSE.

Skinny

Quote from: names_are_hard on May 21, 2022, 08:40:09 PMIt would be cool to see a set of results across a lot of different systems.
I like this idea, we need to find some nice short MLV so anyone can benchmark and share the result.. And later it can be used to measure how certain processing algorithms behave on different systems.

Walter Schulz

Quote from: Skinny on May 22, 2022, 07:27:53 AM
I like this idea, we need to find some nice short MLV so anyone can benchmark and share the result.. And later it can be used to measure how certain processing algorithms behave on different systems.

Blast from the past: https://www.magiclantern.fm/forum/index.php?topic=18999.0

dream951

Quote from: masc on May 20, 2022, 05:52:20 PM
1min RAW 1x3 5.7K (1920x2340) 24fps on M1 MacBookAir:
- playback 11fps
- export with ffmpeg to 5760x2340 H.264 high with MLVApp default: 14:20min
- export with AVFoundation to 5760x2340 H.264 with MLVApp default: 11:00min

1min RAW 1856x1044 25fps on M1 MacBookAir:
- playback 50fps
- export with AVFoundation to 1856x1044 H.264 with MLVApp default (AMaZE): 1:20min (edited: 1:00min was wrong here, measured again, so no realtime)
- export with AVFoundation to 1856x1044 H.264 with MLVApp default (bilinear): 0:30min (more than realtime, good enough as proxy)

1min RAW 1x3 1440x1836 24fps on M1 MacBookAir:
- playback 21fps (with alexa log-c preset 24fps)
- export with AVFoundation to 4320x1836 H.264 with MLVApp default: 5:20min

Hello masc!
My laptop has an RTX2060, which supports export via ffmpeg to h265_nvenc, which gives a multiple increase in performance. I tested it in Fast CinemaDNG Processor - the result is very good. If you make the export settings in mlvapp also in h265_nvenc, will it give a performance boost or is it because fastcinema has a different mlv/cdng export algorithm?
Thanks!
5DIII crop_rec_4k_mlv_snd_isogain_1x3_presets_2022May15.5D3123
Sigma Art 24/1.4 50/1.4 135/1.8 + Canon 24-105/4 70-200/4 + Samyang 8/3.5 + Helios 58-2

masc

Quote from: dream951 on May 23, 2022, 08:16:13 AM
My laptop has an RTX2060, which supports export via ffmpeg to h265_nvenc, which gives a multiple increase in performance. I tested it in Fast CinemaDNG Processor - the result is very good. If you make the export settings in mlvapp also in h265_nvenc, will it give a performance boost or is it because fastcinema has a different mlv/cdng export algorithm?
Unfortunately I can't add this with any of my computers as I don't have any computer with NVidia graphic cards. So this won't work here - some as "Fast CinemaDNG Processor" which doesn't work on any of my computers. But the code is opensource and it isn't difficult to add this into a MLVApp experimental branch, if you like.
5D3.113 | EOSM.202

bouncyball

Quote from: dream951 on May 23, 2022, 08:16:13 AM
If you make the export settings in mlvapp also in h265_nvenc, will it give a performance boost or is it because fastcinema has a different mlv/cdng export algorithm?
Thanks!
Maybe or maybe not with nvenc supported ffmpeg (encoding only). CDNG export will not be accelerated at all.

bouncyball

Quote from: names_are_hard on May 21, 2022, 08:40:09 PM
It would be cool to see a set of results across a lot of different systems.
Absolutely agree.

Once in 2018 I benched mlvapp playback (no export) on 160core (4CPU) Xeon system and ... well have a look yourself:
Link

dream951

Quote from: bouncyball on May 23, 2022, 12:28:01 PM
Maybe or maybe not with nvenc supported ffmpeg (encoding only). CDNG export will not be accelerated at all.
I also thought that encoding from CDNG would be the bottleneck of this process.
Thanks!
5DIII crop_rec_4k_mlv_snd_isogain_1x3_presets_2022May15.5D3123
Sigma Art 24/1.4 50/1.4 135/1.8 + Canon 24-105/4 70-200/4 + Samyang 8/3.5 + Helios 58-2

dream951

Quote from: masc on May 23, 2022, 09:01:10 AM
Unfortunately I can't add this with any of my computers as I don't have any computer with NVidia graphic cards. So this won't work here - some as "Fast CinemaDNG Processor" which doesn't work on any of my computers. But the code is opensource and it isn't difficult to add this into a MLVApp experimental branch, if you like.
It's a pity that I'm not good at coding and I can't help you in any way:(
5DIII crop_rec_4k_mlv_snd_isogain_1x3_presets_2022May15.5D3123
Sigma Art 24/1.4 50/1.4 135/1.8 + Canon 24-105/4 70-200/4 + Samyang 8/3.5 + Helios 58-2

mlrocks

Quote from: bouncyball on May 23, 2022, 12:31:57 PM
Absolutely agree.

Once in 2018 I benched mlvapp playback (no export) on 160core (4CPU) Xeon system and ... well have a look yourself:
Link

160 threads xeon quad cpu server/workstation only gave 8 fps playback with default settings? What was the video card on this system?
it seems to me that playback is more video card dependent than cpu dependent. Exporting is for sure cpu dependent.
Otherwise, how can we explain 160 threads xeon machine only gave 8 fps, whereas mba m1 gives faster fps? I don't think m1 has so vastly improved on cpu speed, but on power saving.

theBilalFakhouri

Quote from: names_are_hard on May 21, 2022, 08:40:09 PM
It would be cool to see a set of results across a lot of different systems.
Quote from: bouncyball on May 23, 2022, 12:31:57 PM
Absolutely agree.

I suggest someone to record few MLV clips in different resolutions/modes and upload them somewhere. To be used for benchmarking by users by exporting the clips in some compressed codecs like H2.64 and ProRes and *writing down processing time for each clip, then posting results in a dedicated thread in the forum (or maybe create a section in mlv.app domain for posting results in an organized way).

*Could we implement function in MLVApp that saves MLV info, used codec, system info and processing time for each MLV clip to a log file? that's could be useful too.

theBilalFakhouri

Quote from: mlrocks on May 23, 2022, 10:51:35 PM
160 threads xeon quad cpu server/workstation only gave 8 fps playback with default settings? ..

Single thread performance seems much more important than multi threaded for playback. e.g. the lowest M1 Macbook destroys my Ryzen 3900x in terms of single thread performance (~30% faster) --> M1 has faster playback speed.
I can do some playback/exporting tests if there some unified MLV clips so other users can also make same tests with same clips on their systems.

@masc You have shot some amazing clips before using 5D3/EOS M, it would be cool if you shorten some of them and upload them somewhere (if you still have them and don't mind :) ).