Magic Lantern (RAW) Video format v2.0 (mlv_rec.mo)

Started by g3gg0, July 15, 2013, 10:58:23 PM

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

g3gg0

Hey.

after alex spent a lot of time to find out how we can squeeze out the last bit of performance while
writing raw video to SD and CF cards, i used the last days to think about how to structure the
raw videos to make the post processing easier and the format more extensible.

the result is our next Magic Lantern Video format (.mlv) i want you to look at.
use it on your own risk.

for users:
mlv_rec: nightly download page.
mlv_dump: most recent nightly download page. (binary for WINDOWS only)

mlv_dump: or here (binaries for WINDOWS, LINUX and OSX)

for developers:
mlv file structures in C: here (LGPL)

preferred: you can export .dng frames from the recorded video using "mlv_dump --dng <in>.mlv -o <prefix>"
legacy mode: post processing is still possible with 'raw2dng' after converting the .mlv into the legacy .raw format using mlv_dump.


for details see the description below.
see the short video i made: http://www.youtube.com/watch?v=A6pug1g-kNs
it shows a bunch of the new (user visible) features of that file format.

mlv_dump
- used for debugging and converting .mlv files
- can dump .mlv to legacy .raw + .wav files
- can dump .mlv to .dng  + .wav
- can compress and decompress frames using LZMA
- convert bit depth (any depth in range from 1 to 16 bits)

compression:
you can get a data reduction of ~60% with 12 bit files.
downconverting to 8 bits gives you about 90% data reduction.
this feature is for archiving your footage.
converting back to e.g. legacy raw doesnt need any parameters - it will decompress and convert transparently without any additional parameter.

parameters:

-o output_file      set the filename to write into
-v                  verbose output

-- DNG output --
--dng               output frames into separate .dng files. set prefix with -o
--no-cs             no chroma smoothing
--cs2x2             2x2 chroma smoothing
--cs3x3             3x3 chroma smoothing
--cs5x5             5x5 chroma smoothing

-- RAW output --
-r                  output into a legacy raw file for e.g. raw2dng

-- MLV output --
-b bits             convert image data to given bit depth per channel (1-16)
-z bits             zero the lowest bits, so we have only specified number of bits containing data (1-16) (improves compression rate)
-f frames           stop after that number of frames
-x                  build xref file (indexing)
-m                  write only metadata, no audio or video frames
-n                  write no metadata, only audio and video frames
-a                  average all frames in <inputfile> and output a single-frame MLV from it
-s mlv_file         subtract the reference frame in given file from every single frame during processing
-e                  delta-encode frames to improve compression, but lose random access capabilities
-c                  (re-)compress video and audio frames using LZMA (set bpp to 16 to improve compression rate)
-d                  decompress compressed video and audio frames using LZMA
-l level            set compression level from 0=fastest to 9=best compression




examples:

# show mlv content (verbose)
./mlv_dump -v in.mlv

# will dump frames 0 through 123 into a new file
# note that ./mlv_dump --dng -f 0 in.mlv (or ./mlv_dump --dng -f 0-0 in.mlv) will now extract just frame 0 instead of all of the frames.
./mlv_dump -f 123 -o out.mlv in.mlv

# prepare an .idx (XREF) file
./mlv_dump -x in.mlv

# compress input file
./mlv_dump -c -o out.mlv in.mlv

# compress input file with maximum compression level 9
./mlv_dump -c -l 9 -o out.mlv in.mlv

# compress input file with maximum compression level 9 and improved delta encoding
./mlv_dump -c -e -l 9 -o out.mlv in.mlv

# compress input file with maximum compression level 9, improved delta encoding, 16 bit alignment which improves compression and 12 bpp
./mlv_dump -c -e -l 9 -z12 -b16 -o out.mlv in.mlv

# decompress input file
./mlv_dump -d -o out.mlv in.mlv

# convert to 10 bit per pixel
./mlv_dump -b 10 -o out.mlv in.mlv

# convert to 8 bit per pixel and compress
./mlv_dump -c -b 14 -o out.mlv in.mlv

# create legacy raw, decompress and convert to 14 bits if needed
./mlv_dump -r -o out.raw in.mlv



Play MLV Files

MLRawViewer

baldand implemented an amazing video player that is using OpenGL and is able to convert your .raw/.mlv into ProRes directly.
even i use it as my playback tool, so consider it as the official player. ;)

see: http://www.magiclantern.fm/forum/index.php?topic=9560.0

MLV_Viewer

see here for a MLV player on windows



in-camera mlv_play:
the module mlv_play.mo is shipped with the pre-built binaries.
it is a plugin for file_man.mo to play .raw and .mlv files in camera.
the discussion thread for this module is there

Drastic Preview:
the guys over at drastic.tv are currently implementing the MLV format and already have a working non-open beta version. (i tried it already and i love it :) )
i am sure within the next weeks they will release a new version.
http://www.drastic.tv/index.php?option=com_content&view=category&id=42&Itemid=79





some technical facts:
- structured format
- extensible layout
- as a consequence, we can start with the minimal subset (file header, raw info and then video frames)
- multi-file support (4 GiB splitting is enforced)
- spanning suport (write to CF and SD in parallel to gain 20MiB/s)
- out-of-order data support (frames are written some random order, depending on which memory slot is free)
- audio support
- exact clock/frametime support (every frame has the hardware counter value)
- RTC information (time of day etc)
- align fields in every frame (can differ from frame to frame)

the benefit for post processing will be:
- files can be easily grouped by processing SW due to UIDs and file header information (autodetect file count and which files belong to each other)
- file contains a lot of shooting information like camera model, S/N and lens info
- lens/focus movement can be tracked (if lens reports)
- exact* frame timing can be determined from hw counter values (*=its accuracy is the limiting thing)
- also frame drops are easy to detect
- hopefully exact audio/video sync, even with frame drops
- unsupported frames can be easily skipped (no need to handle e.g. RTC or LENS frames if the tool doesnt need them)
- specified XREF index format to make seeking easier, even with out of order data and spanning writes

why a custom format and not reuse e.g. .mov?
- other formats are good, but none fits to our needs
- hard to make frames align to sector or EDMAC sizes
- they dont support 14 bit raw bayer patterns out of the box
- even when using a flexible container, nearly all sub blocks would need custom additions
- this means a lot of effort to make the standard libs for those formats compatible
- its hard to implement our stuff in a clean way without breaking the whole format

thats the reason why i decided to throw out another format.
it is minimalistic when desired (especially the first implementation will only use a subset of the frames)
and can be extended step by step - while even the most minimalistic parser/post processing tool
can process the latest video files where all stuff is implemented.

if you are a developer (ML or even 3rd party tools) - look over it and make yourself comfortable with that format.
in case there is a bug or something doesnt make sense, please report it.
i would love to get feedback.

here is the link of the spreadsheet that is some kind of reference when designing the format:
https://docs.google.com/spreadsheet/ccc?key=0AgQ2MOkAZTFHdHJraTVTOEpmNEIwTVlKd0dHVi1ULUE#gid=0

implementer's notes
green = fully implemented
blue= implemented, but not 100%
red = not implemented yet, just defined

[MLVI] (once)
- MLVI block is the first block in every .mlv file
- the MLVI block has no timestamp, it is assumed to have timestamp value 0 if necessary
- the MLVI block contains a GUID field which is a random value generated per video shoot
- using the GUID a tool can detect which partial or spanning files belong together, no matter how they are named
- it is the only block that has a fixed position, all other blocks may follow in random order
- fileCount field in the header may get set to the number of total chunks in this recording (the current implementation on camera isn't doing this right)

[RAWI] (once, event triggered)
- this block is known from the old raw_rec versions
- whenever the video format is set to RAW, this block has to appear
- this block exactly specifies how to parse the raw data
- bit depth may be any value from 1 to 16
- settings apply to all VIDF blocks that come after RAWI's timestamp (this implies that RAWI must come before VIDF - at least the timestamp must be lower)
- settings may change during recording, even resolution may change (this is not planned yet, but be aware of this fact)

[VIDF] (periodic)
- the VIDF block contains encoded video data in any format (H.264, raw, YUV422, ...)
- the format of the data in VIDF blocks have to be determined using MLVI.videoClass
- if the video format requires more information, additional format specific "content information" blocks have to be defined (e.g. RAWI)
- VIDF blocks have a variable sized frameSpace which is meant for optimizing in-memory copy operations for address alignment. it may be set to zero or any other value
- the data right after the header is of the size specified in frameSpace and considered random, unusable data. just ignore it.
- the data right after frameSpace is the video data which fills up the rest until blockSize is reached
- the blockSize of a VIDF is therefore sizeof(mlv_vidf_hdr_t) + frameSpace + video_data which means that a VIDF block is a composition of those three data fields
- if frames were skipped, either a VIDF block with zero sized payload may get written or it may be completele omitted
- the format of the data in VIDF frames may change during recording (e.g. resolution, bit depth etc)
- whenever in time line a new content information block (e.g. RAWI) appears, the format has to get parsed and applies to all following blocks

[WAVI] (once, event triggered)
- when the audio format is set to WAV, this block specifies the exact wave audio format

[AUDF] (periodic)
- see [VIDF] block. same applies to audio

[RTCI] (periodic, event triggered)
- contains the current time of day information that can be gathered from the camera
- may appear with any period, maybe every second or more often
- should get written before any VIDF block appears, else post processing tools cannot reliable extract frame time

[LENS] / [EXPO] / ... (periodic, event triggered)
- whenever a change in exposure settings or lens status (ISO, aperture, focal length, focus dist, ...) is detected a new block is inserted
- all video/audio blocks after these blocks should use those parameters

[IDNT] (once)
- contains camera identification data, like serial number and model identifier
- the camera serial number is written as HEX STRING, so you have to convert it to a 64 bit INTEGER before displaying it

[INFO] (once, event triggered)
- right after this header the info string with the length blockLen - sizeof(mlv_info_hdr_t) follows
- the info string may contain any string entered by the user in format "tag1: value1; tag2: value2"
- tag can for example be strings like take, shot, customer, day etc and value also any string

[NULL] (random)
- ignore this block - its just to fill some writing buffers and thus may contain valid or invalid data
- timestamp is bogus

[ELVL] (periodic)
- roll and pitch values read from acceleration sensor is provided with this block

[WBAL] (periodic, event triggered)
- all known information about the current white balance status is provided with this block

[XREF] (once)
- this is the only block written after recording by processing software, but not the camera
- it contains a list to all blocks that appear, sorted by time
- the XREF block is saved to an additional chunk
- files that only contain a XREF block should get named .idx to clarify their use
- .idx files must contain the same MLVI header like all chunks, but only have the XREF block in it

[MARK]
- on keypresses, like halfshutter or any other button, this block gets written for e.g. supplying video cutting positions
- the data embedded into this block is the keypress ID you can get from module.h

[VERS] (any number, usually at the beginning)
- a string follows that may get used to identify ML and module versions
- should follow the format "<module> <textual version info>"
- possible content: "mlv_play built 2017-07-02 15:10:43 UTC; commit c8dba97 on 2016-12-18 12:45:34 UTC by g3gg0: mlv_play: add variable bit depth support. mlv_play requires experi..."


possible future blocks:

[BIAS]
[DARK]
[FLAT]
- in-camera black and noise reference pictures can be attached here (dark frame, bias frame, flat frame)
- to be checked if this is useful and doable




[MLV Format]
- the Magic Lantern Video format is a block-based file format
- every information, no matter if audio or video data or metadata is written as data block with the same basic structure
- this basic structure includes block type information, block size and timestamp (exception to this is the file header, which has no timestamp, but a version string instead)
- the timestamp field in every block is a) to determine the logical order of data blocks in the file and b) to calculate the wall time distance between any of the blocks in the files
- the file format allows multiple files (=chunks) which basically are in the same format with file header and blocks
- chunks are either sequentially written (due to e.g. 4 GiB file size limitation) or parallel (spanning over mutiple media)
- the first chunk has the extension .mlv, subsequent chunks are numbered .m00, m01, m02, ...
- there is no restriction what may be in which chunk and what not

[processing]
- to accurately process MLV files, first all blocks and their timestamps and offset in source files should get sorted in memory
- when sorting, the sorted data can be written into a XREF block and saved to an additional chunk
- do not rely on any order at all, no matter in which order they were written into a file
- the only reliable indicator is the timestamp in all headers
Help us with datasheets - Help us with register dumps
magic lantern: 1Magic9991E1eWbGvrsx186GovYCXFbppY, server expenses: [email protected]
ONLY donate for things we have done, not for things you expect!

ilguercio

Canon EOS 6D, 60D, 50D.
Sigma 70-200 EX OS HSM, Sigma 70-200 Apo EX HSM, Samyang 14 2.8, Samyang 35 1.4, Samyang 85 1.4.
Proud supporter of Magic Lantern.

HHL

You guys are seriously an inspiration.  Time to give more donations...geez...you guys ROCK! 

xNiNELiVES

Wow, new big news about ML Raw... Nice job gentlemen.

Andy600

Looking good! Magic Lantern getting it's own video format  8)

Have a great vacation g3gg0
Colorist working with Davinci Resolve, Baselight, Nuke, After Effects & Premier Pro. Occasional Sunday afternoon DOP. Developer of Cinelog-C Colorspace Management and LUTs - www.cinelogdcp.com

Danne


mageye

There are some great specs in there. I am keen to see that the audio situation is addressed with this format.

IF/WHEN this format is delivered it will indeed be very useful. It's certainly very promising and I will be very happy to adopt such a format that will address many of the problems that the current RAW files confront us with. The fact that it will be tailored to the Canon RAW footage is great.

One thing that I certainly hope is that software developers follow suit and support this format too. I hope that there will be enough people (developers) out there with the vision that this format deserves.

Without proper tools, 'industry' support (Adobe, Black Magic etc.) and independent developers (you know who you are ;) and thanks!) we will still be in a difficult corner. Support here is ESSENTIAL and maybe that needs a little subtle persuasion from fellow Magic Lantern users and supporters ;).

The bottom line is that I really want and hope that this works out because the benefits will be huge. This will be groundbreaking; not just for the Canon DSLR fans, but for the whole independent digital cinema movement.
5DMKII | 500D | KOMPUTERBAY 32GB Professional 1000x |Canon EF 50mm f/1.8 II | Samyang 35mm f/1.4 ED AS UMC | Canon EF 75-300mm f/4-5.6 III | Zoom H2 (4CH. audio recorder) | Mac OS X 10.9.2 | Photoshop CC | After Effects CC | Final Cut Pro 7

Joachim Buambeki

Quotespecifies from which sensor row/col the video frame was copied (8x2 blocks)
Does that mean this data can be used later to apply *correct* lens correction in any RAW converter that supports it (ACR, Dxo, P1, etc)?
It would be tremendously useful to have that info stored in the .mlv file to use lens profiles for any sensor crop, even though it might needs to be padded with white/black image data to use regular lens profiles (which won't cost any extra bits if lossless compression is used with a DNG converter, right?).
Looking forward to that. :-)

marekk

I think we need a white balance value readed directly from the camera during a raw recording.

Andy600

Could full EXIF data be captured to a separate file just before recording a raw file then merged with the raw file after recording ends either in-camera or in an app. This data could also then be parsed as metadata for Resolve etc
Colorist working with Davinci Resolve, Baselight, Nuke, After Effects & Premier Pro. Occasional Sunday afternoon DOP. Developer of Cinelog-C Colorspace Management and LUTs - www.cinelogdcp.com

larrycafe

what about putting the bad pixel information into the video file?

any bad pixel handling can be done in post processing according to the information

mvejerslev

Full Exif would be wonderful, particularly I miss the ISO tag for Raw denoising.
5D Mark II, PC

Toffifee


bumkicho

Audio part is what I am excited about. It is neat to have a little beep sound, but when you are in a place with music/loud noise/etc., that little beep sound is so easily buried under.

Shizuka

Please express frame rate as a rational number: numerator/denominator instead of framerate/1000. This mitigates increasing precision errors for lower framerates.

gerk.raisen

Please add also support for markers while recording (for a lot easier life in post-prod)

(as requested here http://www.magiclantern.fm/forum/index.php?topic=6713.0

g3gg0

I updated the format a bit. Added some of the suggestions and some other stuff.
Thats still just paperwork. That has to be implemented step by step.

I would concentrate on the existing frames, if they are right for what we need before adding new ones.
Help us with datasheets - Help us with register dumps
magic lantern: 1Magic9991E1eWbGvrsx186GovYCXFbppY, server expenses: [email protected]
ONLY donate for things we have done, not for things you expect!

mauerfuchs

It would be nice to save RAW files with LOG like the technicolor cinestyle for Canon DSLRs
Here is a nice article:
http://www.cinema5d.com/news/?p=6165

g3gg0

Why would this make any sense?
We are saving raw. Do in post whatever you like to do.
Help us with datasheets - Help us with register dumps
magic lantern: 1Magic9991E1eWbGvrsx186GovYCXFbppY, server expenses: [email protected]
ONLY donate for things we have done, not for things you expect!

gnarr

It would be awesome if you could reserve a few bytes for bookkeeping information such as:
time : Auto generate with internal clock
Camera # : set manually
Scene # : set manually
Slate # : set manually
Take# : manually set & edit, plus auto count and reset
Ext Sound record # : manually set & edit, plus auto count and reset

and maybe some other info that could be usefull. As was already spoken about here: http://www.magiclantern.fm/forum/index.php?topic=2497.msg60707#msg60707

An option to change the filename according to this info would also be cool, if possible.
E.g.
Scene 1 - Camera 2 - Take 4 - Slate 1 - 2013-07-21 01:05:13.RAW

g3gg0

well, i think you didnt look into the document i posted ;)
its already in there.

mlv_info_hdr_t   5               
      4   uint8_t   blockType[4]   INFO   user definable info string. take number, location, etc.
      4   uint32_t   blockSize   20   
      8   uint64_t   timestamp      
      4   uint32_t   length      
      x   uint8_t[]   string      can be structured: "tag1: value1; tag2: value2"
Help us with datasheets - Help us with register dumps
magic lantern: 1Magic9991E1eWbGvrsx186GovYCXFbppY, server expenses: [email protected]
ONLY donate for things we have done, not for things you expect!

gnarr

ohh, I definitely didn't see that. You clearly know what you are doing 8)

mucher

Awesome. It looks like a container to me. Though, I am curious how the file names are generated, because there might be multiple mlv files generated in the camera simultaneously, and written to SD/CF card/cards simultaneously and randomly, as far as I can understand, so the file might end up overwriting each other -- maybe I worried too much.

Audionut

Quote from: mucher on July 22, 2013, 01:03:10 AM
Awesome. It looks like a container to me.

It is.  MP4, AVI, MKV, MOV, etc, etc, are all containers.

Quote from: mucher on July 22, 2013, 01:03:10 AM
-- maybe I worried too much.

Yes :)

tin2tin

A reel name tag could be useful(with a default setting as date(max. 8 characters - for EDL files)). Time code would be the same as time stamp right?

For inspiration, the various metadata MediaInfo can extract, can be found in CSV files here:
http://mediaarea.net/download/binary/libmediainfo0/0.7.64/MediaInfo_DLL_0.7.64_Windows_i386_WithoutInstaller.7z
In MediaInfo_DLL_0.7.64_Windows_i386_WithoutInstaller.7z\Developers\List_Of_Parameters\