thinking about the definition, it would be good to write down some restrictions and/or principles. (lot of that sounds logical, but it's worth to say!)
yep, good point. let me paraphrase them.
[MLV Format]
- the Magic Lantern Video format is a block-based file format
- every information, no matter if audio or video data or metadata is written as data block with the same basic structure
- this basic structure includes block type information, block size and timestamp (exception to this is the file header, which has no timestamp, but a version string)
- the timestamp field in every block is a) to determine the logical order of data blocks in the file and b) to calculate the wall time distance between any of the blocks in the files
- the file format allows multiple chunks which basically are in the same format with file header and blocks
- chunks are either sequentially written (due to e.g. 4 GiB file size limitation) or parallel (spanning over mutiple media)
- the first chunk has the extension .mlv, subsequent chunks are numbered .m00, m01, m02, ...
- there is no restriction what may be in which chunk and what not
[processing]
- to accurately process MLV files, first all blocks and their timestamps and offset in source files should get sorted in memory
- when sorting, the sorted data can be written into a XREF block and saved to an additional chunk
- files that only contain a XREF block should get named .idx to clarify their use
- do not rely on any order at all, no matter in which order they were written into a file
- the only reliable indicator is the timestamp in all headers
[MLVI]
- MLVI block is the first block in every .mlv file
- the MLVI block has no timestamp, it is assumed to have timestamp value 0 if necessary
- the MLVI block contains a GUID field which is a random value generated per video shoot
- using the GUID a tool can detect which partial or spanning files belong together, no matter how they are named
- it is the
only block that has a fixed position, all other blocks may follow in
random order - fileCount field in the header may get set to the number of total chunks in this recording
[VIDF] (periodic)
- the VIDF block contains encoded video data in any format (H.264, raw, YUV422, ...)
- the format of the data in VIDF blocks have to be determined using MLVI.videoClass
- if the video format requires more information, additional format specific "content information" blocks have to be defined (e.g. RAWI)
- VIDF blocks have a variable sized frameSpace which is meant for optimizing in-memory copy operations for address alignment. it may be set to zero or any other value
- the data right after the header is of the size specified in frameSpace and considered random, unusable data. just ignore it.
- the data right after frameSpace is the video data which fills up the rest until blockSize is reached
- the blockSize of a VIDF is therefore
sizeof(mlv_vidf_hdr_t) + frameSpace + video_data which means that a VIDF block is a composition of those three data fields
- if frames were skipped, either a VIDF block with zero sized payload may get written or it may be completele omitted
- the format of the data in VIDF frames may change during recording (e.g. resolution, bit depth etc)
- whenever in time line a new content information block (e.g. RAWI) appears, the format has to get parsed and applies to all following blocks
[AUDF] (periodic)
- see [VIDF] block. same applies to audio
[RTCI] (periodic, event triggered)
- contains the current time of day information that can be gathered from the camera
- may appear with any period, maybe every second or more often
- should get written before any VIDF block appears, else post processing tools cannot reliable extract frame time
[LENS] / [EXPO] / ... (periodic, event triggered)
- whenever a change in exposure settings or lens status (ISO, aperture, focal length, focus dist, ...) is detected a new block is inserted
- all video/audio blocks after these blocks should use those parameters
...to be continued