Reverse Engineering Picture Styles

Started by dfort, December 07, 2015, 05:50:39 AM

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

jkaura

Wow! I don't know where did you get that list but that really covers a whole lot of things. Awesome!! :)

agentirons

Hey, @chmee, I'm trying to translate your php script into python, but I seem to have a bug somewhere as the file output doesn't match chris_overseas's java script. Would you be able to take a look and tell me if I'm doing the byteArray wrong or something?

If I can get it to work I can probably whip up a better OSX app for decoding/encoding, then we'll have a good start later on for quick and easy LUT->Picture Style conversion.

http://pastebin.com/ft2LkvYB

chris_overseas

Quote from: agentirons on September 30, 2016, 12:26:18 AM
Hey, @chmee, I'm trying to translate your php script into python, but I seem to have a bug somewhere as the file output doesn't match chris_overseas's java script. Would you be able to take a look and tell me if I'm doing the byteArray wrong or something?

If I can get it to work I can probably whip up a better OSX app for decoding/encoding, then we'll have a good start later on for quick and easy LUT->Picture Style conversion.

http://pastebin.com/ft2LkvYB

Take a look at my Java source (rename Pf3Decoder.jar to Pf3Decoder.zip and have a look at Pf3Decoder.java within), it should be very easy to port to any other language of your choice. Note that it uses two key arrays (one 512 bytes, one 513 bytes) so you don't need the full 260KB array.
EOS R5 1.1.0 | Canon 16-35mm f4.0L | Tamron SP 24-70mm f/2.8 Di VC USD G2 | Canon 70-200mm f2.8L IS II | Canon 100-400mm f4.5-5.6L II | Canon 800mm f5.6L | Canon 100mm f2.8L macro | Sigma 14mm f/1.8 DG HSM Art | Yongnuo YN600EX-RT II

agentirons

Thanks, @chris_overseas, I'm not as familiar with Java or C so it took me a little while to translate but I finally got it down.

Here's the working python script version of @chris_overseas's Java app for anybody who finds it useful:
http://pastebin.com/2xhBcgkT
(Make sure to copy the raw script into your own text file. If you hit the download button and try to run the file directly on OSX, you'll get errors because pastebin added '/r' newline characters to the script for some reason.)

dfort

Wow, fantastic progress on the pf2 pf3 files.

I'm thinking that these picture style files are processed by the EOS Utility and written to the camera's memory. According to a1ex:

QuoteFYI, properties PROP_PC_FLAVOR[123]_PARAM 0x401000[135] might contain user picture style data. In ML, they are only used to get the picture style names.

Maybe the rest of the data is the icc profile?

agentirons

Slowly going through Canon's built-in styles and comparing setting changes to breakdown the binary data.

It seems that the 1 or 2 bytes that are 9 bytes in from the end of the pf2 files (just before FF FF FF FF 00 00 00 00) are some kind of integrity check. If you save the same settings twice you'll get identical data, but if you change anything you'll see a flag change near the top but also those integrity bytes change, and not in a 1:1 relationship. It's like the aggregate of your changes are somehow summed into those two bytes, like "Change two settings by 1 = A4", "Change three settings by 1 = A2", but also "Change three completely different settings by 1 also = A2". That includes incrementing a number in your caption field.

agentirons

Spent a lot of time poring through the data last night, just for pf2 files. I think I can confirm that those bytes near the end are some kind of checksum, with the property ID being 0xfffe - Finalize (type 0x0004 ~ Long) according to @chris_overseas's list. There's also a similar 32bit checksum just for copyright field data that appears at the beginning of the file after the 0x50535000 tag.
EDIT: The 4 bytes after the 0x50535000 tag are actually the file length in bytes, not including the 12byte header.

@jkaura noticed that 0x1009 is specifically Base Picture Style - Technicolor Cinestyle lists 0x0084 as it's value, because it's based on Neutral. I've tried putting in values higher than 86, since my latest OSX PSE software doesn't even list Monochrome as an option, but of course that checksum prevents me from re-encoding the file successfully without knowing how those values are generated.

EDIT: Oh, it's called a Longitudinal redundancy check. Working on the formula.

In charting out the files, it seems that the tags go:
0x1009 0003 = Tag name followed by data type
0x0000 0004 = The actual length of the following data block (ie 04 = 4 bytes. The 3x20 color matrices are F0 = 240 bytes)
0x0000 0001 = Actual data, like 32bit Saturation Value

The caption field under propID 0x1002 is always a fixed 32 byte length, consisting of 31 character bytes plus a 00 ending byte.

The copyright field under propID 0x100A is a variable byte length, up to a max of 64 bytes, consisting of n character bytes plus a 00 ending byte.

Cinestyle actually matches the Canon structure closer than I thought, it's just that some of the tables and matrices are in a different order than Canon's.

agentirons

Alright everyone, here's the pf2 format guide in Google Sheets format, including the annotated Technicolor Cinestyle:

https://docs.google.com/spreadsheets/d/1ePeeBpraX7AlM_pbBOTILNGQ9q0G_YQZGSEJYYFA1w8/edit?usp=sharing

I've set it to view only while I investigate on my own but feel free to duplicate and make changes. There are two tabs at the bottom, first is a breakdown of the basic Canon styles, second tab is the full Cinestyle hex data.

Probably most interesting is the Cinestyle "User RGB Gamma" block (0x1054 0004), which is a 4096 byte(!) block that I can only assume is the actual log curve in all it's glory.

Technicolor rearranged their blocks compared to Canon and nixed a few like "Partial Info (sRGB)" and "WK Data", whatever those are. I'd love more info from @chris_overseas about what some of these tags mean, like "Site" or "Remain".

DeafEyeJedi

Quote from: agentirons on September 30, 2016, 09:33:30 PM
Alright everyone, here's the pf2 format guide in Google Sheets format, including the annotated Technicolor Cinestyle:

Holy Moly, John! This is all really happening and it is actually unfortunate that @dfort is still out on Vacation. Meanwhile hopefully @chris_overseas could fill us in some more on these tags. So unreal!  8)

Seriously it seems more like @Danne's & @dfort's "CineStyle Conspiracy Theory" is starting to become realistic or at least shredding some light for this project to move forward. Keep it up w the great work everyone!
5D3.113 | 5D3.123 | EOSM.203 | 7D.203 | 70D.112 | 100D.101 | EOSM2.* | 50D.109

nikfreak

May i ask the reason for reddercity's posts being removed / deleted (or whatever is happening to 'em)
[size=8pt]70D.112 & 100D.101[/size]

a1ex

He posted code from EDSDK, which is covered by an NDA.

(FYI, his ports were reported by a few users, so it's not just me)

Andy600

Quote from: agentirons on September 30, 2016, 09:33:30 PM
Probably most interesting is the Cinestyle "User RGB Gamma" block (0x1054 0004), which is a 4096 byte(!) block that I can only assume is the actual log curve in all it's glory.

Converting that block to decimal reveals a 10bit log curve and it's pretty much Cineon math (0.6 gamma). The black level is offset to legal level (cv 64/1023).

spreadsheet (excel format) https://s3-us-west-2.amazonaws.com/ofxpublic/cinestyle_hex2dec.xlsx




Update:

The RGB gamma blocks seem to provide an underlying s-curve which is likely to be the inverse of the s-curve used by the Neutral profile i.e. the curve you are always fighting in PSE.

Applying the inverse = Neutral linear (don't confuse linear in this context with scene-referred linear).



https://s3-us-west-2.amazonaws.com/ofxpublic/rgb_gamma_curve.xlsx

I've also put the RGB gamma curve into a .pf3

https://s3-us-west-2.amazonaws.com/ofxpublic/RGB_Gamma_curve.pf3


I'm not sure yet what interpolation is used (if any) but it should be possible to create alternative log profiles by first offsetting the RGB gamma curve - however, 8bits isn't going to be nearly enough for most types.

Protune might be a good alternative but that is very close to Cinestyle anyway. Original Canon log (done properly) could be viable!
Colorist working with Davinci Resolve, Baselight, Nuke, After Effects & Premier Pro. Occasional Sunday afternoon DOP. Developer of Cinelog-C Colorspace Management and LUTs - www.cinelogdcp.com

Danne


Lars Steenhoff

can we make the fuji flog for the canon with this information?

and whats interesting for me it that they use ITU-R BT.2020 for the log encoding

http://www.fujifilm.com/support/digital_cameras/software/lut/pdf/F-Log_DataSheet_E_Ver.1.0.pdf

http://www.fujifilm.com/support/digital_cameras/software/lut/

markanini

Quote from: Andy600 on October 01, 2016, 11:58:54 AM
I'm not sure yet what interpolation is used (if any) but it should be possible to create alternative log profiles by first offsetting the RGB gamma curve - however, 8bits isn't going to be nearly enough for most types.
Perhaps Graeme Gill can provide further insight on effective curves for 8-bit capture.
Gear: Canon 600D & Magic Lantern Nightly.

a1ex

Here are some 8-bit curves that minimize the quantization error, relative to the noise levels.

https://files.apertus.org/AXIOM-Beta/optimal_curve.html

TLDR: they depend on the noise profile, and work best at higher ISOs (because there are more noise bits that can be discarded without impacting the image quality). I also believe these curves are optimal for reducing the bit depth from 14 bits to 10 bits (or even 8 bits at higher ISOs) with minimal loss, if we ever *) find a way to apply curves to the raw data.

*) there is an easy coding task in the 12-bit research thread (reminder)

The big question: is the input data for these curves limited to 10 bits or not?

Andy600

@a1ex so you think there may be some downsampling before the data gets anywhere near a Picture Style? I thought this too but wouldn't that be over complicating things in terms of in-camera processing because there is then another conversion to 8bit when it gets encoded for output?

Your 8bit curve research is very useful as is your comment about better results at higher ISOs. I have a hunch view that for a 'static' log profile like Cinestyle we should be shooting at higher ISOs (probably 640) and using NDs to maintain a consistent noise profile. I don't think there is any real difference between a DLSR and a dedicated cinema camera in this respect. It's the same math but we tend to shoot at the lowest ISO possible because it's cleaner. If this carried through to a cinema camera everyone would be shooting at the lowest ISOs all the time but that's not the case and the base is typically fixed between 400-800 ISO.

re: curves in raw data. @cpc's Slimraw can log encode raw data at 10bit. It's very good - but I suspect you mean a way to do it in-camera!?
Colorist working with Davinci Resolve, Baselight, Nuke, After Effects & Premier Pro. Occasional Sunday afternoon DOP. Developer of Cinelog-C Colorspace Management and LUTs - www.cinelogdcp.com

agentirons

Quote from: Andy600 on October 01, 2016, 11:58:54 AM

The RGB gamma blocks seem to provide an underlying s-curve which is likely to be the inverse of the s-curve used by the Neutral profile i.e. the curve you are always fighting in PSE.

Applying the inverse = Neutral linear (don't confuse linear in this context with scene-referred linear).

So if I'm understanding that right, then the DIGIC processor as well as DPP are first applying the "RGB Gamma" (0x1018) curve, which inverses the curve of the Base Picture Style, "Neutral", and then the "User RGB Gamma" curve is applied after to turn the incoming data into more or less Cineon Log?

The User RGB Gamma block is curious to me because it doesn't exist in any other pf2 files I've looked at from Canon or those made in PSE, and it clearly offers a much finer level of detail than the twelve point curve you get in the RGB or LAB gamma blocks.

Assuming that order of operations is right, and also that we could create our own User Gamma block, then it would appear the key to adding any truly custom picture style is to use some of the data blocks to cancel out the base Canon style and then use the other blocks to apply the desired curves.

Of course at the moment nothing is testable until someone can figure out how the final checksum is generated. I think I was slightly off when I said it was an LRC, it seems maybe more likely that it's CRC-32 since it's a 32-bit hex string, and I've seen at least 24-bits of it used. I'm not very familiar with the topic, but hopefully @chmee or @chris_overseas could shed some light on it. It's gotta just be a matter of figuring out how much of the file is treated as the message to encode by feeding different sections into a CRC-32 algorithm until the output matches the checksum in the file.

Once that's done I feel like we've got pf2 successfully reverse engineered and can easily create software to read/write. Then onto pf3, I guess!

chris_overseas

I've updated the Pf3Decoder app so it now automatically corrects the checksum when processing a pf2/pf3 file with the ".decoded" extension. This means you should be able to decode a pf2/pf3 file, make any changes you wish with a hex editor, then re-encode it without worrying about the checksum being invalidated.

https://www.dropbox.com/s/4x8x9epalybm6aw/Pf3Decoder.jar?dl=0

This hasn't had very thorough testing so please let me know if you hit any problems.
EOS R5 1.1.0 | Canon 16-35mm f4.0L | Tamron SP 24-70mm f/2.8 Di VC USD G2 | Canon 70-200mm f2.8L IS II | Canon 100-400mm f4.5-5.6L II | Canon 800mm f5.6L | Canon 100mm f2.8L macro | Sigma 14mm f/1.8 DG HSM Art | Yongnuo YN600EX-RT II

Andy600

I could be wrong but the 'User RGB Gamma' block (the log curve) doesn't seem to have any affect at all  ???

I just filled it with linear data and the linear profile looks exactly like Cinestyle in the cam and in DPP - zero change! If I alter the other R,G & B gamma blocks i.e. by setting the first curve point output from 64 to 0, it does have the expected effect with the black level dropping.

If I'm right then it's the RGB gamma blocks (9x curve points) that are the Cinestyle curve and that's not great news*. The inverse s-curve may or may not linearize the neutral profile AND convert to log but something doesn't quite add up when looking at the actual curve -  I've always assumed (until this experiment) that the base Picture Styles were linear because the ICC versions are and there are dynamic controls for altering contrast in the camera and DPP - I'm back to thinking that.

* having independent control over R,G&B gamma curves might be useful for making rough print emulation looks similar to what you could do with ASC CDL i.e. not as good as a 3D lut but better than a single 1D RGB tone curve.


Edit: one thing did come to mind regarding the log curve - it may be applied but twice (forward and reverse) and that's why I don't see a change and there may be some color processing (either the base PS or a matrix, possibly saturation) taking place in log gamma. This can be very useful when it comes to highlight saturation rolloff similar to the Alexa Film Matrix. Just a thought! forget that, if that were the case then I would still see some change in color.
Colorist working with Davinci Resolve, Baselight, Nuke, After Effects & Premier Pro. Occasional Sunday afternoon DOP. Developer of Cinelog-C Colorspace Management and LUTs - www.cinelogdcp.com

Danne

So much nice development coming through lately. Not neing able to check into it all atm I just want to pass along an exiftool command which might be of interest. Or not.
Specifying exiftool -U my.CR2 file will reveal "unknown tags" and it seems there are color and picture scale tags from inside the CR2 file here which might be useful. An excerpt here which seems to be the tone curve numbers(not really sure yet).
Canon PS Info 2 0x01d0          : 188
Canon PS Info 2 0x01d1          : 86
Canon PS Info 2 0x01d2          : 192
Canon PS Info 2 0x01d3          : 87
Canon PS Info 2 0x01d4          : 219
Canon PS Info 2 0x01d5          : 0

agentirons

Quote from: chris_overseas on October 02, 2016, 04:35:51 PM
I've updated the Pf3Decoder app so it now automatically corrects the checksum when processing a pf2/pf3 file with the ".decoded" extension. This means you should be able to decode a pf2/pf3 file, make any changes you wish with a hex editor, then re-encode it without worrying about the checksum being invalidated.

https://www.dropbox.com/s/4x8x9epalybm6aw/Pf3Decoder.jar?dl=0

This hasn't had very thorough testing so please let me know if you hit any problems.

Thank you! I'm on my phone so I haven't checked it out yet, but does it also modify the file length value at the beginning of the file?

Andy600

Quote from: Danne on October 02, 2016, 08:13:52 PM
Specifying exiftool -U my.CR2 file will reveal "unknown tags" ...

Tip:  add >unknown.txt to save to a text file i.e.

exiftool -U my.CR2>unknown.txt


Just ran that on a CR2 that was set to Cinestyle - some parts of the Canon Color Data 4 entries look very much like the RGB and equivalent Lab values for a color chart (not a Macbeth ,might be ITU8? dunno) - it's a lut - this 'could be' the look of Neutral. Need to compare with the ICC.
Colorist working with Davinci Resolve, Baselight, Nuke, After Effects & Premier Pro. Occasional Sunday afternoon DOP. Developer of Cinelog-C Colorspace Management and LUTs - www.cinelogdcp.com

chris_overseas

Quote from: agentirons on October 02, 2016, 08:21:18 PM
Thank you! I'm on my phone so I haven't checked it out yet, but does it also modify the file length value at the beginning of the file?

It doesn't yet, but I'll add that later today.

Edit: I've now updated Pf3Decoder.jar so it updates the header with the correct data length when re-encoding files.
EOS R5 1.1.0 | Canon 16-35mm f4.0L | Tamron SP 24-70mm f/2.8 Di VC USD G2 | Canon 70-200mm f2.8L IS II | Canon 100-400mm f4.5-5.6L II | Canon 800mm f5.6L | Canon 100mm f2.8L macro | Sigma 14mm f/1.8 DG HSM Art | Yongnuo YN600EX-RT II

Andy600

This is just a proof of concept but I've written a different RGB curve into a pf2 with a linear tone curve. It's based on (but not completely accurate to) Cineon + 2EV so probably best not to use it for anything. It works in camera and DPP. It also opens in PSE but there is no user accessible data there - re-saving it in PSE either as pf2 or pf3 nulls the RGB curve.

https://s3-us-west-2.amazonaws.com/ofxpublic/Cineon_Log_plus2EV.pf2
Colorist working with Davinci Resolve, Baselight, Nuke, After Effects & Premier Pro. Occasional Sunday afternoon DOP. Developer of Cinelog-C Colorspace Management and LUTs - www.cinelogdcp.com