Reverse Engineering Picture Styles

Started by dfort, December 07, 2015, 05:50:39 AM

Previous topic - Next topic

0 Members and 2 Guests are viewing this topic.

chris_overseas

@Danne, @DeafEyeJedi: The XOR operator (usually represented by the ^ symbol) is its own inverse. That is, if X ^ Y = Z, then it is also true that X = Y ^ Z and Y = X ^ Z. If you know any two of the values you can always figure out the third, but if you only have one value you can't determine either of the other two. This property means it can be used as a sort of "poor man's encryption", where e.g. X = unencrypted data, Y = encryption key, Z = encrypted data.

Taking @jkaura's example, we know all of Z (the encrypted file) and some of X (the original data), so we can compute some of Y (the key) using Z ^ X = Y:

580F 2FC2 552F 2E76 D23D ^ 4141 4141 4141 4141 4141 = 194E 6E83 146E 6F37 937C
5B0C 2CC1 562C 2D75 D13E ^ 4242 4242 4242 4242 4242 = 194E 6E83 146E 6F37 937C

Because the results (key fragments) are the same in both cases, it implies the key is a constant. So now the puzzle is to figure out the rest of the key. It might be hard-coded in the encoding/decoding software, or it might be generated using a formula. If it's hard-coded it should be relatively easy to find by scanning the binaries, if it's a formula then some reverse-engineering of the encoding/decoding software (or extracting more fragments of the key using @jkaura's approach) is probably required.
EOS R5 1.1.0 | Canon 16-35mm f4.0L | Tamron SP 24-70mm f/2.8 Di VC USD G2 | Canon 70-200mm f2.8L IS II | Canon 100-400mm f4.5-5.6L II | Canon 800mm f5.6L | Canon 100mm f2.8L macro | Sigma 14mm f/1.8 DG HSM Art | Yongnuo YN600EX-RT II

jkaura

@chris_overseas: Thanks for the clear explanation! You have clever ideas on further steps to resolve the key behind picture style files!

@Danne, @DeafEyeJedi: Btw, on Windows 10 machine, these example XOR calculations can be quite easily repeated with basic Calculator application in programmer/hex mode. There one can type in hexadecimal characters (number digits and letters between A-F) and perform XOR operations between two hexadecimal numbers.

chris_overseas

I've only just installed Canon's Picture Style Editor a few minutes ago but may have figured out how to disable the protection on pf3 files. On the handful of files I've tried at least, changing byte 0x0006A122 from a 0x31 to a 0x30 seems to allow it to be edited, without any obvious side effects. Based on that and some of the other behaviour I've seen too, I don't think there's a checksum on the file.

Can someone try changing that byte on a pf3 file of theirs and let me know if it enables editing? Also, are pf3 files always 434,487 bytes in size?
EOS R5 1.1.0 | Canon 16-35mm f4.0L | Tamron SP 24-70mm f/2.8 Di VC USD G2 | Canon 70-200mm f2.8L IS II | Canon 100-400mm f4.5-5.6L II | Canon 800mm f5.6L | Canon 100mm f2.8L macro | Sigma 14mm f/1.8 DG HSM Art | Yongnuo YN600EX-RT II

agentirons

Wow, cool find. The only pf3 files I've seen online are the ones in post 1 here, which are all either 434,490 or 434,466. I just tested saving out a default (zero adjustments) pf3 with PSE and it seems the file size fluctuates slightly with the number of characters in the Comment/Copyright fields when saving. The comment field can have 31 characters max, and the copyright field can have 63. Filling both fields with all 'z' characters gets a filesize of 434,529 bytes, and putting only a single '0' in the comment field gives me 434,466.

From looking at pf2 files by Canon and this collection of user styles: http://cinescopophilia.com/wp-content/uploads/2012/03/158%20VW%20Collection%20CPS.zip - I can see a wide variety of file sizes from 1243 bytes up to 20623 bytes.

dfort

Quote from: chris_overseas on September 28, 2016, 10:57:20 PM
...changing byte 0x0006A122 from a 0x31 to a 0x30 seems to allow it to be edited, without any obvious side effects...

Hex to ascii 0x30 = 0 and 0x31 = 1. Seems to be a boolean operator. Have you found the address for pf2 files?

So far all of the pf3 picture files I have looked at appear to have been created with the Canon Picture Style Editor. However, the Canon and Technicolor pf2 files were apparently created by some other method. Try downloading and opening Canon's Emerald picture style and you'll see what I mean. The file is very small and it is possible to open it up in the PSE but all of the controls are at their default positions.

The Canon EOS Utility probably converts pf2 and pf3 files into icc profiles and loads them into the camera's non-volatile memory. The Canon and Technicolor pf2 files might already be in icc format only somehow encrypted? Wonder if the icc profile is stored in the camera encrypted? Maybe I'm just spitting into the wind?

agentirons

(EDIT: Had to go read through the thread again to realize that the newest version of PSE lets you pick pf2 or pf3)

Found PSE 1.3 for Windows at Canon Europe's site: http://www.canon-europe.com/support/consumer_products/products/cameras/digital_slr/eos_1d.aspx?page=3&type=download&language=en&os=all

Still uses pf2 files, in case that's useful for anybody. Saving out a default file with 0 in the comment field gave a file size of only 1431 bytes, so this may be a much simpler format to decode.

chmee

jepp. the technicolor.pf2 is only about 5kb. if we can decode pf2, it would be a big milestone afor understanding the plain format - after that pf3 should be not that difficult.

@agentirons
a "nearly empty file" has no information for us to get the knowledge, we have decoded accordingly. i assume its still binary not readable (like xml).

  • the copyright/title fields can be max 32/20 chars long. have to check if it s maybe as long as the encoding key. (dont believe it :) )
  • if the known fields are long enough and xor-decoded, the key should start repeating. win.win.win.
  • furthermore, if this decoded binary file is comparable to tiff, there should be unique tag-numbers (fixed length) and maybe a hex00 for data-end.
  • maybe there are no tagnumbers but only hex00 (or other static bytevalues) to describe beginning and end of datafields.
  • in terms of statistics this separation-value should be the val with the most occurence and its only one or two times successive.

    maybe [tag][datavalue][separator][tag][datavalue][separator][tag][datavalue][separator][tag][datavalue][separator]..
    or
    maybe [datavalue][separator][datavalue][separator][datavalue][separator]..
    or
    maybe [separator][datavalue][separator][separator][datavalue][separator][separator][datavalue][separator][separator][datavalue][separator]...

    regexlike: /[separator]{1:2}[^separator]+/

    ah, and datetime could be a value we can work with.

    regards chmee
[size=2]phreekz * blog * twitter[/size]

DeafEyeJedi

This is all more than just beautiful. Simply stunning @chmee and thanks for the heads up @jkaura and much thanks to @chris_overseas for your clear pointers. Keep up the great work @agentirons!
5D3.113 | 5D3.123 | EOSM.203 | 7D.203 | 70D.112 | 100D.101 | EOSM2.* | 50D.109

chris_overseas

I've made a bit of progress on solving the XOR encoding. Here's a zip containing the three log picture styles, plus another random one I made. For each, I've also included a copy that (hopefully!) has the XOR encoding removed but should otherwise be identical. Currently it's a fairly manual process to do this but I'll try to automate it when I have time (probably won't be for a few days at least). I'll also try to figure out what the layout of the files are in some detail. In the meantime, maybe the rest of you can figure out some of the file contents from what I have here?

https://www.dropbox.com/s/1xt92z62fosrsri/Decoded-Styles.zip?dl=0

Notes:

  • The entire pf3 file is XORed apart from the first 12 bytes which are left as-is.
  • I think the XOR operation has a period of 0x200 (512 bytes), though haven't confirmed that 100% yet.
    Edit: turns out this is not true, we'll either need to apply a formula based on two separate keys, or have one huge key (512*513 = 260k)
  • The decoded files won't open in Picture Style Editor, they're just for examining with a hex editor.
EOS R5 1.1.0 | Canon 16-35mm f4.0L | Tamron SP 24-70mm f/2.8 Di VC USD G2 | Canon 70-200mm f2.8L IS II | Canon 100-400mm f4.5-5.6L II | Canon 800mm f5.6L | Canon 100mm f2.8L macro | Sigma 14mm f/1.8 DG HSM Art | Yongnuo YN600EX-RT II

chmee

@chris_overseas !big! kudos!
Do you have a encoding key extracted? Pieces? maybe its worth looking inside a canon-firmware for this key. a binary search. the longer the better.
[size=2]phreekz * blog * twitter[/size]

Danne

Thanks a lot for digging into this. Will check your files chris_overseas.

chris_overseas

Quote from: chmee on September 29, 2016, 10:03:23 AM
@chris_overseas !big! kudos!
Do you have a encoding key extracted? Pieces? maybe its worth looking inside a canon-firmware for this key. a binary search. the longer the better.

I don't have the key but it should be trivial to get by XORing the encoded vs decoded files (ignoring the first 0x0C bytes). No need to search the binaries or firmware - it's not stored in its raw value in there anyway.
EOS R5 1.1.0 | Canon 16-35mm f4.0L | Tamron SP 24-70mm f/2.8 Di VC USD G2 | Canon 70-200mm f2.8L IS II | Canon 100-400mm f4.5-5.6L II | Canon 800mm f5.6L | Canon 100mm f2.8L macro | Sigma 14mm f/1.8 DG HSM Art | Yongnuo YN600EX-RT II

a1ex

FYI, properties PROP_PC_FLAVOR[123]_PARAM 0x401000[135] might contain user picture style data. In ML, they are only used to get the picture style names.

Sizes: 16704 on 5D3, 600D, 16680 on 5D2, 550D.

Example:

0x4010001  16704   0x824140 'Flaat_2'             0xffff00fe        0x0        0x0        0x0        0x1   0x219c60 0xfffffffc        0x2 0xfffffffe        0x0 0xdeadbeef 0xdeadbeef   0xfc0040 0xffff2fff        0x0 0xf64ad173 0x7e60894d 0xf452b0c0 ...


HTH

chmee

@chris_overseas
the key (if its static for all bodies, and it should) then will be important for encoding created files. nonetheless, i'm watching now your decoded files.

edit: do i see that right? the encoding key is 256kb long. (for pf3) holy sh*t.
[size=2]phreekz * blog * twitter[/size]

chris_overseas

Here's a little Java app that'll covert a pf3 file between its encoded and unencoded forms:

https://www.dropbox.com/s/4x8x9epalybm6aw/Pf3Decoder.jar?dl=0

To use, just run:

java -jar Pf3Decoder.jar input-file [output-file]

Note that a jar file is basically a .zip file. The source code is included inside the jar.

If the input file is encoded, Pf3Decoder will create an unencoded copy. If the input file is decoded, Pf3Decoder will create an encoded copy.
If you don't specify an output file, Pf3Decoder will create an output file with the same name but with a ".decoded" extension. If the input file already has a ".decoded" extension however, it'll create a file without the ".decoded" extension (ie it'll write out a file with the original encoded filename). Be aware that any existing output file with the same name will be overwritten without warning.
EOS R5 1.1.0 | Canon 16-35mm f4.0L | Tamron SP 24-70mm f/2.8 Di VC USD G2 | Canon 70-200mm f2.8L IS II | Canon 100-400mm f4.5-5.6L II | Canon 800mm f5.6L | Canon 100mm f2.8L macro | Sigma 14mm f/1.8 DG HSM Art | Yongnuo YN600EX-RT II

Danne

Amazing progress and what a helpful tool. Works great over here. Now what to make of the hex info left in there, hehe  :P

Successfully toggled the encoding on...

chris_overseas

Quote from: Danne on September 29, 2016, 03:13:23 PM
Amazing progress and what a helpful tool. Works great over here. Now what to make of the hex info left in there, hehe  :P

Great to hear it's working for you. I'd suggest starting out by taking a pf3, load it in the Style Editor, make a single change, save it with a new name, then compare decoded versions of each to see what changed. Rinse and repeat, documenting any interesting changes you find. Once you start getting an understanding of the file format, try editing the decoded version appropriately then re-encode it and see if the change takes effect as expected in the Style Editor. I wrote the decoder with this workflow in mind so you can decode a file, edit it, then re-encode it quickly/easily. If you have any suggestions for improvements though, let me know.

One thing to try - toggle the byte that is about 20 bytes from the end of the file (just before the FF FE pair) from a 00 to a 01. I suspect that one enables/disables the edit protection.
EOS R5 1.1.0 | Canon 16-35mm f4.0L | Tamron SP 24-70mm f/2.8 Di VC USD G2 | Canon 70-200mm f2.8L IS II | Canon 100-400mm f4.5-5.6L II | Canon 800mm f5.6L | Canon 100mm f2.8L macro | Sigma 14mm f/1.8 DG HSM Art | Yongnuo YN600EX-RT II

agentirons

Thanks for the Java tool, chris_overseas! Ran the test like you suggested, by opening PSE and making two pf3 files where the only difference was checking the 'disable subsequent editing' box on save. Compared in a hex editor, turns out that one option changes two bytes - the one you noticed, but also the byte that's 9 bytes in from the end.


(Left side is the locked file, right side is the unlocked file.)

I tried changing only the byte at 0x6a113 from 00 to 01, and when trying to open in PSE it no longer shows the caption/copyright info and gives an error when loading. If you change both bytes to match then the file is successfully unlocked and editable.

agentirons

And here's the same test in pf2 format:


(Locked on the left, unlocked on the right)

Technicolor Cinestyle on the other hand has a completely different end of file, so not sure what to make of that. It also has two sections that according to my hex editor simply run through every keyboard character.

agentirons

I believe I've found some kind of start tag in .pf2 files for the LUT data - "00 10 21 00 04 00 00 00 F0 00", which is always followed by "00" or "01" and then arbitrary data. This holds true for Canon's styles available through their website, Cinestyle, and styles created through PSE.

In PSE styles, that tag starts immediately after the copyright field no matter how long that field is, and the last byte is 01. In Canon's styles it appears after a middle block of mostly empty bytes, and ends with 00. In Cinestyle it's near the copyright field and ends in 01 like PSE, but the data doesn't start immediately like PSE and Canon's.

chmee

looking on the decoded file i have the "feeling" there are kind of tags as in tiff. double bytes, starting with 0x10. btw. for the players and coders, this is the key-file. as @chris_overseas stated, this key is static and equal for all bodies/picturestyles (513*512bytes)

edit: Here s an php-example of the usage - how to decode

regards chmee
[size=2]phreekz * blog * twitter[/size]

agentirons

Another interesting quirk in PSE - If you open a Canon style and then save out as a new file, the subsequent pf2 contains a copy of both the original style (with tiny modifications) and your adjustments. This doesn't happen if you open your own file and save a new one.

If you decode a double-saved Canon style and compare the two blocks of data inside, the only difference is that the second byte in that start tag becomes 90 in the original block, and stays as 10 in the new block. You can also see that this flag is repeated four times throughout each data block, because the only difference is 90 vs 10. The second block of data will also begin with your new comment/copyright fields.

I made a quick and dirty OSX Automator app that works with Chris's java file, so if you put pfcoder.app and Pf3Decoder.jar in ~/Desktop/PSE, you can add the app to your dock and drag/drop pf2/3 files onto the icon and it will auto run the shell script java -jar Pf3Decoder.jar input-file (no output specified, so it will overwrite stuff). Lot faster than having to revise the terminal command every time you want to decode or encode.

https://drive.google.com/file/d/0B2q7De8nh2q7SzJ4OXp5ZXlKejA/view?usp=sharing

chmee

Here (as php-example) the split by tag 0x10. starting piece as decimal text

16 48-51 seems to be the contrast/saturation and so on values. and yes the dataformat seems to be tiff-like

16 9 0 3 0 0 0 2 0 129
tagstart and type 16 09 [2byte - int]
valuetype 00 03 [2byte - int] (3 means SHORT 2byte)
length 0 0 0 2 [4byte - long]
value 0 129 [value SHORT]

tiff specification 6.0 page 15-16 for valuetype

regards chmee
[size=2]phreekz * blog * twitter[/size]

jkaura

Yep, makes sense. By testing with PSE, the tags affected by making adjustments on PSE's Basic tab and saving a file are the following (Tone Curve not included):

- 0x1002: Caption (type 0x0002 ~ ASCII)
- 0x1009: Base Picture Style (type 0x0003 ~ Short)
- 0x100a: Copyright (type 0x0002 ~ ASCII)
- 0x1030: Strength (type 0x0004 ~ Long)
- 0x1031: Contrast (type 0x0004 ~ Long)
- 0x1032: Saturation (type 0x0004 ~ Long)
- 0x1033: Color tone (type 0x0004 ~ Long)

These are not verified, but I guess semantics for the following tags that seem to be of Long type (0x0004) as well:
- 0x1034: Fineness
- 0x1035: Threshold

Here are some actual tested hexadecimal responses for changing basic adjustments in PSE.

Base Picture Style:
  1009 0003 0000 0002 0081  # Standard
  1009 0003 0000 0002 0082  # Portrait
  1009 0003 0000 0002 0083  # Landscape
  1009 0003 0000 0002 0084  # Neutral
  1009 0003 0000 0002 0085  # Faithful

Strength:
  1030 0004 0000 0004 0000 0000  # Strength: 0
  1030 0004 0000 0004 0000 000a  # Strength: 10

Contrast:
  1031 0004 0000 0004 ffff fffc  # Contrast: -4
  1031 0004 0000 0004 0000 0000  # Contrast: 0
  1031 0004 0000 0004 0000 0004  # Contrast: 4
 
Saturation:
  1032 0004 0000 0004 ffff fffc  # Saturation: -4
  1032 0004 0000 0004 0000 0000  # Saturation: 0
  1032 0004 0000 0004 0000 0004  # Saturation: 4

Color tone:
  1033 0004 0000 0004 ffff fffd  # Color tone: -3
  1033 0004 0000 0004 0000 0000  # Color tone: 0
  1033 0004 0000 0004 0000 0003  # Color tone: 3

Here is the caption field from Example-Decoded.pf3 (provided by @chris_overseas in a previous message) and guesses for the structure:

  1002 # tag?
  0002 # type: ASCII?
  0000 0020  # length?
  4341 5054 494f 4e5f 4341 5054 494f 4e5f  # "CAPTION_CAPTION_"
  4341 5054 494f 4e5f 4341 5054 494f 4e00  # "CAPTION_CAPTION<NUL>"

chris_overseas

Data IDs:

0x0000 - KEY
0x1001 - Camera Model
0x1002 - Caption
0x1003 - Remain
0x1004 - Gnz Gamma
0x1005 - Gnz LUT
0x1006 - Lucky Data
0x1007 - Angel Regs
0x1008 - Wk Data
0x1009 - PC set Selected ID
0x100a - Copyright
0x100b - Lucky Enable
0x100c - Lucky Valid
0x100d - Site
0x1010 - Partial Info
0x1011 - Made In Canon
0x1012 - Partial Info Base
0x1018 - RGB Gamma
0x1019 - LAB Gamma
0x1020 - ICC (sRGB)
0x1021 - 3x20 Matrix (sRGB)
0x1022 - Lucky Table (sRGB)
0x1023 - Partial Info (sRGB)
0x1028 - ICC (AdobeRGB)
0x1029 - 3x20 Matrix (AdobeRGB)
0x102a - Lucky Table (AdobeRGB)
0x102b - Partial Info (AdobeRGB)
0x1030 - Sharpness
0x1031 - Contrast
0x1032 - Saturation
0x1033 - Color Tone
0x1050 - LUT Long (sRGB)
0x1051 - LUT Long (AdobeRGB)
0x1052 - LUT Double (sRGB)
0x1053 - LUT Double (AdobeRGB)
0x1054 - User RGB Gamma
0x1055 - RGB Gamma From Camera
0x1060 - Base PS Gamma ID
0x1061 - Defined Color Partial Info (sRGB)
0x1062 - Defined Color Partial Info (AdobeRGB)
0x1063 - Canon Internal Partial Info (sRGB)
0x1064 - Canon Internal Partial Info (AdobeRGB)
0x1065 - User RGB Gamma After Canon
0x1070 - LUT 3D (sRGB)
0x1071 - LUT 3D (AdobeRGB)
0x1080 - Disable Edit
0x1f00 - Camera Lucky Table (sRGB)
0x1f01 - Camera Lucky Table (AdobeRGB)
0x2001 - Sub Copyright
0x210 - Picture Style LUT Param
0x2fff - Sub Terminator
0x8000 - Backup Flag
0x8001 - PF Version
0xa001 - Work Data
0xa002 - Copyright
0xb001 - Select Object
0xfffe - Finalize
0xffff - Terminator


Data Type:
1 - Byte
2 - ASCII
3 - Short
4 - Long
0xffff - Unknown


Type:
0x2e4d524b - MRK
0x2e504632 - PF2
0x2e504644 - PFD
0x2e505345 - PSE
0x2e544344 - TCD
0x2e574244 - WBD


Picture Styles:
0x21 - User1
0x22 - User2
0x23 - User3
0x41 - PC1
0x42 - PC2
0x43 - PC3
0x81 - Standard
0x82 - Portrait
0x83 - Landscape
0x84 - Neutral
0x85 - Faithful
0x86 - Monochrome
0x87 - Auto
0x88 - FineDetail
EOS R5 1.1.0 | Canon 16-35mm f4.0L | Tamron SP 24-70mm f/2.8 Di VC USD G2 | Canon 70-200mm f2.8L IS II | Canon 100-400mm f4.5-5.6L II | Canon 800mm f5.6L | Canon 100mm f2.8L macro | Sigma 14mm f/1.8 DG HSM Art | Yongnuo YN600EX-RT II