Canon's CR2 Raw File Format Specification

Started by jplxpto, July 30, 2013, 12:59:31 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

jplxpto



http://wildtramper.com/sw/cr2/cr2.html


Although not publicly documented by Canon, their CR2 RAW file format specifiction is scattered in bits and pieces across the web. Compiled here is believed to be an accurate specification for the orignal CR2 compression. Because I own the original Canon 5D, this specification reflects its CR2 formatting. Newer 5D versions have other compression varieties such as sRAW (sRAW1, sRAW2, etc.), and these are not supported by this software.
Included here is the CR2 coder/decoder and a zip file of its C-code implementation. The companion Canon CRW Specification may also be of interest. Another good source of CR2 information can be found in a document titled Understanding What is stored in a Canon RAW .CR2 file, How and Why at http://lclevy.free.fr/cr2/.
The specifiction is listed below and is included in the cr2.zip (revised 11/10/08) file as comments of the primary code file Cr2Codec.cpp. The zip file contains the following:
    Cr2Codec.cpp - Primary C-code for CR2 coder/decoder
    Cr2Codec.exe - its sample excutable
    Cr2Codec.h   - its header file
    MakerNote.h  - A few definitions for Canon's TIFF maker-note
    Prop.h       - Properties type definitions
    RawUtl.cpp   - Various raw utilities
    RawUtl.h     - and its header file
    TiffTags.h   - TIFF tag definitions
    TiffUtl.cpp  - Various TIFF utilities
    TiffUtl.h    - and its header file
    getput.cpp   - Low level input/output
    getput.h     - and its header file
    jpeg.h       - Various JPEG definitions
The executable Cr2Codec.exe must be run in a Windows command shell. Executing the command without parameters yields its syntax as follows:
    Usage: Cr2Codec.exe [options] file.cr2 outfile
    Decompress raw image data of file.cr2
      -b: decode then save binary file (default)
      -c: decode then recode file
      -h: decode then create (text) histogram
      -j: extract JPG image 
      -t: decode then save as 8-bit TIFF file
      -T: decode then save as 16-bit TIFF file
    Version 1.0
    Wildtramper.com, Copyright (c), All rights reserved
Although the -t or -T options create a TIFF file from RAW, expect crude colorization. The -j option is useful to extract the embedded JPG file, and it has Canon's as-shot colorization. This JPG is believed to be optimized for the camera's on-board LCD display -- what to heck, it's a quick view.
/***************************************************************************
* The Canon CR2 file format is an encapsulated TIFF shell having 4 IFD sets.
* These IFDs are different versions of the same image.
*
*   +=====================================+ Start of TIFF/CR2 file
*   | TIFF Header |
*   | Size = 8 |
*   +=====================================+
*   | Various TIFF Tags describing File | IFD #1 Segment
*   |   EXIF (TIFF subdirectory) | Canon 5D image size 2496x1664
*   |- - - - - - - - - - - - - - - - - - -|
*   | JPEG data (baseline compression) |
*   +=====================================+
*   | JpegInterchangeFormat | IFD #2 Segment
*   | | unknown image size
*   |- - - - - - - - - - - - - - - - - - -|
*   | JPEG Compressed data |
*   +=====================================+
*   | Few TIFF Tags describing segment | IFD #3 Segment
*   | | Canon 5D image size 384x256
*   |- - - - - - - - - - - - - - - - - - -|
*   | JPEG data (unknown compression) |
*   +=====================================+
*   | Few TIFF Tags describing segment | IFD #4 Segment - RAW image
*   | | Canon 5D image size 4476x2954
*   |- - - - - - - - - - - - - - - - - - -|
*   | JPEG data (lossless compression) |
*   +=====================================+
*
* A sample parsing of these IFDs are:
* sample.cr2: FileID=II, Ver=2A, IFD #1 w/ 14 Tags
*  TagName(TagID,Len,DataOfst) =Value
*  ImageWidth(0100,1,001A)     =2496
*  ImageLength(0101,1,0026)    =1664
*  BitsPerSample(0102,3,00BE)  =8 8 8
*  Compression(0103,1,003E)    =6  {JpegCompression}
*  Make(010F,6,00C4)        ="Canon"
*  Model(0110,D,00CA)        ="Canon EOS 5D"
*  StripOffsets(0111,1,0062)   =82349
*  Orientation(0112,1,006E)    =1  {TopLeft, Normal}
*  StripByteCounts(0117,1,007A)=1030590
*  XResolution(011A,1,00EA)    =72/1 (=72)
*  YResolution(011B,1,00F2)    =72/1 (=72)
*  ResolutionUnit(0128,1,009E) =2  {InchUnits}
*  DateTime(0132,14,00FA)      ="2006:09:24 07:14:52"
*  Exif(8769,1,00B6)        =270
*   TagName(TagID,Len,DataOfst) =Value
*   ExposureTime(829A,1,0264) =1/80 (=0.0125)
*   FNumber(829D,1,026C) =4/1 (=4)
*   ExposureProgram(8822,1,0130) =3
*   ISOSpeedRatings(8827,1,013C) =100
*   ExifVersion(9000,4,0148) =48 50 50 49
*   DateTimeOriginal(9003,14,0274) ="2006:09:24 07:14:52"
*   DateTimeDigitized(9004,14,0288) ="2006:09:24 07:14:52"
*   ComponentsConfiguration(9101,4,016C) =1 2 3 0
*   ShutterSpeedValue(9201,1,029C) =417792/65536 (=6.375)
*   ApertureValue(9202,1,02A4) =262144/65536 (=4)
*   ExposureBiasValue(9204,1,02AC) =0/1 (=0)
*   MeteringMode(9207,1,019C) =5
*   Flash(9209,1,01A8) =16
*   FocalLength(920A,1,02B4) =24/1 (=24)
*   MakerNote(927C,12548,02BC) =29 0 1 0 3 0 46 0 0 0 30 4 0 0 2 ...
*   UserComment(9286,108,12804) =0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
*   FlashpixVersion(A000,4,01D8) =48 49 48 48
*   ColorSpace(A001,1,01E4) =1
*   PixelXDimension(A002,1,01F0) =4368
*   PixelYDimension(A003,1,01FC) =2912
*   Interoperability(A005,1,0208) =76044
*   FocalPlaneXResolution(A20E,1,1292A)  =4368000/1415 (=3086.93)
*   FocalPlaneYResolution(A20F,1,12932)  =2912000/942 (=3091.3)
*   FocalPlaneResolutionUnit(A210,1,022C)=2  {InchUnits}
*   CustomRendered(A401,1,0238) =0
*   ExposureMode(A402,1,0244) =0
*   WhiteBalance(A403,1,0250) =0
*   SceneCaptureType(A406,1,025C) =0
* sample.cr2: FileID=II, Ver=2A, IFD #2 w/ 2 Tags
*  TagName(TagID,Len,DataOfst)       =Value
*  JPEGInterchangeFmt(0201,1,12944)   =76348
*  JPEGInterchangeFmtLen(0202,1,12950)=6001
* sample.cr2: FileID=II, Ver=2A, IFD #3 w/ 11 Tags
*  TagName(TagID,Len,DataOfst) =Value
*  ImageWidth(0100,1,12962) =384
*  ImageLength(0101,1,1296E) =256
*  BitsPerSample(0102,3,129E2) =8 8 8
*  Compression(0103,1,12986) =6  {JpegCompression}
*  Photometric(0106,1,12992) =2  {RGB}
*  StripOffsets(0111,1,1299E) =1112939
*  SamplesPerPixel(0115,1,129AA)=3
*  RowsPerStrip(0116,1,129B6) =256
*  StripByteCounts(0117,1,129C2)=294912
*  PlanarConfig(011C,1,129CE) =1  {ChunkyFormat}
*  UndefinedTag(C5D9,1,129DA) =2
* sample.cr2: FileID=II, Ver=2A, IFD #4 w/ 6 Tags
*  TagName(TagID,Len,DataOfst) =Value
*  Compression(0103,1,129F2) =6  {JpegCompression}
*  StripOffsets(0111,1,129FE) =1407851
*  StripByteCounts(0117,1,12A0A)=9920742
*  UndefinedTag(C5D8,1,12A16) =1
*  UndefinedTag(C5E0,1,12A22) =1
*  UndefinedTag(C640,3,12A36) =1 2238 2238
*
* The first IFD is a low pixel count JPEG image, and for the Canon 5D its image
* size is 2496x1664.  Also included as part of the first IFD are an assortment
* of image and shot data, such as Make, Model, and abundant EXIF information.
* The EXIF information is itself an IFD subdirectory, and within the EXIF
* directory is the Tag 'MakerNote' which contains more shot information such a
* lens type and serial numbers.  MakerNote is has a TIFF like IFD structure,
* but the "TagType" field follows different rules, see below.
*
* The second IFD is another JPEG rendered image, and it is described by two
* TIFF Tags.  It is not clear what the purpose of this image, but noting that
* the section size is very small (6001 bytes in the sample), then the image
* dimensions must also be small.  It is likely a thumbnail.  Peeking into this
* section one finds four JPEG markers: SOI (start of image), DHT (define
* Huffmann tables), DQT (define quantization tables), and SOS (start of scan).
* There is no SOFn marker to identify the image dimensions.
*
* The third IFD is another JPEG rendered image.  In this case there are several
* TIFF tags, and these give the size of the image as 384x256.  Certainly a
* small image. Although the compression is tagged as JPEG, it is believed this
* is not the case since (1) the image data contains no JPEG markers and (2) the
* size of the image data is exactly three times the rows times columns (294912
* = 3x384x256).  Thus, it is believed the image data is uncompressed RGB.
*
* The forth IFD is the sweet spot of the CR2 file.  It is a lossless JPEG
* compressed image of dimensions the size of the camera's photo sensor.  Canon
* adds three proprietary TIFF tags which are of unknown meaning, except the Tag
* 0xC640 with values for the sample file {1, 2238, 2238} are described by Dave
* Coffin's as being 'slice' information. Peeking into the JPEG code of this
* section one finds the following markers:  SOI (start of image), DHT (define 2
* Huffmann tables), SOF3 (start of frame for lossless, sequential, non-
* differential, Huffman coding), and SOS (start of scan).  There are no RSTm
* (reset modulo-8) markers.  The section is located at the end of the file, and
* it is this location which makes it easy to replace a 'valued added' version
* of the image with the original.
*
* Decoding lossless JPEG is similar (and simpler) to decoding CRW images.  Like
* CRW images, the compressed data is organized as a concatenation of a Huffmann
* code (i.e. HufCode or codeword) followed by a variable length difference code
* (i.e. Diff) bit stream as shown below.  Unlike CRW images the HufCode only
* conveys the number of bits required by Diff.
*
*   +----------------+-------------+
*     ... | HufCode[nBits] | Diff[nBits] | ...
*   +----------------+-------------+
*
* Like CRW images, the compressed data is row organized with interleaved BAYER
* grid array data for that row.  So with a BAYER grid of RG/GB, the even rows
* has interleaved HuffCode/Diff data for ...RGRGRG..., while the odd rows it is
* ...GBGBGB...
*
* Unlike CRW images, there are no 64 pixel blocks, rather it is the width of a
* row. The initial values at the beginning of each row is the RG/GB value of
* its nearest previous row beginning.  For the first row, the initial row
* values are 1/2 the bit range defined by the precision.  Thus for 12-bit
* precision:
*     Pix[Row, Col] = Val
*     Pix[0,0] = (1 << (Precision - 1)) + Diff
*     Pix[0,1] = (1 << (Precision - 1)) + Diff
* and for n >= 1
*     Pix[n,0] = Pix[n-2,0] + Diff
*     Pix[n,1] = Pix[n-2,1] + Diff
* while for any other Row/Column
*     Pix[R,C] = Pix[R,C-2] + Diff
*
* Also unlike CRW images the mapping of decoded rows into the image buffer is
* divided into four quadrants. It is unclear why Canon chose this method, but
* nonetheless it's there.  The RAW image is divided into 4 equal sized
* quandrants:
*     +------------------------+------------------------+ RAW Image
*     |        | |
*     |        Quad 0        | Quad 1 |
*     |        | |
*     +------------------------+------------------------+
*     |        | |
*     |        Quad 2        | Quad 3 |
*     |        | |
*     +------------------------+------------------------+
* The finished image inteleaves quadrants row segments 0/1 and interleaves
* quandrant rows segments 2/3 while combined quandrants 0/1 is located to the
* left and combined quadrants 2/3 are concatenated to the right:
*     +------------------------+------------------------+ Finished Image
*     |      Quad0.Row0        |      Quad2.Row0 |
*     |      Quad1.Row0        |      Quad3.Row0 |
*     |      Quad0.Row1        |      Quad2.Row1 |
*     |      Quad1.Row1        |      Quad3.Row1 |
*     | etc        |   etc |
*     |      Quad0.RowN        |      Quad2.RowN |
*     |      Quad1.RowN        |      Quad3.RowN |
*     +------------------------+------------------------+
* So it can be said that Quads 0 and 2 contain the same two BAYER grid colors
* and Quads 1 and 3 contain the other two BAYER grid colors.
*
* Like CRW images, the actual difference value contained in the Diff[nBits] is
* organized loosely as a signed magnitude number, but has its own specific
* rules.  First, it should be noted that the value of nBits determines the
* range of the number. As an example if nBits = 3, then the magnitude (of a
* DiffVal that can be +/- this magnitude) is 4 <= magnitude <= 7, or more
* specifically:
*
*      (1 << (nBits - 1)) <= DiffVal <=  ((1 << nBits) - 1)    // + DiffVal
*     -(1 << (nBits - 1)) >= DiffVal >= -((1 << nBits) - 1)    // - DiffVal
*
* If the most significant bit (MSB) of Diff[nBits] is one, then DiffVal is
* positive and has a magnitude directly represented by binary value of these
* bits.  If however, the MSB is zero, then DiffVal is negative and has a
* magnitude that is the compliment of these bits.  As an example if
* Diff[nBits=3] = 101, then DiffVal is 5, whereas if Diff[nBits=3] = 001, then
* DiffVal is -6.
*
* The size of the decompressed image is slightly larger than the final image
* size.   A few extra 'black' rows added on top and bottom and a few extra
* 'black' columns may be added to the left and right (but none have been
* observed).  These 'black' rows/columns along with a few additional RG/GB
* rows/columns make up a set of Trim rows/columns which are ultimately deleted
* to get the final 'as advertised' RAW image size.  Other than noting that some
* of the Trim is 'black,' it is beyond the scope of this description to
* determine the method to find the optimum Trim rows/columns.  As an
* observation, customized Trim may (or it may not) be encoded in the Firmware
* of camera so that that the final calibration of the image sensor reflects
* this Trim in a manner that places the center of the image sensor at the lens
* center.
*
* TIFF MakerNote:
*
* The TIFF Tag called MakerNote may be used by a manufacturer to embed any type
* of information.  Canon uses this segment.  Parsing of this proprietary
* segment follows the following data structures:
*
* typedef struct
* {
*     tU16  TagID;
*     tU16  TagType;
*     tU32  TagCnt;
*     tU32  DataOfst;
* } tREC;
*
* typedef struct
* {
*     tU16  numRecords;
*     tREC  RecordDir[numRecords];
*     tU8   Heap[];
* } tMAKERNOTE;
*
* The ordering of bytes is 'little endian' (Intel), but this is probably a
* function of the overall TIFF byte ordering.  The first two bytes identifies
* the number of records (numRecords), and this is followed by the record
* directory (RecordDir), and this if followed by the Heap of various data.
* Each record directory consists of 12 bytes with 4 fields.  The first field is
* the tag identification (TagID), followed by the tag type (TagType), followed
* by the tag count (TagCnt), and finally followed by data offset (DataOfst) or
* if the TagType is 4 then this field is immediate data.  This format closely,
* but not exactly, follows standard TIFF.
*
* The TagType fields may have the following meaning:
* Type = 2:  ASCII data
* Type = 3:  tU16 data
* Type = 4:  immediate data
* Type = 7:  unknown
*
* The TagID fields may have the following meaning:
* TagType = 0x01: Camera settings 1
* TagType = 0x04: Camera settings 2
* TagType = 0x06: Camera model (string)
* TagType = 0x07: Camera F/W version (string)
* TagType = 0x08: Image number ???
* TagType = 0x09: Owner name (string)
* TagType = 0x0C: Camera serial number (immediate data number)
* TagType = 0x0F: Custom functions
* TagType = 0x95: Lens model (string)
*
* The following is an sample extraction:
*
* 02BC: 1D 00  =NumRecord
* -ID-- -Type  ---Cnt--- --Ofst---    #
* 02BE: 01 00 03 00 2E 00 00 00 1E 04 00 00   1 - CameraSettings1
* 02CA: 02 00 03 00 04 00 00 00 7A 04 00 00   2
* 02D6: 03 00 03 00 04 00 00 00 82 04 00 00   3
* 02E2: 04 00 03 00 22 00 00 00 8A 04 00 00   4 - CameraSettings2
* 02EE: 06 00 02 00 0D 00 00 00 CE 04 00 00   5 - "Canon EOS 5D"
* 02FA: 07 00 02 00 18 00 00 00 EE 04 00 00   6 - "Firmware Version 1.1.0"
* 0306: 09 00 02 00 20 00 00 00 06 05 00 00   7 - OwnerName
* 0312: 0C 00 04 00 01 00 00 00 1F 15 CE 42   8 - CameraSerialNum = 0x42CE151F
* 031E: 0D 00 07 00 00 04 00 00 26 05 00 00   9
* 032A: 0F 00 03 00 17 00 00 00 76 09 00 00   a - CustomFunctions
* 0336: 10 00 04 00 01 00 00 00 13 02 00 80   b
* 0342: 12 00 03 00 28 00 00 00 26 09 00 00   c
* 034E: 13 00 03 00 04 00 00 00 A4 09 00 00   d
* 035A: 15 00 04 00 01 00 00 00 00 00 00 A0   e
* 0366: 19 00 03 00 01 00 00 00 01 00 00 00   f
* 0372: 83 00 04 00 01 00 00 00 00 00 00 00  10
* 037E: 93 00 03 00 10 00 00 00 AC 09 00 00  11
* 038A: 95 00 02 00 40 00 00 00 CC 09 00 00  12 = "EF24-105mm f/4L IS USM"
* 0396: 96 00 02 00 10 00 00 00 0C 0A 00 00  13
* 03A2: A0 00 03 00 0E 00 00 00 1C 0A 00 00  14
* 03AE: AA 00 03 00 05 00 00 00 38 0A 00 00  15
* 03BA: B4 00 03 00 01 00 00 00 01 00 00 00  16
* 03C6: E0 00 03 00 11 00 00 00 42 0A 00 00  17
* 03D2: D0 00 04 00 01 00 00 00 00 00 00 00  18
* 03DE: 01 40 03 00 1C 03 00 00 64 0A 00 00  19
* 03EA: 02 40 03 00 66 2B 00 00 9C 10 00 00  1a
* 03F6: 05 40 07 00 88 C0 00 00 68 67 00 00  1b
* 0402: 08 40 03 00 03 00 00 00 F0 27 01 00  1c
* 040E: 09 40 03 00 03 00 00 00 F6 27 01 00  1d
*
* Heap
* 0410:    00 00 00 00 00 5C 00    ....\.
* 0420: 02 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00  ................
* 0430: 07 00 00 00 01 00 00 00 00 00 00 00 FF 7F FF 7F  ................
* 0440: 03 00 02 00 00 00 03 00 FF FF 00 00 69 00 18 00  ............i...
* 0450: 01 00 80 00 20 01 00 00 00 00 00 00 00 00 FF FF  .... ...........
* 0460: FF FF FF FF 00 00 00 00 00 00 00 00 FF FF FF FF  ................
* 0470: 00 00 00 00 FF 7F FF FF FF FF 02 00 69 00 C9 05  ............i...
* 0480: D2 03 00 00 00 00 00 00 00 00 44 00 00 00 A0 00  ..........D.....
* 0490: E4 00 B4 00 D4 00 00 00 00 00 03 00 00 00 08 00  ................
* 04A0: 08 00 95 00 00 00 00 00 00 00 00 00 00 00 01 00  ................
* 04B0: 00 00 00 00 B4 00 D0 00 91 00 00 00 00 00 F8 00  ................
* 04C0: FF FF FF FF FF FF FF FF 00 00 00 00 00 00 43 61  ..............Ca
* 04D0: 6E 6F 6E 20 45 4F 53 20 35 44 00 00 00 00 00 00  non EOS 5D......
* 04E0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 46 69  ..............Fi
* 04F0: 72 6D 77 61 72 65 20 56 65 72 73 69 6F 6E 20 31  rmware Version 1
* 0500: 2E 31 2E 30 00 00 00 00 00 00 00 00 00 00 00 00  .1.0............
* 0510: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
***************************************************************************/

epearson

Hello,

I tried to click on the link above for the cr2codec.exe and source, and apparently that page doesn't exist anymore.  Can it be found elsewhere?  Or is there another utility that will decode CR2 files to raw binary?

Thanks.

gerk.raisen


bdwallis

Does anyone know of an update to the Cr2Codec code ??
while it will go in and grab the JPEG from my 600D  it will not decode the image
or do any of the TIF file creations
Thanks    BWallis