JPEG – JPG Segment Length Code

I am trying to write some code to extract Exif information from JPG.

Exif is stored in the APP1 section of the JPG file. According to the Exif spec, The format of the APP1 segment should start like this:

FF E1 // APP1 segment marker
nn nn // Length of segment
45 // ' E'
78 //'x'
69 //'i'
66 //'f'

until there is an FF followed by FF or something other than 00 .

Viewing the JPG in the hexadecimal editor, I can see the FF E1 and Exif strings, but I have trouble decoding the length bytes. An example: in a jpg, My hexadecimal editor tells me that the APP1 segment is 686 bytes long, but the length bytes are F7 C8.

How should I use these bytes to come up with 686 decimal?

Edit: This is the first part of the sample file:

FF D8 FF E1 F7 C8 45 78 69 66 00 00 4D 4D 00 2A 00 00 00 08

Edit: Actually, I think I might know what’s going on here. Does the APP1 segment actually “include” other market segments? For example, if the thumbnail data is considered to be inside APP1, the length seems more reasonable. Can anyone confirm/deny this?

It turns out that the APP1 segment includes thumbnails (see the linked EXIF ​​document and scroll down to the logic Page 12), so 686 is the red herring (probably the number of bytes before the thumbnail). F7C8 is the actual number of bytes up to the DQT segment, which is so large because it contains a thumbnail.

< /p>

I am trying to write some code to extract Exif information from JPG.

Exif is stored in the APP1 section of the JPG file. According to the Exif spec, the format of the APP1 section should It starts like this:

FF E1 // APP1 segment marker
nn nn // Length of segment
45 //'E'
78 //'x'
69 //'i'
66 //'f'

Until there is an FF followed by FF or something other than 00.

Viewing JPG in a hexadecimal editor, I can see FF E1 and Exif strings, but I have trouble decoding length bytes. An example: in a jpg, my hexadecimal The system editor tells me that the APP1 segment is 686 bytes long, but the length bytes are F7 C8.

How should I use these bytes to come up with 686 decimal?

Edit: This is the first part of the sample file:

FF D8 FF E1 F7 C8 45 78 69 66 00 00 4D 4D 00 2A 00 00 00 08

Edit: Actually, I think I might know what’s going on here. Does the APP1 segment actually “include” other market segments? For example, if the thumbnail data is considered to be inside APP1, the length seems more reasonable. Can anyone confirm/deny this?

It turns out that the APP1 segment includes thumbnails (see the linked EXIF ​​document and scroll down to logical page 12), so 686 is a red herring (probably The number of bytes before the thumbnail). F7C8 is the actual number of bytes up to the DQT segment, it is so large because it contains a thumbnail.

Leave a Comment

Your email address will not be published.