JPEG Orientation

Working with the JPEG/EXIF orientation tag, by

When shooting a photograph in portrait orientation, cameras generally recognize the situation and are able to store the resulting image such that the top of the photo corresponds with the top of the scene. Yet somehow, portrait photos sometimes show up in the wrong orientation; the top of the image on screen suddenly corresponds to the top of the camera as it was when the photo was taken. The culprit in this situation is lack of support for the EXIF orientation tag.

The EXIF orientation tag

JPEG is a standard for compressing and encoding images. However, a .jpg file generally contains information in addition to the image. The file format of such a .jpg file is usually the Exchangeable image file format, EXIF, which specifies how to include these details, including the geographic coordinates where the photo was taken, timestamp, details about the camera that was used and its settings and, the subject of our study, specification of which orientation the the camera was held in as it took the photo.

A camera will always store the image data such that the top of the image corresponds to the top of the camera. It will use its orientation sensor to find a value for the orientation tag to allow compliant software to show the photo the right way up. This is a very simple strategy to implement in a camera compared to rotating the full image data to match, at the expense of requiring all JPEG decoders to be able to compensate for this. As you no doubt have realized, this is a complication many decoders simply ignore.

The orientation tag can assume one of eight values, one for each possible rotation as well as a mirrored version of each of those rotations. We will develop an algorithm to correct the orientation of a .jpg file where the presence of the mirrored variants will make sense even if they do not make sense from a camera user's perspective.

The eight possible orientations are presented in the following table. The correct rendering in all cases looks like a normal F. Software that does not take the orientation tag into account will render a result that looks like the second row. The images in this row do not have the EXIF orientation tag set. The images in the last row have the orientation tag set, and compliant software should render the last row to look like the first row.

EXIF Orientation Tag: 1 2 3 4 5 6 7 8
Correct rendering:
Typical incorrect rendering:
Your browser's attempt:

You can download these images to test your favorite photo software. As a baseline, you can use ImageMagick, which explicitly handles the orientation tag when you specify the -auto-orient command line option:

# Display without orientation correction:
display f6t.jpg

# Explicitly correct orientation:
display -auto-orient f6t.jpg

Correcting the orientation

We could make a table of transformations to apply to images with given orientation tags. After all, eight different transformations and eight different code paths is not that much:

TagOperation
1do nothing
2flip horizontally
3rotate 180°
4flip vertically

But we will not do that. Instead we will exploit a structure in the values allowing us to consider only three operations and combinations of those.

For eight different values, it is tempting to use values in the range [0..7] rather than [1..8] which is used in EXIF. If we subtract one from the tag value, maybe we can discern some binary structure?

EXIF Orientation Tag: 1 2 3 4 5 6 7 8
Subtract one: 0 1 2 3 4 5 6 7
In binary: 000 001 010 011 100 101 110 111
Base image:

Let's consider the different cases. Which transformation should we apply to flip bit zero? In other words; which transformation should we apply to convert image 001 to image 000? Horizontal flip. For bit 1, image 010 to image 000, we need to rotate by 180°. Finally, for bit 2, image 100 to image 000, we need to flip along the diagonal.

It turns out that we can resolve all the remaining cases by combining these fundamental operations according to the bit patterns of the orientation tags:

Binary tag: 000 001 010 011 100 101 110 111
Base image:
Flip diagonally (1__):
Rotate 180° (_1_):
Flip horizontally (__1):

An an implementation concern, it is worth noting that the transformations could be expressed as transformation matrices which can be combined into one composite transformation before applying it to the image data. There should be no need to transform the image data in more than one pass. For completeness, the different transformation matrices are:

Tflip-diagonal=
01
10
Trotate-180=
-10
0-1
Tflip-horizontal=
-10
01

The algorithm can thus be described by this pseudocode:

x = value of EXIF orientation tag
y = x-1
 
t = identity matrix
 
if y & 100b != 0 then t = t * Tflip-diagonally
if y & 010b != 0 then t = t * Trotate-180
if y & 001b != 0 then t = t * Tflip-horizontally
 
apply t to image

Other failure modes

Gwenview, the standard image viewer in KDE, has for a long time been a good citizen, handling the orientation tag properly. Recently, however, the library underlying Gwenview, Qt, has also acquired this feature. This results in the good citizen suddenly double compensating for the stored orientation. This is an extra fun failure mode, because it looks a lot like the standard failure mode where no effort is spent on correcting. It ends up working this way because all the fundamental transformations are their own inverse, and only one of them is not commutative. Double compensating is only distinguishable from not compensating at all in the case of orientation tag values 6 and 8.

EXIF Orientation Tag: 1 2 3 4 5 6 7 8
Uncorrected images:
Singly corrected images:
Doubly corrected images:

Acknowledgements

Thanks to the JPEG Club for explaining the tag and, I guess, for maintaining jpegtran, which I used to generate the example images on this page.

A great thanks to pyexiv2, which I used to set the EXIF orientation tag of .jpg files without preexisting EXIF data. I'd prefer to use Python 3 and GExiv2, but it does not seem to be packaged for Ubuntu 15.04, so there you go.

, 2015