JPEG

The mysterious world of the jpeg file format explained

Benjamin Kenwright

Well those of you who are about to set out on the journey of discovering the jpeg file format better be prepared, I believe that the principles behind it are easy to grasp than the actual coding of it.

The jpeg file format is all around you... probably one of the most widely used... its platform independent... one of its biggest uses is for the internet.

Why use the jpeg file format? ts biggest selling point is the fact that you can take a 3meg bitmap file (coolimage.bmp) and convert it to a jpg of only a few hundred kilobytes instead (e.g. coolimage.jpg).

Of course it does have its downside, it's a lossy image format, so the original image looses some of its original details... but usually these are not noticeable to the human eye.

Inside

Give me a jpeg, and lets rip it to pieces!

How its layed out, and how the information is stored isn't to hard to grasp. Simply put the jpeg file format is arrange with a description section of size 2 bytes which has an value which we can look up, e.g. 0xffda for example, then it has a further 2 bytes following that desciption section header which tells us how many bytes make up that section.

One exception!!... There are a few uni-markers as they say! A uni marker is just a 2 byte id, which has some meaning.

SO A UNI-MARKER DOES NOT HAVE 2 BYTES FOLLOWING IT TO GIVE ITS SIZE.

I've empasised the one exception in bold as it can make decoding corrupted jpgs very challenging if the end marking identifier is missed (as it'll go passed the end of the data block). But more on that later..

Hmmm...markers...hmm....yup it all seems simple and if you take it one step at a time it will eventually make sense! The secret to understanding the jpeg file format is to break it up into pieces and understand fully how each section of the marker works, and its use in the overall picture.

Now, to start with... all jpegs should start with an SOI marker, which is a uni-marker and has the value 'xffda'. But the jpeg file format is arrange in little endian format...What does that mean? Well the bytes are arranged so that the least significant bits are first ... so if we read in the first two bytes we would get '0xda' first followed by '0xff'. Again it only matters if we read more than one byte at a time. So if we read more than one byte at a time we'll have to shift it around.

One thing I found when writing a simple jpeg decoder/reader, was that you have to be prepared to look at your data right down to the binary level! Its not enough to look at your data as bytes....you'll have to actually decode 1's and 0's and compare them etc... and shift them... to get your image back.

Let write a little code to see how the jpeg file format works... give you a taster.

/****************************************************************/
/*                                                              */
/* Title: How JPEG format works                                 */
/*                                                              */
/****************************************************************/
 
/************************** Start *******************************/
#include <windows.h>
#include <stdio.h>
 
/****************************************************************/
/*                                                              */
/* FeedBack Data                                                */
/*                                                              */
/****************************************************************/
void abc(char* s)
{
     FILE *fp;
     fp = fopen("t.txt", "a+");
     fprintf(fp, "%s\n",s);
     fclose(fp);
}
 
void readjpeg();
 
/****************************************************************/
/*                                                              */
/* Entry Point                                                  */
/*                                                              */
/****************************************************************/
int __stdcall WinMain (HINSTANCE hInst, HINSTANCE hPrev, LPSTR lpCmd, int nShow)
{
  readjpeg();
  return 0;
}
 
/****************************************************************/
/*                                                              */
/* jpeg functions                                               */
/*                                                              */
/****************************************************************/
 
 
/* Okay before we start getting overwelmed by bits and bytes and
bit shifting and all sorts of special tricks.. we should first
read in the header...which is the first part of the file, and
can tell us a lot about the jpeg file. */
 
// First lets define some things
#define             SOI                  /*Start of Image*/  0xffd8
#define             EOI                  /*End of Image  */  0xffd9
 
void readjpeg()
{
             byte chunk[2];
             WORD sizechunk;
 
     FILE *f;
     f = fopen("balloon.jpg", "rb");
             // Lets read in the first word (e.g. a word is 2 bytes).
     fread(chunk, 1, 2, f);
 
     // really big buffer for text
     char buff[200];
             // Output what we have read in.. see what it is?
     sprintf(buff, "First 2 bytes are: 0x%x, 0x%x", chunk[0], chunk[1] );
     abc(buff);
 
             fclose(f);
}
 
// okay if you run the program you'll get as an output:
/*
            First 2 bytes are: 0xff, 0xd8
*/
// This tells us that the file we are dealing with is a jpeg,
// as it starts with 0xffd8 which means SOI (Start Of Image).

All markers (e.g. 2 byte identification chunks) always start with 0xff it may not be so easy to follow as the bytes are read in the opposite way so what where really getting is 0xdaff ....But I think you get the point. There is one exception, a marker followed by '00' is a ignore marker, these only appear in our encoded data (e.g. in the SOS Start of Scan part).. but well get to that in time.

Note: Sometimes I use the word chunk to represent the actual section ID of the part of the file for example I say Im reading in the chunk ID, which for example could be APP0 (e.g. 0xffe0) which will be followed by a chunk size of 2 bytes, then we go through the chunk. Alternatively you can refer to then as sections, or markers..just make sure you know what Im going on about, and how its organized.

Our file would start like this:

[0xd8] [0xff] [0xe0] [0xff] [0x00][0x10] ..

So what would this tell us??? Any ideas?

Well its goes like this:

[0xd8] [0xff] first 2 bytes are a uni-marker representing SOI (Start Of Image).
[0xe0] [0xff] - further 2 bytes we have the marker APP0 (Application Marker)
[0x00] [0x10] This is the further 2 bytes following our APP0 marker telling us how long this marker is! Which is 0x1000 remember the bytes are the other way around. So the next 16 bytes (0x10) including the 2 bytes for the length, represent how long our APP0 marker is.

Now a good idea for a newby to the jpeg file format is to write a small program which just goes though the file and sees what markers are in there! As we can read in the markers type, and we know how long that marker is, so we can just jump to the next marker and read in what it isget its size and skip to the next one.

Note: All the markers contain the total size of how much information is in them EXCEPT the SOS marker which is the marker that contains the compressed image information and is usually located at the end of the filewell almost, the last marker is always EOI (End Of Image) and has the value 0xffd9.

And here we are a nice little program which lists the markers in the jpg file:

#include <windows.h>
#include <stdio.h>
 
/****************************************************************/
/*                                                              */
/* FeedBack Data                                                */
/*                                                              */
/****************************************************************/
void abc(char* s)
{
     FILE *fp;
     fp = fopen("t.txt", "a+");
     fprintf(fp, "%s\n",s);
     fclose(fp);
}
 
void readjpeg();
 
/****************************************************************/
/*                                                              */
/* Entry Point                                                  */
/*                                                              */
/****************************************************************/
int __stdcall WinMain (HINSTANCE hInst, HINSTANCE hPrev, LPSTR lpCmd, int nShow)
{
  readjpeg();
  return 0;
}
 
/****************************************************************/
/*                                                              */
/* jpeg functions                                               */
/*                                                              */
/****************************************************************/
 
 
/* Okay before we start getting overwelmed by bits and bytes and
bit shifting and all sorts of special tricks.. we should first
read in the header...which is the first part of the file, and
can tell us a lot about the jpeg file. */
 
// First lets define some things
#define             SOI                  /*Start of Image*/  0xffd8
#define             EOI                  /*End of Image  */  0xffd9
 
#define             APP0   /**/          0xffe0 /*to 0xffef APP15*/
 
// Really big buffer for text output
char buff[200];
 
void readjpeg()
{
     byte chunk[2];
     byte sizeofchunk[2];
 
     FILE *f;
     f = fopen("balloon.jpg", "rb");
             // Lets read in the first 8 bytes
     fread(chunk, 1, 2, f);
 
 
             // Output what we have read in.. see what it is?
     sprintf(buff, "First 2 bytes are: 0x%x, 0x%x", chunk[0], chunk[1] );
     abc(buff);
 
     fread(chunk, 1, 2, f);
     fread(sizeofchunk, 1, 2, f);
     short unsigned int size = ((sizeofchunk[0] << 8) | sizeofchunk[1]);

sprintf(buff, "Second 2 bytes are: 0x%x, 0x%x", chunk[0], chunk[1] );
     abc(buff);
     sprintf(buff, "Size of our piece of data:%u", size);
     // Remeber the size includes the 2 bytes for the size.
     abc(buff);

// Now we now how big the next chunk is, we can read it in.
     // I know its an app0 chunk because the chunk was 0xffe0

char temp[100];
     fread(&temp, size - 2, 1, f);
     temp[size - 2 + 1] = '\0'; // Null terminate the string :)

sprintf(buff, "The APP0 value: %s", temp);
     abc(buff);

// A stage further, opening up the various sections.

//  Lets try and read in all the data...see what we get...
     //  Remeber now, its a 2 byte value which tells us what it is,
     //  then a 2 byte value of how bit it is :)
   
     while(true)
     {
                fread(chunk, 1, 2, f);
               
                // If the chunk we read in doesn't begin with 0xff then
                // then its not a valid chunk and so exit.
                if( chunk[0] != 0xff )
                {
                            sprintf(buff, "Error chunk[0] was: 0x%x, chunk[1]: 0x%x", chunk[0], chunk[1]);
                            abc(buff);
                            break;
                }

// If we get 0xffd9 then its the EOF (End Of File)
                if( chunk[1] == 0xd9 )
                {
                            abc("End Of File");
                            break;
                }

fread(sizeofchunk, 2, 1, f);
                short unsigned int size = ((sizeofchunk[0] << 8) | sizeofchunk[1]);

if( chunk[1] == 0xda )
                {
                // Okay this means we have started to scan the encoded data.
                            byte count;
                            fread(&count, 1, 1, f);
                            sprintf(buff, "Start of scan count: %u", count);
                            abc(buff);
                            //fseek(f, -1, SEEK_CUR);

while(count != 0xff)
                            {
                                        fread(&count, 1, 1, f);
                                        if(count == 0xff)
                                        {
                                                    fread(&count, 1, 1, f);
                                                    if(count != 0x00)
                                                                break;
                                        }

}
                            sprintf(buff, "\nEnd of scan value: 0xff%x", count);
                            abc(buff);

break;
                           
                           
                }

sprintf(buff, "Chunk ID: 0x%x%x,   Size:%u",
                                        chunk[0], chunk[1], size);
                abc(buff);

fseek(f, size-2, SEEK_CUR);

}

fclose(f);
}

If you run the above code you'll get the following:

Now belive it or not, we've got a whole lot of juicy information there....it tells us the size of each chunk, the order that they are in.... and which chunks are in our image.

Advert (Support Website)

Visitor: