“JPEG”
The mysterious world
of the jpeg file format explained
By Benjamin Kenwright
Well those of you who are about to set out on the journey of
discovering the jpeg file format better be prepared, I believe that the
principles behind it are easy to grasp than the actual coding of it.
The jpeg file format is all around you… probably one of the
most widely used… its platform independent… one of its biggest uses is for the
internet.
Why use the jpeg file format? Its biggest selling point is the fact that you can take a 3meg
bitmap file (coolimage.bmp) and convert it to a jpg of only a few hundred
kilobytes instead (e.g. coolimage.jpg).
Of course it does have its downside, it’s a lossy image
format, so the original image looses some of its original details… but usually
these are not noticeable to the human eye.
Give me a jpeg, and lets rip it to pieces!
How its layed out, and how the information is stored isn’t
to hard to grasp…. Simply put the jpeg file format is arrange with a description
section of size 2 bytes which has an value which we can look up, e.g. 0xffda
for example, then it has a further 2 bytes following that desciption section
header which tells us how many bytes make up that section.
One exception!!
There are a few uni-markers as they say! A uni marker is just a 2 byte id, which has some meaning… SO A
UNI-MARKER DOES NOT HAVE 2 BYTES FOLLOWING IT THAT GIVES ITS SIZE.
Hmmm…markers…hmm….yup it all seems simple…and if you take it
one step at a time it will eventually make sense! The secret to understanding the jpeg file format is to break it
up into pieces and understand fully how each section of the marker works, and
its use in the overall picture.
Now, to start with… all jpegs should start with a SOI marker, which is a uni-marker and has the
value 0xffda. But the jpeg file format
is arrange in little endian format… What does that mean? Well the bytes are arranged so that the
least significant bits are first … so if we read in the first two bytes we
would get “0xda” first followed by “0xff”…. Again it only matters if we read
more than one byte at a time. So if we
read more than one byte at a time we’ll have to shift it around.
One thing I found when writing a simple jpeg decoder/reader,
was that you have to be prepared to look at your data right down to the binary
level! Its not enough to look at your
data as bytes….you’ll have to actually decode 1’s and 0’s and compare them etc…
and shift them … to get your image back.
Let write a little code to see how the jpeg file format
works… give you a taster
/****************************************************************/
/* */
/* Title: How JPEG format
works */
/* */
/****************************************************************/
/**************************
Start *******************************/
#include <windows.h>
#include <stdio.h>
/****************************************************************/
/* */
/* FeedBack Data
*/
/* */
/****************************************************************/
void abc(char* s)
{
FILE *fp;
fp = fopen("t.txt", "a+");
fprintf(fp, "%s\n",s);
fclose(fp);
}
void readjpeg();
/****************************************************************/
/* */
/* Entry Point
*/
/* */
/****************************************************************/
int __stdcall WinMain
(HINSTANCE hInst, HINSTANCE hPrev, LPSTR lpCmd, int nShow)
{
readjpeg();
return 0;
}
/****************************************************************/
/* */
/* jpeg functions
*/
/* */
/****************************************************************/
/* Okay before we start
getting overwelmed by bits and bytes and
bit shifting and all sorts
of special tricks.. we should first
read in the header...which
is the first part of the file, and
can tell us a lot about
the jpeg file. */
// First lets define some
things
#define SOI /*Start
of Image*/ 0xffd8
#define EOI /*End
of Image */ 0xffd9
void readjpeg()
{
byte chunk[2];
WORD sizechunk;
FILE *f;
f = fopen("balloon.jpg", "rb");
// Lets read in the
first word (e.g. a word is 2 bytes).
fread(chunk, 1, 2, f);
// really big buffer for text
char buff[200];
// Output what we
have read in.. see what it is?
sprintf(buff, "First 2 bytes are: 0x%x, 0x%x",
chunk[0], chunk[1] );
abc(buff);
fclose(f);
}
// okay if you run the
program you'll get as an output:
/*
First 2 bytes are: 0xff, 0xd8
*/
// This tells us that the
file we are dealing with is a jpeg,
// as it starts with
0xffd8 which means SOI (Start Of Image).
Well its not the most exciting piece of code yet… but I’m a
beliver in starting simple…. Any-how
don’t want to loose you yet!… haven’t even started on the Huffman coding…lol.
So what have we learned above, well we have read in a file
called “ballon.jpg”, something I just found laying around on my hard
drive. Then I read in the first 2 bytes
from the very start of the file and printed them out to a text file called t.txt….
its not the most creative name, but you can change it if you want.
I like to put it to a text file so that we can examine what
we have obtained as we go along.
All markers… (e.g. 2 byte identification chunks) always
start with 0xff… it may not be so easy to follow as the bytes are read in the
opposite way so what where really getting is 0xdaff J But I think you get the point. There is one exception, a marker followed by
00 is a ignore marker, these only appear in our encoded data (e.g. in the SOS –Start
of Scan part).. but we’ll get to that in time.
Note: Sometimes I use the word chunk to represent the actual
section ID of the part of the file… for example I say I’m reading in the chunk
ID, which for example could be APP0 (e.g. 0xffe0) which will be followed by a
chunk size of 2 bytes, then we go through the chunk. Alternatively you can refer to then as sections, or markers..…just
make sure you know what I’m going on about, and how its organized.
Our file would start like this:
[0xd8] [0xff] [0xe0] [0xff] [0x00][0x10] …..
So what would this tell us??? Any ideas?…its goes like this:
[0xd8] [0xff] – first 2 bytes are a uni-marker representing
SOI (Start Of Image).
[0xe0] [0xff] -
further 2 bytes we have the marker APP0 (Application Marker)
[0x00] [0x10] – This is the further 2 bytes following our
APP0 marker telling us how long this marker is! Which is 0x1000…rember the bytes are the other way around. So the next 16 bytes (0x10) including the 2
bytes for the length, represent how long our APP0 marker is.
Now a good idea for a newby to the jpeg file format is to
write a small program which just goes though the file and see’s what markers
are in there! As we can read in the
marker’s type, and we know how long that marker is, so we can just jump to the
next marker and read in what it is…get its size and skip to the next one.
Note: All the
markers contain the total size of how much information is in them “EXCEPT” the
SOS marker which is the marker that contains the compressed image information…
and is usually located at the end of the file…well almost, the last marker is
always EOI (End Of Image) and has the value 0xffd9.
And here we are… a nice little program which lists the
markers in the jpg file:
#include <windows.h>
#include <stdio.h>
/****************************************************************/
/* */
/* FeedBack Data
*/
/* */
/****************************************************************/
void abc(char* s)
{
FILE *fp;
fp = fopen("t.txt", "a+");
fprintf(fp, "%s\n",s);
fclose(fp);
}
void readjpeg();
/****************************************************************/
/* */
/* Entry Point
*/
/* */
/****************************************************************/
int __stdcall WinMain
(HINSTANCE hInst, HINSTANCE hPrev, LPSTR lpCmd, int nShow)
{
readjpeg();
return 0;
}
/****************************************************************/
/* */
/* jpeg functions
*/
/* */
/****************************************************************/
/* Okay before we start
getting overwelmed by bits and bytes and
bit shifting and all sorts
of special tricks.. we should first
read in the header...which
is the first part of the file, and
can tell us a lot about
the jpeg file. */
// First lets define some
things
#define SOI /*Start
of Image*/ 0xffd8
#define EOI /*End
of Image */ 0xffd9
#define APP0 /**/ 0xffe0
/*to 0xffef APP15*/
// Really big buffer for
text output
char buff[200];
void readjpeg()
{
byte chunk[2];
byte
sizeofchunk[2];
FILE *f;
f = fopen("balloon.jpg", "rb");
// Lets read in
the first 8 bytes
fread(chunk, 1, 2, f);
// Output what we
have read in.. see what it is?
sprintf(buff, "First 2 bytes are: 0x%x, 0x%x",
chunk[0], chunk[1] );
abc(buff);
fread(chunk, 1,
2, f);
fread(sizeofchunk, 1, 2, f);
short unsigned
int size = ((sizeofchunk[0] << 8) | sizeofchunk[1]);
sprintf(buff,
"Second 2 bytes are: 0x%x, 0x%x", chunk[0], chunk[1] );
abc(buff);
sprintf(buff,
"Size of our piece of data:%u", size);
// Remeber the
size includes the 2 bytes for the size.
abc(buff);
// Now we now how
big the next chunk is, we can read it in.
// I know its an
app0 chunk because the chunk was 0xffe0
char temp[100];
fread(&temp,
size - 2, 1, f);
temp[size - 2 +
1] = '\0'; // Null terminate the string :)
sprintf(buff,
"The APP0 value: %s", temp);
abc(buff);
// A stage
further, opening up the various sections.
// Lets try and read in all the data...see what
we get...
// Remeber now, its a 2 byte value which tells
us what it is,
// then a 2 byte value of how bit it is :)
while(true)
{
fread(chunk, 1, 2, f);
// If the chunk we read in doesn't begin with
0xff then
// then its not a valid chunk and so exit.
if( chunk[0] != 0xff )
{
sprintf(buff, "Error
chunk[0] was: 0x%x, chunk[1]: 0x%x", chunk[0], chunk[1]);
abc(buff);
break;
}
// If we get 0xffd9 then its the EOF (End Of
File)
if( chunk[1] == 0xd9 )
{
abc("End Of File");
break;
}
fread(sizeofchunk, 2, 1, f);
short unsigned int size = ((sizeofchunk[0]
<< 8) | sizeofchunk[1]);
if( chunk[1] == 0xda )
{
// Okay this means we have started to scan
the encoded data.
byte count;
fread(&count, 1, 1, f);
sprintf(buff, "Start of scan
count: %u", count);
abc(buff);
//fseek(f, -1, SEEK_CUR);
while(count != 0xff)
{
fread(&count, 1,
1, f);
if(count == 0xff)
{
fread(&count,
1, 1, f);
if(count
!= 0x00)
break;
}
}
sprintf(buff, "\nEnd of scan
value: 0xff%x", count);
abc(buff);
break;
}
sprintf(buff, "Chunk ID: 0x%x%x, Size:%u",
chunk[0], chunk[1], size);
abc(buff);
fseek(f, size-2, SEEK_CUR);
}
fclose(f);
}
If you run the above code you’ll get the following:
First 2 bytes are: 0xff,
0xd8
Second 2 bytes are: 0xff,
0xe0
Size of our piece of
data:16
The APP0 value: JFIF
Chunk ID: 0xffdb, Size:67
Chunk ID: 0xffdb, Size:67
Chunk ID: 0xffc0, Size:17
Chunk ID: 0xffc4, Size:31
Chunk ID: 0xffc4, Size:181
Chunk ID: 0xffc4, Size:31
Chunk ID: 0xffc4, Size:181
Start of scan count: 3
End of scan value: 0xffd9
Now belive it or not, we’ve got a whole lot of juicy
information there… it tells us the size of each chunk, the order that they are
in… and which chunks are in our image.