Chptr-1- PE File Format From a Distance.
Now you have to remember when you start to learn the secrets of any file
format, especially one that is as complex and intimidating as this one!
Your the boss... you can delete it, change the bits or what ever you want!
Its only a big array of 1's and 0's.
In we go... a 350g jar of coffee and a kettle full of hot water and where
ready :)
PE = Portable Executable, yup that's what it
stands for. Its the native
format for all Win32's OS, which means whether your running linux or windows or
unix etc, it follows the PE file format! So by learning this its going
to give you a lot of power :)
In we go, lets see what it looks like underneath the bonet:
DOS MZ header |
DOS stub |
PE header |
Section table |
Section 1 |
Section 2 |
Section .. |
Section n |
Well there you go, I bet your getting scared already... well don't worry..
once I tell you what's in those parts you'll feel better. So now all
PE's even DLL's use this structure. The first part which is the includes
DOS MZ header is our entry point.. the start of our journey.
Okay the first two sections are for DOS, well it goes back to the days of
dos, as when windows came along so that if you run the exectuable in dos
you'll get the message "This program cannot run in DOS mode". So the DOS
MZ header contains the offset into the DOS stub, and the DOS stub is a mini
dos program... in case your wondering where it comes from... the
compiler/linker inserts it automatically when you build an exe/dll.
Next as you can see from above, is the PE header. If you open up the
winnt.h files you'll find a struct called IMAGE_NT_HEADER which is a structure
which represnts our PE header. This structure tells us a lot of things
about our PE, its type, the number of sections (e.g. code pieces) how big it
is etc.
The real parts of our PE, the parts which contain our code! The code
that your going to compile soon.. in later chapters, well its all in the
blocks called sections. Very very important that you remember this, a
section is nothing more than a block of data.... repeat after me... "block of
data" ;) Each section has attributes for itself, e.g. code/data,
read/write etc.
So that you don't get lost, think of the header as a boot sector, and the
sections as folders... as you would for a disk. And as you can set the
file attributes for folders such as hidden, read only etc... the same applies
here.
Important news ->The grouping of data into sections is done on the common
attribute basis, not on a logical basis<- It doesn't matter how the
code/data are used, if it has the same attributes they can be lumped together
into a section. Remember a section similar to a folder can can contain a
variety of stuff... so when you think of a section, it can contain code/data
etc... it is grouped based on the attributes.
Well up to this point we have the PE Header, this doesn't tell us where our
data/code is...and what the sections are! We must look at the next part
which is "Section table"... where nearly there so hang on a little more.
Now immediately following the PE Header is an 'array' of structures.
When I mean an array of structures I mean, the the Section table is made up of
loads of little identical structures. Just think of a box full of
apples. The section table is the box, and its full of apples (I'll
expand on this theory later).
Each sub-section (apple) of the section table will contain the information
on one of our sections, its offset, its attribues etc. So if we have 3
sections, we will have an array of 3 structures in our section table... does
that make sense? Have I lost you?
In a nutshell thats the PE layout! At the moment there's usually
questions of "why" and "what's that for" and "what if's" .. well keep with me
on this journey of discovery and we'll answer your questions.
Lets review what actually happens when you run an PE (or exe/dll).
-
The PE file is run, then the PE Loader looks at our
PE file and examines the DOS MZ header for the offset to the PE header.
If it finds it it then jumps to our PE Header.
-
PE Loader again checks that the PE header is valid.
-
The PE Loader goes to the end of the PE Header and
starts to read in the the information from the section table. Now the PE
Loader knows how many sections we have an what there attribues are. It
loads our sections into memory (using file mapping) with there designated
attribues, e.g. read only etc. -- When I use the word file mapping, I
mean a the data from the section is copied into memory exactly as it is!
So if you can access the memory you'd see that is an exact copy of your PE
Section. :)
-
The PE Loader then moves on a notch and checks for
imports from other files etc.... which is part of the import table which we'll
get to in a bit.
Thats it... as easy as "one-two-three"... or "a-b-c" ... :) Yup
I may have oversimplified it... but those are the facts and as you move on
you'll soon see that its as easy as that.
|