Chptr-1- PE File Format From a Distance.

Now you have to remember when you start to learn the secrets of any file format, especially one that is as complex and intimidating as this one! Your the boss... you can delete it, change the bits or what ever you want! Its only a big array of 1's and 0's.

In we go... a 350g jar of coffee and a kettle full of hot water and we're ready :)

PE = Portable Executable, yup that's what it stands for. Its the native format for all Win32's OS, which means whether your running linux or windows or unix etc, it follows the PE file format! So by learning this its going to give you a lot of power :)

In we go, lets see what it looks like underneath the bonet:

DOS MZ header

DOS stub

PE header

Section table

Section 1

Section 2

Section ..

Section n

Well there you go, I bet your getting scared already... well don't worry.. once I tell you what's in those parts you'll feel better. So now all PE's even DLL's use this structure. The first part which is the includes DOS MZ header is our entry point.. the start of our journey.

Okay the first two sections are for DOS, well it goes back to the days of dos, as when windows came along so that if you run the exectuable in dos you'll get the message "This program cannot run in DOS mode". So the DOS MZ header contains the offset into the DOS stub, and the DOS stub is a mini dos program... in case your wondering where it comes from... the compiler/linker inserts it automatically when you build an exe/dll.

Next as you can see from above, is the PE header. If you open up the winnt.h files you'll find a struct called IMAGE_NT_HEADER which is a structure which represnts our PE header. This structure tells us a lot of things about our PE, its type, the number of sections (e.g. code pieces) how big it is etc.

The real parts of our PE, the parts which contain our code! The code that your going to compile soon.. in later chapters, well its all in the blocks called sections. Very very important that you remember this, a section is nothing more than a block of data.... repeat after me... "block of data" ;) Each section has attributes for itself, e.g. code/data, read/write etc.

So that you don't get lost, think of the header as a boot sector, and the sections as folders... as you would for a disk. And as you can set the file attributes for folders such as hidden, read only etc... the same applies here.

Important news ->The grouping of data into sections is done on the common attribute basis, not on a logical basis<- It doesn't matter how the code/data are used, if it has the same attributes they can be lumped together into a section. Remember a section similar to a folder can can contain a variety of stuff... so when you think of a section, it can contain code/data etc... it is grouped based on the attributes.

Well up to this point we have the PE Header, this doesn't tell us where our data/code is...and what the sections are! We must look at the next part which is "Section table"... where nearly there so hang on a little more. Now immediately following the PE Header is an 'array' of structures. When I mean an array of structures I mean, the the Section table is made up of loads of little identical structures. Just think of a box full of apples. The section table is the box, and its full of apples (I'll expand on this theory later).

Each sub-section (apple) of the section table will contain the information on one of our sections, its offset, its attribues etc. So if we have 3 sections, we will have an array of 3 structures in our section table... does that make sense? Have I lost you?

In a nutshell thats the PE layout! At the moment there's usually questions of "why" and "what's that for" and "what if's" .. well keep with me on this journey of discovery and we'll answer your questions.

Lets review what actually happens when you run an PE (or exe/dll).

The PE file is run, then the PE Loader looks at our PE file and examines the DOS MZ header for the offset to the PE header. If it finds it it then jumps to our PE Header.
PE Loader again checks that the PE header is valid.
The PE Loader goes to the end of the PE Header and starts to read in the the information from the section table. Now the PE Loader knows how many sections we have an what there attribues are. It loads our sections into memory (using file mapping) with there designated attribues, e.g. read only etc. -- When I use the word file mapping, I mean a the data from the section is copied into memory exactly as it is! So if you can access the memory you'd see that is an exact copy of your PE Section. :)
The PE Loader then moves on a notch and checks for imports from other files etc.... which is part of the import table which we'll get to in a bit.

Thats it... as easy as "one-two-three"... or "a-b-c" ... :) Yup I may have oversimplified it... but those are the facts and as you move on you'll soon see that its as easy as that.

Advert (Support Website)

Visitor: