📄 the pe file format.txt
字号:
'NumberOfNamedEntries'+'NumberOfIdEntries' structs which are of the
format 'IMAGE_RESOURCE_DIRECTORY_ENTRY', those with the names coming first.
They may point to further 'IMAGE_RESOURCE_DIRECTORY's or they point to
the actual resource data.
A IMAGE_RESOURCE_DIRECTORY_ENTRY consists of:
32 bits giving you the id of the resource or the directory
it describes;
32 bits offset to the data or offset to the next sub-directory.
The meaning of the id depends on the level in the tree; the id may be a
number (if the hi-bit is clear) or a name (if the hi-bit is set). If it
is a name, the lower 31 bits are the offset from the beginning of the
resource section's raw data to the name (the name consists of 16 bits
length and trailing wide characters, in unicode, not 0-terminated).
If you are in the root-directory, the id, if it is a number, is the
resource-type:
1: cursor
2: bitmap
3: icon
4: menu
5: dialog
6: string table
7: font directory
8: font
9: accelerators
10: unformatted resource data
11: message table
12: group cursor
14: group icon
16: version information
Any other number is user-defined. Any resource-type with a type-name is
always user-defined.
If you are one level deeper, the id is the resource-id (or resource-
name).
If you are another level deeper, the id must be a number, and it is the
language-id of the specific instance of the resource; for example, you
can have the same dialog in english, french and german localized forms,
and they all share the same resource-id. The system will choose the
dialog to load based on the process's locale, which in turn will usually
reflect the user's "regional setting".
(If the resource cannot be found for the process locale, the system will
first try to find a resource for the locale using a neutral sublanguage;
if it still can't be found, the instance with the smallest language id will
be used.)
To decipher the language id, split it into the primary language id and
the sublanguage id using the macros PRIMARYLANGID() and SUBLANGID(),
giving you the bits 0 to 9 or 10 to 15, respectivly. The values are
defined in the file "winresrc.h".
Language-resources are only supported for accelerators, dialogs, menus,
rcdata or stringtables; other resource-types should be
LANG_NEUTRAL/SUBLANG_NEUTRAL.
To find out whether the next level below a resource directory is another
directory, you inspect the hi-bit of the offset. If it is set, the
remaining 31 bits are the offset from the beginning of the resource
section's raw data to the next directory, again in the format
IMAGE_RESOURCE_DIRECTORY with trailing IMAGE_RESOURCE_DIRECTORY_ENTRYs.
If the bit is clear, the offset is the distance from the beginning of
the resource section's raw data to the resource's raw data description,
a IMAGE_RESOURCE_DATA_ENTRY. It consists of 32 bits 'OffsetToData' (the
offset to the raw data, counting from the beginning of the resource
section's raw data), 32 bits of 'Size' of the data, 32 bits 'CodePage'
and 32 unused bits.
(The use of codepages is discouraged, you should use the 'language'-
feature to support multiple locales.)
The raw data format depends on the resource type; descriptions can be
found in the MS SDK documentation. Note that any string in resources is
always in UNICODE.
relocations
-----------
The last data directory I will describe is the base relocation
directory. It is pointed to by the IMAGE_DIRECTORY_ENTRY_BASERELOC entry
in the data directories of the optional header. It is typically
contained in a section if its own, with a name like ".reloc" and the
bits IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_DISCARDABLE and
IMAGE_SCN_MEM_READ set.
The relocation data is needed by the loader if the image cannot be
loaded to the preferred load address 'ImageBase' mentioned in the
optional header. In this case, the fixed addresses supplied by the
linker are no longer valid, and the loader has to apply fixups for
absolute addresses used for locations of static variables, string
literals and so on.
The relocation directory is a sequence of chunks. Each chunk contains
the relocation information for 4 KB of the image. A chunk starts with a
'IMAGE_BASE_RELOCATION' struct. It consists of 32 bits 'VirtualAddress'
and 32 bits 'SizeOfBlock'. It is followed by the chunk's actual
relocation data, being 16 bits each.
The 'VirtualAddress' is the base RVA that the relocations of this chunk
need to be applied to; the 'SizeOfBlock' is the size of the entire chunk
in bytes.
The number of trailing relocations is
('SizeOfBlock'-sizeof(IMAGE_BASE_RELOCATION))/2
The relocation information ends when you encounter a
IMAGE_BASE_RELOCATION struct with a 'VirtualAddress' of 0.
Each 16-bit-relocation information consists of the actual relocation
position in the lower 12 bits and a relocation type in the high 4 bits.
To get the relocation RVA, you need to add the IMAGE_BASE_RELOCATION's
'VirtualAddress' to the 12-bit-position. The type is one of:
IMAGE_REL_BASED_ABSOLUTE (0)
This is a no-op; it is used to align the chunk to a 32-bits-
border.
IMAGE_REL_BASED_HIGH (1)
The relocation must be applied to the high 16 bits of the DWORD
in question.
IMAGE_REL_BASED_LOW (2)
The relocation must be applied to the low 16 bits of the DWORD
in question.
IMAGE_REL_BASED_HIGHLOW (3)
The relocation must be applied to the entire 32 bits in question.
IMAGE_REL_BASED_HIGHADJ (4)
Unknown
IMAGE_REL_BASED_MIPS_JMPADDR (5)
Unknown
IMAGE_REL_BASED_SECTION (6)
Unknown
IMAGE_REL_BASED_REL32 (7)
Unknown
As an example, if you find the relocation information to be
0x00004000 (32 bits, starting RVA)
0x00000010 (32 bits, size of chunk)
0x3012 (16 bits reloc data)
0x3080 (16 bits reloc data)
0x30f6 (16 bits reloc data)
0x0000 (16 bits reloc data)
0x00000000 (next chunk's RVA)
0xff341234
you know the first chunk describes relocations starting at RVA 0x4000 and
is 16 bytes long. Because the header uses 8 bytes and one relocation
uses 2 bytes, there are (16-8)/2=4 relocations in the chunk.
The first relocation is to be applied to the DWORD at 0x4012, the next
to the DWORD at 0x4080, and the third to the DWORD at 0x40f6. The last
relocation is a no-op.
The next chunk has a RVA of 0 and finishes the list.
Now, how do you do a relocation?
You know that the image *is* relocated to the preferred load address
'ImageBase' in the optional header; you also know the address you did
load the image to. If they match, you don't need to do anything.
If they don't match, you calculate the difference
actual_base-preferred_base
and add that value to the relocation positions, which you will find with
the method described above.
Acknowledgments
---------------
Thanks go to David Binette for his debugging and proof-reading.
(The remaining errors are entirely mine.)
Copyright
---------
This text is copyright 1998 by B. Luevelsmeyer. It is freeware, and you
may use it for any purpose but on you own risk. It contains errors and
it is incomplete. You have been warned.
Bug reports
-----------
Send any bug reports (or other comments) to
bernd.luevelsmeyer@iplan.heitec.net
Literature
----------
[1]
"Peering Inside the PE: A Tour of the Win32 Portable Executable File
Format" (M. Pietrek), in: Microsoft Systems Journal 3/1994
[2]
"Why to Use _declspec(dllimport) & _declspec(dllexport) In Code", MS
Knowledge Base Q132044
[3]
"Windows Q&A" (M. Pietrek), in: Microsoft Systems Journal 8/1995
[4]
"Writing Multiple-Language Resources", MS Knowledge Base Q89866
[5]
"The Portable Executable File Format from Top to Bottom" (Randy Kath),
in: Microsoft Developer Network
Appendix: hello world
---------------------
In this appendix I will show how to make programs by hand. The example
will use Intel-assembly, because I don't speak DEC Alpha.
The program will be the equivalent of
#include <stdio.h>
int main(void)
{
puts(hello,world);
return 0;
}
First, I translate it to use Win32 functions instead of the C runtime:
#define STD_OUTPUT_HANDLE -11UL
#define hello "hello, world\n"
__declspec(dllimport) unsigned long __stdcall
GetStdHandle(unsigned long hdl);
__declspec(dllimport) unsigned long __stdcall
WriteConsoleA(unsigned long hConsoleOutput,
const void *buffer,
unsigned long chrs,
unsigned long *written,
unsigned long unused
);
static unsigned long written;
void startup(void)
{
WriteConsoleA(GetStdHandle(STD_OUTPUT_HANDLE),hello,sizeof(hello)-1,&written,0);
return;
}
Now I will fumble out the assembly:
startup:
; parameters for WriteConsole(), backwards
6A 00 push 0x00000000
68 ?? ?? ?? ?? push offset _written
6A 0D push 0x0000000d
68 ?? ?? ?? ?? push offset hello
; parameter for GetStdHandle()
6A F5 push 0xfffffff5
2E FF 15 ?? ?? ?? ?? call dword ptr cs:__imp__GetStdHandle@4
; result is last parameter for WriteConsole()
50 push eax
2E FF 15 ?? ?? ?? ?? call dword ptr cs:__imp__WriteConsoleA@20
C3 ret
hello:
68 65 6C 6C 6F 2C 20 77 6F 72 6C 64 0A "hello, world\n"
_written:
00 00 00 00
I need to find the functions WriteConsoleA() and GetStdHandle(). They
happen to be in "kernel32.dll".
Now I can start to make the executable. Question marks will take the
place of yet-to-find-out values; they will be patched afterwards.
First the DOS-header, starting at 0x0 and being 0x40 bytes long:
00 | 4d 5a 00 00 00 00 00 00 00 00 00 00 00 00 00 00
10 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
20 | 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
30 | 00 00 00 00 00 00 00 00 00 00 00 00 40 00 00 00
As you can see, this isn't really a MS-DOS program. It's just the header
with the signature "MZ" at the beginning and the e_lfanew pointing
immediatly after the header, without any code. That's because it isn't
intended to run on MS-DOS.
Then the signature, starting at 0x40 and being 0x4 bytes long:
50 45 00 00
Now the file-header, which will start at byte 0x44 and is 0x14 bytes long:
Machine 4c 01 ; i386
NumberOfSections 02 00 ; code and data
TimeDateStamp 00 00 00 00 ; who cares?
PointerToSymbolTable 00 00 00 00 ; unused
NumberOfSymbols 00 00 00 00 ; unused
SizeOfOptionalHeader e0 00 ; constant
Characteristics 02 01 ; executable on 32-bit-machine
And the optional header, which will start at byte 0x58 and is 0x60 bytes long:
Magic 0b 01 ; constant
MajorLinkerVersion 00 ; I'm version 0.0 :-)
MinorLinkerVersion 00 ;
SizeOfCode 20 00 00 00 ; 32 bytes of code
SizeOfInitializedData ?? ?? ?? ?? ; yet to find out
SizeOfUninitializedData 00 00 00 00 ; we don't have a BSS
AddressOfEntryPoint ?? ??
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -