appendix-a.html

来自「linux-unix130.linux.and.unix.ebooks130 l」· HTML 代码 · 共 1,077 行 · 第 1/3 页
HTML
1,077 行
</P>



<P>Following the header structure header is an area called the 

index. The index contains one or more index entries. 

Each index entry contains information about, and a pointer 

to, a specific data item.

</P>



<P>After the index comes the store. It is in the store that the data items are kept. The data in 

the store is packed together as closely as possible. The order in which the data is stored is 

immaterial&#151;a far cry from the C structure used in the lead.

</P>



<B>A.2.1.1.1.1.3. The Header Structure in Depth</B>





<P>Let's take a more in-depth look at the actual format of a header structure, starting with 

the header structure header.

</P>



<B>A.2.1.1.1.1.4. The Header Structure Header</B>



<P>The header structure header always starts with a 3-byte magic number: 

8e ad e8. Following this is a 1-byte version number. Next are 4 bytes that are reserved for future expansion. 

After the reserved bytes is a 4-byte number that indicates how many index entries exist in this 

header structure, followed by another 4-byte number indicating how many bytes of data are part 

of the header structure.

</P>



<B>A.2.1.1.1.1.5. The Index Entry</B>



<P>The header structure's index is made up of zero or more index entries. Each entry is 16 

bytes long. The first 4 bytes contain a tag&#151;a numeric value that identifies what type of data is 

pointed to by the entry. The tag values change according to the header structure's position in the 

RPM file. A list of the actual tag values, and what they represent, is included in section A.2.1.3.2.

</P>



<A NAME="PAGENUM-343"><P>Page 343</P></A>



<P>Following the tag is a 4-byte type, which is a numeric value that describes the format of 

the data pointed to by the entry. The types and their values do not change from header 

structure to header structure. Here is the current list:

</P>



<UL>

<LI> NULL = 0

<LI> CHAR = 1

<LI> INT8 = 2

<LI> INT16 = 3

<LI> INT32 = 4

<LI> INT64 = 5

<LI> STRING = 6

<LI> BIN = 7

<LI> STRING_ARRAY = 8

</UL>





<P>A few of the data types might need some clarification. The 

STRING data type is simply a null-terminated string, while the 

STRING_ARRAY is a collection of strings. Finally, the 

BIN data type is a collection of binary data. This is normally used to identify data that is longer than an 

INT but is not a printable STRING.

</P>



<P>Next is a 4-byte offset that contains the position of the data, relative to the beginning of 

the store. We'll talk about the store in just a moment.

</P>



<P>Finally, there is a 4-byte count that contains the number of data items pointed to by the 

index entry. There are a few wrinkles to the meaning of the count, and they center around the 

STRING and STRING_ARRAY data types. STRING data always has a count of 1, while 

STRING_ARRAY data has a count equal to the number of strings contained in the store.

</P>



<B>A.2.1.1.1.1.6. The Store</B>





<P>The store is where the data contained in the header structure is stored. Depending on the 

data type being stored, there are some details that should be kept in mind:

</P>





<UL>

<LI> For STRING data, each string is terminated with a null byte.

<LI> For INT data, each integer is stored at the natural boundary for its type. A 64-bit 

     INT is stored on an 8-byte boundary, a 16-bit INT is stored on a 2-byte boundary, and so on.

<LI> All data is in network byte order.

</UL>



<P>With all these details out of the way, let's take a look at the signature.

</P>



<H4>

A.2.1.2. The Signature

</H4>



<P>The signature section follows the lead in the RPM package file. It contains information 

that can be used to verify the integrity and, optionally, the authenticity of the majority of the 

package file. The signature is implemented as a header structure.

</P>



<A NAME="PAGENUM-344"><P>Page 344</P></A>



<P>You probably noticed our use of the word 

majority. The information in the signature header structure is based on the contents of the package file's header and archive only. The data in 

the lead and the signature header structure is not included when the signature information is 

created, nor is it part of any subsequent checks based on that information.

</P>



<P>While that omission might seem to be a weakness in RPM's design, it really isn't. In the case 

of the lead, since it is used only for easy identification of package files, any changes made to 

that part of the file would, at worst, leave the file in such a state that RPM wouldn't recognize it 

as a valid package file. Likewise, any changes to the signature header structure would make 

it impossible to verify the file's integrity, since the signature information would have been 

changed from its original value.

</P>



<B>A.2.1.2.1. Analyzing the Signature Area</B>



<P>Using our newfound knowledge of header structures, let's take a look at the signatures in rpm-

</P>



<!-- CODE SNIP //-->

<PRE>

2.2.1-1.i386.rpm:



00000060: 8ead e801 0000 0000 0000 0003 0000 00ac ................

</PRE>

<!-- END CODE SNIP //-->



<P>The first 3 bytes (8ead e8) contain the magic number for the start of the header structure. 

The next byte (01) is the header structure's version.

</P>



<P>As we discussed earlier, the next 4 bytes (0000 

0000) are reserved. The 4 bytes after that (0000 

0003) represent the number of index entries in the signature section, namely, three. 

Following that are 4 bytes (0000 00ac) that indicate how many bytes of data are stored in the 

signature. The hex value 00ac, when converted to decimal, means the store is 172 bytes long.

</P>



<P>Following the first 16 bytes is the index. Each of the three index entries in this header 

structure consists of four 32-bit integers, in the following order:

</P>



<UL>

<LI> Tag

<LI> Type

<LI> Offset

<LI> Count

</UL>





<P>Let's take a look at the first index entry:

</P>



<!-- CODE SNIP //-->

<PRE>

00000070: 0000 03e8 0000 0004 0000 0000 0000 0001 ................

</PRE>

<!-- END CODE SNIP //-->



<P>The tag consists of the first 4 bytes (0000 

03e8), which is 1,000 when translated from hex. 

Looking in the RPM source directory, at the file 

lib/signature.h, we find the following tag definitions:

</P>



<!-- CODE SNIP //-->

<PRE>

#define SIGTAG_SIZE        1000

#define SIGTAG_MD5         1001

#define SIGTAG_PGP         1002

</PRE>

<!-- END CODE SNIP //-->





<P>So the tag we are studying is for a size signature. Let's continue.

</P>



<A NAME="PAGENUM-345"><P>Page 345</P></A>





<P>The next 4 bytes (0000 0004) contain the data type. As we saw earlier, data type 4 means 

that the data stored for this index entry is a 32-bit integer. Skipping the next 4 bytes for a 

moment, the last 4 bytes (0000 0001) are the number of 32-bit integers pointed to by this index entry.

</P>



<P>Now let's go back to the 4 bytes prior to the count 

(0000 0000). This number is the offset, in bytes, at which the size signature is located. It has a value of zero, but the question is, 0 

bytes from what? The answer, although it doesn't do us much good, is that the offset is 

calculated from the start of the store. So first we must find where the store begins, and we can do that 

by performing a simple calculation.

</P>



<P>First, go back to the start of the signature section. We've made a copy here so you won't 

need to flip from page to page:

</P>



<!-- CODE SNIP //-->

<PRE>

00000060: 8ead e801 0000 0000 0000 0003 0000 00ac ................

</PRE>

<!-- END CODE SNIP //-->



<P>After the magic, the version, and the 4 reserved bytes, there are the number of index 

entries (0000 0003). Since we know that each index entry is 16 bytes long (4 for the tag, 4 for the 

type, 4 for the offset, and 4 for the count), we can multiply the number of entries (3) by the 

number of bytes in each entry (16) and obtain the total size of the index, which is 48 in decimal, or 

30 in hex. Since the first index entry starts at hex offset 70, we can simply add hex 30 to hex 

70, and get, in hex, offset a0. So let's skip down to offset a0 and see what's there:

</P>



<!-- CODE SNIP //-->

<PRE>

000000a0: 0004 4c4f b025 b097 1597 0132 df35 d169 ..LO.%.....2.5.i

</PRE>

<!-- END CODE SNIP //-->



<P>If we've done our math correctly, the first 4 bytes 

(0004 4c4f) should represent the size of this file. Converting to decimal, this is 281,679. Let's take a look at the size of the actual file:

</P>



<!-- CODE SNIP //-->

<PRE>

# ls -al rpm-2.2.1-1.i386.rpm

-rw-rw-r-- 1 ed ed 282015 Jul 21 16:05 rpm-2.2.1-1.i386.rpm

#

</PRE>

<!-- END CODE SNIP //-->



<P>Hmmm, something's not right. Or is it? It looks like we're short by 336 bytes, or in hex, 

150. Interesting how that's a nice round hex number, isn't it? For now, let's continue through 

the remainder of the index entries, and see if hex 150 pops up elsewhere.

</P>



<P>Here's the next index entry. It has a tag of decimal 1001, which is an MD5 checksum. It 

is type 7, which is the BIN data type, it is 16 bytes long, and its data starts 4 bytes after the 

beginning of the store:

</P>



<!-- CODE SNIP //-->

<PRE>

00000080: 0000 03e9 0000 0007 0000 0004 0000 0010 ................

</PRE>

<!-- END CODE SNIP //-->



<P>And here's the data. It starts with b025 (Remember that offset of four!) and ends on the 

second line with 5375. This is a 128-bit MD5 checksum of the package file's header and archive 

sections:

</P>



<!-- CODE SNIP //-->

<PRE>

000000a0: 0004 4c4f b025 b097 1597 0132 df35 d169 ..LO.%.....2.5.i

000000b0: 329c 5375 8900 9503 0500 31ed 6390 a520 2.Su......1.c..

</PRE>

<!-- END CODE SNIP //-->



<P>Okay, let's jump back to the last index entry:

</P>



<!-- CODE SNIP //-->

<PRE>

00000090: 0000 03ea 0000 0007 0000 0014 0000 0098 ................

</PRE>

<!-- END CODE SNIP //-->



<A NAME="PAGENUM-346"><P>Page 346</P></A>



<P>It has a tag value of 03ea (1002 in decimal&#151;a PGP signature block) and is also a 

BIN data type. The data starts 20 decimal bytes from the start of the data area, which would put it at file 

offset b4 (in hex). It's a biggie&#151;152 bytes long! Here's the data, starting with 

8900:

</P>



<!-- CODE //-->

<PRE>

000000b0: 329c 5375 8900 9503 0500 31ed 6390 a520 2.Su......1.c..

000000c0: e8f1 cba2 9bf9 0101 437b 0400 9c8e 0ad4 ........C{......

000000d0: 3790 364e dfb0 9a8a 22b5 b0b3 dc30 4c6f 7.6N....&quot;....0Lo

000000e0: 91b8 c150 704e 2c64 d88a 8fca 18ab 5b6f ...PpN,d......[o

000000f0: f041 ebc8 d18a 01c9 3601 66f0 9ddd e956 .A......6.f....V

00000100: 3142 61b3 b1da 8494 6bef 9c19 4574 c49f 1Ba.....k...Et..

00000110: ee17 35e1 d105 fb68 0ce6 715a 60f1 c660 ..5....h..qZ`..`

00000120: 279f 0306 28ed 0ba0 0855 9e82 2b1c 2ede `...(....U..+...

00000130: e8e3 5090 6260 0b3c ba04 69a9 2573 1bbb ..P.b`.&lt;..i.%s..

00000140: 5b65 4de1 b1d2 c07f 8afa 4a9b 0000 0000 [eM.......J.....

</PRE>

<!-- END CODE //-->



<P>It ends with the bytes 4a9b. This is a 1,216-bit PGP signature block. It is also the end of 

the signature section. There are 4 null bytes following the last data item in order to round the 

size out so that it ends on an 8-byte boundary. This means that the offset of the next section 

starts at offset 150, in hex. Say, wasn't the size in the size signature off by 150 hex? Yes, the size in 

the signature is the size of the file&#151;minus the size of the lead and the signature sections.

</P>



<B>

A.2.1.3. The Header

</B>





<P>The header section contains all available information about the package. Entries such as 

the package's name, version, and file list are contained in the header. Like the signature 

section, the header is in header structure format. Unlike the signature, which has only three 

possible tag types, the header has more than 60 different tags. (The list of currently defined tags 

appears in section A.2.1.3.2.) Be aware that the list of tags changes frequently; the definitive list 

appears in the RPM sources in lib/rpmlib.h.

</P>



<B>A.2.1.3.1. Analyzing the Header</B>





<P>The easiest way to find the start of the header is to look for the second header structure 

by scanning for its magic number (8ead e8). The 16 bytes, starting with the magic, are the 

header structure's header. They follow the same format as the header in the signature's header 

structure:

<P>



<!-- CODE SNIP //-->

<PRE>



00000150: 8ead e801 0000 0000 0000 0021 0000 09d3 ...........!....

</PRE>

<!-- END CODE SNIP //-->



<P>As before, the byte following the magic identifies this header structure as being in version 

1 format. Following the 4 reserved bytes, we find the count of entries stored in the header 

(0000 0021). Converting to decimal, we find that there are 33 entries in the header. The next 4 

bytes (0000 09d3), converted to decimal, tell us that there are 2,515 bytes of data in the store.

</P>





<P>Since the header is a header structure just like the signature, we know that the next 16 bytes 

are the first index entry:

</P>



<!-- CODE SNIP //-->

<PRE>

00000160: 0000 03e8 0000 0006 0000 0000 0000 0001 ................

</PRE>

<!-- END CODE SNIP //-->



<A NAME="PAGENUM-347"><P>Page 347</P></A>



<P>The first 4 bytes (0000 03e8) are the tag, which is the tag for the package name. The next 

4 bytes indicate that the data is type 6, or a null-terminated string. There's an offset of 0 in 

the next 4 bytes, meaning that the data for this tag is first in the store. Finally, the last 4 bytes 

(0000 0001) show that the data count is 1, which is the only legal value for data of type 

STRING.

</P>



<P>To find the data, we need to take the offset from the start of the first index entry in the 

header (160) and add in the count of index entries (21) multiplied by the size of an index entry 

(10). Doing the math (all the values shown are in hex, remember!), we arrive at the offset to 

the store, hex 370. Since the offset for this particular index entry is 0, the data should start at 

offset 370:

</P>
appendix-a.html - 源码说明

本页面展示了「linux-unix130.linux.and.unix.ebooks130 linux and unix ebookslinuxLearning Linux - Collection of 12 E」中的 appendix-a.html 源码文件，采用 HTML 编程语言编写，共 1,077 行代码。您可以在线阅读完整代码内容，也可以返回资源详情页下载完整源码包进行本地学习和开发。
虫虫下载站收录了大量与linux相关的技术资源，包括源代码、技术文档、电路图等，是电子工程师和嵌入式开发者的专业学习平台。
⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?