⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 helpfile.txt

📁 HELP文件格式定义清单说明
💻 TXT
📖 第 1 页 / 共 4 页
字号:
Windows Help File Format / Annotation File Format / SHG and MRB File Format

This documentation describes the file format parsed by HELPDECO, because
Microsoft did not publish the file formats used by WinHelp and MultiMedia
Viewers, and created by HC30, HC31, HCP, HCRTF, HCW, MVC, MMVC and WMVC.
This way it is not an official reference, but the result of many weekends
of work dumping 500+ help files and trying to understand what all the bytes
may mean.
I would like to thank Pete Davis, who first tried to describe 'The Windows
Help File Format' in Dr. Dobbs Journal, Sep/Oct 1993, and Holger Haase, who
did a lot of work on picture file formats and Bent Lynggaard for the infor-
mation on free lists in help files and unused bytes in B+ trees.

Revision 1: Fixed hash value calculation and |FONT, minor additions
Revision 2: Transparent bitmaps, {button}, and {mci} commands
Revision 3: Unknown in Paragraphinfo changed, minor additions
Revision 4: CTXOMAP corrected, bitmap dimensions dpi - not PelsPerMeter
Revision 5: MacroData in HotspotInfo added, Annotation file format added
Revision 6: [MACROS] section / internal file |Rose added, MVB font structure
Revision 7: [GROUPS] section *.GRP and [CHARTAB] section *.tbl file format
Revision 8: free list, clarified TOPICPOS/TOPICOFFSET
Revision 9: B+ tree unused bytes and what I found out about GID files

A help file starts with a header, the only structure at a fixed place

long Magic		   0x00035F3F
long DirectoryStart	   offset of FILEHEADER of internal directory
long FirstFreeBlock	   offset of FREEHEADER or -1L if no free list
long EntireFileSize	   size of entire help file in bytes
----
char HelpFileContent[EntireFileSize-16]   the remainder of the help file

At offset DirectoryStart the FILEHEADER of the internal directory is located

long ReservedSpace	     size reserved including FILEHEADER
long UsedSpace		     size of internal file in bytes
unsigned char FileFlags      normally 4
----
char FileContent[UsedSpace]  the bytes contained in the internal file
char FreeSpace[ReservedSpace-UsedSpace-9]

The FILEHEADER of the internal directory is followed by UsedSpace bytes
containing the internal directory which is used to associate FileNames and
FileOffsets. The directory is structured as a B+ tree.
A B+ tree is made from leaf-pages and index-pages of fixed size, one of which
is the root-page. All entries are contained in leaf-pages. If more entries
are required than fit into a single leaf-page, index-pages are used to locate
the leaf-page which contains the required entry.
A B+ tree starts with a BTREEHEADER telling you the size of the B+ tree pages,
the root-page, the number of levels, and the number of all entries in this
B+ tree. You must follow (NLevels-1) index-pages before you reach a leaf-page.

unsigned short Magic		0x293B
unsigned short Flags		bit 0x0002 always 1, bit 0x0400 1 if directory
unsigned short PageSize 	0x0400=1k if directory, 0x0800=2k else, or 4k
char Structure[16]		string describing format of data
				'L' = long (indexed)
				'F' = NUL-terminated string (indexed)
				'i' = NUL-terminated string (indexed)
				'2' = short
				'4' = long
				'z' = NUL-terminated string
				'!' = long count value, count/8 * record
					long filenumber
					long TopicOffset
short MustBeZero		0
short PageSplits		number of page splits B+ tree has suffered
short RootPage			page number of B+ tree root page
short MustBeNegOne		0xFFFF
short TotalPages		number of B+ tree pages
short NLevels			number of levels of B+ tree
long TotalBtreeEntries		number of entries in B+ tree
----
char Page[TotalPages][PageSize] the pages the B+ tree is made of

If NLevel is greater than 1, RootPage is the page number of an index-page.
Index-pages start with a BTREEINDEXHEADER and are followed by an array of
BTREEINDEX structures, in case of the internal directory containing pairs
of FileNames and PageNumbers.
(STRINGZ is a NUL-terminated string, sizeof(STRINGZ) is strlen(string)+1).
PageNumber gets you to the next page containing entries lexically starting
at FileName, but less than the next FileName. PreviousPage gets you to the
next page if the desired FileName is lexically before the first FileName.

unsigned short Unused	 number of free bytes at end of this page
short NEntries		 number of entries in this index-page
short PreviousPage	 page number of previous page
----
struct			 and this is the structure of directory index-pages
{
    STRINGZ FileName	 varying length NUL-terminated string
    short PageNumber	 page number of page dealing with FileName and above
}
DIRECTORYINDEXENTRY[NEntries]

After NLevels-1 of index-pages you will reach a leaf-page starting with a
BTREENODEHEADER followed by an array of BTREELEAF structures, in case of the
internal directory containing pairs of FileNames and FileOffsets.
You may follow the PreviousPage entry in all NLevels-1 index-pages to reach
the first leaf-page, then iterate thru all entries and use NextPage to
follow the double linked list of leaf-pages until NextPage is -1 to retrieve
a sorted list of all TotalBtreeEntries entries contained in the B+ tree.

unsigned short Unused	 number of free bytes at end of this page
short NEntries		 number of entries in this leaf-page
short PreviousPage	 page number of previous leaf-page or -1 if first
short NextPage		 page number of next leaf-page or -1 if last
----
struct			 and this is the structure of directory leaf-pages
{
    STRINGZ FileName	 varying length NUL-terminated string
    long FileOffset	 offset of FILEHEADER of internal file FileName
			 relative to beginning of help file
}
DIRECTORYLEAFENTRY[NEntries]

At offset FreeListBlock the first FREEHEADER is located. It contains

long FreeSpace		 number of bytes unused, including this header
long NextFreeBlock	 offset of next FREEHEADER or -1L if end of list
----
char Unused[FreeSpace-8] unused bytes

All unused portions of the help file are linked together using FREEHEADERs.

Now that you are able to locate the position of an internal file in the
help file, let's describe what they contain. Remember that each FileOffset
first takes you to the FILEHEADER of the internal file. The structures
described next are located just behind this FILEHEADER.

|SYSTEM

The first one to start with is the |SYSTEM file. This is the SYSTEMHEADER,
the structure of the first bytes of this internal file:

short Magic		 0x036C
short Minor		 help file format version number
			 15 = HC30 Windows 3.0 help file
			 21 = HC31 Windows 3.1 help file
			 27 = WMVC/MMVC media view file
			 33 = MVC or HCW 4.00 Windows 95
short Major		 1
time_t GenDate		 help file created seconds after 1.1.1980, or 0
unsigned short Flags	 see below

Use Minor and Flags to find out how the help file was compressed:
Minor <= 16		 not compressed, TopicBlockSize 2k
Minor > 16		 Flags=0: not compressed,  TopicBlockSize 4k
			 Flags=4: LZ77 compressed, TopicBlockSize 4k
			 Flags=8: LZ77 compressed, TopicBlockSize 2k
Additionally the help file may use phrase compression (oldstyle or Hall).

If Minor is 16 or less, the help file title follows the SYSTEMHEADER:

STRINGZ HelpFileTitle

If Minor is above 16, one or more SYSTEMREC records follow instead up to the
internal end of the |SYSTEM file:

struct
{
    unsigned short RecordType	       type of data in record
    unsigned short DataSize	       size of data
    ----
    char Data[DataSize] 	       dependent on RecordType
}
SYSTEMREC[]

There are different RecordTypes defined, each storing different Data.
They mainly contain what was specified in the help project file.

RecordType  Data
1 TITLE     STRINGZ Title	       help file title
2 COPYRIGHT STRINGZ Copyright	       copyright notice shown in AboutBox
3 CONTENTS  TOPICOFFSET Contents       topic offset of starting topic
4 CONFIG    STRINGZ Macro	       all macros executed on opening
5 ICON	    Windows *.ICO file	       See WIN31WH on icon file format
6 WINDOW    struct		       Windows defined in the HPJ-file
	    {
		struct
		{
		    unsigned short TypeIsValid:1
		    unsigned short NameIsValid:1
		    unsigned short CaptionIsValid:1
		    unsigned short XIsValid:1
		    unsigned short YIsValid:1
		    unsigned short WithIsValid:1
		    unsigned short HeigthIsValid:1
		    unsigned short MaximizeWindow:1
		    unsigned short RGBIsValid:1
		    unsigned short RGBNSRIsValid:1
		    unsigned short WindowsAlwaysOnTop:1
		    unsigned short AutoSizeHeight:1
		}
		Flags
		char Type[10]	       type of window
		char Name[9]	       window name
		char Caption[51]       caption of window
		short X 	       x coordinate of window (0..1000)
		short Y 	       y coordinate of window (0..1000)
		short Width	       width of window (0..1000)
		short Height	       height of window (0..1000)
		short Maximize	       maximize flag and window styles
		COLORREF Rgb	       color of scrollable region
		COLORREF RgbNsr        color of non scrollable region
	    }
	    Window
6 WINDOW    typedef struct	       Viewer 2.0 Windows defined in MVP-file
	    {
		unsigned short Flags
		char Type[10]		 /* type of window */
		char Name[9]		 /* window name */
		char Caption[51]	 /* caption for window */
		unsigned char MoreFlags
		short X 		 /* x coordinate of window (0..1000) */
		short Y 		 /* y coordinate of window (0..1000) */
		short Width		 /* width of window (0..1000) */
		short Height		 /* height of window (0..1000) */
		short Maximize		 /* maximize flag and window styles */
		COLORREF Rgb1
		char Unknown
		COLORREG Rgb2
		COLORREF Rgb3
		short X2
		short Y2
		short Width2
		short Height2
		short X3
		short Y3
	    }
	    Window;
8 CITATION  STRINGZ Citation	       the Citation printed
9 LCID	    short LCID[4]	       language ID, Windows 95 (HCW 4.00)
10 CNT	    STRINGZ ContentFileName    CNT file name, Windows 95 (HCW 4.00)
11 CHARSET  unsigned short Charset     charset, Windows 95 (HCW 4.00)
12 DEFFONT  struct		       default dialog font, Windows 95 (HCW 4.00)
	    {
		unsigned char HeightInPoints
		unsigned char Charset
		STRINGZ FontName
	    }
	    DefFont
12 FTINDEX  STRINGZ dtype	       Multimedia Help Files dtypes
13 GROUPS   STRINGZ Group	       defined GROUPs, Multimedia Help File
14 INDEX_S. STRINGZ IndexSeparators    separators, Windows 95 (HCW 4.00)
14 KEYINDEX struct		       Multimedia Help Files
	    {
		char btreename[10];    btreename[1] is footnote character
		char mapname[10];
		char dataname[10];
		char title[80];
	    }
	    KeyIndex
18 LANGUAGE STRINGZ language	       defined language, Multimedia Help Files
19 DLLMAPS  struct		       defined DLLMAPS, Multimedia Help Files
	    {
		STRINGZ Win16RetailDLL
		STRINGZ Win16DebugDLL
		STRINGZ Win32RetailDLL
		STRINGZ Win32DebugDLL
	    }
	    DLLNames

|Phrase

If the help file is phrase compressed, it contains an internal file named
|Phrases. Windows 3.0 help files generated with HC30 use the following
uncompressed structure to store phrases. A phrase is not NUL-terminated,
instead use the next PhraseOffset to locate the end of the phrase string
(there is one more phrase offset stored than phrases are defined to allow
for this).

unsigned short NumPhrases	 number of phrases in table
unsigned short OneHundred	 0x0100
unsigned short PhraseOffset[NumPhrases+1] PhraseOffset[0]==2*(NumPhrases+1)
char Phrase[NumPhrases][PhraseOffset[PhraseNum+1]-PhraseOffset[PhraseNum]]

Windows 3.1 help files generated using HC31 and later always LZ77 compress
the Phrase character array. Read NumPhrases, OneHundred, DecompressedSize,
and NumPhrases+1 PhraseOffset values. Allocate DecompressedSize bytes for
the Phrase character array and decompress the UsedSpace-2*NumPhrases-10
remaining bytes into the allocated space to retrieve the phrase strings.

unsigned short NumPhrases	 number of phrases in table
unsigned short OneHundred	 0x0100
long DecompressedSize
unsigned short PhraseOffset[NumPhrases+1] PhraseOffset[0]==2*(NumPhrases+1)
----				 the remaining part is LZ77 compressed
char Phrase[NumPhrases][PhraseOffset[PhraseNum+1]-PhraseOffset[PhraseNum]]

The LZ77 decompression algorithm can best be described like this:
  Take the next byte
    Start at the least significant bit
    If the bit is cleared
      Copy 1 byte from source to destination
    Else
      Get the next WORD into the struct { unsigned pos:12; unsigned len:4; }
      Copy len+3 bytes from destination-pos-1 to destination
    Loop until all bits are done
  Loop until all bytes are consumed
See end of this file for a detailed algorithm.

Some MVBs use a slightly different layout of internal |Phrases file:

unsigned short EightHundred	 0x0800
unsigned short NumPhrases	 number of phrases in table
unsigned short OneHundred	 0x0100
long DecompressedSize
char unused[30]
unsigned short PhraseOffset[NumPhrases+1] PhraseOffset[0]==2*(NumPhrases+1)
----				 the remaining part is LZ77 compressed
char Phrase[NumPhrases][PhraseOffset[PhraseNum+1]-PhraseOffset[PhraseNum]]

|PhrIndex

Windows 95 (HCW 4.00) may use Hall compression and the internal files
|PhrIndex and |PhrImage to store phrases. Both must be used to build a
table of phrases and PhraseOffsets. |PhrIndex starts with this header:

long Magic			 1L
long NEntries
long CompressedSize
long PhrImageSize
long PhrImageCompressedSize
long Always0			 0L
unsigned short BitCount:4
unsigned short UnknownBits:12
unsigned short Always4A00	 not really always

The remaining data is bitcompressed. Use this algorithm to build a table
of PhraseOffsets:

short n,i; long mask=0,*ptr=(long *)(&always4A00+1);
int GetBit(void)
{
    ptr+=(mask<0);
    mask=mask*2+(mask<=0);
    return (*ptr&mask)!=0;
}
PhaseOffset[0]=0;
for(i=0;i<NEntries;i++)
{
    for(n=1;GetBit();n+=1<<BitCount) ;
    if(GetBit()) n+=1;
    if(BitCount>1) if(GetBit()) n+=2;
    if(BitCount>2) if(GetBit()) n+=4;
    if(BitCount>3) if(GetBit()) n+=8;
    if(BitCount>4) if(GetBit()) n+=16;
    PhraseOffset[i+1]=PhraseOffset[i]+n;
}

Just behind the bitcompressed phrase length information (on a 32-bit boundary,
that's why GetBit consumed longs) follow NumPhrases bits (one bit for each
phrase). It is assumed that this information is used for the full text search
capability to exclude certain phrases.

|PhrImage

The |PhrImage file stores the phrases. A phrase is not NUL-terminated. Use
PhraseOffset[NumPhrase] and PhraseOffset[NumPhrase+1] to locate beginning
and end of the phrase string. We generated one more PhraseOffset to allow
for this. |PhrImage is LZ77 compressed if PhrImageCompressedSize is not
equal to PhrImageSize. Otherwise you may take it as stored.

|FONT

The next internal file described is the |FONT file, which uses this header:

unsigned short NumFacenames	      number of face names
unsigned short NumDescriptors	      number of font descriptors
unsigned short FacenamesOffset	      start of array of face names
				      relative to &NumFacenames
unsigned short DescriptorsOffset      start of array of font descriptors
				      relative to &NumFacenames
---				      only if FacenamesOffset >= 12
unsigned short NumStyles	      number of style descriptors
unsigned short StyleOffset	      start of array of style descriptors
				      relative to &NumFacenames
---				      only if FacenamesOffset >= 16
unsigned short NumCharMapTables       number of character mapping tables
unsigned short CharMapTableOffset     start of array of character mapping
				      table names relative to &NumFacenames

The face name array is located at FacenamesOffset and contains strings, which
are Windows font names or in case of multimedia files a Windows font name
concatenated with ',' and the character mapping table number. Short strings
are NUL-terminated, but a string may use all bytes for characters.

char FaceName[NumFacenames][(DescriptorsOffset-FacenamesOffset)/NumFacenames]

At DescriptorsOffset is an array located describing all fonts used in the help
file. If this kind of descriptor appears in a help file, any metric value is
given in HalfPoints.

struct oldfont
{
    struct
    {
	unsigned char Bold:1
	unsigned char Italic:1
	unsigned char Underline:1
	unsigned char StrikeOut:1
	unsigned char DoubleUnderline:1
	unsigned char SmallCaps:1
    }
    Attributes
    unsigned char HalfPoints		      PointSize * 2
    unsigned char FontFamily		      font family. See values below
    unsigned short FacenameIndex	      index into FaceName array
    unsigned char FGRGB[3]		      RGB values of foreground
    unsigned char BGRGB[3]		      unused background RGB Values
}
FontDescriptor[NumDescriptors]

#define FAM_MODERN 0x01 		      This is a different order than

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -