📄 excel.txt

📁 VB 不需要安装 EXECEL 直接操作 XLS 文档的类
💻 TXT
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
  We can narrow down the area that we have to search by using BIFF's  INDEX and ROW records.  If a non-protected BIFF file has an INDEX record, it will be the  second record in the file (immediately after the BOF record).  If the second record is not an INDEX record, then we must resort  to the exhaustive record search described above.  Or, alternatively,  we could simply fail the search and return some sort of error code.  Having located the INDEX record, we fetch the rwMic and rwMac fields,  which tell us the range of defined rows on the document.  If the row  we are searching for is outside of that range, then we know right  away that the desired cell doesn't exist, so we can return zero.  The next step is to locate the ROW record for the row of the  desired cell.  To do this, we need to understand how Excel saves  ROW records and cell records.  When Excel saves a document in BIFF format, it divides the document  into blocks of 32 rows, starting at the first defined row on the  document.  Since rwMic is by definition the first defined row, the  first block consists of rows rwMic through rwMic+31; the second, from  rwMic+32 through rwMic+63; and in general, the i-th block, assuming  that i is zero-based, consists of rows (rwMic+i*32) through  (rwMic+i*32+31).  Excel writes a block of ROW records to the file, then follows this  with all the cell records for cells in those rows.  This process is  repeated until all ROW and cell records have been written.  The INDEX record contains an array of file pointers to the blocks  of ROW records.  Working backwards from our rule above for ROW  blocks, we see that to locate the block for row 'rw', we fetch  array element (rw-rwMic)/32.  Here, the '/' operator is integer  division that truncates.  Having found the proper array element, we position the BIFF file at  that location. The file pointer that we fetched from the array is an  absolute byte offset from the beginning of the file, which is byte 0.  For example, if the file pointer were 17,540, then we would position  the file at byte 17,540.  The file is now positioned at the correct block of ROW records. The  next step is to search for the correct ROW record.  Since Excel  documents have sparse cell tables, blocks of ROW records contain only  the defined rows within the block range.  This means that if the row  we are searching for doesn't exist, then it won't have a ROW record  in the BIFF file.  We must read at most 32 records at this point.  If we do not find a  ROW record for the desired row, then we know that the row doesn't  exist, so we can return zero.  We know that the row doesn't exist  as soon as we find a non-ROW record, or a ROW record for a row beyond  the one we are searching for.  Having found the correct ROW record, we fetch the colMic and colMac  fields, which tell us the range of defined columns in the row. If the  column we are searching for lies outside of the defined range, then  we know that the desired cell doesn't exist, and we can return zero.  From the ROW record, we can now determine the position within the  file of the cell records for the desired row.  The next step is to  position the file at that point and search for the cell record for  the desired cell.  The dbRtcell field contains the offset to cells for the ROW record.  This field is limited to 16 bits to save space in BIFF files; thus  the largest offset that will fit is 65,535. In a large Excel  document, however, it is possible for cells to be located farther  than 65,535 bytes from their ROW record. Therefore we encode the  offset to get more value from it.        The first ROW record in a block contains an offset to cells        relative to the second ROW record.  This is because after reading        the first ROW record, you are positioned at the second ROW record,        so finding the cells is just a matter of skipping some number of        bytes relative to the current file position.  The second and all subsequent ROW records in a block contain offsets  to cells relative to the previous ROW record's cells.  This iterative  approach works like this: after reading the first ROW record, you get  its offset and add it to the current file position to get the  absolute file position of the first ROW's cell records.  When you  read the second ROW record, you add the offset contained therein to  your computed position of the first ROW's cells, and you get the  position of the second ROW's cells.  Continuing in this manner, you  find that by the time you find the proper ROW record, you have  already computed the absolute file position of its cells, so you  position there and continue your search.  Having computed the file position of our row's cell records, we set  the file there and start sequentially searching for the desired  cell.  If we find the cell, we fetch its value and return it.  Our  search fails as soon as we encounter a cell record for a cell beyond  ours, or we encounter a record which is not a cell record.  If a ROW has no defined cells, we will set its dbRtcell offset to  zero. If a ROW's cells are more than 64K from the previous ROW's  cells (which is rare but possible), we will write out a zero offset  for that ROW and ALL subsequent ROW records in the same block.  All  this means is that we have to search a little harder for the correct  cell record: instead of being able to start our search at cells in  the desired row, we will have to start searching at cells in some  previous row.Excel Formulas--------------  This section describes how Excel stores formulas within BIFF  files.  Formulas appear in FORMULA, ARRAY, and NAME records.  In this section, the term "formula" is a synonym for "parsed  expression"; it is the internal tokenized representation of  an Excel formula.  Formulas are stored in a reverse Polish scheme. A formula consists of  a sequence of parse tokens, each of which is either an operand,  operator, or a control token.  Operand tokens provide values;  operator tokens perform arithmetic operations upon the operands; and  control tokens assist in formula evaluation by describing properties  of the formula.  A token consists of two parts: a token type and a token value. Token  types are called "ptg's" in Excel; they are one byte long, ranging in  value from 1 to 0x7F.  Ptg's above 0x7F are reserved for internal  Excel use.  The ptg specifies only what kind of information is contained  in a token.  The information itself is stored in the token value,  immediately following the ptg in the parsed expression. Some  tokens consist only of a ptg, without an accompanying token  value; for example, to specify an addition operation, only the token  type, ptgAdd, is required.  But to specify an integer operand, both  the ptg, ptgInt, and the token value, an integer, must be  specified.  As an illustration, consider the parsed expression for =5+6. This  parsed expression consists of three tokens: two integer operands and  an operator.   ptgInt    0x0005    ptgInt    0x0006   ptgAdd   <   token 1  >      <  token 2  >  <token 3>  Notice that each ptgInt is immediately followed by the  integer token value.  In many cases, the token value consists of a structure of two or more  fields.  In describing structures for these cases, offset zero is  assumed to be the first byte of the token value, i.e. the first byte  immediately following the token type.  Unless otherwise noted, all tokens can occur in FORMULA, ARRAY, and  NAME records.  Some tokens do not appear in one or more of these  record types; they are explained as encountered.Expression Evaluation---------------------  The evaluation of Excel formulas is a straightforward process. One  LIFO stack, the operand stack, is maintained during evaluation. When  an operand is encountered, it is pushed onto the stack. When an  operator is encountered, it operates on the topmost operand or  operands.  Operator precedence is irrelevant at evaluation time;  operators are handled as soon as they are encountered.  There are three kinds of operators: unary, binary, and function.  Unary operators, like the minus sign which negates a number, operate  only on the topmost operand.  Binary operators, like the addition  operator, operate on the top two operands.  Function operators, which  implement Excel functions, operate on a variable number of operands,  depending on how many arguments the function accepts.  All operators work by popping the required operands from the stack,  performing calculations, and pushing the result back onto the operand  stack.Unary Operators---------------  Here are the unary operator tokens.  All of these operators pop the  top argument from the operand stack, perform a calculation, and push  the result back onto the operand stack.ptgUplus - unary plus (ptg = 0x12)  This operator has no effect.ptgUminus - unary minus (ptg = 0x13)  Negates the top operand.ptgPercent - percent sign (ptg = 0x14)  Divides the top operand by 100Binary Operators----------------  Here are the binary operator ptg's.  All of these operators pop the  top two arguments from the operand stack, perform a calculation, and  push the result back onto the operand stack.ptgAdd - addition (ptg = 0x03)  Adds the top two operands together.ptgSub - subtraction (ptg = 0x04)  Subtracts the top operand from the second-to-top.ptgMul - multiplication (ptg = 0x05)  Multiplies the top two operands.ptgDiv - division (ptg = 0x06)  Divides the top operand by the second-to-top.ptgPower - exponentiation (ptg = 0x07)  Raises the second-to-top operand to the power of the top operand.ptgConcat - concatenation (ptg = 0x08)  Appends the top operand to the second-to-top operand.ptgLT - less than (ptg = 0x09)  Evaluates to TRUE if the second-to-top operand is less than the top  operand; FALSE otherwise.ptgLE - less than or equal (ptg = 0x0A)  Evaluates to TRUE if the second-to-top operand is less than or equal  to the top operand; FALSE otherwise.ptgEQ - equal (ptg = 0x0B)  Evaluates to TRUE if the top two operands are equal; FALSE otherwise.ptgGE - greater than or equal (ptg = 0x0C)  Evaluates to TRUE if the second-to-top operand is greater than or  equal to the top operand; FALSE otherwise.ptgGT - greater than (ptg = 0x0D)  Evaluates to TRUE if the second-to-top operand is greater than the  top operand; FALSE otherwise.ptgNE - not equal (ptg = 0x0E)  Evaluates to TRUE if the top two operands are not equal; FALSE  otherwise.ptgIsect - intersection (ptg = 0x0F)  This is the Excel space operator.  It computes the intersection of  the top two operands.ptgUnion - union (ptg = 0x10)  This is the Excel comma operator.  It computes the union of the top  two operands.ptgRange - range (ptg = 0x11)  This is the Excel colon operator.  It computes the minimal bounding  rectangle of the top two operands.Operand Tokens - Constant-------------------------  The following operand tokens push a single constant operand onto the  operand stack.ptgMissArg - missing argument (operand, ptg = 0x16)  Missing argument to an Excel function. For example, the second  argument to DCOUNT(Database,,Criteria) would be stored as a  ptgMissArg.ptgStr - string constant (operand, ptg = 0x17)  String constant.  Followed by the string.  Offset  Name    Size Contents  ------  ----    ---- --------  0   cch     1   length of the string  1   rgch    var  the string  ptgStr requires special handling when parsed expressions are  scanned.  See the section "Scanning a Parsed Expression" for an  explanation.ptgErr - error value (operand, ptg = 0x1C)  Error constant.  Followed by the error value.  See the BOOLERR record  for a list of error values.  Offset  Name    Size Contents  ------  ----    ---- --------  0   err     1   Excel error valueptgBool - boolean (operand, ptg = 0x1D)  Boolean constant.  Followed by a byte value.  Offset  Name    Size Contents  ------  ----    ---- --------  0   f    1  =1 for TRUE          =0 for FALSEptgInt - integer (operand, ptg = 0x1E)  Integer constant.  Followed by a word value.  Offset  Name    Size Contents  ------  ----    ---- --------  0   w    2  unsigned integer valueptgNum - number (operand, ptg = 0x1F)  Numeric constant.  Followed by an 8-byte IEEE floating point number.  Offset  Name    Size Contents  ------  ----    ---- --------  0   num     8   IEEE floating point numberOperand Tokens - Classes------------------------  As described above, operand tokens push operand values onto the  operand stack.  These values are divided into three different  classes, depending on what type of value the formula expects from the  operand.  The type of value is determined at parse time by the  context of the operand.  REFERENCE CLASS.  Some operands are required by context to be  references.  In this case, the term "reference" is a general term  meaning the specification of one or more areas on an Excel document,  without regard for the underlying cell values in those areas. When  the Excel expression evaluator encounters a reference type operand,  it pushes only the reference itself onto the operand stack; it does  not dereference it to find the underlying cell values.  For example, consider the formula CELL("width",B5), which returns the  column width of cell B5. Clearly, only the reference to cell B5 is  important here; the value stored at cell B5 is irrelevant to the cell'  s width.  VALUE CLASS.  This is the most common type of operand; it pushes a  single dereferenced value onto the operand stack.  For example, consider the formula A1+1.  Here, we are interested in  the value stored in cell A1, so we dereference the A1 reference.  ARRAY CLASS.  This operand pushes an array of values onto the operand  stack.  The values may be specified either in an array constant or in  a reference to cells.  For example, consider the formula SUM({1,2,3;4,5,6}).  Here, to  evaluate the SUM function, the expression evaluator must push an  entire array of values onto the operand stack.  The three classes of operand tokens are numerically divided as  follows:   Operand Class  Ptg's   -------------  -----   Reference  0x20 - 0x3F   Value   0x40 - 0x5F   Array   0x60 - 0x7F  Notice that the numerical difference between ptg classes is 0x20.  This is the basis for forming the class variants of ptg's. Class  variants of ptg's are formed from the reference class ptg, also known  as the "base" ptg.  To form the value class ptg from the base ptg,  you add 0x20 to the ptg and append "V" (for "value") to the ptg  name.  To form the array class ptg from the base ptg, you add 0x40 to  the ptg and append "A" (for "array") to the ptg name. These rules are  summarized below for a hypothetical ptg called ptgFoo:  Class    Na
上一页 1 2 3 45
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -