📄 changes
字号:
Manually edited CHANGES file - now out of date.See ChangeLog for latest update information.Some notes from Peter on checking the code into RCS and some fixes to 4.1 areappended to the bottom of this file.4.12.5 --> 4.12.6- Fixes to configure script, thanks to Michael Heironimus- Fixes to index/partition.c, index/io.c and index/build_in.c should resolve problem with missing hits on the first one or two keywords in the index. Thanks to Morey Hubin.- Fix to sgrep.c solves problem of double-hit count with record delimiters. M. Hubin. 4.11 --> 4.12.5 Fix for using filters with structured indexes. Added FILE_END_MARK constant so it is possible to configure for filenames with spaces Test-fix for core dump on large indexes (may not have solved problem).4.1 --> 4.11 Fix for core dump on merge, cleanup makefiles.4.0 --> 4.1- Minor bug fixes and cleanup preparatory to final glimpse release.3.6 --> 4.0- Added support to extract titles from HTML pages in glimpseindex with the -X option. These files must have names that end in: html, htm, shtml, shtm (It is easy to extend these -- just see glimpse.h/EXTRACT_INFO_SUFFIX. The routine to extract titles is index/filetype.c/extract_info(). This can be modified in various ways to extract info from many filetypes.) The titles are appended to the corresponding filenames after a ' ', before storing the filenames in .glimpse_filenames. In this case, glimpseindex assumes that filenames don't have spaces in them.- Added support to glimpseindex to store not just the names of files that are indexed, but also some extra information (like a URL) after each file, when -F is used to provide the names of the files to be indexed to glimpseindex. This will be stored in .glimpse_filenames and .glimpse_filehash. The information (URL) must be separated from the actual file name by one blank ' '. In this case too, glimpse assumes that filenames don't have spaces in them.- Added a -U option to glimpse to be able to interpret indices created with a -X or a -U option in glimpseindex. This is necessary since glimpse must know that the first ' ' (see above) siginifies the end of the filename in .glimpse_filenames. When glimpse outputs matches, it will display the filename, the URL, and the title automatically. The user must be able to parse this info properly though!- Added an option -X to glimpse to just output the names of files that do contain a match, in case glimpse is not able to open the file for reading. Without the -X option, glimpse will simply ignore the file and continue.- Added "wgconvert", a program to compress and decompress neighborhoods in webglimpse. It can also be used to convert a file of filenames (that's used as a parameter for the -f option in glimpse) to a smaller binary representation, and vice versa. See file "index/convert.c". (9-10/96). The compression can change a filenames file to a file containing a bit mask representaion of the set of files, or to a file containing a sparse set representation of these files. We recommend sparse-sets only.- Added support in glimpse to read not just a set of filenames (with a -f option), but also a compressed set of filenames (with the -p option). The -p option allows you to utilize compressed `neighborhoods' (sets of filenames) to limit your search, without uncompressing them. The usage is: "-p filename:X:Y:Z" where "filename" is the file with compressed neighborhoods, X is an offset into that file (usually 0, must be a multiple of sizeof(int)), Y is the length glimpse must access from that file (if 0, then whole file; must be a multiple of sizeof(int)), and Z must be 2 (it indicates that "filename" has the sparse-set representation of compressed neighborhoods: the other values are for internal use only). Note that any colon ":" in filename must be escaped using a backslash \.- Added limited support for NOT in glimpse. This works with index search (-N) or whole file scope for booleans (-W) only. (11/96). "Not" can be specified using "~". "Not" is most useful in expressions like "bad;~boy" or "woman,~girl"; or in "global not" expressions like "~{bad;boy}" or "~{woman,girl}". The semantics of ~ is as follows: the ~ works exactly as you would expect for index search (-N). For actual output, you will get all records with at least one of the specified patterns (bad, boy, woman, girl), that satisfy the boolean expression. That is, for example, "bad;~boy" will give you all records that contain "bad" but not "boy", in all files that contain "bad" but not "boy". However, if you search for "~{bad;boy}", glimpse/agrep will NOT output records that don't contain either bad or boy. They will only give you records that contain alteast one of "bad" or "boy" but not both. This is logical since otherwise, a pattern like "~ZZZZIYIUYIUYIUYRR", for example, would force glimpse to output all records in all files... For index-search and actual file-search to be consistent, a ~ should be used only with -W. Glimpse exits with an error otherwise. Agrep can now also search for nots, and the semantics are the same as above, except that the boolean expressions are evaluated on a per- record basis, rather than a per-file basis like glimpse.- Added support to search for patterns with repeating strings (11/96): "{computer;science},{computer;chronicles}" This now works in agrep as well as glimpse. However, its for simple patterns only (i.e., no regexp or spelling-errors). Previously, you were forced to say "{computer;{science,chronicles}}". This also fixes the "bug" where queries like "url=pat1;content=pat1" in Harvest did not work (the same pattern pat1 appears twice).- Fixed some nagging memory leaks and segfaults on Solaris (10/96).- Fixed multiple matches / missed matches problems with -W (11/96).3.5 --> 3.6- Many bug fixes and performance improvements to support webglimpse- A -R option to glimpseindex to recompute .glimpse_filenames_index from a changed .glimpse_filenames. This allows users to move the index from one file system to another (where the absolute pathnames of the same files can be different), or convert all absolute pathnames .glimpse_filenames to relative pathnames, and still use the existing index of that data.3.0 --> 3.5- added "-f filename" option to glimpse: it allows you to restrict the search to only those files whose names appear in "filename".- fixed the agrep bug where -n was not working with ISO characters.- Added -t to glimpseindex that sorts .glimpse_filenames by decreasing order of modify time (st_mtime in stat structure);- Added -j option to glimpse to print time of file along with its name;- Added "-Y days" option to print files that were modified "days" before the index was created.- Added support for arbitrary characters in filenames (e.g. >, <, space, &...)2.1 ---> 3.0- added a data structure (in .glimpse_turbo) that speeds up queries using -w and -i considerably for large indexes. It is meant mostly for servers using glimpse (e.g., Harvest and glimpseHTTP servers), but it benefits everyone. With this "turbo" option, typical queries take less than a second even for very large indexes. This was so successful that we made it the default rather than an option (it used to be -T in some earlier versions). If the .glimpse_turbo file is deleted, glimpse will still work properly (but glimpseindex -f and -a require it).- incremental indexing is now fully supported (even for -b). Deletion from the index is supported. glimpseindex -d filename(s) completely deletes the files from the index; glimpseindex -D filename(s) deletes the files only from the file list.- the index has been improved in several ways (transparently except for speed and space). As a result, indices built with earlier versions of glimpseindex will not work with 3.0 -- you must reindex again. - several options were added to glimpseindex and glimpse: glimpseindex -E indexes all file without attempting to run the filetype filtering (but excluded files or suffixes still apply). glimpse -Q extends -N in a nice way giving much more information about the matches in the index. glimpse -L has more options: -L x | x:y | x:y:z if one number is given, it is a limit on the total number of matches. Glimpse outputs only the first x matches. If two numbers are given (x:y), then y is an added limit on the total number of files. If three numbers are given (x:y:z), then z is an added limit on the number of matches per file. If any of the x, y, or z is set to 0, it means to ignore it (in other words 0 = infinity in this case); for example, -L 0:10 will output all matches to the first 10 files that
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -