⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 findstr.doc

📁 查找一个文件中指定的字符串
💻 DOC
字号:
      FINDSTR.ARC - Find Multiple Strings in Multiple Files


Version 1.01:   August 5, 1986
Author:         Don A. Williams
Language:       Datalight C Developer's Kit, Version 2.03


FINDSTR  is  a  program that can be used to search multiple files 
for multiple text strings.  It was actually developed as  a  test 
program  for  a Boyer-Moore string search subroutine.  The Boyer-
Moore algorithm is many times faster than the more common  string 
search  algorithms and FINDSTR is several times faster than other 
similar  programs.  It  does  not  provide  the  complex  pattern 
matching  and  action statements of BAWK but is from 3 to 5 times 
faster than BAWK at finding simple strings.  


USAGE:

    A>FINDSTR [-c] file_spec [file_spec .....]

FINDSTR defaults to a case insensitive compare on the  assumption 
that  is  better  to find too many occurrences than to miss some.  
The '-C' command option  will  change  this  default  to  a  case 
sensitive  compare  if that is what is required.  The '-C' option 
can actually appear anywhere on the  command  line,  however,  no 
file name specification may begin with a '-' as a result.  

FINDSTR can accept multiple file specifications  on  the  command 
line  and  each file specification may be a full MS-DOS path name 
which,  complements of Datalight C,  may  contain  "wild  cards".  
FINDSTR  will read STDIN for the strings to be searched for.  The 
search strings are entered one per line and may  contain  blanks, 
quotes,  and  other  "special"  characters  but  may  not contain 
carriage returns or line feeds (in this  version).  FINDSTR  will 
quit  reading  search  strings  when  it encounters either a null 
line, i.e. a line containing only a carriage return or an End-of-
File (Control-Z).  Since STDIN may  be  redirected,  FINDSTR  can 
read the search strings from either the console or a file.  


OUTPUT:

For  each  file  that  it  processes,  FINDSTR will output a line 
containing the path name of the file followed by  line  for  each 
occurrence of one of the search strings in the file.  These lines 
will contain the search string found followed by a blank followed 
by  the  4-digit  decimal  line  number  of  the line in the file 
followed by a ':' followed  by  a  blank  followed  by  the  line 
itself.  


EXAMPLES:

Since  the  output  lines  of  FINDSTR will frequently exceed the 
length of the lines in this document,  they will be truncated  in 
these examples; a '>' character in the right margin will indicate 
that the line has been truncated.  

Example 1 - Search this document (to this point) for "the"


C>FINDSTR findstr.doc
FINDSTR Version 1.01: August 5, 1986

Enter search patterns, 1 per line
NULL line terminates
the

--findstr.doc--
the   12: Moore algorithm is many times faster than the more com>
the   13: search  algorithms and FINDSTR is several times faster>
the   14: similar  programs.  It  does  not  provide  the  compl>
the   23: FINDSTR can accept multiple file specifications  on  t>
the   26: FINDSTR  will read STDIN for the strings to be searche>
the   28: quotes,  and  other  "special"  characters  but  may  >
the   30: quit  reading  search  strings  when  it encounters ei>
the   33: read the search strings from either the console or a f>
the   39: containing the path name of the file followed by  line>
the   40: occurrence of one of the search strings in the file.  >
the   41: will contain the search string found followed by a bla>
the   42: by  the  4-digit  decimal  line  number  of  the line >
the   43: followed by a ':' followed  by  a  blank  followed  by>
the   49: Since  the  output  lines  of  FINDSTR will frequently>
the   50: length of the lines in this document,  they will be tr>
the   51: these examples; a '>' character in the right margin wi>
the   52: that the line has been truncated.  
the   54: Example 1 - Search this document (to this point) for ">
                                                                 

Example 2 - Search the C source files for this program for "if" 
            and "for".

C>FINDSTR *.c
FINDSTR Version 1.01: August 5, 1986

Enter search patterns, 1 per line
NULL line terminates
if
for

--BM.C--
for    7:     for (i=0; i<256; i++) d[i] = pl;
for    8:     for (i=0; i<pl-2; i++) d[Pattern[i]] = pl - i - 1;
if   28:     if (j < 0) return(io - pl);
--BMSEAR.C--
for    7:       for (i=0; i<256; i++) d[i] = pl;
for    8:       for (i=0; i<pl-1; i++) {
if   30:        if (j < 0) return(io - pl);
--FINDSTR.C--
if   35:                if (pl == 0) break;
if   37:                if (t == NULL) {
for   38:                   fprintf(stderr, "Insufficient memory>
if   42:                if (t->Pattern == 0) {
for   43:                   fprintf(stderr, "Insufficient memory>
if   47:                if (t->d == 0) {
for   48:                       fprintf(stderr, "Insufficient me>
if   54:                if (PatQueue.Head == NULL) PatQueue.Head>
for   58:       for (fp=1; fp<argc; fp++) {
if   59:                if ((F1 = fopen(argv[fp], "r")) == 0) {
for   69:                       for (t=PatQueue.Head; t != NULL;>
if   70:                                if ((p = BMSearch(Line, >

This example could also have been run by  creating  a  file,  say 
INPUT,  containing  the  search strings,  one string to a line as 
follows: 

    if
    for

The command line would then be:

C>FINDSTR *.c <input

The output would be the same as shown  above.  The  output  could 
also have been redirected to a file by the command line: 

C>FINDSTR *.C <input >output

In this case, all of the output from the first file name on would 
have been put in the file,  OUTPUT.  Input and output redirection 
are independent, i.e. either may be used without the other.  


TECHNICAL CONSIDERATIONS:

The "heart" of this program  is  the  Boyer-Moore  string  search 
algorithm.  This  algorithm  is the fastest known on the average.  
The description of how  it  works  is  somewhat  complex  and  is 
presented  in  "Data Structures and Algorithms" by Niklaus Wirth, 
pages 66-69 and by the inventors R. S. Boyer and J.  S.  Moore in 
"A Fast String Matching Algorithm",  Communications of  the  ACM, 
20,  10,  (Oct.  1977),  pp 762-772.  The algorithm does required 
"compilation" of  each  search  string  and  is  best  suited  to 
conditions  where  the  data  to be search is considerably larger 
than the search string.  The routine,  BMCompile,  in the module, 
BMSEAR,  performs the compilation and the routine,  BMSearch,  in 
the same module performs  the  actual  search.  For  each  search 
string,  the  algorithm  requires  a  integer  array  to hold the 
"compiled" string as well as space to retain the  string  itself.  
I  have  chosen  to  make  these arrays 256 entries long to allow 
strings to contain characters above the standard ASCII 128.  

FINDSTR forms the input search strings into  a  simple  First-In-
First-Out  (FIFO)  linked  list,  acquiring  memory  for both the 
string itself and for its "compiled" array dynamically.  

The expansion of "wild card" file specifications on  the  command 
line  is  performed  by  a  proprietary  module supplied with the 
Datalight C compiler,  however,  other good C  compilers  provide 
similar  facilities.   Outside  of  the  "wild  card"  expansion, 
FINDSTR and BMSEAR are very "standard" C and should be compilable 
by any other C compiler.  

A similar Boyer-Moore search algorithm is also available in Turbo 
Pascal.  


COPYRIGHT CONSIDERATIONS:

As far as I can determine,  everything  in  FINDSTR  and  BMSEAR, 
except  the expansion of "wild card" command line parameters,  is 
in the public domain and no copyright restrictions  apply.  Since 
the   total  development  time  for  this  program  from  initial 
conception to the production of this document was under 9  hours, 
the  usual  request  for  "donations" is absurd.  The author,  of 
course,   disclaims  liability  for  any  damages  of  any   kind 
whatsoever  arising  from  the  use of this program or any of its 
component parts.  


⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -