📄 pdftotext.1
字号:
.\" Copyright 1997-2007 Glyph & Cog, LLC.TH pdftotext 1 "27 Febuary 2007".SH NAMEpdftotext \- Portable Document Format (PDF) to text converter(version 3.02).SH SYNOPSIS.B pdftotext[options].RI [ PDF-file.RI [ text-file ]].SH DESCRIPTION.B Pdftotextconverts Portable Document Format (PDF) files to plain text..PPPdftotext reads the PDF file,.IR PDF-file ,and writes a text file,.IR text-file .If.I text-fileis not specified, pdftotext converts.I file.pdfto.IR file.txt .If .I text-fileis \'-', the text is sent to stdout..SH CONFIGURATION FILEPdftotext reads a configuration file at startup. It first tries tofind the user's private config file, ~/.xpdfrc. If that doesn'texist, it looks for a system-wide config file, typically/usr/local/etc/xpdfrc (but this location can be changed when pdftotextis built). See the.BR xpdfrc (5)man page for details..SH OPTIONSMany of the following options can be set with configuration filecommands. These are listed in square brackets with the description ofthe corresponding command line option..TP.BI \-f " number"Specifies the first page to convert..TP.BI \-l " number"Specifies the last page to convert..TP.B \-layoutMaintain (as best as possible) the original physical layout of thetext. The default is to \'undo' physical layout (columns,hyphenation, etc.) and output the text in reading order..TP.B \-rawKeep the text in content stream order. This is a hack which often"undoes" column formatting, etc. Use of raw mode is no longerrecommended..TP.B \-htmlmetaGenerate a simple HTML file, including the meta information. Thissimply wraps the text in <pre> and </pre> and prepends the metaheaders..TP.BI \-enc " encoding-name"Sets the encoding to use for text output. The.I encoding\-namemust be defined with the unicodeMap command (see.BR xpdfrc (5)).The encoding name is case-sensitive. This defaults to "Latin1" (whichis a built-in encoding)..RB "[config file: " textEncoding ].TP.BI \-eol " unix | dos | mac"Sets the end-of-line convention to use for text output..RB "[config file: " textEOL ].TP.B \-nopgbrkDon't insert page breaks (form feed characters) between pages..RB "[config file: " textPageBreaks ].TP.BI \-opw " password"Specify the owner password for the PDF file. Providing this willbypass all security restrictions..TP.BI \-upw " password"Specify the user password for the PDF file..TP.B \-qDon't print any messages or errors..RB "[config file: " errQuiet ].TP.BI \-cfg " config-file"Read.I config-filein place of ~/.xpdfrc or the system-wide config file..TP.B \-vPrint copyright and version information..TP.B \-hPrint usage information..RB ( \-helpand.B \-\-helpare equivalent.).SH BUGSSome PDF files contain fonts whose encodings have been mangled beyondrecognition. There is no way (short of OCR) to extract text fromthese files..SH EXIT CODESThe Xpdf tools use the following exit codes:.TP0No error..TP1Error opening a PDF file..TP2Error opening an output file..TP3Error related to PDF permissions..TP99Other error..SH AUTHORThe pdftotext software and documentation are copyright 1996-2007 Glyph& Cog, LLC..SH "SEE ALSO".BR xpdf (1),.BR pdftops (1),.BR pdfinfo (1),.BR pdffonts (1),.BR pdftoppm (1),.BR pdfimages (1),.BR xpdfrc (5).br.B http://www.foolabs.com/xpdf/
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -