📄 wget.pod

📁 一个从网络上自动下载文件的自由工具
💻 POD
📖 第 1 页 / 共 5 页
字号:
Beginning with Wget 1.7, if you use B<-c> on a non-empty file, andit turns out that the server does not support continued downloading,Wget will refuse to start the download from scratch, which wouldeffectively ruin existing contents.  If you really want the download tostart from scratch, remove the file.Also beginning with Wget 1.7, if you use B<-c> on a file which is ofequal size as the one on the server, Wget will refuse to download thefile and print an explanatory message.  The same happens when the fileis smaller on the server than locally (presumably because it was changedon the server since your last download attempt)---because "continuing"is not meaningful, no download occurs.On the other side of the coin, while using B<-c>, any file that'sbigger on the server than locally will be considered an incompletedownload and only C<(length(remote) - length(local))> bytes will bedownloaded and tacked onto the end of the local file.  This behavior canbe desirable in certain cases---for instance, you can use B<wget -c>to download just the new portion that's been appended to a datacollection or log file.However, if the file is bigger on the server because it's beenI<changed>, as opposed to just I<appended> to, you'll end upwith a garbled file.  Wget has no way of verifying that the local fileis really a valid prefix of the remote file.  You need to be especiallycareful of this when using B<-c> in conjunction with B<-r>,since every file will be considered as an "incomplete download" candidate.Another instance where you'll get a garbled file if you try to useB<-c> is if you have a lame HTTP proxy that inserts a"transfer interrupted" string into the local file.  In the future a"rollback" option may be added to deal with this case.Note that B<-c> only works with FTP servers and with HTTPservers that support the C<Range> header.=item B<--progress=>I<type>Select the type of the progress indicator you wish to use.  Legalindicators are "dot" and "bar".The "bar" indicator is used by default.  It draws an ASCII progressbar graphics (a.k.a "thermometer" display) indicating the status ofretrieval.  If the output is not a TTY, the "dot" bar will be used bydefault.Use B<--progress=dot> to switch to the "dot" display.  It tracesthe retrieval by printing dots on the screen, each dot representing afixed amount of downloaded data.When using the dotted retrieval, you may also set the I<style> byspecifying the type as B<dot:>I<style>.  Different styles assigndifferent meaning to one dot.  With the C<default> style each dotrepresents 1K, there are ten dots in a cluster and 50 dots in a line.The C<binary> style has a more "computer"-like orientation---8Kdots, 16-dots clusters and 48 dots per line (which makes for 384Klines).  The C<mega> style is suitable for downloading very largefiles---each dot represents 64K retrieved, there are eight dots in acluster, and 48 dots on each line (so each line contains 3M).Note that you can set the default style using the C<progress>command in F<.wgetrc>.  That setting may be overridden from thecommand line.  The exception is that, when the output is not a TTY, the"dot" progress will be favored over "bar".  To force the bar output,use B<--progress=bar:force>.=item B<-N>=item B<--timestamping>Turn on time-stamping.  =item B<-S>=item B<--server-response>Print the headers sent by HTTP servers and responses sent byFTP servers.=item B<--spider>When invoked with this option, Wget will behave as a Web I<spider>,which means that it will not download the pages, just check that theyare there.  For example, you can use Wget to check your bookmarks:		wget --spider --force-html -i bookmarks.htmlThis feature needs much more work for Wget to get close to thefunctionality of real web spiders.=item B<-T seconds>=item B<--timeout=>I<seconds>Set the network timeout to I<seconds> seconds.  This is equivalentto specifying B<--dns-timeout>, B<--connect-timeout>, andB<--read-timeout>, all at the same time.When interacting with the network, Wget can check for timeout andabort the operation if it takes too long.  This prevents anomalieslike hanging reads and infinite connects.  The only timeout enabled bydefault is a 900-second read timeout.  Setting a timeout to 0 disablesit altogether.  Unless you know what you are doing, it is best not tochange the default timeout settings.All timeout-related options accept decimal values, as well assubsecond values.  For example, B<0.1> seconds is a legal (thoughunwise) choice of timeout.  Subsecond timeouts are useful for checkingserver response times or for testing network latency.=item B<--dns-timeout=>I<seconds>Set the DNS lookup timeout to I<seconds> seconds.  DNS lookups thatdon't complete within the specified time will fail.  By default, thereis no timeout on DNS lookups, other than that implemented by systemlibraries.=item B<--connect-timeout=>I<seconds>Set the connect timeout to I<seconds> seconds.  TCP connections thattake longer to establish will be aborted.  By default, there is noconnect timeout, other than that implemented by system libraries.=item B<--read-timeout=>I<seconds>Set the read (and write) timeout to I<seconds> seconds.  The"time" of this timeout refers to I<idle time>: if, at any point inthe download, no data is received for more than the specified numberof seconds, reading fails and the download is restarted.  This optiondoes not directly affect the duration of the entire download.Of course, the remote server may choose to terminate the connectionsooner than this option requires.  The default read timeout is 900seconds.=item B<--limit-rate=>I<amount>Limit the download speed to I<amount> bytes per second.  Amount maybe expressed in bytes, kilobytes with the B<k> suffix, or megabyteswith the B<m> suffix.  For example, B<--limit-rate=20k> willlimit the retrieval rate to 20KB/s.  This is useful when, for whateverreason, you don't want Wget to consume the entire available bandwidth.This option allows the use of decimal numbers, usually in conjunctionwith power suffixes; for example, B<--limit-rate=2.5k> is a legalvalue.Note that Wget implements the limiting by sleeping the appropriateamount of time after a network read that took less time than specifiedby the rate.  Eventually this strategy causes the TCP transfer to slowdown to approximately the specified rate.  However, it may take sometime for this balance to be achieved, so don't be surprised if limitingthe rate doesn't work well with very small files.=item B<-w> I<seconds>=item B<--wait=>I<seconds>Wait the specified number of seconds between the retrievals.  Use ofthis option is recommended, as it lightens the server load by making therequests less frequent.  Instead of in seconds, the time can bespecified in minutes using the C<m> suffix, in hours using C<h>suffix, or in days using C<d> suffix.Specifying a large value for this option is useful if the network or thedestination host is down, so that Wget can wait long enough toreasonably expect the network error to be fixed before the retry.  Thewaiting interval specified by this function is influenced byC<--random-wait>, which see.=item B<--waitretry=>I<seconds>If you don't want Wget to wait between I<every> retrieval, but onlybetween retries of failed downloads, you can use this option.  Wget willuse I<linear backoff>, waiting 1 second after the first failure on agiven file, then waiting 2 seconds after the second failure on thatfile, up to the maximum number of I<seconds> you specify.  Therefore,a value of 10 will actually make Wget wait up to (1 + 2 + ... + 10) = 55seconds per file.Note that this option is turned on by default in the globalF<wgetrc> file.=item B<--random-wait>Some web sites may perform log analysis to identify retrieval programssuch as Wget by looking for statistically significant similarities inthe time between requests. This option causes the time between requeststo vary between 0.5 and 1.5 * I<wait> seconds, where I<wait> wasspecified using the B<--wait> option, in order to mask Wget'spresence from such analysis.A 2001 article in a publication devoted to development on a popularconsumer platform provided code to perform this analysis on the fly.Its author suggested blocking at the class C address level to ensureautomated retrieval programs were blocked despite changing DHCP-suppliedaddresses.The B<--random-wait> option was inspired by this ill-advisedrecommendation to block many unrelated users from a web site due to theactions of one.=item B<--no-proxy>Don't use proxies, even if the appropriate C<*_proxy> environmentvariable is defined.=item B<-Q> I<quota>=item B<--quota=>I<quota>Specify download quota for automatic retrievals.  The value can bespecified in bytes (default), kilobytes (with B<k> suffix), ormegabytes (with B<m> suffix).Note that quota will never affect downloading a single file.  So if youspecify B<wget -Q10k ftp://wuarchive.wustl.edu/ls-lR.gz>, all of theF<ls-lR.gz> will be downloaded.  The same goes even when severalURLs are specified on the command-line.  However, quota isrespected when retrieving either recursively, or from an input file.Thus you may safely type B<wget -Q2m -i sites>---download will beaborted when the quota is exceeded.Setting quota to 0 or to B<inf> unlimits the download quota.=item B<--no-dns-cache>Turn off caching of DNS lookups.  Normally, Wget remembers the IPaddresses it looked up from DNS so it doesn't have to repeatedlycontact the DNS server for the same (typically small) set of hosts itretrieves from.  This cache exists in memory only; a new Wget run willcontact DNS again.However, it has been reported that in some situations it is notdesirable to cache host names, even for the duration of ashort-running application like Wget.  With this option Wget issues anew DNS lookup (more precisely, a new call to C<gethostbyname> orC<getaddrinfo>) each time it makes a new connection.  Please notethat this option will I<not> affect caching that might beperformed by the resolving library or by an external caching layer,such as NSCD.If you don't understand exactly what this option does, you probablywon't need it.=item B<--restrict-file-names=>I<mode>Change which characters found in remote URLs may show up in local filenames generated from those URLs.  Characters that are I<restricted>by this option are escaped, i.e. replaced with B<%HH>, whereB<HH> is the hexadecimal number that corresponds to the restrictedcharacter.By default, Wget escapes the characters that are not valid as part offile names on your operating system, as well as control characters thatare typically unprintable.  This option is useful for changing thesedefaults, either because you are downloading to a non-native partition,or because you want to disable escaping of the control characters.When mode is set to "unix", Wget escapes the character B</> andthe control characters in the ranges 0--31 and 128--159.  This is thedefault on Unix-like OS'es.When mode is set to "windows", Wget escapes the characters B<\>,B<|>, B</>, B<:>, B<?>, B<">, B<*>, B<E<lt>>,B<E<gt>>, and the control characters in the ranges 0--31 and 128--159.In addition to this, Wget in Windows mode uses B<+> instead ofB<:> to separate host and port in local file names, and usesB<@> instead of B<?> to separate the query portion of the filename from the rest.  Therefore, a URL that would be saved asB<www.xemacs.org:4300/search.pl?input=blah> in Unix mode would besaved as B<www.xemacs.org+4300/search.pl@input=blah> in Windowsmode.  This mode is the default on Windows.If you append B<,nocontrol> to the mode, as inB<unix,nocontrol>, escaping of the control characters is alsoswitched off.  You can use B<--restrict-file-names=nocontrol> toturn off escaping of control characters without affecting the choice ofthe OS to use as file name restriction mode.=item B<-4>=item B<--inet4-only>=item B<-6>=item B<--inet6-only>Force connecting to IPv4 or IPv6 addresses.  With B<--inet4-only>or B<-4>, Wget will only connect to IPv4 hosts, ignoring AAAArecords in DNS, and refusing to connect to IPv6 addresses specified inURLs.  Conversely, with B<--inet6-only> or B<-6>, Wget willonly connect to IPv6 hosts and ignore A records and IPv4 addresses.Neither options should be needed normally.  By default, an IPv6-awareWget will use the address family specified by the host's DNS record.If the DNS responds with both IPv4 and IPv6 addresses, Wget will trythem in sequence until it finds one it can connect to.  (Also seeC<--prefer-family> option described below.)These options can be used to deliberately force the use of IPv4 orIPv6 address families on dual family systems, usually to aid debuggingor to deal with broken network configuration.  Only one ofB<--inet6-only> and B<--inet4-only> may be specified at thesame time.  Neither option is available in Wget compiled without IPv6support.=item B<--prefer-family=IPv4/IPv6/none>When given a choice of several addresses, connect to the addresseswith specified address family first.  IPv4 addresses are preferred bydefault.This avoids spurious errors and connect attempts when accessing hoststhat resolve to both IPv6 and IPv4 addresses from IPv4 networks.  Forexample, B<www.kame.net> resolves toB<2001:200:0:8002:203:47ff:fea5:3085> and toB<203.178.141.194>.  When the preferred family is C<IPv4>, theIPv4 address is used first; when the preferred family is C<IPv6>,the IPv6 address is used first; if the specified value is C<none>,the address order returned by DNS is used without change.Unlike B<-4> and B<-6>, this option doesn't inhibit access toany address family, it only changes the I<order> in which theaddresses are accessed.  Also note that the reordering performed bythis option is I<stable>---it doesn't affect order of addresses ofthe same family.  That is, the relative order of all IPv4 addressesand of all IPv6 addresses remains intact in all cases.=item B<--retry-connrefused>Consider "connection refused" a transient error and try again.Normally Wget gives up on a URL when it is unable to connect to thesite because failure to connect is taken as a sign that the server isnot running at all and that retries would not help.  This option isfor mirroring unreliable sites whose servers tend to disappear forshort periods of time.=item B<--user=>I<user>=item B<--password=>I<password>Specify the username I<user> and password I<password> for bothFTP and HTTP file retrieval.  These parameters can be overriddenusing the B<--ftp-user> and B<--ftp-password> options for FTP connections and the B<--http-user> and B<--http-password> options for HTTP connections.=back=head2 Directory Options
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -