📄 draft-ietf-dnsop-bad-dns-res-05.txt
字号:
This section should not be understood to claim that all queries to a zone's parent are bad. In some cases, such queries are not only reasonable but required. Consider the situation when required information, such as the address of a name server (i.e., the address record corresponding to the RDATA of an NS record), has timed out of an iterative resolver's cache before the corresponding NS record. If the name of the name server is below the apex of the zone, then the name server's address record is only available as glue in the parent zone. For example, consider this NS record: example.com. IN NS ns.example.com. If a cache has this NS record but not the address record for "ns.example.com", it is unable to contact the "example.com" zone directly and must query the "com" zone to obtain the address record. Note, however, that such a query would not have QTYPE=NS according to the standard resolution algorithm.2.1.1. Recommendation An iterative resolver MUST NOT send a query for the NS RRset of a non-responsive zone to any of the name servers for that zone's parentLarson & Barber Expires August 14, 2006 [Page 6]Internet-Draft Observed DNS Resolution Misbehavior February 2006 zone. For the purposes of this injunction, a non-responsive zone is defined as a zone for which every name server listed in the zone's NS RRset: 1. is not authoritative for the zone (i.e., lame), or, 2. returns a server failure response (RCODE=2), or, 3. is dead or unreachable according to section 7.2 of RFC 2308 [4].2.2. Repeated queries to lame servers Section 2.1 describes a catastrophic failure: when every name server for a zone is unable to provide an answer for one reason or another. A more common occurrence is when a subset of a zone's name servers are unavailable or misconfigured. Different failure modes have different expected durations. Some symptoms indicate problems that are potentially transient; for example, various types of ICMP unreachable messages because a name server process is not running or a host or network is unreachable, or a complete lack of a response to a query. Such responses could be the result of a host rebooting or temporary outages; these events don't necessarily require any human intervention and can be reasonably expected to be temporary. Other symptoms clearly indicate a condition requiring human intervention, such as lame server: if a name server is misconfigured and not authoritative for a zone delegated to it, it is reasonable to assume that this condition has potential to last longer than unreachability or unresponsiveness. Consequently, repeated queries to known lame servers are not useful. In this case of a condition with potential to persist for a long time, a better practice would be to maintain a list of known lame servers and avoid querying them repeatedly in a short interval. It should also be noted, however, that some authoritative name server implementations appear to be lame only for queries of certain types as described in RFC 4074 [5]. In this case, it makes sense to retry the "lame" servers for other types of queries, particularly when all known authoritative name servers appear to be "lame".2.2.1. Recommendation Iterative resolvers SHOULD cache name servers that they discover are not authoritative for zones delegated to them (i.e. lame servers). If this caching is performed, lame servers MUST be cached against the specific query tuple <zone name, class, server IP address>. Zone name can be derived from the owner name of the NS record that was referenced to query the name server that was discovered to be lame.Larson & Barber Expires August 14, 2006 [Page 7]Internet-Draft Observed DNS Resolution Misbehavior February 2006 Implementations that perform lame server caching MUST refrain from sending queries to known lame servers based on a time interval from when the server is discovered to be lame. A minimum interval of thirty minutes is RECOMMENDED. An exception to this recommendation occurs if all name servers for a zone are marked lame. In that case, the iterative resolver SHOULD temporarily ignore the servers' lameness status and query one or more servers. This behavior is a workaround for the type-specific lameness issue described in the previous section. Implementors should take care not to make lame server avoidance logic overly broad: note that a name server could be lame for a parent zone but not a child zone, e.g., lame for "example.com" but properly authoritative for "sub.example.com". Therefore a name server should not be automatically considered lame for subzones. In the case above, even if a name server is known to be lame for "example.com", it should be queried for QNAMEs at or below "sub.example.com" if an NS record indicates it should be authoritative for that zone.2.3. Inability to follow multiple levels of indirection Some iterative resolver implementations are unable to follow sufficient levels of indirection. For example, consider the following delegations: foo.example. IN NS ns1.example.com. foo.example. IN NS ns2.example.com. example.com. IN NS ns1.test.example.net. example.com. IN NS ns2.test.example.net. test.example.net. IN NS ns1.test.example.net. test.example.net. IN NS ns2.test.example.net. An iterative resolver resolving the name "www.foo.example" must follow two levels of indirection, first obtaining address records for "ns1.test.example.net" or "ns2.test.example.net" in order to obtain address records for "ns1.example.com" or "ns2.example.com" in order to query those name servers for the address records of "www.foo.example". While this situation may appear contrived, we have seen multiple similar occurrences and expect more as new generic top-level domains (gTLDs) become active. We anticipate many zones in new gTLDs will use name servers in existing gTLDs, increasing the number of delegations using out-of-zone name servers.Larson & Barber Expires August 14, 2006 [Page 8]Internet-Draft Observed DNS Resolution Misbehavior February 20062.3.1. Recommendation Clearly constructing a delegation that relies on multiple levels of indirection is not a good administrative practice. However, the practice is widespread enough to require that iterative resolvers be able to cope with it. Iterative resolvers SHOULD be able to handle arbitrary levels of indirection resulting from out-of-zone name servers. Iterative resolvers SHOULD implement a level-of-effort counter to avoid loops or otherwise performing too much work in resolving pathological cases. A best practice that avoids this entire issue of indirection is to name one or more of a zone's name servers in the zone itself. For example, if the zone is named "example.com", consider naming some of the name servers "ns{1,2,...}.example.com" (or similar).2.4. Aggressive retransmission when fetching glue When an authoritative name server responds with a referral, it includes NS records in the authority section of the response. According to the algorithm in section 4.3.2 of RFC 1034 [2], the name server should also "put whatever addresses are available into the additional section, using glue RRs if the addresses are not available from authoritative data or the cache." Some name server implementations take this address inclusion a step further with a feature called "glue fetching". A name server that implements glue fetching attempts to include address records for every NS record in the authority section. If necessary, the name server issues multiple queries of its own to obtain any missing address records. Problems with glue fetching can arise in the context of "authoritative-only" name servers, which only serve authoritative data and ignore requests for recursion. Such an entity will not normally generate any queries of its own. Instead it answers non- recursive queries from iterative resolvers looking for information in zones it serves. With glue fetching enabled, however, an authoritative server invokes an iterative resolver to look up an unknown address record to complete the additional section of a response. We have observed situations where the iterative resolver of a glue- fetching name server can send queries that reach other name servers, but is apparently prevented from receiving the responses. For example, perhaps the name server is authoritative-only and therefore its administrators expect it to receive only queries and not responses. Perhaps unaware of glue fetching and presuming that the name server's iterative resolver will generate no queries, its administrators place the name server behind a network device thatLarson & Barber Expires August 14, 2006 [Page 9]Internet-Draft Observed DNS Resolution Misbehavior February 2006 prevents it from receiving responses. If this is the case, all glue- fetching queries will go answered. We have observed name server implementations whose iterative resolvers retry excessively when glue-fetching queries are unanswered. A single com/net name server has received hundreds of queries per second from a single such source. Judging from the specific queries received and based on additional analysis, we believe these queries result from overly aggressive glue fetching.2.4.1. Recommendation Implementers whose name servers support glue fetching SHOULD take care to avoid sending queries at excessive rates. Implementations SHOULD support throttling logic to detect when queries are sent but no responses are received.2.5. Aggressive retransmission behind firewalls A common occurrence and one of the largest sources of repeated queries at the com/net and root name servers appears to result from resolvers behind misconfigured firewalls. In this situation, an iterative resolver is apparently allowed to send queries through a firewall to other name servers, but not receive the responses. The result is more queries than necessary because of retransmission, all of which are useless because the responses are never received. Just as with the glue-fetching scenario described in Section 2.4, the queries are sometimes sent at excessive rates. To make matters worse, sometimes the responses, sent in reply to legitimate queries, trigger an alarm on the originator's intrusion detection system. We are frequently contacted by administrators responding to such alarms who believe our name servers are attacking their systems. Not only do some resolvers in this situation retransmit queries at an excessive rate, but they continue to do so for days or even weeks. This scenario could result from an organization with multiple recursive name servers, only a subset of whose iterative resolvers' traffic is improperly filtered in this manner. Stub resolvers in the organization could be configured to query multiple recursive name servers. Consider the case where a stub resolver queries a filtered recursive name server first. The iterative resolver of this recursive name server sends one or more queries whose replies are filtered, so it can't respond to the stub resolver, which times out. Then the stub resolver retransmits to a recursive name server that is able to provide an answer. Since resolution ultimately succeeds the underlying problem might not be recognized or corrected. A popular stub resolver implementation has a very aggressive retransmission schedule, including simultaneous queries to multiple recursive nameLarson & Barber Expires August 14, 2006 [Page 10]Internet-Draft Observed DNS Resolution Misbehavior February 2006 servers, which could explain how such a situation could persist without being detected.2.5.1. Recommendation The most obvious recommendation is that administrators SHOULD take care not to place iterative resolvers behind a firewall that allows queries to pass through but not the resulting replies. Iterative resolvers SHOULD take care to avoid sending queries at excessive rates. Implementations SHOULD support throttling logic to detect when queries are sent but no responses are received.2.6. Misconfigured NS records Sometimes a zone administrator forgets to add the trailing dot on the domain names in the RDATA of a zone's NS records. Consider this fragment of the zone file for "example.com": $ORIGIN example.com. example.com. 3600 IN NS ns1.example.com ; Note missing example.com. 3600 IN NS ns2.example.com ; trailing dots The zone's authoritative servers will parse the NS RDATA as "ns1.example.com.example.com" and "ns2.example.com.example.com" and return NS records with this incorrect RDATA in responses, including typically the authority section of every response containing records from the "example.com" zone. Now consider a typical sequence of queries. An iterative resolver attempting to resolve address records for "www.example.com" with no cached information for this zone will query a "com" authoritative server. The "com" server responds with a referral to the "example.com" zone, consisting of NS records with valid RDATA and associated glue records. (This example assumes that the "example.com" zone delegation information is correct in the "com" zone.) The iterative resolver caches the NS RRset from the "com" server and follows the referral by querying one of the "example.com" authoritative servers. This server responds with the "www.example.com" address record in the answer section and, typically, the "example.com" NS records in the authority section and, if space in the message remains, glue address records in the additional section. According to Section 5.4 of RFC 2181 [3], NS records in the authority section of an authoritative answer are more trustworthy than NS records from the authority section of a non- authoritative answer. Thus the "example.com" NS RRset just received from the "example.com" authoritative server overrides the "example.com" NS RRset received moments ago from the "com"Larson & Barber Expires August 14, 2006 [Page 11]Internet-Draft Observed DNS Resolution Misbehavior February 2006
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -