📄 netfilter-hacking-howto.txt

📁 这是我对防火墙技术的一些见解
💻 TXT
📖 第 1 页 / 共 5 页
字号:


  The table changes are not written back until the `iptc_commit()'
  function is called.  This means it is possible for two library users
  operating on the same chain to race each other; locking would be
  required to prevent this, and it is not currently done.


  There is no race with counters, however; counters are added back in to
  the kernel in such a way that counter increments between the reading
  and writing of the table still show up in the new table.


  There are various helper functions:


     [1miptc_first_chain()[0m
        This function returns the first chain name in the table.


     [1miptc_next_chain()[0m
        This function returns the next chain name in the table: NULL
        means no more chains.


     [1miptc_builtin()[0m
        Returns true if the given chain name is the name of a builtin
        chain.


     [1miptc_first_rule()[0m
        This returns a pointer to the first rule in the given chain
        name: NULL for an empty chain.


     [1miptc_next_rule()[0m
        This returns a pointer to the next rule in the chain: NULL means
        the end of the chain.


     [1miptc_get_target()[0m
        This gets the target of the given rule.  If it's an extended
        target, the name of that target is returned.  If it's a jump to
        another chain, the name of that chain is returned.  If it's a
        verdict (eg. DROP), that name is returned.  If it has no target
        (an accounting-style rule), then the empty string is returned.


        Note that this function should be used instead of using the
        value of the `verdict' field of the ipt_entry structure
        directly, as it offers the above further interpretations of the
        standard verdict.


     [1miptc_get_policy()[0m
        This gets the policy of a builtin chain, and fills in the
        `counters' argument with the hit statistics on that policy.


     [1miptc_strerror()[0m
        This function returns a more meaningful explanation of a failure
        code in the iptc library.  If a function fails, it will always
        set errno: this value can be passed to iptc_strerror() to yield
        an error message.



  [1m4.3.  Understanding NAT[0m

  Welcome to Network Address Translation in the kernel.  Note that the
  infrastructure offered is designed more for completeness than raw
  efficiency, and that future tweaks may increase the efficiency
  markedly.  For the moment I'm happy that it works at all.


  NAT is separated into connection tracking (which doesn't manipulate
  packets at all), and the NAT code itself.  Connection tracking is also
  designed to be used by an iptables modules, so it makes subtle
  distinctions in states which NAT doesn't care about.


  [1m4.3.1.  Connection Tracking[0m

  Connection tracking hooks into high-priority NF_IP_LOCAL_OUT and
  NF_IP_PRE_ROUTING hooks, in order to see packets before they enter the
  system.


  The nfct field in the skb is a pointer to inside the struct
  ip_conntrack, at one of the infos[] array.  Hence we can tell the
  state of the skb by which element in this array it is pointing to:
  this pointer encodes both the state structure and the relationship of
  this skb to that state.


  The best way to extract the `nfct' field is to call
  `ip_conntrack_get()', which returns NULL if it's not set, or the
  connection pointer, and fills in ctinfo which describes the
  relationship of the packet to that connection.  This enumerated type
  has several values:



     [1mIP_CT_ESTABLISHED[0m
        The packet is part of an established connection, in the original
        direction.


     [1mIP_CT_RELATED[0m
        The packet is related to the connection, and is passing in the
        original direction.


     [1mIP_CT_NEW[0m
        The packet is trying to create a new connection (obviously, it
        is in the original direction).


     [1mIP_CT_ESTABLISHED + IP_CT_IS_REPLY[0m
        The packet is part of an established connection, in the reply
        direction.


     [1mIP_CT_RELATED + IP_CT_IS_REPLY[0m
        The packet is related to the connection, and is passing in the
        reply direction.

  Hence a reply packet can be identified by testing for >=
  IP_CT_IS_REPLY.



  [1m4.4.  Extending Connection Tracking/NAT[0m

  These frameworks are designed to accommodate any number of protocols
  and different mapping types.  Some of these mapping types might be
  quite specific, such as a load-balancing/fail-over mapping type.


  Internally, connection tracking converts a packet to a "tuple",
  representing the interesting parts of the packet, before searching for
  bindings or rules which match it.  This tuple has a manipulatable
  part, and a non-manipulatable part; called "src" and "dst", as this is
  the view for the first packet in the Source NAT world (it'd be a reply
  packet in the Destination NAT world).  The tuple for every packet in
  the same packet stream in that direction is the same.


  For example, a TCP packet's tuple contains the manipulatable part:
  source IP and source port, the non-manipulatable part: destination IP
  and the destination port.  The manipulatable and non-manipulatable
  parts do not need to be the same type though; for example, an ICMP
  packet's tuple contains the manipulatable part: source IP and the ICMP
  id, and the non-manipulatable part: the destination IP and the ICMP
  type and code.


  Every tuple has an inverse, which is the tuple of the reply packets in
  the stream.  For example, the inverse of an ICMP ping packet, icmp id
  12345, from 192.168.1.1 to 1.2.3.4, is a ping-reply packet, icmp id
  12345, from 1.2.3.4 to 192.168.1.1.


  These tuples, represented by the `struct ip_conntrack_tuple', are used
  widely.  In fact, together with the hook the packet came in on (which
  has an effect on the type of manipulation expected), and the device
  involved, this is the complete information on the packet.


  Most tuples are contained within a `struct ip_conntrack_tuple_hash',
  which adds a doubly linked list entry, and a pointer to the connection
  that the tuple belongs to.


  A connection is represented by the `struct ip_conntrack': it has two
  `struct ip_conntrack_tuple_hash' fields: one referring to the
  direction of the original packet (tuplehash[IP_CT_DIR_ORIGINAL]), and
  one referring to packets in the reply direction
  (tuplehash[IP_CT_DIR_REPLY]).


  Anyway, the first thing the NAT code does is to see if the connection
  tracking code managed to extract a tuple and find an existing
  connection, by looking at the skbuff's nfct field; this tells us if
  it's an attempt on a new connection, or if not, which direction it is
  in; in the latter case, then the manipulations determined previously
  for that connection are done.


  If it was the start of a new connection, we look for a rule for that
  tuple, using the standard iptables traversal mechanism, on the `nat'
  table.  If a rule matches, it is used to initialize the manipulations
  for both that direction and the reply; the connection-tracking code is
  told that the reply it should expect has changed.  Then, it's
  manipulated as above.



  If there is no rule, a `null' binding is created: this usually does
  not map the packet, but exists to ensure we don't map another stream
  over an existing one.  Sometimes, the null binding cannot be created,
  because we have already mapped an existing stream over it, in which
  case the per-protocol manipulation may try to remap it, even though
  it's nominally a `null' binding.


  [1m4.4.1.  Standard NAT Targets[0m

  NAT targets are like any other iptables target extensions, except they
  insist on being used only in the `nat' table.  Both the SNAT and DNAT
  targets take a `struct ip_nat_multi_range' as their extra data; this
  is used to specify the range of addresses a mapping is allowed to bind
  into.  A range element, `struct ip_nat_range' consists of an inclusive
  minimum and maximum IP address, and an inclusive maximum and minimum
  protocol-specific value (eg. TCP ports).  There is also room for
  flags, which say whether the IP address can be mapped (sometimes we
  only want to map the protocol-specific part of a tuple, not the IP),
  and another to say that the protocol-specific part of the range is
  valid.


  A multi-range is an array of these `struct ip_nat_range' elements;
  this means that a range could be "1.1.1.1-1.1.1.2 ports 50-55 AND
  1.1.1.3 port 80".  Each range element adds to the range (a union, for
  those who like set theory).


  [1m4.4.2.  New Protocols[0m

  [1m4.4.2.1.  Inside The Kernel[0m

  Implementing a new protocol first means deciding what the
  manipulatable and non-manipulatable parts of the tuple should be.
  Everything in the tuple has the property that it identifies the stream
  uniquely.  The manipulatable part of the tuple is the part you can do
  NAT with: for TCP this is the source port, for ICMP it's the icmp ID;
  something to use as a "stream identifier".  The non-manipulatable part
  is the rest of the packet that uniquely identifies the stream, but we
  can't play with (eg. TCP destination port, ICMP type).


  Once you've decided this, you can write an extension to the
  connection-tracking code in the directory, and go about populating the
  `ip_conntrack_protocol' structure which you need to pass to
  `ip_conntrack_register_protocol()'.


  The fields of `struct ip_conntrack_protocol' are:


     [1mlist[0m
        Set it to '{ NULL, NULL }'; used to sew you into the list.


     [1mproto[0m
        Your protocol number; see `/etc/protocols'.


     [1mname[0m
        The name of your protocol.  This is the name the user will see;
        it's usually best if it's the canonical name in
        `/etc/protocols'.


     [1mpkt_to_tuple[0m
        The function which fills out the protocol specific parts of the
        tuple, given the packet.  The `datah' pointer points to the
        start of your header (just past the IP header), and the datalen
        is the length of the packet.  If the packet isn't long enough to
        contain the header information, return 0; datalen will always be
        at least 8 bytes though (enforced by framework).


     [1minvert_tuple[0m
        This function is simply used to change the protocol-specific
        part of the tuple into the way a reply to that packet would
        look.


     [1mprint_tuple[0m
        This function is used to print out the protocol-specific part of
        a tuple; usually it's sprintf()'d into the buffer provided.  The
        number of buffer characters used is returned.  This is used to
        print the states for the /proc entry.


     [1mprint_conntrack[0m
        This function is used to print the private part of the conntrack
        structure, if any, also used for printing the states in /proc.


     [1mpacket[0m
        This function is called when a packet is seen which is part of
        an established connection.  You get a pointer to the conntrack
        structure, the IP header, the length, and the ctinfo.  You
        return a verdict for the packet (usually NF_ACCEPT), or -1 if
        the packet is not a valid part of the connection.  You can
        delete the connection inside this function if you wish, but you
        must use the following idiom to avoid races (see
        ip_conntrack_proto_icmp.c):



          if (del_timer(&ct->timeout))
                  ct->timeout.function((unsigned long)ct);



     [1mnew[0m
        This function is called when a packet creates a connection for
        the first time; there is no ctinfo arg, since the first packet
        is of ctinfo IP_CT_NEW by definition.  It returns 0 to fail to
        create the connection, or a connection timeout in jiffies.

  Once you've written and tested that you can track your new protocol,
  it's time to teach NAT how to translate it.  This means writing a new
  module; an extension to the NAT code and go about populating the
  `ip_nat_protocol' structure which you need to pass to
  `ip_nat_protocol_register()'.


     [1mlist[0m
        Set it to '{ NULL, NULL }'; used to sew you into the list.


     [1mname[0m
        The name of your protocol.  This is the name the user will see;
        it's best if it's the canonical name in `/etc/protocols' for
        userspace auto-loading, as we'll see later.


     [1mprotonum[0m
        Your protocol number; see `/etc/protocols'.


     [1mmanip_pkt[0m
        This is the other half of connection tracking's pkt_to_tuple
        function: you can think of it as "tuple_to_pkt".  There are some
        differences though: you get a pointer to the start of the IP
        header, and the total packet length.  This is because some
        protocols (UDP, TCP) need to know the IP header.  You're given
        the ip_nat_tuple_manip field from the tuple (i.e., the "src"
        field), rather than the entire tuple, and the type of
        manipulation you are to perform.


     [1min_range[0m
        This function is used to tell if manipulatable part of the given
        tuple is in the given range.  This function is a bit tricky:
        we're given the manipulation type which has been applied to the
        tuple, which tells us how to interpret the range (is it a source
        range or a destination range we're aiming for?).
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -