📄 netfilter-hacking-howto.txt
字号:
The packet then passes a final netfilter hook, the NF_IP_POST_ROUTING
[4] hook, before being put on the wire again.
The NF_IP_LOCAL_OUT [5] hook is called for packets that are created
locally. Here you can see that routing occurs after this hook is
called: in fact, the routing code is called first (to figure out the
source IP address and some IP options): if you want to alter the
routing, you must alter the `skb->dst' field yourself, as is done in
the NAT code.
[1m3.1. Netfilter Base[0m
Now we have an example of netfilter for IPv4, you can see when each
hook is activated. This is the essence of netfilter.
Kernel modules can register to listen at any of these hooks. A module
that registers a function must specify the priority of the function
within the hook; then when that netfilter hook is called from the core
networking code, each module registered at that point is called in the
order of priorites, and is free to manipulate the packet. The module
can then tell netfilter to do one of five things:
1. NF_ACCEPT: continue traversal as normal.
2. NF_DROP: drop the packet; don't continue traversal.
3. NF_STOLEN: I've taken over the packet; don't continue traversal.
4. NF_QUEUE: queue the packet (usually for userspace handling).
5. NF_REPEAT: call this hook again.
The other parts of netfilter (handling queued packets, cool comments)
will be covered in the kernel section later.
Upon this foundation, we can build fairly complex packet
manipulations, as shown in the next two sections.
[1m3.2. Packet Selection: IP Tables[0m
A packet selection system called IP Tables has been built over the
netfilter framework. It is a direct descendent of ipchains (that came
from ipfwadm, that came from BSD's ipfw IIRC), with extensibility.
Kernel modules can register a new table, and ask for a packet to
traverse a given table. This packet selection method is used for
packet filtering (the `filter' table), Network Address Translation
(the `nat' table) and general pre-route packet mangling (the `mangle'
table).
The hooks that are registered with netfilter are as follows (with the
functions in each hook in the order that they are actually called):
--->PRE------>[ROUTE]--->FWD---------->POST------>
Conntrack | Mangle ^ Mangle
Mangle | Filter | NAT (Src)
NAT (Dst) | | Conntrack
(QDisc) | [ROUTE]
v |
IN Filter OUT Conntrack
| Conntrack ^ Mangle
| Mangle | NAT (Dst)
v | Filter
[1m3.2.1. Packet Filtering[0m
This table, `filter', should never alter packets: only filter them.
One of the advantages of iptables filter over ipchains is that it is
small and fast, and it hooks into netfilter at the NF_IP_LOCAL_IN,
NF_IP_FORWARD and NF_IP_LOCAL_OUT points. This means that for any
given packet, there is one (and only one) possible place to filter it.
This makes things much simpler for users than ipchains was. Also, the
fact that the netfilter framework provides both the input and output
interfaces for the NF_IP_FORWARD hook means that many kinds of
filtering are far simpler.
Note: I have ported the kernel portions of both ipchains and ipfwadm
as modules on top of netfilter, enabling the use of the old ipfwadm
and ipchains userspace tools without requiring an upgrade.
[1m3.2.2. NAT[0m
This is the realm of the `nat' table, which is fed packets from two
netfilter hooks: for non-local packets, the NF_IP_PRE_ROUTING and
NF_IP_POST_ROUTING hooks are perfect for destination and source
alterations respectively. If CONFIG_IP_NF_NAT_LOCAL is defined, the
hooks NF_IP_LOCAL_OUT and NF_IP_LOCAL_IN are used for altering the
destination of local packets.
This table is slightly different from the `filter' table, in that only
the first packet of a new connection will traverse the table: the
result of this traversal is then applied to all future packets in the
same connection.
[1m3.2.2.1. Masquerading, Port Forwarding, Transparent Proxying[0m
I divide NAT into Source NAT (where the first packet has its source
altered), and Destination NAT (the first packet has its destination
altered).
Masquerading is a special form of Source NAT: port forwarding and
transparent proxying are special forms of Destination NAT. These are
now all done using the NAT framework, rather than being independent
entities.
[1m3.2.3. Packet Mangling[0m
The packet mangling table (the `mangle' table) is used for actual
changing of packet information. Example applications are the TOS and
TCPMSS targets. The mangle table hooks into all five netfilter hooks.
(please note this changed with kernel 2.4.18. Previous kernels didn't
have mangle attached to all hooks)
[1m3.3. Connection Tracking[0m
Connection tracking is fundamental to NAT, but it is implemented as a
separate module; this allows an extension to the packet filtering code
to simply and cleanly use connection tracking (the `state' module).
[1m3.4. Other Additions[0m
The new flexibility provides both the opportunity to do really funky
things, but for people to write enhancements or complete replacements
that can be mixed and matched.
[1m4. Information for Programmers[0m
I'll let you in on a secret: my pet hamster did all the coding. I was
just a channel, a `front' if you will, in my pet's grand plan. So,
don't blame me if there are bugs. Blame the cute, furry one.
[1m4.1. Understanding ip_tables[0m
iptables simply provides a named array of rules in memory (hence the
name `iptables'), and such information as where packets from each hook
should begin traversal. After a table is registered, userspace can
read and replace its contents using getsockopt() and setsockopt().
iptables does not register with any netfilter hooks: it relies on
other modules to do that and feed it the packets as appropriate; a
module must register the netfilter hooks and ip_tables separately, and
provide the mechanism to call ip_tables when the hook is reached.
[1m4.1.1. ip_tables Data Structures[0m
For convenience, the same data structure is used to represent a rule
by userspace and within the kernel, although a few fields are only
used inside the kernel.
Each rule consists of the following parts:
1. A `struct ipt_entry'.
2. Zero or more `struct ipt_entry_match' structures, each with a
variable amount (0 or more bytes) of data appended to it.
3. A `struct ipt_entry_target' structure, with a variable amount (0 or
more bytes) of data appended to it.
The variable nature of the rule gives a huge amount of flexibility for
extensions, as we'll see, especially as each match or target can carry
an arbitrary amount of data. This does create a few traps, however:
we have to watch out for alignment. We do this by ensuring that the
`ipt_entry', `ipt_entry_match' and `ipt_entry_target' structures are
conveniently sized, and that all data is rounded up to the maximal
alignment of the machine using the IPT_ALIGN() macro.
The `struct ipt_entry' has the following fields:
1. A `struct ipt_ip' part, containing the specifications for the IP
header that it is to match.
2. An `nf_cache' bitfield showing what parts of the packet this rule
examined.
3. A `target_offset' field indicating the offset from the beginning of
this rule where the ipt_entry_target structure begins. This should
always be aligned correctly (with the IPT_ALIGN macro).
4. A `next_offset' field indicating the total size of this rule,
including the matches and target. This should also be aligned
correctly using the IPT_ALIGN macro.
5. A `comefrom' field used by the kernel to track packet traversal.
6. A `struct ipt_counters' field containing the packet and byte
counters for packets which matched this rule.
The `struct ipt_entry_match' and `struct ipt_entry_target' are very
similar, in that they contain a total (IPT_ALIGN'ed) length field
(`match_size' and `target_size' respectively) and a union holding the
name of the match or target (for userspace), and a pointer (for the
kernel).
Because of the tricky nature of the rule data structure, some helper
routines are provided:
[1mipt_get_target()[0m
This inline function returns a pointer to the target of a rule.
[1mIPT_MATCH_ITERATE()[0m
This macro calls the given function for every match in the given
rule. The function's first argument is the `struct
ipt_match_entry', and other arguments (if any) are those
supplied to the IPT_MATCH_ITERATE() macro. The function must
return either zero for the iteration to continue, or a non-zero
value to stop.
[1mIPT_ENTRY_ITERATE()[0m
This function takes a pointer to an entry, the total size of the
table of entries, and a function to call. The functions first
argument is the `struct ipt_entry', and other arguments (if any)
are those supplied to the IPT_ENTRY_ITERATE() macro. The
function must return either zero for the iteration to continue,
or a non-zero value to stop.
[1m4.1.2. ip_tables From Userspace[0m
Userspace has four operations: it can read the current table, read the
info (hook positions and size of table), replace the table (and grab
the old counters), and add in new counters.
This allows any atomic operation to be simulated by userspace: this is
done by the libiptc library, which provides convenience
"add/delete/replace" semantics for programs.
Because these tables are transferred into kernel space, alignment
becomes an issue for machines which have different userspace and
kernelspace type rules (eg. Sparc64 with 32-bit userland). These
cases are handled by overriding the definition of IPT_ALIGN for these
platforms in `libiptc.h'.
[1m4.1.3. ip_tables Use And Traversal[0m
The kernel starts traversing at the location indicated by the
particular hook. That rule is examined, if the `struct ipt_ip'
elements match, each `struct ipt_entry_match' is checked in turn (the
match function associated with that match is called). If the match
function returns 0, iteration stops on that rule. If it sets the
`hotdrop' parameter to 1, the packet will also be immediately dropped
(this is used for some suspicious packets, such as in the tcp match
function).
If the iteration continues to the end, the counters are incremented,
the `struct ipt_entry_target' is examined: if it's a standard target,
the `verdict' field is read (negative means a packet verdict, positive
means an offset to jump to). If the answer is positive and the offset
is not that of the next rule, the `back' variable is set, and the
previous `back' value is placed in that rule's `comefrom' field.
For non-standard targets, the target function is called: it returns a
verdict (non-standard targets can't jump, as this would break the
static loop-detection code). The verdict can be IPT_CONTINUE, to
continue on to the next rule.
[1m4.2. Extending iptables[0m
Because I'm lazy, iptables is fairly extensible. This is basically a
scam to palm off work onto other people, which is what Open Source is
all about (cf. Free Software, which as RMS would say, is about
freedom, and I was sitting in one of his talks when I wrote this).
Extending iptables potentially involves two parts: extending the
kernel, by writing a new module, and possibly extending the userspace
program iptables, by writing a new shared library.
[1m4.2.1. The Kernel[0m
Writing a kernel module itself is fairly simple, as you can see from
the examples. One thing to be aware of is that your code must be re-
entrant: there can be one packet coming in from userspace, while
another arrives on an interrupt. In fact in SMP there can be one
packet on an interrupt per CPU in 2.3.4 and above.
The functions you need to know about are:
[1minit_module()[0m
This is the entry-point of the module. It returns a negative
error number, or 0 if it successfully registers itself with
netfilter.
[1mcleanup_module()[0m
This is the exit point of the module; it should unregister
itself with netfilter.
[1mipt_register_match()[0m
This is used to register a new match type. You hand it a
`struct ipt_match', which is usually declared as a static (file-
scope) variable.
[1mipt_register_target()[0m
This is used to register a new type. You hand it a `struct
ipt_target', which is usually declared as a static (file-scope)
variable.
[1mipt_unregister_target()[0m
Used to unregister your target.
[1mipt_unregister_match()[0m
Used to unregister your match.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -