⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 the journey of a packet through the linux 2_4 network stack.htm

📁 这是我对防火墙技术的一些见解
💻 HTM
字号:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-- saved from url=(0055)http://gnumonks.org/ftp/pub/doc/packet-journey-2.4.html -->
<HTML><HEAD><TITLE>The journey of a packet through the linux 2.4 network stack</TITLE>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.2800.1400" name=GENERATOR></HEAD>
<BODY>
<H1>The journey of a packet through the linux 2.4 network stack</H1>
<H2>Harald Welte <CODE>laforge@gnumonks.org</CODE></H2>1.4, 2000/10/14 20:27:43 
<P>
<HR>
<EM>This document describes the journey of a network packet inside the linux 
kernel 2.4.x. This has changed drastically since 2.2 because the globally 
serialized bottom half was abandoned in favor of the new softirq system.</EM> 
<HR>

<H2><A name=s1>1. Preface</A></H2>
<P>I have to excuse for my ignorance, but this document has a strong focus on 
the "default case": x86 architecture and ip packets which get forwarded. 
<P>
<P>I am definitely no kernel guru and the information provided by this document 
may be wrong. So don't expect too much, I'll always appreciate Your comments and 
bugfixes. 
<P>
<H2><A name=s2>2. Receiving the packet</A></H2>
<H2>2.1 The receive interrupt</H2>
<P>If the network card receives an ethernet frame which matches the local MAC 
address or is a linklayer broadcast, it issues an interrupt. The network driver 
for this particular card handles the interrupt, fetches the packet data via DMA 
/ PIO / whatever into RAM. It then allocates a skb and calls a function of the 
protocol independent device support routines: 
<CODE>net/core/dev.c:netif_rx(skb)</CODE>. 
<P>If the driver didn't already timestamp the skb, it is timestamped now. 
Afterwards the skb gets enqueued in the apropriate queue for the processor 
handling this packet. If the queue backlog is full the packet is dropped at this 
place. After enqueuing the skb the receive softinterrupt is marked for execution 
via <CODE>include/linux/interrupt.h:__cpu_raise_softirq()</CODE>. 
<P>The interrupt handler exits and all interrupts are reenabled. 
<P>
<H2>2.2 The network RX softirq</H2>
<P>Now we encounter one of the big changes between 2.2 and 2.4: The whole 
network stack is no longer a bottom half, but a softirq. Softirqs have the major 
advantage, that they may run on more than one CPU simultaneously. bh's were 
guaranteed to run only on one CPU at a time. 
<P>Our network receive softirq is registered in 
<CODE>net/core/dev.c:net_init()</CODE> using the function 
<CODE>kernel/softirq.c:open_softirq()</CODE> provided by the softirq subsystem. 
<P>Further handling of our packet is done in the network receive softirq 
(NET_RX_SOFTIRQ) which is called from 
<CODE>kernel/softirq.c:do_softirq()</CODE>. do_softirq() itself is called from 
three places within the kernel: 
<OL>
  <LI>from <CODE>arch/i386/kernel/irq.c:do_IRQ()</CODE>, which is the generic 
  IRQ handler 
  <LI>from <CODE>arch/i386/kernel/entry.S</CODE> in case the kernel just 
  returned from a syscall 
  <LI>inside the main process scheduler in 
  <CODE>kernel/sched.c:schedule()</CODE> </LI></OL>
<P>So if execution passes one of these points, do_softirq() is called, it 
detects the NET_RX_SOFTIRQ marked an calls 
<CODE>net/core/dev.c:net_rx_action()</CODE>. Here the sbk is dequeued from this 
cpu's receive queue and afterwards handled to the apropriate packet handler. In 
case of IPv4 this is the IPv4 packet handler. 
<P>
<H2>2.3 The IPv4 packet handler </H2>
<P>The IP packet handler is registered via 
<CODE>net/core/dev.c:dev_add_pack()</CODE> called from 
<CODE>net/ipv4/ip_output.c:ip_init()</CODE>. 
<P>The IPv4 packet handling function is 
<CODE>net/ipv4/ip_input.c:ip_rcv()</CODE>. After some initial checks (if the 
packet is for this host, ...) the ip checksum is calculated. Additional checks 
are done on the length and IP protocol version 4. 
<P>Every packet failing one of the sanity checks is dropped at this point. 
<P>If the packet passes the tests, we determine the size of the ip packet and 
trim the skb in case the transport medium has appended some padding. 
<P>Now it is the first time one of the netfilter hooks is called. 
<P>Netfilter provides an generict and abstract interface to the standard routing 
code. This is currently used for packet filtering, mangling, NAT and queuing 
packets to userspace. For further reference see my conference paper 'The 
netfilter subsystem in Linux 2.4' or one of Rustys unreliable guides, i.e the 
netfilter-hacking-guide. 
<P>After successful traversal the netfilter hook, 
<CODE>net/ipv4/ipv_input.c:ip_rcv_finish()</CODE> is called. 
<P>Inside ip_rcv_finish(), the packet's destination is determined by calling the 
routing function <CODE>net/ipv4/route.c:ip_route_input()</CODE>. Furthermore, if 
our IP packet has IP options, they are processed now. Depending on the routing 
decision made by <CODE>net/ipv4/route.c:ip_route_input_slow()</CODE>, the 
journey of our packet continues in one of the following functions: 
<P>
<DL>
  <DT><B>net/ipv4/ip_input.c:ip_local_deliver()</B>
  <DD>
  <P>The packet's destination is local, we have to process the layer 4 protocol 
  and pass it to an userspace process. 
  <P></P>
  <DT><B>net/ipv4/ip_forward.c:ip_forward()</B>
  <DD>
  <P>The packet's destination is not local, we have to forward it to another 
  network 
  <P></P>
  <DT><B>net/ipv4/route.c:ip_error()</B>
  <DD>
  <P>An error occurred, we are unable to find an apropriate routing table entry 
  for this packet. 
  <P></P>
  <DT><B>net/ipv4/ipmr.c:ip_mr_input()</B>
  <DD>
  <P>It is a Multicast packet and we have to do some multicast routing. 
</P></DD></DL>
<P>
<H2><A name=s3>3. Packet forwarding to another device </A></H2>
<P>If the routing decided that this packet has to be forwarded to another 
device, the function <CODE>net/ipv4/ip_forward.c:ip_forward()</CODE> is called. 
<P>
<P>The first task of this function is to check the ip header's TTL. If it is 
&lt;= 1 we drop the packet and return an ICMP time exceeded message to the 
sender. 
<P>We check the header's tailroom if we have enough tailroom for the destination 
device's link layer header and expand the skb if neccessary. 
<P>Next the TTL is decremented by one. 
<P>If our new packet is bigger than the MTU of the destination device and the 
don't fragment bit in the IP header is set, we drop the packet and send a ICMP 
frag needed message to the sender. 
<P>
<P>Finally it is time to call another one of the netfilter hooks - this time it 
is the NF_IP_FORWARD hook. 
<P>
<P>Assuming that the netfilter hooks is returning a NF_ACCEPT verdict, the 
function <CODE>net/ipv4/ip_forward.c:ip_forward_finish()</CODE> is the next step 
in our packet's journey. 
<P>
<P>ip_forward_finish() itself checks if we need to set any additional options in 
the IP header, and has ip_optFIXME doing this. Afterwards it calls 
<CODE>include/net/ip.h:ip_send()</CODE>. 
<P>
<P>If we need some fragmentation, FIXME:ip_fragment gets called, otherwise we 
continue in <CODE>net/ipv4/ip_forward:ip_finish_output()</CODE>. 
<P>
<P>ip_finish_output() again does nothing else than calling the netfilter 
postrouting hook NF_IP_POST_ROUTING and calling ip_finish_output2() on 
successful traversal of this hook. 
<P>
<P>ip_finish_output2() calls prepends the hardware (link layer) header to our 
skb and calls <CODE>net/ipv4/ip_output.c:ip_output()</CODE>. 
<P></P></BODY></HTML>

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -