Thread (3 messages) 3 messages, 2 authors, 2005-06-28

Re: [RFC/PATCH] "safer ipv4 reassembly" (fwd)

From: Arthur Kepner <hidden>
Date: 2005-06-28 22:11:39

On Sun, 26 Jun 2005, Herbert Xu wrote:
.....
Thanks for writing this patch Arthur.
Likewise, thanks for reviewing it.
quoted
+struct ipc {
+	struct hlist_node	node;
+	u32			saddr;
+	u32			daddr;
+	u8			protocol;
+	atomic_t		refcnt;	/* how many ipqs hold refs to us */
+	atomic_t		seq;	/* how many ip datagrams for this 
+					 * (saddr,daddr,protocol) since we 
+					 * were created */
+	struct timer_list	timer;
+	struct rcu_head		rcu;
Is RCU worth it here? The only time we'd be taking the locks on this
is when the first fragment of a packet comes in.  At that point we'll
be taking write_lock(&ipfrag_lock) anyway.

The only other use of RCU in your patch is ip_count.  That should be
changed to be done in ip_defrag instead.  At that point you can simply
find the ipc by deferencing ipq, so no need for __ipc_find and hence
RCU.

The reason you need to change it in this way is because you can't make
assumptions about ip_rcv_finish being the first place where a packet
is defragmented.  With connection tracking enabled conntrack is the first
place where defragmentation occurs.
  
Right, I see that now. (I'm not well acquainted with the conntrack 
code...) 

One reason I used RCU for the "ipc" structures is that I wanted 
to be able to find find them (in ip_rcv_finish()) without locking. 
Since ip_rcv_finish() is the wrong place to do that, that reason 
is invalid. 

There is a (big) advantage to doing this in ip_defrag() - this 
becomes a no-op for non-fragmented datagrams. The disadvantage 
is that there could be a situation where you receive:

  1) first fragment of datagram X [for a particular (src,dst,proto)]
  2) a zillion non-fragmented datagrams [for the same (src,dst,proto)]
  3) last fragment of datagram X [for (src,dst,proto)]

and no "disorder" would be detected for the datagrams associated 
with (src,dst,proto), even though the ip id could have wrapped in the 
meantime. This seems like a very uncommon case, however.

quoted
+#define IPC_HASHSZ	IPQ_HASHSZ
+static struct {
+	struct hlist_head head;
+	spinlock_t lock;
+} ipc_hash[IPC_HASHSZ];
I'd store ipc entries in the main ipq hash table since they can use
the same keys for lookup as ipq entries.  You just need to set protocol
to zero and map the user to values specific to ipc for ipc entries.
One mapping would be to set the top bit of user for ipc entries, e.g.

#define IP_DEFRAG_IPC 0x80000000
	ipc->user = ipq->user | IP_DEFRAG_IPC;

Of course you also need to make sure that the two structures share
the leading elements.  You can then use the user field to distinguish
between ipc/ipq entries.
Hmmm, let me think about combining the ipc/ipq structures, and 
also the related question of whether to use RCU for the ipc 
structures. I'll try to spin another version of the patch before 
the end of the week.

--
Arthur
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help