Thread (5 messages) 5 messages, 2 authors, 2017-02-01

RE: sock_create_kern() and network namespace reference counts

From: David Laight <hidden>
Date: 2017-02-01 17:37:59

From: Cong Wang
Sent: 01 February 2017 17:20
On Tue, Jan 31, 2017 at 9:57 AM, David Laight [off-list ref] wrote:
quoted
From: Cong Wang
quoted
Sent: 31 January 2017 17:38
On Tue, Jan 31, 2017 at 7:41 AM, David Laight [off-list ref] wrote:
quoted
Commit 26abe1437 changed sock_create_kern() so that it stopped
holding a reference to the network namespace.
The rational seemed to be 'to allow to stop it' (presumably 'be deleted').
Prior to this change some kernel paths used sk_change_net() (etc) to
change the namespace after the socket was created.

If the socket doesn't hold a reference to the namespace, what actually
happens when the namespace is deleted?
Kernel socket should have the same lifetime with the net namespace,
that is, created in net_init and released in net_exit. Think about it, if it
really held a refcnt to this netns, how could this netns be teared down?
That rather depends on what they are being used for.
Consider something like an in kernel ftp client, it doesn't really care
about namespaces except in as much as the connections it creates must
be inside the correct namespace.
The namespace shouldn't be torn down while that connection exists any more
than it should be torn down while a user process has an open connection.
(Listening sockets are likely to be more of a problem.)
If you don't care about netns, why not just use init_net which is never
torn down and make your kernel socket global so that each netns
can access it too?
If I create the kernel socket in init_net the connections don't work.
In particular a connection to 127.0.0.1 to a process started in
a different namespace (which contains an external ethernet port).

So I care enough about them to have to create sockets in the right one.
I don't care about namespaces being created or deleted.

They do work if I save the net_ns from a 'random' open of the driver
(from a process that happens to be running in the right namespace).

However that just proves the kernel socket need to be in the right
namespace. It isn't a real solution and I can't hold a reference count
on the namespace at all (well I could call sock_create() and hold it
via a user socket).

As a matter of interest, a process can change namespace by doing:
	set_ns(open("/var/run/netns/namespace",...),...)
How can it select init_ns ??

	David

Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help