Thread (37 messages) 37 messages, 5 authors, 2010-12-17

Re: [PATCH v3 21/22] netoops: Add user-programmable boot_id

From: Matt Mackall <hidden>
Date: 2010-12-14 22:47:55
Also in: lkml, netdev

On Tue, 2010-12-14 at 14:33 -0800, Mike Waychison wrote:
On Tue, Dec 14, 2010 at 2:06 PM, Matt Mackall [off-list ref] wrote:
quoted
On Tue, 2010-12-14 at 13:59 -0800, Mike Waychison wrote:
quoted
On Tue, Dec 14, 2010 at 1:42 PM, Matt Mackall [off-list ref] wrote:
quoted
On Tue, 2010-12-14 at 13:30 -0800, Mike Waychison wrote:
quoted
Add support for letting userland define a 32bit boot id.  This is useful
for users to be able to correlate netoops reports to specific boot
instances offline.
This sounds a lot like the pre-existing /proc/sys/kernel/random/boot_id
that's used by kerneloops.org.
Could be.  I'm looking at it now... There is no documentation for this
boot_id field?
Probably not. It's just a random number generated at boot.
quoted
Reusing this guy would work, except that it doesn't appear to allow
arbitrary values to be set.  We need to inject our boot sequence
number (which is figured out in userland) in the packet somehow as we
need to correlate it to our other monitoring systems.
What happens if you oops before userspace is available?
Either one of two general cases:
  - The crash is a one-off and the machine comes back.  The boot
number sequence will see a hole in it, which is a clue that something
bad happened.
  - The machine is in a crash loop.  This has the same failure mode
for us as if the machine never made it onto the network due to
whatever reason: bad cables, bad firmware, bad ram, ...

In both cases, we can detect that something is wrong and handle it.
Note that our firmware is responsible for incrementing the boot
sequence at bootup, which is why the above works.   In general though,
our machines do make it up to userland -- staying alive once booted is
the hard part ;)
Interesting. Is this Google-specific firmware magic? I'd probably accept
a hook in random.c to fold a number into the UUID, which would unify
things. 

-- 
Mathematics is the supreme nostalgia of our time.
Keyboard shortcuts
hback out one level
jnext message in thread
kprevious message in thread
ldrill in
Escclose help / fold thread tree
?toggle this help