Debugging a Stall or a Freeze
From: Sankar P <hidden>
Date: 2013-08-17 02:14:15
Possibly related (same subject, not in this thread)
- 2013-07-25 · Debugging a Stall or a Freeze · Salam Farhat <hidden>
On Fri, Aug 16, 2013 at 10:08 PM, Salam Farhat [off-list ref] wrote:
I have posted a question earlier and I have confirmed that this is running in an infinite loop. However, I discovered that the infinite loop is happening inside kernel code. Specifically inside the kmalloc function. I know this is highly improbable, but I believe that this is the case. The line of code that cause the infinite loop is in bold below and starts with buf = If I comment this line out then it does not hang. If I uncomment it then it does. Further, more no print statements after that line are being printed and I have it surrounded by print statements. KMALLOC is a macro defined as # define KMALLOC(a,b) kmalloc((a),(b)) The last line being printed: b0b0b0 4096 4096 being the size of buffer. The get_buffer method is called quite a few times before the last time where it goes into an infinite loop. I am thinking there could be a memory leak or if memory is low this can happen? An advice on how to tackle this issue would be greatly appreciated.
GFP_KERNEL flag can make the kmalloc call to sleep on low-memory situations. If you pass GFP_ATOMIC, the kernel will fail the kmalloc instantly, in case memory cannot be allocated, instead of putting it to sleep. You can try that. Google showed me that a show_page_info call can tell you about the memory usage. You can use that prior to making the kmalloc call to see the memory status, alternatively. If you are interested in debugging memory leaks, you can try kmemleak http://psankar.blogspot.in/2010/11/detecting-memory-leaks-in-kernel.html
Thanks. static inline struct buffer *get_buffer(void) { /* XXX: __get_free_page should be used. KMALLOC is for small stuff < PAGE_SIZE */ struct buffer *buf; printk(KERN_EMERG " b0b0b0 %d\n", sizeof(struct buffer)); buf = KMALLOC(sizeof(struct buffer), GFP_KERNEL); print_entry_location(); printk(KERN_EMERG " b1b1b1\n"); //if (buf) //i commented these out //buf->ptr = buf->data + INIT_LOC; //i commented these out printk(KERN_EMERG " b1b1b1\n"); print_exit_location(); return NULL; //i changed it to return null so the next function just exits } Additional info struct buffer { char *ptr; char data[DATA_SIZE]; }; #define DATA_SIZE (PAGE_SIZE - sizeof(int)) On Thu, Jul 25, 2013 at 2:23 PM, [off-list ref] wrote:quoted
On Thu, 25 Jul 2013 13:56:47 -0400, Salam Farhat said:quoted
When the guest OS freezes I get the following messages seen below. I would like to know what is a good approach for debugging this issue. I am not sure what a process stall is. Is that a deadlock? [ 780.357876] BUG: soft lockup - CPU#0 stuck for 22s! [nautilus:1382] [ 780.361658] Process nautilus (pid: 1382, ti=dca12000 task=dc837230 task.ti=d) [ 780.361658] Stack: [ 780.361658] Call Trace: [ 780.361658] Code: 90 b8 43 64 03 c1 b9 40 64 03 c1 e9 49 ff ff ff 90 55 ba 0 [ 808.356372] BUG: soft lockup - CPU#0 stuck for 22s!That's probably not a deadlock. That's code stuck in an infinite loop, probably while running in a non-interruptible state. Too bad we didn't get a stack dump out of it, that would tell us what code is hung in a loop. For debugging deadlocks, turning on CONFIG_PROVE_LOCKING=y in the .config is the best bet - that will fire an alert not only when the kernel *does* lock up, but also if there's even a *possible* deadlock (for instance, if one section takes 2 locks in the order A B, it will trigger if it ever spots another chunk of code taking B and then A - even if that doesn't actually trigger a deadlock because neither lock is held at the time)._______________________________________________ Kernelnewbies mailing list Kernelnewbies at kernelnewbies.org http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
-- Sankar P http://psankar.blogspot.com