Top Banner
printk() considered harmful KS/LPC tech topic November 2016 Sergey Senozhatsky Samsung Electronics
24

printk() considered harmful

Jan 26, 2017

Download

Software

Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: printk() considered harmful

printk() considered harmful

KS/LPC tech topicNovember 2016

Sergey SenozhatskySamsung Electronics

Page 2: printk() considered harmful

Table of contents 1. Deadlocks and recursion in printk()2. Async printk()3. pr_cont() *4. Console semaphore *

* bonus topics

Page 3: printk() considered harmful

1. Deadlocks in printk()

● There are several scenarios that can cause deadlocks in printk()● For instance:

○ Re-entrant printk() [resolved]■ Per-cpu buffers for safe printk() from NMI context

○ Recursions in printk() [unresolved]■ A proposal [RFC]

Page 4: printk() considered harmful

1. Deadlocks in printk()

● An extension of NMI-printk() idea [1]○ Use additional per-cpu buffers to handle printk() recursion○ Mark unsafe regions with printk_safe_enter()/exit() calls, that

set special per-cpu flags○ printk()->vprintk_funct() then checks those per-cpu flags○ And “redirects” unsafe printk() call to one of the safe

callbacks:■ vprintk_safe_nmi()■ vprintk_safe()

Page 5: printk() considered harmful

1. Deadlocks in printk()

● This handles printk() recursion only:○ printk() -> /* anything that calls printk() again */ -> printk()

● But what about printk() deadlocks...

Page 6: printk() considered harmful

1. Deadlocks in printk()SyS_ioctl tty_ioctl tty_mode_ioctl uart_set_termios uart_change_speed FOO_serial_set_termios spin_lock_irqsave(&port->lock) // lock the output port /* WARN_ON() / BUG_ON() / printk() */ vprintk_emit() console_unlock() call_console_drivers() FOO_write() spin_lock_irqsave(&port->lock) // deadlock

Page 7: printk() considered harmful

1. Deadlocks in printk()

● The fundamental issue is that printk() dependson two different lock groups - internal and external locks

● We can handle internal locks with printk_safe● External locks are out of control: scheduler locks, clock locks, etc.

● printk_deferred(), basically, tells printk() to acquire ONLYinternal locks (logbuf spin lock), avoiding any external locks

Page 8: printk() considered harmful

1. Deadlocks in printk()

● A proposal:○ Make printk() behave like printk_deferred()○ IOW, avoid any external locks in printk() if we are in atomic,

queueing irq_work for printing○ So we can eventually remove printk_deferred(), which is

always soooo hard

Page 9: printk() considered harmful

1. Deadlocks in printk()

● Thoughts/questions?

Page 10: printk() considered harmful

2. Async printk()

● Printing CPU does an “endless” loop in console_unlock()● console_unlock() is not always preemptible● Things get worse once you have a slow serial console

attached

Page 11: printk() considered harmful

2. Async printk()

● You never know how long will it take to printk() a message● printk() can spin lockup, soft/hard lockup the system, etc.● It’s not always safe to call printk() under spin_lock, RCU lock, etc.

○ It, thus, can provoke memory pressure, etc.● It’s not always safe to call printk() from IRQ● It’s not always safe to call printk_deferred()

Page 12: printk() considered harmful

2. Async printk()

● A proposal (started by Jan Kara) [2]:○ Offload printing to a special printk kthread○ printk() stores the message and queue irq_work

/* or wake_up() the printk kthread if the proposal from #1 won’t fly */

○ wake_up_klogd_work_func() wakes up the printkkthread instead of doing console_unlock() from IRQ

Page 13: printk() considered harmful

2. Async printk()

● A proposal (continue):○ So there is no time-unbound console_unlock() possibly

standing behind printk() and deferred_printk()○ The printk kthread is always preemptible○ We switch back to sync printk mode when in panic.

● printk() already depends on scheduler (wake_up() in semaphore)● With printk_safe we also minimize the impact of wake_up()

○ Recursive printk() from wake_up()

Page 14: printk() considered harmful

2. Async printk()

● Thoughts/questions?

Page 15: printk() considered harmful

3. pr_cont()

● pr_cont() is NOT SMP safe

● “That said, we should strive to avoid the need for KERN_CONT. It does result inreal problems for logging, and should generally not be seen as a good feature.If we someday can get rid of the feature entirely, because nobody does anyfragmented printk calls, that would be lovely.”Linus Torvalds

Page 16: printk() considered harmful

3. pr_cont()

● pr_cont() is not going to die

version git grep “pr_cont(“ git grep “KERN_CONT”

tags/v3.8 761 547

tags/v4.5 1123 515

tags/v4.9-rc2 1194 501

Page 17: printk() considered harmful

3. pr_cont()

● Proposals:○ Make pr_cont buffer per-thread [hardly possible]○ Use fake pr_cont buffers [Petr Mladek posted a PoC patch set [3]]

○ Add a new cont API that will use a temp kmalloc()-d or per-CPULOG_LINE_MAX buffer [an extended version of [4]]

■ kmalloc() might not always work. OOM pr_cont(), soper-CPU buffers look a bit better

■ Stop caring about pr_cont() in the meantime■ Probably add pr_cont() usage WARN to checkpatch.pl

Page 18: printk() considered harmful

3. pr_cont()

● Thoughts/questions?

Page 19: printk() considered harmful

4. Console semaphore

● Console driver list is protected by a single console_sem semaphore● Not every console_lock() caller:

○ Modifies the consoles list (read-only access)○ Is a kernel thread (user-space can access that list in syscalls)○ Want to print the messages from console_unlock()○ Want to wait in TASK_UNINTERRUPTIBLE for current console_sem owner

■ Which possibly can be in console_unlock() printing loop

● Example: ○ cat /proc/consoles does console_lock()/console_unlock()

Page 20: printk() considered harmful

4. Console semaphore

● Do we want to replace console sem with a READ/WRITE lock?○ So processes that don’t modify the list can acquire console_sem

in read mode

● Direct console_unlock() callers will still do that “endless”printing loop

● [flashback]○ In async printk() we wake_up() printk kthread from wake_up_klogd_work_func()

Page 21: printk() considered harmful

4. Console semaphore

● Do we want to split console_unlock() function into○ async_flush_console_unlock() function

■ Basically, just unlock console_sem and wake_up()printk kthread

○ sync_flush_console_unlock() function■ Flush the logbuf and then unlock console_sem■ There are paths that probably want to flush logbuf

from console_unlock()

Page 22: printk() considered harmful

4. Console semaphore

● Thoughts/questions?

Page 23: printk() considered harmful

Thank you.