Example Linux kernel crashes from the EDAC driver are captured below with different scenarios.
1) When the CPU is in sleep mode:
PMUFW: PmProcTrSleepToActive: SLEEP->ACTIVE NODE_APU_0
PMUFW: PmPowerRequestParent: NODE_APU_0->NODE_APU
[ 149.889952] INFO: rcu_sched self-detected stall on CPU
[ 149.895015] 3-...: (5249 ticks this GP) idle=d59/140000000000001/0 softirq=649/649 fqs=2625
[ 149.903523] (t=5250 jiffies g=182 c=181 q=46)
[ 149.908029] Task dump for CPU 3:
[ 149.911239] kworker/u8:1 R running task 0 28 2 0x00000002
[ 149.918278] Workqueue: edac-poller edac_device_workq_function
[ 149.923995] Call trace:
[ 149.926432] [<ffffff80080881a8>] dump_backtrace+0x0/0x1a8
[ 149.931812] [<ffffff8008088364>] show_stack+0x14/0x20
[ 149.936845] [<ffffff80080c0d44>] sched_show_task+0x94/0xf0
[ 149.942312] [<ffffff80080c2ee0>] dump_cpu_task+0x40/0x50
[ 149.947608] [<ffffff800812f458>] rcu_dump_cpu_stacks+0xb4/0xe8
[ 149.953422] [<ffffff80080e9000>] rcu_check_callbacks+0x668/0x838
[ 149.959411] [<ffffff80080ec6c4>] update_process_times+0x34/0x60
[ 149.965311] [<ffffff80080fbaf4>] tick_sched_handle.isra.4+0x3c/0x50
[ 149.971559] [<ffffff80080fbb4c>] tick_sched_timer+0x44/0x90
[ 149.977115] [<ffffff80080ed1c8>] __hrtimer_run_queues+0xf0/0x178
[ 149.983103] [<ffffff80080ed558>] hrtimer_interrupt+0x98/0x1c8
[ 149.988832] [<ffffff80086aac38>] arch_timer_handler_phys+0x30/0x40
[ 149.994994] [<ffffff80080dfa00>] handle_percpu_devid_irq+0x78/0x128
[ 150.001242] [<ffffff80080da74c>] generic_handle_irq+0x24/0x38
[ 150.006968] [<ffffff80080dadd4>] __handle_domain_irq+0x5c/0xb8
[ 150.012782] [<ffffff80080814cc>] gic_handle_irq+0x64/0xc0
[ 150.018163] Exception stack(0xffffffc87ba37b90 to 0xffffffc87ba37cc0)
[ 150.024586] 7b80: 0000000000000002 ffffff8008681ce8
[ 150.032404] 7ba0: 0000000000000000 0000000000000001 ffffffc87ffb8598 0000000000000000
[ 150.040215] 7bc0: ffffffc87ffb8580 0000000000000000 ffffffc87ba33b60 ffffffc87ba34000
[ 150.048026] 7be0: 0000000000000780 0000000000000000 0000000000000bbc ffffffc87ae98d00
[ 150.055836] 7c00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 150.063647] 7c20: 0000000000000000 0000000000000040 0000000000000002 ffffff8008681ce8
[ 150.071458] 7c40: 000000000000000f ffffff8009412590 ffffff8009387000 ffffff8009387000
[ 150.079269] 7c60: ffffffc87ba37d48 ffffffc87b87ec78 ffffffc87b87eea8 ffffffc87ba37cc0
[ 150.087080] 7c80: ffffff8008681f20 ffffffc87ba37cc0 ffffff8008100408 0000000020000145
[ 150.094890] 7ca0: ffffffc87ba37cc0 ffffff8008100430 ffffffffffffffff 0000000000000001
[ 150.102701] [<ffffff80080827b0>] el1_irq+0xb0/0x140
[ 150.107555] [<ffffff8008100408>] smp_call_function_single+0x88/0x128
[ 150.113891] [<ffffff8008681f20>] cortex_arm64_edac_check+0x80/0xd8
[ 150.120053] [<ffffff800867e0f8>] edac_device_workq_function+0x78/0xc0
[ 150.126475] [<ffffff80080b0da0>] process_one_work+0x120/0x380
[ 150.132201] [<ffffff80080b1048>] worker_thread+0x48/0x4b0
[ 150.137582] [<ffffff80080b6b18>] kthread+0xd0/0xe8
[ 150.142355] [<ffffff8008082e80>] ret_from_fork+0x10/0x50
[ 150.147648] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 150.153117] 3-...: (5251 ticks this GP) idle=d59/140000000000000/0 softirq=649/649 fqs=2626
[ 150.161627] (detected by 2, t=5318 jiffies, g=182, c=181, q=48)
[ 150.167607] Task dump for CPU 3:
[ 150.170818] kworker/u8:1 R running task 0 28 2 0x00000002
[ 150.177852] Workqueue: edac-poller edac_device_workq_function
[ 150.183573] Call trace:
[ 150.186009] [<ffffff8008085360>] __switch_to+0x90/0xa8
[ 150.191129] [<ffffffc87b87ec00>] 0xffffffc87b87ec00
2) During kernel boot, it can lock up after a few seconds or minutes about 50% of the time.
PetaLinux 2017.1 plnx_aarch64 /dev/ttyPS0
plnx_aarch64 login: root
Password:
root@plnx_aarch64:~# [ 80.027055] INFO: rcu_sched self-detected stall on CPU
[ 80.032108] 0-...: (5249 ticks this GP) idle=6a1/140000000000001/0 softirq=1297/1297 fqs=2625
[ 80.035054] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 80.035060] 0-...: (5249 ticks this GP) idle=6a1/140000000000001/0 softirq=1297/1297 fqs=2625
[ 80.035064] (detected by 2, t=5252 jiffies, g=95, c=94, q=2)
[ 80.035065] Task dump for CPU 0:
[ 80.035070] kworker/u8:1 R running task 0 28 2 0x00000002
[ 80.035082] Workqueue: edac-poller edac_device_workq_function
[ 80.035084] Call trace:
[ 80.035090] [<ffffff80080852c4>] __switch_to+0x8c/0xa0
[ 80.035094] [<ffffffc07586ec00>] 0xffffffc07586ec00
[ 80.100425] (t=5268 jiffies g=95 c=94 q=2)
[ 80.106561] Task dump for CPU 0:
[ 80.111645] kworker/u8:1 R running task 0 28 2 0x00000002
[ 80.120575] Workqueue: edac-poller edac_device_workq_function
[ 80.128221] Call trace:
[ 80.132629] [<ffffff80080880f0>] dump_backtrace+0x0/0x198
[ 80.140047] [<ffffff800808829c>] show_stack+0x14/0x20
[ 80.147140] [<ffffff80080bf064>] sched_show_task+0x94/0xf0
[ 80.154685] [<ffffff80080c0f18>] dump_cpu_task+0x40/0x50
[ 80.162077] [<ffffff800812a7bc>] rcu_dump_cpu_stacks+0xb4/0xe8
[ 80.169978] [<ffffff80080e4c3c>] rcu_check_callbacks+0x67c/0x860
[ 80.178015] [<ffffff80080e7d0c>] update_process_times+0x34/0x60
[ 80.185978] [<ffffff80080f67f0>] tick_sched_handle.isra.4+0x38/0x48
[ 80.194281] [<ffffff80080f6844>] tick_sched_timer+0x44/0x90
[ 80.201865] [<ffffff80080e8670>] __hrtimer_run_queues+0xf0/0x178
[ 80.209870] [<ffffff80080e8a00>] hrtimer_interrupt+0x98/0x1c8
[ 80.217594] [<ffffff80086ae038>] arch_timer_handler_phys+0x30/0x40
[ 80.225733] [<ffffff80080dbe00>] handle_percpu_devid_irq+0x78/0x128
[ 80.233939] [<ffffff80080d6b24>] generic_handle_irq+0x24/0x38
[ 80.241612] [<ffffff80080d7184>] __handle_domain_irq+0x5c/0xb8
[ 80.249352] [<ffffff80080814cc>] gic_handle_irq+0x64/0xc0
[ 80.256633] Exception stack(0xffffffc075a47b90 to 0xffffffc075a47cc0)
[ 80.264967] 7b80: 0000000000000000 ffffff8008686400
[ 80.274692] 7ba0: 0000000000000000 0000000000000001 ffffffc077f72698 ffffffc077f72680
[ 80.284408] 7bc0: 0000000000000000 0000000000000000 ffffffc075a43b60 ffffffc075a44000
[ 80.294109] 7be0: 0000000000000780 0000000000000000 ffffffc074f6c800 0000000000000000
[ 80.303812] 7c00: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 80.313480] 7c20: 0000000000000000 0000000000000000 0000000000000000 ffffff8008686400
[ 80.323104] 7c40: 0000000000000000 ffffff8008cf1488 ffffff8008c67000 ffffff8008c67000
[ 80.332744] 7c60: ffffffc075a47d48 ffffffc07586ec78 ffffffc07586eea8 ffffffc075a47cc0
[ 80.342384] 7c80: ffffff8008686630 ffffffc075a47cc0 ffffff80080fb110 0000000060000145
[ 80.352017] 7ca0: ffffffc075a47d40 ffffff800890dac4 ffffffffffffffff ffffff8008c39000
[ 80.361647] [<ffffff80080827b0>] el1_irq+0xb0/0x140
[ 80.368315] [<ffffff80080fb110>] smp_call_function_single+0x88/0x128
[ 80.376458] [<ffffff8008686630>] cortex_arm64_edac_check+0x78/0xd0
[ 80.384428] [<ffffff8008682848>] edac_device_workq_function+0x78/0xc0
[ 80.392663] [<ffffff80080af654>] process_one_work+0x1bc/0x380
[ 80.400182] [<ffffff80080af860>] worker_thread+0x48/0x4a8
[ 80.407304] [<ffffff80080b5284>] kthread+0xd4/0xe8
[ 80.413780] [<ffffff8008082e80>] ret_from_fork+0x10/0x50
3) While configuring an IP address on target:
4) During shutdown in Linux:
To work around this issue, disable the CONFIG_EDAC_CORTEX_ARM64 driver from the kernel.
AR# 69433 | |
---|---|
Date | 07/10/2017 |
Status | Active |
Type | General Article |
Devices | |
Tools | |
Boards & Kits |