scsi: ipr: Fix softlockup when rescanning devices in petitboot
When trying to rescan disks in petitboot shell, we hit the following
softlockup stacktrace:
Kernel panic - not syncing: System is deadlocked on memory
[ 241.223394] CPU: 32 PID: 693 Comm: sh Not tainted 5.4.16-openpower1 #1
[ 241.223406] Call Trace:
[ 241.223415] [
c0000003f07c3180] [
c000000000493fc4] dump_stack+0xa4/0xd8 (unreliable)
[ 241.223432] [
c0000003f07c31c0] [
c00000000007d4ac] panic+0x148/0x3cc
[ 241.223446] [
c0000003f07c3260] [
c000000000114b10] out_of_memory+0x468/0x4c4
[ 241.223461] [
c0000003f07c3300] [
c0000000001472b0] __alloc_pages_slowpath+0x594/0x6d8
[ 241.223476] [
c0000003f07c3420] [
c00000000014757c] __alloc_pages_nodemask+0x188/0x1a4
[ 241.223492] [
c0000003f07c34a0] [
c000000000153e10] alloc_pages_current+0xcc/0xd8
[ 241.223508] [
c0000003f07c34e0] [
c0000000001577ac] alloc_slab_page+0x30/0x98
[ 241.223524] [
c0000003f07c3520] [
c0000000001597fc] new_slab+0x138/0x40c
[ 241.223538] [
c0000003f07c35f0] [
c00000000015b204] ___slab_alloc+0x1e4/0x404
[ 241.223552] [
c0000003f07c36c0] [
c00000000015b450] __slab_alloc+0x2c/0x48
[ 241.223566] [
c0000003f07c36f0] [
c00000000015b754] kmem_cache_alloc_node+0x9c/0x1b4
[ 241.223582] [
c0000003f07c3760] [
c000000000218c48] blk_alloc_queue_node+0x34/0x270
[ 241.223599] [
c0000003f07c37b0] [
c000000000226574] blk_mq_init_queue+0x2c/0x78
[ 241.223615] [
c0000003f07c37e0] [
c0000000002ff710] scsi_mq_alloc_queue+0x28/0x70
[ 241.223631] [
c0000003f07c3810] [
c0000000003005b8] scsi_alloc_sdev+0x184/0x264
[ 241.223647] [
c0000003f07c38a0] [
c000000000300ba0] scsi_probe_and_add_lun+0x288/0xa3c
[ 241.223663] [
c0000003f07c3a00] [
c000000000301768] __scsi_scan_target+0xcc/0x478
[ 241.223679] [
c0000003f07c3b20] [
c000000000301c64] scsi_scan_channel.part.9+0x74/0x7c
[ 241.223696] [
c0000003f07c3b70] [
c000000000301df4] scsi_scan_host_selected+0xe0/0x158
[ 241.223712] [
c0000003f07c3bd0] [
c000000000303f04] store_scan+0x104/0x114
[ 241.223727] [
c0000003f07c3cb0] [
c0000000002d5ac4] dev_attr_store+0x30/0x4c
[ 241.223741] [
c0000003f07c3cd0] [
c0000000001dbc34] sysfs_kf_write+0x64/0x78
[ 241.223756] [
c0000003f07c3cf0] [
c0000000001da858] kernfs_fop_write+0x170/0x1b8
[ 241.223773] [
c0000003f07c3d40] [
c0000000001621fc] __vfs_write+0x34/0x60
[ 241.223787] [
c0000003f07c3d60] [
c000000000163c2c] vfs_write+0xa8/0xcc
[ 241.223802] [
c0000003f07c3db0] [
c000000000163df4] ksys_write+0x70/0xbc
[ 241.223816] [
c0000003f07c3e20] [
c00000000000b40c] system_call+0x5c/0x68
As a part of the scan process Linux will allocate and configure a
scsi_device for each target to be scanned. If the device is not present,
then the scsi_device is torn down. As a part of scsi_device teardown a
workqueue item will be scheduled and the lockups we see are because there
are 250k workqueue items to be processed. Accoding to the specification of
SIS-64 sas controller, max_channel should be decreased on SIS-64 adapters
to 4.
The patch fixes softlockup issue.
Thanks for Oliver Halloran's help with debugging and explanation!
Link: https://lore.kernel.org/r/1583510248-23672-1-git-send-email-wenxiong@linux.vnet.ibm.com
Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>