A number of hangs have been reported against the target driver; they are
due to the fact that multiple threads may try to destroy the iscsi session
at the same time. This may be reproduced for example when a "targetcli
iscsi/iqn.../tpg1 disable" command is executed while a logout operation is
underway.
When this happens, two or more threads may end up sleeping and waiting for
iscsit_close_connection() to execute "complete(session_wait_comp)". Only
one of the threads will wake up and proceed to destroy the session
structure, the remaining threads will hang forever.
Note that if the blocked threads are somehow forced to wake up with
complete_all(), they will try to free the same iscsi session structure
destroyed by the first thread, causing double frees, memory corruptions
etc...
With this patch, the threads that want to destroy the iscsi session will
increase the session refcount and will set the "session_close" flag to 1;
then they wait for the driver to close the remaining active connections.
When the last connection is closed, iscsit_close_connection() will wake up
all the threads and will wait for the session's refcount to reach zero;
when this happens, iscsit_close_connection() will destroy the session
structure because no one is referencing it anymore.
INFO: task targetcli:5971 blocked for more than 120 seconds.
Tainted: P OE 4.15.0-72-generic #81~16.04.1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
targetcli D 0 5971 1 0x00000080
Call Trace:
__schedule+0x3d6/0x8b0
? vprintk_func+0x44/0xe0
schedule+0x36/0x80
schedule_timeout+0x1db/0x370
? __dynamic_pr_debug+0x8a/0xb0
wait_for_completion+0xb4/0x140
? wake_up_q+0x70/0x70
iscsit_free_session+0x13d/0x1a0 [iscsi_target_mod]
iscsit_release_sessions_for_tpg+0x16b/0x1e0 [iscsi_target_mod]
iscsit_tpg_disable_portal_group+0xca/0x1c0 [iscsi_target_mod]
lio_target_tpg_enable_store+0x66/0xe0 [iscsi_target_mod]
configfs_write_file+0xb9/0x120
__vfs_write+0x1b/0x40
vfs_write+0xb8/0x1b0
SyS_write+0x5c/0xe0
do_syscall_64+0x73/0x130
entry_SYSCALL_64_after_hwframe+0x3d/0xa2
The default resource group ("rdtgroup_default") is associated with the
root of the resctrl filesystem and should never be removed. New resource
groups can be created as subdirectories of the resctrl filesystem and
they can be removed from user space.
There exists a safeguard in the directory removal code
(rdtgroup_rmdir()) that ensures that only subdirectories can be removed
by testing that the directory to be removed has to be a child of the
root directory.
A possible deadlock was recently fixed with
334b0f4e9b1b ("x86/resctrl: Fix a deadlock due to inaccurate reference").
This fix involved associating the private data of the "mon_groups"
and "mon_data" directories to the resource group to which they belong
instead of NULL as before. A consequence of this change was that
the original safeguard code preventing removal of "mon_groups" and
"mon_data" found in the root directory failed resulting in attempts to
remove the default resource group that ends in a BUG:
kernel BUG at mm/slub.c:3969!
invalid opcode: 0000 [#1] SMP PTI
Fix this by improving the directory removal safeguard to ensure that
subdirectories of the resctrl root directory can only be removed if they
are a child of the resctrl filesystem's root _and_ not associated with
the default resource group.
Fixes: 334b0f4e9b1b ("x86/resctrl: Fix a deadlock due to inaccurate reference")
Reported-by: Sai Praneeth Prakhya <email address hidden>
Signed-off-by: Reinette Chatre <email address hidden>
Signed-off-by: Borislav Petkov <email address hidden>
Tested-by: Sai Praneeth Prakhya <email address hidden>
Cc: <email address hidden>
Link: https://lkml.kernel.org/r/884cbe1773496b5dbec1b6bd11bb50cffa83603d<email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>
The ti_sci_inta_irq_handler() does not take into account INTA IRQs state
(masked/unmasked) as it uses INTA_STATUS_CLEAR_j register to get INTA IRQs
status, which provides raw status value.
This causes hard IRQ handlers to be called or threaded handlers to be
scheduled many times even if corresponding INTA IRQ is masked.
Above, first of all, affects the LEVEL interrupts processing and causes
unexpected behavior up the system stack or crash.
Fix it by using the Interrupt Masked Status INTA_STATUSM_j register which
provides masked INTA IRQs status.
We do not want to create initialized extents beyond end of file because
for e2fsck it is impossible to distinguish them from a case of corrupted
file size / extent tree and so it complains like:
Inode 12, i_size is 147456, should be 163840. Fix? no
Code in ext4_ext_convert_to_initialized() and
ext4_split_convert_extents() try to make sure it does not create
initialized extents beyond inode size however they check against
inode->i_size which is wrong. They should instead check against
EXT4_I(inode)->i_disksize which is the current inode size on disk.
That's what e2fsck is going to see in case of crash before all dirty
data is written. This bug manifests as generic/456 test failure (with
recent enough fstests where fsx got fixed to properly pass
FALLOC_KEEP_SIZE_FL flags to the kernel) when run with dioread_lock
mount option.
CC: <email address hidden>
Fixes: 21ca087a3891 ("ext4: Do not zero out uninitialized extents beyond i_size")
Reviewed-by: Lukas Czerner <email address hidden>
Signed-off-by: Jan Kara <email address hidden>
Signed-off-by: Theodore Ts'o <email address hidden>
Link: https://<email address hidden>
Signed-off-by: Theodore Ts'o <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>
a6a517a...
by
Ashutosh Dixit <email address hidden>
drm/i915/perf: Do not clear pollin for small user read buffers
It is wrong to block the user thread in the next poll when OA data is
already available which could not fit in the user buffer provided in
the previous read. In several cases the exact user buffer size is not
known. Blocking user space in poll can lead to data loss when the
buffer size used is smaller than the available data.
This change fixes this issue and allows user space to read all OA data
even when using a buffer size smaller than the available data using
multiple non-blocking reads rather than staying blocked in poll till
the next timer interrupt.
v2: Fix ret value for blocking reads (Umesh)
v3: Mistake during patch send (Ashutosh)
v4: Remove -EAGAIN from comment (Umesh)
v5: Improve condition for clearing pollin and return (Lionel)
v6: Improve blocking read loop and other cleanups (Lionel)
v7: Added Cc stable
We already set DPM_FLAG_SMART_PREPARE, so we completely skip all
callbacks (other then prepare) where possible, quoting from
dw_i2c_plat_prepare():
/*
* If the ACPI companion device object is present for this device, it
* may be accessed during suspend and resume of other devices via I2C
* operation regions, so tell the PM core and middle layers to avoid
* skipping system suspend/resume callbacks for it in that case.
*/
return !has_acpi_companion(dev);
Also setting the DPM_FLAG_SMART_SUSPEND will cause acpi_subsys_suspend()
to leave the controller runtime-suspended even if dw_i2c_plat_prepare()
returned 0.
Leaving the controller runtime-suspended normally, when the I2C controller
is suspended during the suspend_late phase, is not an issue because
the pm_runtime_get_sync() done by i2c_dw_xfer() will (runtime-)resume it.
But for dw I2C controllers on Bay- and Cherry-Trail devices acpi_lpss.c
leaves the controller alive until the suspend_noirq phase, because it may
be used by the _PS3 ACPI methods of PCI devices and PCI devices are left
powered on until the suspend_noirq phase.
Between the suspend_late and resume_early phases runtime-pm is disabled.
So for any ACPI I2C OPRegion accesses done after the suspend_late phase,
the pm_runtime_get_sync() done by i2c_dw_xfer() is a no-op and the
controller is left runtime-suspended.
i2c_dw_xfer() has a check to catch this condition (rather then waiting
for the I2C transfer to timeout because the controller is suspended).
acpi_subsys_suspend() leaving the controller runtime-suspended in
combination with an ACPI I2C OPRegion access done after the suspend_late
phase triggers this check, leading to the following error being logged
on a Bay Trail based Lenovo Thinkpad 8 tablet:
So since on BYT and CHT platforms we want ACPI I2c OPRegion accesses
to work until the suspend_noirq phase, we need the controller to be
runtime-resumed during the suspend phase if it is runtime-suspended
suspended at that time. This means that we must not set the
DPM_FLAG_SMART_SUSPEND on these platforms.
On BYT and CHT we already have a special ACCESS_NO_IRQ_SUSPEND flag
to make sure the controller stays functional until the suspend_noirq
phase. This commit makes the driver not set the DPM_FLAG_SMART_SUSPEND
flag when that flag is set.
Cc: <email address hidden>
Fixes: b30f2f65568f ("i2c: designware: Set IRQF_NO_SUSPEND flag for all BYT and CHT controllers")
Signed-off-by: Hans de Goede <email address hidden>
Reviewed-by: Andy Shevchenko <email address hidden>
Acked-by: Rafael J. Wysocki <email address hidden>
Acked-by: Jarkko Nikula <email address hidden>
Signed-off-by: Wolfram Sang <email address hidden>
Signed-off-by: Greg Kroah-Hartman <email address hidden>