Commit 768adee
scsi: sg: Fix occasional bogus elapsed time that exceeds timeout
[ Upstream commit 0e1677654259a2f3ccf728de1edde922a3c4ba57 ]
A race condition was found in sg_proc_debug_helper(). It was observed on
a system using an IBM LTO-9 SAS Tape Drive (ULTRIUM-TD9) and monitoring
/proc/scsi/sg/debug every second. A very large elapsed time would
sometimes appear. This is caused by two race conditions.
We reproduced the issue with an IBM ULTRIUM-HH9 tape drive on an x86_64
architecture. A patched kernel was built, and the race condition could
not be observed anymore after the application of this patch. A
reproducer C program utilising the scsi_debug module was also built by
Changhui Zhong and can be viewed here:
https://github.com/MichaelRabek/linux-tests/blob/master/drivers/scsi/sg/sg_race_trigger.c
The first race happens between the reading of hp->duration in
sg_proc_debug_helper() and request completion in sg_rq_end_io(). The
hp->duration member variable may hold either of two types of
information:
#1 - The start time of the request. This value is present while
the request is not yet finished.
#2 - The total execution time of the request (end_time - start_time).
If sg_proc_debug_helper() executes *after* the value of hp->duration was
changed from #1 to #2, but *before* srp->done is set to 1 in
sg_rq_end_io(), a fresh timestamp is taken in the else branch, and the
elapsed time (value type #2) is subtracted from a timestamp, which
cannot yield a valid elapsed time (which is a type #2 value as well).
To fix this issue, the value of hp->duration must change under the
protection of the sfp->rq_list_lock in sg_rq_end_io(). Since
sg_proc_debug_helper() takes this read lock, the change to srp->done and
srp->header.duration will happen atomically from the perspective of
sg_proc_debug_helper() and the race condition is thus eliminated.
The second race condition happens between sg_proc_debug_helper() and
sg_new_write(). Even though hp->duration is set to the current time
stamp in sg_add_request() under the write lock's protection, it gets
overwritten by a call to get_sg_io_hdr(), which calls copy_from_user()
to copy struct sg_io_hdr from userspace into kernel space. hp->duration
is set to the start time again in sg_common_write(). If
sg_proc_debug_helper() is called between these two calls, an arbitrary
value set by userspace (usually zero) is used to compute the elapsed
time.
To fix this issue, hp->duration must be set to the current timestamp
again after get_sg_io_hdr() returns successfully. A small race window
still exists between get_sg_io_hdr() and setting hp->duration, but this
window is only a few instructions wide and does not result in observable
issues in practice, as confirmed by testing.
Additionally, we fix the format specifier from %d to %u for printing
unsigned int values in sg_proc_debug_helper().
Signed-off-by: Michal Rábek <mrabek@redhat.com>
Suggested-by: Tomas Henzl <thenzl@redhat.com>
Tested-by: Changhui Zhong <czhong@redhat.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Reviewed-by: John Meneghini <jmeneghi@redhat.com>
Reviewed-by: Tomas Henzl <thenzl@redhat.com>
Link: https://patch.msgid.link/20251212160900.64924-1-mrabek@redhat.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Ulrich Hecht <uli@kernel.org>1 parent a476164 commit 768adee
1 file changed
Lines changed: 13 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
741 | 741 | | |
742 | 742 | | |
743 | 743 | | |
| 744 | + | |
| 745 | + | |
744 | 746 | | |
745 | 747 | | |
746 | 748 | | |
| |||
830 | 832 | | |
831 | 833 | | |
832 | 834 | | |
833 | | - | |
834 | 835 | | |
835 | 836 | | |
836 | 837 | | |
| |||
1345 | 1346 | | |
1346 | 1347 | | |
1347 | 1348 | | |
1348 | | - | |
1349 | | - | |
1350 | | - | |
1351 | 1349 | | |
1352 | 1350 | | |
1353 | 1351 | | |
| |||
1397 | 1395 | | |
1398 | 1396 | | |
1399 | 1397 | | |
| 1398 | + | |
| 1399 | + | |
| 1400 | + | |
1400 | 1401 | | |
1401 | 1402 | | |
1402 | 1403 | | |
| |||
2546 | 2547 | | |
2547 | 2548 | | |
2548 | 2549 | | |
| 2550 | + | |
2549 | 2551 | | |
2550 | 2552 | | |
2551 | 2553 | | |
| |||
2584 | 2586 | | |
2585 | 2587 | | |
2586 | 2588 | | |
2587 | | - | |
| 2589 | + | |
2588 | 2590 | | |
2589 | 2591 | | |
2590 | | - | |
| 2592 | + | |
| 2593 | + | |
| 2594 | + | |
| 2595 | + | |
| 2596 | + | |
2591 | 2597 | | |
2592 | 2598 | | |
2593 | | - | |
| 2599 | + | |
2594 | 2600 | | |
2595 | 2601 | | |
2596 | 2602 | | |
| |||
0 commit comments