~ubuntu-support-team/binutils/+git/binutils-gdb:users/palves/step-over-thread-exit-v3

Last commit made on 2022-12-12
Get this branch:
git clone -b users/palves/step-over-thread-exit-v3 https://git.launchpad.net/~ubuntu-support-team/binutils/+git/binutils-gdb

Branch merges

Branch information

Name:
users/palves/step-over-thread-exit-v3
Repository:
lp:~ubuntu-support-team/binutils/+git/binutils-gdb

Recent commits

37dff3c... by Pedro Alves <email address hidden>

Cancel execution command on thread exit, when stepping, nexting, etc.

If your target has no support for TARGET_WAITKIND_NO_RESUMED events
(and no way to support them, such as the yet-unsubmitted AMDGPU
target), and you step over thread exit with scheduler-locking on, this
is what you get:

 (gdb) n
 [Thread ... exited]
 *hang*

Getting back the prompt by typing Ctrl-C may not even work, since no
inferior thread is running to receive the SIGINT. Even if it works,
it seems unnecessarily harsh. If you started an execution command for
which there's a clear thread of interest (step, next, until, etc.),
and that thread disappears, then I think it's more user friendly if
GDB just detects the situation and aborts the command, giving back the
prompt.

That is what this commit implements. It does this by explicitly
requesting the target to report thread exit events whenever the main
resumed thread has a thread_fsm. Note that unlike stepping over a
breakpoint, we don't need to enable clone events in this case.

With this patch, we get:

 (gdb) n
 [Thread 0x7ffff7d89700 (LWP 3961883) exited]
 Command aborted, thread exited.
 (gdb)

Change-Id: I901ab64c91d10830590b2dac217b5264635a2b95

daf510b... by Pedro Alves <email address hidden>

Centralize "[Thread ...exited]" notifications

Currently, each target backend is responsible for printing "[Thread
...exited]" before deleting a thread. This leads to unnecessary
differences between targets, like e.g. with the remote target, we
never print such messages, even though we do print "[New Thread ...]".

E.g., debugging the gdb.threads/attach-many-short-lived-threads.exp
with gdbserver, letting it run for a bit, and then pressing Ctrl-C, we
currently see:

 (gdb) c
 Continuing.
 ^C[New Thread 3850398.3887449]
 [New Thread 3850398.3887500]
 [New Thread 3850398.3887551]
 [New Thread 3850398.3887602]
 [New Thread 3850398.3887653]
 ...

 Thread 1 "attach-many-sho" received signal SIGINT, Interrupt.
 0x00007ffff7e6a23f in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7fffffffda80, rem=rem@entry=0x7fffffffda80)
     at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
 78 in ../sysdeps/unix/sysv/linux/clock_nanosleep.c
 (gdb)

Above, we only see "New Thread" notifications, even though threads
were deleted.

After this patch, we'll see:

 (gdb) c
 Continuing.
 ^C[Thread 3558643.3577053 exited]
 [Thread 3558643.3577104 exited]
 [Thread 3558643.3577155 exited]
 [Thread 3558643.3579603 exited]
 ...
 [New Thread 3558643.3597415]
 [New Thread 3558643.3600015]
 [New Thread 3558643.3599965]
 ...

 Thread 1 "attach-many-sho" received signal SIGINT, Interrupt.
 0x00007ffff7e6a23f in __GI___clock_nanosleep (clock_id=clock_id@entry=0, flags=flags@entry=0, req=req@entry=0x7fffffffda80, rem=rem@entry=0x7fffffffda80)
     at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:78
 78 in ../sysdeps/unix/sysv/linux/clock_nanosleep.c
 (gdb) q

This commit fixes this by moving the thread exit printing to common
code instead, triggered from within delete_thread (or rather,
set_thread_exited).

There's one wrinkle, though. While most targest want to print:

 [Thread ... exited]

the Windows target wants to print:

 [Thread ... exited with code <exit_code>]

... and sometimes wants to suppress the notification for the main
thread. To address that, this commits adds a delete_thread_with_code
function, only used by that target (so far).

Change-Id: I06ec07b7c51527872a9713dd11cf7867b50fc5ff

657a94e... by Pedro Alves <email address hidden>

inferior::clear_thread_list always silent

Now that the MI mi_thread_exit observer ignores "silent", there's no
difference between inferior::clean_thread_list with silent true or
false. And for the CLI, clean_thread_list() never printed anything
either. So we can eliminate the 'silent' parameter.

Change-Id: I90ff9f07537d22ea70986fbcfe8819cac7e06613

1bbadb3... by Pedro Alves <email address hidden>

Document remote clone events, and QThreadOptions packet

This commit documents in both manual and NEWS:

 - the new remote clone event stop reply,
 - the new QThreadOptions packet and its current defined options,
 - the associated "set/show remote thread-events-packet" command,
 - and the associated QThreadOptions qSupported feature.

Approved-By: Eli Zaretskii <email address hidden>
Change-Id: Ic1c8de1fefba95729bbd242969284265de42427e

15f070d... by Simon Marchi <email address hidden>

Testcases for stepping over thread exit syscall (PR gdb/27338)

- New in this series version:

 - gdb.threads/step-over-thread-exit.exp has a couple new testing axes:

   - non-stop vs all-stop.

   - in non-stop have the testcase explicitly stop all threads in
     non-stop mode, vs not doing that.

 - bail out of gdb.threads/step-over-thread-exit.exp early on FAIL, to
   avoid cascading timeouts.

Add new gdb.threads/step-over-thread-exit.exp and
gdb.threads/step-over-thread-exit-while-stop-all-threads.exp
testcases, exercising stepping over thread exit syscall. These make
use of lib/my-syscalls.S to define the exit syscall.

Co-authored-by: Pedro Alves <email address hidden>
Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=27338
Change-Id: Ie8b2c5747db99b7023463a897a8390d9e814a9c9

cd20416... by Pedro Alves <email address hidden>

gdb/testsuite/lib/my-syscalls.S: Refactor new SYSCALL macro

Refactor the syscall assembly code in gdb/testsuite/lib/my-syscalls.S
behind a SYSCALL macro so that it's easy to add new syscalls without
duplicating code.

Note that the way the macro is implemented, it only works correctly
for syscalls with up to 3 arguments, and, if the syscall doesn't
return (the macro doesn't bother to save/restore callee-saved
registers).

The following patch will want to use the macro to define a wrapper for
the "exit" syscall, so the limitations continue to be sufficient.

Change-Id: I8acf1463b11a084d6b4579aaffb49b5d0dea3bba

9d40da2... by Pedro Alves <email address hidden>

Ignore failure to read PC when resuming

If GDB sets a GDB_THREAD_OPTION_EXIT option on a thread, and the
thread exits, the server reports the corresponding thread exit event,
and forgets about the thread, i.e., removes the exited thread from its
thread list.

On the GDB side, GDB set the GDB_THREAD_OPTION_EXIT option on a
thread, GDB delays deleting the thread from its thread list until it
sees the corresponding thread exit event, as that event needs special
handling in infrun.

When a thread disappears from the target, but it still exists on GDB's
thread list, in all-stop RSP mode, it can happen that GDB ends up
trying to resume such an already-exited-thread that GDB doesn't yet
know is gone. When that happens, against GDBserver, typically the
ongoing execution command fails with this error:

 ...
 PC register is not available
 (gdb)

At the remote protocol level, we may see e.g., this:

      [remote] Packet received: w0;p97479.978d2
    [remote] wait: exit
    [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
    [infrun] print_target_wait_results: 619641.620754.0 [Thread 619641.620754],
    [infrun] print_target_wait_results: status->kind = THREAD_EXITED, exit_status = 0
    [infrun] handle_inferior_event: status->kind = THREAD_EXITED, exit_status = 0
    [infrun] context_switch: Switching context from 0.0.0 to 619641.620754.0
    [infrun] clear_proceed_status_thread: 619641.620754.0

GDB saw an exit event for thread 619641.620754. After processing it,
infrun decides to re-resume the target again. To do that, infrun
picks some other thread that isn't exited yet from GDB's perspective,
switches to it, and calls keep_going. Below, infrun happens to pick
thread p97479.97479, the leader, which also exited, but GDB doesn't
know yet:

...
    [remote] Sending packet: $Hgp97479.97479#75
    [remote] Packet received: OK
    [remote] Sending packet: $g#67
    [remote] Packet received: xxxxxxxxxxxxxxxxx (...snip...) [1120 bytes omitted]
    [infrun] reset: reason=handling event
    [infrun] maybe_set_commit_resumed_all_targets: not requesting commit-resumed for target remote, no resumed threads
  [infrun] fetch_inferior_event: exit
  PC register is not available
  (gdb)

The Linux backends, both in GDB and in GDBserver, already silently
ignore failures to resume, with the understanding that we'll see an
exit event soon. Core of GDB doesn't do that yet, though.

This patch is a small step in that direction. It swallows the error
when thrown from within resume_1. There are likely are spots where we
will need similar treatment, but we can tackle them as we find them.

After this patch, we'll see something like this instead:

    [infrun] resume_1: step=0, signal=GDB_SIGNAL_0, trap_expected=0, current thread [640478.640478.0] at 0x0
    [infrun] do_target_resume: resume_ptid=640478.0.0, step=0, sig=GDB_SIGNAL_0
    [remote] Sending packet: $vCont;c:p9c5de.-1#78
    [infrun] prepare_to_wait: prepare_to_wait
    [infrun] reset: reason=handling event
    [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target remote
    [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target remote
  [infrun] fetch_inferior_event: exit
  [infrun] fetch_inferior_event: enter
    [infrun] scoped_disable_commit_resumed: reason=handling event
    [infrun] random_pending_event_thread: None found.
    [remote] wait: enter
      [remote] Packet received: W0;process:9c5de
    [remote] wait: exit
    [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
    [infrun] print_target_wait_results: 640478.0.0 [process 640478],
    [infrun] print_target_wait_results: status->kind = EXITED, exit_status = 0
    [infrun] handle_inferior_event: status->kind = EXITED, exit_status = 0
  [Inferior 1 (process 640478) exited normally]
    [infrun] stop_waiting: stop_waiting
    [infrun] reset: reason=handling event
  (gdb) [infrun] fetch_inferior_event: exit

Change-Id: I7f1c7610923435c4e98e70acc5ebe5ebbac581e2

e2e59de... by Pedro Alves <email address hidden>

Report thread exit event for leader if reporting thread exit events

If GDB sets the GDB_THREAD_OPTION_EXIT option on a thread, then if the
thread disappears from the thread list, GDB expects to shortly see a
thread exit event for it. See e.g., here, in
remote_target::update_thread_list():

    /* Do not remove the thread if we've requested to be
       notified of its exit. For example, the thread may be
       displaced stepping, infrun will need to handle the
       exit event, and displaced stepping info is recorded
       in the thread object. If we deleted the thread now,
       we'd lose that info. */
    if ((tp->thread_options () & GDB_THREAD_OPTION_EXIT) != 0)
      continue;

There's one scenario that is deleting a thread from the
remote/gdbserver thread list without ever reporting a corresponding
thread exit event though -- check_zombie_leaders. This can lead to
GDB getting confused. For example, with a following patch that
enables GDB_THREAD_OPTION_EXIT whenever schedlock is enabled, we'd see
this regression:

 $ make check RUNTESTFLAGS="--target_board=native-extended-gdbserver" TESTS="gdb.threads/no-unwaited-for-left.exp"
 ...
 Running src/gdb/testsuite/gdb.threads/no-unwaited-for-left.exp ...
 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)
 ... some more cascading FAILs ...

gdb.log shows:

 (gdb) continue
 Continuing.
 FAIL: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits (timeout)

A passing run would have resulted in:

 (gdb) continue
 Continuing.
 No unwaited-for children left.
 (gdb) PASS: gdb.threads/no-unwaited-for-left.exp: continue stops when the main thread exits

note how the leader thread is not listed in the remote-reported XML
thread list below:

 (gdb) set debug remote 1
 (gdb) set debug infrun 1
 (gdb) info threads
   Id Target Id Frame
 * 1 Thread 1163850.1163850 "no-unwaited-for" main () at /home/pedro/rocm/gdb/build/gdb/testsuite/../../../src/gdb/testsuite/gdb.threads/no-unwaited-for-left.c:65
   3 Thread 1163850.1164130 "no-unwaited-for" [remote] Sending packet: $Hgp11c24a.11c362#39
 (gdb) c
 Continuing.
 [infrun] clear_proceed_status_thread: 1163850.1163850.0
 ...
     [infrun] resume_1: step=1, signal=GDB_SIGNAL_0, trap_expected=1, current thread [1163850.1163850.0] at 0x55555555534f
     [remote] Sending packet: $QPassSignals:#f3
     [remote] Packet received: OK
     [remote] Sending packet: $QThreadOptions;3:p11c24a.11c24a#f3
     [remote] Packet received: OK
 ...
     [infrun] target_set_thread_options: [options for Thread 1163850.1163850 are now 0x3]
 ...
   [infrun] do_target_resume: resume_ptid=1163850.1163850.0, step=0, sig=GDB_SIGNAL_0
   [remote] Sending packet: $vCont;c:p11c24a.11c24a#98
   [infrun] prepare_to_wait: prepare_to_wait
   [infrun] reset: reason=handling event
   [infrun] maybe_set_commit_resumed_all_targets: enabling commit-resumed for target extended-remote
   [infrun] maybe_call_commit_resumed_all_targets: calling commit_resumed for target extended-remote
 [infrun] fetch_inferior_event: exit
 [infrun] fetch_inferior_event: enter
   [infrun] scoped_disable_commit_resumed: reason=handling event
   [infrun] random_pending_event_thread: None found.
   [remote] wait: enter
     [remote] Packet received: N
   [remote] wait: exit
   [infrun] print_target_wait_results: target_wait (-1.0.0 [process -1], status) =
   [infrun] print_target_wait_results: -1.0.0 [process -1],
   [infrun] print_target_wait_results: status->kind = NO_RESUMED
   [infrun] handle_inferior_event: status->kind = NO_RESUMED
   [remote] Sending packet: $Hgp0.0#ad
   [remote] Packet received: OK
   [remote] Sending packet: $qXfer:threads:read::0,1000#92
   [remote] Packet received: l<threads>\n<thread id="p11c24a.11c362" core="0" name="no-unwaited-for" handle="0097d8f7ff7f0000"/>\n</threads>\n
   [infrun] handle_no_resumed: TARGET_WAITKIND_NO_RESUMED (ignoring: found resumed)
 ...

... however, infrun decided there was a resumed thread still, so
ignored the TARGET_WAITKIND_NO_RESUMED event. Debugging GDB, we see
that the "found resumed" thread that GDB finds, is the leader thread.
Even though that thread is not on the remote-reported thread list, it
is still on the GDB thread list, due to the special case in remote.c
mentioned above.

This commit addresses the issue by fixing GDBserver to report a thread
exit event for the zombie leader too, i.e., making GDBserver respect
the "if thread has GDB_THREAD_OPTION_EXIT set, report a thread exit"
invariant. To do that, it takes a bit more code than one would
imagine off hand, due to the fact that we currently always report LWP
exit pending events as TARGET_WAITKIND_EXITED, and then decide whether
to convert it to TARGET_WAITKIND_THREAD_EXITED just before reporting
the event to GDBserver core. For the zombie leader scenario
described, we need to record early on that we want to report a
THREAD_EXITED event, and then make sure that decision isn't lost along
the way to reporting the event to GDBserver core.

Change-Id: I1e68fccdbc9534434dee07163d3fd19744c8403b

55144e8... by Pedro Alves <email address hidden>

Don't resume new threads if scheduler-locking is in effect

If scheduler-locking is in effect, like e.g., with "set
scheduler-locking on", and you step over a function that spawns a new
thread, the new thread is allowed to run free, at least until some
event is hit, at which point, whether the new thread is re-resumed
depends on a number of seemingly random factors. E.g., if the target
is all-stop, and the parent thread hits a breakpoint, and gdb decides
the breakpoint isn't interesting to report to the user, then the
parent thread is resumed, but the new thread is left stopped.

I think that letting the new threads run with scheduler-locking
enabled is a defect. This commit fixes that, making use of the new
clone events on Linux, and of target_thread_events() on targets where
new threads have no connection to the thread that spawned them.

Testcase and documentation changes included.

Approved-By: Eli Zaretskii <email address hidden>
Change-Id: Ie12140138b37534b7fc1d904da34f0f174aa11ce

190583a... by Pedro Alves <email address hidden>

gdbserver: Queue no-resumed event after thread exit

Normally, if the last thread resumed thread on the target exits, the
server sends a no-resumed event to GDB. If however, GDB enables the
GDB_THREAD_OPTION_EXIT option on a thread, and, that thread exits, the
server sends a thread exit event for that thread instead.

In all-stop RSP mode, since events can only be forwarded to GDB one at
a time, and the whole target stops whenever an event is reported, GDB
resumes the target again after getting a THREAD_EXITED event, and then
the server finally reports back a no-resumed event if/when
appropriate.

For non-stop RSP though, events are asynchronous, and if the server
sends a thread-exit event for the last resumed thread, the no-resumed
event is never sent. This patch makes sure that in non-stop mode, the
server queues a no-resumed event after the thread-exit event if it was
the last resumed thread that exited.

Without this, we'd see failures in step-over-thread-exit testcases
added later in the series, like so:

   continue
   Continuing.
 - No unwaited-for children left.
 - (gdb) PASS: gdb.threads/step-over-thread-exit.exp: displaced-stepping=off: non-stop=on: target-non-stop=on: schedlock=off: ns_stop_all=1: continue stops when thread exits
 + FAIL: gdb.threads/step-over-thread-exit.exp: displaced-stepping=off: non-stop=on: target-non-stop=on: schedlock=off: ns_stop_all=1: continue stops when thread exits (timeout)

(and other similar ones)

Change-Id: I927d78b30f88236dbd5634b051a716f72420e7c7