On 16 March 2017 at 21:43, Kamil Rytarowski <
n54%gmx.com@localhost> wrote:
> On 16.03.2017 11:55, Pavel Labath wrote:
>> What kind of per-process events
>> are we talking about here?
>
> I'm mostly thinking about ResumeActions - to resume the whole process,
> while being able single-stepping desired thread(s).
>
> (We also offer PT_SYSCALL feature, but it's not needed right now in LLDB).
>
>> Is there anything more here than a signal
>> directed at the whole process?
>
> single-stepping
> resume thread
> suspend thread
>
> I'm evaluating FreeBSD-like API PT_SETSTEP/PT_CLEARSTEP for NetBSD. It
> marks a thread for single-stepping. This code is needed to allow us to
> combine PT_SYSCALL & PT_STEP and PT_STEP & emit signal.
>
> I was thinking about ResumeActions marking which thread to
> resume/suspend/singlestep, whether to emit a signal (one per global
> PT_CONTINUE[/PT_SYSCALL]) and whether to resume the whole thread.
>
> To some certain point it might be kludged with single-thread model for
> basic debugging.
>
>
> I imagined a possible flow of ResumeAction calls like:
> [Generic/Native framework knows upfront the image of threads within
> debuggee]
> - Resume Thread 2 (PT_RESUME)
> - Suspend Thread 3 (PT_SUSPEND)
> - Set single-step Thread 2 (PT_SETSTEP)
> - Set single-step Thread 4 (PT_SETSTEP)
> - Clear single-step Thread 5 (PT_CLEARSTEP)
> - Resume & emit signal SIGIO (PT_CONTINUE)
>
> In other words: setting properties on threads and pushing the
> PT_CONTINUE button at the end.
None of this is really NetBSD-specific, except the whole-process signal at the end (which I am going to ignore for now). I mean, the implementation of it is different, but there is no reason why someone would not want to perform the same set of actions on Linux for instance. I think most of the work here should be done on the client. Then, when the user issues the final "continue", the client sends something like $vCont;s:2;s:4;c:5. Then it's up to the server to figure out how execute these actions. On NetBSD it would execute the operations you mention above, while on linux it would do something like ptrace(PTRACE_SINGLESTEP, 2); ptrace(PTRACE_SINGLESTEP, 4); ptrace(PTRACE_CONTINUE, 5); (linux lldb-server already supports this actually, although you may have a hard time convincing the client to send a packet like that).
So I don't believe there will be any sweeping changes necessary to support this in the future. If I understand it correctly, you are working on the server now. All you need to do there is to make sure you translate the set of actions in the packet to the proper sequence of ptrace calls. You can even write lldb-server-style tests for that. Then, we can discuss what would be the best user-level interface to specify complex actions like this, and teach the client to send these packets.
>
>> AFAICT, most of the stop reasons
>> (breakpoint, watchpoint, single step, ...) are still linked to a
>> specific thread even in your process model. I think you could get to a
>> point where lldb is very useful even without getting these events
>> "correct".
>>
>
> I was thinking for example about this change (it's not following the
> real function name nor the prototype):
>
> GetStoppedReason(Thread) -> GetStoppedReason(Process,Thread)
>
> The Linux code would easily route it to desired thread and (Net)BSD
> return immediately the requested data. The need to have these functions
> in NativeThread (enforced by the framework) is the only purpose I keep
> them there, while there is global stopped reason on NetBSD (per-process).
Ok, I think we can talk about tweaks like that once you have something upstream. Right now it does not seem to me like that should pose a big development obstacle.
In my local code, I'm populating all threads within the tracee
(NativeThread) with exactly the same stop reason - for the "whole
process" case. I can see - on the client side - that it prints out the
same message for each thread within the process as all of them captured
a stop action.
Indeed, that can be a nuissance. The whole-process events is probably the first thing we should look at after the port is operational. I think this can be handled independently of the fancy resume actions we talk about above, which as Jim pointed out, would be very hard for users to comprehend anyway.
I'm evaluating it from the point of view of a tracee with 10.000 threads
and getting efficient debugging experience. This is why I would ideally
reduce NativeThread to a container that is sorted, hashale box of
integers (lwpid_t) and shut down stopped reason extension called for
each stopped in debuggee.
I wouldn't worry too much about the performance of this part of the code. If you get to the point where you debug a process with ten thousand threads, I think you'll find that there are other things which are causing performance problems.