On 24.05.2019 17:09, Michael van Elst wrote: > On Fri, May 24, 2019 at 10:17:54AM +0200, Kamil Rytarowski wrote: > >> Shouldn't that be optimized with libc functions? It calls read(2) for >> each character. > > The input might be read by shell and programs launched by the shell. > For files you can read-ahead and seek back, but for pipes you can > only read single bytes. > As far as I'm aware we can use read(2) and write(2) in pipes with longer transfers than 1 byte. But the real question here is what is heavy in the build infrastructure. 5k times transferring 1 byte was just a potential starting point. > > >>>> 2. Firefox and Thunderbird and certainly other similar software calls >>>> excessively gettimeofday() and clock_gettime(). At least around 100k >>>> times per 1 minute, and the program spends around 30sec (cumulative time >>> >from all LWPs in a process) in the kernel space prompting for the >>>> current time. >>> >>> That's only a symptom. The real question is why it doesn't sleep. >> >> This is a symptom, but this is not specific to a single application. In >> my checks other programs like top(1) are relatively hungry for checking >> for the current time. More than 70% syscalls from top(1) are for >> __gettimeofday50() (but of course top(1) doesn't emit so many syscalls >> in so short periods). > > Top caches data from several databases (e.g. passwd) and checks time for > each lookup to find out whether the cache needs a refresh. Compared to > everything else done by top it is neglible. > My observation was general that this syscall is frequently called by many programs. Optimization of it can potentially change responsiveness of the whole system. > > >> In NetBSD truss(1) we prompt for the current time for each event like a >> syscall entry/exit of a traced process. >> >> Jason Thorpe mentioned how to optimize it. As far as I understand, we >> can create a page shared between userland and kernel, pass it through >> AUXV vector and effectively replace all syscalls with memory reads. > > Yes, that is an option, als for other calls like getuid() or getpid(). > > On the other hand, your measurement is probably a bit misleading, > a modern system does 100k gettimeofday calls in about a millisecond. > My computers are slower than that. Also as long as ptrace(2) is racy, I cannot guarantee any accurate numbers of calls with this tool (unless profiling a single-threaded application). I'm looking forward to some analysis with the right tool (DTrace is more appropriate here). ptrace(2) based syscall tracers can give merely some rough idea here. There are some websites (no need to market them here) that present profiling with strace and show when it is efficient. At some point of time Joyent optimized bulk builds of pkgsrc from 2 days to 3 h. There are certainly low-hanging fruits in build.sh as well. > I'm not sure whether the additional complexity would be justified. > Another argument against this optimization is that tracing these > non-syscalls is even more complex. > I'm not sure that this would be a real concern here to skip gettimeofday calls in strace-like programs. On the other hand it would be helpful to filter out moderately interesting syscalls. Tracing libc calls with ptrace(2) shouldn't be that difficult, but it would need a tool with MD code. > > > Greetings, > Anyway I gave a tool, if someone is interested in experimenting and feedbacking patches, feel free to do so. I will keep using them for catching kernel stability problems of the ptrace(2) APIs.
Attachment:
signature.asc
Description: OpenPGP digital signature