Subject: kern/10113: procfs cmdline is (politely) not quite right
To: None <gnats-bugs@gnats.netbsd.org>
From: None <kre@munnari.OZ.AU>
List: netbsd-bugs
Date: 05/13/2000 14:25:12
>Number: 10113
>Category: kern
>Synopsis: cat /proc/curproc/cmdline returns EINVAL (on alpha) (and more)
>Confidential: no
>Severity: serious
>Priority: medium
>Responsible: kern-bug-people
>State: open
>Class: sw-bug
>Submitter-Id: net
>Arrival-Date: Sat May 13 14:26:01 PDT 2000
>Closed-Date:
>Last-Modified:
>Originator: Robert Elz
>Release: NetBSD-current 2000-05-13
>Organization:
University of Melbourne
>Environment:
System: NetBSD lavender.cs.mu.OZ.AU 1.4U NetBSD 1.4U (LAVENDER) #28: Sun May 14 06:24:07 EST 2000 kre@lavender.cs.mu.OZ.AU:/usr/src/sys/arch/alpha/compile/LAVENDER alpha
>Description:
The cmdline implementation in procfs is bogus. It's possible that
part of the fix is a workaround of a UVM problem - that is, when
(internally) accessing the top of the process VM (the end of the
args) a request for I/0 of a PAGE_SIZE'd block starting at less
than a PAGE_SIZE from the end of the mem space returns EINVAL
rather than the data that is available. Whether this is a bug
in UVM or not depends upon how it is defined to work, and I was
unable to determine that. (Simon Burge found that problem, and
provided the basis of the workaround/fix).
Then, the cmdline function is unable to read more than one
page of args, and a good thing too, as the way it is written
attempting to get more than that would reference into lala land.
And, on an attempt to read a lot of data when the above is
fixed, most of the data won't be returned, only the final block
of any read.
>How-To-Repeat:
On an alpha, "cat /proc/curproc/cmdline" and note the EINVAL
(the environment may need to be limited to rather smaller than
a page of data to experience the problem.)
Then, also try
cd /usr/include
cat /proc/curproc/cmdline * */* * */* * */* |
tr \\0 \\012 |
sed 10q
and examine the output. Create some other tool that will
read the cmdline, but not all the rest of the args, and try
the above using it, and examine the end of the output (using cat
you get all those files, over and over again - generating a lot of
output, so be wary of doing that...).
Notice that nothing is quite the way it it should be.
>Fix:
Apply the patch below. This contains a workaround for the
UVM problem (whether it be a problem in UVM, or a problem in the
way that the cmdline code was using the UVM interfaces).
That was developed by Simon Burge.
It also allows the entire arg space to be read, or any part of
it, starting at any offset, and will return as much data as is
available in a single read (to the limit of the minimum of the
amount requested, and the number of bytes of args available)
Issues: This code could be simplified, and made much more
effecient if the ps_strings struct contained the byte
count of the args (and the environ) as well as the string
counts. As it is now, the cmdline code has to read the
entire arg space, every time it is called, just to make
sure that accesses aren't attempted beyond the end of the
arg string space - knowing the byte count would allow just
the data requested to be accessed.
It would be trivial to create /proc/N/environ based upon
the cmdline code (could use same code with an arg/env flag).
This is probably worth doing.
While working on this, I noticed that while cmdline has
special case code to deal with SZOMB processes, no
zombie processes ever show up in /proc. The code in cmdline
is still needed, as the cmdline could have been opened
before the process turned into a zombie, so that's OK.
But, this would make implementing a (kind of) ps out of
the contents of /proc impossible, as none of the zombie
processes would be visible, and that's no good.
There are round_page() and trunc_page() macros, but
no page_offset() or anything quite like it I could see.
That is, where page_trunc(n) is ((n) & ~PAGE_MASK)
page_offset(n) would be ((n) & PAGE_MASK). That expression
is used more than 15 times in the MI kernel code, a macro
for it (even though it is pretty simple) might be nice.
Nothing is provided here to deal with any of those.
This patch is from NetBSD-current as it is now. I think it will
almost apply to 1.4.1 and 1.4.2 and probably several other variants.
The only change that's been made since 1.4.1 is substituting use of
the P_ZOMBIE() macro for an explicit test of p_stat == SZOMB
(which is not changed by this patch) - but does appear in the
context, so it isn't going to apply cleanly. I actually worked
on this under 1.4U but nothing has changed since then. The diff is
against -current as I fetched it from the AU sup mirror about
24 hours ago.
Once this has been tested in -current, it would be nice if it was
pulled up to 1.4.3
--- procfs_cmdline.c.WAS Fri Jul 23 22:01:07 1999
+++ procfs_cmdline.c Sun May 14 07:04:35 2000
@@ -85,7 +85,14 @@
*/
if (P_ZOMBIE(p) || (p->p_flag & P_SYSTEM) != 0) {
len = snprintf(arg, PAGE_SIZE, "(%s)", p->p_comm);
- goto doio;
+ xlen = len - uio->uio_offset;
+ if (xlen <= 0)
+ error = 0;
+ else
+ error = uiomove(arg, xlen, uio);
+
+ free(arg, M_TEMP);
+ return (error);
}
/*
@@ -142,14 +149,15 @@
*/
len = 0;
count = pss.ps_nargvstr;
- upper_bound = round_page(uio->uio_offset + 1);
- for (; count && len < upper_bound; len += PAGE_SIZE) {
+ upper_bound = round_page(uio->uio_offset + uio->uio_resid);
+ for (; count && len < upper_bound; len += xlen) {
aiov.iov_base = arg;
aiov.iov_len = PAGE_SIZE;
auio.uio_iov = &aiov;
auio.uio_iovcnt = 1;
auio.uio_offset = argv + len;
- auio.uio_resid = PAGE_SIZE;
+ xlen = PAGE_SIZE - ((argv + len) & PAGE_MASK);
+ auio.uio_resid = xlen;
auio.uio_segflg = UIO_SYSSPACE;
auio.uio_rw = UIO_READ;
auio.uio_procp = NULL;
@@ -157,39 +165,30 @@
if (error)
goto bad;
- for (i = len; i < (len + PAGE_SIZE) && count != 0; i++) {
+ for (i = 0; i < xlen && count != 0; i++) {
if (arg[i] == '\0')
count--; /* one full string */
}
- if (count == 0) {
- /* No more argv strings, set up len and break. */
- len = i;
- break;
+ if (count == 0)
+ i--; /* exclude the final NUL */
+
+ if (len + i > uio->uio_offset) {
+ /* Have data in this page, copy it out */
+ error = uiomove(arg + uio->uio_offset - len,
+ i + len - uio->uio_offset, uio);
+ if (error || uio->uio_resid <= 0)
+ break;
}
}
- if (len > 0)
- len--; /* exclude last NUL */
+ bad:;
/*
* Release the process.
*/
PRELE(p);
uvmspace_free(p->p_vmspace);
- doio:
- xlen = len - uio->uio_offset;
- if (xlen <= 0)
- error = 0;
- else
- error = uiomove(arg + trunc_page(len), xlen, uio);
-
- free(arg, M_TEMP);
- return (error);
-
- bad:
- PRELE(p);
- uvmspace_free(p->p_vmspace);
free(arg, M_TEMP);
return (error);
}
>Release-Note:
>Audit-Trail:
>Unformatted: