NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

kern/56606: Kernel crash with nfs client



>Number:         56606
>Category:       kern
>Synopsis:       Kernel crash with nfs client
>Confidential:   no
>Severity:       serious
>Priority:       high
>Responsible:    kern-bug-people
>State:          open
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Wed Jan 05 16:25:00 +0000 2022
>Originator:     Uwe
>Release:        NetBSD 9.2_STABLE
>Organization:
University of Leipzig
>Environment:
NetBSD 6bone.informatik.uni-leipzig.de 9.2_STABLE NetBSD 9.2_STABLE (MYCONF9.gdb) #0: Fri Dec  3 22:52:15 CET 2021  root%6bone.informatik.uni-leipzig.de@localhost:/usr/obj/sys/arch/amd64/compile/MYCONF9.gdb amd64

>Description:
The server runs stably without NFS. If I mount large NFS drives (size 10 TB, several hundred thousand files), after some time the kernel crashes. The Server is only NFS Client.

The output from the crashdump:

dmesg -M netbsd.12.core -N netbsd.12
...
[ 2440123.627306] uvm_fault(0xffffde22d52c85e0, 0xdeadb000, 1) -> e
[ 2440123.627306] fatal page fault in supervisor mode
[ 2440123.627306] trap type 6 code 0 rip 0xffffffff80978353 cs 0x8 rflags 0x10202 cr2 0xdeadbeef ilevel 0 rsp 0xffffb60382766b90
[ 2440123.627306] curlwp 0xffffde1be4749180 pid 28222.1 lowest kstack 0xffffb603827632c0
[ 2440123.627306] panic: trap
[ 2440123.627306] cpu3: Begin traceback...
[ 2440123.627306] vpanic() at netbsd:vpanic+0x160
[ 2440123.627306] snprintf() at netbsd:snprintf
[ 2440123.627306] startlwp() at netbsd:startlwp
[ 2440123.627306] alltraps() at netbsd:alltraps+0xc3
[ 2440123.627306] pmap_remove_pte() at netbsd:pmap_remove_pte+0x1d8
[ 2440123.637379] pmap_remove() at netbsd:pmap_remove+0x237
[ 2440123.637379] ubc_alloc() at netbsd:ubc_alloc+0x758
[ 2440123.637379] ubc_uiomove() at netbsd:ubc_uiomove+0xec
[ 2440123.637379] nfs_bioread() at netbsd:nfs_bioread+0x5de
[ 2440123.637379] VOP_READ() at netbsd:VOP_READ+0x88
[ 2440123.637379] vn_read() at netbsd:vn_read+0x88
[ 2440123.637379] dofileread() at netbsd:dofileread+0x8f
[ 2440123.637379] sys_read() at netbsd:sys_read+0x49
[ 2440123.637379] syscall() at netbsd:syscall+0x196
[ 2440123.637379] --- syscall (number 3) ---
[ 2440123.637379] 798554242c6a:
[ 2440123.637379] cpu3: End traceback...

vmstat -M netbsd.12.core -N netbsd.12 -s
     4096 bytes per page
       64 page colors
 12216441 pages managed
     2876 pages free
        0 pages paging
     3932 pages wired
     2860 zero pages
        1 reserve pagedaemon pages
       20 reserve kernel pages
   357925 boot kernel pages
  1175001 kernel pool pages
    46329 anonymous pages
 10466724 cached file pages
     9189 cached executable pages
     1024 minimum free pages
     1365 target free pages
  4072147 maximum wired pages
        1 swap devices
 15359999 swap pages
        0 swap pages in use
        0 swap allocations
 21205572 total faults taken
 21065659 traps
1714279768 device interrupts
413180200 CPU context switches
 26095770 software interrupts
3407669224 system calls
        0 pagein requests
        0 pageout requests
        0 pages swapped in
        0 pages swapped out
   324914 forks total
    17752 forks blocked parent
    17752 forks shared address space with parent
 33698832 pagealloc zero wanted and avail
  5618371 pagealloc zero wanted and not avail
    68081 aborts of idle page zeroing
149681999 pagealloc desired color avail
 13724720 pagealloc desired color not avail
115608179 pagealloc local cpu avail
 47798540 pagealloc local cpu not avail
        1 faults with no memory
        0 faults with no anons
        0 faults had to wait on pages
        0 faults found released page
     1215 faults relock (1214 ok)
 38961239 anon page faults
        0 anon retry faults
 16352138 amap copy faults
        0 neighbour anon page faults
190968037 neighbour object page faults
 61071725 locked pager get faults
     1215 unlocked pager get faults
 19092355 anon faults
 19869056 anon copy on write faults
 51524088 object faults
  9548445 promote copy faults
 24916941 promote zero fill faults
     2979 times daemon wokeup
     2979 revolutions of the clock hand
 15140716 pages freed by daemon
 15164196 pages scanned by daemon
        0 anonymous pages scanned by daemon
 15140716 object pages scanned by daemon
     4338 pages reactivated
        0 pages found busy by daemon
        0 total pending pageouts
 18649068 pages deactivated
406482288 total name lookups
315477227 good hits
 12647690 negative hits
    16282 bad hits
    61870 false hits
  6592808 miss
 71686411 too long
  1565858 pass2 hits
  1612698 2passes
          cache hits (77% pos + 3% neg) system 0% per-process
          deletions 0%, falsehits 0%, toolong 17%

  gdb ./netbsd
GNU gdb (GDB) 8.3
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64--netbsd".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./netbsd...
(gdb) target kvm netbsd.12.core
0xffffffff80222ba5 in cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at /usr/src/sys/arch/amd64/amd64/machdep.c:728
728                     dumpsys();
(gdb) bt
#0  0xffffffff80222ba5 in cpu_reboot (howto=howto@entry=260, bootstr=bootstr@entry=0x0) at /usr/src/sys/arch/amd64/amd64/machdep.c:728
#1  0xffffffff809ffc29 in vpanic (fmt=fmt@entry=0xffffffff81312267 "trap", ap=ap@entry=0xffffb60382766958) at /usr/src/sys/kern/subr_prf.c:336
#2  0xffffffff809ffcda in panic (fmt=fmt@entry=0xffffffff81312267 "trap") at /usr/src/sys/kern/subr_prf.c:255
#3  0xffffffff8022508e in trap (frame=0xffffb60382766aa0) at /usr/src/sys/arch/amd64/amd64/trap.c:334
#4  0xffffffff8021d533 in alltraps ()
#5  0xffffffff80978353 in uvm_page_locked_p (pg=pg@entry=0xffffb60053bd96a0) at /usr/src/sys/uvm/uvm_page.c:1767
#6  0xffffffff8024a362 in pmap_remove_pte (pmap=<optimized out>, ptp=0x0, pte=<optimized out>, va=18446662724896296960, pv_tofree=0xffffb60382766c48) at /usr/src/sys/arch/x86/x86/pmap.c:3513
#7  0xffffffff8024df19 in pmap_remove_ptes (pv_tofree=<optimized out>, endva=<optimized out>, startva=18446662724896296960, ptpva=<optimized out>, ptp=<optimized out>, pmap=<optimized out>) at /usr/src/sys/arch/x86/x86/pmap.c:3418
#8  pmap_remove_locked (pdes=<optimized out>, ptes=<optimized out>, eva=18446662724896305152, sva=18446744071586079632, pmap=0xffffffff816e78c0 <kernel_pmap_store>) at /usr/src/sys/arch/x86/x86/pmap.c:3605
#9  pmap_remove (pmap=0xffffffff816e78c0 <kernel_pmap_store>, sva=sva@entry=18446662724896296960, eva=18446662724896305152) at /usr/src/sys/arch/x86/x86/pmap.c:3633
#10 0xffffffff80961f28 in ubc_alloc (uobj=uobj@entry=0xffffde25be3b02e0, offset=offset@entry=3088384, lenp=lenp@entry=0xffffb60382766d58, advice=advice@entry=0, flags=flags@entry=257) at /usr/src/sys/uvm/uvm_bio.c:528
#11 0xffffffff809632ad in ubc_uiomove (uobj=uobj@entry=0xffffde25be3b02e0, uio=uio@entry=0xffffb60382766f00, todo=24576, advice=advice@entry=0, flags=flags@entry=257) at /usr/src/sys/uvm/uvm_bio.c:749
#12 0xffffffff808d6e6c in nfs_bioread (vp=<optimized out>, uio=<optimized out>, ioflag=0, cred=0xffffde1a27cc0400, cflag=0) at /usr/src/sys/nfs/nfs_bio.c:160
#13 0xffffffff80a72671 in VOP_READ (vp=vp@entry=0xffffde25be3b02e0, uio=uio@entry=0xffffb60382766f00, ioflag=ioflag@entry=0, cred=cred@entry=0xffffde1a27cc0400) at /usr/src/sys/kern/vnode_if.c:470
#14 0xffffffff80a694bc in vn_read (fp=<optimized out>, offset=0xffffde1b43a27dc0, uio=0xffffb60382766f00, cred=0xffffde1a27cc0400, flags=1) at /usr/src/sys/kern/vfs_vnops.c:587
#15 0xffffffff80a0ebdf in dofileread (fd=fd@entry=10, fp=<optimized out>, buf=0x798555648d00, nbyte=32768, offset=<optimized out>, flags=flags@entry=1, retval=retval@entry=0xffffb60382766fb0) at /usr/src/sys/kern/sys_generic.c:156
#16 0xffffffff80a0eca7 in sys_read (l=<optimized out>, uap=0xffffb60382767000, retval=0xffffb60382766fb0) at /usr/src/sys/kern/sys_generic.c:121
#17 0xffffffff802526bb in sy_call (rval=0xffffb60382766fb0, uap=0xffffb60382767000, l=0xffffde1be4749180, sy=0xffffffff81657ea8 <sysent+72>) at /usr/src/sys/sys/syscallvar.h:65
#18 sy_invoke (code=3, rval=0xffffb60382766fb0, uap=0xffffb60382767000, l=0xffffde1be4749180, sy=0xffffffff81657ea8 <sysent+72>) at /usr/src/sys/sys/syscallvar.h:94
#19 syscall (frame=0xffffb60382767000) at /usr/src/sys/arch/x86/x86/syscall.c:138
#20 0xffffffff802096dd in handle_syscall ()

>How-To-Repeat:
not known
>Fix:



Home | Main Index | Thread Index | Old Index