Subject: re: crash dump failing on machine with 4GB
To: Chris Ross <cross+netbsd@distal.com>
From: matthew green <mrg@eterna.com.au>
List: port-sparc64
Date: 09/27/2007 03:51:54
On Sep 26, 2007, at 12:07, Chris Ross wrote:
> Is this a known issue? I have a sparc64 machine with 4GB of memory.
Not unexpectedly, this appears to be an int overflow issue.
Making the following change:
--- sys/arch/sparc64/sparc64/machdep.c 11 Sep 2007 16:00:06
-0000 1.202
+++ sys/arch/sparc64/sparc64/machdep.c 26 Sep 2007 17:24:50 -0000
@@ -759,7 +759,7 @@
for (mp = &phys_installed[0], j = 0; j < phys_installed_size;
j++, mp = &phys_installed[j]) {
- unsigned i = 0, n;
+ unsigned long i = 0, n;
paddr_t maddr = mp->start;
#if 0
@@ -781,8 +781,7 @@
printf("%ld ", todo / (1024*1024));
pmap_kenter_pa(dumpspace, maddr, VM_PROT_READ);
pmap_update(pmap_kernel());
- error = (*dump)(dumpdev, blkno,
- (void *)dumpspace, (int)n);
+ error = (*dump)(dumpdev, blkno, (void *)
dumpspace, n);
pmap_kremove(dumpspace, n);
pmap_update(pmap_kernel());
if (error)
i guess i was expecting something like this. you may be the first
person to truly try crashdumps on 4GB machine :-)
causes it to produce a new error. n is capped at 8192 by other
code, so the latter segment above is probably not even an issue. I
don't know enough about the lower-level device code to know what I'm
hitting, so I thought I'd ask. This wasn't getting hit before
because n was 0, due to the overflow.
8192 is almost certainly due to that being the sparc64 page size.
I'm seeing now:
db> reboot 0x104
Frame pointer is at 0xe0016651
Call traceback:
13ea690(1, d, 0, e00171e0, ffffffffffffffff, 0, e0016731) fp = e0016731
10be120(104, 0, e00170a8, 1860800, 1860b88, 188c7a8, e00167f1) fp =
e00167f1
10bd658(1, 0, 4, e0017170, e0017298, 188c7a8, e00168c1) fp = e00168c1
10bdc88(180f2c8, 4, 0, 0, e0017388, 0, e0016a11) fp = e0016a11
10c163c(13f3f08, 0, 2, 1898819, 0, 0, e0016b01) fp = e0016b01
13f5264(0, 0, 0, 0, 4, 1000000, e0016bd1) fp = e0016bd1
13f2dd8(101, e0017b60, 98b31e1fa, 957d95e00000000, 1d00000000,
18a4800, e0017131) fp = e0017131
1008c1c(e0017b60, 101, 13f3f00, 1d0006, 400, 187a998, e00172b1) fp =
e00172b1
13c234c(189b950, 187f3e0, ffffffff, 0, 1818c00, 1d, e0017491) fp =
e0017491
13c29a8(61c4800, e0017e0c, a847c1a, 7477, ffff, 40, e0017551) fp =
e0017551
100911c(0, 0, e0017ed0, 1877998, 13c2960, 1000000, e0017621) fp =
e0017621
1288640(0, 0, 4, 6, 187a800, 1000000, ffbd561) fp = ffbd561
dumping to dev 7,1 offset 4310231
dump 4096 esiop0: unable to load cmd DMA map: -1i/o error
sd0(esiop0:0:0:0): polling command not done
panic: scsipi_execute_xs
cpu0: kdb breakpoint at 13f3f00
Stopped in pid 0.2 (system) at netbsd:cpu_Debugger+0x4: nop
db>
can you get a stack trace with symbols? or use gdb to
find them out from these values?
.mrg.