Subject: Re: --db_more-- in recent sparc64 kernel
To: None <eeh@netbsd.org, petrov@netbsd.org>
From: None <eeh@netbsd.org>
List: port-sparc64
Date: 07/14/2001 01:33:18
| On Sat, Jul 14, 2001 at 12:13:31AM -0000, eeh@netbsd.org wrote:
| >
| > | One which fails to start:
| > |
| > | netbsd: file format elf64-sparc
| > | netbsd
| > | architecture: sparc:v9, flags 0x00000012:
| > | EXEC_P, HAS_SYMS
| > | start address 0x0000000001000000
| > |
| > | Program Header:
| > | LOAD off 0x0000000000000080 vaddr 0x0000000001000000 paddr 0x0000000001000000 align 2**7
| > | filesz 0x000000000041e7d8 memsz 0x0000000000499768 flags rwx
| >
| > You haven't been looking at the linker errors, have you?
| >
|
| No, no linker errors.
Hm, that's interesting.
| > What has happened in this case is that your kernel and data segments have
| > collided. This is a bad thing.
| >
|
| Here is the same with sections information:
|
| netbsd: file format elf64-sparc
| netbsd
| architecture: sparc:v9, flags 0x00000012:
| EXEC_P, HAS_SYMS
| start address 0x0000000001000000
|
| Program Header:
| LOAD off 0x0000000000000080 vaddr 0x0000000001000000 paddr 0x0000000001000000 align 2**7
| filesz 0x000000000041e7d8 memsz 0x0000000000499768 flags rwx
|
| Sections:
| Idx Name Size VMA LMA File off Algn
| 0 .text 0029fe38 0000000001000000 0000000001000000 00000080 2**7
| CONTENTS, ALLOC, LOAD, READONLY, CODE
| 1 .data 0001e7d8 0000000001400000 0000000001400000 00400080 2**6
| CONTENTS, ALLOC, LOAD, DATA
| 2 .rodata 000998d1 000000000129fe38 000000000129fe38 0029feb8 2**3
| CONTENTS, ALLOC, LOAD, READONLY, DATA
| 3 .bss 0007af88 000000000141e7e0 000000000141e7e0 0041e860 2**4
| ALLOC
| 4 .comment 000060a2 0000000000000000 0000000000000000 0041e860 2**0
| CONTENTS, READONLY
| 5 .note 00000020 00000000000060a4 00000000000060a4 00424904 2**2
| CONTENTS, READONLY
| 6 .ident 0000007b 00000000000060c4 00000000000060c4 00424924 2**0
| CONTENTS, READONLY
| SYMBOL TABLE:
| 0000000001404008 l .data 0000000000000000 estack0
|
| text+rodata is smaller then 4MB so it shouldn't overlap with data.
| I got confused first time when I looked at sizes which ofwboot
| prints out.
| Linker put everything in one segment (program header), so I expect
| that gap is filled somehow.
Here are the start and end of each segment:
(gdb) p 1000000+29fe38
$1 = 0x129fe38
(gdb) p 129fe38+998d1
$2 = 0x1339709
(gdb) p 1400000+1e7d8
$3 = 0x141e7d8
(gdb) p 141e7e0+7af88
$4 = 0x1499768
There does not seem to be any overlap. There is 813303 bytes between the
end of the text segment and the beginning of the data segment. There
is surprizingly little in the data segment. Under 1MB.
I have seen this sort of thing myself recently. It would appear that the
linker is fouling up. If I grab one of these corrupt kernel images, stick
it under gdb, and dump out statically initialized data, say cn_tab, the data
is corrupt.
|
| The kernel crashes in main() in the very beginning
| p = &proc0;
| curproc = p;
| p->p_cpu = curcpu(); <-----
|
| with 6c (fast DMMU protection) trap. All trap levels are populated
| this.
Fire up GDB on the umage and see what you have in, say, proc0.
Also, if there is only one segment, or confusion between segments,
it is quite possible that writeable data has is read-only because
it has text protection.
Eduardo