tech-net archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: npf panic - need some clues
> Date: Thu, 12 Oct 2023 00:09:29 +0000 (UTC)
> From: John Klos <john%klos.com@localhost>
>
> Does anyone have any clue about what's happening here, and what to check /
> try in the future?
>
> [ 2828128.148194] panic: Trap: Data Abort (EL1): Translation Fault L0 with
> write access for 0000000000000000: pc ffffc00000595da4: stp x27, x19, [x0]
This is a null pointer dereference.
> [ 2828128.331272] fp ffffc00255d37760 stage_mem_gc() at ffffc00000595da4
> netbsd:stage_mem_gc+0x54
It happened at stage_mem_gc+0x54, which I bet is subr_thmap.c line
933:
932 gc = kmem_intr_alloc(sizeof(thmap_gc_t), KM_NOSLEEP);
933 gc->addr = addr;
934 gc->len = len;
https://nxr.netbsd.org/xref/src/sys/kern/subr_thmap.c?r=1.13#933
This on its face is wrong -- use KM_NOSLEEP, must tolerate allocation
failure.
Unfortunately, it can't be changed to KM_SLEEP instead as it is
currently used; either the algorithm must be changed or the caller
must be reorganized.
> [ 2828128.339342] fp ffffc00255d377d0 thmap_del() at ffffc000005976b0
> netbsd:thmap_del+0x530
> [ 2828128.339342] fp ffffc00255d378d0 npf_conndb_remove() at
> ffffc00000344f34 netbsd:npf_conndb_remove+0x40
> [ 2828128.355048] fp ffffc00255d37900 npf_conn_establish() at
> ffffc00000342a8c netbsd:npf_conn_establish+0x28c
> [ 2828128.364679] fp ffffc00255d37990 npfk_packet_handler() at
> ffffc0000033a5c4 netbsd:npfk_packet_handler+0x4d4
> [ 2828128.374970] fp ffffc00255d37aa0 pfil_run_hooks() at ffffc0000066c4e0
> netbsd:pfil_run_hooks+0x110
> [ 2828128.384681] fp ffffc00255d37b50 ipintr() at ffffc000002cd87c
> netbsd:ipintr+0x318
> [ 2828128.394683] fp ffffc00255d37d00 softint_dispatch() at
> ffffc000005589a8 netbsd:softint_dispatch+0xf4
Problems:
- thmap_del can't tolerate allocation failure unless the API is
changed to report back failure itself, but...
- npf_conndb_remove can't handle failure of thmap_del anyway in this
error branch, so it really needs to block until enough memory is
freed that the allocation can succeed, but...
- All this logic runs in soft interrupt context where blocking is
forbidden.
The issue is reported and analyzed here:
https://github.com/rmind/npf/issues/129
https://gnats.netbsd.org/57208
Unfortunately nobody has gotten a round tuit.
(Nothing Arm-specific about this -- it's an npf/thmap bug.)
Home |
Main Index |
Thread Index |
Old Index