Subject: Servicing Multiple (nested) TLB Misses
To: None <port-mips@netbsd.org>
From: Toru Nishimura <locore32@gaea.ocn.ne.jp>
List: port-mips
Date: 12/08/2002 13:24:07
A while ago, I briefly mentioned nested TLB miss handling which
would matter for linear PTE arrangement. Here goes more descriptive
explanation. The following is an excerpt from TOSHIBA TX39/H2
documents. Combined with what See MIPS Run tells, the design
intent of CONTEXT gets clearer.
/// Servicing Multiple (nested) TLB Misses ///
(From TX39/H2 architecture manual)
Within a UTLB Miss handler, the virtual address that specifies the
PTE contains physical address and access control information that
might not be mapped in the TLB. Then, a TLB Miss exception occurs.
You can recognize this case by nothing that the EPC register points
within the UTLB Miss handler. The operating system might interpret
the event as an address error (when the virtual address falls
outside the valid region for the process) or as a TLB Miss on the
page mapping table. This second TLB miss obscures the contents of
the BadVAddr, Context, and EntryHi registers as they were within
the UTLB Miss handler. As a result, the exact virtual address whose
translation caused the first fault is not known unless the UTLB
Miss handler specifically saved this address. You can only observe
the failing PTE virtual address. The BadVAddr register now contains
the original contents of the Context register within the UTLB Miss
handler, which is the PTE address for the original faulting address.
If the operating system interprets the exception as a TLB Miss on
the page mapping table, it constructs a TLB entry to map the page
table and writes the entry into the TLB. Then, the operating system
can determine the original faulting virtual page number, but not
the complete address. The operating system uses this information
to fetch the PTE that contains the physical address and access
control information. It also writes this information into the TLB.
The UTLB Miss handler must save the EPC in a way that allows the
second miss to find it. The EPC register information that the UTLB
Miss handler saved gives the correct address at which to resume
execution. The old KUo and IEo bits of the Status register contain
the correct made after TX39/H2 Processor Core services a double
miss. NOTE: You neither need nor want to return to the UTLB Miss
handler at this point.
###
Now, I can show the real codes. I have been crafting new MIPS
pmap which features linear PTE map for years. Codes work in some
extents. Kernel boots up, /sbin/init runs, singler user /bin/sh emits #
prompt, but dies mysteriously soon. I suspect ASID or cache issue
is still roaming beyond my understanging. Anyway, my R3000 code
looks like this;
LEAF_NOPROFILE(mips1_UTLBMiss)
mfc0 k1, MIPS_COP_0_TLB_CONTEXT
mfc0 k0, MIPS_COP_0_EXC_PC
lw k1, 0(k1) # possible KTLBmiss here
nop # N.B. k0 saved the original EPC
mtc0 k1, MIPS_COP_0_TLB_LOW
nop
tlbwr
jr k0
rfe
.globl _C_LABEL(mips1_UTLBMissEnd)
_C_LABEL(mips1_UTLBMissEnd):
END(mips1_UTLBMiss)
The double fault may happen at the 3rd instruction above.
LEAF_NOPROFILE(mips1_exception)
mfc0 k1, MIPS_COP_0_CAUSE
nop
and k1, MIPS1_CR_EXC_CODE
sub k1, T_TLB_LD_MISS << 2 # anticipating UTLBmiss
bnez k1, 1f
la k1, _C_LABEL(mips1_GXCPT)
jr k1 # preserve k0 for load miss trap
nop
1: mfc0 k1, MIPS_COP_0_CAUSE
la k0, mips1_xcptsw
and k1, MIPS1_CR_EXC_CODE
add k1, k1, k0
lw k1, 0(k1) # dispatch
nop
jr k1
nop
.globl _C_LABEL(mips1_exceptionEnd)
_C_LABEL(mips1_exceptionEnd):
END(mips1_exception)
The double faulting special case is handled through common
trap() code.
void
trap(status, cause, opc, frame)
unsigned status;
unsigned cause;
vaddr_t opc;
struct frame *frame;
{
...
case T_TLB_LD_MISS:
/* layout linear PTE in VPT and AVPT for TLB refill */
if (pdei(vaddr) == 1018 || pdei(vaddr) == 1019) {
pt_entry_t **pdp, *ptp;
pdp = curpcb->pcb_pmap->pm_pdir;
/* loopback to myself or refer to otherone */
pdp = (pt_entry_t **)pdp[pdei(vaddr)];
/* take KSEG0 address of PT page */
ptp = pdp[ptei(vaddr)];
if (ptp == NULL) {
/* hit 4MB desert hole, masquerade PG_NV */
MIPS_TLBWR(vaddr, desertpte);
}
else {
/* map the PT page in VPT/AVPT space */
pte = PG_V | PG_D;
if (ptei(vaddr) >= 768)
pte |= PG_G;
pte |= MIPS_KSEG0_TO_PTE(ptp);
MIPS_TLBWR(vaddr, pte);
}
/* detour lw fault during UTLBmiss; MIPS1 only */
if (cpu_arch == CPU_ARCH_MIPS1
&& opc == (0x80000000+sizeof(int)*2))
frame->f_pc = 0x80000000+sizeof(int)*7;
return;
}
/* FALLTHRU */
case T_TLB_ST_MISS:
/* TLB refill for kernel space; MIPS1 only */
...
It'd be necessary to depict a clear picture to explain how the linear PTE
is arranged in those address ranges (1018 * 4MB or 1019 * 4MB)
In short, it's done just like as what NetBSD/i386 or NetBSD/pc532
"fools" their MMU.
For the R4000 case, I found a trick useful in R4000 CONTEXT register.
The definition of R4000 CONTEXT differs from R3000 for the sake of
64bit PTR. TLBrefill exception can be written like as;
LEAF_NOPROFILE(mips3_TLBrefill)
ALEAF(mips3_XTLBrefill)
mfc0 k1, MIPS_COP_0_TLB_CONTEXT
nop
nop
sra k1, 1 # simulate MIPS1 CONTEXT register
lw k0, 0(k1) # possible KTLBmiss,
lw k1, 4(k1) # but never returns here
sll k0, 2
srl k0, 2 # mask of software PTE bits
sll k1, 2
srl k1, 2 # mask of software PTE bits
mtc0 k0, MIPS_COP_0_TLB_LO0
mtc0 k1, MIPS_COP_0_TLB_LO1
nop
tlbwr
nop
eret
.globl _C_LABEL(mips3_TLBrefillEnd)
_C_LABEL(mips3_TLBrefillEnd):
.globl _C_LABEL(mips3_XTLBrefillEnd)
_C_LABEL(mips3_XTLBrefillEnd):
END(mips3_TLBrefill)
Here, note that PTE is kept 32bit. The trick is the 4th instrunction.
The CONTEXT register is arranged to hold <<=1 value of PTEbase
and the whole value is adjusted at runtime to simulate R3000
CONTEXT. EXL bit eliminates cumbersome doulble faulting care
found in R3000. Logic goes straightly into trap() in this case.
Toru Nishimura/ALKYL Technology