Port-sparc64 archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: Pausing/resuming CPU's in DDB
Hi,
> Increasing the number of retries in sparc64_send_ipi() is unlikely to help
> the situation since you should only be able to exit that routine one of
> two ways, if it thinks sending the IPI was successful or through this
> code:
>
> if (panicstr == NULL)
> panic("cpu%d: ipi_send: couldn't send ipi to UPAID %u"
> " (tried %d times)", cpu_number(), upaid, i);
>
> Are you getting a panic? If not, then increasing the loop count won't
> help.
No panic, but I did see the "RED State Exception". I have increased the
number of retries in sparc64_send_ipi to 10000, and the E3500 survived a
10h30 build.sh -j 16 (base, X, 2 kernels) with all filesystems on NFS.
However, even with retries set to 10000, it still sometimes fails to pause or
to resume all the CPU's, so the loops in mp_pause_cpus() and mp_resume_cpus()
still seem to be necessary.
> Also, instead of always sending the IPI to all the cpus I would recomment
> updating the cpuset by removing the processors that have halted for the
> next iteration of your patch.
Next iteration of the patch attached.
Thanks,
J
PS. It was suggested that the code could be moved to use kcupset(9), but I
haven't looked at that yet.
--
My other computer also runs NetBSD / Sailing at Newbiggin
http://www.netbsd.org/ / http://www.newbigginsailingclub.org/
Index: ipifuncs.c
===================================================================
RCS file: /cvsroot/src/sys/arch/sparc64/sparc64/ipifuncs.c,v
retrieving revision 1.44
diff -u -p -r1.44 ipifuncs.c
--- ipifuncs.c 12 Feb 2012 16:34:10 -0000 1.44
+++ ipifuncs.c 10 Mar 2012 23:43:11 -0000
@@ -222,7 +222,7 @@ sparc64_send_ipi(int upaid, ipifunc_t fu
intr_func = (uint64_t)(u_long)func;
/* Schedule an interrupt. */
- for (i = 0; i < 1000; i++) {
+ for (i = 0; i < 10000; i++) {
int s = intr_disable();
stxa(IDDR_0H, ASI_INTERRUPT_DISPATCH, intr_func);
@@ -325,17 +325,21 @@ mp_halt_cpus(void)
void
mp_pause_cpus(void)
{
+ int i = 3;
sparc64_cpuset_t cpuset;
CPUSET_ASSIGN(cpuset, cpus_active);
CPUSET_DEL(cpuset, cpu_number());
+ while (i-- > 0) {
+ if (CPUSET_EMPTY(cpuset))
+ return;
- if (CPUSET_EMPTY(cpuset))
- return;
-
- sparc64_multicast_ipi(cpuset, sparc64_ipi_pause, 0, 0);
- if (sparc64_ipi_wait(&cpus_paused, cpuset))
- sparc64_ipi_error("pause", cpus_paused, cpuset);
+ sparc64_multicast_ipi(cpuset, sparc64_ipi_pause, 0, 0);
+ if (!sparc64_ipi_wait(&cpus_paused, cpuset))
+ return;
+ CPUSET_SUB(cpuset, cpus_paused);
+ }
+ sparc64_ipi_error("pause", cpus_paused, cpuset);
}
/*
@@ -354,16 +358,20 @@ mp_resume_cpu(int cno)
void
mp_resume_cpus(void)
{
+ int i = 3;
sparc64_cpuset_t cpuset;
- CPUSET_CLEAR(cpus_resumed);
- CPUSET_ASSIGN(cpuset, cpus_paused);
- membar_Sync();
- CPUSET_CLEAR(cpus_paused);
+ while (i-- > 0) {
+ CPUSET_CLEAR(cpus_resumed);
+ CPUSET_ASSIGN(cpuset, cpus_paused);
+ membar_Sync();
+ CPUSET_CLEAR(cpus_paused);
- /* CPUs awake on cpus_paused clear */
- if (sparc64_ipi_wait(&cpus_resumed, cpuset))
- sparc64_ipi_error("resume", cpus_resumed, cpuset);
+ /* CPUs awake on cpus_paused clear */
+ if (!sparc64_ipi_wait(&cpus_resumed, cpuset))
+ return;
+ }
+ sparc64_ipi_error("resume", cpus_resumed, cpuset);
}
int
Home |
Main Index |
Thread Index |
Old Index