NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: port-macppc/59014: Shutdown -r now often freezes on macppc b&w G3
The following reply was made to PR port-macppc/59014; it has been noted by GNATS.
From: Chris Tucker <capa150%gmail.com@localhost>
To: gnats-bugs%netbsd.org@localhost, port-macppc-maintainer%netbsd.org@localhost,
gnats-admin%netbsd.org@localhost
Cc:
Subject: Re: port-macppc/59014: Shutdown -r now often freezes on macppc b&w G3
Date: Fri, 24 Jan 2025 13:05:37 -0800
--0000000000004fcfe6062c7a17fe
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
On Thu, Jan 23, 2025 at 3:25=E2=80=AFAM Martin Husemann via gnats <
gnats-admin%netbsd.org@localhost> wrote:
> The following reply was made to PR port-macppc/59014; it has been noted b=
y
> GNATS.
>
> From: Martin Husemann <martin%duskware.de@localhost>
> To: gnats-bugs%netbsd.org@localhost
> Cc:
> Subject: Re: port-macppc/59014: Shutdown -r now often freezes on macppc
> b&w G3
> Date: Thu, 23 Jan 2025 12:22:25 +0100
>
> > netbsd# rndctl -S seed2 (works)
> > netbsd# rndctl -S seed2 (works)
> > netbsd# rndctl -S seed2 (works)
> > netbsd# rndctl -S seed2 (freezes, and I press ctrl-t)
> > [ 139.8510292] load: 0.42 cmd: rndctl 633 [biowait] 0.01u 0.00s 0%
> 1424k
>
> Just for completenes: could you mount a tmpfs on /tmp (if you haven't
> that already as part of your standard install) and then try again with a
> sequence of
>
> # rndctl -S /tmp/seed
>
> One other thing to check if the drive knows it is failing:
>
> # atactl wd0 smart status
>
> This could be the disk not giving up on the sector (and remapping it yet=
),
> but running into issues whenever trying to write it. I would expect
> dmesg spam from that though.
>
> If you have anything but the fresh install on the disk, I would do a
> backup
> ASAP.
>
> Martin
>
> A terrific idea!
I did a
mount -t tmpfs tmpfs /tmp
followed by
# rndctl -S /tmp/seed
I repeated the rndctl -S command above 32 times for good measure with no
errors. I also did a rndctl -L off the tmpfs another 32 times without error
(seems to have eliminated the problem).
Smart on my IBM DJSA-220 test-install drive running NetBSD 10.1:
netbsd# atactl wd0 smart status
SMART supported, SMART enabled
id value thresh crit collect reliability description raw
1 100 62 yes online positive Raw read error rate 0
2 100 40 yes offline positive Throughput performance 0
3 102 33 yes online positive Spin-up time
107374182402
4 100 0 no online positive Start/stop count 1506
5 100 5 yes online positive Reallocated sector count 0
7 100 67 yes online positive Seek error rate 0
8 100 40 yes offline positive Seek time performance 0
9 57 0 no online positive Power-on hours count 19067
10 100 60 yes online positive Spin retry count 0
12 100 0 no online positive Device power cycle count 1391
191 100 0 no online positive G-sense error rate 0
192 100 0 no online positive Power-off retract count 86
193 1 50 no online negative Load cycle count 105928=
2
196 100 0 no online positive Reallocated event count 7
197 100 0 no online positive Current pending sector 2
198 100 0 no offline positive Offline uncorrectable 0
199 200 0 no online positive Ultra DMA CRC error count 127
netbsd#
Smart on my Maxtor DiamondMax Plus 8 (new old stock) NetBSD 9.1 system:
kawaii# atactl wd0 smart status
SMART supported, SMART enabled
id value thresh crit collect reliability description raw
3 221 63 yes online positive Spin-up time 11385
4 253 0 no online positive Start/stop count 185
5 253 63 yes online positive Reallocated sector count 0
6 253 100 yes offline positive Read channel margin 0
7 253 0 no online positive Seek error rate 0
8 252 187 yes online positive Seek time performance 52886
9 253 0 no online positive Power-on hours count 6774
10 253 157 yes online positive Spin retry count 0
11 253 223 yes online positive Calibration retry count 0
12 253 0 no online positive Device power cycle count 177
99 253 0 no offline positive Unknown 0
100 253 0 no offline positive Erase/Program Cycles 0
101 253 0 no offline positive Unknown 0
192 253 0 no online positive Power-off retract count 120
193 253 0 no online positive Load cycle count 191
194 253 0 no online positive Temperature 42
195 253 0 no online positive Hardware ECC Recovered 2226
196 253 0 no offline positive Reallocated event count 0
197 253 0 no offline positive Current pending sector 0
198 253 0 no offline positive Offline uncorrectable 0
199 199 0 no offline positive Ultra DMA CRC error count 1
200 253 0 no online positive Write error rate 0
201 253 0 no online positive Soft read error rate 3
202 253 0 no online positive Data address mark errors 0
203 253 180 yes online positive Run out cancel 0
204 253 0 no online positive Soft ECC correction 0
205 253 0 no online positive Thermal asperity check 0
207 253 0 no online positive Spin high current 0
208 253 0 no online positive Spin buzz 0
209 185 0 no offline positive Offline seek performance 0
kawaii#
Both drives exhibited the rndctl crash problem.
I suspect the Mac's onboard IDE controller is flakey, as it is known to be
wacky. See the following link:
https://en.wikipedia.org/wiki/Power_Macintosh_G3#Blue_and_White_2
I have the "good" revision 2 motherboard, which some people think is also
quirky.
My next plan is to modify the kernel to turn off UDMA mode on the hard disk
using the wd flags, see if that resolves the problem.
Thanks to all for helping to narrow down the cause of the problem.
--0000000000004fcfe6062c7a17fe
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
<div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote g=
mail_quote_container"><div dir=3D"ltr" class=3D"gmail_attr">On Thu, Jan 23,=
2025 at 3:25=E2=80=AFAM Martin Husemann via gnats <<a href=3D"mailto:gn=
ats-admin%netbsd.org@localhost">gnats-admin%netbsd.org@localhost</a>> wrote:<br></div><block=
quote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-w=
idth:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding=
-left:1ex">The following reply was made to PR port-macppc/59014; it has bee=
n noted by GNATS.<br>
<br>
From: Martin Husemann <<a href=3D"mailto:martin%duskware.de@localhost" target=3D"_=
blank">martin%duskware.de@localhost</a>><br>
To: <a href=3D"mailto:gnats-bugs%netbsd.org@localhost" target=3D"_blank">gnats-bugs@n=
etbsd.org</a><br>
Cc: <br>
Subject: Re: port-macppc/59014: Shutdown -r now often freezes on macppc b&a=
mp;w G3<br>
Date: Thu, 23 Jan 2025 12:22:25 +0100<br>
<br>
=C2=A0>=C2=A0 netbsd# rndctl -S seed2=C2=A0 =C2=A0(works)<br>
=C2=A0>=C2=A0 netbsd# rndctl -S seed2=C2=A0 =C2=A0(works)<br>
=C2=A0>=C2=A0 netbsd# rndctl -S seed2=C2=A0 =C2=A0(works)<br>
=C2=A0>=C2=A0 netbsd# rndctl -S seed2=C2=A0 =C2=A0(freezes, and I press =
ctrl-t)<br>
=C2=A0>=C2=A0 [ 139.8510292] load: 0.42=C2=A0 cmd: rndctl 633 [biowait] =
0.01u 0.00s 0% 1424k<br>
<br>
=C2=A0Just for completenes: could you mount a tmpfs on /tmp (if you haven&#=
39;t<br>
=C2=A0that already as part of your standard install) and then try again wit=
h a<br>
=C2=A0sequence of<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 # rndctl -S /tmp/seed<br>
<br>
=C2=A0One other thing to check if the drive knows it is failing:<br>
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 # atactl wd0 smart status<br>
<br>
=C2=A0This could be the disk not giving up on the sector (and remapping it =
yet),<br>
=C2=A0but running into issues whenever trying to write it. I would expect<b=
r>
=C2=A0dmesg spam from that though.<br>
<br>
=C2=A0If you have anything but the fresh install on the disk, I would do a =
backup<br>
=C2=A0ASAP.<br>
<br>
=C2=A0Martin<br>
<br></blockquote><div>A terrific idea!</div><div><br></div><div>I did a</di=
v><div>mount -t tmpfs tmpfs /tmp=C2=A0</div><div>followed by=C2=A0</div><di=
v># rndctl -S /tmp/seed</div><div>I repeated the rndctl -S command above 32=
times for good measure with no errors. I also did a rndctl -L off the tmpf=
s another 32 times without error (seems to have eliminated the problem).=C2=
=A0</div><div><br></div><div>Smart on my IBM=C2=A0<span class=3D"gmail-Appl=
e-converted-space">=C2=A0</span>DJSA-220 =C2=A0test-install drive running N=
etBSD 10.1:</div><div>netbsd# atactl wd0 smart status<br>SMART supported, S=
MART enabled<br>id value thresh crit collect reliability description =C2=A0=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 raw<br>=C2=A0 1 100 =C2=
=A0 62 =C2=A0 =C2=A0 yes online =C2=A0positive =C2=A0 =C2=A0Raw read error =
rate =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>=C2=A0 2 100 =C2=A0 40 =C2=A0 =C2=A0 =
yes offline positive =C2=A0 =C2=A0Throughput performance =C2=A0 =C2=A0 =C2=
=A00<br>=C2=A0 3 102 =C2=A0 33 =C2=A0 =C2=A0 yes online =C2=A0positive =C2=
=A0 =C2=A0Spin-up time =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0107374182402<br>=C2=A0 4 100 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0onlin=
e =C2=A0positive =C2=A0 =C2=A0Start/stop count =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A01506<br>=C2=A0 5 100 =C2=A0 =C2=A05 =C2=A0 =C2=A0 yes online =
=C2=A0positive =C2=A0 =C2=A0Reallocated sector count =C2=A0 =C2=A00<br>=C2=
=A0 7 100 =C2=A0 67 =C2=A0 =C2=A0 yes online =C2=A0positive =C2=A0 =C2=A0Se=
ek error rate =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>=C2=A0 8 100 =
=C2=A0 40 =C2=A0 =C2=A0 yes offline positive =C2=A0 =C2=A0Seek time perform=
ance =C2=A0 =C2=A0 =C2=A0 0<br>=C2=A0 9 =C2=A057 =C2=A0 =C2=A00 =C2=A0 =C2=
=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Power-on hours count =C2=A0=
=C2=A0 =C2=A0 =C2=A019067<br>=C2=A010 100 =C2=A0 60 =C2=A0 =C2=A0 yes onli=
ne =C2=A0positive =C2=A0 =C2=A0Spin retry count =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A00<br>=C2=A012 100 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0onlin=
e =C2=A0positive =C2=A0 =C2=A0Device power cycle count =C2=A0 =C2=A01391<br=
>191 100 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0=
=C2=A0G-sense error rate =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00<br>192 100 =
=C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Po=
wer-off retract count =C2=A0 =C2=A0 86<br>193 =C2=A0 1 =C2=A0 50 =C2=A0 =C2=
=A0 no =C2=A0online =C2=A0negative =C2=A0 =C2=A0Load cycle count =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01059282<br>196 100 =C2=A0 =C2=A00 =C2=A0 =C2=
=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Reallocated event count =C2=
=A0 =C2=A0 7<br>197 100 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=A0=
positive =C2=A0 =C2=A0Current pending sector =C2=A0 =C2=A0 =C2=A02<br>198 1=
00 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0offline positive =C2=A0 =C2=A0Offl=
ine uncorrectable =C2=A0 =C2=A0 =C2=A0 0<br>199 200 =C2=A0 =C2=A00 =C2=A0 =
=C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Ultra DMA CRC error coun=
t =C2=A0 127<br>netbsd#=C2=A0<br></div><div><br></div><div>Smart on my Maxt=
or DiamondMax=C2=A0Plus 8 (new old stock) NetBSD 9.1 system:</div><div>kawa=
ii# atactl wd0 smart status<br>SMART supported, SMART enabled<br>id value t=
hresh crit collect reliability description =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 raw<br>=C2=A0 3 221 =C2=A0 63 =C2=A0 =C2=A0 yes on=
line =C2=A0positive =C2=A0 =C2=A0Spin-up time =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A011385<br>=C2=A0 4 253 =C2=A0 =C2=A00 =C2=A0 =C2=
=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Start/stop count =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0185<br>=C2=A0 5 253 =C2=A0 63 =C2=A0 =C2=A0 =
yes online =C2=A0positive =C2=A0 =C2=A0Reallocated sector count =C2=A0 =C2=
=A00<br>=C2=A0 6 253 =C2=A0100 =C2=A0 =C2=A0 yes offline positive =C2=A0 =
=C2=A0Read channel margin =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>=C2=A0 7 253 =C2=
=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Seek =
error rate =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>=C2=A0 8 252 =C2=
=A0187 =C2=A0 =C2=A0 yes online =C2=A0positive =C2=A0 =C2=A0Seek time perfo=
rmance =C2=A0 =C2=A0 =C2=A0 52886<br>=C2=A0 9 253 =C2=A0 =C2=A00 =C2=A0 =C2=
=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Power-on hours count =C2=A0=
=C2=A0 =C2=A0 =C2=A06774<br>=C2=A010 253 =C2=A0157 =C2=A0 =C2=A0 yes onlin=
e =C2=A0positive =C2=A0 =C2=A0Spin retry count =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A00<br>=C2=A011 253 =C2=A0223 =C2=A0 =C2=A0 yes online =C2=A0pos=
itive =C2=A0 =C2=A0Calibration retry count =C2=A0 =C2=A0 0<br>=C2=A012 253 =
=C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0De=
vice power cycle count =C2=A0 =C2=A0177<br>=C2=A099 253 =C2=A0 =C2=A00 =C2=
=A0 =C2=A0 no =C2=A0offline positive =C2=A0 =C2=A0Unknown =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>100 253 =C2=A0 =
=C2=A00 =C2=A0 =C2=A0 no =C2=A0offline positive =C2=A0 =C2=A0Erase/Program =
Cycles =C2=A0 =C2=A0 =C2=A0 =C2=A00<br>101 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0=
no =C2=A0offline positive =C2=A0 =C2=A0Unknown =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>192 253 =C2=A0 =C2=A00 =C2=
=A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Power-off retract co=
unt =C2=A0 =C2=A0 120<br>193 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0onli=
ne =C2=A0positive =C2=A0 =C2=A0Load cycle count =C2=A0 =C2=A0 =C2=A0 =C2=A0=
=C2=A0 =C2=A0191<br>194 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =
=C2=A0positive =C2=A0 =C2=A0Temperature =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 42<br>195 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0on=
line =C2=A0positive =C2=A0 =C2=A0Hardware ECC Recovered =C2=A0 =C2=A0 =C2=
=A02226<br>196 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0offline positive =
=C2=A0 =C2=A0Reallocated event count =C2=A0 =C2=A0 0<br>197 253 =C2=A0 =C2=
=A00 =C2=A0 =C2=A0 no =C2=A0offline positive =C2=A0 =C2=A0Current pending s=
ector =C2=A0 =C2=A0 =C2=A00<br>198 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=
=A0offline positive =C2=A0 =C2=A0Offline uncorrectable =C2=A0 =C2=A0 =C2=A0=
0<br>199 199 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0offline positive =C2=A0=
=C2=A0Ultra DMA CRC error count =C2=A0 1<br>200 253 =C2=A0 =C2=A00 =C2=A0 =
=C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Write error rate =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00<br>201 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0=
no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Soft read error rate =C2=A0 =
=C2=A0 =C2=A0 =C2=A03<br>202 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0onli=
ne =C2=A0positive =C2=A0 =C2=A0Data address mark errors =C2=A0 =C2=A00<br>2=
03 253 =C2=A0180 =C2=A0 =C2=A0 yes online =C2=A0positive =C2=A0 =C2=A0Run o=
ut cancel =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00<br>204 253 =C2=
=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Soft =
ECC correction =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>205 253 =C2=A0 =C2=A00 =C2=
=A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Thermal asperity che=
ck =C2=A0 =C2=A0 =C2=A00<br>207 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0o=
nline =C2=A0positive =C2=A0 =C2=A0Spin high current =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 0<br>208 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=
=A0positive =C2=A0 =C2=A0Spin buzz =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 =C2=A0 0<br>209 185 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=
=A0offline positive =C2=A0 =C2=A0Offline seek performance =C2=A0 =C2=A00<br=
>kawaii#=C2=A0<br></div><div><br></div><div>Both drives exhibited the rndct=
l crash problem.</div><div><br></div><div>I suspect the Mac's onboard I=
DE controller is flakey, as it is known to be wacky. See the following link=
:</div><div><a href=3D"https://en.wikipedia.org/wiki/Power_Macintosh_G3#Blu=
e_and_White_2">https://en.wikipedia.org/wiki/Power_Macintosh_G3#Blue_and_Wh=
ite_2</a><br></div><div><br></div><div>I have the "good" revision=
2 motherboard, which some people think is also quirky.</div><div><br></div=
><div>My next plan is to modify the kernel to turn off UDMA mode on the har=
d disk using the wd flags, see if that resolves the problem.</div><div><br>=
</div><div>Thanks to all for helping to narrow down the cause of the proble=
m.</div></div></div>
--0000000000004fcfe6062c7a17fe--
Home |
Main Index |
Thread Index |
Old Index