NetBSD-Bugs archive

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]

Re: port-macppc/59014: Shutdown -r now often freezes on macppc b&w G3



The following reply was made to PR port-macppc/59014; it has been noted by GNATS.

From: Chris Tucker <capa150%gmail.com@localhost>
To: gnats-bugs%netbsd.org@localhost, port-macppc-maintainer%netbsd.org@localhost, 
	gnats-admin%netbsd.org@localhost
Cc: 
Subject: Re: port-macppc/59014: Shutdown -r now often freezes on macppc b&w G3
Date: Fri, 24 Jan 2025 13:05:37 -0800

 --0000000000004fcfe6062c7a17fe
 Content-Type: text/plain; charset="UTF-8"
 Content-Transfer-Encoding: quoted-printable
 
 On Thu, Jan 23, 2025 at 3:25=E2=80=AFAM Martin Husemann via gnats <
 gnats-admin%netbsd.org@localhost> wrote:
 
 > The following reply was made to PR port-macppc/59014; it has been noted b=
 y
 > GNATS.
 >
 > From: Martin Husemann <martin%duskware.de@localhost>
 > To: gnats-bugs%netbsd.org@localhost
 > Cc:
 > Subject: Re: port-macppc/59014: Shutdown -r now often freezes on macppc
 > b&w G3
 > Date: Thu, 23 Jan 2025 12:22:25 +0100
 >
 >  >  netbsd# rndctl -S seed2   (works)
 >  >  netbsd# rndctl -S seed2   (works)
 >  >  netbsd# rndctl -S seed2   (works)
 >  >  netbsd# rndctl -S seed2   (freezes, and I press ctrl-t)
 >  >  [ 139.8510292] load: 0.42  cmd: rndctl 633 [biowait] 0.01u 0.00s 0%
 > 1424k
 >
 >  Just for completenes: could you mount a tmpfs on /tmp (if you haven't
 >  that already as part of your standard install) and then try again with a
 >  sequence of
 >
 >         # rndctl -S /tmp/seed
 >
 >  One other thing to check if the drive knows it is failing:
 >
 >         # atactl wd0 smart status
 >
 >  This could be the disk not giving up on the sector (and remapping it yet=
 ),
 >  but running into issues whenever trying to write it. I would expect
 >  dmesg spam from that though.
 >
 >  If you have anything but the fresh install on the disk, I would do a
 > backup
 >  ASAP.
 >
 >  Martin
 >
 > A terrific idea!
 
 I did a
 mount -t tmpfs tmpfs /tmp
 followed by
 # rndctl -S /tmp/seed
 I repeated the rndctl -S command above 32 times for good measure with no
 errors. I also did a rndctl -L off the tmpfs another 32 times without error
 (seems to have eliminated the problem).
 
 Smart on my IBM  DJSA-220  test-install drive running NetBSD 10.1:
 netbsd# atactl wd0 smart status
 SMART supported, SMART enabled
 id value thresh crit collect reliability description                 raw
   1 100   62     yes online  positive    Raw read error rate         0
   2 100   40     yes offline positive    Throughput performance      0
   3 102   33     yes online  positive    Spin-up time
  107374182402
   4 100    0     no  online  positive    Start/stop count            1506
   5 100    5     yes online  positive    Reallocated sector count    0
   7 100   67     yes online  positive    Seek error rate             0
   8 100   40     yes offline positive    Seek time performance       0
   9  57    0     no  online  positive    Power-on hours count        19067
  10 100   60     yes online  positive    Spin retry count            0
  12 100    0     no  online  positive    Device power cycle count    1391
 191 100    0     no  online  positive    G-sense error rate          0
 192 100    0     no  online  positive    Power-off retract count     86
 193   1   50     no  online  negative    Load cycle count            105928=
 2
 196 100    0     no  online  positive    Reallocated event count     7
 197 100    0     no  online  positive    Current pending sector      2
 198 100    0     no  offline positive    Offline uncorrectable       0
 199 200    0     no  online  positive    Ultra DMA CRC error count   127
 netbsd#
 
 Smart on my Maxtor DiamondMax Plus 8 (new old stock) NetBSD 9.1 system:
 kawaii# atactl wd0 smart status
 SMART supported, SMART enabled
 id value thresh crit collect reliability description                 raw
   3 221   63     yes online  positive    Spin-up time                11385
   4 253    0     no  online  positive    Start/stop count            185
   5 253   63     yes online  positive    Reallocated sector count    0
   6 253  100     yes offline positive    Read channel margin         0
   7 253    0     no  online  positive    Seek error rate             0
   8 252  187     yes online  positive    Seek time performance       52886
   9 253    0     no  online  positive    Power-on hours count        6774
  10 253  157     yes online  positive    Spin retry count            0
  11 253  223     yes online  positive    Calibration retry count     0
  12 253    0     no  online  positive    Device power cycle count    177
  99 253    0     no  offline positive    Unknown                     0
 100 253    0     no  offline positive    Erase/Program Cycles        0
 101 253    0     no  offline positive    Unknown                     0
 192 253    0     no  online  positive    Power-off retract count     120
 193 253    0     no  online  positive    Load cycle count            191
 194 253    0     no  online  positive    Temperature                 42
 195 253    0     no  online  positive    Hardware ECC Recovered      2226
 196 253    0     no  offline positive    Reallocated event count     0
 197 253    0     no  offline positive    Current pending sector      0
 198 253    0     no  offline positive    Offline uncorrectable       0
 199 199    0     no  offline positive    Ultra DMA CRC error count   1
 200 253    0     no  online  positive    Write error rate            0
 201 253    0     no  online  positive    Soft read error rate        3
 202 253    0     no  online  positive    Data address mark errors    0
 203 253  180     yes online  positive    Run out cancel              0
 204 253    0     no  online  positive    Soft ECC correction         0
 205 253    0     no  online  positive    Thermal asperity check      0
 207 253    0     no  online  positive    Spin high current           0
 208 253    0     no  online  positive    Spin buzz                   0
 209 185    0     no  offline positive    Offline seek performance    0
 kawaii#
 
 Both drives exhibited the rndctl crash problem.
 
 I suspect the Mac's onboard IDE controller is flakey, as it is known to be
 wacky. See the following link:
 https://en.wikipedia.org/wiki/Power_Macintosh_G3#Blue_and_White_2
 
 I have the "good" revision 2 motherboard, which some people think is also
 quirky.
 
 My next plan is to modify the kernel to turn off UDMA mode on the hard disk
 using the wd flags, see if that resolves the problem.
 
 Thanks to all for helping to narrow down the cause of the problem.
 
 --0000000000004fcfe6062c7a17fe
 Content-Type: text/html; charset="UTF-8"
 Content-Transfer-Encoding: quoted-printable
 
 <div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote g=
 mail_quote_container"><div dir=3D"ltr" class=3D"gmail_attr">On Thu, Jan 23,=
  2025 at 3:25=E2=80=AFAM Martin Husemann via gnats &lt;<a href=3D"mailto:gn=
 ats-admin%netbsd.org@localhost">gnats-admin%netbsd.org@localhost</a>&gt; wrote:<br></div><block=
 quote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left-w=
 idth:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding=
 -left:1ex">The following reply was made to PR port-macppc/59014; it has bee=
 n noted by GNATS.<br>
 <br>
 From: Martin Husemann &lt;<a href=3D"mailto:martin%duskware.de@localhost"; target=3D"_=
 blank">martin%duskware.de@localhost</a>&gt;<br>
 To: <a href=3D"mailto:gnats-bugs%netbsd.org@localhost"; target=3D"_blank">gnats-bugs@n=
 etbsd.org</a><br>
 Cc: <br>
 Subject: Re: port-macppc/59014: Shutdown -r now often freezes on macppc b&a=
 mp;w G3<br>
 Date: Thu, 23 Jan 2025 12:22:25 +0100<br>
 <br>
 =C2=A0&gt;=C2=A0 netbsd# rndctl -S seed2=C2=A0 =C2=A0(works)<br>
 =C2=A0&gt;=C2=A0 netbsd# rndctl -S seed2=C2=A0 =C2=A0(works)<br>
 =C2=A0&gt;=C2=A0 netbsd# rndctl -S seed2=C2=A0 =C2=A0(works)<br>
 =C2=A0&gt;=C2=A0 netbsd# rndctl -S seed2=C2=A0 =C2=A0(freezes, and I press =
 ctrl-t)<br>
 =C2=A0&gt;=C2=A0 [ 139.8510292] load: 0.42=C2=A0 cmd: rndctl 633 [biowait] =
 0.01u 0.00s 0% 1424k<br>
 <br>
 =C2=A0Just for completenes: could you mount a tmpfs on /tmp (if you haven&#=
 39;t<br>
 =C2=A0that already as part of your standard install) and then try again wit=
 h a<br>
 =C2=A0sequence of<br>
 <br>
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 # rndctl -S /tmp/seed<br>
 <br>
 =C2=A0One other thing to check if the drive knows it is failing:<br>
 <br>
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 # atactl wd0 smart status<br>
 <br>
 =C2=A0This could be the disk not giving up on the sector (and remapping it =
 yet),<br>
 =C2=A0but running into issues whenever trying to write it. I would expect<b=
 r>
 =C2=A0dmesg spam from that though.<br>
 <br>
 =C2=A0If you have anything but the fresh install on the disk, I would do a =
 backup<br>
 =C2=A0ASAP.<br>
 <br>
 =C2=A0Martin<br>
 <br></blockquote><div>A terrific idea!</div><div><br></div><div>I did a</di=
 v><div>mount -t tmpfs tmpfs /tmp=C2=A0</div><div>followed by=C2=A0</div><di=
 v># rndctl -S /tmp/seed</div><div>I repeated the rndctl -S command above 32=
  times for good measure with no errors. I also did a rndctl -L off the tmpf=
 s another 32 times without error (seems to have eliminated the problem).=C2=
 =A0</div><div><br></div><div>Smart on my IBM=C2=A0<span class=3D"gmail-Appl=
 e-converted-space">=C2=A0</span>DJSA-220 =C2=A0test-install drive running N=
 etBSD 10.1:</div><div>netbsd# atactl wd0 smart status<br>SMART supported, S=
 MART enabled<br>id value thresh crit collect reliability description =C2=A0=
  =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 raw<br>=C2=A0 1 100 =C2=
 =A0 62 =C2=A0 =C2=A0 yes online =C2=A0positive =C2=A0 =C2=A0Raw read error =
 rate =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>=C2=A0 2 100 =C2=A0 40 =C2=A0 =C2=A0 =
 yes offline positive =C2=A0 =C2=A0Throughput performance =C2=A0 =C2=A0 =C2=
 =A00<br>=C2=A0 3 102 =C2=A0 33 =C2=A0 =C2=A0 yes online =C2=A0positive =C2=
 =A0 =C2=A0Spin-up time =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
 =A0107374182402<br>=C2=A0 4 100 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0onlin=
 e =C2=A0positive =C2=A0 =C2=A0Start/stop count =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
 =C2=A0 =C2=A01506<br>=C2=A0 5 100 =C2=A0 =C2=A05 =C2=A0 =C2=A0 yes online =
 =C2=A0positive =C2=A0 =C2=A0Reallocated sector count =C2=A0 =C2=A00<br>=C2=
 =A0 7 100 =C2=A0 67 =C2=A0 =C2=A0 yes online =C2=A0positive =C2=A0 =C2=A0Se=
 ek error rate =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>=C2=A0 8 100 =
 =C2=A0 40 =C2=A0 =C2=A0 yes offline positive =C2=A0 =C2=A0Seek time perform=
 ance =C2=A0 =C2=A0 =C2=A0 0<br>=C2=A0 9 =C2=A057 =C2=A0 =C2=A00 =C2=A0 =C2=
 =A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Power-on hours count =C2=A0=
  =C2=A0 =C2=A0 =C2=A019067<br>=C2=A010 100 =C2=A0 60 =C2=A0 =C2=A0 yes onli=
 ne =C2=A0positive =C2=A0 =C2=A0Spin retry count =C2=A0 =C2=A0 =C2=A0 =C2=A0=
  =C2=A0 =C2=A00<br>=C2=A012 100 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0onlin=
 e =C2=A0positive =C2=A0 =C2=A0Device power cycle count =C2=A0 =C2=A01391<br=
 >191 100 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0=
  =C2=A0G-sense error rate =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00<br>192 100 =
 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Po=
 wer-off retract count =C2=A0 =C2=A0 86<br>193 =C2=A0 1 =C2=A0 50 =C2=A0 =C2=
 =A0 no =C2=A0online =C2=A0negative =C2=A0 =C2=A0Load cycle count =C2=A0 =C2=
 =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A01059282<br>196 100 =C2=A0 =C2=A00 =C2=A0 =C2=
 =A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Reallocated event count =C2=
 =A0 =C2=A0 7<br>197 100 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=A0=
 positive =C2=A0 =C2=A0Current pending sector =C2=A0 =C2=A0 =C2=A02<br>198 1=
 00 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0offline positive =C2=A0 =C2=A0Offl=
 ine uncorrectable =C2=A0 =C2=A0 =C2=A0 0<br>199 200 =C2=A0 =C2=A00 =C2=A0 =
 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Ultra DMA CRC error coun=
 t =C2=A0 127<br>netbsd#=C2=A0<br></div><div><br></div><div>Smart on my Maxt=
 or DiamondMax=C2=A0Plus 8 (new old stock) NetBSD 9.1 system:</div><div>kawa=
 ii# atactl wd0 smart status<br>SMART supported, SMART enabled<br>id value t=
 hresh crit collect reliability description =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
 =A0 =C2=A0 =C2=A0 =C2=A0 raw<br>=C2=A0 3 221 =C2=A0 63 =C2=A0 =C2=A0 yes on=
 line =C2=A0positive =C2=A0 =C2=A0Spin-up time =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
 =C2=A0 =C2=A0 =C2=A0 =C2=A011385<br>=C2=A0 4 253 =C2=A0 =C2=A00 =C2=A0 =C2=
 =A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Start/stop count =C2=A0 =C2=
 =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0185<br>=C2=A0 5 253 =C2=A0 63 =C2=A0 =C2=A0 =
 yes online =C2=A0positive =C2=A0 =C2=A0Reallocated sector count =C2=A0 =C2=
 =A00<br>=C2=A0 6 253 =C2=A0100 =C2=A0 =C2=A0 yes offline positive =C2=A0 =
 =C2=A0Read channel margin =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>=C2=A0 7 253 =C2=
 =A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Seek =
 error rate =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>=C2=A0 8 252 =C2=
 =A0187 =C2=A0 =C2=A0 yes online =C2=A0positive =C2=A0 =C2=A0Seek time perfo=
 rmance =C2=A0 =C2=A0 =C2=A0 52886<br>=C2=A0 9 253 =C2=A0 =C2=A00 =C2=A0 =C2=
 =A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Power-on hours count =C2=A0=
  =C2=A0 =C2=A0 =C2=A06774<br>=C2=A010 253 =C2=A0157 =C2=A0 =C2=A0 yes onlin=
 e =C2=A0positive =C2=A0 =C2=A0Spin retry count =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
 =C2=A0 =C2=A00<br>=C2=A011 253 =C2=A0223 =C2=A0 =C2=A0 yes online =C2=A0pos=
 itive =C2=A0 =C2=A0Calibration retry count =C2=A0 =C2=A0 0<br>=C2=A012 253 =
 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0De=
 vice power cycle count =C2=A0 =C2=A0177<br>=C2=A099 253 =C2=A0 =C2=A00 =C2=
 =A0 =C2=A0 no =C2=A0offline positive =C2=A0 =C2=A0Unknown =C2=A0 =C2=A0 =C2=
 =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>100 253 =C2=A0 =
 =C2=A00 =C2=A0 =C2=A0 no =C2=A0offline positive =C2=A0 =C2=A0Erase/Program =
 Cycles =C2=A0 =C2=A0 =C2=A0 =C2=A00<br>101 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0=
  no =C2=A0offline positive =C2=A0 =C2=A0Unknown =C2=A0 =C2=A0 =C2=A0 =C2=A0=
  =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>192 253 =C2=A0 =C2=A00 =C2=
 =A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Power-off retract co=
 unt =C2=A0 =C2=A0 120<br>193 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0onli=
 ne =C2=A0positive =C2=A0 =C2=A0Load cycle count =C2=A0 =C2=A0 =C2=A0 =C2=A0=
  =C2=A0 =C2=A0191<br>194 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =
 =C2=A0positive =C2=A0 =C2=A0Temperature =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
 =C2=A0 =C2=A0 =C2=A0 42<br>195 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0on=
 line =C2=A0positive =C2=A0 =C2=A0Hardware ECC Recovered =C2=A0 =C2=A0 =C2=
 =A02226<br>196 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0offline positive =
 =C2=A0 =C2=A0Reallocated event count =C2=A0 =C2=A0 0<br>197 253 =C2=A0 =C2=
 =A00 =C2=A0 =C2=A0 no =C2=A0offline positive =C2=A0 =C2=A0Current pending s=
 ector =C2=A0 =C2=A0 =C2=A00<br>198 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=
 =A0offline positive =C2=A0 =C2=A0Offline uncorrectable =C2=A0 =C2=A0 =C2=A0=
  0<br>199 199 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0offline positive =C2=A0=
  =C2=A0Ultra DMA CRC error count =C2=A0 1<br>200 253 =C2=A0 =C2=A00 =C2=A0 =
 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Write error rate =C2=A0 =
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00<br>201 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0=
  no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Soft read error rate =C2=A0 =
 =C2=A0 =C2=A0 =C2=A03<br>202 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0onli=
 ne =C2=A0positive =C2=A0 =C2=A0Data address mark errors =C2=A0 =C2=A00<br>2=
 03 253 =C2=A0180 =C2=A0 =C2=A0 yes online =C2=A0positive =C2=A0 =C2=A0Run o=
 ut cancel =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A00<br>204 253 =C2=
 =A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Soft =
 ECC correction =C2=A0 =C2=A0 =C2=A0 =C2=A0 0<br>205 253 =C2=A0 =C2=A00 =C2=
 =A0 =C2=A0 no =C2=A0online =C2=A0positive =C2=A0 =C2=A0Thermal asperity che=
 ck =C2=A0 =C2=A0 =C2=A00<br>207 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0o=
 nline =C2=A0positive =C2=A0 =C2=A0Spin high current =C2=A0 =C2=A0 =C2=A0 =
 =C2=A0 =C2=A0 0<br>208 253 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=A0online =C2=
 =A0positive =C2=A0 =C2=A0Spin buzz =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
 =A0 =C2=A0 =C2=A0 =C2=A0 0<br>209 185 =C2=A0 =C2=A00 =C2=A0 =C2=A0 no =C2=
 =A0offline positive =C2=A0 =C2=A0Offline seek performance =C2=A0 =C2=A00<br=
 >kawaii#=C2=A0<br></div><div><br></div><div>Both drives exhibited the rndct=
 l crash problem.</div><div><br></div><div>I suspect the Mac&#39;s onboard I=
 DE controller is flakey, as it is known to be wacky. See the following link=
 :</div><div><a href=3D"https://en.wikipedia.org/wiki/Power_Macintosh_G3#Blu=
 e_and_White_2">https://en.wikipedia.org/wiki/Power_Macintosh_G3#Blue_and_Wh=
 ite_2</a><br></div><div><br></div><div>I have the &quot;good&quot; revision=
  2 motherboard, which some people think is also quirky.</div><div><br></div=
 ><div>My next plan is to modify the kernel to turn off UDMA mode on the har=
 d disk using the wd flags, see if that resolves the problem.</div><div><br>=
 </div><div>Thanks to all for helping to narrow down the cause of the proble=
 m.</div></div></div>
 
 --0000000000004fcfe6062c7a17fe--
 


Home | Main Index | Thread Index | Old Index