NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/58775 (apei(4) spamming console)
The following reply was made to PR kern/58775; it has been noted by GNATS.
From: Hauke Fath <hf%spg.tu-darmstadt.de@localhost>
To: Taylor R Campbell <riastradh%NetBSD.org@localhost>
Cc: gnats-bugs%netbsd.org@localhost, gnats-admin%netbsd.org@localhost
Subject: Re: kern/58775 (apei(4) spamming console)
Date: Sun, 27 Oct 2024 00:13:15 +0200
On Sat, 26 Oct 2024 15:49:32 +0000, Taylor R Campbell wrote:
> So, the new apei(4) code and pcictl(8) both confirm that your
> PCI device is unhappy with lots of hardware errors -- corrected
> errors, but still alarming. This is almost certainly an actual
> hardware problem that you might want to address (once we're done
> doing science!).
Disturbing - this is basically a new machine, from a batch we bought=20
last year (always quicker bought than deployed). We did have another=20
machine from the same vendor, though, whose ECC RAM faults vanished=20
once the offending module was found and - reseated. We've worked with=20
this vendor for almost twenty years, but they got so big they=20
apparently don't have to sweat the details any more.
I guess I'll hook up the machine's ipmi console on Monday, and see what=20
that has to say.
> Can you revert the previous patch and try the attached patch instead,
> which applies a rate limit to the console output?
Done, resulted in a much more reasonable message rate. Thanks!
In the general case, how would I map the "error source" on hardware?
Cheerio,
Hauke
--=20
The ASCII Ribbon Campaign Hauke Fath
() No HTML/RTF in email Institut f=FCr Nachrichtentechnik
/\ No Word docs in email TU Darmstadt
Respect for open standards Ruf +49-6151-16-21344
Home |
Main Index |
Thread Index |
Old Index