tech-net archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Looking for help diagnosing dhcpcd problem
Hi, tech-net!
I've had an issue with dhcpcd on an amd64 machine running NetBSD 10 for
months and I'd like to ask for help diagnosing it.
Some history:
This is with two different machines at one location and two different
ISPs. They only share a bulk of dhcpcd.conf, but otherwise:
One machine ran with this setup for years, and only in the last year or so
did I notice dhcpcd dying. I then added a crontab entry for
"/etc/rc.d/dhcpcd start" to run every half hour or so.
This was an older AMD machine with Realtek re ethernet interfaces for both
public Internet and for local, NATed network.
I then switched to a newer AMD system and used a dual rge card for both
Internet and LAN. However, there are issues [1] with rge, so I bought a
dual Broadcom (bge) card.
We then switched ISPs from Frontier to Spectrum, but the problem still
occurs. Frontier doesn't offer IPv6, so that was turned on recently, but
that didn't affect the dying of dhcpcd.
Interestingly, I have another machine, identical to the first machine used
here, with the exact same dhcpcd.conf running on Optimum, which also
doesn't provide IPv6, but I've left those lines there because that's a few
thousand miles away and I don't want to take any chances of that becoming
problematic.
Here's the dhcpcd.conf in current use (comment lines removed):
hostname sage.zia.io
duid
persistent
require dhcp_server_identifier
option rapid_commit
option domain_name_servers, domain_name, domain_search
option classless_static_routes
option interface_mtu
slaac hwaddr
nohook resolv.conf
noipv6rs # disable routing solicitation
allowinterfaces bge0
interface bge0
timeout 360 # Wait up to six minutes (time for cable modem to boot)
waitip 4
ipv6rs # enable routing solicitation
ia_na 1 # request an IPv6 address
ia_pd 2 bge1/0 # request a PD and assign it to bge1
There's /etc/dhcpcd.exit-hook which is needed for Internet to work when
the address changes:
#!/bin/sh
case "$interface" in
lo[0-9]* | tun[0-9]*) exit;;
esac
/etc/rc.d/npf reload
There's nothing meaningful in the logs - when dhcpcd dies, it does so
silently, so I ran it like so:
ktrace /sbin/dhcpcd -B -d -M -f /etc/dhcpcd.conf
After waiting several days, it died, and I now have 2,176,126 line long
ktrace.out ;) The last 2000 lines are here:
https://www.klos.com/~john/ktracedhcpcd.log
I see plenty of "Too many open files" messages near the end, even though
the typically running dhcpcd process has less than 70 open file handles,
even after days (on the system where it exits) or months (the system
that's far away). The system where it exits has kern.maxfiles = 16384, so
that's not an issue).
Also, this appears to happen when the ISP would either be giving a new
lease or a new IP address.
Also, running dhcpcd with -d showed lots and lots of lines like these:
bge0: Router Advertisement from fe80::201:5cff:fe6b:4846
bge0: executing: /libexec/dhcpcd-run-hooks ROUTERADVERT
Reloading NPF ruleset /etc/npf.conf
This happens every five seconds or so, which seeems... excessive.
Does anyone have any thoughts, suggestions or observations about what I
might be doing wrong, or what I could try differently?
Thanks!
John
[1] https://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=58047
https://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=57694
https://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=57972
Home |
Main Index |
Thread Index |
Old Index