tech-net archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
re: COMPAT_50 vs NET_RT_IFLIST
On Tue, 30 Apr 2019, Paul Goyette wrote:
And another data point:
The failures (``ifconfig -l'' and ``route monitor'') do NOT occur when
running on a 7.0 base system.
So it would seem that the problem is specifically with the compat_50
code, and was introduced between 7.0 and 8.0.
OK, so armed with these two data points (7.0 ==> GOOD, 8.0 ==> BAD) I
was able to run a bisect to identify the culprit.
Sources from 2019-09-21 at 10:00:00 UTC ==> GOOD
Sources from 2019-09-21 at 19:18:10 UTC ==> BAD
There are several commits during this time window, but the build was
broken for various reasons for several hours (as shown by the babylon5
test logs). The only commits that seem relevant are those which start
with the following:
Module Name: src
Committed By: roy
Date: Wed Sep 21 10:50:23 UTC 2016
Modified Files:
src/share/man/man4: route.4
src/sys/compat/common: Makefile
src/sys/compat/net: if.h route.h
src/sys/net: if.h route.h rtsock.c
src/sys/rump/net/lib/libnet: Makefile
src/sys/sys: socket.h
Added Files:
src/sys/compat/common: rtsock_70.c
Log Message:
Add ifam_pid and ifam_addrflags to ifa_msghdr.
Re-version RTM_NEWADDR, RTM_DELADDR, RTM_CHGADDR and
NET_RT_IFLIST. Add compat code for old version.
Roy, can you please look into this further? Thanks!
Note that the breakage for the 5.2 version of ``ifconfig -l'' began
with this commit, yet the 5.2 version of ``route monitor'' continues
to produce "reasonable" looking results.
# chroot /chroot52 route monitor &
# ifconfig lo0 alias 1.2.3.4
RTM_ONEWADDR
got message of size 152 on Wed May 1 04:43:57 2019
RTM_ADD: Add Route: len 152, pid 463, seq 0, errno 0, flags: <UP,HOST> locks: inits:
sockaddrs: <DST,GATEWAY>
1.2.3.4 lo0
The ``route monitor'' starts failing to function correctly at some time
after 2016-09-21 19:18:10 UTC (it definitely fails as of 2017-05-27
00:00 UTC).
Hopefully this narrows things enough for someone familiar with the
rtsock stuff to help us make some forward progress.
On Tue, 30 Apr 2019, Paul Goyette wrote:
Some additional testing (on a -current base system) shows that the
problem is almost certainly related to compat_50 code. Using the
ifconfig from 6.0 or newer does not display the problem.
Also, the issue is probably wider than just the sysctl stuff, since
running a 5.2 version of ``route monitor'' produces no output when
adding or changing an addresss on lo0; the 6.0 version of route
monitor produces correct output.
Furthermore, previous testing show that the problem also occurs on
a 8.0 base system with 5.2 userland. (I have not tested a 7.0 base
system.)
On Mon, 29 Apr 2019, Paul Goyette wrote:
Alas, making the suggested changes does not help. Same results as
before:
Userland and Kernel both -current with suggested changes (the diffs
are attached to this Email):
# ifconfig -l
wm0 lo0
# ifconfig lo0
lo0: flags=0x8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 33624
inet 127.0.0.1/8 flags 0x0
inet6 ::1/128 flags 0x20<NODAD>
inet6 fe80::1%lo0/64 flags 0x0 scopeid 0x2
#
And with a 5.2 base system loaded in /chroot52 directory:
# chroot /chroot52 ifconfig -l
# chroot /chroot52 ifconfig lo0
#
On Mon, 29 Apr 2019, matthew green wrote:
I still cannot explain how things got broken between 5.2 and 8.0. I
will defer to those who are more expert in this area than am I. My
suspicion is that the breakage is related to sys/socket.h rev 1.99
which versioned AF_{,O}ROUTE for some 64-bit cleanliness.
i think i have a guess about the problem.
sys/net/if.h, sys/net/route.h, and sys/compat/net/if.h all
have this code:
/*
* Message format for use in obtaining information about interfaces from
* sysctl and the routing socket. We need to force 64-bit alignment if we
* aren't using compatiblity definitons.
*/
#if !defined(_KERNEL) || !defined(COMPAT_RTSOCK)
#define __align64 __aligned(sizeof(uint64_t))
#else
#define __align64
#endif
struct if_msghdr {
u_short ifm_msglen __align64;
but i think this comment is wrong.
the compat structures are defined in the compat headers and
the above structure should never change, however when the
code handling code wants to talk to the *real* structure it
will get this adjusted one (without the align), and thus
it will copy the wrong portions out from it.
the fix may be as simple as removing this from these headers
(leaving it always defined for the current defs), and making
sure that the compat headers have the right alignment (my
quick look seem ok.)
this will, obviously, need a recompile of the newer kernel.
.mrg.
+--------------------+--------------------------+-----------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul%whooppee.com@localhost |
| Software Developer | 0786 F758 55DE 53BA 7731 | pgoyette%netbsd.org@localhost |
+--------------------+--------------------------+-----------------------+
+--------------------+--------------------------+-----------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul%whooppee.com@localhost |
| Software Developer | 0786 F758 55DE 53BA 7731 | pgoyette%netbsd.org@localhost |
+--------------------+--------------------------+-----------------------+
+--------------------+--------------------------+-----------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul%whooppee.com@localhost |
| Software Developer | 0786 F758 55DE 53BA 7731 | pgoyette%netbsd.org@localhost |
+--------------------+--------------------------+-----------------------+
!DSPAM:5cc78918185496256522020!
+--------------------+--------------------------+-----------------------+
| Paul Goyette | PGP Key fingerprint: | E-mail addresses: |
| (Retired) | FA29 0E3B 35AF E8AE 6651 | paul%whooppee.com@localhost |
| Software Developer | 0786 F758 55DE 53BA 7731 | pgoyette%netbsd.org@localhost |
+--------------------+--------------------------+-----------------------+
Home |
Main Index |
Thread Index |
Old Index