NetBSD-Bugs archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: kern/53562: bridge(4) breaks segmentation / TX checksum offloading
The following reply was made to PR kern/53562; it has been noted by GNATS.
From: Masanobu SAITOH <msaitoh%execsw.org@localhost>
To: Rin Okuyama <rokuyama%rk.phys.keio.ac.jp@localhost>, gnats-bugs%NetBSD.org@localhost,
kern-bug-people%netbsd.org@localhost, gnats-admin%netbsd.org@localhost, netbsd-bugs%netbsd.org@localhost
Cc: msaitoh%execsw.org@localhost
Subject: Re: kern/53562: bridge(4) breaks segmentation / TX checksum
offloading
Date: Fri, 14 Sep 2018 20:26:20 +0900
On 2018/09/14 20:17, Masanobu SAITOH wrote:
> On 2018/09/12 21:33, Rin Okuyama wrote:
>> Hi, sorry for the delay in the response.
>>
>> On 2018/09/10 16:34, Masanobu SAITOH wrote:
>> ...
>>> On 2018/09/08 8:50, Rin Okuyama wrote:
>> ...
>>>> Â The following would be a simplest
>>>> Â example where your patch does not work:
>>>> Â - NetBSD/amd64-currnet with wm0 connected to LAN
>>>> Â - qemu-3.0.0nb2 installed from pkgsrc
>>>> Â - setup tap and bridge as follows:
>>>> Â Â Â Â Â Â # ifconfig tap0 create up
>>>> Â Â Â Â Â Â # ifconfig bridge0 create
>>>> Â Â Â Â Â Â # brconfig bridge0 add wm0 add tap0 up
>>>> Â - run NetBSD/amd64 8.0 on QEMU with tap enabled:
>>>> Â Â Â Â Â Â # qemu-system-x86_64 -m 128m -cdrom NetBSD-8.0-amd64.iso -boot d \
>>>> Â Â Â Â Â Â Â Â -display curses -net tap,fd=3 -net nic 3<>/dev/tap0
>>>> Â Then, the virtual host can send and receive small packets, e.g.,
>>>> Â ping works well regardless of whether TSO is enabled or not for wm0.
>> [quotation added by RO from the previous message of mine]
>>>> Â However, it cannot receive larger packets when TSO is enabled, e.g.,
>>>> Â file cannot be retrieved through ftp. If TSO is turned off, everything
>>>> Â works fine.
>> [quotation done]
>>>
>>> Â 0)
>>> Â Is this "wm0" on the guest or the host?
>>> Â If my understanding is correct, writing packet via tap on host is not related
>>> Â to TSO...
>>
>
> I realized that this problem is not only for TSO but for all outgoing offload flags.
>
>
>> Sorry for bothering you many times. This "wm0" is for the host. Also,
>> in this example, the guest fails to receive data from the ftp server
>> working on the host. I forgot to test servers other than the host. It
>> worked fine; only connecting from the guest to host fails.
>
> I see. I could reproduce the same problem with Xen dom0 & domU by adding only ip4csum
> flag to wm0.
>
>>> Â 1)
>>> Â If you're using wm(4) on the guest, please try vioif(4) and see if the problem
>>> occurs or not.
>>
>> I tried vioif(4) on the guest but the situation does not change.
>>
>>> Â 2)
>>> Â For checksum oofloading, the following kernel options will help for debugging:
>>> Â Â Â Â INET_CSUM_COUNTERS
>>> Â Â Â Â TCP_CSUM_COUNTERS
>>> Â Â Â Â UDP_CSUM_COUNTERS
>>>
>>> Â These options add some event counters:
>> ...
>>>
>>> Â 3)
>>> Â wm(4) has a lot of event counters, some of them are useful for debugging.
>>> Â To use them add "options WM_EVENT_COUNTERS" to your kernel config file.
>>
>> Thanks! I enabled these options.
>>
>>> Â 4)
>>> Â For TSO, current wm(4) is not perfect because it doesn't use m_defrag().
>>> Â I have not-yet-committed patch:
>>>
>>> Â Â Â Â http://www.netbsd.org/~msaitoh/wm-defrag-20180906-0.dif
>>>
>>>> % vmstat -ev | grep toomany
>>>> wm0 txq00txtoomanyseg                                     0   0 misc
>>>
>>> Â If "wmX txqYYtxtoomanyseg" is increasing, the above diff would fix the
>>> problem.
>>
>> I tried if_wm.c r1.587, which already contains this patch. Counter
>> "wmX txqYYtxtoomanyseg" was not increased.
>>
>> I examined how packets flow in the example above, from the ftp server
>> on host (wm0 on host) to the ftp client on guest (tap0 on host ->
>> vioif0 on guest):
>>
>> tcp_output()
>> -> ip_output()
>> -> ip_if_output()
>> -> if_output_lock()
>> -> ifp->if_output = ether_output()
>> -> bridge_output()
>> -> bridge_enqueue()
>>
>> When TSO4 is enabled for wm0 on the host, tcp_output() attempts to send
>> packets larger than MTU. If the destination interface is wm0, i.e.,
>> dst_ifp == ifp in bridge_output(), there's no problem with your patch
>> for if_bridge.c; packets are segmented automatically by the HW. However,
>> in this case, the destination is tap0, and bridge_enqueue() send packets
>> larger than MTU for tap0. This is, of course, illegal, since there is no
>> HW by which packets are segmented. The guest on QEMU, therefore, cannot
>> receive packets. This is the scenario how the failure occurs in my
>> example of QEMU above.
>>
>> The similar problems occur when one of other TX offload options, TSO6,
>> IP4CSUM-TX, TCPnCSUM-TX, or UDPnCSUM-TX, are enabled. For example,
>> consider the case of IP4CSUM-TX. When this option is enabled for wm0
>> on the host, checksumming is omitted in ip_output(). Then, when the
>> destination is tap0, packets with wrong checksum are sent.
>>
>> This patch disable TX offload when the interface is added to bridge:
>>
>> http://www.netbsd.org/~rin/disable_tx_offload_when_bridged_20180912.patch
>
> I don't know if it's good to doing in L3...
>
> It seems that both FreeBSD and DragonFly clear all bridge member's outgoing
> offload flags when adding a member to a bridge. I also don't know if
> this is good solution or not...
>
> Â My diff have been committed as if_bridge.c rev. 1.156 now.
>
>> Module Name:Â Â Â src
>> Committed By:Â Â Â msaitoh
>> Date:Â Â Â Â Â Â Â Fri Sep 14 11:05:09 UTC 2018
>>
>> Modified Files:
>> Â Â Â Â src/sys/net: if_bridge.c
>>
>> Log Message:
>> Â Fix a bug that bridge_enqueue() incorrectly cleared outgoing packet's offload
>> flags. bridge_enqueue() is called from bridge_output() when a packet is
>> spontaneous. Clear csum_flags before calling brige_enqueue() in
>> bridge_forward() or bridge_broadcast() instead of in the beginning of
>> bridge_enqueue().
>>
>> Note that this change doesn't fix a problem on the following configuration:
>>
>> Â Â Â Â A bridge has two or more interfaces.
>>
>> Â Â Â Â An address is assigned to an bridge member interface and
>> Â Â Â Â some offload flags are set.
>>
>> Â Â Â Â Another interface has no address and has no any offload flag.
>>
>> XXX pullup-[78]
>>
>>
>> To generate a diff of this commit:
>> cvs rdiff -u -r1.156 -r1.157 src/sys/net/if_bridge.c
>
>
>
>> With this patch, the guest on QEMU can retrieve files from the host
>> without problem, even if TX offload options are enabled, which supports
>> the discussion above. No need to modify if_bridge.c from the original
>> (r1.156) in this case.
>>
>> I don't know this patch is preferable; there should be better ways to
>> handle the problem. Packets need to be segmented or checksummed only
>> if the destination in bridge_output() is different from the original
>> interface. I will examine further if needed.
>>
>> Thanks,
>> rin
>
>
FYI:
http://www.netbsd.org/~msaitoh/packetflow017.pdf
--
-----------------------------------------------
SAITOH Masanobu (msaitoh%execsw.org@localhost
msaitoh%netbsd.org@localhost)
Home |
Main Index |
Thread Index |
Old Index