tech-net archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: A new network interface send queue API
> On Nov 12, 2023, at 7:51 AM, Greg Troxel <gdt%lexort.com@localhost> wrote:
>
> Jason Thorpe <thorpej%me.com@localhost> writes:
>
> I have only quibbles and the odds are good it's more an explication than
> real, or that the fine points I bring up are not worth it.
>
>> ==> void ifq_fini(struct ifqueue *);
>> Finalize and tear down an interface queue. For ifp->if_snd, this is done
>> for you in if_detach().
>
> Presumably this can sleep and involves deleting all packets and waiting
> for locks.
Yes, it can block to acquire locks and purges the queue.
>> ==> int ifq_put(struct ifqueue *, struct mbuf *);
>> Put a packet into the interface queue. This is the equivalent of the
>> old IFQ_ENQUEUE(). Returns 0 on success or an error code indicating
>> the mode of failure (ENOBUFS is the queue is full). This function always
>> consumes the packet (either places it in the queue or frees it if an
>> error occurs).
>
> I do not see a 'is queue full', and that feels like a top-level
> architectural decision that we don't want queueing or flow control ahead
> fo the queue. But we could add a function later (put that didn't free
> on error?) later if necessary and probably it isn't.
The old interface doesn’t have one explicitly, and I don’t think such a thing exists for ALTQ (I’d have to dive a little deeper into that code), so I didn’t add an explicit function to check. There’s nothing that prevents one from being added in the future, although of course such a function has a TOCTOU problem, so it is of questionable utility. Certainly a “try” flavor of put could be possible if deemed useful / necessary later.
>> ==> struct mbuf *ifq_stage(struct ifqueue *);
>> Stage a packet for output. This is the rough equivalent of IFQ_POLL(),
>> but it makes some additional guarantees. Namely, ifq_stage() is guaranteed
>> to return the same mbuf each time it is called, no matter what queueing
>> discipline the ifqueue uses, until one of 3 things happens: the packet
>> is committed, aborted, or re-staged (see below). ifq_get() (see below)
>> will also return the currently-staged packet before dipping into the
>> queue, if one exists. As noted above, if the ifqueue state is
>> IFQ_STATE_BUSY and there are no more packets (staged or otherwise)
>> in the queue, ifq_stage() will atomically set the state to IFQ_STATE_READY
>> and return NULL.
>
> I don't like 'stage' as a word, but I realize that's a messy
> discussion. The point here is two-fold
>
> The queue must select the next packet, if it hasn't. For more than
> FIFO this could be complicated (priority, rate limits by class), but
> it's work that needs to be done. Once selected that choice is frozen.
>
> Return a pointer to the packet which can be read, but consider that
> the packet is still in the queue, both in terms of the queue owning
> the storage, and in terms of the packet counting towards logical queue
> occupancy.
Yes, I could certainly change the description to avoid the “dip into the queue” wording that I used. But yes, once you stage a packet, it is THE packet that will be returned until consumed.
> To me the point is "ifq_peek" with a rule that once a packet has been
> passed to peek, it's fixed.
I don’t like the term “peek” because “peek” implies lack of side-effects. “Stage” specifically takes an action (it’s not really a side-effect at all).
> I would then want to see
>
> ifq_peek_reset (bad name)
This is essentially IF_PREPEND(), except there’s no way to do this in ALTQ, so I avoided it, for now.
> to tell the queue: you are relieved of the requirement to return the
> same packet. But maybe there is no need.
>
>> ==> void ifq_restage(struct ifqueue *, struct mbuf *);
>> "Re-stage" the currently-staged packet with a new one. This is the
>> replacement for prior uses of IF_PREPEND(). A packet must already
>> be staged in the ifqueue. Once the new packet has taken the place
>> of the old one, the old packet will be freed. It is OK to call
>> ifq_restage() with the pointer to the currently-staged packet; this
>> case is detected and treated as a no-op.
>
> This allows the driver to change the contents. I don't understand why
> you'd want to do this, vs just say "sorry, I peeked and I can't cope
> now".
There are a couple of reasons:
1- The driver needs to de-frag the packet to handle alignment / DMA segment count constraints.
2- The driver needs to prepend some encapsulating header or something like that.
…but after that processing was done, there was some other temporary resource constraint, and so the driver would like to try again later, but without having to re-do the work it’s already invested into the packet.
Existing drivers do this using IF_PREPEND(). Sure, some driver-specific mechanism could be devised, but why not just let the send queue deal with it? It’s simple enough.
>> ==> struct mbuf *ifq_get(struct ifqueue *);
>> Get a packet from the queue. This is the equivalent of the old IFQ_DEQUEUE(),
>> with the caveat that if a packet has been staged, it will be returned before
>> ifq_get() dips into the queue. ifq_get(), like ifq_stage(), will perform the
>> state transition from IFQ_STATE_BUSY to IFQ_STATE_READY, under the same
>> conditions.
>
> I don't like this language about "into the queue". I guess there is a
> fundamental issue not clear about whether the staged packet is in the
> queue or it isn't. Presumably ifq_fini would free it, so I say it's in.
Yes, I can change that wording.
> This staging concept is a requirement on queueing disciplines to have
> the ability to pick a next packet within the queue and maintain that
> choice. Clealry one can do this with a queue and an on-deck slot, but
> if you're tols the queue is 10 packets, that's a 9 packet queue and the
> on-deck slot.
The current implementation has an on-deck slot, and that slot is not counted against the maximum queue length. I could change that to make the on-deck slot have cost if people feel strongly about it.
>> ==> void ifq_commit(struct ifqueue *);
>> Commits the currently-staged packet, freeing up the staging area for another
>> packet. The ifq_stage() / ifq_commit() combination is the rough equivalent
>> of the existing IFQ_POLL() / IFQ_DEQUEUE() pattern. The caller is responsible
>> for freeing the packet once the transmission has completed or the mbuf
>> is otherwise no longer needed. Typically, for a network interface that is
>> doing DMA directly from the mbuf, the packet will be freed in the interrupt
>> handler.
>
> How is this different from just calling ifq_get? Is it an error to
> ifq_stage
> ifq_get
> or is that ok? It could be ifq_peek_consume() to be clearly the 2nd
> half of peek where you take it.
You could call ifq_get() after ifq_stage(), but I wanted to have what I see as 2 distinct and clear patterns for usage:
1- stage + commit/abort
2- get
>
> Presumably ifq_commit time is the timestamp that should be used for
> updating rate-based queueing data, as the sent time. Even though which
> packets were eligible was assessed at _stage time.
>
> This says "frees up the staging area for another packet". Clearly a
> queueing discipline can make decisions when it wants. Perhaps instead
>
> Removes the currently-staged packet from the queue. An
> ifq_stage/ifq_commit pair is equivalent to ifq_get. After ifq_commit
> the staging area is empty. (A queue must choose a packet for the
> staging area when ifq_stage is called again, and is of course free to
> make decisions at other times.)
Sure. That seems fine. ALTQ doesn’t currently have those decision points, though, and I don’t plan to add them. Left as an exercise for the reader :-)
>> ==> void ifq_abort(struct ifqueue *);
>> This function is intended to be used when transmission of the packet has
>> encountered a fatal error. The packet is removed from the staging area
>> and freed.
>
> Do we expect this? Is this "something is wrong with this packet?" vs
> resources? How is this different from ifq_commit?
“Commit” transfers ownership of the packet to the caller, which is now responsible for freeing the packet when they’re done using it. “Abort” clears the staging area and frees the packet. It would be used if there’s something fundamentally wrong with the packet that makes it non-transmittable, yes.
-- thorpej
Home |
Main Index |
Thread Index |
Old Index