tech-kern archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
Re: DRM/KMS: vmwgfx driver is now available
> Date: Sun, 16 Jul 2023 04:50:35 +0900
> From: PHO <pho%cielonegro.org@localhost>
>
> On 7/16/23 04:19, Taylor R Campbell wrote:
>
> > 1. Did you reimplement an allocator for the dma_pool_* API? Should it
> > use vmem(9) instead or can that not handle the requests that it
> > needs?
>
> Yes I reimplemented it but it's probably far from efficient. And I
> didn't know the existence of vmem! Lol, I should have used it.
Could I talk you into converting it to use vmem(9), so we have fewer
allocators to debug? Should be pretty easy, there's a couple other
examples of vmem(9) in drm already: drm_vma_manager.c,
drm_gem_cma_helper.c.
For that matter, do we even need a new allocator here? Can we just do
a new bus_dmamem_alloc/load for each one?
Or, how important is this API? It looks like there are two calls to
dma_pool_zalloc in vmwgfx. Maybe it would be simpler to patch them
away so we don't have to spend a lot of time matching all the
semantics -- and tracking any changes in updates later -- for exactly
two users.
> > 2. What's up with the restore_mode property? Does anything set it?
> > What's the protocol? Why isn't this needed by all the other
> > drivers, which can already gracefully handle exiting X?
>
> vmwgfxfb sets it and vmwgfx uses it. It's not needed as long as X
> gracefully exits, because it calls WSDISPLAYIO_SMODE ioctl to revert the
> mode back to WSDISPLAYIO_MODE_EMUL. But if X crashes or is killed by
> SIGKILL, the only way to perform a cleanup is to do something in the
> drm_driver::lastclose callback. That's why.
But why isn't this needed by all the other drivers?
> > 3. Why all the logic in vmwgfxfb.c duplicating genfb? Why not just
> > use genfb via drmfb, like the other major drm drivers use (i915,
> > radeon, nouveau, amdgpu)?
>
> Because vmwgfx does not, and can not use drm_fb_helper. The helper
> assumes that you can just allocate a framebuffer in VRAM and map it to a
> virtual address. The VMware GPU appears to support this operation mode
> too, but the vmwgfx driver isn't capable of doing it. So I had to
> allocate a framebuffer in the system memory and ask the driver to upload
> dirty areas to the GPU every time something changes in the buffer.
What a nightmare!
Is there a way we could just use or add a simple hook in genfb or
rasops for when there's dirty areas to update, so we don't have to
copy & paste all of that cruft from genfb and drmfb?
It looks like a painful maintenance burden; maintaining genfb on its
own is bad enough, and there's a lot of API cleanup warranted in
genfb, rasops, and wscons.
Some review comments:
linux/dmapool.h:
- Prefer out-of-line dma_pool_zalloc -- not a substantial optimization
opportunity, less to rebuild if the header file changes. (Early on
I put a lot of static inlines in header files, but unless they're
just trivially wrapping a NetBSD API, I think that is a mistake.)
- `linux_' namespace for dma_pool_sync, not because the API came from
Linux but to avoid defining non-`linux_' symbols in linux.kmod.
linux_dmapool.c:
- Here and everywhere else, write `struct dma_page *foo' and not
`struct dma_page* foo'.
(Syntactically, `struct dma_page' is the type specifier of a
declaration, and `*foo' is the declarator -- the grouping `struct
dma_page* foo' is misleading, e.g. `struct dma_page* foo, bar' does
_not_ declare both foo and bar with the same `struct dma_page*'
type.)
- Four-space continuation lines, not alignment with `(':
err = bus_dmamap_create(dmat, page_size, 1, page_size, 0,
BUS_DMA_WAITOK, &page->dmam);
(in NetBSD code, that is; don't change upstream code to conform, of
course).
- Block comments like this:
/*
* foo bar baz quux zot
* lorem ipsum
*/
vmwgfx_cmdbuf.c:
- No need to ifdef around bus_addr_t vs dma_addr_t -- dma_addr_t is a
typedef alias for bus_addr_t.
- Write
if (...) {
...
} else {
...
}
with no extra line break before else.
- In:
/* Pretend we just completed a DMA. The device sometimes updates
* the status field synchronously without an IRQ, and we need our
* CPU to realize that. */
Seems suspicious. Do you mean that header->cb_header->status gets
updated but the device doesn't subsequently deliver an interrupt?
- In:
#if !defined(__NetBSD__)
/* This function is called with man->lock already held. I have no
* idea why the original code tries to lock it twice. It will never
* work. */
spin_lock(&man->lock);
#endif
That's my bug, sorry, from incompletely converting vmwgfx_cmdbuf.c
to drm_waitqueue_t. Just delete the spin_lock call here, don't
#ifdef it out.
- In:
#if defined(__NetBSD__)
/* XXX: This is the only mode we support at the moment. The other
* modes will cause the driver to panic() because we don't have
* ttm_pool_populate(). */
dev_priv->map_mode = vmw_dma_alloc_coherent;
Can you make vmw_ttm_populate use ttm_bus_dma_populate more or less
unconditionally like all the other drivers do?
vmwgfx_drv.h:
- Why is vmw_ttm_fault_reserve_notify here? Why isn't it static to
vmwgfx_ttm_buffer.c?
vmwgfx_irq.c:
- In:
#if defined(__NetBSD__)
static irqreturn_t vmw_irq_handler(void *arg)
#else
static irqreturn_t vmw_irq_handler(int irq, void *arg)
#endif
use DRM_IRQ_ARGS.
- In:
dev->driver->irq_handler = vmw_irq_handler;
Why is this not statically initialized? dev->driver should really
be const so the compiler rejects this.
vmwgfx_page_dirty.c:
- `const struct ttm_buffer_object *bo',
not `struct ttm_buffer_object const* bo'
- In:
return ttm_bo_uvm_fault(ufi, vaddr, pps, npages, centeridx,
access_type, flags);
Do we eve have to define vmw_bo_vm_fault? Let's just set
vmw_uvm_ops.pgo_fault = ttm_bo_uvm_fault, to keep it simpler?
vmwgfx_resource.c:
- In:
/* This looks wrong, doesn't it? But it's actually what the
* original Linux code does. The tree is expected to be able to
* hold multiple nodes with the same index. */
if (a->backup_offset < b->backup_offset)
return -1;
else
return +1;
I suggest making this a total order by picking some other
distinguishing characteristic; otherwise rbtree(3) might get
confused. If nothing else, you can use (uintptr_t)a and
(uintptr_t)b (but preferably something else so it doesn't depend on
kernel virtual address ordering).
- In:
#if !defined(__NetBSD__)
struct rb_node **new = &backup->res_tree.rb_node, *parent = NULL;
#endif
Instead of ifdefing this out, put __USE(new); __USE(parent) in the
existing ifdef, to keep the diff smaller.
- In:
#if defined(__NetBSD__)
struct rb_node *node = rb_first(&vbo->res_tree);
#else
struct rb_node *node = vbo->res_tree.rb_node;
#endif
No ifdef here: rb_first is from Linux rbtree.h API. Same with other
uses of rb_first.
- In:
static dma_addr_t __vmw_piter_dma_addr(struct vmw_piter *viter)
{
#if defined(__NetBSD__)
/* XXX: Is this... really? Really what we need to do? */
struct vm_page* const page = &viter->pages[viter->i]->p_vmp;
paddr_t const paddr = VM_PAGE_TO_PHYS(page);
bus_addr_t const baddr = PHYS_TO_BUS_MEM(viter->dmat, paddr);
return baddr;
Seems wrong -- surely there should be a bus dma map here to use?
vmwgfx_validation.c:
- In:
#if defined(__NetBSD__)
/* vmw_validation_res_node::private is defined as "unsigned long
* private[0]". The struct has essentially a variable length. I believe
* it's a wrong way to do it, and our <libkern.h> dislikes type-mismatched
* use of container_of() against it. */
# define __LGTM_BOT__ 1
#endif
Maybe just patch the member to be `void *private[]' instead of
`unsigned long private[0]'?
- Instead of:
- struct ttm_operation_ctx ctx = {
+ struct ttm_operation_ctx tctx = {
Just add -Wno-shadow to the flags for vmwgfx_validation.c.
- Instead of:
#if defined(__NetBSD__)
# include "vmwgfx_page.h"
/* Rewrite "struct page" to "struct vmw_page". Can't do it with a typedef. */
# define page vmw_page
static inline struct vmw_page*
alloc_page(gfp_t gfp) {
return vmw_alloc_page(gfp);
}
I suggest just patching the one call to alloc_page/__free_page.
Trying to mimic the semantics of Linux's page allocator is likely to
be a worse maintenance burden than just patching the one call here,
especially if Linux expects any guaranteed correspondence between
pages and DMA addresses. I see only 2-3 references to `struct page'
in the file, and both in local variables, so it's probably not that
big a deal to just patch it to use uvm_km here.
linux/rbtree.h:
- I suspect that it may actually be more harm than good to provide a
definition of RB_CLEAR_NODE, because use of it usually indicates
that Linux expects to be able to discern whether a node is in the
tree or not by examining the node itself, and that requires patching
to guarantee, like in nouveau_nvkm_core_object.c.
linux_dma_fence.c:
- In:
- KASSERT(dma_fence_referenced_p(fence));
+ dma_fence_assert_alive(fence);
These are not the same -- why the change?
linux_sgt.c:
- In:
bus_addr_t const baddr = PHYS_TO_BUS_MEM(piter->sg->sg_dmat, paddr);
I don't think this can be right, although it might work by acident
with vmwgfx. What's this trying to accomplish? Generally nothing
should ever use PHYS_TO_BUS_MEM outside the implementation of
bus_dma(9) to populate a bus dma map. Page addresses are not
guaranteed to have any corresponding bus dma map.
In general, I'm not sure we can usefully implement the sg page
iteration business without a bus dma map having been loaded in a
patch that obviates the need for the sg page iteration business
anyway.
Home |
Main Index |
Thread Index |
Old Index