Hi Mathew, On 23.05.23 15:11, Mathew, Cherry G.* wrote:
MP> I came across Qemu/NVMM more or less out of necessity, as I had MP> been struggling for some time to set up a proper Xen MP> configuration on newer NUCs (UEFI only). The issue I encountered MP> was with the graphics output on the virtual host, meaning that MP> the screen remained black after switching from Xen to NetBSD MP> DOM0. Since the device I had at my disposal lacked a serial MP> console or a management engine with Serial over LAN MP> capabilities, I had to look for alternatives and therefore got MP> somewhat involved in this topic. MP> I'm using the combination of NetBSD 9.3_STABLE + Qemu/NVMM on MP> small low-end servers (Intel NUC7CJYHN), primarily for classic MP> virtualization, which involves running multiple independent MP> virtual servers on a physical server. The setup I have come up MP> with works stably and with acceptable performance. I have a follow-on question about this - Xen has some config tooling related to startup - so you can say something like 'xendomains = dom1, dom2' in /etc/rc.conf, and these domains will be started during bootup. If you did want that for nvmm, what do you use ?
Unfortunately, I didn't find anything suitable and was in a big hurry to make the issue controllable for me. Therefore I wrote a shellscript quick and dirty. It encapsulates the aspects of starting VMs from the command line and from an rc script, creating appropriate Unix domain sockets to serve the guest's serial terminal and the Qemu frontend's monitoring console. If you want to have a look at it, I have uploaded it here (unfortunately without documentation and with a big warning that it is all done with a hot needle):
https://forge.petermann-it.de/mpeterma/vmctl
MP> Scenario: MP> I have a small root filesystem with FFS on the built-in SSD, and MP> the backing store for the VMs is provided through ZFS ZVOLs. The MP> ZVOLs are replicated alternately every night (full and MP> incremental) to an external USB hard drive. Are these 'zfs send' style backups ? or is the state on the backup USB hard drive ready for swapping, if the primary fails for eg ?
Yes, I use zfs send, saving the stream from zfs send to files on the USB drive for my regular backups. So they are not directly usable. The idea is interesting though - I chose this way back then because I do it quite similar on my FFS systems with dump and the incremental aspect was important to me. On the other hand, I've also tested pulling a zfs send of all ZVOLs from the mini-server to my laptop, and then playing around locally with Qemu/nvmm with a "production copy".
MP> There are a total of 5 VMs: MP> net (DHCP server, NFS and SMB server, DNS server) app MP> (Apache/PHP-FPM/PostgreSQL hosting some low-traffic web apps) MP> comm (ZNC) iot (Grafana, InfluxDB for data collection from two MP> smart meters every 10 seconds) mail (Postfix/Cyrus IMAP for a MP> handful of mailboxes) MP> Most of the time, the Hosts CPU usage of the host with this MP> "load" is around 20%. The provided services consistently respond MP> quickly. Ok - and these are accounted as the container qemu processes' quota scheduling time, I assume ? What about RAM ? Have you had a situation where the host OS has to swap out ? Does this cause trouble ? Or does qemu/nvmm only use pinned memory ?
I configured the VMs' RAM to have a few hundred MB buffer left on the host. Memory has run out in the past, especially when zfs send makes use of the buffer cache. Then swapping also occurred and together with the i/o load already increased by zfs send, the system was slowed down so badly that the response times were no longer acceptable. A complete recovery brought in this state also only a restart of the host. I got this under control with a tip someone gave me in #netbsd - I now pipe the output of zfs send first into dd, which has set the oflag "direct" and takes over the writing of the file. Obviously this bypasses some of the caching and avoids this situation.
Regarding pinned memory I can't say anything - the memory consumption of the VMs is stable from the host point of view, ballooning I haven't really tried with it yet.
MP> However, I have noticed that depending on the load, the clocks MP> of the VMs can deviate significantly. This can be compensated MP> for by using a higher HZ in the host kernel (HZ=1000) and MP> tolerant ntdps configuration in the guests. I have also tried MP> various settings with schedctl, especially with the FIFO MP> scheduler, which helped in certain scenarios with high I/O MP> load. However, this came at the expense of stability. I assume this is only *within* your VMs, right ? Do you see this across guest Operating Systems, or just specific ones ?
The deviation of the time is caused by missed interrupts of the guests. As I said, there are a number of workarounds for this and a number of very good explanations in this thread:
https://mail-index.netbsd.org/netbsd-users/2022/08/31/msg028894.htmlI do not use operating systems other than NetBSD as guests in this setup. As a test, I also had various Linux distributions running under nvmm. I didn't do the tests in depth, but I had a test VM with Alpine Linux running for a while and had the impression that this ran as well as NetBSD.
MP> Furthermore, in my system configuration, granting a guest more MP> than one CPU core does not seem to provide any MP> advantage. Particularly in the VMs where I am concerned about MP> performance (net with Samba/NFS), my impression is that MP> allocating more CPU cores actually decreases performance even MP> further. I should measure this more precisely someday... ic - this is interesting - are you able to run some tests to nail this down more precisely ?
I should definitely do that and if you have a specific idea of what I should try once, feel free to let me know. I think that the observations from back then should also be seen in the context of my concrete system. Since I have only two CPU cores available, virtually with one Qemu process and one Qemu IO thread running outside the Qemu process, both cores are already fully occupied under full I/O load of one VM. Therefore, it seems to me in this setup not so improbable that when adding another Qemu process (for the 2nd CPU of the VM) then resources become rare.
MP> If you have specific questions or need assistance, feel free to MP> reach out. I have documented everything quite well, as I MP> intended to contribute it to the wiki someday. By the way, I am MP> currently working on a second identical system where I plan to MP> test the combination of NetBSD 10.0_BETA and Xen 4.15. There's quite a bit of goodies wrt Xen in 10.0 - mainly you can now run accelerated as a Xen guest (hvm with the PV drivers active).
For now I only use the "conventional" PV for my guest systems. But I also have a pure NetBSD setup here at the moment. I'm curious about the comparison myself. Currently I have measured about 5 times the bandwidth with Xen on the identical hardware with Samba from a VM when transferring a large file minus all caching effects. This is my focus at the moment, because on the Xen system I use VNDs on FFS, while on the Qemu/nvmm ZVOLS are in use. There are too many variables in the equation at the moment ;-)
Kind regards Matthias
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature