So I spent a few more hours on it, and here's the current patch I've come up with. With this patch applied, NetBSD-current works fine.
---
diff --git a/sys/arch/vax/vsa/lcg.c b/sys/arch/vax/vsa/lcg.c
index 0178c069cb08..0808b24f4faa 100644
--- a/sys/arch/vax/vsa/lcg.c
+++ b/sys/arch/vax/vsa/lcg.c
@@ -438,6 +438,9 @@ lcg_match(struct device *parent, struct cfdata *match, void *aux)
if ((vax_boardtype != VAX_BTYP_46) && (vax_boardtype != VAX_BTYP_48))
return 0;
+ if (vax_siedata & 0x1)
+ return 0; /* is microvax */
+
*ch = 1;
if ((*ch & 1) == 0)
return 0;
@@ -457,6 +460,9 @@ lcg_attach(struct device *parent, struct device *self, void *aux)
struct vsbus_attach_args *va = aux;
struct wsemuldisplaydev_attach_args aa;
+ if (vax_siedata & 0x1)
+ return; /* is microvax */
+
printf("\n");
aa.console = lcgaddr != NULL;
@@ -956,6 +962,9 @@ lcgcnprobe(struct consdev *cndev)
if (vax_confdata & 0x100)
return; /* Diagnostic console */
+ if (vax_siedata & 0x1)
+ return; /* is microvax */
+
lcg_init_common(NULL, NULL);
/* Set up default LUT */
---
This patch makes lcg consistent with how we detect MicroVAX vs. VAXstation in locore.c. OpenBSD has a different approach to this problem, but that's a longer discussion:
Matthew Green brought up OpenBSD's lcg.c on VAX.
> i noticed that the final version of the openbsd lcg.c has quiteI actually spent a few hours going through and getting OpenBSD 5.8 setup and working. I can confirm that it's lcg.c driver doesn't misbehave but there are some differences in how OpenBSD and NetBSD do platform detection on VAX.
On NetBSD, this MicroVAX is treated as a KA48, even though the firmware says its a KA45, and thus BTYP_48 gets defined. OpenBSD conversely actually sees this system as a KA45. It seems to make a distinction between board type and system type that NetBSD doesn't.
There's an explicit check for STYP_48 in lcg_match in OpenBSD's that has no direct analogue in the current VAX port. We can see in OpenBSD's ka48.c that in ka48_conf, the determination for STYP_45/48 is made. This is done via the cpustype variable which we don't have. I could write a patch to add it but do we actually care beyond this case (and getting dmesg to say a KA45 is a KA45?)
Greg Stark:
> Of course it's also quite possible to have a VS4000 that happens to not have any framebuffer installed too....
My understanding was that if it was a VAXstation, it would have a framebuffer. My understanding is on most VAXstation, there are DIP switches to configure if said framebuffer is used, but the hardware as I understand it should always be present. There's vax_confdata which some parts of the code read to see if we're on the diagnostic port, but vax_confdata & 0x100 == 1 on my system.
I looked through ka*.h and vax_confdata looks very system specific. Some processors defined a VIDOPT bit, but the code from KA48 specifically says it's incomplete and I get the impression it was written by looking at what VMS does instead of an actual processor manual. Most of the other VAX framebuffer drivers have some sorta config variable they can probe, but lcg.c just attaches blindly. It should be noted that the INSTALL kernel has never enabled lcg.c.
In the case of VAXstation without a framebuffer, I think OpenBSD 5.8's lcg.c would crash as is right now, since it would be a STYP_48, but no actual hardware being present.
Currently, NetBSD uses the first two bytes of the SIE register to determine if its a VAXstation 4000 vs. MicroVAX 3100 M30/40. This is also used for several other VAXstation vs MicroVAX determinations, but I haven't found documentation for what this flag *actually is*. The KA780 CPU manual says the top bit of the SIE is if the system is Qbus based, but no idea what the second bit actually is for. It might be a "framebuffer present" flag for all I know.
N
I was recently given remote access to a MicroVAX 3100 M40, and I've been spending the last week or so getting NetBSD 10/vax going. The good news is that I've thus been rewarded with the following:
soap$ dmesg | head -n20
[ 1.000000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003,
[ 1.000000] 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013,
[ 1.000000] 2014, 2015, 2016, 2017, 2018, 2019, 2020, 2021, 2022, 2023,
[ 1.000000] 2024
[ 1.000000] The NetBSD Foundation, Inc. All rights reserved.
[ 1.000000] Copyright (c) 1982, 1986, 1989, 1991, 1993
[ 1.000000] The Regents of the University of California. All rights reserved.
[ 1.000000] NetBSD 10.99.10 (GENERIC) #18: Tue Apr 30 01:57:43 UTC 2024
[ 1.000000] ncmdr%soapmaker.lan@localhost:/home/ncmdr/netbsd-src/sys/arch/vax/compile/obj/GENERIC
[ 1.000000] MicroVAX 3100/m{30,40}
[ 1.000000] total memory = 32508 KB
[ 1.000000] avail memory = 26920 KB
[ 1.000000] timecounter: Timecounters tick every 10.000 msec
[ 1.000000] Kernelized RAIDframe activated
[ 1.000000] mainbus0 (root)
[ 1.000000] cpu0 at mainbus0: KA48, SOC, 6KB L1 cache
[ 1.000000] vsbus0 at mainbus0
[ 1.000000] vsbus0: 32K entry DMA SGMAP at PA 0x580000 (VA 0x80580000)
[ 1.000000] vsbus0: interrupt mask 0
The bad news is that my initial attempts at getting this machine were greeted with a dead console, gibberish, or bootlooping. This combined with the fact that I'm MOP booting off Ultrix (and Linux originally) on a machine that's physically in a different country has largely led to what can be described as a deeply magical debugging experience.
Part of the problems are likely related to the Linux mopd port, but the largest hurdle was that through a lot of trial and error that the GENERIC kernel on NetBSD 10 is busted due to the lcg driver being broken on MicroVAX 3100. This code is almost unchanged since 2014, so this has been broken for awhile.
NetBSD 7 has lcg.c, but its disabled in GENERIC, so that's why it worked.
On NetBSD 10, this manifested itself as the INSTALL kernel loading properly from MOP, and being able to run the full install but the system falling over with garbage on the console when loading GENERIC. I could reproduce these results both off HDD boot, and MOP boot. I had found through trial and error that NetBSD 7-GENERIC worked. I ended up tracking the problem to the lcg driver, and what appears to be faulty platform detection code.
Looking through CVS, on -current, revision 1.219, the lcg0 driver is enabled for GENERIC
lcg0 at vsbus0 csr 0x21801000 # VS4000/60 (or VLC) graphics
It's disabled on the INSTALL kernel. lcg0 support was added on GENERIC in Revision 1.195.
As a MicroVAX, this machine doesn't have a framebuffer, so this driver shouldn't trying to attach, but when it does, it appears to cause a bad poke somewhere and the entire system goes down in flames. I sometimes get kicked back to the >>> prompt, but I can't INITIALIZE< and its essentially dead until I physically powercycle it to bring it back.
Looking at the driver itself, and what locore.c does for platform detection, I think lcgcnprobe is broken. This also accounts for it kealing over before even printing the copyright string.
void
lcgcnprobe(struct consdev *cndev)
{
extern const struct cdevsw wsdisplay_cdevsw;
if ((vax_boardtype != VAX_BTYP_46) && (vax_boardtype != VAX_BTYP_48))
return; /* Only for VS 4000/60 and VLC */
if (vax_confdata & 0x100)
return; /* Diagnostic console */
lcg_init_common(NULL, NULL);
/* Set up default LUT */
cndev->cn_pri = CN_INTERNAL;
cndev->cn_dev = makedev(cdevsw_lookup_major(&wsdisplay_cdevsw), 0);
}
https://github.com/NetBSD/src/blob/trunk/sys/arch/vax/vsa/lcg.c#L948
Looking at locore, platform detection between a MicroVAX and a VAXstation in locore.c is looking at vax_siedata. This check is used for quite a few different MicroVAX vs. VAXStation determinations.
Looking at smg.c and and gpx.c, they have more complex checks which correctly detect that the framebuffer is not present and thus don't cause an issue. Conversely, lcg.c thinks we're a VS 4000/60 (or VLC), and tries to load.
Unfortunately, just adjusting the if statements in either lcg_match or lcgcnprobe didn't seem to be enough - I just get new and interesting failure modes if I try that in and of itself.
I haven't managed to figure out how to fix the driver, but I did manage to basically patch out enough of the driver so it at least loads in on the MicroVAX while doing nothing at all. This patch is https://gist.github.com/NCommander/3659937fa10444805637b08b1e871ef0.
For actually fixing lcg.c, I couldn't find anything documenting what would be in vax_confdata. My initial thought is there probably needs to be an additional check that checks SIE to determine VAXstation vs. MicroVAX and bail out.
Any assistance in creating an acceptable patch is welcome.
~ N