Mach 2.5 Independence day

With apologies to a very 90’s movie

Ever since I got my hands on the Mt Xinu disk images, I’ve been working to see if the old Mach kernels on the CSRG CD-ROM set are actually buildable and runnable. And the TL;DR is that yes, they are.

The CD has 3 Mach kernels, the MK35 kernel, a kernel that appears to be something called X147, and a release of Mach 3.0. While X147 has hardware support for the SUN-3 and most of the files for the VAX, only MK35 has hardware support for the i386. The MK35 kernel has incomplete Makefiles and other dependencies, while X147 lacks i386 support. The good news is that it’s possible to use portions of the missing config & Makefiles from X147 to fill in for MK35, as it’s possible to copy the platform code from MK35 along with the i386 specific config into X147, yielding 2 working kernels.

Now this leads to the next few issues. The hardware support appears to be code ‘donated’ from various OEMs from Intel, Olivetti, Toshiba, OSF, and the CSRG. Dates vary from 1987 to 1991.

I started with the MK35 kernel as it was smaller, and since it was tagged as an ‘Intel only release’ of Mach, I figured that this one had the best chance of actually working.

boot:
442336+46792+115216[+38940+39072]

And this is as far as it got on it’s first attempted boot. The Qemu VM would immediately reboot. Since I had installed Mt Xinu on VMware I went ahead and tried it there, and it said that there was a critical CPU exception and that it was shutting down. Bochs did the same thing, as did PCem. Since nothing was being printed to the screen it must be failing in the locore.s which is split into several assembly modules. I put in a hlt at various points and kept rebuilding and rebooting to see if it would halt or if it’d reboot. Thankfully VM’s are cheap and plentiful, I can’t imagine how tedious this would be on actual hardware. Eventually I found out that right after the paging bit in CR0 was flipped the VM would reboot. Now I had something.

        / turn PG on
        mov     %cr0, %eax
        or      $PAGEBIT, %eax
        mov     %eax, %cr0

        mov   %edx, %cr3 

I had tried not flipping the page bit, not flipping cr3, no matter what I tried it would triple fault and reboot.

I had to break down and beg for help, and as luck would have it, someone who knows a heck of lot more about the i386 than I could ever hope to know took a glace at the above code and immediately noted:

I looked at start.s. And it immediately jumped out at me as being very fishy. What they do is enable protected mode *and* paging, but only then load CR3. That’s something which may well work on some CPUs, but it’s against the rules. You could try just swapping the instructions around, first load CR3, then CR0. The next question is then if that code executes out of an identity-mapped page; if yes then just swapping the instructions should do the trick, if not then there is a bigger issue.


 Background: Old CPUs, especially 386/486, will decode and pipeline several instructions past the protected mode switch (mov cr0, eax). The jmp instruction is there to flush that pipeline and make sure all further instructions are executed with the new addressing mode in effect. But old CPUs did not enforce that and it was possible to execute the jmp from a non-identity-mapped page, and I guess it was also possible to execute instructions between the move to cr0 and the jmp, at least most of the time. That tends to break on modern CPUs (probably P6 and later) and definitely in emulation/virtualization. The move to cr0 effectively flushes the pipeline and if the next instruction is not in the page tables, poof, there goes the OS.

Could it really be that simple?

        mov     %edx, %cr3

        / turn PG on
        mov     %cr0, %eax
        or      $PAGEBIT, %eax
        mov     %eax, %cr0

        / mov   %edx, %cr3 

I commented out the cr3 line and just pasted above the cr0 pagebit flip.

The first boot of Mach 2.5 MK35

Amazingly the kernel booted. Behold the first boot of Mach/4.3 which very well could be the first boot independent of the CMU and I’d venture the first boot from the source on the CSRG CD-ROM set. I tried to tell Mach to use the disk as prepared by Mt Xinu, but naturally it’s incompatible.

The next thing to do was create a root diskette, which thankfully the CMU folks left the needed files in the standi386at directory. I was able to build the disk, and using VMware I could boot into single user mode. I went through the ‘unpublished’ documentation I was able to mirror, and was able to get lucky enough to have Mach prepare the hard disk, format the partitions, and I used tar to transfer the root diskette onto the hard disk. I thought it ought to be possible to boot from the boot disk, have it mount the hard disk, and re-mount the boot disk, and copy the kernel. Sounds reasonable right?

This is where the incredibly stale platform code showed it’s head once more again as the floppy driver in MK35 is amazingly useless. It seems that the emulated hardware is too fast? But all reads from the floppy using the hard disk as root failed. Instead I removed a bunch of files from the disk, and copied over gzip & a compressed copy of the kernel to disk, along with the boot.hd program, and was able to copy them to hard disk using that modified root diskette. Luckily Mach has support for a.out binaries, and this stuff being so old it’s all statically linked. My Mt Xinu build of gzip runs fine on the Mach kernel, so I could decompress the kernel and install the bootblocks.

This is where the next weird issue would happen, which is that Mach was quite insistent on mounting everything under this /RFS directory. It appears that RFS was CMU’s answer to NFS… Which needless to say didn’t ignite the world on fire. I was later able to find that I could disable the RFS code, re-configure, rebuild and re-transfer a kernel and with a bit of fighting with mount I was able to mount hd0d/hd0e. Sadly during the install process there was no visible option to specify slice sizes so I’m stuck with a 10MB root.

With this much luck in hand I thought it may be interesting to see if Mt Xinu could mount the Mach disk. Turns out that it can without any issues. So I went ahead and wiped the Mach disk, and transfered Mt Xinu over to the Mach disk, and rebooted with that. And it “works”! Although of course there is some caveats.

The first being the aforementioned floppy support is broken. The next one being that the serial support also suffers from basically losing interrupts and leaving the system waiting. The kernel debugger still works, and you can see it in the idle loop, along with the other threads waiting. This means my favorite method of using uuencode and pasting to the terminal won’t work, MK35 locks up after 35kb, and X147 made it as far as 150kb. Keep in mind that they are using the same i386/i386at platform directories.

So I’m quite sure that there is other issues hiding in the code, maybe obvious ones like the cr3/cr0 thing. On the other front I’ve been starting at looking at doing some porting of the Tahoe/Quasijarus userland with varying success. I have already started to rebuild some binaries with a substitute crt0.o as there is no source for anything included in the Mt Xinu distribution outside of the Mach 3.0 kernel.

For those who want to play along I have uploaded VMDK’s and the source tarballs.

For people using Qemu I find that a serial terminal is FAR nicer to use than the console. Also I’m unsure of how hard the 16MB ISA DMA window is being hit, but X147 seems okay with 64MB of ram, while M35 really needs to be 50MB or less..

qemu.exe -L pc-bios -hda Mach25-MK35.vmdk -serial telnet:127.0.0.1:42323,server,nowait -m 16

This puts the serial port into TCP server mode, so you can simply telnet into the serial port. As always change the memory are your own discretion.

2 thoughts on “Mach 2.5 Independence day

Leave a Reply