Mach 2.5 Independence day

With apologies to a very 90’s movie

Ever since I got my hands on the Mt Xinu disk images, I’ve been working to see if the old Mach kernels on the CSRG CD-ROM set are actually buildable and runnable. And the TL;DR is that yes, they are.

The CD has 3 Mach kernels, the MK35 kernel, a kernel that appears to be something called X147, and a release of Mach 3.0. While X147 has hardware support for the SUN-3 and most of the files for the VAX, only MK35 has hardware support for the i386. The MK35 kernel has incomplete Makefiles and other dependencies, while X147 lacks i386 support. The good news is that it’s possible to use portions of the missing config & Makefiles from X147 to fill in for MK35, as it’s possible to copy the platform code from MK35 along with the i386 specific config into X147, yielding 2 working kernels.

Now this leads to the next few issues. The hardware support appears to be code ‘donated’ from various OEMs from Intel, Olivetti, Toshiba, OSF, and the CSRG. Dates vary from 1987 to 1991.

I started with the MK35 kernel as it was smaller, and since it was tagged as an ‘Intel only release’ of Mach, I figured that this one had the best chance of actually working.

boot:
442336+46792+115216[+38940+39072]

And this is as far as it got on it’s first attempted boot. The Qemu VM would immediately reboot. Since I had installed Mt Xinu on VMware I went ahead and tried it there, and it said that there was a critical CPU exception and that it was shutting down. Bochs did the same thing, as did PCem. Since nothing was being printed to the screen it must be failing in the locore.s which is split into several assembly modules. I put in a hlt at various points and kept rebuilding and rebooting to see if it would halt or if it’d reboot. Thankfully VM’s are cheap and plentiful, I can’t imagine how tedious this would be on actual hardware. Eventually I found out that right after the paging bit in CR0 was flipped the VM would reboot. Now I had something.

        / turn PG on
        mov     %cr0, %eax
        or      $PAGEBIT, %eax
        mov     %eax, %cr0

        mov   %edx, %cr3 

I had tried not flipping the page bit, not flipping cr3, no matter what I tried it would triple fault and reboot.

I had to break down and beg for help, and as luck would have it, someone who knows a heck of lot more about the i386 than I could ever hope to know took a glace at the above code and immediately noted:

I looked at start.s. And it immediately jumped out at me as being very fishy. What they do is enable protected mode *and* paging, but only then load CR3. That’s something which may well work on some CPUs, but it’s against the rules. You could try just swapping the instructions around, first load CR3, then CR0. The next question is then if that code executes out of an identity-mapped page; if yes then just swapping the instructions should do the trick, if not then there is a bigger issue.


 Background: Old CPUs, especially 386/486, will decode and pipeline several instructions past the protected mode switch (mov cr0, eax). The jmp instruction is there to flush that pipeline and make sure all further instructions are executed with the new addressing mode in effect. But old CPUs did not enforce that and it was possible to execute the jmp from a non-identity-mapped page, and I guess it was also possible to execute instructions between the move to cr0 and the jmp, at least most of the time. That tends to break on modern CPUs (probably P6 and later) and definitely in emulation/virtualization. The move to cr0 effectively flushes the pipeline and if the next instruction is not in the page tables, poof, there goes the OS.

Could it really be that simple?

        mov     %edx, %cr3

        / turn PG on
        mov     %cr0, %eax
        or      $PAGEBIT, %eax
        mov     %eax, %cr0

        / mov   %edx, %cr3 

I commented out the cr3 line and just pasted above the cr0 pagebit flip.

The first boot of Mach 2.5 MK35

Amazingly the kernel booted. Behold the first boot of Mach/4.3 which very well could be the first boot independent of the CMU and I’d venture the first boot from the source on the CSRG CD-ROM set. I tried to tell Mach to use the disk as prepared by Mt Xinu, but naturally it’s incompatible.

The next thing to do was create a root diskette, which thankfully the CMU folks left the needed files in the standi386at directory. I was able to build the disk, and using VMware I could boot into single user mode. I went through the ‘unpublished’ documentation I was able to mirror, and was able to get lucky enough to have Mach prepare the hard disk, format the partitions, and I used tar to transfer the root diskette onto the hard disk. I thought it ought to be possible to boot from the boot disk, have it mount the hard disk, and re-mount the boot disk, and copy the kernel. Sounds reasonable right?

This is where the incredibly stale platform code showed it’s head once more again as the floppy driver in MK35 is amazingly useless. It seems that the emulated hardware is too fast? But all reads from the floppy using the hard disk as root failed. Instead I removed a bunch of files from the disk, and copied over gzip & a compressed copy of the kernel to disk, along with the boot.hd program, and was able to copy them to hard disk using that modified root diskette. Luckily Mach has support for a.out binaries, and this stuff being so old it’s all statically linked. My Mt Xinu build of gzip runs fine on the Mach kernel, so I could decompress the kernel and install the bootblocks.

This is where the next weird issue would happen, which is that Mach was quite insistent on mounting everything under this /RFS directory. It appears that RFS was CMU’s answer to NFS… Which needless to say didn’t ignite the world on fire. I was later able to find that I could disable the RFS code, re-configure, rebuild and re-transfer a kernel and with a bit of fighting with mount I was able to mount hd0d/hd0e. Sadly during the install process there was no visible option to specify slice sizes so I’m stuck with a 10MB root.

With this much luck in hand I thought it may be interesting to see if Mt Xinu could mount the Mach disk. Turns out that it can without any issues. So I went ahead and wiped the Mach disk, and transfered Mt Xinu over to the Mach disk, and rebooted with that. And it “works”! Although of course there is some caveats.

The first being the aforementioned floppy support is broken. The next one being that the serial support also suffers from basically losing interrupts and leaving the system waiting. The kernel debugger still works, and you can see it in the idle loop, along with the other threads waiting. This means my favorite method of using uuencode and pasting to the terminal won’t work, MK35 locks up after 35kb, and X147 made it as far as 150kb. Keep in mind that they are using the same i386/i386at platform directories.

So I’m quite sure that there is other issues hiding in the code, maybe obvious ones like the cr3/cr0 thing. On the other front I’ve been starting at looking at doing some porting of the Tahoe/Quasijarus userland with varying success. I have already started to rebuild some binaries with a substitute crt0.o as there is no source for anything included in the Mt Xinu distribution outside of the Mach 3.0 kernel.

For those who want to play along I have uploaded VMDK’s and the source tarballs.

For people using Qemu I find that a serial terminal is FAR nicer to use than the console. Also I’m unsure of how hard the 16MB ISA DMA window is being hit, but X147 seems okay with 64MB of ram, while M35 really needs to be 50MB or less..

qemu.exe -L pc-bios -hda Mach25-MK35.vmdk -serial telnet:127.0.0.1:42323,server,nowait -m 16

This puts the serial port into TCP server mode, so you can simply telnet into the serial port. As always change the memory are your own discretion.

I’m not dead

It’s just been really busy with this move, unpacking and the usual losing things, finding things and breaking things.

In the middle of it all, I found something online, that I want to at least do some proper article thing about… As it’s been really exciting, and goes back to the first month I started this blog.

Outside of NeXTSTEP, the other i386 commercial version of Mach 2.5 surfaced, the Mt Xinu version!

Even better, a few years ago, I had stumbled onto the source code for 2.5 buried on the 4th disc of the CSRG set, and with a LOT of luck and persistence I can confirm that the sources are complete enough to build.

loading vmunix.sys
rearranging symbols
text	data	bss	dec	hex
389088	45564	101364	536016	82dd0
ln vmunix.sys vmunix; ln vmunix vmunix.I386x.STD+WS-afs-nfs

However, as luck always has it, start.s in the i386 code does something weird at the 3GB mark causing a triple fault on any kind of modern emulation/virtualization setup.

    / Fix up the 1st, 3 giga and last entries in the page directory
    mov     $EXT(kpde), %ebx
    and     $MASK, %ebx

    mov     $EXT(kpte), %eax
    and     $0xffff000, %eax
    or      $0x1, %eax

    mov     %eax, (%ebx)
    mov     %eax, 3072(%ebx)        / 3 giga -- C0000000

    mov     $EXT(kpde), %edx
    and     $MASK, %edx

Not all that sure why, but at least on Bochs, I can see the triple fault.

00036527018d[CPU0  ] page walk for address 0x0000000000101122
00036527018d[CPU0  ] page walk for address 0x00000000e0000011
00036527018d[CPU0  ] PDE: entry not present
00036527018d[CPU0  ] page fault for address 00000000e0000011 @ 0000000000101124
00036527018d[CPU0  ] exception(0x0e): error_code=0002
00036527018d[CPU0  ] interrupt(): vector = 0e, TYPE = 3, EXT = 1
00036527018d[CPU0  ] page walk for address 0x00000000c0161370
00036527018d[CPU0  ] PDE: entry not present
00036527018d[CPU0  ] page fault for address 00000000c0161370 @ 0000000000101122
00036527018d[CPU0  ] exception(0x0e): error_code=0000
00036527018d[CPU0  ] exception(0x08): error_code=0000
00036527018d[CPU0  ] interrupt(): vector = 08, TYPE = 3, EXT = 1
00036527018d[CPU0  ] page walk for address 0x00000000c0161340
00036527018d[CPU0  ] PDE: entry not present
00036527018d[CPU0  ] page fault for address 00000000c0161340 @ 0000000000101122
00036527018d[CPU0  ] exception(0x0e): error_code=0000
00036527018i[CPU0  ] CPU is in protected mode (active)
00036527018i[CPU0  ] CS.mode = 32 bit
00036527018i[CPU0  ] SS.mode = 32 bit
00036527018i[CPU0  ] EFER   = 0x00000000
00036527018i[CPU0  ] | EAX=e0000011  EBX=0015f000  ECX=00161dc1  EDX=0015f000
00036527018i[CPU0  ] | ESP=0000efbc  EBP=0000efbc  ESI=00193fb8  EDI=00009d84
00036527018i[CPU0  ] | IOPL=0 id vip vif ac vm RF nt of df if tf SF zf af PF cf
00036527018i[CPU0  ] | SEG sltr(index|ti|rpl)     base    limit G D
00036527018i[CPU0  ] |  CS:0028( 0005| 0|  0) 00000000 ffffffff 1 1
00036527018i[CPU0  ] |  DS:0020( 0004| 0|  0) 00000000 ffffffff 1 1
00036527018i[CPU0  ] |  SS:0010( 0002| 0|  0) 00001000 0000ffff 0 1
00036527018i[CPU0  ] |  ES:0020( 0004| 0|  0) 00000000 ffffffff 1 1
00036527018i[CPU0  ] |  FS:0000( 0005| 0|  0) 00000000 0000ffff 0 0
00036527018i[CPU0  ] |  GS:0000( 0005| 0|  0) 00000000 0000ffff 0 0
00036527018i[CPU0  ] | EIP=00101122 (00101122)
00036527018i[CPU0  ] | CR0=0xe0000011 CR2=0xc0161340
00036527018i[CPU0  ] | CR3=0x00000000 CR4=0x00000000
00036527018i[CPU0  ] 0x0000000000101122>> add byte ptr ds:[eax], al : 0000
00036527018d[SIM   ] searching for component 'cpu' in list 'bochs'
00036527018d[SIM   ] searching for component 'reset_on_triple_fault' in list 'cpu'
00036527018e[CPU0  ] exception(): 3rd (14) exception with no resolution, shutdown status is 00h, resetting

Mach 3.0 doesn’t do this, so I’ll have to dig far deeper into start.s which is kind of really beyond me.

Building a boot disk … is involved. 😐

rm -rf /usr/src/mach25-i386/obj
mkdir /usr/src/mach25-i386/obj
cd /usr/src/mach25-i386/standi386at/boot
make fdboot
/home/user/mkfs /dev/rfloppy 2880 18 2 4096 512 32 1
dd if=/usr/src/mach25-i386/obj/standi386at/boot/boot.fd of=/dev/rfd0d
/home/user/fsck -y /dev/rfloppy
cd /usr/src/mach25-i386/
make
mount /dev/floppy /mnt
cp /usr/src/mach25-i386/obj/STD+WS-afs-nfs/vmunix /mnt
sync
umount /mnt
/home/user/fsck -y /dev/rfloppy

So, I’m not all that dead. For anyone super impatient, you can download my VMDK here, which runs on Qemu & VMware, it includes a serial terminal on COM1 so you can use a real terminal, and if you are like me, uuencode/uudecode files in & out of the system. As always read the 404 page for the current username/password.

A tale of two kernels.

Back in the early 1990’s Microkernel’s were all the rage. Everywhere you would go you’d hear all about Pink, Taligent, Windows NT, and the grand daddy of them all Mach.

Probably the most well known debate about microkernel vs monolithic kernels was the Tanenbaum vs Torvalds debate that raged on comp.os.minix back in 1992. You can read the entire thing here : ( http://www.oreilly.com/catalog/opensources/book/appa.html ). It was interesting in the sense that even Ken Thompson of UNIX fame even chipped in. Tanenbaums’s major points were that a microkernel is more inherently portable than a monolithic kernel, and that microkernels could be more reliable, and easier to maintain. Of course more than 10 years later we can see that Linux still flourishes, and that outside of Windows NT & OS X no mainstream OS relies on a microkernel. Even OS X treats mach more as a call library than a traditional microkernel, since all of the exe’s in Darwin / OS X are Mach-O format, not COFF/ELF,A.OUT, etc etc.

Mach started out as a project from CMU derived from the UNIX source code, to try to re-invent the lower levels of UNIX into something that would scale easier to multiprocessors, support for threads, and the holy grail of them all, expand it’s portability. Sadly the first few versions of Mach are barred from distribution due to their inclusion of encumbered UNIX source code. However when the university of Utah picked up Mach, and released Mach4 (UK22, info here http://www.cs.utah.edu/flux/mach4/html/Mach4-proj.html ), including full source. Also they provided Lites, a BSD server that can run atop Mach, giving the user a ‘UNIX’ system, as it were.

Lites

Lites

So digging through some network groups, and testing stuff, I finally slapped together the pieces, and built a Mach/Lites system on NetBSD 1.1 . And how does it perform? It’s significantly slower than NetBSD is. You can tell that the amount of context switching involved in Mach as a program makes a call the microkernel, which in turn validates & passes it to the server, which further validates, runs the process, then sends the results back to the kernel, which then passes it back to the program. I’ve heard in a worse case scenario a 500% reduction in speed. You can always read more info on the fine wiki article here ( http://en.wikipedia.org/wiki/Mach_kernel ).

Lots of people will argue that microkernel’s have simply failed, and that it’s simply an example of what seemed like a good idea being pushed too hard once it was found to fail. It a lot of ways it reminds me of ADA.

So for now I’m going to provide a Qemu image of Lites running on Mach 4. unzipping the file will provide you with a lites.cmd file which for windows users you can just run directly. Things to note are:

-The version of Qemu that I’ve bound requires libpcap to be installed.
-Mach4 can only address 16mb of ram, due to DMA issues across the 16mb line.
-I’ve enabled user mode networking so that
-the cmd file sets up local port 23 to be redirected to the VM. This will allow you to telnet in simply by ‘telnet localhost’. You may want to use putty for better terminal handeling
-The included Gcc 2.4.5 is ancient. Outside of building a simple irc client, I wouldn’t expect much.
-The boot process is broken, and it’ll parse through the rc scripts twice. Just let it do it’s thing, and it’ll drop to a login prompt.

Logging into Lites/NetBSD

Logging into Lites/NetBSD

Other than that it behaves just like a NetBSD 1.1 machine.

Notice that grub boots the kernel /Mach.UK22 . When Mach boot’s it’ll load up the files emulator & startup. The ‘emulator’ is the Lites microkernel. Once it’s loaded it’ll start mach_init which just symlinks to /sbin/init and the normal NetBSD bootup will commence.

You can download my image directly here as MachUK22-lites-nat.zip.

Next time we’ll play with SLS Linux.