xMach

Back when I started blogging, I started with a few quick things on the top of my head, that being Netware, SIMH, CP/M Zork, Xenix and of course Mach. Back in the shiny new world of 1999 we may have survived the Y2K apocalypses but the valley was firmly set in the mind of start-up fever that would lead to the looming .com crash but the menatlity of IPO & dump would remain to this very day.

The Utah projects around Mach were wound down with the OSKit being the next wave for pistachio, but that set the scene of more than average users now having ‘fast’ machines at their disposal and the ability to rifle though published source and roll their own. It was this environment of trying to make it in the full force of the post ‘year of the desktop’ Linux where xMach appeared, had some interest and quickly died. For the longest time I had thought all evidence of it had been eradicated as the domain had been excluded from wayback, and searching went nowhere. But one bored night I tried exploring sourceforge, a veteran of the pre .com bubble crash to see if it had anything, and yes it did!

https://sourceforge.net/projects/xmach

Oh sure in retrospect it’s pretty obvious. All the cool kids had public repositories on the internet.. And then after the .com implosion so many took their CSV’s and went home. And so many of those just died in selfhosted, self imposed silence.

From what I remember one of the first goals was to time the source to make it easier to cross compile from either NetBSD or Linux, and utilize what was the double edged sword of Lites, the ability to run either 386BSD/NetBSD binaries or Linux binaries. And the Linux side was already incorporating shared libraries something that was desperately needed in 1993 for X11. But this was 2000! And of course the downfall of running Linux under xMach/Lites is that you will of course compare the performance, and Linux won hands down. Not to mention interest in improving and adding to Linux was in full mainstream support by 2000 so it made xMach feel all the more ancient.

But we didn’t come here for practicality!

Getting the source & tools

First off I’m going to use the cross tool chain for building OS Kit. i586-linux2.tar.gz (password and 404 apply) as I not only don’t feel like fighting why MIG doesn’t work in 64bit, and it’s just easier to enable 32bit usermode. Speaking of for Ubuntu 20.04.2:

dpkg --add-architecture i386
apt-get update
apt-get install libc6:i386 libncurses5:i386 libstdc++6:i386 libc6-dev:i386 libfl-dev:i386
apt-get install sharutils gcc-multilib build-essential

And for people like me running on Windows 10, You absolutely MUST enable WSL v2. Without an actual kernel the multiarch simply doesn’t work. No doubt it has something to with actually running Linux vs emulating it. Probably mucking about with the LDT/GDT which you certainly cannot do as a process on Windows.

Next just add the old GCC 2.7.2.3 chain into your path:

PATH=/usr/local/i586-linux2/bin:/usr/local/i586-linux2/lib/gcc-lib/i586-linux/2.7.2.3:$PATH
export PATH

Next up you’ll need the source, and being all new and trendy I put it over on GitHub. Clone the repo, and you are ready to start compiling!

Building the xMach kernel

Make an object directory and run configure like this:

../kernel/configure --host=i586-linux --target=i586-linux --build=i586-linux --enable-elf --enable-libmach --enable-linuxdev --prefix=/usr/local/xmach;cp ../updated-conf/kernel/Makeconf .

For some reason configure never populates the compiler tools right, and I can either mess with autoconf (no thanks) or I can just include a known good file. It’s not hard to figure out, you can diff it if you want to, but it’s just set for a native 32bit build of MIG, and then using the cross tools to build the kernel.

type make, and wait 13 seconds, give or take…

i586-linux-ld -r  -L/home/jsteve/src/xMach/kernel-build/lib -nostdlib \
         -o bsdboot.o crt0.o about_to_die.o anno.o boot_info_dump.o boot_start.o cpu.o cpu_init.o cpu_tables_init.o cpu_tables_load.o crtn.o die.o do_boot.o exit.o gdt.o idt.o idt_inittab.o idt_irq_init.o main.o malloc.o panic.o phys_mem_add.o pic.o putchar.o puts.o real_tss.o real_tss_def.o rv86_real_int.o rv86_real_int_asm.o rv86_reflect_irq.o rv86_trap_handler.o serial.o trap_dump.o trap_dump_die.o trap_handler.o trap_return.o tss.o tss_dump.o -lmach_exec -lmach_c
 make[1]: Leaving directory '/home/jsteve/src/xMach/kernel-build/boot/bsd'
 MACHBOOTDIR=cd /home/jsteve/src/xMach/kernel-build/boot/bsd; pwd \
 /home/jsteve/src/xMach/kernel-build/boot/bsd/mkbsdimage -o /home/jsteve/src/xMach/kernel-build/Mach /home/jsteve/src/xMach/kernel-build/kernel/kernel /home/jsteve/src/xMach/kernel-build/bootstrap/bootstrap
 real    0m13.135s
 user    0m12.060s
 sys     0m1.383s

I cannot stress just how incredibly fast this is. I’m pretty sure it literally took hours for this to run, assuming nothing went wrong, which it almost always did. As part of the fun, next do a ‘make install’

Building Lites

This is somewhat similar to the kernel build, however it is far more touchy. You absolutely MUST copy / fix the Makeconf file otherwise the build will be corrupted and the only way to fix is to remove the build directory and try again.

../lites/configure --host=i586-linux --target=i586-linux --build=i586-linux --enable-mach4 --prefix=/usr/local/xmach --with-mach4=../kernel;cp ../updated-conf/lites/conf/Makeconf conf

And once more again make the project and just wait a few seconds. BSD without the hardware support is pretty tiny

i586-linux-ld -o emulator.Lites.1.1.u3.out ./emulator_base  -L/home/jsteve/src/xMach/lites-build/liblites -L/usr/local/xmach/lib -e __start ecrt0.o  e_mig_support.o emul_generic.o error_codes.o emul_init.o emul_mapped.o emul_misc_asm.o e_bsd.o e_43ux.o bsd_1_user.o emul_mach_interface_user.o e_mach_msg_server.o e_stat.o e_bnr.o emul_exec.o signal_server.o e_uname.o e_mapped_time.o e_sysvipc.o e_linux.o e_sysv.o e_exception.o e_signal.o e_machinedep.o e_linux_trampoline.o e_linux_sysent.o e_isc4_sysent.o e_cmu_43ux_sysent.o e_lite_sysent.o emul_vector.o e_trampoline.o e_linux_getcwd.o    -llites -lthreads -lmach_sa  /usr/local/i586-linux2/lib/gcc-lib/i586-linux/2.7.2.3/libgcc.a && \
 i586-linux-size emulator.Lites.1.1.u3.out && \
 mv emulator.Lites.1.1.u3.out emulator.Lites.1.1.u3
    text    data     bss     dec     hex filename
  130195   16904     160  147259   23f3b emulator.Lites.1.1.u3.out
 make[1]: Leaving directory '/home/jsteve/src/xMach/lites-build/emulator'
 make[1]: Entering directory '/home/jsteve/src/xMach/lites-build/bin'
 make[1]: Nothing to be done for 'all'.
 make[1]: Leaving directory '/home/jsteve/src/xMach/lites-build/bin'
 real    0m6.423s
 user    0m6.006s
 sys     0m0.551s

Yes that’s six seconds. do a ‘make install’ and now your /usr/local/xmach directory is fully populated and ready to transport somewhere to do the ‘install’ and boot.

Installation

One thing to take notice of xMach over the standard last Mach4 is that the Linux boot options have been removed. There is all kinds of changes in 16bit assembly handling that need to be rewritten or fixed. I’m sure someone at some point has, but apparently it’s easier to go with the bsdboot and use either the machboot or GRUB (v1!) to get it going. Although Linux knows how to mount loop ‘diskfiles’ I don’t remember how to deal with partitions and BSD slices. So I did my typical trick of booting up an existing image, and using telnet/uucp to transfer xMach into an existing disk.

Booting however was a bit more fun, as the kernel is now compiled as a BSD ELF image and Grub 1.x won’t boot it. I used a ‘Super Grub2 disk hybrid‘ ISO image, and was able to just hit escape and type in:

kfreebsd (hd0,msdos2,bsd1)/Mach
boot
Booting xMach kernel

Because of the massive drift in multiboot vs ‘BSD ELF’ the drive isn’t passed correctly so I have to manually tell it ‘/dev/hd0a/mach_servers’ and it’ll pick up Lites, and boot up!

And there we go, with my old Mach4/Lites image from 2009 with a ‘fresh’ cross compiled build from Windows 10 / Ubuntu 20.04

Where to go from here?

Realistically, nowhere. Seven years ago there was OpenMach, which seems to have done a few things, but even moving to a GAS from 2015 didn’t help me at all trying to build the 16bit stuff so the answer must lie in the past. Otherwise the fundamental problem as always is at best you will have a 4.4BSD system which again in 1992 would be awesome, but 2021? yeah it’s a curiosity at best.

Mach 2.5 Independence day

With apologies to a very 90’s movie

Ever since I got my hands on the Mt Xinu disk images, I’ve been working to see if the old Mach kernels on the CSRG CD-ROM set are actually buildable and runnable. And the TL;DR is that yes, they are.

The CD has 3 Mach kernels, the MK35 kernel, a kernel that appears to be something called X147, and a release of Mach 3.0. While X147 has hardware support for the SUN-3 and most of the files for the VAX, only MK35 has hardware support for the i386. The MK35 kernel has incomplete Makefiles and other dependencies, while X147 lacks i386 support. The good news is that it’s possible to use portions of the missing config & Makefiles from X147 to fill in for MK35, as it’s possible to copy the platform code from MK35 along with the i386 specific config into X147, yielding 2 working kernels.

Now this leads to the next few issues. The hardware support appears to be code ‘donated’ from various OEMs from Intel, Olivetti, Toshiba, OSF, and the CSRG. Dates vary from 1987 to 1991.

I started with the MK35 kernel as it was smaller, and since it was tagged as an ‘Intel only release’ of Mach, I figured that this one had the best chance of actually working.

boot:
442336+46792+115216[+38940+39072]

And this is as far as it got on it’s first attempted boot. The Qemu VM would immediately reboot. Since I had installed Mt Xinu on VMware I went ahead and tried it there, and it said that there was a critical CPU exception and that it was shutting down. Bochs did the same thing, as did PCem. Since nothing was being printed to the screen it must be failing in the locore.s which is split into several assembly modules. I put in a hlt at various points and kept rebuilding and rebooting to see if it would halt or if it’d reboot. Thankfully VM’s are cheap and plentiful, I can’t imagine how tedious this would be on actual hardware. Eventually I found out that right after the paging bit in CR0 was flipped the VM would reboot. Now I had something.

        / turn PG on
        mov     %cr0, %eax
        or      $PAGEBIT, %eax
        mov     %eax, %cr0

        mov   %edx, %cr3 

I had tried not flipping the page bit, not flipping cr3, no matter what I tried it would triple fault and reboot.

I had to break down and beg for help, and as luck would have it, someone who knows a heck of lot more about the i386 than I could ever hope to know took a glace at the above code and immediately noted:

I looked at start.s. And it immediately jumped out at me as being very fishy. What they do is enable protected mode *and* paging, but only then load CR3. That’s something which may well work on some CPUs, but it’s against the rules. You could try just swapping the instructions around, first load CR3, then CR0. The next question is then if that code executes out of an identity-mapped page; if yes then just swapping the instructions should do the trick, if not then there is a bigger issue.


 Background: Old CPUs, especially 386/486, will decode and pipeline several instructions past the protected mode switch (mov cr0, eax). The jmp instruction is there to flush that pipeline and make sure all further instructions are executed with the new addressing mode in effect. But old CPUs did not enforce that and it was possible to execute the jmp from a non-identity-mapped page, and I guess it was also possible to execute instructions between the move to cr0 and the jmp, at least most of the time. That tends to break on modern CPUs (probably P6 and later) and definitely in emulation/virtualization. The move to cr0 effectively flushes the pipeline and if the next instruction is not in the page tables, poof, there goes the OS.

Could it really be that simple?

        mov     %edx, %cr3

        / turn PG on
        mov     %cr0, %eax
        or      $PAGEBIT, %eax
        mov     %eax, %cr0

        / mov   %edx, %cr3 

I commented out the cr3 line and just pasted above the cr0 pagebit flip.

The first boot of Mach 2.5 MK35

Amazingly the kernel booted. Behold the first boot of Mach/4.3 which very well could be the first boot independent of the CMU and I’d venture the first boot from the source on the CSRG CD-ROM set. I tried to tell Mach to use the disk as prepared by Mt Xinu, but naturally it’s incompatible.

The next thing to do was create a root diskette, which thankfully the CMU folks left the needed files in the standi386at directory. I was able to build the disk, and using VMware I could boot into single user mode. I went through the ‘unpublished’ documentation I was able to mirror, and was able to get lucky enough to have Mach prepare the hard disk, format the partitions, and I used tar to transfer the root diskette onto the hard disk. I thought it ought to be possible to boot from the boot disk, have it mount the hard disk, and re-mount the boot disk, and copy the kernel. Sounds reasonable right?

This is where the incredibly stale platform code showed it’s head once more again as the floppy driver in MK35 is amazingly useless. It seems that the emulated hardware is too fast? But all reads from the floppy using the hard disk as root failed. Instead I removed a bunch of files from the disk, and copied over gzip & a compressed copy of the kernel to disk, along with the boot.hd program, and was able to copy them to hard disk using that modified root diskette. Luckily Mach has support for a.out binaries, and this stuff being so old it’s all statically linked. My Mt Xinu build of gzip runs fine on the Mach kernel, so I could decompress the kernel and install the bootblocks.

This is where the next weird issue would happen, which is that Mach was quite insistent on mounting everything under this /RFS directory. It appears that RFS was CMU’s answer to NFS… Which needless to say didn’t ignite the world on fire. I was later able to find that I could disable the RFS code, re-configure, rebuild and re-transfer a kernel and with a bit of fighting with mount I was able to mount hd0d/hd0e. Sadly during the install process there was no visible option to specify slice sizes so I’m stuck with a 10MB root.

With this much luck in hand I thought it may be interesting to see if Mt Xinu could mount the Mach disk. Turns out that it can without any issues. So I went ahead and wiped the Mach disk, and transfered Mt Xinu over to the Mach disk, and rebooted with that. And it “works”! Although of course there is some caveats.

The first being the aforementioned floppy support is broken. The next one being that the serial support also suffers from basically losing interrupts and leaving the system waiting. The kernel debugger still works, and you can see it in the idle loop, along with the other threads waiting. This means my favorite method of using uuencode and pasting to the terminal won’t work, MK35 locks up after 35kb, and X147 made it as far as 150kb. Keep in mind that they are using the same i386/i386at platform directories.

So I’m quite sure that there is other issues hiding in the code, maybe obvious ones like the cr3/cr0 thing. On the other front I’ve been starting at looking at doing some porting of the Tahoe/Quasijarus userland with varying success. I have already started to rebuild some binaries with a substitute crt0.o as there is no source for anything included in the Mt Xinu distribution outside of the Mach 3.0 kernel.

For those who want to play along I have uploaded VMDK’s and the source tarballs.

For people using Qemu I find that a serial terminal is FAR nicer to use than the console. Also I’m unsure of how hard the 16MB ISA DMA window is being hit, but X147 seems okay with 64MB of ram, while M35 really needs to be 50MB or less..

qemu.exe -L pc-bios -hda Mach25-MK35.vmdk -serial telnet:127.0.0.1:42323,server,nowait -m 16

This puts the serial port into TCP server mode, so you can simply telnet into the serial port. As always change the memory are your own discretion.

I’m not dead

It’s just been really busy with this move, unpacking and the usual losing things, finding things and breaking things.

In the middle of it all, I found something online, that I want to at least do some proper article thing about… As it’s been really exciting, and goes back to the first month I started this blog.

Outside of NeXTSTEP, the other i386 commercial version of Mach 2.5 surfaced, the Mt Xinu version!

Even better, a few years ago, I had stumbled onto the source code for 2.5 buried on the 4th disc of the CSRG set, and with a LOT of luck and persistence I can confirm that the sources are complete enough to build.

loading vmunix.sys
rearranging symbols
text	data	bss	dec	hex
389088	45564	101364	536016	82dd0
ln vmunix.sys vmunix; ln vmunix vmunix.I386x.STD+WS-afs-nfs

However, as luck always has it, start.s in the i386 code does something weird at the 3GB mark causing a triple fault on any kind of modern emulation/virtualization setup.

    / Fix up the 1st, 3 giga and last entries in the page directory
    mov     $EXT(kpde), %ebx
    and     $MASK, %ebx

    mov     $EXT(kpte), %eax
    and     $0xffff000, %eax
    or      $0x1, %eax

    mov     %eax, (%ebx)
    mov     %eax, 3072(%ebx)        / 3 giga -- C0000000

    mov     $EXT(kpde), %edx
    and     $MASK, %edx

Not all that sure why, but at least on Bochs, I can see the triple fault.

00036527018d[CPU0  ] page walk for address 0x0000000000101122
00036527018d[CPU0  ] page walk for address 0x00000000e0000011
00036527018d[CPU0  ] PDE: entry not present
00036527018d[CPU0  ] page fault for address 00000000e0000011 @ 0000000000101124
00036527018d[CPU0  ] exception(0x0e): error_code=0002
00036527018d[CPU0  ] interrupt(): vector = 0e, TYPE = 3, EXT = 1
00036527018d[CPU0  ] page walk for address 0x00000000c0161370
00036527018d[CPU0  ] PDE: entry not present
00036527018d[CPU0  ] page fault for address 00000000c0161370 @ 0000000000101122
00036527018d[CPU0  ] exception(0x0e): error_code=0000
00036527018d[CPU0  ] exception(0x08): error_code=0000
00036527018d[CPU0  ] interrupt(): vector = 08, TYPE = 3, EXT = 1
00036527018d[CPU0  ] page walk for address 0x00000000c0161340
00036527018d[CPU0  ] PDE: entry not present
00036527018d[CPU0  ] page fault for address 00000000c0161340 @ 0000000000101122
00036527018d[CPU0  ] exception(0x0e): error_code=0000
00036527018i[CPU0  ] CPU is in protected mode (active)
00036527018i[CPU0  ] CS.mode = 32 bit
00036527018i[CPU0  ] SS.mode = 32 bit
00036527018i[CPU0  ] EFER   = 0x00000000
00036527018i[CPU0  ] | EAX=e0000011  EBX=0015f000  ECX=00161dc1  EDX=0015f000
00036527018i[CPU0  ] | ESP=0000efbc  EBP=0000efbc  ESI=00193fb8  EDI=00009d84
00036527018i[CPU0  ] | IOPL=0 id vip vif ac vm RF nt of df if tf SF zf af PF cf
00036527018i[CPU0  ] | SEG sltr(index|ti|rpl)     base    limit G D
00036527018i[CPU0  ] |  CS:0028( 0005| 0|  0) 00000000 ffffffff 1 1
00036527018i[CPU0  ] |  DS:0020( 0004| 0|  0) 00000000 ffffffff 1 1
00036527018i[CPU0  ] |  SS:0010( 0002| 0|  0) 00001000 0000ffff 0 1
00036527018i[CPU0  ] |  ES:0020( 0004| 0|  0) 00000000 ffffffff 1 1
00036527018i[CPU0  ] |  FS:0000( 0005| 0|  0) 00000000 0000ffff 0 0
00036527018i[CPU0  ] |  GS:0000( 0005| 0|  0) 00000000 0000ffff 0 0
00036527018i[CPU0  ] | EIP=00101122 (00101122)
00036527018i[CPU0  ] | CR0=0xe0000011 CR2=0xc0161340
00036527018i[CPU0  ] | CR3=0x00000000 CR4=0x00000000
00036527018i[CPU0  ] 0x0000000000101122>> add byte ptr ds:[eax], al : 0000
00036527018d[SIM   ] searching for component 'cpu' in list 'bochs'
00036527018d[SIM   ] searching for component 'reset_on_triple_fault' in list 'cpu'
00036527018e[CPU0  ] exception(): 3rd (14) exception with no resolution, shutdown status is 00h, resetting

Mach 3.0 doesn’t do this, so I’ll have to dig far deeper into start.s which is kind of really beyond me.

Building a boot disk … is involved. 😐

rm -rf /usr/src/mach25-i386/obj
mkdir /usr/src/mach25-i386/obj
cd /usr/src/mach25-i386/standi386at/boot
make fdboot
/home/user/mkfs /dev/rfloppy 2880 18 2 4096 512 32 1
dd if=/usr/src/mach25-i386/obj/standi386at/boot/boot.fd of=/dev/rfd0d
/home/user/fsck -y /dev/rfloppy
cd /usr/src/mach25-i386/
make
mount /dev/floppy /mnt
cp /usr/src/mach25-i386/obj/STD+WS-afs-nfs/vmunix /mnt
sync
umount /mnt
/home/user/fsck -y /dev/rfloppy

So, I’m not all that dead. For anyone super impatient, you can download my VMDK here, which runs on Qemu & VMware, it includes a serial terminal on COM1 so you can use a real terminal, and if you are like me, uuencode/uudecode files in & out of the system. As always read the 404 page for the current username/password.

A tale of two kernels.

Back in the early 1990’s Microkernel’s were all the rage. Everywhere you would go you’d hear all about Pink, Taligent, Windows NT, and the grand daddy of them all Mach.

Probably the most well known debate about microkernel vs monolithic kernels was the Tanenbaum vs Torvalds debate that raged on comp.os.minix back in 1992. You can read the entire thing here : ( http://www.oreilly.com/catalog/opensources/book/appa.html ). It was interesting in the sense that even Ken Thompson of UNIX fame even chipped in. Tanenbaums’s major points were that a microkernel is more inherently portable than a monolithic kernel, and that microkernels could be more reliable, and easier to maintain. Of course more than 10 years later we can see that Linux still flourishes, and that outside of Windows NT & OS X no mainstream OS relies on a microkernel. Even OS X treats mach more as a call library than a traditional microkernel, since all of the exe’s in Darwin / OS X are Mach-O format, not COFF/ELF,A.OUT, etc etc.

Mach started out as a project from CMU derived from the UNIX source code, to try to re-invent the lower levels of UNIX into something that would scale easier to multiprocessors, support for threads, and the holy grail of them all, expand it’s portability. Sadly the first few versions of Mach are barred from distribution due to their inclusion of encumbered UNIX source code. However when the university of Utah picked up Mach, and released Mach4 (UK22, info here http://www.cs.utah.edu/flux/mach4/html/Mach4-proj.html ), including full source. Also they provided Lites, a BSD server that can run atop Mach, giving the user a ‘UNIX’ system, as it were.

Lites

Lites

So digging through some network groups, and testing stuff, I finally slapped together the pieces, and built a Mach/Lites system on NetBSD 1.1 . And how does it perform? It’s significantly slower than NetBSD is. You can tell that the amount of context switching involved in Mach as a program makes a call the microkernel, which in turn validates & passes it to the server, which further validates, runs the process, then sends the results back to the kernel, which then passes it back to the program. I’ve heard in a worse case scenario a 500% reduction in speed. You can always read more info on the fine wiki article here ( http://en.wikipedia.org/wiki/Mach_kernel ).

Lots of people will argue that microkernel’s have simply failed, and that it’s simply an example of what seemed like a good idea being pushed too hard once it was found to fail. It a lot of ways it reminds me of ADA.

So for now I’m going to provide a Qemu image of Lites running on Mach 4. unzipping the file will provide you with a lites.cmd file which for windows users you can just run directly. Things to note are:

-The version of Qemu that I’ve bound requires libpcap to be installed.
-Mach4 can only address 16mb of ram, due to DMA issues across the 16mb line.
-I’ve enabled user mode networking so that
-the cmd file sets up local port 23 to be redirected to the VM. This will allow you to telnet in simply by ‘telnet localhost’. You may want to use putty for better terminal handeling
-The included Gcc 2.4.5 is ancient. Outside of building a simple irc client, I wouldn’t expect much.
-The boot process is broken, and it’ll parse through the rc scripts twice. Just let it do it’s thing, and it’ll drop to a login prompt.

Logging into Lites/NetBSD

Logging into Lites/NetBSD

Other than that it behaves just like a NetBSD 1.1 machine.

Notice that grub boots the kernel /Mach.UK22 . When Mach boot’s it’ll load up the files emulator & startup. The ‘emulator’ is the Lites microkernel. Once it’s loaded it’ll start mach_init which just symlinks to /sbin/init and the normal NetBSD bootup will commence.

You can download my image directly here as MachUK22-lites-nat.zip.

Next time we’ll play with SLS Linux.