Back in 1995 I bought this rather expensive, and ambitious book simply called: Developing Your Own 32-Bit Operating
And while it is a LONG read, it really is the embodiment of Apple pie from scratch. Rather than rely on open and available tools, the author Richard Burgess instead goes on to write his own assembler, compiler, and then onward to a simple message passing RTOS.
No doubt the price he paid for eschewing popular GNU tools, and having a non BSD/GPL license for the time is that it was quickly relegated to history as the inevitable rise of Linux took place.
For those wishing to look, not only is the source code and a few patches available on the site ipdatacorp.com, but so is a PDF of the 1st edition of the book.
While MMURTL may not have caught on in the marketplace of ideas, it’s still astounding to look at the volume of work produced, that even though open source tools and starting points were available (The book easily could have been using CMU Mach 3.0) instead it’s all written from scratch by a single person.
I didn’t realize that I never uploaded this over there. After a discussion on the passing anniversary on the TUHS mailing list I had to dig out my installed copy.
I had forgotten just how rough around the edges this was, as it’s missing quite a few utilities from the Net/2 tape, and isn’t complete enough to come up in multiuser mode, but it is capable of booting up.
Although 386BSD itself was really short lived with its effective short death in the subsequent release it paved the way for an internet only release of a BSD Unix by just 2 people. And it closed up the glaring hole of the lack of a free i386 port of Net/2.
The natural competition was Mach386, which was based around the older 4.3BSD Tahoe, and the up and coming BSDI, which had many former CSRG people which were also racing to deliver their own i386 binary / source release for sale.
One thing about this era is that you had SUN apparently forced out of the BSD business instead to work with the USL on making SYSV usable, leaving NeXT as the next big seller of BSD. The commercial world was going SYSV in a big way, and the only place that was to have a market was on the micros. And for those of us who wanted something open and free 386BSD paved the way realizing the dream of the Net/2 release. A free Unix for the common person, the true democratization of computing by letting common people use, develop and distribute it independently of any larger organization.
It’s almost a shame that GNU had stuck with the unrealized dream of a hierarchy of daemons, instead of adopting the BSD kernel with a GNU userland, on top of that tendy micro kernel Mach.
The landscape radically changed with the infamous ad proudly proclaiming “It’s UNIX”.
While USL was happy to fight both BSDI and the CSRG they never persued Bill Jolitz. And after the internet flame and lawsuit dragged on, neither of the splinter groups NetBSD or FreeBSD caught up, although both did reset upon the release of the 4.4BSD Lite 2 code.
Ever since I got my hands on the Mt Xinu disk images, I’ve been working to see if the old Mach kernels on the CSRG CD-ROM set are actually buildable and runnable. And the TL;DR is that yes, they are.
The CD has 3 Mach kernels, the MK35 kernel, a kernel that appears to be something called X147, and a release of Mach 3.0. While X147 has hardware support for the SUN-3 and most of the files for the VAX, only MK35 has hardware support for the i386. The MK35 kernel has incomplete Makefiles and other dependencies, while X147 lacks i386 support. The good news is that it’s possible to use portions of the missing config & Makefiles from X147 to fill in for MK35, as it’s possible to copy the platform code from MK35 along with the i386 specific config into X147, yielding 2 working kernels.
Now this leads to the next few issues. The hardware support appears to be code ‘donated’ from various OEMs from Intel, Olivetti, Toshiba, OSF, and the CSRG. Dates vary from 1987 to 1991.
I started with the MK35 kernel as it was smaller, and since it was tagged as an ‘Intel only release’ of Mach, I figured that this one had the best chance of actually working.
And this is as far as it got on it’s first attempted boot. The Qemu VM would immediately reboot. Since I had installed Mt Xinu on VMware I went ahead and tried it there, and it said that there was a critical CPU exception and that it was shutting down. Bochs did the same thing, as did PCem. Since nothing was being printed to the screen it must be failing in the locore.s which is split into several assembly modules. I put in a hlt at various points and kept rebuilding and rebooting to see if it would halt or if it’d reboot. Thankfully VM’s are cheap and plentiful, I can’t imagine how tedious this would be on actual hardware. Eventually I found out that right after the paging bit in CR0 was flipped the VM would reboot. Now I had something.
/ turn PG on
mov %cr0, %eax
or $PAGEBIT, %eax
mov %eax, %cr0
mov %edx, %cr3
I had tried not flipping the page bit, not flipping cr3, no matter what I tried it would triple fault and reboot.
I had to break down and beg for help, and as luck would have it, someone who knows a heck of lot more about the i386 than I could ever hope to know took a glace at the above code and immediately noted:
I looked at start.s. And it immediately jumped out at me as being very fishy. What they do is enable protected mode *and* paging, but only then load CR3. That’s something which may well work on some CPUs, but it’s against the rules. You could try just swapping the instructions around, first load CR3, then CR0. The next question is then if that code executes out of an identity-mapped page; if yes then just swapping the instructions should do the trick, if not then there is a bigger issue.
Background: Old CPUs, especially 386/486, will decode and pipeline several instructions past the protected mode switch (mov cr0, eax). The jmp instruction is there to flush that pipeline and make sure all further instructions are executed with the new addressing mode in effect. But old CPUs did not enforce that and it was possible to execute the jmp from a non-identity-mapped page, and I guess it was also possible to execute instructions between the move to cr0 and the jmp, at least most of the time. That tends to break on modern CPUs (probably P6 and later) and definitely in emulation/virtualization. The move to cr0 effectively flushes the pipeline and if the next instruction is not in the page tables, poof, there goes the OS.
Could it really be that simple?
mov %edx, %cr3
/ turn PG on
mov %cr0, %eax
or $PAGEBIT, %eax
mov %eax, %cr0
/ mov %edx, %cr3
I commented out the cr3 line and just pasted above the cr0 pagebit flip.
Amazingly the kernel booted. Behold the first boot of Mach/4.3 which very well could be the first boot independent of the CMU and I’d venture the first boot from the source on the CSRG CD-ROM set. I tried to tell Mach to use the disk as prepared by Mt Xinu, but naturally it’s incompatible.
The next thing to do was create a root diskette, which thankfully the CMU folks left the needed files in the standi386at directory. I was able to build the disk, and using VMware I could boot into single user mode. I went through the ‘unpublished’ documentation I was able to mirror, and was able to get lucky enough to have Mach prepare the hard disk, format the partitions, and I used tar to transfer the root diskette onto the hard disk. I thought it ought to be possible to boot from the boot disk, have it mount the hard disk, and re-mount the boot disk, and copy the kernel. Sounds reasonable right?
This is where the incredibly stale platform code showed it’s head once more again as the floppy driver in MK35 is amazingly useless. It seems that the emulated hardware is too fast? But all reads from the floppy using the hard disk as root failed. Instead I removed a bunch of files from the disk, and copied over gzip & a compressed copy of the kernel to disk, along with the boot.hd program, and was able to copy them to hard disk using that modified root diskette. Luckily Mach has support for a.out binaries, and this stuff being so old it’s all statically linked. My Mt Xinu build of gzip runs fine on the Mach kernel, so I could decompress the kernel and install the bootblocks.
This is where the next weird issue would happen, which is that Mach was quite insistent on mounting everything under this /RFS directory. It appears that RFS was CMU’s answer to NFS… Which needless to say didn’t ignite the world on fire. I was later able to find that I could disable the RFS code, re-configure, rebuild and re-transfer a kernel and with a bit of fighting with mount I was able to mount hd0d/hd0e. Sadly during the install process there was no visible option to specify slice sizes so I’m stuck with a 10MB root.
With this much luck in hand I thought it may be interesting to see if Mt Xinu could mount the Mach disk. Turns out that it can without any issues. So I went ahead and wiped the Mach disk, and transfered Mt Xinu over to the Mach disk, and rebooted with that. And it “works”! Although of course there is some caveats.
The first being the aforementioned floppy support is broken. The next one being that the serial support also suffers from basically losing interrupts and leaving the system waiting. The kernel debugger still works, and you can see it in the idle loop, along with the other threads waiting. This means my favorite method of using uuencode and pasting to the terminal won’t work, MK35 locks up after 35kb, and X147 made it as far as 150kb. Keep in mind that they are using the same i386/i386at platform directories.
So I’m quite sure that there is other issues hiding in the code, maybe obvious ones like the cr3/cr0 thing. On the other front I’ve been starting at looking at doing some porting of the Tahoe/Quasijarus userland with varying success. I have already started to rebuild some binaries with a substitute crt0.o as there is no source for anything included in the Mt Xinu distribution outside of the Mach 3.0 kernel.
For those who want to play along I have uploaded VMDK’s and the source tarballs.
For people using Qemu I find that a serial terminal is FAR nicer to use than the console. Also I’m unsure of how hard the 16MB ISA DMA window is being hit, but X147 seems okay with 64MB of ram, while M35 really needs to be 50MB or less..
text data bss dec hex
389088 45564 101364 536016 82dd0
ln vmunix.sys vmunix; ln vmunix vmunix.I386x.STD+WS-afs-nfs
However, as luck always has it, start.s in the i386 code does something weird at the 3GB mark causing a triple fault on any kind of modern emulation/virtualization setup.
/ Fix up the 1st, 3 giga and last entries in the page directory
mov $EXT(kpde), %ebx
and $MASK, %ebx
mov $EXT(kpte), %eax
and $0xffff000, %eax
or $0x1, %eax
mov %eax, (%ebx)
mov %eax, 3072(%ebx) / 3 giga -- C0000000
mov $EXT(kpde), %edx
and $MASK, %edx
Not all that sure why, but at least on Bochs, I can see the triple fault.
00036527018d[CPU0 ] page walk for address 0x0000000000101122
00036527018d[CPU0 ] page walk for address 0x00000000e0000011
00036527018d[CPU0 ] PDE: entry not present
00036527018d[CPU0 ] page fault for address 00000000e0000011 @ 0000000000101124
00036527018d[CPU0 ] exception(0x0e): error_code=0002
00036527018d[CPU0 ] interrupt(): vector = 0e, TYPE = 3, EXT = 1
00036527018d[CPU0 ] page walk for address 0x00000000c0161370
00036527018d[CPU0 ] PDE: entry not present
00036527018d[CPU0 ] page fault for address 00000000c0161370 @ 0000000000101122
00036527018d[CPU0 ] exception(0x0e): error_code=0000
00036527018d[CPU0 ] exception(0x08): error_code=0000
00036527018d[CPU0 ] interrupt(): vector = 08, TYPE = 3, EXT = 1
00036527018d[CPU0 ] page walk for address 0x00000000c0161340
00036527018d[CPU0 ] PDE: entry not present
00036527018d[CPU0 ] page fault for address 00000000c0161340 @ 0000000000101122
00036527018d[CPU0 ] exception(0x0e): error_code=0000
00036527018i[CPU0 ] CPU is in protected mode (active)
00036527018i[CPU0 ] CS.mode = 32 bit
00036527018i[CPU0 ] SS.mode = 32 bit
00036527018i[CPU0 ] EFER = 0x00000000
00036527018i[CPU0 ] | EAX=e0000011 EBX=0015f000 ECX=00161dc1 EDX=0015f000
00036527018i[CPU0 ] | ESP=0000efbc EBP=0000efbc ESI=00193fb8 EDI=00009d84
00036527018i[CPU0 ] | IOPL=0 id vip vif ac vm RF nt of df if tf SF zf af PF cf
00036527018i[CPU0 ] | SEG sltr(index|ti|rpl) base limit G D
00036527018i[CPU0 ] | CS:0028( 0005| 0| 0) 00000000 ffffffff 1 1
00036527018i[CPU0 ] | DS:0020( 0004| 0| 0) 00000000 ffffffff 1 1
00036527018i[CPU0 ] | SS:0010( 0002| 0| 0) 00001000 0000ffff 0 1
00036527018i[CPU0 ] | ES:0020( 0004| 0| 0) 00000000 ffffffff 1 1
00036527018i[CPU0 ] | FS:0000( 0005| 0| 0) 00000000 0000ffff 0 0
00036527018i[CPU0 ] | GS:0000( 0005| 0| 0) 00000000 0000ffff 0 0
00036527018i[CPU0 ] | EIP=00101122 (00101122)
00036527018i[CPU0 ] | CR0=0xe0000011 CR2=0xc0161340
00036527018i[CPU0 ] | CR3=0x00000000 CR4=0x00000000
00036527018i[CPU0 ] 0x0000000000101122>> add byte ptr ds:[eax], al : 0000
00036527018d[SIM ] searching for component 'cpu' in list 'bochs'
00036527018d[SIM ] searching for component 'reset_on_triple_fault' in list 'cpu'
00036527018e[CPU0 ] exception(): 3rd (14) exception with no resolution, shutdown status is 00h, resetting
Mach 3.0 doesn’t do this, so I’ll have to dig far deeper into start.s which is kind of really beyond me.
Building a boot disk … is involved. 😐
rm -rf /usr/src/mach25-i386/obj
/home/user/mkfs /dev/rfloppy 2880 18 2 4096 512 32 1
dd if=/usr/src/mach25-i386/obj/standi386at/boot/boot.fd of=/dev/rfd0d
/home/user/fsck -y /dev/rfloppy
mount /dev/floppy /mnt
cp /usr/src/mach25-i386/obj/STD+WS-afs-nfs/vmunix /mnt
/home/user/fsck -y /dev/rfloppy
So, I’m not all that dead. For anyone super impatient, you can download my VMDK here, which runs on Qemu & VMware, it includes a serial terminal on COM1 so you can use a real terminal, and if you are like me, uuencode/uudecode files in & out of the system. As always read the 404 page for the current username/password.
The new dynamic recompiler appears to be much more faster, although if you want maximum performance, make sure to set your video card to the fastest possible performance.
I was doing my typical DooM thing, and the performance was abysmal. But I did have an 8bit VGA card selected, so what would I really expect? Interestingly enough in ‘low resolution’ mode it performed quite well, but setting it to the artificial ‘fastest PCI/VLB’ speed it was performing just great.
This was a lot harder than it should have been. And not because of gcc or surprisingly ancient binutils.
I didn’t have much to go on, as ancient threads like this, or this end up unanswered or without any good conclusion. I guess it’s not surprising that all the attention is to MiNT & MINIX rather than the native platform. But I was not deterred.
The reason why this was so freaking hard was how so much of key parts of gcc for the ST have been purged and what remains being scattered to the winds. Amazingly the hardest thing to source is the include files. There is a GCC 1.30 file on all the usual GNU mirrors but to save a few kb it has no headers, instead it wants you to reuse the ones from the 1.25 binary distribution. Which is gone. There survives a pl95 binary and source package, but again no includes. Instead I got lucky with all three for pl98. Which has a lot of GCC2 hooks so I cheated on getting the 1.30 hello world by using the 2.5.8 pre-processor.
It’s kind of annoying how all these seemingly tiny files get purged to save a few kb. Just as I can’t for the life of me find the old original GNU libc.
Speaking of files, ZOO has to be the worst compressor ever. Not only is it just overall worse than ZIP, but there are 2 incompatible compression methods, like the introduction of LZD, which any of the good versions of UNZOO can’t deal with. And sure there is zoo210.tar.Z but despite being able to build it on multiple platforms it never does anything useful. All these ancient fileformats sure don’t help anything. And sure there is a MS-DOS version that the MS-DOS Player can run, but get ready for 8.3 filename renaming.
The one good thing that came out of this experience is that since I am building form i386 to 68000 I found that this setup uses the G++ linker which has endian swapping. So maybe I can complete the chain for Mint and MachTen.
I even got the 1987 Infocom interpreter running. Although I don’t know what the deal is, it seems the larger the GCC based program is the higher chance it’ll just crash on exit or force the next program to crash. Building anything native under emulation was an impossibility.
In the same effort, I’ve had the same luck with sozobon. It took way too long to find a working dlibs. I don’t know why people couldn’t either package them together or at least in the same directory. It took far longer just to find the libs… But it was still fun to get that one running as well.
It’s a far more manual process to compile as I have to invoke each stage manually, but at least I’m finally able to get things going.
One of the bigger issues is that I would always find libraries in this olb file format, that the linker from Sozobon wouldn’t recognize. And almost every attempt of trying to build the G++ linker would also fail on. It wasn’t until I was able to get the pl98 include files that I could finally get a linker to actually recognize this … seemingly different for no apparent reason format to actually link. After then I managed to finally find a build of this dlibs that would actually link with Sozobon, which naturally didn’t use olb at all.
So yeah that was an adventure.
I haven’t cleaned it up at all, and really wouldn’t expect anyone else to care, but all my mashed together work (source & binaries!) is here: MinGW-AtariTOS.7z
I started browsing more cd.textfiles.com and amazingly found a ‘home made CD-ROM set’ of Atari software, and buried in the gigabytes of stuff was 4 of the 5 disks of the original GCC-1.23! Namely the source & includes to the first GCC library. I didn’t think this article was going to get any traction, let alone downloads. So many people downloaded the above download.
The default download set is for GCC-1.30, with the headers & lib, along with source. It’s crazy small which just goes for how this old stuff is, and how impact full for losing a few kb.
Also the shell that you use apparently makes a BIG difference. The shell that I was using EmuCON doesn’t show any output from the GCC 1.x libs. However other shells most certainly do. I’ll have to do another update regarding shells/emulation.
While looking around for simple compilers to see how easy it is to modify their assembly output syntax, I ran across this tiny file, cc68iii3.zip which bills itself as:
This compiler consists of various modules that build up a front end — these modules are common to all versions of this compiler — consisting of parser, analyzer and optimizer, of modules that are specific for the target processor, namely *68k.c (for the 68000) and *386.c (for the i386), and of assembly language output modules that are further dependent on the (syntax of the) target assembler.
Well isn’t that interesting! So instead of doing something 68000 based, I setup the i386-gas compiler, and tried it with MinGW. And amazingly a hello world program worked!
Linux supports ELF binaries for ~25 years now. a.out coredumping has bitrotten quite significantly and would need some fixing to get it into shape again but considering how even the toolchains cannot create a.out executables in its default configuration, let’s deprecate a.out support and remove it a couple of releases later, instead.
I can’t say I’m all that surprised, maintaining backwards compatibility has not really been a Linux thing, as most people are incapable of doing any troubleshooting in the myriad of hundreds to thousands of independent packages, and instead find it far easier to switch to a different distro entirely.
At the same time, the vast majority of Linux packages are available in source code, so re-building things as ELF most likely has happened in the last 25 years.
During the great ELF migration, it was a gigantic PITA having basically 2 copies of all the libraries as things were converted over, and a.out stuff quickly evaporated. For me, the beauty of a.out was for later kernels to be able to mount and run older stuff. But as we are in the era of both ‘cheap’ user mode kernels along with virtualization will the old executable format truly be missed?
Linux has survived the removal of native support for the 80386, and even the detection logic for the NexGen processors (yes they were real, and yes they did ship), so no doubt this further amputation won’t matter to the vast majority.
I have to wonder how long until the i386 32bit target is removed? Distros like Debian have long since removed support for 80486/80586 classed processors to bring the minimum up to requiring SSE-2 based instructions, and I can’t imagine anyone who is running a 32bit OS for their main OS in this day and era.
And Alan Cox points out that the a.out binary loader _could_ be done in user space if somebody wants to, but we might keep just the loader in the kernel if somebody really wants it, since the loader isn’t that big and has no really odd special cases like the core dumping does.
So retrohun is doing their blog thing on github of all things, and the latest entry, is of course Xenix tales. As mentioned in comments on this blog & other places they found another driver for Xenix TCP/IP!
Going back years ago, the tiny NIC driver support for the elderly Microsoft/SCO Xenix 386 v2 included 3COMA/B/C and SLIP. However it’s been recently unearthed that D-Link had drivers for their DE-100 & DE-200 models, and as it happens the DE-200 is a NE-2000 compatible card!
That means that Qemu can install/run Xenix, and it can get onto the internet* (there is a catch, there is always a catch).
You can download the driver either from github or my password protected mirror. Simply untar the floppy under Xenix (tar -xvf /dev/fd0) and do the install via ‘mkdev dlnk’
Setting up the driver is… tedious. Much like the system itself. I found Qemu 0.90 works great, and is crazy fast (in part to GCC 3) even though Qemu 0.9’s floppy emulation isn’t good enough to install or read disks. With all the updates to Qemu 3.1 use that, it’ll read the disks, and allow for networking.
To give some idea of speed I ran the age old Dhrystone test, compiled by GCC 1.37.1 and scored the following:
Dhrystone(1.1) time for 5000000 passes = 8 This machine benchmarks at 625000 dhrystones/second
When compared to the SGI Indy’s 133Mhz R4600SC score of 194,000 @ 50000 loops that makes my Xeon W3565 322 times faster, under Qemu 0.90! And that’s under Windows!
Setting up the commandline/launching is pretty much this:
qemu.exe -L pc-bios -m 16 -net nic,model=ne2k_isa -net user -redir tcp:42323::23 -hda ..\xenix.vmdk added SLIRP adding a [GenuineIntelC] family 5 model 4 stepping 3 CPU added 16 megabytes of RAM trying to load video rom pc-bios/vgabios-cirrus.bin added parallel port 0x378 7 added NE2000(isa) 0x320 10 pci_piix3_ide_init PIIX3 IDE ide_init2  s->cylinders 203 s->heads 16 s->sectors 63 ide_init2  s->cylinders 0 s->heads 0 s->sectors 0 ide_init2  s->cylinders 2 s->heads 16 s->sectors 63 ide_init2  s->cylinders 0 s->heads 0 s->sectors 0 added PS/2 keyboard added PS/2 mouse added Floppy Controller 0x3f0 irq 6 dma 2 Bus 0, device 0, function 0: Host bridge: PCI device 8086:1237 Bus 0, device 1, function 0: ISA bridge: PCI device 8086:7000 Bus 0, device 1, function 1: IDE controller: PCI device 8086:7010 BAR4: I/O at 0xffffffff [0x000e]. Bus 0, device 1, function 3: Class 0680: PCI device 8086:7113 IRQ 0. Bus 0, device 2, function 0: VGA controller: PCI device 1013:00b8 BAR0: 32 bit memory at 0xffffffff [0x01fffffe]. BAR1: 32 bit memory at 0xffffffff [0x00000ffe].
In the file /etc/tcp the default installation does a terrible job of setting up the NIC. I changed the ifconfig line to this:
So there you go, all 20 Xenix fans out there! Not only a way to get back online, but to do it in SPEED!
Thanks to Mark for pointing out that there has been tremendous progress with version 3.1 of Qemu, and it’s TCG user speed is up to the 0.90 levels of speed (at least with dhrystone/Xenix), and it just takes a little (lot) of massaging to get up and running with Xenix with the right flags:
While I’m writing this, I’m listening to Neuromancer via WinAmp & the ancient Speex plugin I had updated about 8 years ago.
I took my Surface, and downgraded it to the North American 8.0 version without updates, added my MS ID, and from there ran the jailbreak and win86emu (sometimes called x86node) and from there I was able to run some simple Win32 exe’s.
Even though I had done a simple cut down QuakeWorld port with the GDI only display, using win86emu it’ll run the 80386 build as well. while I haven’t thrown much at it, I’m just amazed that so far things are working.
When you think that between the jail-break to unlock the ability to run programs combined with a CPU translator, and Win32 to Win32 thunk / translation program, why on earth did this thing ship without it? It’s amazing that between trying to launch a platform with no inertial for applications after Android & Apple were selling millions of hardware units, and billions of software units, and cutting the past applications. It’s just crazy.
And then Microsoft did their normal thing when something goes wrong, which is basically end it, and destroy all evidence it existed. There is no Windows 10 upgrade for the Surface, even though Windows 10 IOT has been hacked to run from a USB stick on the Surface, but it’s insanely slow.