Building MIT PC/IP, or making apple pie

“If you want to make a pie from scratch, you must first create the universe”

–Carl Sagan

A little while ago I had touched briefly on the availability of of a PCC port to the 8086 done back in the early 1980’s that was hosted on VAX running 4.1BSD. I’d ruled it basically useless as you are restricted to 64kb .COM files, and I couldn’t get much of anything interesting running on it.

After all the fun of setting up NetManage Chameleon on Windows 3.0, over on IRC lys had mentioned I should try the MIT PC/IP stack. I never knew anything about it’s history but it became the first PC TCP/IP stack. You can read more about it from Internaut?

Dave Clark had gone to England for sabbatical and while he was there, had implemented TCP/IP in BCPL for the TRIPOS operating system, a predecessor of the Commodore AMIGA operating system. He brought the TCP/IP code back with him, and the Lab for Computer Science had a bunch of Xerox Alto workstations, and someone at LCS ported Dave’s TCP/IP to the Alto.

Then someone ported it to Version 6 UNIX and rewrote it in C. And that was what we took, and ported to PC DOS. At that point there were no C compilers that ran on the PC, and we were using a compiler that ran on a PDP 11/45 on Version 6 UNIX, and then on a VAX 750 running BSD v4.1. That was the AT&T; portable C compiler, and a group of people on the fourth floor of the LCS had written an 8088 code generator for it. This was before Microsoft C, and before 4.2 BSD.

Our first tasks were to bring up TFTP, TCP, and a telnet client under DOS. Several people were involved. Lou Konopelski did the initial TCP and telnet work, and Dave Brigham did similar work to what I did.

John Romkey – InternautHow PC-IP Came to Be

What is even more notable about PCIP is that it’s the reason the whole ‘MIT License’ even exists, as word seems to have spread quickly about a TCP/IP stack for the IBM PC compatible market. Jerome Saltzer has documented it all, if you want to read about it (WARNING PDF!)

Romkey would even then go on to found FTP software in that wonderful pre public internet era, before the eternal September.

Over on bitsavers there are 3 files:

[   ]8086_C_19850820.tar2019-03-12 14:36750K 
[   ]PC-IP_19850124.tar2019-03-12 11:534.6M 
[   ]PC-IP_19860403.tar2019-03-12 11:536.9M 
bitsavers.trailing-edge.com/bits/MIT/pc-ip/

Of course, the one thing that stands out is that these files look tiny. They aren’t even compressed! PCC, or the Portable C Compiler was released from AT&T, itself written in C, to make porting the C compiler easier to further allow Unix to be further easily ported. Now that I kind of had a mission, I decided to take the 8086 PCC leap, again.

Get the time machine ready!

A man, his best friend and a time machine! – Microsoft Sydney

Thankfully I had that 4.1c BSD image still up on sourceforge, aptly named: 4.1c BSD.7z, so I could follow my old instructions to create the tap file and start working with 8086 C, going back from 1985. And in no time, I had re-built the compiler, and assembler up and running. But I wanted more, as much fun as 4.1BSD is, I wanted to run everything natively on Windows.

At this point I’d remembered that this setup is a bit odd in that the object files that the assembler produces are OMAGIC (type 0407) a.out files. Thinking back to my old project of building Ancient Linux on Windows using vintage tools, it also uses OMAGIC a.out files! It’s not that unexpected as the original GNU ld linker from 1986 has hooks for both old magic & new magic (OMAGIC/NMAGIC) files, as this would be consistent from the era. Thinking this was my out, I might have a way of migrating the build process off of the VAX.

The first pass was to try to pull in all the VAX includes into my native Visual C++ 1.0, and get it to build with Microsoft C/C++ 8.0. The next thing to do of course, is look for where it’s doing anything with binary files and make sure it’s all set to O_BINARY/”rb”/”wb” where appropriate as MS-DOS/Win32/OS2 all handle text files differently from binary data. There is also fights with mktemp along with procedure name creep, like how ’round’ wasn’t a thing in 1980 but it sure is by 1993! Before the era of the 486DX/68040/Pentium not everyone had a math co-processor (FPU) so it’d make sense that a lot of things wouldn’t have this by default.

As a quick refresher the following diagram may be specific to the GNU GCC compiler, but the older PCC compiler uses the same methodology of first pre-processing files, then compiling them, assembling the resulting compiler output, then finally linking to an executable program. ( See – “So it turns out GCC could have been available on Windows NT the entire time“)

a rough explanation of how old C compilers work in stages

The steps for PCC 8086 are quite similar but to spell them out:

  • Pre-process with GNU cpp
  • Compile with PCC’s c86
  • Assemble with PCC’s a86
  • Link with GNU’s ld
  • Extract the MS-DOS .COM file with cvt86

There isn’t much to talk about the pre-processor, so I’ll skip it, suffice to say from my cl386 research, and porting GCC to OS2/NT, it just worked.

Compiling the compiler

Surprisingly getting the compiler running wasn’t too difficult. I do remember getting this running before, so seeing it run again wasn’t too much of a surprise. The simple C program:

main(){
printf("hi from 8086 pcc\n");
}

Gives us the following generated assembly:

        .data
        .text
        .globl  _main
_main:
        push    bp
        mov     bp,sp
        push    si
        push    di
        sub     sp,#LF1
        mov     ax,#L14
        push    ax
        call    _printf
        pop     cx
L12:
        lea     sp,*-4(bp)
        pop     di
        pop     si
        pop     bp
        ret
        LF1 = 0
        .data
L14:
        .byte   104,105,32,102,114,111,109,32
        .byte   56,48,56,54,32,112,99,99
        .byte   10,0

So far, so good, right! Even for fun, I was able to build it using Microsoft C 6.0! I figured I may as well try to get as much out of that purchase as possible.

Strage binary formats in a strange world

One thing that was a constant problem for me is that the assembler generated garbage, it never gave me the a.out OMAGIC file. Opening it up in a hex editor, Hex Editor Neo, and it showed problem, right away.

A simple do nothing program, assembled on the VAX

The OMAGIC sequence in hex should be written down as 07 01, but when I looked at the output from my PC port the file was not only too big but it didn’t have the headder:

The same program assembled on the PC

As you can see it’s just a bunch of zeros up front. Later on, I’d realize this was a ‘pad’ so it could be filled in later by the assembler when doing relocations. The file in question was rel.c which also should have been the hint. I don’t know if anyone saw it, but let me highlight it for you:

The OMAGIC header is being appended!

As you can see, where the file ends on the VAX, on the PC the OMAGIC header is being appended. I did a simple cut & paste in the editor, and the object file validated just fine. So why was this happening? In my rush to just add ‘binary’ flags to any file operations I had seen this in rel.c:

		(dout = fopen(Rel_name, "a")) == NULL)

I had taken this be an ‘append’ for whatever reason it would need to do that kind of thing. Well maybe that’s what it means in 1993, but in 1981 it doesn’t append, instead it just opens it normally. Is this a bug in the assembler, or a feature of 4.1BSD? Without debugging it, I just modified it to not append, as this was the only occurrence of an explicit append in the source code I could see.

		(dout = fopen(Rel_name, "wb")) == NULL)

And that did the trick, I now had a working assembler running on my PC!

But we are not out of the woods yet!

Naturally trying to build a much ‘larger’ Fibonacci program crashed the assembler. Luckily debugging it was a snap to find out that it was trying to free a static pointer. Or so I think. Today, in the future RAM is cheap, so I just commented out the offending free and now it was off to the linker.

When is advanced optimization a bad idea?

When the machine you wrote this for no longer exists. In the middle of the ld86 linker is this line:

		asm("movc3 r8,(r11),(r7)");

I have no idea why it’s there.

I don’t know what it should be doing.

This makes the linker un-portable.

However, as I’d mentioned before I did have the GNU linker that I’d successfully used to build Linux kernels, so there was hope!

C:\msvc32s\proj\8086pcc>\aoutgcc\bin\ld.exe -X -r -o hi.out crt0.b hi.b ./libc.a
C:\msvc32s\proj\8086pcc>cvt86 hi.out hi.com
C:\msvc32s\proj\8086pcc>msdos hi.com
hello from pcc for 8086!

I had now successfully run my first program without using the VAX. Although I had not mentioned a step yet, cvt86, this utility is described as creating a MS-DOS .COM file from an a.out file. I didn’t look into how it accomplishes this, but basically, it’s another linker. And this issue would come to complicate things as I had thought that everything was working.

libc & the heart of C

Libc, is simply put the central C library for getting everything done. While crt0 will setup the C environment everything else core from opening files, writing to the screen, and reading keyboard input is done through libc. So I thought re-building libc would be easy enough. To build the library you ‘archive’ them with the ‘ar’ archiver, then index them with ‘ranlib’. And again, from my a.out adventures building Linux I had both tools, however no matter what I was doing they did not work with cvt86. I wen’t back and rebuilt some kernels to verify, and yes it does generate archives but cvt86 was not happy.

I still had the SIMH VAX running in case I needed it, so I just broke down and figured I’d just port the VAX ar/ranlib to the PC. I don’t know off hand what the problem was, and I didn’t feel like spending an eternity to try to correct it, and both of the programs are somewhat portable. But of course it wasn’t that simple as ar opens files in strange ways that work on 4.1BSD but yeild chaos on the PC.

#define ARMAG   "!<arch>\n"

#define SARMAG  8

#define ARFMAG  "`\n"

‘ar’ has it’s own magic, a simple !<arch> and a ` followed by a new line. On any UNIX the \n is 10 in decimal 0xa in hex. But on the PC it’s carriage return and linefeed! And yes despite setting all the opens to binary, it was constantly injecting carriage returns & linefeeds all over the place! Some-how the file was being opened in text mode. Thankfully debugging even in old Visual C is great and inspecting the temporary files lead me to find this part:

			f = creat(file, larbuf.lar_mode & 0777);

In a few places it uses the creat (or create call. In an interview Dennis Ritchie had mentioned that this was one of his regrets in life, not naming creat create) call, which of course never has to set a mode, as everything is binary in Unix, unlike the PC. Great.

Luckily the fix was very simple after every creat, simply set the file mode to binary.

			setmode(f,O_BINARY);

Great!

Apple pie!

Fibonacci sequence

Now I could re-build libc from source and link it to the Fibonacci program!

Yes it’d take this long to get to the point where I can now easily edit file on a modern machine and have a Win32 native toolchain! VAX no longer required! We’ve successfully extracted everything we needed from the 1980’s!

First contact!

Now it’s time to look at what brought us on this adventure, MIT PC/IP.

The MIT PC/IP (PCIP) does require changes to the libc to work correctly. Unfortunately, they didn’t provide the full copy of the libc, but rather some ed scripts. So, the first question is do I even have the version these are based off of to start? I don’t know, so I had guessed, and I had guessed incorrectly.

3com 3c501

Configuring PCIP is somewhat involved, first you need MS-DOS 2.00 or greater which thankfully in the future is FREE! The next thing you need is a 3com 3c501 card. This is going to be a challenge but it’s just as any good time to mention 86box, and the latest version that I’ve been using that of course supports this kind of setup!

New version 4.1.1

I can’t recommend it enough, 86box is like all the inconveniences of old software & hardware without having to spend a fortune for weird combinations, fighting to have space for it. I naturally setup a 386sx with CGA, 20Mb hard disk and a 3c501 card. It’s nice being able to also be very task specific, this doesn’t have to be a DooM/Quake machine!

the first thing you need to do is add the netdev.sys device driver that is created as part of building PCIP from source. a simple:

DEVICE=\NETDEV.SYS

in your config.sys is more than enough. However, how do you configure it? Well it’s the ‘custom’ program that binary edit’s the device driver.

YES, IT EDITS THE DEVICE DRIVER.

Setting stuff up is somewhat straight forward, however it displays TCP/IP information in decimal not in hex. I haven’t even looked into the how or why.

The first page

The first page options are kind of banal, it’s back in the day when people would use finger to find people on the internet and call them up or send emails. How cute. 1985 was a different world!

hardware options

In the hardware options the only thing to check is the I/O base, IRQ & DMA for the Ethernet card. I just configured the card around 0x300/5/1 as it’s great being purpose built!

telnet options

There is a separate window for telnet options. Naturally high speed connections frame far too fast for something built from 1985. I found lowering the TCP windows really helped with pacing.

Site config

As I had mentioned the site configuration displays all the information in decimal. Also, I’m wasn’t sure what is going on with the netmask, but looking at the old Windows calculator revealed it was being stored in OCTAL. It’s probably why the addresses have commas instead of periods. Although I had configured DNS it didn’t work, as it uses UDP port 42. It’s clearly doing something very early 1980’s.

The status CR/LF is broken!

So close and yet so far away. The only thing to do was try to figure out which of the libc stuff was ‘newest’ in the pure state and try to figure out where to go from there.

Redo!

While I had not configured the libc correctly, I had partial success! I could actually establish a telnet session! Libc wasn’t working correctly as every line feed didn’t generate a carriage return, as you’d need for MS-DOS leaving it with broken status.

But at the same time, despite all the weird ‘challenges’ for the most part ‘it just worked’. Which is pretty cool!

It turns out the answer was the 8086_C_19850820 file. As far as I can tell there was only one thing that didn’t patch correctly but I was able to build a libc in no time.. that didn’t work. In the patch it removes ulrem/auldiv from arith.a86 Not sure why, I haven’t messed with it. But that means I had to restructure to build with the non floating point n86c compiler as that’s the way PCIP is expected to be built. After rebuilding with this compiler and this seemingly properly patched library I finally had it working!

Ping my local gateway!

Instead of a garbled mess, I had something I could read!

telnetting to my test BBS

Now instead of a garbled mess, I can see it was trying to display the connected IP, and a clock.

Sadly it doesn’t work with SLiRP. I’m sure it’s either classful routing or it really doesn’t like how SLiRP handles ARP. I suspect it’s also trying to do old style classful routing as well, which means you can’t just load arbitrary subnet masks wherever you want, to try to squeeze the 4 billion IP’s out of the internet.

The updated telnet client connecting to a test BBS

Final thoughts

I suspect that although there were binaries in the above tar files, going through the effort to rebuild PCIP really wasn’t all that expected for most people to carry out. Sadly, there was no shared source ‘sites’ online, and we’re lucky enough someone kept a few tarballs lying around. I really can’t blame them for sticking with then current development tools, especially for what you’d need to build a C compiler back in the early 80’s. It’s a shame the QL or the Macintosh didn’t have the RAM or the DASD capacity to become that home cross compiler of the 80’s.

Most project just require you to work on that actual project, while this has been a substantially larger undertaking from anything normal, but I guess I’ve learned a bit along the way with all those “pointless” GCC port things I’d done, well it turns out they are incredibly useful! It’s been a fun archeological expedition for me, thankfully C is still a thing, I wonder what happened to all the ADA/Perl/Pascal/”Wave of the future” stuff that is always disappearing. At least more and more people work on full system emulation so there is always that!

For anyone that curious you can find all the code over on github:

https://github.com/neozeed/8086pcc

Against my better judgement, I’ve added a binary package on github.

6 thoughts on “Building MIT PC/IP, or making apple pie

  1. That’s a decent effort.

    What’s the “msdos” command executing in your setup?

    In Linux I was able to run your pcc win32 binaries under WINE (from the command prompt – “wine cmd”), and the resulting DOS binary under DOSEMU.

    The C runtime might need a few tweaks since printf(“Hello!\n”) from the DOS binary only sends a LF (0x0A) to stdout instead of the CRLF pair that DOS (and DOSEMU) needs to move the cursor to the start of the next line.

    • msdos is the most excellent msdos player

      I’m sure it will work with the ‘corrected’ libc that mit used, although good pont I never tried anything that much more involved. In one of the dumps there is a note about turning on cr/lf

  2. “however it displays TCP/IP information in decimal not in hex”

    I think you mean octal rather than decimal? For example 377 octal is 255 decimal.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.