Teaching an Almost 40-year Old UNIX about Backspace

(This is a guest post by xorhash.)

Introduction

I have been messing with the UNIX®† operating system, Seventh Edition (commonly known as UNIX V7 or just V7) for a while now. V7 dates from 1979, so it’s about 40 years old at this point. The last post was on V7/x86, but since I’ve run into various issues with it, I moved on to a proper installation of V7 on SIMH. The Internet has some really good resources on installing V7 in SIMH. Thus, I set out on my own journey on installing and using V7 a while ago, but that was remarkably uneventful.

One convenience that I have been dearly missing since the switch from V7/x86 is a functioning backspace key. There seem to be multiple different definitions of backspace:

  1. BS, as in ASCII character 8 (010, 0x08, also represented as ^H), and
  2. DEL, as in ASCII character 127 (0177, 0x7F, also represented as ^?).

V7 does not accept either for input by default. Instead, # is used as the erase character and @ is used as the kill character. These defaults have been there since UNIX V1. In fact, they have been “there” since Multics, where they got chosen seemingly arbitrarily. The erase character erases the character before it. The kill character kills (deletes) the whole line. For example, “ba##gooo#d” would be interpreted as “good” and “bad line@good line” would be interpreted as “good line”.

There is some debate on whether BS or DEL is the correct character for terminals to send when the user presses the backspace key. However, most programs have settled on DEL today. tmux forces DEL, even if the terminal emulator sends BS, so simply changing my terminal to send BS was not an option. The change from the defaults outlined here to today’s modern-day defaults occurred between 4.1BSD and 4.2BSD. enf on Hacker News has written a nice overview of the various conventions.

Changing the Defaults

These defaults can be overridden, however. Any character can be set as erase or kill character using stty(1). It accepts the caret notation, so that ^U stands for ctrl-u. Today’s defaults (and my goal) are:

Function Character
erase DEL (^?)
kill ^U

I wanted to change the defaults. Fortunately, stty(1) allows changing them. The caret notation represents ctrl as ^. The effect of holding ctrl and typing a character is bitwise-ANDing it with 037 (0x1F) as implemented in /usr/src/cmd/stty.c and mentioned in the stty(1) man page, so the notation as understood by stty(1) for ^? is broken for DEL: ASCII ? bitwise-AND 037 is US (unit separator), so ^? requires special handling. stty(1) in V7 does not know about this special case. Because of this, a separate program – or a change to stty(1) – is required to call stty(2) and change the erase character to DEL. Changing the kill character was easy enough, however:

$ stty kill '^U'

So I wrote that program and found out that DEL still didn’t work as expected, though ^U did. # stopped working as erase character, so something certainly did change. This is because in V7, DEL is the interrupt character. Today, the interrupt character is ^C.

Clearly, I needed to change the interrupt character. But how? Given that stty(1) nor the underlying syscall stty(2) seemed to let me change it, I looked at the code for tty(4) in /usr/sys/dev/tty.c and /usr/sys/h/tty.h. And in the header file, lo and behold:

#define	CERASE	'#'		/* default special characters */
#define	CEOT	004
#define	CKILL	'@'
#define	CQUIT	034		/* FS, cntl shift L */
#define	CINTR	0177		/* DEL */
#define	CSTOP	023		/* Stop output: ctl-s */
#define	CSTART	021		/* Start output: ctl-q */
#define	CBRK	0377

I assumed just changing these defaults would fix the defaults system-wide, which I found preferable to a solution in .profile anyway. Changed the header, one cycle of make allmake unixcp unix /unix and a reboot later, the system exhibited the same behavior. No change to the default erase, kill or interrupt characters. I double-checked /usr/sys/dev/tty.c, and it indeed copied the characters from the header. Something, somewhere must be overwriting my new defaults.

Studying the man pages in vol. 1 of the manual, I found that on multi-user boot init calls /etc/rc, then calls getty(8), which then calls login(1), which ultimately spawns the shell. /etc/rc didn’t do anything interesting related to the console or ttys, so the culprit must be either getty(8) or login(1). As it turns out, both of them are the culprits!

getty(8) changes the erase character to #, the kill character to @ and the struct tchars to { '\177', '\034', '\021', '\023', '\004', '\377' }. At this point, I realized that:

  1. there’s a struct tchars,
  2. it can be changed from userland.

The first member of struct tchars is char t_intrc, the interrupt character. So I could’ve had a much easier solution by writing some code to change the struct tchars, if only I’d actually read the manual. I’m too far in to just settle with a .profile solution and a custom executable, though. Besides, I still couldn’t actually fix my typos at the login prompt unless I make a broader change. I’d have noticed the first point if only I’d actually read the man page for tty(4). Oops.

login(1) changes the erase character to # and the kill character to @. At least the shell leaves them alone. Seriously, three places to set these defaults is crazy.

Fixing the Characters

The plan was simple, namely, perform the following substitution:

Function Old Character Old Character
ASCII (Octal)
New Character New Character
ASCII (Octal)
erase # 043 DEL 0177
kill @ 0100 ^U 025
interrupt DEL 0177 ^C 003

So, I changed the characters in tty(4), getty(8) and login(1). It worked! Almost. Now DEL did indeed erase. However, there was no feedback for it. When I typed DEL, the cursor would stay where it is.

Pondering the code for tty(4) again, I found that there is a variable called partab, which determines delays and what kind of special handling to apply if any. In particular, BS has an entry whose handler looks like this:

	/* backspace */
	case 2:
		if (*colp)
			(*colp)--;
		break;

Naïve as I was, I just changed the entry for DEL from “non-printing” to “backspace”, hoping that would help. Another recompilation cycle and reboot later, nothing changed. DEL still only silently erased. So I changed the handler for another character, recompiled, rebooted. Nothing changed. Again. At that point, I noticed something else must have been up.

I found out that the tty is in so-called echo mode. That means that all characters typed get echoed back to the tty. It just so happens that the representation of DEL is actually none at all. Thus it only looked like nothing changed, while the character was actually properly echoed back. However, when temporarily changing the erase character to BS (^H) and typing ^H manually, I would get the erase effect and the cursor moved back by one character on screen. When changing the erase character to something else like # and typing ^H manually, I would get no erasure, but the cursor moved back by one character on screen anyway. I now properly got the separation of character effect and representation on screen. Because of this unprintable-ness of DEL, I needed to add a special case for it in ttyoutput():

	if (c==0177) {
		ttyoutput(010, tp);
		ttyoutput(' ', tp);
		ttyoutput(010, tp);
		return;
	}

What this does is first send a BS to move the cursor back by one, then send a space to rub out the previous character on screen and then send another BS to get to the previous cursor position. Fortunately, my terminal lives in a world where doing this is instantaneous.

Getting the Diff

For future generations as well as myself when I inevitably majorly break this installation of V7, I wanted to make a diff. However, my V7 is installed in SIMH. I am not a very intelligent man, I didn’t keep backup copies of the files I’d changed. Getting data out of this emulated machine is an exercise in frustration.

Transmission over ethernet is out by virtue of there being no ethernet in V7. I could simulate a tape drive and write a tar file to it, but neither did I find any tools to convert from simulated tape drive to raw data, nor did I feel like writing my own. I could simulate a line printer, but neither did V7 ship with the LP11 driver (apparently by mistake), nor did I feel like copy/pasting a long lpr program in – a simple cat(1) to /dev/lp would just generate fairly garbled output. I could simulate another hard drive, but even if I format it, nothing could read the ancient file system anyway, except maybe mount_v7fs(8) on NetBSD. Though setting up NetBSD for the sole purpose of mounting another virtual machine’s hard drive sounds silly enough that I might do it in the future.

While V7 does ship with uucp(1), it requires a device to communicate through. It seems that communication over a tty is possible V7-side, but in my case, quite difficult. I use the version of SIMH as packaged on Debian because I’m a lazy person. For some reason, the DZ11 terminal emulator was removed from that package. The DUP11 bit synchronous interface, which I hope is the same as the DU-11 mentioned /usr/sys/du.c, was not part of SIMH at the time of packaging. V7 only speaks the g protocol (see Ptbl in /usr/src/cmd/uucp/cntrl.c), which requires the connection to be 8-bit clean. Even if the simulator for a DZ11 were packaged, it would most likely be unsuitable because telnet isn’t 8-bit clean by default and I’m not sure if the DZ11 driver can negotiate 8-bit clean Telnet. That aside, I’m not sure if Taylor UUCP for Linux would be able to handle “impure” TCP communications over the simulated interface, rather than a direct connection to another instance of Taylor UUCP. Then there is the issue of general compatibility between the two systems. As reader DOS pointed out, there seem to be quite some difficulties. Those difficulties were experienced on V7/x86, however. I’m not ruling out that the issues encountered are specific to V7/x86. In any case, UUCP is an adventure for another time. I expect it’ll be such a mess that it’ll deserve its own post.

In the end, I printed everything on screen using cat(1) and copied that out. Then I performed a manual diff against the original source code tree because tabs got converted to spaces in the process. Then I applied the changes to clean copies that did have the tabs. And finally, I actually invoked diff(1).

Closing Thoughts

Figuring all this out took me a few days. Penetrating how the system is put together was surprisingly fairly hard at first, but then the difficulty curve eased up. It was an interesting exercise in some kind of “reverse engineering” and I definitely learned something about tty handling. I was, however, not pleased with using ed(1), even if I do know the basics. vi(1) is a blessing that I did not appreciate enough until recently. Had I also been unable to access recursive grep(1) on my host and scroll through the code, I would’ve probably given up. Writing UNIX under those kinds of editing conditions is an amazing feat. I have nothing but the greatest respect for software developers of those days.

Here’s the diff, but V7 predates patch(1), so you’ll be stuck manually applying it: backspace.diff

† UNIX is a trademark of The Open Group.

Life in UNIX® V7: an attempt at a simple task

(This is a guest post by xorhash.)

1. Introduction

ChuckMcM on Hacker News (https://news.ycombinator.com/item?id=15990351) reacted to my previous entry here about trying to typeset old troff sources with groff. It was said that ‘‘you really can’t appreciate troff (and runoff and scribe) unless you do all of your document preparation on a fixed width font 24 line by 80 column terminal’’.

‘‘Challenge accepted’’ I said to myself. However, it would be quite boring to just do my document preparation in this kind of situation. Thus, I raised the ante: I will do my document preparation on a fixed width font 24 line by 80 column terminal on an ancient UNIX . That document is the one you are reading here.

2. Getting an old version of UNIX

While it would have been interesting to run my experiment on SIMH with a genuine UNIX , I was feeling far too lazy for that. Another constraint I made for myself is that I wanted to use the Internet as little as possible. Past the installation phase, only resources that are on the filesystem or part of the Seventh Edition Manual should be consulted. However, if I have to work with SIMH, chances are I’d be possibly fighting the emulator and the old emulated hardware much more than the software.

2.1. FreeBSD 1.0

My first thought was that I could just go for FreeBSD 1.0 or something. FreeBSD 1.0 dates from around 1993. That was surprisingly recent, but I needed a way to get the data off this thing again, so I did want networking. As luck would have it, FreeBSD 1.0 refused to install, giving me a hard read error when trying to read the floppy. FreeBSD 2.0 was from 1995 and already had colorful menus to install itself (!). That’s no use for an exercise in masochism.

2.2. V7/x86

I turned to browsing http://gunkies.org/wiki/Main_Page for a bit, hoping to find something to work with. Lo and behold, it pointed me to http://www.nordier.com/v7x86/! V7/x86 is a port of UNIX version 7 to the x86. It made some changes to V7, among those are:

1.

including the more pager,

2.

including the vi editor,

3.

providing a console for the screen, rather than expecting a teletype, and

4.

including an installation script.

The version of vi that ships with it is surprisingly usable, even by today’s standards. I believe I would’ve gone mad if I’d had to use ed to write this text with.

3. Installing V7/x86

The V7/x86 installer requires that a partition exists with the correct partition type. It ships with a tool called ptdisk to do that, but because /boot/mbr does not exist on the installation environment, it cannot initialize a disk that does not already have a partition table (http://www.nordier.com/v7x86/files/ISSUES). Thus I used a (recent) release of FreeBSD to create it. At first, FreeBSD couldn’t find its own CD-ROM, which left me quite confused. As it turns out, it being unable to find the CD-ROM was a side effect of assigning only 64 megabytes of RAM to the virtual machine. Once I’d bumped the RAM to 1GB, the FreeBSD booting procedure worked and I could create the partition for V7/x86. V7/x86 itself comes packaged on a standard ISO file and with a simple installation script. It seems it requires an IDE drive, but I did not investigate support for other types of hard drives, in particular SATA drives, much further. There seem to be no USB drivers, so USB keyboards may not work, either.

During the installation, my hard drive started making a lot of scary noises for a few minutes, so I aborted the installation procedure. After moving the disk image to a RAM disk (thank you, Linux, for giving me the power of tmpfs), I restarted the installation and it went in a flash. The scary noises were probably related to copying data with a block size of 20, which I assume was 20 bytes per block: The virtual hard disk was opened with O_DIRECT, i.e., all writes got flushed to it immediately. Rewriting the hard drive sector 20 bytes at a time must’ve been rather stressful for the drive.

4. Using V7/x86

I thought I knew my UNIX , but the 70s apparently had a few things to teach me. Fortunately, getting the system into a usable state was fairly simple because http://www.nordier.com/v7x86/doc/v7x86intro.pdf got me started. The most important notes are:

4. Using V7/x86

I thought I knew my UNIX , but the 70s apparently had a few things to teach me. Fortunately, getting the system into a usable state was fairly simple because http://www.nordier.com/v7x86/doc/v7x86intro.pdf got me started. The most important notes are:

1.

V7 boots in single-user mode by default. Only when you exit single-user mode, /etc/rc is actually run and the system comes up in multi-user mode.

2.

Using su is recommended because root has an insane environment by default. To erase, # is used, rather than backspace (^H). The TERM variable is not set, breaking vi. /usr/ucb is not on the path, making more unavailable.

3.

The character to interrupt a running command is DEL, not ^C. It does not seem possible to remap this.

more is a necessity on a console. I do not have a teletype, meaning I cannot just ‘‘scroll’’ by reading the text on the sheet so far. Therefore, man is fairly useless without also piping its output to more.

Creating a user account was simple enough, though: Edit /etc/passwd, run passwd for the new user, make the home directory, done. However, my first attempt failed hard because I was not aware of the stty erase situation. I now have a directory in /usr that reads ‘‘xorhash’’, but is definitely not the ASCII string ‘‘xorhash’’. It’s ‘‘o^Hxorhash’’. The same problem applies to hitting the arrow keys out of habit to access the command history, only to butcher the partial command you were writing that way.

Another mild inconvenience is the lack of alternative keyboard layouts. There is only the standard US English keyboard layout. I’m not used to it and it took me a while to figure out where some relevant keys ($, ^, &, / and – in particular) are. Though I suppose if I really wanted to, I could mess around with the kernel and the console driver, which is probably the intended way to change the keyboard layout in the first place.

5. Writing a document

Equipped with a new user, I turned to writing this text down before my memory fails me on the installation details.

5.1. The vi Editor

I am infinitely thankful for having vi in the V7/x86 distribution. Truly, I cannot express enough gratitude after just seeing a glimpse of ed in the V7/x86 introduction document. It has some quirks compared to my daily vim setup, though. Backspacing across lines is not possible. c only shows you until where you’re deleting by marking the end with $. You only get one undo, and undoing the undo is its own undo entry. And of course, there’s no syntax highlighting in that day and age.

5.2. troff/nroff

And now for the guests of this show for which the whole exercise was undertaken. The information in volume 2A of the 7th Edition manuals was surprisingly useful to get me started with the ms macros. I didn’t bother reading the troff/nroff User’s Manual as I only wanted to use the program, not write a macro package myself. The ms macro set seemed to be the way to go for that. In this case, nroff did much more heavy lifting than troff. After all, troff is designed the Graphic Systems C/A/T phototypesetter. I don’t have one of those. M. E. Lesk’s Typing Documents on the UNIX System: Using the −ms Macros with Troff and Nroff and Brian W. Kernighan’s A TROFF Tutorial proved invaluable trying to get this text formatted in nroff.

The ‘‘testing’’ cycle is fairly painful, too. When reading the nroff output, some formatting information (italics, bold) is lost. more can only advance pagewise, which makes it difficult to observe paragraphs in their entirety. It also cannot jump or scroll very fast so that finding issues in the later pages becomes infuriating, which I solved by splitting the file up into multiple files, one for each section heading.

5.3. The refer program and V7/x86

Since I was writing this in roff anyway, I figured I might as well take advantage of its capabilities – I wanted to use refer. It is meant to keep a list of references (think BibTeX). Trying to run it, I got this:

$ /bin/refer
/bin/refer: syntax error at line 1: ‘)’
unexpected

The system was trying to run the file as shell script. This also happens for tbl. It was actually an executable for which support got removed during the port (see https://pastebin.com/cxRhR7u9). I contacted Robert Nordier about this; he suggested I remove the -i and -n flags and recompile refer. Now it runs, exhibiting strange behavior instead: https://pastebin.com/0dQtnxSV For all intents and purposes, refer is quite unusable like this. Fixing this is beyond my capacity, unfortunately, and (understandably) Robert Nordier does not feel up to diving into it, either. Thus, we’ll have to live without the luxury of a list of references.

6. Getting the Data off the Disk

I’m writing this text on V7/x86 in a virtual machine. There are multiple ways I could try to get it off the disk image, such as via a floppy image or something. However, that sounds like effort. I’ll try to search for it in the raw disk image instead and just copy it out from there. Update: I’ve had to go through the shared floppy route. The data in this file is split up on the underlying file system. Fortunately, /dev/ entries are just really fancy files. Therefore, I could just write with tar to the floppy directly without having to first create an actual file system. The host could then use that “floppy” as a tar file directly.

Even when I have these roff sources, I still need to get them in a readable format. I’ll have to cheat and use groff -Thtml to generate an HTML version to put on the blog. However, to preserve some semblance of authenticity, I’ll also put the raw roff source up, along with the result of running nroff over it on the version running on V7/x86. That version of nroff attributes the trademark to Bell Laboratories. This is wrong. UNIX is a registered trademark of The Open Group.

7. Impressions

ChuckMcM was right. When you’re grateful for vi, staring at a blob of text with no syntax highlighting and with limited space, you start appreciating troff/nroff much more. In particular, LaTeX tends to have fairly verbose commands. Scanning through those without syntax highlighting becomes more difficult. However, ‘‘parsing’’ troff/nroff syntax is much easier on one’s mind. Additionally, the terse commands help because

\section{Impressions}

stands out much less than

.NH
Impressions
.PP

That can be helped by adding whitespace, but then you remove some precious context on your tiny 80×24 screen. troff/nroff are very much children of their time, but they’re not as bad as I may have made them look last time. Having said that, there’s no way you’ll ever convince me to actually touch troff/nroff macros.

As for the system as a whole, I was positively surprised how usable it was by today’s standards. The biggest challenge is getting the system up and shutting it down again, as well as moving data to and from it. I did miss having a search function whenever I was looking for information on roff in volume 2A of the manual.

I have the greatest of respect for the V7/x86 project. Porting an ancient operating system that hardcoded various aspects of the PDP-11 in scattered places must have been extremely frustrating. The drivers were written in the ancient version of C that is used on V7 (see /usr/sys/dev).

learn: Ancient troff sources vs. modern-day groff

(This is a guest post by xorhash.)

Introduction

I’ve been on a trip on the memory lane lately, digging around old manuals of UNIX® operating system before BSD.† In doing so, I’ve come across the sources for the 7th Edition manuals. I wanted to show one part of volume 2A to other people, but didn’t want to make them download the entire 336 pages of volume 2A for the part in question. The part I wanted to extract was “LEARN — Computer-Aided Instruction on UNIX”, starting at p. 107 in the volume 2A PDF file).

A normal person would, I presume, try to split the PDF file. That is straightforward and produces the expected results. I believe I needn’t state that you wouldn’t be reading this if I solved this problem like any sane person would. Instead, I opted to rebuild the PDF from the troff sources provided at the link above.

I am not a very clever man, and thus I completely disregarded the generation procedure that was already spelled out. However, it wasn’t exactly specific anyway, so I didn’t miss out on much.

Getting the sources

So I knew what I needed to do: Get the troff sources. I asked that the Heavens have mercy on my poor soul if this requires a lot of adjustment for 2017 text processing tools. However, a man must do what a man must do. The file in question was called “vol2/learn.bun”. I had no idea what a bun file is, hoped it wasn’t related to steamed buns and clicked it. As it turns out, it’s just what we would call a self-extracting archive today. The shell commands are not very weird, so the extraction process actually worked out just fine. Now I had files “p0” through “p7”. Except what happened to “p1”, the world will never know.

First Steps

I’ve dabbled in man pages before, but that was mostly mandoc, not actual troff.
Accordingly, the first attempt at getting something going was as naive as it could get:
$ groff -Tpdf p* | zathura -
It led to, shall we say, varying results.

really butchered rendering attempt

Clearly, I was doing something very fundamentally wrong. Conveniently, volume 2A also had a lot of troff documentation. Apparently I was supposed to pass -ms and first run tbl(1) over the troff source before actually giving it to groff. That sounded like a good idea, but the results were still somewhat off:

not very butchered rendering attempt

Allow me to express my doubts that this text was written in 2017. If you compare the output with the known-good PDF, you’ll also notice that, somehow, “Bell Laboratories, Murray Hill, New Jersey 07974” turned into “CAI”. Unfortunate.

Back to Square One and Pick Up the Breadcrumbs

Continuing to read the page I got the learn.bun from, I also spied a section called “Macros and References”. That sounds relevant to my interests. tmac.s, which after studying groff(1) seems to be what would get used with -ms references some files in /usr/lib/tmac. I was not in the mood to let this flood over into my system, so I had to make minor adjustments and turn it into relative paths. I also renamed tmac.s to tmac.os to avoid colliding with the one provided by groff, making the new invocation:

$ tbl p* | groff -M./macros -mos -Tpdf | zathura -

Now we’re getting somewhere:

almost not butchered rendering attempt

It’s better than the previous attempts. But there are also some warnings and problems that need cleaning up:

  1. There’s a note that Bell Laboratories holds the UNIX®
    trademark, which is no longer true.†
  2. Now, this most certainly was not written in December 21,
    19117, either.
  3. tmac.os:806: warning: numeric expression expected (got `\')
  4. Every time the .UX macro was requested, I got:
    warning: macro `ev1' not defined (possibly missing space after `ev')
    environment stack underflow

Point 1 was easy to address, it’s a simple text change. Point 2 was caused by spurious dots in front of a call to .ND. However, the actual volume 2A PDF said a different date than in the file, so I adjusted that to match (June 18, 1976 to January 30, 1979).

And Down the Slippery Slope

As for points 3 and 4… Let’s just say groff/troff macros are definitely not meant to be written or read by humans and it’s a feat comparable to magic that someone wrote this set of troff macros. Line 806 is .ch FO \\n(YYu. Supposedly, that changes the location of a page trap when the given macro is invoked. The second argument is meant to be a distance, which explains why groff is complaining. I tried to checked what groff does and left none the wiser. FO seems related to the page footer, I seemed to get away with just deleting that line, though.

Finally, point 4. Apparently, .ev1 was used multiple times in the tmac.os. This looked like it should’ve been .ev 1 instead. Changing those, lo and behold, .UX stopped behaving funky for the most part. Yet for some reason, I’d still get multiple footnotes about the trademark ownership of the UNIX® trademark.† tmac.os sets a troff register (GA) when the .UX macro is first encountered so that the footnote is only made once. The footnote is being made twice. Something does not add up here..AI (author’s institution) resets GA, but the first .UX comes after .AI, so that’s not the problem. Removing the .AB/.AE macros from page 1 caused only one footnote to be made. Thus, I infer it’s actually intended behavior that the footnote is made once for the abstract and once for the main body. Checking with the volume 2A PDF again, I realized that point 4 was, in fact, fixed just by the ev1 changes and I was just chasing a bug that does not exist. I really should’ve checked the PDF twice.

The abstract finally looks okay.

good rendering attempt

Done! Wait, No, Almost

Okay, we’re done, we can go home, right? Almost, one last thing to do: On the last page, there’s something really important missing: the bibliography. Instead, there’s just “$LIST$” there. We can’t just turn Brian W. Kernighan and Michael E. Lesk into plagiarists!

Back to the troff documentation in volume 2A, there’s a match for “$LIST$” on p. 183. Apparently I need a reference file and preprocess the file with refer(1). That sounds simple enough. Fortunately, I got the reference file along with the macros above, so I didn’t have to look for that separately.

$ refer -pRv7man -e p* | tbl | groff -M./macros -mos -Tpdf | zathura -

half of the references are blank

Of course. Why would it work? That’d have been too much to ask for.
At least I get some nice hints:

refer:p2:148: no matches for `skinner teaching 1961'
refer:p3:114: no matches for `kernighan editor tutorial 1974'

The troff documentation conveniently explains the format for the reference file, so I could just add these two entries to Rv7man and be done with it. Thankfully, the pre-compiled PDF of the volume 2A manual had the information necessary to compile the bibliography entries with.

%T Why We Need Teaching Machines
%A B. F. Skinner
%J Harvard Educational Review
%V 31
%P 377-398
%D 1961

%T A Tutorial Introduction to the Unix Editor ed
%A B. W. Kernighan
%D 1974

now that’s what I call a bibliography

And of course, here is the product of this whole ordeal.

Closing Remarks

The Heavens were feeling somewhat merciful, but only just enough that I could waste no more than a day on this project. They really wanted me to spend that day on it, though.

On a side note, “the missing learn references” aren’t available from the link that was
provided. http://cm.bell-labs.com/cm/cs/who/bwk/learn.tar.gz is now down, though the web archive still has it. Needless to say, I didn’t read that.

I will never, ever touch troff/groff again. mandoc is good at what it does and I’ll stick to mandoc for writing man pages. But if I ever need to get something typeset nicely from plain text?

LaTeX is the answer.
Not troff.
Never troff.
Not even once.

†UNIX® is a registered trademark of The Open Group.

Coherent 3.0

I don’t know why I was so dumb as a kid, but I remember thumbing through various magazines, and always seeing this ad:

Coherent Ad

And isn’t that sounding great?  Lex, Yacc, UUCP, and UNIX functionality on a AT compatible machine for $99!  And then you see reviews like this one from PC Mag:

Now even if you want to you can’t wind the clock back to the late 1970s, but Unix lovers can do the next best thing-pick up a copy of Mark William’s Coherent for $99.95.

Included in this time capsule are all of the utilities that you would have received in an AT&T Unix, Version 7, distribution circa 1978. The package includes a protected-mode multi-user multitasking operating system. over 150 utility programs, a C compiler, an assembler, software development tools, text formatting tools, system management tools, telecommunications utilities, and complete documentation in a very hefty, 1,000-page, perfect bound book. Most of the Unix classics-grep, ed, sed, awk, lex, sh, emacs-are there as well. The only favorites that are missing are vi (which is a text editor) and Dave Korn’s new shell.

Whether Coherent’s views on the Unix system match your own is a matter of taste. In the halcyon late 1970s, the Unix system was a relatively simple affair-lean and clean, and understandable to mere mortals. Since then, in an effort to make Unix the universal solution, countless features and versions have been grafted onto it by innumerable programmers, managers, committees, and boards of directors.

The result stands in stark contrast to the stated goals of Unix’s inventors. Coherent remains true to Unix’s roots and eschews local area networking, graphical user interfaces, menus, mice, and many of the other amenities that present-day DOS users and Unix users have come to expect of modem software.

Coherent’s installation is painless, but only after the agony of freeing up a 7MB or larger partition on an ordinary MFM or RLL disk, on a classic AT architecture machine. Since they are products of the modem era, ESDI and SCSI disks, as well as IBM’s Micro Channel architecture are not supported. Graphic display adapters are tolerated (used in text mode); mice are not supported. Coherent worked flawlessly, though, on my geriatric AT clone.

Coherent has a dual boot facility, so that you can choose to boot either DOS or Coherent during your startup procedures, but unfortunately you can’t run
DOS software from within the Coherent environment.

Mark Williams’president Robert Schwartz explained that the intended audience for Coherent are people who want to learn about or try the Unix system, without the hefty price tag and steep learning curve of the latest Unix versions. Part of Coherent’s advantage in both simplicity and price stems from its origins as a privately developed “clone” product-therefore no AT&T requirements need be met and no per-copy royalty is paid to AT&T. This gives Mark Williams the freedom to set prices as well as compatibility targets.

But learning Unix from Coherent would be a bumpy road. You could certainly master traditional system administration, learn the utilities. And experiment with Unix software development. But you couldn’t learn about networking or the increasingly important X Windows system. Nor could you realistically use Coherent to automate a small business.

Schwartz promises that future versions of Coherent will support 32-bit operation on the 386, and will likely support tighter integration with DOS, some form of window manager. And local area networking. When that occurs, Coherent will be much more like modem Unix systems and, like modem Unix systems, it will have strayed far from its roots.

List Price: Coherent Version 3.0, $99.95.
Requires: A free 7MB or larger hard disk partition, 640K RAM, highly disk drive, MFM or RLL controller.

Mark Williams Co., 601 N.
Skokie Hwy., Lake Bluff. IL
(Kaare Christian)

And then it seemed to my teenage eyes something pretty underwhelming.  So I dove into OS/2, and ignored the idea of having a UNIX like system.  I was still happy to finally move onto a 16-bit machine, and the thought of running stuff from the 1970’s wasn’t that appealing. Such missed opportunities.  But in the last few years, Coherent has been placed under a 3-clause BSD license.

Over at unix4fun, they did unearth some version 3.0 disks!  And yes, it’ll install on PCem/86Box using a 286/386/486 machine.  One issue I had was I first tried to install onto a massive 40Mb disk, and it never would reboot after the install correctly.  However it works great with a 32Mb or smaller disk.  As you can see from Kaare’s review it’ll fit into 7MB of disk space!  At least having to either re-partition or worry about dual booting is a thing of the past.  The disk images are 5.25 disk images, so re-configure your VM appropriately.

Coherent on PCem

As the advertisement says, the installation is a mere four diskettes!  And yes, it really does have a C compiler.  You will need a serial number for Coherent 3.0, which took a while to find, but Peter had one, and has been poking me for the last week+ to finally write this up.  Oh the number is 130500000.  305/Miami connection? Unlikely, but who knows.  Don’t forget to download the hefty manual, Coherent_Revision_8_1992, which is for a later version, but still suitable.

And yes, it feels just like Unix v7.  The kernel is tiny, 77kb!  It’s really cool for 16-bit era stuff, and really interesting to knock around.  I know there is a few more people out there that want fun things for their 286, and Coherent will certainly scratch that itch.

Additionally on the site are the 3.1 and 3.2 updates to give you thinks like Elvis so it doesn’t feel anywhere near as primitive.  Installing updates and 3rd party packages is covered on page 736 of the manual, or in short you need to know the magical ‘disk set name’ for everything you want to install.  I suppose back then it had stuff like this printed on them.

coh300-ddk.img Drv_110
coh300-rdb.img rdb
coh31update-1.img CohUpd310
coh320update-1.img
coh320update-2.img CohUpd320
coh320-ddk.img

While a ‘dump’ of the source code has been out there, I haven’t really gone through it, so I thought now would be as good as any to take a look at the kernel.  The layout is very similar to v6, so I based this on the file ‘sys1.c’ which appears quite a few times in the trees.  Using a MD5 checksum against the files there appears to be no less than 17 duplicated tress or 7 unique kernels, spread over three years.

cc3b2bef09be7d60d52a01ca908972f7 Jul,24,1991 gtz/relic/d/kernel/USRSRC/coh/sys1.c
cc3b2bef09be7d60d52a01ca908972f7 Jul,24,1991 romana/relic/d/kernel/USRSRC/coh/sys1.c

41797301f9d5771dd10161954df82a5d Jan,13,1992 gtz/relic/d/286_KERNEL/USRSRC/coh/sys1.c
41797301f9d5771dd10161954df82a5d Jan,13,1992 romana/relic/d/286_KERNEL/USRSRC/coh/sys1.c
6b888a45afb3476c5e482fa54fb4dd86 Jul,17,1992 gtz/relic/b/kernel/coh.286/sys1.c
6b888a45afb3476c5e482fa54fb4dd86 Jul,17,1992 romana/relic/b/kernel/coh.286/sys1.c
6b888a45afb3476c5e482fa54fb4dd86 Aug,11,1992 gtz/relic/d/PS2_KERNEL/coh.286/sys1.c
6b888a45afb3476c5e482fa54fb4dd86 Aug,11,1992 romana/relic/d/PS2_KERNEL/coh.286/sys1.c
b90f2659fdecfbfa576d39bc8e54ffa0 Aug,11,1992 romana/relic/d/PS2_KERNEL/coh.386/sys1.c
b90f2659fdecfbfa576d39bc8e54ffa0 Aug,11,1992 gtz/relic/d/PS2_KERNEL/coh.386/sys1.c

6503663ebb9a852007a46d66cd43ac1a Jun,14,1993 gtz/relic/b/kernel/coh.386/sys1.c
6503663ebb9a852007a46d66cd43ac1a Jun,14,1993 romana/relic/b/kernel/coh.386/sys1.c
35bc7569ab99ab340d2ca8bf66a47c46 Aug,9,1993 romana/relic/b/STREAMS/coh.386/sys1.c
35bc7569ab99ab340d2ca8bf66a47c46 Aug,9,1993 gtz/relic/b/STREAMS/coh.386/sys1.c
436f245293c88ceaecd99840c37dcbb4 Nov,15,1993 gtz/hal/r10/coh.386/sys1.c
436f245293c88ceaecd99840c37dcbb4 Nov,15,1993 gtz/src/sys.r12/coh.386/sys1.c
436f245293c88ceaecd99840c37dcbb4 Nov,15,1993 gtz/src/sys.r12/coh.386/r12/sys1.c

Phew!  Naturally the tree structure drifted, but I went ahead and just did a blind import into my CVS server to take a look. And there really does appear in the 1991 versions to be the remnants of either 2.3.37, 3.2.1.  It’s hard to say.

AT&T 3B2 400 emulated

This is super awesome!

AT&T 3B2 SYSTEM CONFIGURATION:

Memory size: 4 Megabytes
System Peripherals:

        Device Name        Subdevices           Extended Subdevices

        SBD
                        Floppy Disk
                        72 Megabyte Disk

        Welcome!
This machine has to be set up by you.  When you see the "login" message type
                                setup
followed by the RETURN key.  This will start a procedure that leads you through
those things that should be done the "first time" the machine is used.

The system is ready.

Console Login:

Back in the 1980’s AT&T shifted UNIX from being an internal research project that got somewhat popular in college spaces (and larger companies, General Motors was an early UNIX adapter, along with companies like Industrial Light and Magic).  Quickly after the divestiture of 1984, AT&T entered the commercial space with it’s own custom machines & their home made UNIX operating system.  Below is one of the ads they ran in 1984, touting their so called ‘super microcomputers’, featuring the 3B2, the 3B5, and the AT&T Personal Computer.

Thew new computers from AT&T

And indeed for many a government institute bewildered by the dozens of UNIX vendors, standards, and chaos of different platforms and processors many were all to happy to buy AT&T UNIX on AT&T machines.

And indeed this was my first experience with genuine SYSV Unix.

And I hated it.

Initially I had been thrown at an English computer lab because I knew how logon and do my work in style & diction, they decided I could help.  The system was aging and had major problems, as some prior students had figured out enough of the link kit that they would put their own attempts at re-writing portions of the kernel into the system, and it’d break.  Naturally the original installation diskettes were lost, and the best that could be done was basically shut it down throughout the day and run the disk repair utilities.  It was not a fun job.

Later on the 3B2’s were thrown into the ‘common garbage’ aka free kit for other departments, and the 3B2’s re-appeared at the next place I was volunteering at on campus.  However in addition to the two machines, there was a few other boxes of manuals, and oddly enough the installation diskettes.  And also about a dozen of these AT&T ISA Starlan adapters.  These weren’t the ones that were basically Ethernet (Starlan10) but rather the original ones.

Through some incredible luck we did find an NDIS 3 MS-DOS driver for the Starlan car, and we were able to cobble together a Starlan1 LAN consisting of a 3B2 that we cannibalized the RAM and disks from one of them to make a ‘super’ 3B2, with added TCP/IP software and a massively cut down port I did of samba to turn it into a tiny file & print server (72MB MFM disks were it’s biggest if I recall), along with Windows 95 clients.  And of course with a TCP/IP lan we could easily load a proxy server (WinGate?) on one machine with the 56kb modem, and now we all had internet access.  I know it’s sad today, but trust me back then it was “a big deal” that we had a fully functional LAN.

Over on loomcom.com there is an incredible amount of information about the reverse engineered WE32100, along with the 3B2 hardware, and of course information about the newest SIMH simulator the 3B2/400!

Instructions and disk images on the site made it incredibly easy to grab the latest SIMH Windows Development binaries, and get my own virtual 3B2 up and running in minutes! So naturally I pasted in dhrystone.c to see if it’d work.  And that was the first weird issue is that the backspace is the pound # key.  So all the C macro definitions lost their # sign.  I added them in vi without full terminal support because I’m crazy and:

# uname -a
unix unix 3.2 2 3B2
# ./dhrystone
Dhrystone(1.0) time for 500000 passes = 40
This machine benchmarks at 12500 dhrystones/second

Obviously this is 100% bogus, as the real machine should get around 735, and I didn’t even bother with the -O flag.

The current emulator doesn’t do any additional serial ports, nor any Ethernet adapters.  So you only get a console.  But with the pre-installed C compiler image, I was able to build a trivial file just fine.  Although pasting on the console really leaves a lot to be desired.

SDF AT&T 3B2/500 UNIX System

I know for some of us old people the 3B2 hid in the corners of our call centres, running our AT&T Definity switches, our voicemail, and even some of our early ISPs.  After funneling money into SUN to get them to work on SYSVr4 which was the grand unification of BSD + SYSV AT&T’s interest if UNIX quickly waned, and they divested themselves of UNIX, and eventually all PC hardware, although they did re-enter the PC space a few times before exiting yet again.

As time would tell, proprietary hardware + a previously ‘open’ operating system were not the winning combination.  And so far the only UNIX vendor to weather the Linux storm so far is IBM with it’s A/IX.

Research UNIX v9

v9 on TME

This just in, I have just booted Research UNIX v9 on TME’s SUN-3 emulator!

And there we are booted up and logged in.. pardon the disk error..

funinthe

I’m slightly hesitant about uploading it, as it clearly isn’t right… And this is only the binary component, I have integrated the source tree onto the disk image.  But I haven’t actually tried to compile anything except a simple hello-world program.  You can download it here from sourceforge: SUN3-research_v9.7z  If anyone want’s to browse the source, it’s on my CVS browser thing.

Research UNIX v8

    v8 on SIMH

So what the heck is Research UNIX v8?  Or even what is Research UNIX?  Well a query against utzoo gave me this answer:

>I've seen people that use System V and the like refer to their Unix as
>"tenth edition" or "ninth edition", or whatever. I've always seen things as
>"System V release n", or whatever. Anyone know the difference between these
>different naming schemes ?

There are actually three designations: Versions, Editions, and
System/Releases. The proper names of the first six Unixen were
"The #th Edition". Colloquially, people called them "Version #".
The Version Sixth Edition split off several variations, one of which
became Version Seven (the Seventh Edition) and sired BSD. From
several others, System III was born, and later named System V.
Tacked onto this name were Release numbers and yes, Versions.
So you will see things line SVr3v2.

The Eighth, Ninth, and Tenth Editions seldom left Bell Labs
and are also referred to as "Research UNIX". Another system
(not UNIX) they are playing with is called "Plan 9". Every so
often, a feature, such as STREAMS, finds its way into System V.

In some ways, Research UNIX is closer to BSD than to System V.

In short, UNIX began it’s life as a research project.  Until recently versions 1-6 & 32v were available to the public.  However the later versions, 8,9,10 were not.  However thanks to the work over at TUHS it’s available for non commercial use:

Alcatel-Lucent USA Inc has permitted usage saying "will not assert its
copyright rights with respect to any non-commercial copying, distribution,
performance, display or creation of derivative works of 
Research Unix®1 Editions 8,9, and 10."

So awesome!

The version of Research v8 is split onto 2 tape images, one for the graphical terminals, and the other for the OS install onto the VAX.  The distribution is not suitable for any standalone operation, and requires a previously installed 4.1BSD machine, with a second disk to install v8 onto.  Part of the installation requires you to compile your own kernel.  I ran into a bit of problems as it’s not a 100% process, but after referencing this guide from David du Colombier, I had the system up and running.  Naturally reading the installation manual helped a great deal too.

As always there is strange artifacts left in the backup, such as this scoreboard from rogue:

Top Ten Rogueists:
Rank Score Name
1 5545 Rog-O-Matic XIII for mjs: quit on level 17.
2 5043 ken: killed on level 23 by a dragon.
3 3858 zip: killed on level 16 by an invisible stalker.
4 3249 Rog-O-Matic VII: killed on level 16 by an invisible stalker.
5 2226 Rog-O-Matic VII: killed on level 13 by a troll.
6 2172 St. Jude: killed on level 13 by a troll.
7 1660 Rog-O-Matic VII: quit on level 11.
8 1632 Chipmunk the Jello: killed on level 10 by a centaur.
9 844 Rog-O-Matic VII: quit on level 5.
10 401 Rog-O-Matic VII: killed on level 4 by a snake.

Does this mean Ken Thompson was an avid rogue fan?  Perhaps.  Naturally I quickly compiled the v100 version of aclock, and had it running.

aclock on v8

I’ll have to edit this and more and more as I find out, but I’ve been busy in real life, and of course I know that in addition to v8, there is also v9 & v10 to tackle.

As always, if you want you can download my pre-installed from my site : researchv8.7z

You will have to bring your own copy of the SIMH VAX-11/780 simulator.  As of 31/3/2017 ther is issues with the github version of SIMH, and you will have issues with the disks on the VAX.  You need to disable the async with a simple set command in your ini file:

set noasync

And you should now be good to go!  As always you’ll have to battle the 404 page for the correct link and the username & password.

sorry.

Style & Diction

While looking at some old picture of a 3B2, I remembered in college we used to use this ‘fine’ system for it’s Writer’s Workbench which revolved around the programs style & diction.

I thought it’d be interesting to see if I could track down the source, however the sources seem to have been part of the AT&T DWB package, and were not included in any of the seemingly numerous available Unix sources available on TUHS.  But thanks to this post on the TUHS mailing list, I saw this:

I know about style and diction which was shipped with BSD4.1
which (again wooly memory) was an early subset of the
whole wwb package.

Going with this, I pulled out the recently unearthed images on bitsavers of 4.1_BSD_19810710, and in the tape images sure was the source!  The only date in there is from 1979!

Deroff Version 2.0    29 December 1979

Which for a 1981 tape sure would be in the same light.  So with some fun playing with the makefiles, I had it running on Debian 8 x64!  So with a little bit of kicking I have it running on Windows via MinGW.

So for a fun example, I though I’d take Bill Gate’s forward on Inside OS/2:

 

      OS/2 is destined to be a very important piece of software. During the
 next 10 years, millions of programmers and users will utilize this system.
 From time to time they will come across a feature or a limitation and
 wonder why it's there. The best way for them to understand the overall
 philosophy of the system will be to read this book. Gordon Letwin is
 Microsoft's architect for OS/2. In his very clear and sometimes humorous
 way, Gordon has laid out in this book why he included what he did and why
 he didn't include other things.
      The very first generation of microcomputers were 8-bit machines, such
 as the Commodore Pet, the TRS-80, the Apple II, and the CPM 80 based
 machines. Built into almost all of them was Microsoft's BASIC Interpreter.
 I met Gordon Letwin when I went to visit Heath's personal computer group
 (now part of Zenith). Gordon had written his own BASIC as well as an
 operating system for the Heath system, and he wasn't too happy that his
 management was considering buying someone else's. In a group of about 15
 people, he bluntly pointed out the limitations of my BASIC versus his.
 After Heath licensed my BASIC, I convinced Gordon that Microsoft was the
 place to be if you wanted your great software to be popular, and so he
 became one of Microsoft's first 10 programmers. His first project was to
 single-handedly write a compiler for Microsoft BASIC. He put a sign on his
 door that read

         Do not disturb, feed, poke, tease...the animal

 and in 5 months wrote a superb compiler that is still the basis for all our
 BASIC compilers. Unlike the code that a lot of superstar programmers write,
 Gordon's source code is a model of readability and includes precise
 explanations of algorithms and why they were chosen.
      When the Intel 80286 came along, with its protected mode completely
 separate from its compatible real mode, we had no idea how we were going to
 get at its new capabilities. In fact, we had given up until Gordon came up
 with the patented idea described in this book that has been referred to as
 "turning the car off and on at 60 MPH." When we first explained the idea to
 Intel and many of its customers, they were sure it wouldn't work. Even
 Gordon wasn't positive it would work until he wrote some test programs that
 proved it did.
      Gordon's role as an operating systems architect is to overview our
 designs and approaches and make sure they are as simple and as elegant as
 possible. Part of this job includes reviewing people's code. Most
 programmers enjoy having Gordon look over their code and point out how it
 could be improved and simplified. A lot of programs end up about half as
 big after Gordon has explained a better way to write them. Gordon doesn't
 mince words, however, so in at least one case a particularly sensitive
 programmer burst into tears after reading his commentary. Gordon isn't
 content to just look over other people's code. When a particular project
 looks very difficult, he dives in. Currently, Gordon has decided to
 personally write most of our new file system, which will be dramatically
 faster than our present one. On a recent "vacation" he wrote more than 50
 pages of source code.
      This is Gordon's debut as a book author, and like any good designer he
 has already imagined what bad reviews might say. I think this book is both
 fun and important. I hope you enjoy it as much as I have.

First we run it through style which will give the overall report on the text.

D:\diction\bin>style.cmd forward.txt
readability grades:
        (Kincaid)  9.1  (auto)  9.2  (Coleman-Liau)  8.8  (Flesch)  8.5 (64.8)
sentence info:
        no. sent 31 no. wds 607
        av sent leng 19.6 av word leng 4.43
        no. questions 0 no. imperatives 0
        no. nonfunc wds 338  55.7%   av leng 5.58
        short sent (<15) 35% (11) long sent (>30)  13% (4)
        longest sent 35 wds at sent 14; shortest sent 7 wds at sent 5
sentence types:
        simple  39% (12) complex  32% (10)
        compound   3% (1) compound-complex  26% (8)
word usage:
        verb types as % of total verbs
        tobe  33% (26) aux  22% (17) inf  13% (10)
        passives as % of non-inf verbs   6% (4)
        types as % of total
        prep 8.6% (52) conj 4.1% (25) adv 7.1% (43)
        noun 23.9% (145) adj 14.7% (89) pron 8.4% (51)
        nominalizations   0 % (3)
sentence beginnings:
        subject opener: noun (6) pron (5) pos (2) adj (3) art (3) tot  61%
        prep  19% (6) adv   6% (2)
        verb   0% (0)  sub_conj  13% (4) conj   0% (0)
        expletives   0% (0)

So that places it on the grade 9 level, average readability.

Now let’s see about usage errors with diction!

D:\diction\bin>diction forward.txt
      os 2 is destined to be a[ very ]important piece of software.

 during the  next 10 years  millions of programmers and users will[ utilize]
 this system.

 the best way for them to understand the[ overall ] philosophy of the system
 will be to read this book.

 in his[ very ]clear and sometimes humorous  way  gordon has laid out in this
 book why he included what he did and why  he didn t include other things.

       the[ very ]first generation of microcomputers were 8 bit machines
 such  as the commodore pet  the trs 80  the apple ii  and the cpm 80 based
  machines.

 built into almost[ all of ]them was microsoft s basic interpreter.

 unlike the code that[ a lot of ]superstar programmers write   gordon s source
 code is a model of readability and includes precise  explanations of algorithms
 and why they were chosen.

[ in fact ] we had given up until gordon came up  with the patented idea described
 in this book that has been referred to as   turning the car off and on at
 60 mph.

[ a lot of ]programs[ end up ]about half as  big after gordon has explained
 a better way to write them.

 when a particular project  looks[ very ]difficult  he dives in.

number of sentences 34 number of hits 11

As you can see, Bill likes very, very much.

Explain can give you examples of what to use instead, so how about ‘a lot of’?

D:\diction\bin>bash explain
phrase?
a lot of
use "many" for "a lot of"
phrase?

Explain is a sed script, so in this case I’m using MinGW’s MSYS environment to run the script.

I don’t think much of anyone will care about text processing utilities from the 1970’s in 2016 (and beyond) but for anyone else who is bored, or found out about this by mistake, here you go!

diction.7z

You’ll get a 404 page, just read the error page for the password.