learn: Ancient troff sources vs. modern-day groff

(This is a guest post by xorhash.)

Introduction

I’ve been on a trip on the memory lane lately, digging around old manuals of UNIX® operating system before BSD.† In doing so, I’ve come across the sources for the 7th Edition manuals. I wanted to show one part of volume 2A to other people, but didn’t want to make them download the entire 336 pages of volume 2A for the part in question. The part I wanted to extract was “LEARN — Computer-Aided Instruction on UNIX”, starting at p. 107 in the volume 2A PDF file).

A normal person would, I presume, try to split the PDF file. That is straightforward and produces the expected results. I believe I needn’t state that you wouldn’t be reading this if I solved this problem like any sane person would. Instead, I opted to rebuild the PDF from the troff sources provided at the link above.

I am not a very clever man, and thus I completely disregarded the generation procedure that was already spelled out. However, it wasn’t exactly specific anyway, so I didn’t miss out on much.

Getting the sources

So I knew what I needed to do: Get the troff sources. I asked that the Heavens have mercy on my poor soul if this requires a lot of adjustment for 2017 text processing tools. However, a man must do what a man must do. The file in question was called “vol2/learn.bun”. I had no idea what a bun file is, hoped it wasn’t related to steamed buns and clicked it. As it turns out, it’s just what we would call a self-extracting archive today. The shell commands are not very weird, so the extraction process actually worked out just fine. Now I had files “p0” through “p7”. Except what happened to “p1”, the world will never know.

First Steps

I’ve dabbled in man pages before, but that was mostly mandoc, not actual troff.
Accordingly, the first attempt at getting something going was as naive as it could get:
$ groff -Tpdf p* | zathura -
It led to, shall we say, varying results.

really butchered rendering attempt

Clearly, I was doing something very fundamentally wrong. Conveniently, volume 2A also had a lot of troff documentation. Apparently I was supposed to pass -ms and first run tbl(1) over the troff source before actually giving it to groff. That sounded like a good idea, but the results were still somewhat off:

not very butchered rendering attempt

Allow me to express my doubts that this text was written in 2017. If you compare the output with the known-good PDF, you’ll also notice that, somehow, “Bell Laboratories, Murray Hill, New Jersey 07974” turned into “CAI”. Unfortunate.

Back to Square One and Pick Up the Breadcrumbs

Continuing to read the page I got the learn.bun from, I also spied a section called “Macros and References”. That sounds relevant to my interests. tmac.s, which after studying groff(1) seems to be what would get used with -ms references some files in /usr/lib/tmac. I was not in the mood to let this flood over into my system, so I had to make minor adjustments and turn it into relative paths. I also renamed tmac.s to tmac.os to avoid colliding with the one provided by groff, making the new invocation:

$ tbl p* | groff -M./macros -mos -Tpdf | zathura -

Now we’re getting somewhere:

almost not butchered rendering attempt

It’s better than the previous attempts. But there are also some warnings and problems that need cleaning up:

  1. There’s a note that Bell Laboratories holds the UNIX®
    trademark, which is no longer true.†
  2. Now, this most certainly was not written in December 21,
    19117, either.
  3. tmac.os:806: warning: numeric expression expected (got `\')
  4. Every time the .UX macro was requested, I got:
    warning: macro `ev1' not defined (possibly missing space after `ev')
    environment stack underflow

Point 1 was easy to address, it’s a simple text change. Point 2 was caused by spurious dots in front of a call to .ND. However, the actual volume 2A PDF said a different date than in the file, so I adjusted that to match (June 18, 1976 to January 30, 1979).

And Down the Slippery Slope

As for points 3 and 4… Let’s just say groff/troff macros are definitely not meant to be written or read by humans and it’s a feat comparable to magic that someone wrote this set of troff macros. Line 806 is .ch FO \\n(YYu. Supposedly, that changes the location of a page trap when the given macro is invoked. The second argument is meant to be a distance, which explains why groff is complaining. I tried to checked what groff does and left none the wiser. FO seems related to the page footer, I seemed to get away with just deleting that line, though.

Finally, point 4. Apparently, .ev1 was used multiple times in the tmac.os. This looked like it should’ve been .ev 1 instead. Changing those, lo and behold, .UX stopped behaving funky for the most part. Yet for some reason, I’d still get multiple footnotes about the trademark ownership of the UNIX® trademark.† tmac.os sets a troff register (GA) when the .UX macro is first encountered so that the footnote is only made once. The footnote is being made twice. Something does not add up here..AI (author’s institution) resets GA, but the first .UX comes after .AI, so that’s not the problem. Removing the .AB/.AE macros from page 1 caused only one footnote to be made. Thus, I infer it’s actually intended behavior that the footnote is made once for the abstract and once for the main body. Checking with the volume 2A PDF again, I realized that point 4 was, in fact, fixed just by the ev1 changes and I was just chasing a bug that does not exist. I really should’ve checked the PDF twice.

The abstract finally looks okay.

good rendering attempt

Done! Wait, No, Almost

Okay, we’re done, we can go home, right? Almost, one last thing to do: On the last page, there’s something really important missing: the bibliography. Instead, there’s just “$LIST$” there. We can’t just turn Brian W. Kernighan and Michael E. Lesk into plagiarists!

Back to the troff documentation in volume 2A, there’s a match for “$LIST$” on p. 183. Apparently I need a reference file and preprocess the file with refer(1). That sounds simple enough. Fortunately, I got the reference file along with the macros above, so I didn’t have to look for that separately.

$ refer -pRv7man -e p* | tbl | groff -M./macros -mos -Tpdf | zathura -

half of the references are blank

Of course. Why would it work? That’d have been too much to ask for.
At least I get some nice hints:

refer:p2:148: no matches for `skinner teaching 1961'
refer:p3:114: no matches for `kernighan editor tutorial 1974'

The troff documentation conveniently explains the format for the reference file, so I could just add these two entries to Rv7man and be done with it. Thankfully, the pre-compiled PDF of the volume 2A manual had the information necessary to compile the bibliography entries with.

%T Why We Need Teaching Machines
%A B. F. Skinner
%J Harvard Educational Review
%V 31
%P 377-398
%D 1961

%T A Tutorial Introduction to the Unix Editor ed
%A B. W. Kernighan
%D 1974

now that’s what I call a bibliography

And of course, here is the product of this whole ordeal.

Closing Remarks

The Heavens were feeling somewhat merciful, but only just enough that I could waste no more than a day on this project. They really wanted me to spend that day on it, though.

On a side note, “the missing learn references” aren’t available from the link that was
provided. http://cm.bell-labs.com/cm/cs/who/bwk/learn.tar.gz is now down, though the web archive still has it. Needless to say, I didn’t read that.

I will never, ever touch troff/groff again. mandoc is good at what it does and I’ll stick to mandoc for writing man pages. But if I ever need to get something typeset nicely from plain text?

LaTeX is the answer.
Not troff.
Never troff.
Not even once.

†UNIX® is a registered trademark of The Open Group.

Coherent 3.0

I don’t know why I was so dumb as a kid, but I remember thumbing through various magazines, and always seeing this ad:

Coherent Ad

And isn’t that sounding great?  Lex, Yacc, UUCP, and UNIX functionality on a AT compatible machine for $99!  And then you see reviews like this one from PC Mag:

Now even if you want to you can’t wind the clock back to the late 1970s, but Unix lovers can do the next best thing-pick up a copy of Mark William’s Coherent for $99.95.

Included in this time capsule are all of the utilities that you would have received in an AT&T Unix, Version 7, distribution circa 1978. The package includes a protected-mode multi-user multitasking operating system. over 150 utility programs, a C compiler, an assembler, software development tools, text formatting tools, system management tools, telecommunications utilities, and complete documentation in a very hefty, 1,000-page, perfect bound book. Most of the Unix classics-grep, ed, sed, awk, lex, sh, emacs-are there as well. The only favorites that are missing are vi (which is a text editor) and Dave Korn’s new shell.

Whether Coherent’s views on the Unix system match your own is a matter of taste. In the halcyon late 1970s, the Unix system was a relatively simple affair-lean and clean, and understandable to mere mortals. Since then, in an effort to make Unix the universal solution, countless features and versions have been grafted onto it by innumerable programmers, managers, committees, and boards of directors.

The result stands in stark contrast to the stated goals of Unix’s inventors. Coherent remains true to Unix’s roots and eschews local area networking, graphical user interfaces, menus, mice, and many of the other amenities that present-day DOS users and Unix users have come to expect of modem software.

Coherent’s installation is painless, but only after the agony of freeing up a 7MB or larger partition on an ordinary MFM or RLL disk, on a classic AT architecture machine. Since they are products of the modem era, ESDI and SCSI disks, as well as IBM’s Micro Channel architecture are not supported. Graphic display adapters are tolerated (used in text mode); mice are not supported. Coherent worked flawlessly, though, on my geriatric AT clone.

Coherent has a dual boot facility, so that you can choose to boot either DOS or Coherent during your startup procedures, but unfortunately you can’t run
DOS software from within the Coherent environment.

Mark Williams’president Robert Schwartz explained that the intended audience for Coherent are people who want to learn about or try the Unix system, without the hefty price tag and steep learning curve of the latest Unix versions. Part of Coherent’s advantage in both simplicity and price stems from its origins as a privately developed “clone” product-therefore no AT&T requirements need be met and no per-copy royalty is paid to AT&T. This gives Mark Williams the freedom to set prices as well as compatibility targets.

But learning Unix from Coherent would be a bumpy road. You could certainly master traditional system administration, learn the utilities. And experiment with Unix software development. But you couldn’t learn about networking or the increasingly important X Windows system. Nor could you realistically use Coherent to automate a small business.

Schwartz promises that future versions of Coherent will support 32-bit operation on the 386, and will likely support tighter integration with DOS, some form of window manager. And local area networking. When that occurs, Coherent will be much more like modem Unix systems and, like modem Unix systems, it will have strayed far from its roots.

List Price: Coherent Version 3.0, $99.95.
Requires: A free 7MB or larger hard disk partition, 640K RAM, highly disk drive, MFM or RLL controller.

Mark Williams Co., 601 N.
Skokie Hwy., Lake Bluff. IL
(Kaare Christian)

And then it seemed to my teenage eyes something pretty underwhelming.  So I dove into OS/2, and ignored the idea of having a UNIX like system.  I was still happy to finally move onto a 16-bit machine, and the thought of running stuff from the 1970’s wasn’t that appealing. Such missed opportunities.  But in the last few years, Coherent has been placed under a 3-clause BSD license.

Over at unix4fun, they did unearth some version 3.0 disks!  And yes, it’ll install on PCem/86Box using a 286/386/486 machine.  One issue I had was I first tried to install onto a massive 40Mb disk, and it never would reboot after the install correctly.  However it works great with a 32Mb or smaller disk.  As you can see from Kaare’s review it’ll fit into 7MB of disk space!  At least having to either re-partition or worry about dual booting is a thing of the past.  The disk images are 5.25 disk images, so re-configure your VM appropriately.

Coherent on PCem

As the advertisement says, the installation is a mere four diskettes!  And yes, it really does have a C compiler.  You will need a serial number for Coherent 3.0, which took a while to find, but Peter had one, and has been poking me for the last week+ to finally write this up.  Oh the number is 130500000.  305/Miami connection? Unlikely, but who knows.  Don’t forget to download the hefty manual, Coherent_Revision_8_1992, which is for a later version, but still suitable.

And yes, it feels just like Unix v7.  The kernel is tiny, 77kb!  It’s really cool for 16-bit era stuff, and really interesting to knock around.  I know there is a few more people out there that want fun things for their 286, and Coherent will certainly scratch that itch.

Additionally on the site are the 3.1 and 3.2 updates to give you thinks like Elvis so it doesn’t feel anywhere near as primitive.  Installing updates and 3rd party packages is covered on page 736 of the manual, or in short you need to know the magical ‘disk set name’ for everything you want to install.  I suppose back then it had stuff like this printed on them.

coh300-ddk.img Drv_110
coh300-rdb.img rdb
coh31update-1.img CohUpd310
coh320update-1.img
coh320update-2.img CohUpd320
coh320-ddk.img

While a ‘dump’ of the source code has been out there, I haven’t really gone through it, so I thought now would be as good as any to take a look at the kernel.  The layout is very similar to v6, so I based this on the file ‘sys1.c’ which appears quite a few times in the trees.  Using a MD5 checksum against the files there appears to be no less than 17 duplicated tress or 7 unique kernels, spread over three years.

cc3b2bef09be7d60d52a01ca908972f7 Jul,24,1991 gtz/relic/d/kernel/USRSRC/coh/sys1.c
cc3b2bef09be7d60d52a01ca908972f7 Jul,24,1991 romana/relic/d/kernel/USRSRC/coh/sys1.c

41797301f9d5771dd10161954df82a5d Jan,13,1992 gtz/relic/d/286_KERNEL/USRSRC/coh/sys1.c
41797301f9d5771dd10161954df82a5d Jan,13,1992 romana/relic/d/286_KERNEL/USRSRC/coh/sys1.c
6b888a45afb3476c5e482fa54fb4dd86 Jul,17,1992 gtz/relic/b/kernel/coh.286/sys1.c
6b888a45afb3476c5e482fa54fb4dd86 Jul,17,1992 romana/relic/b/kernel/coh.286/sys1.c
6b888a45afb3476c5e482fa54fb4dd86 Aug,11,1992 gtz/relic/d/PS2_KERNEL/coh.286/sys1.c
6b888a45afb3476c5e482fa54fb4dd86 Aug,11,1992 romana/relic/d/PS2_KERNEL/coh.286/sys1.c
b90f2659fdecfbfa576d39bc8e54ffa0 Aug,11,1992 romana/relic/d/PS2_KERNEL/coh.386/sys1.c
b90f2659fdecfbfa576d39bc8e54ffa0 Aug,11,1992 gtz/relic/d/PS2_KERNEL/coh.386/sys1.c

6503663ebb9a852007a46d66cd43ac1a Jun,14,1993 gtz/relic/b/kernel/coh.386/sys1.c
6503663ebb9a852007a46d66cd43ac1a Jun,14,1993 romana/relic/b/kernel/coh.386/sys1.c
35bc7569ab99ab340d2ca8bf66a47c46 Aug,9,1993 romana/relic/b/STREAMS/coh.386/sys1.c
35bc7569ab99ab340d2ca8bf66a47c46 Aug,9,1993 gtz/relic/b/STREAMS/coh.386/sys1.c
436f245293c88ceaecd99840c37dcbb4 Nov,15,1993 gtz/hal/r10/coh.386/sys1.c
436f245293c88ceaecd99840c37dcbb4 Nov,15,1993 gtz/src/sys.r12/coh.386/sys1.c
436f245293c88ceaecd99840c37dcbb4 Nov,15,1993 gtz/src/sys.r12/coh.386/r12/sys1.c

Phew!  Naturally the tree structure drifted, but I went ahead and just did a blind import into my CVS server to take a look. And there really does appear in the 1991 versions to be the remnants of either 2.3.37, 3.2.1.  It’s hard to say.

AT&T 3B2 400 emulated

This is super awesome!

AT&T 3B2 SYSTEM CONFIGURATION:

Memory size: 4 Megabytes
System Peripherals:

        Device Name        Subdevices           Extended Subdevices

        SBD
                        Floppy Disk
                        72 Megabyte Disk

        Welcome!
This machine has to be set up by you.  When you see the "login" message type
                                setup
followed by the RETURN key.  This will start a procedure that leads you through
those things that should be done the "first time" the machine is used.

The system is ready.

Console Login:

Back in the 1980’s AT&T shifted UNIX from being an internal research project that got somewhat popular in college spaces (and larger companies, General Motors was an early UNIX adapter, along with companies like Industrial Light and Magic).  Quickly after the divestiture of 1984, AT&T entered the commercial space with it’s own custom machines & their home made UNIX operating system.  Below is one of the ads they ran in 1984, touting their so called ‘super microcomputers’, featuring the 3B2, the 3B5, and the AT&T Personal Computer.

Thew new computers from AT&T

And indeed for many a government institute bewildered by the dozens of UNIX vendors, standards, and chaos of different platforms and processors many were all to happy to buy AT&T UNIX on AT&T machines.

And indeed this was my first experience with genuine SYSV Unix.

And I hated it.

Initially I had been thrown at an English computer lab because I knew how logon and do my work in style & diction, they decided I could help.  The system was aging and had major problems, as some prior students had figured out enough of the link kit that they would put their own attempts at re-writing portions of the kernel into the system, and it’d break.  Naturally the original installation diskettes were lost, and the best that could be done was basically shut it down throughout the day and run the disk repair utilities.  It was not a fun job.

Later on the 3B2’s were thrown into the ‘common garbage’ aka free kit for other departments, and the 3B2’s re-appeared at the next place I was volunteering at on campus.  However in addition to the two machines, there was a few other boxes of manuals, and oddly enough the installation diskettes.  And also about a dozen of these AT&T ISA Starlan adapters.  These weren’t the ones that were basically Ethernet (Starlan10) but rather the original ones.

Through some incredible luck we did find an NDIS 3 MS-DOS driver for the Starlan car, and we were able to cobble together a Starlan1 LAN consisting of a 3B2 that we cannibalized the RAM and disks from one of them to make a ‘super’ 3B2, with added TCP/IP software and a massively cut down port I did of samba to turn it into a tiny file & print server (72MB MFM disks were it’s biggest if I recall), along with Windows 95 clients.  And of course with a TCP/IP lan we could easily load a proxy server (WinGate?) on one machine with the 56kb modem, and now we all had internet access.  I know it’s sad today, but trust me back then it was “a big deal” that we had a fully functional LAN.

Over on loomcom.com there is an incredible amount of information about the reverse engineered WE32100, along with the 3B2 hardware, and of course information about the newest SIMH simulator the 3B2/400!

Instructions and disk images on the site made it incredibly easy to grab the latest SIMH Windows Development binaries, and get my own virtual 3B2 up and running in minutes! So naturally I pasted in dhrystone.c to see if it’d work.  And that was the first weird issue is that the backspace is the pound # key.  So all the C macro definitions lost their # sign.  I added them in vi without full terminal support because I’m crazy and:

# uname -a
unix unix 3.2 2 3B2
# ./dhrystone
Dhrystone(1.0) time for 500000 passes = 40
This machine benchmarks at 12500 dhrystones/second

Obviously this is 100% bogus, as the real machine should get around 735, and I didn’t even bother with the -O flag.

The current emulator doesn’t do any additional serial ports, nor any Ethernet adapters.  So you only get a console.  But with the pre-installed C compiler image, I was able to build a trivial file just fine.  Although pasting on the console really leaves a lot to be desired.

SDF AT&T 3B2/500 UNIX System

I know for some of us old people the 3B2 hid in the corners of our call centres, running our AT&T Definity switches, our voicemail, and even some of our early ISPs.  After funneling money into SUN to get them to work on SYSVr4 which was the grand unification of BSD + SYSV AT&T’s interest if UNIX quickly waned, and they divested themselves of UNIX, and eventually all PC hardware, although they did re-enter the PC space a few times before exiting yet again.

As time would tell, proprietary hardware + a previously ‘open’ operating system were not the winning combination.  And so far the only UNIX vendor to weather the Linux storm so far is IBM with it’s A/IX.

Research UNIX v9

v9 on TME

This just in, I have just booted Research UNIX v9 on TME’s SUN-3 emulator!

And there we are booted up and logged in.. pardon the disk error..

funinthe

I’m slightly hesitant about uploading it, as it clearly isn’t right… And this is only the binary component, I have integrated the source tree onto the disk image.  But I haven’t actually tried to compile anything except a simple hello-world program.  You can download it here from sourceforge: SUN3-research_v9.7z  If anyone want’s to browse the source, it’s on my CVS browser thing.

Research UNIX v8

    v8 on SIMH

So what the heck is Research UNIX v8?  Or even what is Research UNIX?  Well a query against utzoo gave me this answer:

>I've seen people that use System V and the like refer to their Unix as
>"tenth edition" or "ninth edition", or whatever. I've always seen things as
>"System V release n", or whatever. Anyone know the difference between these
>different naming schemes ?

There are actually three designations: Versions, Editions, and
System/Releases. The proper names of the first six Unixen were
"The #th Edition". Colloquially, people called them "Version #".
The Version Sixth Edition split off several variations, one of which
became Version Seven (the Seventh Edition) and sired BSD. From
several others, System III was born, and later named System V.
Tacked onto this name were Release numbers and yes, Versions.
So you will see things line SVr3v2.

The Eighth, Ninth, and Tenth Editions seldom left Bell Labs
and are also referred to as "Research UNIX". Another system
(not UNIX) they are playing with is called "Plan 9". Every so
often, a feature, such as STREAMS, finds its way into System V.

In some ways, Research UNIX is closer to BSD than to System V.

In short, UNIX began it’s life as a research project.  Until recently versions 1-6 & 32v were available to the public.  However the later versions, 8,9,10 were not.  However thanks to the work over at TUHS it’s available for non commercial use:

Alcatel-Lucent USA Inc has permitted usage saying "will not assert its
copyright rights with respect to any non-commercial copying, distribution,
performance, display or creation of derivative works of 
Research Unix®1 Editions 8,9, and 10."

So awesome!

The version of Research v8 is split onto 2 tape images, one for the graphical terminals, and the other for the OS install onto the VAX.  The distribution is not suitable for any standalone operation, and requires a previously installed 4.1BSD machine, with a second disk to install v8 onto.  Part of the installation requires you to compile your own kernel.  I ran into a bit of problems as it’s not a 100% process, but after referencing this guide from David du Colombier, I had the system up and running.  Naturally reading the installation manual helped a great deal too.

As always there is strange artifacts left in the backup, such as this scoreboard from rogue:

Top Ten Rogueists:
Rank Score Name
1 5545 Rog-O-Matic XIII for mjs: quit on level 17.
2 5043 ken: killed on level 23 by a dragon.
3 3858 zip: killed on level 16 by an invisible stalker.
4 3249 Rog-O-Matic VII: killed on level 16 by an invisible stalker.
5 2226 Rog-O-Matic VII: killed on level 13 by a troll.
6 2172 St. Jude: killed on level 13 by a troll.
7 1660 Rog-O-Matic VII: quit on level 11.
8 1632 Chipmunk the Jello: killed on level 10 by a centaur.
9 844 Rog-O-Matic VII: quit on level 5.
10 401 Rog-O-Matic VII: killed on level 4 by a snake.

Does this mean Ken Thompson was an avid rogue fan?  Perhaps.  Naturally I quickly compiled the v100 version of aclock, and had it running.

aclock on v8

I’ll have to edit this and more and more as I find out, but I’ve been busy in real life, and of course I know that in addition to v8, there is also v9 & v10 to tackle.

As always, if you want you can download my pre-installed from my site : researchv8.7z

You will have to bring your own copy of the SIMH VAX-11/780 simulator.  As of 31/3/2017 ther is issues with the github version of SIMH, and you will have issues with the disks on the VAX.  You need to disable the async with a simple set command in your ini file:

set noasync

And you should now be good to go!  As always you’ll have to battle the 404 page for the correct link and the username & password.

sorry.

Style & Diction

While looking at some old picture of a 3B2, I remembered in college we used to use this ‘fine’ system for it’s Writer’s Workbench which revolved around the programs style & diction.

I thought it’d be interesting to see if I could track down the source, however the sources seem to have been part of the AT&T DWB package, and were not included in any of the seemingly numerous available Unix sources available on TUHS.  But thanks to this post on the TUHS mailing list, I saw this:

I know about style and diction which was shipped with BSD4.1
which (again wooly memory) was an early subset of the
whole wwb package.

Going with this, I pulled out the recently unearthed images on bitsavers of 4.1_BSD_19810710, and in the tape images sure was the source!  The only date in there is from 1979!

Deroff Version 2.0    29 December 1979

Which for a 1981 tape sure would be in the same light.  So with some fun playing with the makefiles, I had it running on Debian 8 x64!  So with a little bit of kicking I have it running on Windows via MinGW.

So for a fun example, I though I’d take Bill Gate’s forward on Inside OS/2:

 

      OS/2 is destined to be a very important piece of software. During the
 next 10 years, millions of programmers and users will utilize this system.
 From time to time they will come across a feature or a limitation and
 wonder why it's there. The best way for them to understand the overall
 philosophy of the system will be to read this book. Gordon Letwin is
 Microsoft's architect for OS/2. In his very clear and sometimes humorous
 way, Gordon has laid out in this book why he included what he did and why
 he didn't include other things.
      The very first generation of microcomputers were 8-bit machines, such
 as the Commodore Pet, the TRS-80, the Apple II, and the CPM 80 based
 machines. Built into almost all of them was Microsoft's BASIC Interpreter.
 I met Gordon Letwin when I went to visit Heath's personal computer group
 (now part of Zenith). Gordon had written his own BASIC as well as an
 operating system for the Heath system, and he wasn't too happy that his
 management was considering buying someone else's. In a group of about 15
 people, he bluntly pointed out the limitations of my BASIC versus his.
 After Heath licensed my BASIC, I convinced Gordon that Microsoft was the
 place to be if you wanted your great software to be popular, and so he
 became one of Microsoft's first 10 programmers. His first project was to
 single-handedly write a compiler for Microsoft BASIC. He put a sign on his
 door that read

         Do not disturb, feed, poke, tease...the animal

 and in 5 months wrote a superb compiler that is still the basis for all our
 BASIC compilers. Unlike the code that a lot of superstar programmers write,
 Gordon's source code is a model of readability and includes precise
 explanations of algorithms and why they were chosen.
      When the Intel 80286 came along, with its protected mode completely
 separate from its compatible real mode, we had no idea how we were going to
 get at its new capabilities. In fact, we had given up until Gordon came up
 with the patented idea described in this book that has been referred to as
 "turning the car off and on at 60 MPH." When we first explained the idea to
 Intel and many of its customers, they were sure it wouldn't work. Even
 Gordon wasn't positive it would work until he wrote some test programs that
 proved it did.
      Gordon's role as an operating systems architect is to overview our
 designs and approaches and make sure they are as simple and as elegant as
 possible. Part of this job includes reviewing people's code. Most
 programmers enjoy having Gordon look over their code and point out how it
 could be improved and simplified. A lot of programs end up about half as
 big after Gordon has explained a better way to write them. Gordon doesn't
 mince words, however, so in at least one case a particularly sensitive
 programmer burst into tears after reading his commentary. Gordon isn't
 content to just look over other people's code. When a particular project
 looks very difficult, he dives in. Currently, Gordon has decided to
 personally write most of our new file system, which will be dramatically
 faster than our present one. On a recent "vacation" he wrote more than 50
 pages of source code.
      This is Gordon's debut as a book author, and like any good designer he
 has already imagined what bad reviews might say. I think this book is both
 fun and important. I hope you enjoy it as much as I have.

First we run it through style which will give the overall report on the text.

D:\diction\bin>style.cmd forward.txt
readability grades:
        (Kincaid)  9.1  (auto)  9.2  (Coleman-Liau)  8.8  (Flesch)  8.5 (64.8)
sentence info:
        no. sent 31 no. wds 607
        av sent leng 19.6 av word leng 4.43
        no. questions 0 no. imperatives 0
        no. nonfunc wds 338  55.7%   av leng 5.58
        short sent (<15) 35% (11) long sent (>30)  13% (4)
        longest sent 35 wds at sent 14; shortest sent 7 wds at sent 5
sentence types:
        simple  39% (12) complex  32% (10)
        compound   3% (1) compound-complex  26% (8)
word usage:
        verb types as % of total verbs
        tobe  33% (26) aux  22% (17) inf  13% (10)
        passives as % of non-inf verbs   6% (4)
        types as % of total
        prep 8.6% (52) conj 4.1% (25) adv 7.1% (43)
        noun 23.9% (145) adj 14.7% (89) pron 8.4% (51)
        nominalizations   0 % (3)
sentence beginnings:
        subject opener: noun (6) pron (5) pos (2) adj (3) art (3) tot  61%
        prep  19% (6) adv   6% (2)
        verb   0% (0)  sub_conj  13% (4) conj   0% (0)
        expletives   0% (0)

So that places it on the grade 9 level, average readability.

Now let’s see about usage errors with diction!

D:\diction\bin>diction forward.txt
      os 2 is destined to be a[ very ]important piece of software.

 during the  next 10 years  millions of programmers and users will[ utilize]
 this system.

 the best way for them to understand the[ overall ] philosophy of the system
 will be to read this book.

 in his[ very ]clear and sometimes humorous  way  gordon has laid out in this
 book why he included what he did and why  he didn t include other things.

       the[ very ]first generation of microcomputers were 8 bit machines
 such  as the commodore pet  the trs 80  the apple ii  and the cpm 80 based
  machines.

 built into almost[ all of ]them was microsoft s basic interpreter.

 unlike the code that[ a lot of ]superstar programmers write   gordon s source
 code is a model of readability and includes precise  explanations of algorithms
 and why they were chosen.

[ in fact ] we had given up until gordon came up  with the patented idea described
 in this book that has been referred to as   turning the car off and on at
 60 mph.

[ a lot of ]programs[ end up ]about half as  big after gordon has explained
 a better way to write them.

 when a particular project  looks[ very ]difficult  he dives in.

number of sentences 34 number of hits 11

As you can see, Bill likes very, very much.

Explain can give you examples of what to use instead, so how about ‘a lot of’?

D:\diction\bin>bash explain
phrase?
a lot of
use "many" for "a lot of"
phrase?

Explain is a sed script, so in this case I’m using MinGW’s MSYS environment to run the script.

I don’t think much of anyone will care about text processing utilities from the 1970’s in 2016 (and beyond) but for anyone else who is bored, or found out about this by mistake, here you go!

diction.7z

You’ll get a 404 page, just read the error page for the password.

Virtualization Challenge Part II – WYSE Unix

(this is a guest post from Tenox)

The second virtualization contest is now on! Similar to the previous one, the winner receives $100 via Paypal and the submission is posted on this blog! Hopefully this one will be little bit more challenging. 🙂

The subject is the rarest of the rare WYSE Unix!

The progress so far: A few years ago I came in to possession of a set of floppy disks pictured here:

Wyse UnixThanks to Al Kossow from bitsavers.org the floppy disk content has been recovered. Michal Necasek of OS/2 museum successfully converted them in to an usable format and made some modifications to get them to boot on VirtualBox:

Wyse Unix in VirtualBoxCouple of years later, thanks to Andrew Gong, a WYSE Unix tape has been found on eBay:

wyseunixMore recently Al Kossow was able to read the tape in to an image, which now I have uploaded to my web server: wyseunix321a.zip

The next step is yours! Install the whole system on to a hypervisor of your choice, document the process and supply a vanilla boot image or VM.

The winner shall be the person who will first post a comment declaring success including a screenshot and can further prove it by emailing emailing me the submission shortly after. If the comment gets blocked by spam filter, don’t worry the original submission time will of course count. Oh and almost forgot: I also need aclock binary for it, however if there is no compiler and the standard SysV binary works fine, you are exempt from the requirement.

The catch? Looks like floppy disk trouble. The boot disk is fine after it has been fixed up by Michal. The Base floppy looks like has same content as boot. Copy Tools is very small. Looks like it may be truncated. Hopefully not, but if yes I count on your creativity. Remember that Dell Unix is an exactly save release of SystemV/386 and did not have or needed copy tools to install.

Good Luck!

Update: Looks like the contest has been won by Mihai! Congratulations!

I saw this git/Unix archive mentioned on TUHS

And I thought that I should broadcast it to the world. Diomidis Spinellis has gone through the hard work of going through all the old legacy Unix source code, making it easily available here.  Even more fun it to just find somewhere with a couple of GB free, and clone it!

git clone https://github.com/dspinellis/unix-history-repo

With that done, you can then ‘check’ out the repo from any of the major releases and get the source!  For example to see 4.4 BSD, you would type in:

cd unix-history-repo
git checkout BSD-4_4

Pretty cool!

And it goes up to FreeBSD 10.0.1  Release tags are:

  • Epoch
  • Research-V1
  • Research-V3
  • Research-V4
  • Research-V5
  • Research-V6
  • BSD-1
  • BSD-2
  • Research-V7
  • Bell-32V
  • BSD-3
  • BSD-4
  • BSD-4_1_snap
  • BSD-4_1c_2
  • BSD-4_2
  • BSD-4_3
  • BSD-4_3_Reno
  • BSD-4_3_Net_1
  • BSD-4_3_Tahoe
  • BSD-4_3_Net_2
  • BSD-4_4
  • BSD-4_4_Lite1
  • BSD-4_4_Lite2
  • BSD-SCCS-END
  • 386BSD-0.0
  • 386BSD-0.1
  • FreeBSD-release/1.0, 1.1, 1.1.5
  • FreeBSD-release/2.0 2.0.5, 2.1.0, 2.1.5, 2.1.6, 2.1.6.1, 2.1.7, 2.2.0, 2.2.1, 2.2.2, 2.2.5, 2.2.6, 2.2.7, 2.2.8
  • FreeBSD-release/3.0.0, 3.1.0, 3.2.0, 3.3.0, 3.4.0, 3.5.0
  • FreeBSD-release/4.0.0 4.1.0, 4.1.1, 4.2.0, 4.3.0, 4.4.0, 4.5.0, 4.6.0, 4.6.1, 4.6.2, 4.7.0, 4.8.0, 4.9.0, 4.10.0, 4.11.0
  • FreeBSD-release/5.0.0 5.1.0, 5.2.0, 5.2.1, 5.3.0, 5.4.0, 5.5.0
  • FreeBSD-release/6.0.0, 6.1.0, 6.2.0, 6.3.0, 6.4.0
  • FreeBSD-release/7.0.0, 7.1.0, 7.2.0, 7.3.0, 7.4.0
  • FreeBSD-release/8.0.0, 8.1.0, 8.2.0, 8.3.0, 8.4.0
  • FreeBSD-release/9.0.0, 9.1.0, 9.2.0
  • FreeBSD-release/10.0.0, 10.1.0