Elijah Miller’s NEC v30 on a Pi hat

v30 on a board

While talking about home brew 8080 and 8086 systems on Discord an ebay search brought me to Elijah’s store page where this small little curiosity was up for sale. It’s literally just a NEC v30 on a Raspberry Pi hat, for a mere $15 USD! Interestingly enough the v30 can operate at 3.3v meaning no special hardware is required to interface to the GPIO bus on a Pi. This reminds me so much of the CP/M cartridge for the Commodore 64, and the price being so right I quickly ordered one and eagerly awaited to 2 weeks shipping to Asia.

While I have Pi 4’s that I run Windows 10 on to drive some displays & power point, I wanted to use the slightly faster Pi400 for this. The Pi400 has a compatible GPIO expansion port so just like a cartridge it’s a simple matter of slotting the card, powering up and building the software. While there is an included binary, it’s a 32bit one, and I’m running Manjaro on the Pi400 for a similar look/feel as the PineBook Pro. Anyways the dependences are SDL2, and an odly named ‘wiringPi’ library that allows C programs to interface to the GPIO.

You can download the emulator over on homebrew8088, specifically the Raspberry Pi Second Project. The last ‘ver 2’ download has the project configured for a v30 which is an 8086 analogue, unlike the v20 which is an 8088. When physically interfacing to the processor things like this really matter!

With the emulator built it was pretty simple to fire it up, and boot into MS-DOS:

first boot!

I have to admit I was a little startled at first as I really had no idea if this was going to work at all. I’d spoken to an engineer friend and he was saying plugging a CPU directly into the GPIO bus, and toggling connections to actually emulate the board was both crazy and that without any electrical buffers it’d most likely either fry the processor and maybe the Pi as well. I suspect this being low voltage may be sparing both, although I have no EE so I’m not going to pretend to know.

Loading up Norton SI confirms what Elijah had posted on Ebay is that it runs very slowly about 1/3rd the speed of an XT. Now I may not know anything about hardware but this seemed at least something a profiler could at least tell me what is going on, and if someone like me helicoptering in on the shoulder of giants could see something.

gcc -I/usr/include/SDL2 -pg -O2 *.cpp -o pi -lSDL2 -lwiringPi -lpthread -lstdc++

This will build a profiled version of the emulator that’ll let us know which functions are being called both the number of times, and how much time to do so. Not knowing anything but having profiled other emulators, the usual pattern is that you spend most time fetching and possibly translating memory; Both in feeding instructions and pushing/popping data from stack and pointers. Waiting is usually for initialisation and for IO.

Once you’ve run your profiled executable, it’ll dump a binary file gmon.out which you can then use gprof to format to a text file like this:

gprof pi gmon.out > report.txt

And then looking at the report you can see where the top time, along with top calls are. Some things just take a while to complete and other well they get called far too often.

Each sample counts as 0.01 seconds. % cumulative self self total
time seconds seconds calls s/call s/call name
39.91 0.71 0.71 286883 0.00 0.00 Print_Char_9x16(SDL_Render er*, int, int, unsigned char)
16.30 1.00 0.29 1 0.29 1.02 Start_System_Bus(int)
12.37 1.22 0.22 1100374 0.00 0.00 Data_Bus_Direction_8086_OUT()
7.87 1.36 0.14 5954106 0.00 0.00 CLK()

As expected Start_System_Bus takes 1 second, followed by 1,100,374 calls to set the Data_Bus_Direction_8086_OUT (no doubt the Pi needs to alternate between reading and writing to the CPU), followed by 5,954,106 ticks of the CLK function. Of course the real culprit is Print_Char_9x16 which was called 286,883 times, and is responsible for nearly 40% of the tuntime!

Obviously for a simple MS-DOS boot the screen should not be calling any print char anywhere near this many times. Clearly something is amiss. Not knowing anything I added a simple counter to block at the top of the Print_Char_9x16 function to let it only execute 1:1000 times, and I got this:

Obviously it’s not right, which means that the culprit really isn’t Print_Char_9x16 but rather what is calling it. It was a simple change to each of the Mode functions to only render a fraction of the time, and I changed it to a define to let me fire it more often. This is a simple diff, assuming WordPress doesn’t screw it up. It’s not pretty but it gets the job done.

$ diff -ruN ver2/vga.cpp ver2-j/vga.cpp 
--- ver2/vga.cpp	2020-07-29 10:36:51.000000000 +0800
+++ ver2-j/vga.cpp	2021-06-04 01:51:33.546124473 +0800
@@ -1,5 +1,9 @@
 #include "vga.h"
 
+static int do9x16 = 0;
+#define VIDU 5000
+
+
 void Print_Char_18x16(SDL_Renderer *Renderer, int x, int y, unsigned char Ascii_value)
 {
 	for (int i = 0; i < 9; i++)
@@ -23,6 +27,12 @@
 
 void Mode_0_40x25(SDL_Renderer *Renderer, char* Video_Memory, char* Cursor_Position)
 {
+do9x16++;
+if(do9x16>VIDU)
+        {do9x16=0;}
+else
+        {return;}
+
 	int index = 0; 
 	for (int j = 0; j < 25; j++)
 	{
@@ -36,6 +46,7 @@
 	Print_Char_18x16(Renderer, (Cursor_Position[0] * 18), (Cursor_Position[1] * 16), 0xDB);
 	SDL_RenderPresent(Renderer);	
 }
+
 void Print_Char_9x16(SDL_Renderer *Renderer, int x, int y, unsigned char Ascii_value)
 {
 	for (int i = 0; i < 9; i++)
@@ -57,6 +68,12 @@
 }
 void Mode_2_80x25(SDL_Renderer *Renderer, char* Video_Memory, char* Cursor_Position)
 {
+do9x16++;
+if(do9x16>VIDU)
+        {do9x16=0;}
+else
+        {return;}
+
 	int index = 0; 
 	for (int j = 0; j < 25; j++)
 	{
@@ -102,6 +119,12 @@
 
 void Graphics_Mode_320_200_Palette_0(SDL_Renderer *Renderer, char* Video_Memory)
 {
+do9x16++;
+if(do9x16>VIDU)
+        {do9x16=0;}
+else
+        {return;}
+
 	SDL_RenderClear(Renderer);
 			int index = 0; 				
 			for (int j = 0; j < 100; j++)
@@ -156,6 +179,12 @@
 }
 void Graphics_Mode_320_200_Palette_1(SDL_Renderer *Renderer, char* Video_Memory)
 {
+do9x16++;
+if(do9x16>VIDU)
+        {do9x16=0;}
+else
+        {return;}
+
 	SDL_RenderClear(Renderer);
 			int index = 0; 
 			for (int j = 0; j < 100; j++)

While it feels more responsive on the console, it’s still incredibly slow. SI was returning the same speed which means that although we aren’t hitting the screen anywhere near as often it’s still doing far too much. Is it really a GPIO bus limitation? Again I have no idea. But the next function of course is the clock.

First I tried dividing the usleep in half thinking that maybe it’s not getting called enough. And running SI revealed that I’d gone from a 0.3 to a 0.1! Obviously this is not the desired effect! So instead of a divide I multiplied it by four:

diff -ruN ver2/timer.cpp ver2-j/timer.cpp 
--- ver2/timer.cpp	2020-08-12 00:32:13.000000000 +0800
+++ ver2-j/timer.cpp	2021-06-04 02:06:25.505904407 +0800
@@ -7,7 +7,7 @@
 {
    while(Stop_Flag != true)
    {
-      usleep(54926); 
+      usleep(54926*4); 
       IRQ0();
    }
 }

Now re-running SI I get this:

Norton SI with clock multiplied by four

Now it’s scoring a 1.5! Obviously these are all ‘magic numbers’ and tied to the Pi400 and more importantly I haven’t studied the code at all, I’m not trying to disparage or anything, if anything it’s just a quick example why profiling your code can be so important! At the same time trying to run games is so incredibly slow I don’t even know if my changes had any actual impact to speed as emulation of benchmarks can be such a finickie thing.

My goto game, Battletech 3025 Crescent Hawks Inception loads to the first splash but then seems to hang. I could be impatient or there could be further issues but I’m just some impatient tourist with a C compiler…

With my changes and re-running the profiler I now see this:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  us/call  us/call  name    
 95.41    129.23   129.23 22696621     5.69     5.69  Read_Memory_Array(unsigned long long, char*, int)
  2.90    133.15     3.92                             Start_System_Bus(int)
  0.88    134.34     1.19 64369074     0.02     0.02  CLK()
  0.30    134.74     0.40                             keyboard()
  0.16    134.96     0.22   412873     0.53     0.53  Print_Char_9x16(SDL_Render
er*, int, int, unsigned char)
  0.08    135.07     0.11 11273939     0.01     0.01  Data_Bus_Direction_8086_OUT()

Which is now what I expect with the bulk of the emulation now calling Read_Memory, with the Clock following that and of course our tamed screen renderer (although its still called far too much!) with the Data_Bus_Direction being further down the list. No doubt some double buffering and checking what changed in between calls would go a LONG way to optimise it, just as would actually studying the source code.

The one cool thing about this is that if I wanted to write a PC emulator this way gives me the confidence that the CPU is not only 100% cycle accurate, but it’s 100% bug for bug accurate since we are using a physical processor.

And again for $15 USD + Shipping I cannot recommend this enough!

Virtualization Challenge IV – QNX 1.2

(This is a guest post by Antoni Sawicki aka Tenox)

This is a Virtualization Challenge. A competition to virtualize an OS inside emulator/hypervisor. (Previously 1 / 2 / 3)

This time the object of the competition is QNX version 1.2. A demo disk is covered here. This is the set of floppy disks:

As you can see the boot disk is copy-protected. As such I have imaged these disks using both KryoFlux and SuperCard Pro. The magnetic flux stream images are available here. For verification I have converted the raw stream of the demo disk in to a sector image using HFE tool. The converted disk boots and works correctly in an emulator. The demo disk can also help with analyzing the boot process since it’s known to work.

The contest is to virtualize the OS, install it and provide a fully working hard disk image with the OS installed. Any emulator of your choice or method is acceptable as long as anyone can download and run it. The prize is $100 via PayPal and of course the fame! 🙂 The winner will be whoever comments the article first with a verifiable working solution.

A bonus $50 prize will be awarded if you can patch the boot floppy disk so that it can be installed as if the copy protection was never there.

Good luck!!!

UPDATE: The competition has ben won: QNX 1.2 Virtualized

UPDATE 2 : QNX 1.2 challenge Act II – HDD Boot

UPDATE 3: Reverse-engineering QNX 1.2 to boot from HDD

QNX 1.1 Demo Disk

(This is a guest post by Antoni Sawicki aka Tenox)

Fresh from the oven, or rather Kryoflux dump – a QNX version 1.1 Demo Disk:

QNX 1.1 Demo Disk

I managed to boot it on 86Box:

QNX 1.1 booted on 86Box Emulator

For the readers with more curiosity and time at their hands please could you try it on different emulators and comment what works and what doesn’t.

For the less curious this how the demo actually looks like once you log in as demo user:

QNX 1.1 Demo Menu

As the authors demand to make as many copies of this disk as possible here it is. Please download and spread!

I also managed to dump the rest of QNX 1.2 including boot disk, utils and even c compiler. Unfortunately the boot disk is copy protected:

I have raw stream dump made with Kryoflux as well as regular disk images. If you are interested in circumventing checking the copy protection so the system could be run in an emulator let me know in a comment. Perhaps time for another Virtualization Challenge?

Previously:

Virtualizing QNX 2

QNX Windows – First Look

QNX 2.21 Arrived Today

PCem v15 released!

The new dynamic recompiler appears to be much more faster, although if you want maximum performance, make sure to set your video card to the fastest possible performance.

I was doing my typical DooM thing, and the performance was abysmal. But I did have an 8bit VGA card selected, so what would I really expect? Interestingly enough in ‘low resolution’ mode it performed quite well, but setting it to the artificial ‘fastest PCI/VLB’ speed it was performing just great.

PCem v15 released. Changes from v14 :

  • New machines added – Zenith Data SupersPort, Bull Micral 45, Tulip AT Compact, Amstrad PPC512/640, Packard Bell PB410A, ASUS P/I-P55TVP4, ASUS P/I-P55T2P4, Epox P55-VA, FIC VA-503+
  • New graphics cards added – Image Manager 1024, Sigma Designs Color 400, Trigem Korean VGA
  • Added emulation of AMD K6 family and IDT Winchip 2
  • New CPU recompiler. This provides several optimisations, and the new design allows for greater portability and more scope for optimisation in the future
  • Experimental ARM and ARM64 host support
  • Read-only cassette emulation for IBM PC and PCjr
  • Numerous bug fixes

Thanks to dns2kv2, Greatpsycho, Greg V, John Elliott, Koutakun, leilei, Martin_Riarte, rene, Tale and Tux for contributions towards this release.

As always PCem can be downloaded here:

8086tiny BIOS patch update

This fun patch allows bigger hard disks, allowing you to run larger OS’s like QNX!

You don’t have to update the emulator, it’s just for the BIOS.  Source is here: Over on github.

8086 Tiny on Windows with ansicon to render the textmode correctly.

Seeing the QNX logo sure has some flashbacks to the Burroughs/Unisys Icon from days of old.  Although it has no relationship to the Waterloo Icon, some housing complex for students.

PCem v13 released

Lots of new features added into this release!

New systems like:

  • IBM PS/2 Model 50
  • IBM PS/2 Model 55SX
  • IBM PS/2 Model 80

New disk controllers!

  • AT Fixed Disk Adapter
  • DTC 5150X
  • Fixed Disk Adapter (Xebec)
  • IBM ESDI Fixed Disk Controller
  • Western Digital WD1007V-SE1
  • Adaptec AHA-1542C
  • BusLogic BT-545S
  • Longshine LCS-6821N
  • Rancho RT1000B
  • Trantor T130B

And plenty of new fixes!  I just installed Citrix 2.0 with the older OS/2 1.2 based drivers, and it works great!

Citrix 2.0 on PCem v13

The full announcement is here, along with downloads for Windows & Linux.

Epyx Rogue 1.48

Rouge 1.48 title screen

Rouge 1.48 title screen

A while back while looking for old Rogue source, and resources I came across this page, which includes a lot of old versions, and source code, and the file rog11src.zip. But looking at the source in this directory the file rogue.h reveals that it is actually 1.48!

#define REV 1
#define VER 48

And the source is all timestamped from late 1984, and throughout 1985.  Well isn’t that exciting!  Also on the same site is rogue-1.48.zip, a binary distribution of Rogue 1.48.  So I thought I’d give it a shot to build it.  The source mentions needing the MANX C compiler, which of course a quick google search yields an ad:

Manx Aztec C86

Manx Aztec C86

Which has all kinds of fascinating information, such as the ability to cross compile from VAX BSD, or PDP-11 BSD, the Amiga, CP/M etc but they don’t actually give any information about versions.

There is, however an Aztec C museum, that hosts several versions.   And they do have the versions, along with the years to show that the C86 compiler that they had for 1985 would be 3.4b

Version 3.4b
Compiler Aztec C 8086 3.40a 7-3-86
(C) 1982,83,84,85 by Manx Software Systems, Inc.

And conveniently, they do have a download link for the comiler here: az8634b.zip

Now, since I’m on Windows 10 x64 I can’t easily run MS-DOS based compilers from 1985 at my native CLI, without a tool, and I chose Takeda Toshiya’s MSDOS.  I was able to ‘bind’ the azmake utility which then could call the needed compiler, assembler, and linker to build an executable without too much work.  I just created a command file, ‘build.cmd’ in the src directory, to setup the paths and needed variables to quickly compile Rogue from the command line.  And a quick attempt at playing it showed that although it does compile, it is unplayable!

Killed

Killed by the Copy Protection Mafia

Well isn’t that great.  There is a copy protection scheme.  But wait, we have source so can’t we just by pass it?  Yes we can!  In the file dos.asm there is some checks for the variables hit_mul & goodchk.  So I did the logical thing, which is before it checks them I just set them to good values.

; fake copy protection
mov hit_mul_, 1
mov goodchk_, 0D0DH

And the good news is that I would no longer get killed by the Mafia, but I couldn’t progress down any levels.  So in the file oprotec.asm, I saw there is some disk check routine called protect, that I went ahead and bypassed by having it immediately jump down to the ‘good’ label. Everything compiles but it still locks up going down a level.  So finally I check rogue.h and commend the #define PROTECT statement, and now it’ll run!

I don’t know if anyone would even care, but I added the PDF manual and all the zip files that I used to source this version.  You can download it here:

rogue-148_binary+source.7z

If you don’t want to run it under MS-DOS, or something like DOSBox, you can use msdos to run it.  The title screen is garbled as it doesn’t emulate CGA, but as the rest is just text mode, it’ll run just fine.

MS-DOS player can now embed executables

So what this means is that now you can make fully standalone Win32/Win64 executables out of CLI based MS-DOS applications.

D:\tcc>msdos\binary\i486_x64\msdos.exe tcc -Iinclude -Llib hi.c
Turbo C++ Version 3.00 Copyright (c) 1992 Borland International
hi.c:
Turbo Link Version 5.0 Copyright (c) 1992 Borland International

Available memory 4215648

D:\tcc>c:msdos\binary\i486_x64\msdos.exe hi
hi!

D:\tcc>c:msdos\binary\i486_x64\msdos.exe -c hi.exe
‘new_exec_file.exe’ is successfully created

D:\tcc>new_exec_file.exe
hi!

Isn’t that great?

I’ve had one issue with Turbo C++ 3.00 and that is the embedded executable will run out of memory while linking, but invoking it by calling msdos.exe let’s it run fine. If you compile and link separately it’ll run just fine.

As always you can find the project page here:

http://takeda-toshiya.my.coocan.jp/msdos/index.html

 

 

 

As requested, PCem v11 with networking

via SLiRP

via SLiRP

injecting networking was no more difficult than it was in version 10.  It’s only a few changes to pc.c, if you look at the USENETWORKING define you’ll see them.  The best notes are on the forum.

I haven’t changed or improved anything it still requires manual configuration.

Downloads are available on my site as pcem_v11_networking.7z.  You’ll have to defeat the password protection, as always.  I included the source, it ought to be trivial to rebuild.

*For anyone using an old version the ‘nvr’ directory is missing, so PC-em is unable to create new non volatile ram save files, meaning you always loose your BIOS settings.  Sorry I missed that one.

PCem v11 released

I haven’t had time to follow it, but great news!

PCem v11 released. Changes from v10.1 :

  • New machines added – Tandy 1000HX, Tandy 1000SL/2, Award 286 clone, IBM PS/1 model 2121
  • New graphics card – Hercules InColor
  • 3DFX recompiler – 2-4x speedup over previous emulation
  • Added Cyrix 6×86 emulation
  • Some optimisations to dynamic recompiler – typically around 10-15% improvement over v10, more when MMX used
  • Fixed broken 8088/8086 timing
  • Fixes to Mach64 and ViRGE 2D blitters
  • XT machines can now have less than 640kb RAM
  • Added IBM PS/1 audio card emulation
  • Added Adlib Gold surround module emulation
  • Fixes to PCjr/Tandy PSG emulation
  • GUS now in stereo
  • Numerous FDC changes – more drive types, FIFO emulation, better support of XDF images, better FDI support
  • CD-ROM changes – CD-ROM IDE channel now configurable, improved disc change handling, better volume control support
  • Now directly supports .ISO format for CD-ROM emulation
  • Fixed crash when using Direct3D output on Intel HD graphics
  • Various other fixes

Thanks to Battler, SA1988, leilei, Greatpsycho, John Elliott, RichardG867, ecksemmess and cooprocks123e for contributions towards this release.

Downloads are available for Windows & Linux.