Porting Sarien to OS/2 Presentation Manager

Posted on February 23, 2024 by neozeed

Originally with all the buildup of compilers & GCC ports to OS/2, I had a small goal of getting Sarien running on OS/2. I did have it running on both a 286 & 386 DOS Extender, so the code should work fine, right?

To recap, years ago I had done a QuakeWorld port to OS/2 using the full screen VIO mode, a legacy hangover from 16bit OS/2. It works GREAT on the released 2.00 GA version. I went through the motion of getting the thunking from 32bit mode to 16bit mode, to find out that it doesn’t exist in the betas!

No VIO for you! — No VIO access from 32bit

So that meant I was going to have to break down and do something with Presentation Manager.

So the first thing I needed was a program I could basically uplift into what I needed, and I found it through FastGPI.

While it was originally built with GCC, I had rebuilt it using Visual C++ 2003 for the math, and the Windows NT 1991 compiler for the front-end. As you can see it works just fine. While I’m not a graphical programmer by any stretch, the source did have some promise in that it creates a bitmap in memory, alters it a runtime, and blits (fast binary copy) it to the Display window. Just what I need!

  for (y = 0; y < NUM_MASSES_Y; y++)
  {
    for (x = 0; x < NUM_MASSES_X; x++)
    {
      disp_val = ((int) current[x][y] + 16);
      if (disp_val > 32) disp_val = 32;
      else if (disp_val < 0) disp_val = 0;
      Bitmap[y*NUM_MASSES_X+x] = RGBmap[disp_val];
    }
  }

It goes through the X/Y coordinate plane of the calculated values, and stores them as an RGB mapping into the bitmap. Seems simple enough right?

  DosRequestMutexSem(hmtxLock, SEM_INDEFINITE_WAIT);

  /* This is the key to the speed. Instead of doing a GPI call to set the
     color and a GPI call to set the pixel for EACH pixel, we get by
     with only two GPI calls. */
  GpiSetBitmapBits(hpsMemory, 0L, (LONG) (NUM_MASSES_Y-2), &Bitmap[0], pbmi);
  GpiBitBlt(hps, hpsMemory, 3L, aptl, ROP_SRCCOPY, BBO_AND);

  DosReleaseMutexSem(hmtxLock);

It then locks the screen memory, and then sets up the copy & uses the magical GpiBitBlt to copy it to the video memory, then releases the lock. This all looks like something I can totally use!

I then have it call the old ‘main’ procedure form Sarien as a thread, and have it source the image from the Sarien temporary screen buffer

disp_val = ((int) screen_buffer[y*NUM_MASSES_X+x] );

Which all looks simple enough!

And WOW it did something! I of course, have no keyboard, so can’t hit enter, and I screwed up the coordinates. I turned off the keyboard read, flipped the X/Y and was greeted with this!

Welcome to OS/2 where the memory is the total opposite of what you expect.

And it’s backwards. And upside down. But it clearly is rendering into FastGPI’s gray palette! I have to admit I was really shocked it was running! At this point there is no timer, so it runs at full speed (I’m using Qemu 0.80 which is very fast) and even if there was keyboard input it’d be totally unplayable in this reversed/reversed state.

The first thing to do is to flip the display. I tried messing with how the bitmap was stored, but it had no effect. Instead, I had to think about how to draw it backwards in RAM.

  {
    for (x = 0; x < NUM_MASSES_X; x++)
    {
      disp_val = ((int) screen_buffer[y*NUM_MASSES_X+x] );	//+ 16);
      if (disp_val > 32) disp_val = 32;
      else if (disp_val < 0) disp_val = 0;
      Bitmap[((NUM_MASSES_Y-y)*(NUM_MASSES_X))-(NUM_MASSES_X-x)] = RGBmap[disp_val];
    }
  }

Now comes the next fun part, colour.

I had made the decision that since I want to target as many of the OS/2 2.0 betas as possible they will be running at best in 16 colour mode, so I’ll stick to the CGA 4 colour modes. So the first thing I need is to find out what the RGB values CGA can display.

This handy image is from the The 8-Bit Guy’s video “CGA Graphics – Not as bad as you thought!” but here are the four possible sets:

And of course I got super lucky with finding this image:

So now I could just manually populate the OS/2 palette with the appropriate CGA mapping, just like how it worked in MS-DOS:

First define the colours:

#define CGA_00 0x000000
#define CGA_01 0x0000AA
#define CGA_02 0x00AA00
#define CGA_03 0x00AAAA
#define CGA_04 0xAA0000
#define CGA_05 0xAA00AA
#define CGA_06 0xAA5500
#define CGA_07 0xAAAAAA
#define CGA_08 0x555555
#define CGA_09 0x5555FF
#define CGA_10 0x55FF55
#define CGA_11 0x55FFFF
#define CGA_12 0xFF5555
#define CGA_13 0xFF55FF
#define CGA_14 0xFFFF55
#define CGA_15 0xFFFFFF

Then map the 16 colours onto the CGA 4 colours:

OS2palette[0]=CGA_00;
OS2palette[1]=CGA_11;
OS2palette[2]=CGA_11;
OS2palette[3]=CGA_11;
OS2palette[4]=CGA_13;
OS2palette[5]=CGA_13;
OS2palette[6]=CGA_13;
OS2palette[7]=CGA_15;
OS2palette[8]=CGA_00;
OS2palette[9]=CGA_11;
OS2palette[10]=CGA_11;
OS2palette[11]=CGA_11;
OS2palette[12]=CGA_13;
OS2palette[13]=CGA_13;
OS2palette[14]=CGA_13;
OS2palette[15]=CGA_15;

So now it’s looking right but there is no timer so on modern machines via emulation it runs at warp speed. And that’s where OS/2 shows its origins is that it’s timer ticks about every 32ms, so having a high resolution timer is basically out of the question. There may have been options later one, but those most definitively will not be an option for early betas. I thought I could do a simple thread that counts and sleeps, as hooking events and alarms again suffer from the 32ms tick resolution problem so maybe a sleeping thread is good enough.

static void Timer(){
for(;;){
	DosSleep(20);
	clock_ticks++;
	}
}

And it crashed. Turns out that I wasn’t doing the threads correctly and was blowing their stack. And somehow the linker definition file from FastGPI kept sneaking back in, lowering the stack as well.

Eventually I got it sorted out.

The next big challenge came of course from the keyboard. And I really struggled here as finding solid documentation on how to do this is not easy to come by. Both Bing/google want to suggest articles about OS/2 and why it failed (hint it’s the PS/2 model 60), but nothing much on actually being useful about it.

Eventually through a lot of trial and error, well a lot of errors I had worked uppon this:

    case WM_CHAR:
      if (SHORT1FROMMP(parm1) & KC_KEYUP)
        break;
pm_keypress=1;
      switch (SHORT1FROMMP(parm1))
      {
      	case VK_LEFT:
	key_from=KEY_LEFT;
	break;
	case VK_RIGHT:
	key_from=KEY_RIGHT;
	break;
	case VK_UP:
	key_from=KEY_UP;
	break;
	case VK_DOWN:
	key_from=KEY_DOWN;
	break;

	case KC_VIRTUALKEY:
	default:
	key_from=SHORT1FROMMP(parm2);
	break;
      }

I had cheated and just introduced 2 new variables, key_from, pm_keypress to signal a key had been pressed and which key it was. I had issues mapping certain keys so it was easier to just manually map the VK_ mapping from OS/2 into the KEY_ for Sarien. So it triggers only on single key down events, and handles only one at a time. So for fast typers this sucks, but I didn’t want to introduce more mutexes, more locking and queues or DIY circular buffers. I’m at the KISS stage still.

I’m not sure why it was dropping letters, I would hit ‘d’ all I wanted and it never showed up. I then recompiled the entire thing and with the arrow keys now mapped I could actually move!

And just like that, Roger Wilco now walks.

From there I added the savegame fixes I did for the 286/386 versions, along with trying to not paint every frame with a simple frame skip and…

Sarien for OS/2 running at 16Mhz

And it’s basically unplayable on my PS/2 model 80. Even with the 32bit XGA-2 video card.

I had to give it a shot under 86Box, to try the CGA/EGA versions:

It’s weird how the image distorts! Although the black and white mapping seems to work fine.

I should also point out that the CGA/EGA versions are running on OS/2 2.0 Beta 6.123, which currently is the oldest beta I can get a-hold of. So at the least I did reach my goal of having a 32bit version for early OS/2.

I would imagine it running okay on any type of Pentium system, however. So, what would the advantage of this, vs just running the original game in a dos box? Well, it is a native 32bit app. This is the future that was being sold to us back in 1990. I’m sure the native assembly that Sierra used was far more efficient and would have made more sense to just be a full screened 16bit VIO application.

So how long did it take to get from there to here? Shockingly not that much time, — 02/20/2024 6:02 PM for running FastGPI, to — 02/20/2024 10:56 PM For the first image being displayed in Presentation Manager, and finally — 02/21/2024 10:39 PM to when I was first able to walk. As you can see, that is NOT a lot of time. Granted I have a substantially faster machine today than what I’d have in 1990 (I didn’t get a 286 until late 91? early 92?), compiling Sarien on the PS/2 takes 30-40 minutes and that’s with the ultra-fast BlueSCSI, compared to even using MS-DOS Player I can get a build in about a minute without even compiling in parallel.

I’ve put the source over on github: neozeed/sarienPM: Sarien for OS/2 (github.com)

I think the best way to distribute this is in object form, so I’ve created both a zip & disk image containing the source & objects, so you can link natively on your machine, just copy the contents of the floppy somewhere and just run ‘build.cmd’ which will invoke the system linker, LINK386 to do it’s job. I have put both the libc & os2386 libraries on the disk so it should just work about everywhere. Or it did for me!

So that’s my quick story over the last few days working on & off on this simple port of Sarien to OS/2 Presentation Manager. As always, I want to give thanks to my Patrons!

Thunking for fun & a lack of profit

Posted on February 19, 2024 by neozeed

So, with a renewed interest in OS/2 betas, I’d been getting stuff into the direction of doing some full screen video. I’d copied and pasted stuff before and gotten QuakeWorld running, and I was looking forward to this challenge. The whole thing hinges on the VIO calls in OS/2 like VioScrLock, VioGetPhysBuf, VioScrUnLock etc etc. I found a nifty sample program Q59837 which shows how to map into the MDA card’s text RAM and clear it.

It’s a 16bit program, but first I got it to run on EMX with just a few minor changes, like removing far pointers. Great. But I wanted to build it with my cl386 experiments and that went off the edge. First there are some very slick macros, and Microsoft C just can’t deal with them. Fine I’ll use GCC. Then I had to get emximpl working so I could build an import library for VIO calls. I exported the assembly from GCC, and mangled it enough to where I could link it with the old Microsoft linker, and things were looking good! I could clear the video buffer on OS/2 2.00 GA.

Now why was it working? What is a THUNK? Well it turns out in the early OS/2 2.0 development, they were going to cut loose all the funky text mode video, keyboard & mouse support and go all in on the graphical Presentation Manager.

Instead, they were going to leave that old stuff in the past, and 16bit only for keeping some backwards compatibility. And the only way a 32bit program can use those old 16bit API’s for video/keyboard/mouse (etc) is to call from 32bit mode into 16bit mode, then copy that data out of 16bit mode into 32bit mode. This round trip is called thunking, and well this sets up where it all goes wrong.

Then I tried one of the earlier PM looking betas 6.605, and quickly it crashed!

Well this was weird. Obviously, I wanted to display help

This ended up being a long winded way of saying that there is missing calls from DOSCALL1.DLL. Looking through all the EMX thunking code, I came to the low level assembly, that actually implemented the thunking.

EXTRN   DosFlatToSel:PROC
EXTRN   DosSelToFlat:PROC

After looking at the doscalls import library, sure enough they just don’t exist. I did the most unspeakable thing and looked at the online help for guidance:

So it turns out that in the early beta phase, there was no support for any of the 16bit IO from 32bit mode. There was no thunking at all. You were actually expected to use Presentation Manager.

YUCK

For anyone crazy enough to care, I uploaded this onto github Q59837-mono

It did work on the GA however so I guess I’m still on track there.

New version of the MS-DOS Player

Posted on February 16, 2024 by neozeed

And it’s a big update on takeda-toshiya.my.coocan.jp!

From cracyc and roytam’s fork, I have incorporated a correction.
These include file access using FCB and fixing exceptions around the FPU of the MAME version of the i386 core.
In addition, the DAA/DAS/AAA/AAS/AAM/AAD instructions of the MAME version of the i386 core have been modified based on the DOSBox implementation.
With the Pentium 4 version, the testi386.exe is the same as the real thing.

The I386 core of NP21/W has been updated to equivalent to ver0.86 rev92 beta2.
Also, fixed the build time warning so that it does not appear.

Improved checking when accessing environment variables, referencing incorrect environment tables.
Recent builds have resolved an issue that prevented testi386.exe from working.
Improved the efficiency of memory access handling.
Basic memory, extended memory, and reserved areas (such as VRAM) can be accessed in that order with a small number of conditional branches.
The processing speed may be slightly increased.
MS-DOS Player for Win32-x64 Mystery WIP Page (coocan.jp)

Takeda has been very busy indeed!

I don’t want to complain or anything, I’m very thankful for the tool. It’s just so amazing.

but on my Windows 10 install I have so many issues relating to the font/screen changes, that I just made an incredibly lame fork, and commented out those changes, msdos-player_. I stumbled onto the issue by accident by redirecting stdout/stderr, and compiling stuff ran fine, but as soon as it started to mess with the console it’d just crash.

OK so you can run some basic stuff like compilers, but what about ORACLE?!

I did have to subst a drive, as I didn’t feel like dealing with paths and stuff, I had extracted it from oracle-51c-qemu, and modified the autoexec & config.ora and yeah, using the 386 or better emulation it just worked! Sadly there is no network part of the install, although there is a SDK so I guess there ought to be a way to proxy queries.

OK, but how about something even more complicated?! NETWARE!

Obviously there is no ISA MFM/IDE disks in MS-DOS Player, but the server loaded!

Needless to say this update is just GREAT!

I’d say try the one hosted on Takeda’s site! It’ll almost certainly work fine for you. Otherwise I guess try mine. Or not.

Totally unfair comparison of Microsoft C

Posted on February 5, 2024 by neozeed

Because I hate myself, I tried to get the Microsoft OS/2 Beta 2 SDK’s C compiler building simple stuff for text mode NT. Because, why not?!

Since the object files won’t link, we have to go in with assembly. And that of course doesn’t directly assemble, but it just needs a little hand holding:

Microsoft (R) Program Maintenance Utility   Version 1.40
Copyright (c) Microsoft Corp 1988-93. All rights reserved.

        cl386 /Ih /Ox /Zi /c /Fadhyrst.a dhyrst.c
Microsoft (R) Microsoft 386 C Compiler. Version 1.00.075
Copyright (c) Microsoft Corp 1984-1989. All rights reserved.

dhyrst.c
        wsl sed -e 's/FLAT://g' dhyrst.a > dhyrst.a1
        wsl sed -e "s/DQ\t[0-9a-f]*r/&XMMMMMMX/g" dhyrst.a1  | wsl sed -e "s/rXMMMMMMX/H/g" > dhyrst.asm
        ml /c dhyrst.asm
Microsoft (R) Macro Assembler Version 6.11
Copyright (C) Microsoft Corp 1981-1993.  All rights reserved.

 Assembling: dhyrst.asm
        del dhyrst.a dhyrst.a1 dhyrst.asm
        link -debug:full -out:dhyrst.exe dhyrst.obj libc.lib
Microsoft (R) 32-Bit Executable Linker Version 1.00
Copyright (C) Microsoft Corp 1992-93. All rights reserved.

I use sed to remove the FLAT: directives which makes everything upset. Also there is some weird confusion on how to pad float constants and encode them.

CONST   SEGMENT  DWORD USE32 PUBLIC 'CONST'
$T20001         DQ      0040f51800r    ;        86400.00000000000
CONST      ENDS

MASM 6.11 is very update with this. I just padded it with more zeros, but it just hung. I suspect DQ isn’t the right size? I’m not 386 MASM junkie. I’m at least getting the assembler to shut-up but it doesn’t work right. I’ll have to look more into it.

Xenix 386 also includes an earlier version of Microsoft C / 386, and it formats the float like this:

CONST   SEGMENT  DWORD USE32 PUBLIC 'CONST'
$T20000         DQ      0040f51800H    ;        86400.00000000000
CONST      ENDS

So I had thought maybe if I replace the ‘r’ with a ‘H’ that might be enough? The only annoying thing about the Xenix compiler is that it was K&R so I spent a few minutes porting phoon to K&R, dumped the assembly and came up with this sed string to find the pattern, mark it, and replace it (Im not that good at this stuff)

wsl sed -e "s/DQ\t[0-9a-f]r/&XMMMMMMX/g" $.a1 \
| wsl sed -e "s/rXMMMMMMX/H/g" > $*.asm

While it compiles with no issues, and runs, it just hangs. I tried the transplanted Xenix assembly and it just hangs as well. Clearly there is something to do with how to use floats.

I then looked at whetstone, and after building it noticed this is the output compiling with Visual C++ 8.0

      0       0       0  1.0000e+000 -1.0000e+000 -1.0000e+000 -1.0000e+000
  12000   14000   12000 -1.3190e-001 -1.8218e-001 -4.3145e-001 -4.8173e-001
  14000   12000   12000  2.2103e-002 -2.7271e-002 -3.7914e-002 -8.7290e-002
 345000       1       1  1.0000e+000 -1.0000e+000 -1.0000e+000 -1.0000e+000
 210000       1       2  6.0000e+000  6.0000e+000 -3.7914e-002 -8.7290e-002
  32000       1       2  5.0000e-001  5.0000e-001  5.0000e-001  5.0000e-001
 899000       1       2  1.0000e+000  1.0000e+000  9.9994e-001  9.9994e-001
 616000       1       2  3.0000e+000  2.0000e+000  3.0000e+000 -8.7290e-002
      0       2       3  1.0000e+000 -1.0000e+000 -1.0000e+000 -1.0000e+000
  93000       2       3  7.5000e-001  7.5000e-001  7.5000e-001  7.5000e-001

However this is the output from C/386:

      0       0       0  5.2998e-315  1.5910e-314  1.5910e-314  1.5910e-314
  12000   14000   12000  0.0000e+000  0.0000e+000  0.0000e+000  0.0000e+000
  14000   12000   12000  0.0000e+000  0.0000e+000  0.0000e+000  0.0000e+000
 345000       1       1  5.2998e-315  1.5910e-314  1.5910e-314  1.5910e-314
 210000       1       2  6.0000e+000  6.0000e+000  0.0000e+000  0.0000e+000
  32000       1       2  5.2946e-315  5.2946e-315  5.2946e-315  5.2946e-315
 899000       1       2  5.2998e-315  5.2998e-315  0.0000e+000  0.0000e+000
 616000       1       2  5.3076e-315  5.3050e-315  5.3076e-315  0.0000e+000
      0       2       3  5.2998e-315  1.5910e-314  1.5910e-314  1.5910e-314
  93000       2       3  5.2972e-315  5.2972e-315  5.2972e-315  5.2972e-315

Great they look nothing alike. So something it totally broken. I guess the real question is, does it even work on OS/2?

Since I should post the NMAKE Makefile so I can remember how it can do custom steps so I can edit the intermediary files. Isn’t C fun?!

INC = /Ih
OPT = /Ox
DEBUG = /Zi
CC = cl386

OBJ = dhyrst.obj

.c.obj:
	$(CC) $(INC) $(OPT) $(DEBUG) /c /Fa$*.a $*.c
	wsl sed -e 's/FLAT://g' $*.a > $*.a1
	wsl sed -e "s/DQ\t[0-9a-f]*r/&XMMMMMMX/g" $*.a1 \
	| wsl sed -e "s/rXMMMMMMX/H/g" > $*.asm
	ml /c $*.asm
	del $*.a $*.a1 $*.asm

dhyrst.exe: $(OBJ)
        link -debug:full -out:dhyrst.exe $(OBJ) libc.lib

clean:
        del $(OBJ)
        del dhyrst.exe
        del *.asm *.a *.a1

As you can see, I’m using /Ox or maximum speed! So how does it compare?

Dhrystone(1.1) time for 180000000 passes = 20
This machine benchmarks at 9000000 dhrystones/second

And for the heck of it, how does Visual C++ 1.0’s performance compare?

Dhrystone(1.1) time for 180000000 passes = 7
This machine benchmarks at 25714285 dhrystones/second

That’s right the 1989 compiler is 35% the speed of the 1993 compiler. wow. Also it turns out that MASM 6.11 actually can (mostly) assemble the output of this ancient compiler. It’s nice when something kind of work. I can also add that the Infocom ’87 interpreter works as well.

YAY!

Apparently talking about DOS Extenders is too hot for Twitter: AKA Phar Lap 386

Posted on February 1, 2024 by neozeed

I had a small twitter account, and I tried not to get dragged into anything that would just be basically wasting my time. Just stay focused and on topic. FINE. I just wanted to see if anyone ever saw it, if it was even worth the effort of doing WIP’s as I didn’t want to make it super annoying.

I logged on to post a fun update that I’d finally gotten a Phar Lap 386 version 4.1 app to do something halfway useful, the sairen AGI interpreter up and running in the most basic sense.

Talking about DOS Extenders is spammy and manipulation!

I don’t get what triggered it, but oh well there was a ‘have a review’ and yeah that was fine. Great. So I’m unlocked so I go ahead and post with the forbidden topic, as I’m clearly dumb, and forgetting that Twitter is for hate mobs & posting pictures of food, and cat pictures.

The Sairen AGI interpreter built with Watcom 386/7.0 & Phar Lap 386 4.1

So yes, that was a line too far, and now that’s it.

Now some of you may think, if you buy ‘the plan’ you’ll no doubt be exempt from the heavy hands of Twitter

But I already was and had been for a while.

So that’s the end of that. I guess it’s all too confusing for a boomer like me.

So needless to say I cancelled Twitter as well. Kind of sneaky they didn’t auto-cancel taking money.

So yeah, with that out of the way, let’s continue into DOS Extender land. I added just enough 386 magic, onto github: neozeed/sarien286. Yes I see now it really was a poorly named repo. Such is life.

There is 3 main things for porting old programs where they take care of all the logic, it’s going to be File I/O, Screen I/O, and timers. Luckily this time it was easier than I recalled.

Over on usenet (google groups link) Chris Giese shared this great summary on direct memory access from various methods:

/* 32-bit Watcom C with CauseWay DOS extender */
int main(void) {
char *screen = (char *)0xA0000;
initMode13();
*screen = 1;
return 0;
}

/* 32-bit Watcom C with DOS/4GW extender
(*** This code is untested ***) */
int main(void) {
char *screen = (char *)0xA0000;
initMode13();
*screen = 1;
return 0;
}

/* 32-bit Watcom C with PharLap DOS extender
(*** This code is untested ***) */
#include <dos.h> /* MK_FP() */
#define PHARLAP_CONVMEM_SEL 0x34
int main(void) {
char far *screen = (char far *)MK_FP(PHARLAP_CONVMEM_SEL, 0xA0000);
initMode13();
*screen = 1;
return 0;
}

/* 16-bit Watcom C (real mode) */
#include <dos.h> /* MK_FP() */
int main(void) {
char far *screen = (char far *)MK_FP(0xA000, 0);
initMode13();
*screen = 1;
return 0;
}

It is missing the Phar Lap 286 method:

/* Get PM pointer to text screen */
  DosMapRealSeg(0xb800,4000,&rseg);
  textptr=MAKEP(rseg,0);

But it’s very useful to have around as documentation is scarce.

Which brings me to this (again?)

Years ago, I had managed to score a documentation set, and a CD-ROM with a burnt installed copy of the extender. I didn’t know if it was complete, but of course these things are so incredibly rare I jumped on the chance to get it!

Unfortunately, I didn’t feel right breaking the books apart, and scanning them, then add in some bad life choices on my part, and I ended up losing the books. Fast forward *years* later and Foone uploaded a document set on archive.org. GREAT! As far as I can tell the only difference in what I had is that I’ve got a different serial number. Thankfully I was smart enough to at lest email myself a copy of the CD-ROM contents! And this whole thing did inspire me to gut and upload the Phar Lap TNT 6.0 that I had also managed to acquire.

Although unlocking the video RAM wasn’t too bad, once I knew what to do, the other thing is to hook the clock for a timer. ISR’s are always hell, but at least this is a very simple one:

void (__interrupt __far *prev_int_irq0)();
void __interrupt __far timer_rtn();
int clock_ticks;
#define IRQ0 0x08
void main()
  {
   clock_ticks=0;
   //get prior IRQ routine
   prev_int_irq0 = _dos_getvect( IRQ0 );
   //hook in new protected mode ISR
   _dos_setvect( IRQ0, timer_rtn );

/* do something interesting */
   //restore prior ISR
   _dos_setvect( IRQ0, prev_int_irq0 );
  }

void __interrupt __far timer_rtn()
  {
    ++clock_ticks;
    //call prior ISR
    _chain_intr( prev_int_irq0 );
  }

The methodology is almost always the same, as always, it’s the particular incantation.

So yeah, it’s super simple, but the 8086/80286 calling down to DOS/BIOS from protected mode via the int86 just had to be changed to int386, and some of the register structs being redefined. I’m not sure why but the video/isr code compiled with version 7 of Watcom, but crashes. I think its more drift in the headers, as the findfirst/findnext/assert calls are lacking from Watcom 7, so I just cheated and linked with Watcom 10. This led to another strange thing where the stdio _iob structure was undefined. In Watcom 10 it became __iob, so I just updated the 7 headers, and that actually worked. I had to include some of the findfirst/next structures into the fileglob.c file but it now builds and links fine.

Another thing to do differently when using Watcom 7, is that it doesn’t include a linker, rather you need to use 386LINK. Generating the response file, as there is so many objects didn’t turn out too hard once I realized that by default everything is treated as an object.

Another fun thing is that you can tell the linker to use the program ‘stub386.exe’ so that it will run ‘run386’ on it’s own, making your program feel more standalone. From the documentation:

386 | LINK has the ability to bind the stub loader program, STUB386.EXE, to 
the front of an application .EXP file. The resulting .EXE file can be run by 
typing the file name, just like a real mode DOS program. The stub loader 
program searches the execution PATH for RUN386.EXE (the 

386 | DOS-Extender executable) and loads it; 386 | DOS-Extender then loads 
the application .EXP file following the stub loader in the bound .EXE file. 


To autobind STUB386.EXE to an application .EXP file and create a bound 
executable, specify STUB386.EXE as one of the input object files on the 
command line.

So that means I can just use the following as my linker response file.

agi.obj,agi_v2.obj,agi_v3.obj,checks.obj,cli.obj,console.obj,cycle.obj
daudio.obj,fileglob.obj,font.obj,getopt.obj,getopt1.obj,global.obj
graphics.obj,id.obj,inv.obj,keyboard.obj,logic.obj,lzw.obj,main.obj
menu.obj,motion.obj,pharcga3.obj,objects.obj,op_cmd.obj,op_dbg.obj
op_test.obj,patches.obj,path.obj,picture.obj,rand.obj,savegame.obj
silent.obj,sound.obj,sprite.obj,text.obj,view.obj
words.obj,picview.obj stub386.exe
-exe 386.exe
-lib \wat10\lib386\dos\clib3s.lib \wat10\lib386\math387s.lib
-lib \wat10\lib386\dos\emu387.lib

It really was that simple. I have to say it’s almost shocking how well this went.

So, this brings me back, full circle to where it started, me getting banned for posting this:

I thought it was exciting!

For anyone who feels like trying it, I prepped a 5 1/4″ floppy disk image.

One interesting observation is that the 386 extender is actually smaller than the 286 one. And being able to compile with full optimisations it is significantly faster.

16bit on the left, 32bit on the right.

I ran both the prior 16bit protected mode version (on the left), and 32bit version (on the right), on the same IBM PS/2 80386DX 16Mhz machine. You can see how the 32bit version is significantly faster!.

I really should profile the code, and have it load all the resources into RAM, it does seem to be loading and unloading stuff, which considering were in protected mode, we should use all ram, or push the VMM386 subsystem to page, and not do direct file swapping, like it’s the 1970s.

The Rise of Unix. The Seeds of its Fall. / A Chronicle of the Unix Wars

Posted on January 5, 2024 by neozeed

It’s not mine, rather it’s Asianometry‘s. It’s a nice overview of the rise of Unix. I’d recommend checking it out, it’s pretty good. And of course, as I’m referenced!

The Rise of Unix. The Seeds of its Fall.

And part 2: A Chronicle of the Unix Wars

A Chronicle of the Unix Wars (youtube.com)

Years ago I had tried to make these old OS’s accessible to the masses with a simple windows installer where you could click & run these ancient artifacts. Say 4.2BSD.

Download BSD4.2-install-0.3.exe (Ancient UNIX/BSD emulation on Windows) (sourceforge.net)

Installing should be pretty straight forward, I just put the license as a click through and accept defaults.

Starting BSD via ‘RUN BSD42’ and the emulator will fire up, and being up a console program (Tera Term) giving you the console access. Windows will probably warn you that it requested network access. This will allow you to access the VAX over the network, including being able to telnet into the VAX via ‘Attach a PTY’ which will spawn another Tera Term, prompting you to login.

You can login as root, there is no password, and now you are up and running your virtual VAX with 4.2BSD!

I converted many of the old documents into PDF’s so you may want to start with the Beginners guide to Unix. I thought this was a great way to bring a complex system to the masses, but I’m not sure if I succeded.

As it sits now, since 2007 it’s had 776 downloads. I’d never really gotten any feedback so I’d hoped it got at least a few people launched into the bewildering world of ancient Unix. Of course I tried to make many more packages but I’d been unsure if any of them went anywhere. It’s why I found these videos so interesting as at least the image artifacts got used for something!

But in the off hand, maybe this can encourage some Unix curious into a larger world.

Other downloads in the same scope are:

Enjoy!

Win32Emu / DIY WOW

Posted on January 4, 2024 by neozeed

This is a guest post by CaptainWillStarblazer

When the AXP64 build tools for Windows 2000 were discovered back in May 2023, there was a crucial problem. Not only was it difficult to test the compiled applications since you needed an exotic and rare DEC Alpha machine running a leaked version of Windows, it was also difficult to even compile the programs, since you needed the same DEC Alpha machine to run the compiler; there was no cross-compiler.

As a result, I began writing a program conceptually similar to WOW64 on Itanium (or WX86, or FX-32), only in reverse, to allow RISC Win32 programs to run on x86.

The PE/COFF file format is surprisingly simple once you get the hang of it, so loading a basic Win32 EXE that I assembled with NASM was pretty simple – just map the appropriate sections to the appropriate areas, fix up import tables, and start executing.

To start, I wrote a basic 386 emulator core. To complement it, I wrote my own set of Windows NT system DLLs (USER32, KERNEL32, GDI32) that execute inside of the emulator and then use an interrupt to signal a system call which is trapped by the emulator and thunked up to execute the API call on the host.

For example, up above, you can see that the emulated app calls MessageBoxA inside of the emulated USER32, which puts 0 in EAX (the API call number for MessageBoxA) and then does the syscall interrupt (int 0x80 in my case), which causes the emulator to grab the arguments off of the stack and call MessageBoxA.

To ease communication between the host’s Win32 environment and the emulated Win32 environment, I ran the emulated CPU inside of the host’s memory space, which means that to run applications written for a 32-bit version of Windows NT, you need a 32-bit version of win32emu (or a 64-bit version with /LARGEADDRESSAWARE:NO passed to the linker) to avoid pointer truncation issues, to prevent Windows from mapping memory addresses inaccessible by the emulated CPU.

To get “real” apps working, a lot of single-stepping through the CRT was required, but eventually I did get Reversi – one of the basic Win32 SDK samples – to work, albeit with some bugs at first. Calling a window procedure essentially requires a thunk in reverse, so I inserted a thunk window procedure on the host side that calls the emulated window procedure and returns the result.

After this, I got to work on getting more complicated applications to work. Several failed due to lack of floating-point support, some failed due to unsupported DLLs, but I was able to get FreeCell and WinMine to work (with some bugs) after adding SHELL32. I was able to run the real SHELL32.DLL from Windows NT 3.51 under this environment.

One might wonder why I put all this work into running x86 programs on x86, but the reason is that there’s the most information about, and I’m most proficient with, Windows on the 386. Not only does Windows on other CPUs use other CPUs, but also there’s different calling conventions and a lot of other stuff I didn’t want to mess with at first. But this was at least a proof-of-concept to build a framework where I could swap the CPU core for an emulator for MIPS or PPC or Alpha or whatever I wanted and get stuff running.

Astute readers might be wondering why I didn’t take the approach taken by WOW64. For those who don’t know, most system DLLs on WOW64 are the same as those in 32-bit Windows, the only ones that are different are ones with system call stubs that call down to the kernel (NTDLL, GDI32, and USER32, the first of which calls to NTOSKRNL and the latter two calling to WIN32K.SYS). WOW64 instead calls a function with a system call dispatch number, which does essentially the same thing. The reason for this is that the system call numbers are undocumented and change between versions of Windows. WOW64, being an integrated component of Windows, can stay up to date. If I took this approach, I’d either have to stay locked to one emulated set of DLLs (i.e. from NT 4.0) and use their system call numbers on the emulated side, or write my own emulated DLLs and stick to a fixed set of numbers, but either way I’d somehow have to map them to whatever syscall numbers are being used on the host.

As I went on, I should probably also mention that what I said earlier about loading Win32 apps being easy was wrong. Loading a PE image is pretty straightforward, but once you get into populating the TEB and PEB (many of whose fields are undocumented), it quickly gets gnarly, and my PEB emulation is incomplete.

Adding MIPS support wasn’t too much of a hassle, since the MIPS ISA (ignoring delay slots, which gave me no shortage of trouble) is pretty clean and writing an emulator wasn’t difficult. The VirtuallyFun Discord pointed me to Embedded Visual C++ 4.0, which was invaluable to me during development, since it included a MIPS assembler and disassembler, which I haven’t seen elsewhere. After writing a set of MIPS thunk DLLs and doing some more debugging, I finally got Reversi working.

There’s still some DLL relocation/rebasing issues, but Reversi is finally working in this homebrewed WOW!

I’d encourage someone to write a CPU module for the DEC Alpha AXP (or even PowerPC if anyone for some reason wants that). The API isn’t too complicated, and the i386 emulator is available for reference to see how the CPU emulator interfaces with the Win32 thunking side. An Alpha backend for the thunk compiler can definitely be written without too much trouble. Obviously, the AXP presents the challenge that fewer people are familiar with its instruction set than MIPS or 386, but this approach does free one from having to emulate all of the intricate hardware connections in actual Alpha applications while still running applications designed for it, and I’ve heard the Alpha is actually quite nice and clean. MAME’s Digital Alpha core could be a good place to start, but it’ll need some adaptation to work in this codebase. Remember that while being a 64-bit CPU with 64-bit registers and operations, the Alpha still runs Windows with 32-bit pointers, so it should run in a 32-bit address space (i.e. pass /LARGEADDRESSAWARE:NO to the linker).

Theoretically, recompiling the application to support the full address space should enable emulation of AXP64 applications, since the Alpha’s 64-bit pointers will allow it to address the host’s 64-bit address space, but I’m not sure if my emulator is totally 64-bit clean, or if the AXP64’s calling convention is materially different from that on the AXP32 in such a way that would require substantial changes. In either case, most of the code should still be transferable.

I also want to get more “useful” applications running, like development tools (i.e. the MSVC command line utilities – CL, MAKE, LINK, etc.) and CMD. Most of that probably involves implementing more thunks and potentially fixing CPU bugs.

This project is obviously still in a quite early stage, but I’m hoping to see it grow and become something useful for those in the hobby.

For those who want to play along at home, you can download the binary snapshot here: w32emu.zip

A more complete version of the writeup is available here: https://bhty.github.io/og/win32emu_VirtuallyFun_Post.htm and you can find the project here https://github.com/BHTY/Win32Emu/.

SINIX-Z 5.42 on 86Box

Posted on January 1, 2024 by tenox

(this is a guest post by Antoni Sawicki aka Tenox)

As part of nerdy new year celebrations I got Fujitsu Siemens Nixdorf Informationssysteme SINIX-Z running on 86Box. This was possible thanks to Plamen and Vlad!.

Update: got X11 going:

Also C compiler:

Also networking… apparently you have to manually connect it.

Here is a pre-installed image and 86Box configuration file. Now includes bash, gcc and gzip. These are also downloadable here.

BSD on Windows: Things I wish I knew existed

Posted on December 8, 2023 by neozeed

It’s 1995 and I’ve been nearly two years in the professional workspace. OS/2 is the dominant workstation product, Netware servers rule the world, and the year of the Linux desktop is going to happen any moment now. If you weren’t running OS/2, you were probably running Windows 3.1, only very few people were using that Linux thing. What would have been the prefect OS at the time would have been NT with a competent POSIX subsystem, but since we were denied that, enter Hiroshi Oota with BSD on Windows.

It was a late night browsing yahoo auctions Japan as one does, laughing at the absurd Famicom/Super Famicom games, and I went ahead and looked for BSD CD-ROMS, where I first came across BSD on Windows. And then I’d forgotten about it and went to work on some Darwin projects.

Fast forward 3 weeks, and vic485 had bought it, had it shipped, and uploaded on archive.org. So a big super thanks to vic485 for making this all possible!

So what is it? It’s not quite BSD, its a bunch of 16bit DLL’s that broke the kernel down into subsystems, that each rely on winmem32.dll to give access to flat/32bit address space. BSD on Windows (BOW) being a hybrid 16/32bit app is originally for Windows 3.1, with the later 1.5 update for Windows 95, which includes support for long filenames. I’m not sure if it’ll run on Windows NT or OS/2, as I don’t think

So what do you get?

The key media contents are the install floppy and the CD-ROM. Yes the setup program IS only on the floppy. Hope you get that disk image. I’m unsure what the manual is like, other than of course it is in Japanese.

It’s very much a single user mode BSD like environment complete with vi/gcc/csh/perl just to name a few. I’ve been able to test job control, and building some simple programs like Hack 1.03. I found a few issues however.

I haven’t tested enough with FreeBSD 1/2 but I can verify that from my ‘Ancient Linux on Windows‘ packages, the object format is the same, which is that early era when everything was a.out, although all different the reliance on GNU GAS & LD did make the object format the same. And it was nice to compile a hello world from my Linux cross compiler, link it on BOW, and get a running executable.

The memory is weird, in that you can add hundreds of megabytes to Windows and BOW will always run exhausted. In the bow.ini file you can set the heap for each program, and I found out from some silly trial and error that the maximum heap you can effectively give is 13 megabytes. It seems that winmem32 has a single chunk of memory where all processes run out of, hence the sub 16mb ram zone. Maybe there is a way to allocate it, but I’m unsure, maybe it’s in the book. CC1 was frequently having issues, so setting it’s heap to 13M sure helped, the linker ‘ld’ of course was running out of memory as well so setting it to 8M got me linking.

Filenames, especially on Windows 3.1 are a huge problem. All the LFN TSR’s I tried to load just resulted in a full crash. I had to point the linker to the CD-ROM live filesystem, which maybe would be tedious on a real machine, but under emulation it’s fine.

BOW does NOT like Qemu. At all. It won’t under otvdm either. I suspect NT is a no go but I haven’t tried. Oddly enough it’s not a timing issue, as it does run under VMware. There is an advantage to running it under Windows 95, is that it supports long filenames. 86Box works as well, I even was using the Pentium II Xeon at 400Mhz and that ran fine.

Probably the most annoying and silly thing is that the GCC C compiler doesn’t have C++ style comments turned on. Not being able to use ‘//’ is quite annoying.

Hack ran fine on my 386, which was a pleasant surprise!. It was really cool to have Word+Excel and Hack running at the same time.

Had I known about this, it would have been an incredible bridge product. Not to mention cross compiling to even Win32, or Linux. Not to mention at the time being able to run BSD with no real pain, just install and go

There is generic TCP/IP Winsock support in BOW 1.5 as it simply calls winsock. This also includes the ability to run daemons, however limitations in BOW are quickly exposed, such as missing setuid/setgid sno there is no ability to impersonate lower privileged users. MMAP stuff also doesn’t seem to work, although I was able to build a super simple port of Apache 1.3.1 to BSD on Windows (BOW).

While BOW may appear to be very BSD like, there is a lack of a the mmap Apache needs, along with user mapping & impersonation. I ended up using the EMX – OS/2 system code, since it’s very POSIX like without relying on the Unix like OS actually working.

I’ve been able to serve pages to myself, however BOW crashing out many emulators and hypervisors kind of stops me from putting it on the internet. BOW enthusiasts can download it from archive.org

Today, there is really no point to BOW, it’s an interesting oddity, but back in the day, for a jr network administrator being able to run the Unix version of the snmp tools, even if it’s only client side would have been great. If tftpd could be built to run this would have been beyond amazing, as you not only get BSD, but full Windows apps at the same time, much like MachTen.

It’s a shame I never knew this was a thing, I certainly would have been evangelizing BOW! Who knows what other treasures are in the parallel societies of Japan/Asia/Europe?

**UPDATE

Ive been able to cross compile from Windows to BOW using an old 386BSD 0.1 cross tool chain. You can read about it here: Cross compiling to BSD on Windows (BOW) from Win32

Don’t waste money on a math coprocessor they said;

Posted on November 12, 2023 by neozeed

You don’t need it they said!

Well it’s been no secret, but OS/2 6.123 on my PS/2 model 80, is insanely unstable running simple MS-DOS based games (large EXE’s)

And almost always I’d get this fun error:

SYS0037: The system cannot write to the write-protected c: drive.

Followed by a crash trying to execute code at the top of the memory MAP (ABIOS?)

Then ending the program will just crash OS/2. Very annoying!

My goto test of v86 mode environments is an old game that I enjoyed as a kid, 1988’s BattleTech the Crescent Hawks inception.

It’s a great game, that runs on many 8-bit/16-bit systems of the era, and is surprisingly a very well behaved MS-DOS game. I mean if Windows/386 VGA machines can run it in a window using the CGA version, surely a super early OS/2 2.0 beta (6.123) can run it, right? However I found 6.123 to be incredibly unstable, and sadly not up to the task.

I tried to launch BattleTech over and over and had zero success. I couldn’t figure out why it was struggling on my model 80 board, where it runs just great on 86Box. What is going on?

One thing I had stumbled upon was that if I launched an ancient Infocom game in a DOS box, and then launched BattleTech it had a much higher chance of running. But this did not always equate to it working. How is launching an old COM file from the early 80’s excise the ‘devil’ of some 1988 EXE from running?

I wasn’t sure but I had this weird suspicion that it was that my system was lacking a math coprocessor. When I had the model 60 286 board in the PS/2 case I did spring for an 80287, and one thing I found is that OS/2 1.0 & 1.21 ran great. As a matter of fact I think it ran better than when I used to have a 386sx-16 and then later a 486SX-20. Now it’s been closer to 30 years, so I could have an absolutely false memory of all this, but I wasn’t sure I was onto something. So while shopping around a subscriber offered me a math coprocessor as they seem to be insanely expensive in the UK. I have no idea why the 80287 was so cheap, and no idea how to make any kind of adapter, but pJok was able to score one for super cheap in his homeland and send it to the barren wastelands of Scotland. As I was wrapping up the SSD G5 fun, the coprocessor arrived, and it was time to install it!

Note the purple 80386! It’s what we might call foreshadowing

The PS/2 8580 motherboard is really oddly designed with chip orientation going in every which other direction, and the 80387 socket isn’t keyed by pin, so it’s vital to see the notches on the silkscreen. Otherwise I just used compressed air to blow out the socket, and run the reference disk to add the processor.

The processor was instantly picked up, although I had the crashing issue with the BocaRAM/2 memory card again, which meant I had to remove the RAM card, re-configure with the math coprocessor, then add the RAM card, reconfigure, then run the util to patch the CMOS so it’d boot up. I really dislike this RAM card, but 32bit cards cost far more than this entire endeavor cost so I’m pretty much stuck with it.

Now let’s compare the Landmark scores between the 286/287 and the 386/ITT387

Landmark System Speed Test with the PS/2 model 60 80286/80287

And now the 386:

Landmark System Speed Test with the PS/2 model 80 80386/ITT 80387

The ITT processor is significantly faster than the old 80287, which is pretty amazing. The system bus is running at 16Mhz, although being 32bit vs 16bit yielding a nearly 2x in performance, although the ITT co-processor is so much more efficient.

Booting back into OS/2 6.123, and yeah now it just works! No fussing around, everything is just great.

I’m kind of lost too, as none of this should require the maths coprocessor, but the results speak for themselves. I used to wonder once I got some disk images for this ancient version of OS/2, why didn’t they ship it? Sure that insane fight with Microsoft on refusing something like Windows on OS/2, or even WLO like Windows IN OS/2 from being part of the product killed any hope of running apps, but this version of OS/2 is already caught in the trap that it can run MS-DOS so well, despite DPMI not being a thing right now.

As I’d mentioned it does run just fine in 86Box, so what is the deal? Well that lead me to look back at when it did crash I noticed an odd string 038600b1

So what does this mean? Well looking back at the CPU let’s try to decode some of it

First, it’s an A80386-16, which really isn’t that hard to figure out it’s a 16Mhz rated 80386. Next is the revision level, S40344. Searching around we can find this table:

S40276 - A1 (but probably 12 MHz as S40277 is 12 MHz)
S40334 - A2
S40336 - B0
S40337 is B0 stepping
S40343 - B1
S40344 is B1 stepping
S40362 - B1 (20 MHz)

So this places it at at the tail end of the introductory line of 386 processors. Checking over at pcjs, we find that there were quite a few more revisions to the 386.

And further that the B1 Errata is actually quite substantial. Maybe this is why the 386 had such a poor reputation for Unix ports in the day, and why it was shunned by CSRG?

As mentioned in the infamous 32bit multiply bug, this processor had been tested and was given the ΣΣ mark of approval. There are numerous issues listed with the presence of a math coprocessor, I have to wonder if beyond issues for using the full 32bit datapath, if there were some electrical issues with utilizing the full datapath as well? Much like an improperly terminated SCSI bus, did the simple presence of the ITT 387 help with signaling and improve system stability? Or am I hitting some weird bug in 32bit math that is simulated due to the lack of a coprocessor, that once one is in the system, the operation is performed on hardware, sidestepping the entire issue? I’m neither an EE or any good at reversing code, so I really don’t know.

The date code 751 does mean that this processor was manufactured in the 51st week of 1987.

Looking at how ancient this CPU is, I have opted to order one that was made in 1990, an SX218 or D1 stepping.

Although it hasn’t arrived yet, I have to wonder if it would make a really big difference in 32bit system stability? I have to wonder if there was such a massive delay in OS/2 2.0 because of the early 386 processors having so many defects that it just added an undue burden to the development, along with the fighting between IBM & Microsoft. While it would be interesting to see the difference between any of the Microsoft versions of OS/2 2.0, none have surfaced as of yet. Which is a shame.

Although it is nice to have this ‘mid’ IBM beta of OS/2, it does suffer from the ever so common issue of not being able to run any shipping 32bit executable, so unless you have source/object files to link, you are pretty much out of luck. The Microsoft Beta 2 tools are 16bit, so thankfully they run on pretty much any version of OS/2, and they ought to be able to run under Phar Lap 286 as well.

One thing that did recently surface on eBay, is a Microsoft tee shirt from their OS/2 2.0 group. With a minor bit of sleuthing, the Enterprise is from the 1989 ‘hit’ Star Trek V. Maybe I’m too much of a nerd to have recognized the GIF.

Back some 20+ years ago when I lived in Miami, I did have a loaded out PS/2 model 80 back then, and I ran AIX on it, as I thought it was really cool. But it was also incredibly unstable. I have to wonder now if it was a fault of the processor, or the system? Then again back then I had 6 registered IP’s and of course my PS/2 was on the internet! Although it was also the right height to double as a standing mouse pad.

So I guess this potentially leaves us with some painful lesson that you ought to get the math coprocessor for older systems if you plan on running anything other than DOS/Windows with a DOS extender. While I do have a PS/2 version of Xenix, I haven’t been able to dump them yet as my Power Mac doesn’t like NON FAT disks. One thing is for sure, it made a massive difference in OS/2. I don’t think 16Mhz/6MB of RAM is anywhere near enough to run OS/2 2.00 at any decent speed so I’ll stick with the much lighter 6.123.

Virtually Fun

Fun with Virtualization

Category Archives: i386