Thunking for fun & a lack of profit

Posted on February 19, 2024 by neozeed

So, with a renewed interest in OS/2 betas, I’d been getting stuff into the direction of doing some full screen video. I’d copied and pasted stuff before and gotten QuakeWorld running, and I was looking forward to this challenge. The whole thing hinges on the VIO calls in OS/2 like VioScrLock, VioGetPhysBuf, VioScrUnLock etc etc. I found a nifty sample program Q59837 which shows how to map into the MDA card’s text RAM and clear it.

It’s a 16bit program, but first I got it to run on EMX with just a few minor changes, like removing far pointers. Great. But I wanted to build it with my cl386 experiments and that went off the edge. First there are some very slick macros, and Microsoft C just can’t deal with them. Fine I’ll use GCC. Then I had to get emximpl working so I could build an import library for VIO calls. I exported the assembly from GCC, and mangled it enough to where I could link it with the old Microsoft linker, and things were looking good! I could clear the video buffer on OS/2 2.00 GA.

Now why was it working? What is a THUNK? Well it turns out in the early OS/2 2.0 development, they were going to cut loose all the funky text mode video, keyboard & mouse support and go all in on the graphical Presentation Manager.

Instead, they were going to leave that old stuff in the past, and 16bit only for keeping some backwards compatibility. And the only way a 32bit program can use those old 16bit API’s for video/keyboard/mouse (etc) is to call from 32bit mode into 16bit mode, then copy that data out of 16bit mode into 32bit mode. This round trip is called thunking, and well this sets up where it all goes wrong.

Then I tried one of the earlier PM looking betas 6.605, and quickly it crashed!

Well this was weird. Obviously, I wanted to display help

This ended up being a long winded way of saying that there is missing calls from DOSCALL1.DLL. Looking through all the EMX thunking code, I came to the low level assembly, that actually implemented the thunking.

EXTRN   DosFlatToSel:PROC
EXTRN   DosSelToFlat:PROC

After looking at the doscalls import library, sure enough they just don’t exist. I did the most unspeakable thing and looked at the online help for guidance:

So it turns out that in the early beta phase, there was no support for any of the 16bit IO from 32bit mode. There was no thunking at all. You were actually expected to use Presentation Manager.

YUCK

For anyone crazy enough to care, I uploaded this onto github Q59837-mono

It did work on the GA however so I guess I’m still on track there.

Targeting OS/2 with Visual Studio 2003

Posted on February 11, 2024 by neozeed

Sydney’s idea of what Visual C++ for OS/2 should look like

No, it’s not a typo.

This is a long-winded post, but the short version is that I found a working combination to get the C compiler from Visual Studio 2003 targeting OS/2.

Once I’d learned how C compilers are a collection of programs working in concert, I’d always wanted to force Microsoft C to work in that fashion, however it is born to be a compiler that integrates everything but linking. There has been a “/Fa” or output assembly option, but I’ve never gotten it to do anything useful. I’m not that much into assembly but it seemed insurmountable.

But for some reason this time things were different.

This time I used:

Microsoft (R) Macro Assembler Version 6.11

After the great divorce and the rise of Windows NT, Microsoft had shifted from the OMF format to COFF. However somewhere buried in their old tools it still supports it, namely MASM. For example, if I try to run LINK386 (the OS/2 Linker) against output from Visual C++ 2003 I get his

abandon.obj :  fatal error L1101: invalid object module
 Object file offset: 1  Record type: 4c

However if I output to assembly and then have MASM assemble that, and try the linker, I’m bombarded with errors like this:

warp.obj(warp.asm) :  error L2025: __real@4059000000000000 : symbol defined more than once
warp.obj(warp.asm) :  error L2029: '__ftol2' : unresolved external

If I was smart I’d have given up, there is pages and pages of this stuff. But I’m not smart, so instead I decided to something different, and use SED, the stream editor, and try to patch out the errors.

The ftol2 call is for newer CPU’s and any OS/2 library won’t have it. But instead of binary editing symbols we can replace the ftol2 with ftol with this simple line:

 sed -e 's/_ftol2/_ftol/g'

For some reason Visual C++ likes to make all it’s reals “public” meaning there can only be one, but yet there is so many. Why not comment them all out?

sed -e 's/PUBLIC\t__real@/;PUBLIC\t__real@/g'

And there are various other annoying things, but again they can be all patched out. Just as the older Windows 1991 Pre-release compilers also have weird syntax that MASM doesn’t understand.

astro.asm(59): error A2138: invalid data initializer

which goes into how Microsoft C used to initialize floating point constants:

CONST   SEGMENT  DWORD USE32 PUBLIC 'CONST'
$T20000         DQ      0a0ce51293ee845c8r    ; 1.157407407407407E-05
$T20001         DQ      0bffffd3441429ec5r    ; 2440587.499999667
CONST      ENDS

while I found that the C compiler in Xenix 386 initialises them like this:

$T20000         DQ      0a0ce51293ee845c8h    ; 1.157407407407407E-05
$T20001         DQ      0bffffd3441429ec5h    ; 2440587.499999667

This one was a little hard for me as I’m not a sed expert, but I did figure out how to mark the section, and then to replace it

sed -e "s/DQ\t[0-9a-f]r/&XMMMMMMX/g" $.a1 |  sed -e "s/rXMMMMMMX/H/g"

And so on. At the moment my ‘mangle’ script is now this:

.c.obj:
        $(CC) $(INC) $(OPT) $(DEBUG) /c /Fa$*.a $*.c
        wsl sed -e 's/FLAT://g' $*.a > $*.a1
        wsl sed -e "s/DQ\t[0-9a-f]*r/&XMMMMMMX/g" $*.a1 \
        | wsl sed -e "s/rXMMMMMMX/H/g" \
        | wsl sed -e 's/call \t\[/call DWORD PTR\[/g' \
        | wsl sed -e 's/PUBLIC\t__real@/;PUBLIC\t__real@/g' \
        | wsl sed -e 's/_ftol2/_ftol/g' > $*.asm
        ml /c $*.asm
        del $*.a $*.a1 $*.asm

This allows me to plug it into a Makefile, so I only have to edit it in one place.

Not surprisingly, this allows the LINK from Visual C++ 1.0 to link the MASM generated object files and get a native Win32 executable. Even from the oldest compiler I have from the Microsoft OS/2 2.00 Beta 2 SDK from 1989!

But now that we have the C compilers being able to output to something we can edit and force into a Win32, there is a few more things and suddenly:

        C:\cl386-research\bin.10.6030\cl386 /u /w /G3 /O  /c /Faphoon.a phoon.c
C:\cl386-research\bin.10.6030\CL386.EXE: warning: invoking C:\cl386-research\bin.10.6030\CL.EXE
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.6030 for 80x86
Copyright (C) Microsoft Corporation 1984-2002. All rights reserved.

phoon.c
        wsl sed -e 's/FLAT://g' phoon.a > phoon.a1
        wsl sed -e "s/DQ\t[0-9a-f]*r/&XMMMMMMX/g" phoon.a1  | wsl sed -e "s/rXMMMMMMX/H/g"  | wsl sed -e 's/call \t\[/call DWORD PTR\[/g'  | wsl sed -e 's/PUBLIC\t__real@/;PUBLIC\t__real@/g'  | wsl sed -e 's/_ftol2/_ftol/g' > phoon.asm
        ml /c phoon.asm
Microsoft (R) Macro Assembler Version 6.11
Copyright (C) Microsoft Corp 1981-1993.  All rights reserved.

 Assembling: phoon.asm
        del phoon.a phoon.a1 phoon.asm
        msdos286 run286 C:\cl386-research\bin\ddk12\LINK386.EXE @phoon.lnk

Operating System/2 Linear Executable Linker
Version 2.01.012 Nov 02 1993
Copyright (C) IBM Corporation 1988-1993.
Copyright (C) Microsoft Corp. 1988-1993.
 All rights reserved.

Object Modules [.obj]: astro.obj date_p.obj phoon.obj
Run File [astro.exe]: phoon2 /NOE /NOI /NOD:OLDNAMES
List File [nul.map]: nul.map
Libraries [.lib]: ..\..\lib2\libc.lib +
Libraries [.lib]: ..\..\lib2\os2386.lib
Definitions File [nul.def]: nul.def;
LINK386 :  warning L4071: application type not specified; assuming WINDOWCOMPAT

I know it’s a bit of a word salad, but the key thing here is that using Visual C++ 2003’s compiler (version 13.10.6030), and outputting to assembly that we can edit, we can then use MASM to build objects that surprisingly LINK386 version 2.01.012 will link with. I suspect this has to do with device drivers, and probably the majority of the OS/2 operating system.

Anways, we’ve done the incredible, using the same object files, we made both a Win32 application, and an OS/2 application!

phoon-13.10.6030.exe: PE32 executable (console) Intel 80386, for MS Windows
phoon2.exe: MS-DOS executable, LX for OS/2 (console) i80386

Phoon compiled by Visual C++ 2003 on OS/2 2.00

Incidentally Happy CNY!

Obviously, this is VERY cool stuff.

I know the next question is do we have to rely on a 16bit linker? How about Watcom?

C:\cl386-research\proj\trek>wlink @trek.wlk
WATCOM Linker Version 10.0
Copyright by WATCOM International Corp. 1985, 1994. All rights reserved.
WATCOM is a trademark of WATCOM International Corp.
loading object files
searching libraries
Warning(1008): cannot open LIBC.lib : No such file or directory
Warning(1008): cannot open OLDNAMES.lib : No such file or directory
creating an OS/2 32-bit executable

Ignore the warnings and YES we can Link from something much newer & 32bit! In this example I linked the old TREK game, also built with Visual C++ 2003. The response file looks lke:

SYS os2v2
NAME trek2w
FILE abandon.obj,attack.obj,autover.obj,capture.obj,cgetc.obj,checkcond.obj
FILE check_out.obj,compkl.obj,computer.obj,damage.obj,damaged.obj,dcrept.obj
FILE destruct.obj,dock.obj,dumpgame.obj,dumpme.obj,dumpssradio.obj,events.obj
FILE externs.obj,getcodi.obj,getpar.obj,help.obj,impulse.obj,initquad.obj
FILE kill.obj,klmove.obj,lose.obj,lrscan.obj,main.OBJ,move.obj
FILE nova.obj,nullsleep.obj,out.obj,phaser.obj,play.obj,ram.obj
FILE ranf.obj,rest.obj,schedule.obj,score.obj,setup.obj,setwarp.obj
FILE shield.obj,snova.obj,srscan.obj,systemname.obj,torped.obj,utility.obj
FILE visual.obj,warp.obj,win.obj
LIBR ..\..\lib2\LIBC.LIB
LIBR ..\..\lib2\OS2386.LIB

It’s probably needing additional stack space, maybe some other stuff, or resources, maybe how to flag it’s windowing compatible.

TREK built by Visual C++ 2003, and Linked using Watcom C/C++ 10.0

How do I get started, if I dare?! First download and unpack cl386-research-v2. Ideally on the root of your C: drive, because why not?

run the ‘env’ command to set your environment up. Its pretty complicated but in the proj directly there is currently:

*NOTE that I do use SED scripts, I have it set to use Linux in the WSL package.
I tried some Win32 sed but it didn’t work. So you need WSL or a working sed!

bench
doom
info
phoon
trek
(more stuff being added or updated on github)

you can then go into each directory and run

make os2

and it’ll compile populate a floppy and launch the emulator

Its all good fun.

Read the Makefiles to configure a compiler, how to run it, and if you need to mangle the assembly. The 32bit new stuff needs to be mangled, the older stuff almost always works with just compile.

# Version 6.00.054      1989
# https://archive.org/details/os-2-cd-rom_202401
PLATFORM = ddksrc

In this case it’ll select the platform from the ‘ddksdk’ release. The next is if the compiler is OS/2 based or native win32. Basically 73g / windows 95 & below are native Win32.

In the above example we comment out the dos extended cross

# dos exteded cross
CC =  $(EMU) $(DOSX) $(CL386ROOT)$(PLATFORM)\cl386
# native CC
# CC =  $(CL386ROOT)$(PLATFORM)\cl386

Next is the mangle strategy. In this case it’s an ancient OS/2 (like) compile so
try un commenting the ‘just compile’ line

# must include ONLY ONE strategey..
# for OS/2 it must have been assembled my MASM 6.11

include ..\-justcompile.mak
#include ..\-mangleassembly.mak
#include ..\-plainassembly.mak

save the makefile, and run

nmake os2

You can just close the emulator as after each run it’ll unpack a hard disk image, so nothing will be lost. or saved. It’s just for testing. You may need to periodically clean the floppy drive, as that is the only way to transfer stuff in and out of the VM.

What versions of CL386 have I found? Well, it’s quite a few, although I know I’m missing quite a few.

== c386 ============================
Microsoft C 5 386 Compiler

Microsoft C 5.2 286/386 Compiler -- Driver

@(#)C Compiler Apr 19 1990 11:48:30

Copyright (c) Microsoft Corp
1984-1989. All rights reserved.
  (press <return> to continue)
Microsoft 386 C Compiler. Version 1.00.075
Quick C Compiler Version 2.00.000
1.00.075

== ddk12 ============================

C 6.00 (Alpha) Aug 24 1990 19:12:31

Copyright (c) Microsoft Corp
1984-1989. All rights reserved.
  (press <return> to continue)
Microsoft 386 C Compiler. Version 6.00.054
Quick C Compiler Version 2.00.000
6.00.054

== ddk20 ============================

C 6.00 (Alpha) Aug 16 1990 23:04:06

Copyright (c) Microsoft Corp
1984-1989. All rights reserved.
  (press <return> to continue)
Microsoft 386 C Compiler. Version 6.00.054
Quick C Compiler Version 2.00.000
6.00.054

== ddksrc ============================

C 6.00 (Alpha) Aug 24 1990 19:21:49

Copyright (c) Microsoft Corp
1984-1989. All rights reserved.
  (press <return> to continue)
Microsoft 386 C Compiler. Version 6.00.054
Quick C Compiler Version 2.00.000
6.00.054

== nt-sep ============================

@(#)C Compiler 6.00 Feb 06 1991 17:15:19
@(#)C Compiler 6.00 May 13 1991 23:54:12
@(#)C Compiler 6.00 Jun 03 1991 15:16:22


Copyright (c) Microsoft Corp
1984-1991. All rights reserved.
  (press <return> to continue)
Microsoft 386 C Compiler. Version 6.00.077
Quick C Compiler Version 2.00.000
6.00.077

== nt-oct ============================

@(#)C Compiler 6.00 Jun 03 1991 15:16:22
@(#)C Compiler 6.00 Jun 13 1991 22:07:23
@(#)C Compiler 6.00 Oct 10 1991 00:42:24

Copyright (c) Microsoft Corp
1984-1991. All rights reserved.
  (press <return> to continue)
Microsoft 386 C Compiler. Version 6.00.080
Quick C Compiler Version 2.00.000
6.00.080

== nt-dec ============================
@(#)C Compiler 6.00 Jun 03 1991 15:16:22
@(#)C Compiler 6.00 Jun 13 1991 22:07:23
@(#)C Compiler 6.00 Oct 10 1991 00:42:24

Copyright (c) Microsoft Corp
1984-1991. All rights reserved.
  (press <return> to continue)
Microsoft 386 C Compiler. Version 6.00.081
Quick C Compiler Version 2.00.000
6.00.081

== 73g  ============================
1984-1993. All rights reserved.
Copyright (c) Microsoft Corp
8.00.3200
32-bit C/C++ Optimizing Compiler Version
Microsoft (R)


== msvc32s ============================

Microsoft 8.00.0000 - Copyright (C) 1986-1993 Microsoft Corp.
Microsoft 8.00.0000 - Copyright (C) 1986-1993 Microsoft Corp.
@(#) Microsoft C/C++ 32 bits x86 Compiler Version 8.00.XXXX

8.00.000

== 13.10.6030 ============================

Microsoft (R) C/C++ Compiler Version 13.10.6030
From my install of Visual Studio 2003 Enterprise

As you can see many of these earlier OS/2 compilers report the same versions but are in fact different builds on the inside. I suspect Microsoft had to support one version, and an Alpha version of version 6 is as good as it got. I would have imagined there were internal 32bit versions of 6 or 7, but I haven’t seen them.

Hopefully this gives some idea of how I tried to made a probably too modular build system to try all kinds of different compilers. I might have to see if it’s possible to run the tools from the 1992 versions of Windows NT in this setup, perhaps they are interesting as well.

One thing in my porting GCC to OS/2 experience is that the usability of the C compilers from 1991 were dramatically better than what Microsoft had given IBM at the time of the divorce. No doubt the upcoming NTOS/2 project was placing a bigger demand on the tools team.

If anyone has any access to other ‘cl386’ compilers, or early OS/2 2.00 stuff, please let me know! I’d love to do build/tests and see if my idea of distributing objects ‘just works’!

Totally unfair comparison of Microsoft C

Posted on February 5, 2024 by neozeed

Because I hate myself, I tried to get the Microsoft OS/2 Beta 2 SDK’s C compiler building simple stuff for text mode NT. Because, why not?!

Since the object files won’t link, we have to go in with assembly. And that of course doesn’t directly assemble, but it just needs a little hand holding:

Microsoft (R) Program Maintenance Utility   Version 1.40
Copyright (c) Microsoft Corp 1988-93. All rights reserved.

        cl386 /Ih /Ox /Zi /c /Fadhyrst.a dhyrst.c
Microsoft (R) Microsoft 386 C Compiler. Version 1.00.075
Copyright (c) Microsoft Corp 1984-1989. All rights reserved.

dhyrst.c
        wsl sed -e 's/FLAT://g' dhyrst.a > dhyrst.a1
        wsl sed -e "s/DQ\t[0-9a-f]*r/&XMMMMMMX/g" dhyrst.a1  | wsl sed -e "s/rXMMMMMMX/H/g" > dhyrst.asm
        ml /c dhyrst.asm
Microsoft (R) Macro Assembler Version 6.11
Copyright (C) Microsoft Corp 1981-1993.  All rights reserved.

 Assembling: dhyrst.asm
        del dhyrst.a dhyrst.a1 dhyrst.asm
        link -debug:full -out:dhyrst.exe dhyrst.obj libc.lib
Microsoft (R) 32-Bit Executable Linker Version 1.00
Copyright (C) Microsoft Corp 1992-93. All rights reserved.

I use sed to remove the FLAT: directives which makes everything upset. Also there is some weird confusion on how to pad float constants and encode them.

CONST   SEGMENT  DWORD USE32 PUBLIC 'CONST'
$T20001         DQ      0040f51800r    ;        86400.00000000000
CONST      ENDS

MASM 6.11 is very update with this. I just padded it with more zeros, but it just hung. I suspect DQ isn’t the right size? I’m not 386 MASM junkie. I’m at least getting the assembler to shut-up but it doesn’t work right. I’ll have to look more into it.

Xenix 386 also includes an earlier version of Microsoft C / 386, and it formats the float like this:

CONST   SEGMENT  DWORD USE32 PUBLIC 'CONST'
$T20000         DQ      0040f51800H    ;        86400.00000000000
CONST      ENDS

So I had thought maybe if I replace the ‘r’ with a ‘H’ that might be enough? The only annoying thing about the Xenix compiler is that it was K&R so I spent a few minutes porting phoon to K&R, dumped the assembly and came up with this sed string to find the pattern, mark it, and replace it (Im not that good at this stuff)

wsl sed -e "s/DQ\t[0-9a-f]r/&XMMMMMMX/g" $.a1 \
| wsl sed -e "s/rXMMMMMMX/H/g" > $*.asm

While it compiles with no issues, and runs, it just hangs. I tried the transplanted Xenix assembly and it just hangs as well. Clearly there is something to do with how to use floats.

I then looked at whetstone, and after building it noticed this is the output compiling with Visual C++ 8.0

      0       0       0  1.0000e+000 -1.0000e+000 -1.0000e+000 -1.0000e+000
  12000   14000   12000 -1.3190e-001 -1.8218e-001 -4.3145e-001 -4.8173e-001
  14000   12000   12000  2.2103e-002 -2.7271e-002 -3.7914e-002 -8.7290e-002
 345000       1       1  1.0000e+000 -1.0000e+000 -1.0000e+000 -1.0000e+000
 210000       1       2  6.0000e+000  6.0000e+000 -3.7914e-002 -8.7290e-002
  32000       1       2  5.0000e-001  5.0000e-001  5.0000e-001  5.0000e-001
 899000       1       2  1.0000e+000  1.0000e+000  9.9994e-001  9.9994e-001
 616000       1       2  3.0000e+000  2.0000e+000  3.0000e+000 -8.7290e-002
      0       2       3  1.0000e+000 -1.0000e+000 -1.0000e+000 -1.0000e+000
  93000       2       3  7.5000e-001  7.5000e-001  7.5000e-001  7.5000e-001

However this is the output from C/386:

      0       0       0  5.2998e-315  1.5910e-314  1.5910e-314  1.5910e-314
  12000   14000   12000  0.0000e+000  0.0000e+000  0.0000e+000  0.0000e+000
  14000   12000   12000  0.0000e+000  0.0000e+000  0.0000e+000  0.0000e+000
 345000       1       1  5.2998e-315  1.5910e-314  1.5910e-314  1.5910e-314
 210000       1       2  6.0000e+000  6.0000e+000  0.0000e+000  0.0000e+000
  32000       1       2  5.2946e-315  5.2946e-315  5.2946e-315  5.2946e-315
 899000       1       2  5.2998e-315  5.2998e-315  0.0000e+000  0.0000e+000
 616000       1       2  5.3076e-315  5.3050e-315  5.3076e-315  0.0000e+000
      0       2       3  5.2998e-315  1.5910e-314  1.5910e-314  1.5910e-314
  93000       2       3  5.2972e-315  5.2972e-315  5.2972e-315  5.2972e-315

Great they look nothing alike. So something it totally broken. I guess the real question is, does it even work on OS/2?

Since I should post the NMAKE Makefile so I can remember how it can do custom steps so I can edit the intermediary files. Isn’t C fun?!

INC = /Ih
OPT = /Ox
DEBUG = /Zi
CC = cl386

OBJ = dhyrst.obj

.c.obj:
	$(CC) $(INC) $(OPT) $(DEBUG) /c /Fa$*.a $*.c
	wsl sed -e 's/FLAT://g' $*.a > $*.a1
	wsl sed -e "s/DQ\t[0-9a-f]*r/&XMMMMMMX/g" $*.a1 \
	| wsl sed -e "s/rXMMMMMMX/H/g" > $*.asm
	ml /c $*.asm
	del $*.a $*.a1 $*.asm

dhyrst.exe: $(OBJ)
        link -debug:full -out:dhyrst.exe $(OBJ) libc.lib

clean:
        del $(OBJ)
        del dhyrst.exe
        del *.asm *.a *.a1

As you can see, I’m using /Ox or maximum speed! So how does it compare?

Dhrystone(1.1) time for 180000000 passes = 20
This machine benchmarks at 9000000 dhrystones/second

And for the heck of it, how does Visual C++ 1.0’s performance compare?

Dhrystone(1.1) time for 180000000 passes = 7
This machine benchmarks at 25714285 dhrystones/second

That’s right the 1989 compiler is 35% the speed of the 1993 compiler. wow. Also it turns out that MASM 6.11 actually can (mostly) assemble the output of this ancient compiler. It’s nice when something kind of work. I can also add that the Infocom ’87 interpreter works as well.

YAY!

Patching Touhou 6 (Embodiment of Scarlet Devil) to run on a 3dfx Voodoo 2

Posted on October 3, 2023 by neozeed

This is a guest post from spaztron64

One thing that’s been bugging me for years at this point is the ability to run Touhou 6 on my PC-9821 V166. For a good few years I’ve been stuck with nothing more than a Matrox Mystique graphics card in that thing, which can’t create a D3D6 HAL context for rendering the game’s 3D elements. In 2021 I snatched a 12MB 3dfx Voodoo 2, in hopes of being able to play more 3D games on that machine. There were two major problems….

1) The USBHID.SYS driver for PC-98 Windows 9x conflicts haaaard with Voodoo drivers. Moving the cursor around corrupts memory and makes the system unstable or kills the driver in mere seconds of use

2) None of the Touhou games support secondary Direct3D devices

For those not in the know in regards to the second issue, DirectX allows you to use multiple DDraw and D3D capable GPUs on one system. By default it’ll set the video card outputting a signal on the primary monitor as the primary DirectX device, the secondary output as secondary, and so on. Most people only used one monitor on their Win9x PC back in the day, hooked up to their 2D capable card. The Voodoo 1 and 2 aren’t meant to act as 2D video cards, yet they had to support D3D initialization somehow, so they presented themselves as non-primary DirectX devices, usually secondary, in hopes that game developers would allow the end user to select their 3D accelerator of choice.

This was standard practice at the tail end of the 1990s, but it was falling out of use at the turn of the millenium with the demise of 3dfx and the general lack of need for multiple graphics cards in one system for 3D gaming. This presented a problem, as games that technically could be played on a Voodoo 2… didn’t, as they could never be told to use it through normal means. Hacky solutions existed, like 3dfx’s unfinished, buggy attempt at a Voodoo 2 driver for Windows 2000 that allowed it to behave like a primary display adapter for general 2D and 3D use, but it’s notoriously unstable and isn’t possible to use on 9x. I’ve used this method before to play Touhou 6, Max Payne 1, GZDoom, GTA 3 and Vice City on the Voodoo 2 through Windows XP with mixed results.

Once I got an NEC bus mouse for use on my PC-98, I could finally use the Voodoo 2 on it without constant crashing. This got me interested in trying to get Touhou 6 to work on it, which lead me to a path of pure pain.

For starters, Touhou 6 is one of those games that only use primary DirectX devices, like the unsupported Mystique, so I had to somehow coax it into initializing the secondary device instead. My first approach to handling this was through direct binary patching. I didn’t know where to look for the init routines, so I asked 32th System for some heads up, and he pointed me to a rough location in process memory where the appropriate CreateDevice calls reside:

I then searched for the appropriate opcodes in the game binary, and patched all 6A 00 (push 0, A.K.A D3DADAPTER_DEFAULT) opcodes to be 6A 01 (push 1), forcing the game to init the secondary D3D device.

While this initially did in fact work, the approach ultimately sucked for two reasons.

1) Static binary patching only works for that specific binary, and doesn’t carry across different versions.

2) This requires manually patching every CreateDevice call, of which there are many in Touhou 6

It is at this point that I started sharing my progress with friends. jbit was quick to hop in and say “Why the fuck are you doing it this way? Just make a d3d8.dll wrapper DLL”. This was absolutely the smarter approach, I just didn’t know how to do it since I don’t know jack about DirectX programming. Fortunately, he handed me a little VS project he worked on called d3dcutter that, among other things, wrapped the CreateDevice function, which I promptly modified to always push 1 instead of 0 to the stack for the device selection parameter.

This solved the two patching problems, and I had something to show for starters:

Now, I’m sure you can tell that the performance is absolutely atrocious. This came as no surprise to me, as while the Voodoo can absolutely render the game at a full 60FPS most of the time, the dinky little Pentium MMX 166 struggles hard at doing triangle setup for the backgrounds every 16 milliseconds. Remember, 3dfx cards had no Hardware T&L, so the game has to fall back to Software T&L. I think the following wireframe screenshot will help illustrate the amount of work the CPU has to do every frame:

“Well, I’m in luck!”, one might think as they remember that older Touhou games support framerate division by 1/2 (30FPS) and 1/3 (20FPS) as options in custom.exe… There’s just one problem.

The game just… fails to initialize the D3D HAL on the Voodoo 2 in the frame divided modes. Why? Beats me, I still haven’t figured it out and likely never will. Just more ZUNcode bullshit I suppose.

I then figured out that framerate division is handled by a variable (that can be set even lower than 1/3, by the way) which could be set at runtime, even when the normal 60FPS mode is used. I suspected that the game uses a different initialization path for the two modes, so I once again tracked down opcodes that expect the variable to be set to 0 in process memory, this time with Cheat Engine, and patched them in the binary. Well guess fucking what, the game fails to initialize even when the regular init routines are modified to expect 30FPS or 20FPS frame division to be set.

This approach simply wasn’t going to work. so I went with trying to set the variable at runtime. Unfortunately, I had to go back to version specific patching once more for this, since there’s no way to wrap this functionality through DLL means. Additionally, while this wasn’t hard to do with the game running in windowed mode with Cheat Engine on the side on a modern system, it was basically impossible on a Voodoo 2 equipped machine, as the game ran in fullscreen and it wasn’t possible to restore the window after an Alt+Tab.

My final solution was to generate a trainer in Cheat Engine for version 1.02d of the game, as it’s the last one with a working logic speed limiter, that would forcibly set the frame divider variable at runtime with a hotkey:

This finally allowed me to play the game in 1/3 framerate mode on the Voodoo 2.
This allows the game to run at full logic speed most of the time, as the CPU now has 40 miliseconds per frame for triangle setup, however there’s something wrong with how the card handles buffer swaps in this mode of operation, leading to a very back-and-forth stuttery image that’s very unpleasant to look at.

Can we do better? Well yes! The game uses so-called STD scripts for certain stage-specific data setup, but also handling camera movement and geometry generation. Using Touhou Toolkit, I was able to unpack the appropriate DAT file, decompile all the STD scripts, remove all geometry commands, and recompile them for in-game use. As there are no more backgrounds to draw, there’s a trail effect left behind every frame, but thankfully custom.exe has an option to forcibly clear the back buffer every frame.

The end result? A nearly tripled framerate in 60FPS mode, recovered just by not drawing any 3D backgrounds. The game still lags when lots of bullets are on screen, but this doesn’t really come as a surprise.

Building MS-DOS 2.11

Posted on August 6, 2023 by neozeed

I thought I’d slap together some github thing with MS-DOS 2.11 that’s been made buildable thanks to a whole host of other smart people. The default stuff out there expects you to build it under MS-DOS using the long obsoleted ‘append’ utility which can add directories to a search path. Instead I created a bunch of makefiles that take advantage of MS-DOS Player, and let you build from Windows.

dos211: just the MS-DOS 2.11 sources, I re-aranged stuff and made it (slightly) easier to rebuild on Windows. (github.com)

building should be somewhat straightforward, assuming you have the ms-dos player in your path. JUST MAKE SURE YOU UNZIP as TEXT mode. If you are getting a million errors you probably have them in github’s favourite unix mode.

D:\temp\dos211-main\bios>..\tools\make
msdos ..\tools\masm ibmbio.asm ibmbio.obj NUL NUL
The Microsoft MACRO Assembler , Version 1.25
 Copyright (C) Microsoft Corp 1981,82,83


Warning Severe
Errors  Errors
0       0
msdos ..\tools\masm sysimes.asm sysimes.obj NUL NUL
The Microsoft MACRO Assembler , Version 1.25
 Copyright (C) Microsoft Corp 1981,82,83


Warning Severe
Errors  Errors
0       0
msdos ..\tools\masm sysinit.asm sysinit.obj NUL NUL
The Microsoft MACRO Assembler , Version 1.25
 Copyright (C) Microsoft Corp 1981,82,83

DOSSYM in Pass 2

Warning Severe
Errors  Errors
0       0
msdos ..\tools\LINK IBMBIO+SYSINIT+SYSIMES;

   Microsoft Object Linker V2.00
(C) Copyright 1982 by Microsoft Inc.

Warning: No STACK segment

There was 1 error detected.
msdos ..\tools\exe2bin.exe IBMBIO IBMBIO.COM < 70.TXT
Fix-ups needed - base segment (hex): 70
del -f ibmbio.obj    sysimes.obj   sysinit.obj ibmbio.exe

D:\temp\dos211-main\bios>

As an example building the bios by running make. For the impatiend you can download dos211.zip, which includes a bootable 360kb disk image, and a 32Mb vmdk!

32016 stand alone planetfall!

Posted on October 1, 2021 by neozeed

InfoTaskForce’87 running on a simple NS32016 emulator

What is it?

It sure may not look like much but it was an adventure getting here.

First, what is it? Well it’s the very simple NS32016 from here, with a few minor changes. I expanded the RAM from 256kb to a whopping 8MB. Then I added simple character I/O allowing me to print messages to the screen. Next looking at the toolchain page, I used my old Linux to Windows GCC 4 cross compiler to build the appropriate Canadian cross compiler to the NS3216.

Building the tools

A while back, I had built a cross compiler from Linux to Windows using GCC 4.1 as the basis as it was the last version that didn’t have massive external dependencies. NS32016 support was dropped some time in the late 3.x or early 4.x GCC so it means we need to go old anyways. I arbitrary picked GCC 2.8.1 for this build, while using the recommended Binutils 2.27

I cheated and just downloaded my existing linux-minw32.7z cross compiler as I didn’t feel like rebuilding everything again, although it is all in the Building a MIPS Compiler for Windows on a Linux VM! article. I also used an old Linux to Linux i586 32bit compiler (back from the OSKit build!) although you can use your hosts as well.

configuring Binutils is pretty simple like this:

./configure --prefix=/cross --target=ns32k-pc532-netbsd --host=i686-mingw32 --build=i586-linux

You can try omitting the –build portion, Debian GNU Linux 10 seemed okay with Gcc 8 as the default system compiler.

configuring GCC 2.8.1 was pretty similar:

./configure --target=ns32k-pc532-netbsd --prefix=/cross --disable-libssp --build=i586-linux --host=i686-mingw32

GCC 2.8.1 doesn’t quite know what we are doing so there is some flags we need to run off in auto-config.h namely

#define HAVE_BCMP 1
#define HAVE_BCOPY 1
#define HAVE_BZERO 1
#define HAVE_INDEX 1
#define HAVE_KILL 1
#define HAVE_RINDEX 1
#define HAVE_SYS_RESOURCE_H 1
#define HAVE_SYS_TIMES_H 1

You can just comment them out, or remove those lines all together.

When it came to building GCC, I did run into issues with GCC 7/8 trying to build GCC 2.8.1. I found it much easier to either have that Linux 4.1 compiler, or if you have access to Wine or WSL you can just run the Win32 binaries for the gen phases.

./configure --prefix=/cross --target=ns32k-pc532-netbsd --host=i686-mingw32
make CC=i686-mingw32-gcc xgcc cccp cc1 cc1obj

If you can run your own Win32 exe’s on Linux it’ll run just fine using the Linux to Windows GCC 4 cross. Otherwise you will need to either patch GCC or make your own GCC 4 hosted Linux to Linux cross compiler like this:

make CC=i686-mingw32-gcc HOST_CC=i586-linux-gcc xgcc cccp cc1 cc1obj

Hopefully that worked enough, and now you have your cross compiler. Now it’s time to build libgcc1.a

cp cccp cpp.exe
cp cc1 cc1.exe
cp xgcc xgcc.exe
cp ../binutils-2.27/gas/.libs/as-new.exe as.exe
cp ../binutils-2.27/binutils/.libs/ar.exe ar
cp ../binutils-2.27/binutils/.libs/ranlib.exe ranlib
make libgcc1.a TARGET_TOOLPREFIX="./" OLDCC=./xgcc.exe

Again you really want to be able to run the resulting programs on Linux but I guess you could script it out. Naturally if you wanted to just use Linux, it’d be easier to make that cross compiler directly, although I’m not sure how much of GCC 2.8.1 I want to fight, or just get GCC 4 running on Linux and use that to port.

crt0, somewhere for C to start

As mentioned a crt0.s is missing but there was enough inspiration to come up with this:

#NO_APP
gcc_compiled.:
.text
        .align 1
.globl _start
_start:
        enter [],0
#APP
#       setting the stack 256k under 8MB
        lprd sp,0x7c0000
        jsr _main
#NO_APP
L1:
        exit []
#       setting the stack 256k under 8MB
        lprd sp,0x7c0000
        bpt
        .align 1

#does nothing
.globl ___main
___main:
        ret 0

.globl _exit
_exit:
        bpt
        ret 0

I used a bit of the C example, and added some hooks that GCC was expecting namely a __main call that is made from main before it does anything (a place to init memory perhaps?), a place to catch an explicit exit call, along with setting the stack of course.

Patching InfoTaskForce without malloc / disk access

It’s not going to win any awards, but it was really great to get it to run a simple program written in GCC. Looking for something more fun, I took the old InfoTaskForce interpreter from ’87, and dug up my modification to run on cisco routers, and cooked up this version, that adds enough of printf from Linux, a bogus malloc that just allocates from a fixed memory array (otherwise you have to actually know about your platform), and a fun trick with later binutils where you can import a binary file directly as an object!

Neat!

Since I don’t have any file I/O being able to have the game data in RAM is crucial. I tried to tweak it so you could build the same working thing on Windows (maybe others?).

So for anyone who wants to look at the standalone adventure Win32 hosted tools are here, although the emulator should be somewhat portable.

Quick and dirty i8080 emulator in BASH!

Posted on September 22, 2019 by neozeed

Based on the Toledo 8080 processor emulator, it’s Peter Naszvadi’s i8080.sh!

You will need bash version 4 for this to run. So for those of you in the 1990’s you are out of luck.

And to make this fun, the 2kb basic ‘ROM’ is bootstrapped into RAM for immediate execution!

$ ./8080.sh
Quick and dirty i8080 emulator by NASZVADI Peter, 2019.
Press Ctrl+C to quit anytime!
Initializing upper memory to NOPs
Initializing register values
Starting CPU emulation


OK
>10 PRINT "HIHI"
>RUN
HIHI

OK
>

So we’ve seen emulation in javascript of all things, but now you can run 8080 instructions from bash of all things!

And for anyone even more crazy there is always BDS C: which was opened up back in 2002, a K&R for the 8080 on CP/M!

Messing with the Monitor

Posted on April 9, 2019 by neozeed

The 68000 Microprocessor (5th Edition) Hardcover â€“ Nov 25 2003
by James L. Antonakos (Author)

So I was trapped in the Library for a bit, and spied this book. It’s not every often in 2019 you are going to find books about the 68000, as I’m sure any good library will have removed stuff like this, and have it pulped ages ago. But the amount of current technical books in English here is pretty damned slim to none, so I was all to happy to pickup this book for a week.

The poor thing has been checked out 4 times in the last 15 years. I guess the kids don’t know what they are missing.

Anyways what was interesting in this book is that it has a CD-ROM, and on there is some lesson code from the book, along with an assembler that outputs to S-records of all things, and a small emulator that is meant to be compiled under MS-DOS. It was trivial to isolate the passing of DOS interrupts from Unix/MinGW and get the simulator running on something modern.

In Chapter 11 there is a brief walkthrough on building a board, which sounds like fun although I’m sure in 2019 finding parts will be.. challenging, along with a simple monitor program.

The built in assembler can happily assemble the monitor, but it’s geared for talking to the obsolete hardware as specified in the book. I just made a few small changes to instead have it’s console IO hook to the simulator’s TRAPs and I had the monitor running!

I then took the echo test program, and modified it to run at a higher location in memory, along with exiting via the RTS instruction, so that it will exit when you press Q back to the monitor. Then for the heck of it I further extended the monitor so you can Quit it, and return to the simulator.

Is this useful? I’m pretty sure the answer is absolutely not.

The CD-ROM is tiny, I thought it would be packed with goodies, but it’s 250kb compressed.

The68000MicroprocessorFifthEdition.zip

If anything people using this book will probably have lost the CD-ROM and want the programs.

ISBN-10: 0130195618

And my horrible changes here.

Wasting some cycles on FrontVM

Posted on April 1, 2019 by neozeed

A while ago I had chased FrontVM to moretom.net and found 2 links. One from 2003 which is a dead link, and the 2004 version which was archived by the wayback machine!

It was an interesting build, as it still used 68000 emulation from Hatari/UAE this pre-dates the 68000 to C or i386 ASM. However since it ran (mostly) the original code, it was more ‘feature complete’, ~~although loading save games is broken for some reason (I think the decryption was not disassembled correctly)~~. It was actually a stupid file mode setting. I just updated the source & put out a new binary, testing save games between Linux &Windows.

Anyways, it originally built on Cygwin, so I filled in the missing bits, and have it building on both MinGW & Visual C++

Parked outside Willow in the Ross 154 system

So yeah, it’s Frontier, for the AtariST with the OS & Hardware calls abstracted, still running the 68000 code under emulation. I think it’s an interesting thing, but that’s me.

I put it and the other original versions I found over on sourceforge.net

Oddly enough it’s already been downloaded, so go figure.

Source to the Apple-II MIT Logo recovered

Posted on October 4, 2018 by neozeed

;  LOGO Language Interpreter for the Apple-II-Plus Personal Microcomputer
;  Written and developed by Stephen L. Hain, Patrick G. Sobalvarro,
;  and the M.I.T. LOGO Group, at the Massachusetts Institute of
;  Technology.

It’s over on github:Â https://github.com/PDP-10/its/blob/master/src/aplogo/logo.958

Virtually Fun

Fun with Virtualization

Category Archives: assembly