This is a guest post from spaztron64
One thing that’s been bugging me for years at this point is the ability to run Touhou 6 on my PC-9821 V166. For a good few years I’ve been stuck with nothing more than a Matrox Mystique graphics card in that thing, which can’t create a D3D6 HAL context for rendering the game’s 3D elements. In 2021 I snatched a 12MB 3dfx Voodoo 2, in hopes of being able to play more 3D games on that machine. There were two major problems….
1) The USBHID.SYS driver for PC-98 Windows 9x conflicts haaaard with Voodoo drivers. Moving the cursor around corrupts memory and makes the system unstable or kills the driver in mere seconds of use
2) None of the Touhou games support secondary Direct3D devices
For those not in the know in regards to the second issue, DirectX allows you to use multiple DDraw and D3D capable GPUs on one system. By default it’ll set the video card outputting a signal on the primary monitor as the primary DirectX device, the secondary output as secondary, and so on. Most people only used one monitor on their Win9x PC back in the day, hooked up to their 2D capable card. The Voodoo 1 and 2 aren’t meant to act as 2D video cards, yet they had to support D3D initialization somehow, so they presented themselves as non-primary DirectX devices, usually secondary, in hopes that game developers would allow the end user to select their 3D accelerator of choice.
This was standard practice at the tail end of the 1990s, but it was falling out of use at the turn of the millenium with the demise of 3dfx and the general lack of need for multiple graphics cards in one system for 3D gaming. This presented a problem, as games that technically could be played on a Voodoo 2… didn’t, as they could never be told to use it through normal means. Hacky solutions existed, like 3dfx’s unfinished, buggy attempt at a Voodoo 2 driver for Windows 2000 that allowed it to behave like a primary display adapter for general 2D and 3D use, but it’s notoriously unstable and isn’t possible to use on 9x. I’ve used this method before to play Touhou 6, Max Payne 1, GZDoom, GTA 3 and Vice City on the Voodoo 2 through Windows XP with mixed results.
Once I got an NEC bus mouse for use on my PC-98, I could finally use the Voodoo 2 on it without constant crashing. This got me interested in trying to get Touhou 6 to work on it, which lead me to a path of pure pain.
For starters, Touhou 6 is one of those games that only use primary DirectX devices, like the unsupported Mystique, so I had to somehow coax it into initializing the secondary device instead. My first approach to handling this was through direct binary patching. I didn’t know where to look for the init routines, so I asked 32th System for some heads up, and he pointed me to a rough location in process memory where the appropriate CreateDevice calls reside:
I then searched for the appropriate opcodes in the game binary, and patched all 6A 00 (push 0, A.K.A D3DADAPTER_DEFAULT) opcodes to be 6A 01 (push 1), forcing the game to init the secondary D3D device.
While this initially did in fact work, the approach ultimately sucked for two reasons.
1) Static binary patching only works for that specific binary, and doesn’t carry across different versions.
2) This requires manually patching every CreateDevice call, of which there are many in Touhou 6
It is at this point that I started sharing my progress with friends. jbit was quick to hop in and say “Why the fuck are you doing it this way? Just make a d3d8.dll wrapper DLL”. This was absolutely the smarter approach, I just didn’t know how to do it since I don’t know jack about DirectX programming. Fortunately, he handed me a little VS project he worked on called d3dcutter that, among other things, wrapped the CreateDevice function, which I promptly modified to always push 1 instead of 0 to the stack for the device selection parameter.
This solved the two patching problems, and I had something to show for starters:
Now, I’m sure you can tell that the performance is absolutely atrocious. This came as no surprise to me, as while the Voodoo can absolutely render the game at a full 60FPS most of the time, the dinky little Pentium MMX 166 struggles hard at doing triangle setup for the backgrounds every 16 milliseconds. Remember, 3dfx cards had no Hardware T&L, so the game has to fall back to Software T&L. I think the following wireframe screenshot will help illustrate the amount of work the CPU has to do every frame:
“Well, I’m in luck!”, one might think as they remember that older Touhou games support framerate division by 1/2 (30FPS) and 1/3 (20FPS) as options in custom.exe… There’s just one problem.
The game just… fails to initialize the D3D HAL on the Voodoo 2 in the frame divided modes. Why? Beats me, I still haven’t figured it out and likely never will. Just more ZUNcode bullshit I suppose.
I then figured out that framerate division is handled by a variable (that can be set even lower than 1/3, by the way) which could be set at runtime, even when the normal 60FPS mode is used. I suspected that the game uses a different initialization path for the two modes, so I once again tracked down opcodes that expect the variable to be set to 0 in process memory, this time with Cheat Engine, and patched them in the binary. Well guess fucking what, the game fails to initialize even when the regular init routines are modified to expect 30FPS or 20FPS frame division to be set.
This approach simply wasn’t going to work. so I went with trying to set the variable at runtime. Unfortunately, I had to go back to version specific patching once more for this, since there’s no way to wrap this functionality through DLL means. Additionally, while this wasn’t hard to do with the game running in windowed mode with Cheat Engine on the side on a modern system, it was basically impossible on a Voodoo 2 equipped machine, as the game ran in fullscreen and it wasn’t possible to restore the window after an Alt+Tab.
My final solution was to generate a trainer in Cheat Engine for version 1.02d of the game, as it’s the last one with a working logic speed limiter, that would forcibly set the frame divider variable at runtime with a hotkey:
This finally allowed me to play the game in 1/3 framerate mode on the Voodoo 2.
This allows the game to run at full logic speed most of the time, as the CPU now has 40 miliseconds per frame for triangle setup, however there’s something wrong with how the card handles buffer swaps in this mode of operation, leading to a very back-and-forth stuttery image that’s very unpleasant to look at.
Can we do better? Well yes! The game uses so-called STD scripts for certain stage-specific data setup, but also handling camera movement and geometry generation. Using Touhou Toolkit, I was able to unpack the appropriate DAT file, decompile all the STD scripts, remove all geometry commands, and recompile them for in-game use. As there are no more backgrounds to draw, there’s a trail effect left behind every frame, but thankfully custom.exe has an option to forcibly clear the back buffer every frame.
The end result? A nearly tripled framerate in 60FPS mode, recovered just by not drawing any 3D backgrounds. The game still lags when lots of bullets are on screen, but this doesn’t really come as a surprise.