Degrading qemu performance in DooM Part II

ticks for demo (fewer is better)

VersionASM FixedDiv/FixedMulC
0.9.0259268
0.10.5299300
0.11.0316317
0.12.0289281
0.12.5290282
0.14.0282274
0.14.1280269
20180430192195
20190218187187

So after the last round, I went ahead and dug out my crap version, where I had just recently found a nice abs() fix for a FixedDiv issue that the old iD code suffers from, and re-built a version of DooM that both used the assembly fixed division, and another with the C version. To compile I used my old GCC 2.7.2.3 to build with the flags:

-m486 -msoft-float -ffast-math -O2 -fforce-addr -fomit-frame-pointer

So here we go using the versions of Qemu that I can build quickly with GCC 3.4.5 MinGW, along with the last two pre-built Win64 builds.

It’s kind of interesting just how close the performance is between the two versions.

Naturally the real test is to run it on actual hardware, and to try a few versions of Watcom C.

Maybe the real takeaway is that Qemu runs GCC built code better…?

Adding DOSBox’s MPU401 to Qemu 0.90

I thought this may be something cool, if not kind of pointless. Anyways the MPU401 UART can be run like a traditional serial port with an IRQ, in intelligent mode, or just as a ‘dumb’ device you can just bit bang to talk to MIDI devices. So while playing with DOSBox I thought it’d be fun to take it’s emulation and plug it into Qemu.

And this is the end result.

It’s far from perfect, when it works it does tend to work well, although it fails to work with things like Return to Zork, but it does work with DMX’s sound code in DooM and the MPU401 driver for Windows 3.1

While doing this I was originally struggling with mapping the IO ports. Qemu has some functions to map in the memory model to assign a function that will trap read/write space. In this case base is 0x330 the base of the MPU401 device.

register_ioport_write(base, 8, 1, mpu401_ioport_write, s);     register_ioport_read(base, 8, 1, mpu401_ioport_read, s);

I was thinking that the port 0x331 needed to be mapped in the same way, but it turns out after looking through more of the source, it’s actually a word aligned access. So in that case you can use a switch to see which port is actually being accessed.

static uint32_t mpu401_ioport_read(void *opaque, uint32_t addr) {
     switch(addr&0xf)
     {
     case 0:
      return(MPU401_ReadData(addr,0)); break;
     case 1:
      return(MPU401_ReadStatus(addr,0)); break;
     default:
      return(0xff); break;
     }
  return(0xff);
}

Pretty simple, right?

And from there it’s a matter of mapping the DOSBox MPU code, along with the Windows interface code. Since I’m not using intelligent or IRQ mode, I just amputated the code where applicable.

If anyone wants to look at what I did to merge into anything else (and probably do a better job!) it’s on sourceforge as mpu401.c.

Otherwise the binary is available on sourceforge:

Download Qemu090b

Degrading qemu performance in DooM

Quick table… its late.

I’m using MS-DOS 5 & this benchmark suite loaded into a VMDK, and ran a few tests to check performance numbers.

version2 3d bench3 chris bencha doom ticksc quake demo
0.9.0182257.426346.4
0.10.5204247.649141.7
0.12.5305.1296.747546.8
0.13.0305.3284.351145.7
0.14.1289.6285.350945.2
0.15.0213.4176.952015.3
1197.9152.154615.5
2.4.0.1392.1340.674324.2
3.1.50405462.3149733.3

I snagged 3.1.50 from https://qemu.weilnetz.de/w64/

better performance than v2, sure, but for interactive stuff.. not so much.

So what is really going on here? Why is 0.90 so much faster when it comes to doom, and how is it possible that it’s the slowest in raw CPU performance. And fastest at IO? It appears that the crux of the issue is simply how it handles its IO, heavily favoring device performance VS CPU.

I’ll have to follow up with more builds and reading release notes to see what changed between releases. And what was it exactly that broke between gcc 3 and 4, and why the rip had to be.

I still like 0.90, if anything for it’s ability to run NeXTSTEP and NetWare.

SharpShooter3D

When the best friend becomes an enemy, and the main villain becomes you yourself and blood flows like a river, someone will definitely have to answer the main question – Who is behind the awakening of ancient evil and is there still a chance for humanity?

PUBLISHER:Dagestan Technology
DEVELOPER:HeadHuntersGames

I saw this ‘gem’ pop up on steam for $3.50 HKD (so $0.50 USD?) and thought what the hell let’s try it out. SharpShooter3D is well, a DooM mod of sorts, but it also feels a lot like Duke3D with the inclusion of vehicles and ‘moving room/vehicles’ like train cars. At it’s heart is GZDoom 3.3.2 which states in the license:

Parts of the voxel code in the software renderer use code from the
BUILD engine by Ken Silverman and are used under the terms of the
GPL v3 with permission.

Well isn’t that cool! The best of Duke3D and Doom! All in one.

It captures the imagined feel of the eastern block, old factories, nuclear power plants, lots of guys in trainers & Adidas all over the place, along with copious alcohol and milk (yes milk is the health thing here!).

Not to mention the punk sound track is pretty good for such a seemingly ‘low end offer’. Had this come out 20 years ago, it really would have set the world on fire, and probably set off quite the few controversy, but today it’s a discount mod that no doubt the devs did put a bit of work into.

I have no idea if the game is 90% off for the rest of the world, but I’d say it’s worth a look at the price.

Updated my DooM port


x68000 still won’t build on this, lots more to either separate out or just fork out.  No biggie.

The big plus is for GCC 2.x and higher that use DJGPP v2 runtimes is that the allegro music and sound work.  It tests faster than Watcom 10/11 or Open Watcom ..  probably DMX vs Allegro I suspect …

Things to do..

  • Upload my GCC 2.7.2.1 cross toolchain
  • Upload the Freedoom build chain (again)
  • Test on real machines, the 486 and the P3

So yeah.  Limited progress.

Also it’s nice being able to cross build Allegro 3.12 in 10-15 seconds vs the hour+++ in emulation.

Project is in the djgppv1 cross thing

https://sourceforge.net/p/crossdjgppv1/DooM/ci/master/tree/

git access to clone

git clone git://git.code.sf.net/p/crossdjgppv1/DooM crossdjgppv1-DooM

make with:

make -f makefile.djgpp_v2

and you should be good to go

dosdoom 0.2 recovered

While cruising around at doomworld.com looking for something else, I saw this thread: ‘Recovered’ DOSDoom 0.2.

So I quickly built it with my MinGW32-DJGPP using GCC 3.4.5.  And this version needs the Allegro library as it has sound effects audio!  Although building Allegro needed GCC 2.7.2.1 and Binutils 2.8.1.  Using other versions just led to nothing but trouble.  I ended up just installing DJGPP on DOSBox to build Allegro which took … a whlie to build.  Although being able to cross compile dosdoom from Windows was far far far quicker.

So yeah, it runs.  With sound.  It’s great.  Allegro integration isn’t anywhere as near complete at this point it’s just the sound files.  I took a much later version of dosdoom’s MIDI code, which required the Allegro timer, which interfered with my older timer IRQ hook.  Converting the whole thing to use the Allegro timer, and keyboard wasn’t too difficult, and that gives my DooM source fork a really full feeling when using DJGPP v2.

Although I’m having issues uploading from China at the moment.

Imagemagick really is.. Magic!

Ok, so since I’ve been playing around with the Freedoom assets, I wanted to process all the assets and then make them into an iwad.  And for the graphics this meant generating a simple color cube palette and then transforming all of the images to match that palette.  And the results were, while recognizable as DooM, they are drab.

Gray world

And yes, it’s gray, and drab.  So UK.  And there is another problem, many of the ‘graphics’ assets were a mix of PNG and GIF, and it turns out that in the GIF format you usually have a single color set as your transparent color, and it’ll get cut out automatically.  In this case the transparent color is cyan.  However there is some cyan still in the image!

Cyan artifact

So the best way I figured to ‘fix’ this was to do a straight conversion using Imagemagick.  So I loaded up paintbrush of all things and noticed that for some reason the colors had bled on the gif’s I had from the Freedoom pack I’d downloaded.  And that there were actually 3 cyan colors that needed to be purged.  In this case in hex they are 0x00fefe, 0x00f2f2, 0x02f6f5, and the one that they should have been, 0x00ffffff

convert ..\graphics_freedoom\old\wia20200.gif -transparent #02f6f5 -transparent #00fefe -transparent #00f2f2 -transparent #00ffff oldpng\wia20200.png

So I had to run convert like this against all the GIFs that I needed to fill in the graphics that I’m currently not processing in Python.  Now the images actually look right, no surprise cyan, but my palette still sucks

Although the Freedoom team has told me that it’s far easier to just use the DooM palette as their assets use that palette and it’ll just work.  But I’m too stupid for that.  One great feature of Imagemagick is that you can hand it an image, and ask it to reduce it to any arbitrary number of colors, and it’ll do a great job of it.  And while that is great for a single image, that doesn’t help me when I’m talking about thousands of images.  Except Imagemagick also has another great ability which is to paste a new image to the right of an existing one.  So with a little creative use of the make command I can then build a single giant image that contains all of the artwork.  Isn’t that great?

Although great care and detail went into the original DooM palette selection we can throw all of that away, and let a program stitch everything together, and then have it analyze the entire mess, and come up with the ‘ultimate palette’ that works best with everything.  Although one word of warning it takes well over an hour on an i7 to just stitch the images together (I should have setup a RAM disk) but it only took about 10 minutes for Imagemagick to process the blob image to come up with 255 colors that work best across the entire image set.

With the reduction done, the next thing to do is to create a ‘palette image’, which is one pixel for each color.  This is the palette that we will use to ‘reduce’ all the images against.  It’s more so to let Imagemagick do the hard work of selecting a palette.

convert slug256.png -unique-colors -scale 100% slug256_table.png

Imagemagick optimal palette

And the next thing is to extract the RGB set, which DooM uses for it’s palette.  In this case each color is is represented by three bytes.

convert -size 256×1 -depth 8 slug256_table.png -append rgb:playpal-base256.lmp

And then the next step is to process this palette with dmutils dcolors.  While it is primarily designed to use the LMB file format from Amiga fame, it wasn’t too hard to modify it to read a palette file directly, and let it add in the ‘red pain’ color shift, along with the green ‘bio hazard suit’ effect.  The color map it generates is totally corrupt at the moment, so I’m using the old perl program to generate one based off of the palettes.

Indeed it is something so crazy that I really don’t want to even do it a second time to make sure my process was reproducible.  However compared to before, I think the results speak for themselves:

new palette

So while it does take the better part of forever to go through all the images like this, it certainly gives zeeDoom a little more ‘building the world from scratch’ type feel.