DooM, GCC & AI

One of the things that always annoyed me about DooM is the fixed point math. It relies on a 64bit data type, which many 32bit platform just lack. Namely the FixedMul & FixedDiv.

fixed_t FixedMul
( fixed_t a,
fixed_t b )
{
return ((long long) a * (long long) b) >> FRACBITS;
}

They are generally re-written into assembly, at least for the i386 getting around the whole 64bit on a 32bit platform.

FixedMul:	
	pushl %ebp
	movl %esp,%ebp
	movl 8(%ebp),%eax
	imull 12(%ebp)
	shrdl $16,%edx,%eax
	popl %ebp
	ret

FixedDiv2:
	pushl %ebp
	movl %esp,%ebp
	movl 8(%ebp),%eax
	cdq
	shldl $16,%eax,%edx
	sall	$16,%eax
	idivl	12(%ebp)
	popl %ebp
	ret

So that’d always been the ‘catch’ when porting Doom is that you either need a ‘long long’ data type, or the custom assembly or you basically are out of luck.

I know I’m a weird person, but I do like going backwards in terms of software, while most people want latest GCC 14 targeting old machines or some other hack, I prefer the opposite, trying to get the oldest stuff running on something new(er). In this case, it’s GCC 1.27.

While much of the old GCC history is lost, I did my best to collect as many versions as I could find here, along with doing patches in reverse to ‘reconstruct’ many old versions. The results are on archive.org. And what is significant from this, is the first version of GCC with i386 support appears in version 1.27.

From the internals-1 file we can find out that we all have William Schelter to thank for doing the bulk of the 386 port of GCC.

William Schelter did most of the work on the Intel 80386 support.

Internals-1 – gcc 1.27

With the first commit being on May 29th, 1988:

Sun May 29 00:20:23 1988 Richard Stallman (rms at sugar-bombs.ai.mit.edu)
...
* tm-att386.h: New file.

GCC 1.25

The first GCC with i386 parts is 1.25, however it tends to emit the instruction movsbl, which GAS doesn’t like. Comparing the output to GCC 1.40 reveals this:

-	movsbl -12(%ebp),%ax
+	movsbw -12(%ebp),%ax

It’s trivial enough to change to a movsbw, but there are some other issues going on, and even the Infocom ’87 interpreter won’t fully build & run. The files input.c & print.c have to be built with a later version of GCC, in this case I used GCC 1.40.

8&% compiled with GCC 1.25!

However when using the old XenixNT build thing I did, I do get a runnable EXE!

GCC 1.25 targeting DJGPP v1

I added the BSD targeting files from 1.27 allowing it to generate the needed underscores in the right places, as 1.25 by default only supports AT&T syntax. I guess the best way to illustrate it is to compile the compiler twice, once as the AT&T compiler, and the next being the BSD compiler, and compare their output:

cc1-att.exe hi.cpp -quiet -dumpbase hi.c -version -o hi-att.s
GNU C version 1.25 (80386, ATT syntax) compiled by GNU C version 4.8.1.

.text
.LC0:
        .byte 0x31,0x2e,0x32,0x35,0x0
.LC1:
        .byte 0x68,0x65,0x6c,0x6c,0x6f,0x20,0x66,0x72,0x6f,0x6d
        .byte 0x20,0x25,0x73,0xa,0x0
        .align 2
.globl main
main:
        pushl %ebp
        movl %esp,%ebp
        pushl $.LC0
        pushl $.LC1
        call printf
.L1:
        leave
        ret

And then looking at the BSD syntax:

cc1-bsd.exe hi.cpp -quiet -dumpbase hi.c -version -o hi-bsd.s
GNU C version 1.25 (80386, BSD syntax) compiled by GNU C version 4.8.1.

        .file   "hi.c"
.text
LC0:
        .byte 0x31,0x2e,0x32,0x35,0x0
LC1:
        .byte 0x68,0x65,0x6c,0x6c,0x6f,0x20,0x66,0x72,0x6f,0x6d
        .byte 0x20,0x25,0x73,0xa,0x0
        .align 1
.globl _main
_main:
        pushl %ebp
        movl %esp,%ebp
        pushl $LC0
        pushl $LC1
        call _printf
L1:
        leave
        ret

As you can see main: becomes _main:, just as labels (LC0/LC1) have a prepended, while in BSD they do not. There are no doubt countless other nuanced differences, but for the assembler & operating system to match you want these to align. Thankfully calling conventions are mostly the same per processor so you can add the underscores to the AT&T target and get something that’ll run, not only on DJGPP but also Win32, as MinGW32 uses BSD syntax at its heart!

C:\xdjgpp.v1\src>gcc hi-bsd.s -o hi.exe

C:\xdjgpp.v1\src>wsl file hi.exe
hi.exe: PE32 executable (console) Intel 80386, for MS Windows

C:\xdjgpp.v1\src>hi
hello from 1.25

Not that we really need to go all the way and have GCC 1.25 running on anything much, although at the same time it’s kind of fun!

Reaching out for help

The oldest & most robust GCC is 1.27, and I’d been able to use that to build DooM before, but the caveat is of course the fixed point math. I’d asked smarter people than I years ago about this problem, and basically was told to figure it out for myself. After all its ‘trivial’. But alas, I’m not smart. What I would do is build the fixed point math with GCC and try to re-work that into other compilers, although again for platforms without GCC or the target CPU lacking the 64bit data type is well.. fatal.

But AI, sadly for it, is compelled to help. So I just went ahead and asked and got a surprising result!

Sydney strikes back!

It’s very C++ like but it’s trivial enough to make it into old C.

Sydney’s fixed-point guess

Well on the one hand it does actually load up and play. But the controls go wild and I’m pulled into the intersection of these boxes, and unable to move. And as I rotate the floor and walls clip in and out. It’s very weird.

Not knowing anything about anything, I saw this ‘guard’ on the fixed division and tried to add that to the multiply:

	if ( (abs(a)>>14) >= abs(b))
		return (a^b)<0 ? MININT : MAXINT;

And I got something even weirder!

Now with absolute guards!

Not only that but the engine crashes! Not good.

After thinking about it on and off, asking for more help and going nowhere, I just gave up. It’s something beyond my skill, and apparently the AI as well. Until I had one of those moments in a dream where I had somehow told myself I bet the integers are not unsigned, and obviously fixed-point math needs all the bits, and it’s such a trivial fix, that even I should have figured it out.

I woke up at 4am and fired up the computer to see if it did anything. I was surprised to see that yes, in fact the integers were signed. I added the one key word, and recompiled using GCC 2.2:

Now with unsigned integers

And it RAN! I tried GCC 1.39, and too ran! I then made sure there was no assembly modules being called accidentally, and then re-built with GCC 1.27, and yeah, it runs!

Armed with a simple port of DooM to Win32, I went ahead and put in the fixed-point solution, and used Visual C++ 1.10 aka Microsoft C/C++ 8.0 to build DooM, and yes it works there too!

In the end I guarded it around a long long type, as I’m sure it’s much more faster, but for those without the types or any assembly skill, here is the solution:

/* Fixme. __USE_C_FIXED__ or something. */
fixed_t
FixedMul
        ( fixed_t a,
        fixed_t b )
{
#ifdef HAVE_LONG_LONG
    return ((long long) a * (long long) b) >> FRACBITS;
#else
    unsigned int ah,al,bh,bl,result;

    ah = (a >> FRACBITS);
    al = (a & (FRACUNIT-1));
    bh = (b >> FRACBITS);
    bl = (b & (FRACUNIT-1));

    /* Multiply the parts separately    */
    result = (ah * bh) << FRACBITS;     /* High*High    */
    result += ah * bl;                  /* High*Low     */
    result += al * bh;                  /* Low*High     */
    /* Low*Low part doesn't need to be calculated because it doesn't contribute to the result after shifting

    // Shift right by FRACBITS to get the fixed-point result                                            */
    result += (al * bl) >> FRACBITS;

    return (fixed_t)result;
#endif
}

/* */
/* FixedDiv, C version. */
/* */
fixed_t
FixedDiv2
        ( fixed_t a,
        fixed_t b )
{
#ifdef HAVE_LONG_LONG
        long long c;
        c = ((long long)a<<16) / ((long long)b);
        return (fixed_t) c;
#else
        double c;

        c = ((double)a) / ((double)b) * FRACUNIT;

        if (c >= 2147483648.0 || c < -2147483648.0)
                I_Error("FixedDiv: divide by zero");
        return (fixed_t) c;
#endif
}

fixed_t
FixedDiv
        ( fixed_t a,
        fixed_t b )
{
        if ( (abs(a)>>14) >= abs(b))
                return (a^b)<0 ? MININT : MAXINT;
        return FixedDiv2 (a,b);
}

I haven’t tested it on Big Endian machines yet. I’ve updated my terrible DooM engine port, along with Doom-New for DOS. Additionally, I’ve been able to confirm it works with Watcom 9-OpenWatcom 2. It might even work with 7 & 8 but I haven’t tried.

In Defense of the Mac Pro 2023

guest post by neozeed‘s nephew

There are a few reasons to get an M2 Mac Pro and although many will say the Studio is a better buy for value: that’s only true if you’re not after these important considerations:

  • The ability to install your own *bootable* SSD: nearly every major Mac reviewer ignored this insanely important feature.
  • The ability to install internal storage (and go beyond 8 TB), period: do we really want a cocktail of external HDDs attached? I don’t!
  • The ability to install an internal USB A licensing dongle: unless you’re sharing your dongle over the network with 3rd party software from an RPi hiding in a closet (you should try VirtualHere if you want cross-platform dongle sharing it’s great), you don’t want to accidentally shear it off costing thousands of dollars of lost licensing.
  • The Magic Keyboard and (black) Magic Mouse are bundled (this is not the case with the Mac Studio or the MacBook Pro adding a substanstial cost. However, since the AppleCare+ is more expensive for the Mac Pro over the Mac Studio you could argue these costs cancel themselves out… unless you’re Icarus with a wax wallet instead of wax wings and never purchase AppleCare+).

Recently ‘GoFetch’ made the headlines, but it’s irrelevant for a variety of reasons in my opinion: 1) you won’t have WAN-exploitable instances of GoFetch in the real world, 2) it does indeed affect some Intel processors and probably others. The way all processors are designed now with speculative execution, CVE-after-CVE is unavoidable so the sensationalization has worn out its appeal. Even the once-ironclad AMD processors are afflicted with a bunch of nasty CVEs now too. \rant

Mac Pro vs PS/2 Model 95

After an eye-watering $8000: refurbished base model with AppleCare+ 🤮💸💸, we’re greeted with our new friend. The Cheese Grater (2023 Mac Pro) has befriended the Ardent Tool of Capitalism (PS/2 Model 95)! It’s odd how both share silly nicknames and a very similar height sans handles. Both systems symbolize the same sentiment that Louis Ohland shared many years ago: “Think of a business computer being used for purely personal reasons. Fist pump at the man! Isn’t using a corporate tool because you can an expression of free will?”*

*Louis Ohland is the guy who nicknamed the PS/2 Model 95, the Ardent Tool of Capitalism.

Q&A

Q: will I grate cheese on both of them?

A: only if you clean up the cheese residue for me. Are Personal System/2s even food-safe???

Storage

Sonnet M.2 4×4 NVMe PCI-e

The first thing we’ll need to do is install an NVMe PCI-e card. I’m going with the overly-priced “Sonnet M.2 4×4”, because the 2×4 card is nearly the same price making it a horribly valued product and we may as well expand this thing with four NVMes to get our money’s worth. It’s not really clear if the Sonnet M.2 4×4’s controller outperforms the Sonnet M.2 2×4 (they don’t use the same one), but both operate on gen 3 and the NVMes themselves are gen 3 so none of it really matters. There are much cheaper NVMe PCI-e cards but most are not compatible with Macs, you’re paying the tax for the fancy firmware… otherwise buy a much cheaper card if you’re on Windows or Linux. The card only came in a pink ‘static suppressant bag’ instead of a true antistatic bag which is laughable at how much sonnet is charging, and Amazon appears to have taken a bite out of the box.

For the primary boot NVMe we’re going with a 2TB 970 EVO Plus. I know Louis Rossmann decried them as being unreliable after he torched a bunch in some custom gaming rigs with sketchy PSUs, but they’re good drives if you don’t kill them with dirty PSU voltage rails. Always use quality PSUs folks. This is why many Maxtors failed due to the ST SMOOTH chips receiving power from PSUs outputting higher than 12v, and not the drives themselves… same thing applies today when you eclipse 12v on your power rails. I’ve also been running one in a ThinkPad for more than a year and it’s been fine.

For the remaining tertiary storage we’re going with some WD Green SN350s: solely because they’re compatible with macOS — the macOS compatibility with NVMes is very specific unfortunately. Otherwise I would have went with more TeamGroup 4TB drives as they’re one of the best value for money (particularly the TM8FP4004T0C101, it uses better NANDs than the more expensive and inferior 4TB offerings from Crucial and WD). Yeah… the cost of NVMe disks isn’t absolute, sometimes cheaper ones use better NANDs and you can be fleeced by brand recognition and false-positive specs on gen4 which I imagine is what Crucial capitalizes on.

[If you don’t know what I’m talking about: the Crucial and Western Digital NVMe drives always cheap out and use QLC NANDs instead of proper TLC NANDs as TeamGroup and Samsung do; and obviously they’re not going to advertise they’re cheating you and will price their products the same as the competition. Very similar to the whole SMR/CMR debacle, why would Western Digital tell you you’re buying something cheaper at a premium cost??? Caching is an entirely different thing separate to this and usually only the Samsung drives have ‘true’ dedicated cache logic, which is why I’m using the 970 EVO Plus as a boot/OS NVMe]

reinstalling MacOS

Fortunately we don’t need a second mac to perform OS reinstallation so the ‘Apple Configurator’ is not needed. The procedure is as simple as this: Press & hold power button until the recovery menu pops up, choose ‘continue’, choose reinstall OS, choose the new drive (in this case the Samsung drive I just formatted as “OS”). I know a lot of people raise an eyebrow requiring a second mac for when the system does actually need to be completely restored if it can’t boot into the internal recovery mode; just when you haven’t paid Apple enough you also need a second Mac to perform recovery and restoration. Even neozeed himself encountered this problem and with a heavy sigh (a very heavy sigh) and mild disbelief, set up a macOS VM for restoration since he only owns one. 😂

Once this is completed we’ll no longer be using the proprietary SSD that’s present on the Mac Pro. It DOES still need to be present in the system for the computer to POST (Apple marries it against the security IC so it’s intrinsic and serialized to the computer based on configured storage), but presumably as it won’t be written to anymore it’ll never become exhausted from write cycles… and even if it did fail over time, as a result of ordering the bottom-of-the-barrel 1TB model I could just buy another ‘cheap’ 1TB card which would allow the system to resume POSTing once again. If the soldered-in RAM or CPU fails then it’s game over; as much micro-soldering as I do, I refuse to purchase even more tools to swap out underfilled BGA ICs… and then of course you have to hope employees at Foxconn actually managed to sneak out unused genuine ones to be resold on AliExpress or eBay. *sigh*

With us now being able to use our own bootable SSD, the primary failure and annoyance of ARM-based Macs is now mitigated. For the Mac Studio you could buy backup replacement SSDs to constantly replace as they wear out (they would have to match what storage size the system was preconfigured with), but keep in mind I can add 8TB cheaply and have my own bootable SSD. And in the event you need to do data recovery or read the drive on another system, anything — even your grandma’s phonograph — can read NVMes so it’s much less of a hassle. As much as I hate to say it I think the Mac Studio makes less sense over the Mac Pro BECAUSE of the storage… you’re already buying an overpriced computer, may as well go the full distance for proper storage? Everyone’s living in the honeymoon phase right now while all of the NANDs are under warranty and still functioning… but once they start failing it’ll be a nasty money pit at best, or unfixable at worst. And do you know how many people make one computer their whole life and allow it to spontaneously fail with no backups?

ARE YOU NOT ENTERTAINED?

An ARM-based Mac using internal NVMes, is that not a nice thing? ARE YOU NOT ENTERTAINED? And no need to pay ~$2000 for 8TB. I did have to shell out $400 for the stupid SoNNeT card and $400 for the SSDs… buuut if I paid $2000 worth of SSDs I would far eclipse 8TB. In this screenshot you can also see the ‘OS’ Samsung SSD now the primary ‘Startup disk’. Fortunately, Apple’s utility automatically switched it over after I reinstalled the OS to this drive shockingly enough, so nothing more needed even on that.

Internal USB, perfect for Dongles:

Installing the iLok dongle

The iLok licensing dongle installs nicely inside the internal USB A port. Kind of reminds me of those internal VMware USB A ports meant for the ESX installation… and then you know they’d eventually go bad or corrupt themselves and the internal IT of that company never makes a backup so then you need to reconfigure ESX from scratch… good times. What? I’m not salty, not salty at all. The Sonnet NVMe card being installed on the first slot (bottom) does seem to bring more attention to the fact there’s so many unpopulated PCI-e slots.

What should be used as the display option?

1. The Dell UltraSharp U3224KB 6K actually has a few potential compatibility problems with macOS or the hardware (it’s not really well-known as Dell support gave up troubleshooting it), so you’ll get various screen distortions. It’s also possibly one of the most UGLY products I’ve ever seen in my life… the web camera looks like a malignancy, and I absolutely can’t stand silver-painted plastic. Complain about Apple’s prices all you want, at least they use nice materials.

2. The Pro Display XDR is just a little bit too much for my taste and sometimes temperamental as it’s such a complicated display (contrary to popular mythology it does not use OLED technology so it shouldn’t burn out over time). I honestly don’t think I would encounter any problems if I bought a Pro Display XDR but the cost is too much.

It’s Free Real Estate – Tim & Eric

3. That basically leaves us with the Studio Display. A lot of the 3rd party Samsung/ASUS/LG 4K or 5K offerings have dramatically inferior colour or a larger pixel size… and there’s still the potential aspect of compatibility since non-Apple hardware sometimes doesn’t play nicely. While the Studio Display is much-maligned with its high cost and strangely attached power plug, its DPI is the same as the Pro Display XDR you just get less screen real-estate and inferior contrast which I don’t care too much about. It will still look much better than your garden variety LG 27” 4K UHD Ultrafine because the colour is calibrated very well and it gets decently bright… again… I wish YouTube reviewers would point some of these things out instead assume that every display is equivalent to Apple’s offerings when they’re not. And in the event, you do find the 5K 27″ displays from other manufacturers they’re still at 60Hz. The refurbished Studio Display I had my eye on from Apple is no longer available, so I’ll be waiting for a bit until they stock another one… or maybe they’ll get a heavily marked down Pro Display XDR…. in the meantime, I’m stuck using one of my gaming monitors which has 240Hz and strobing to reduce ghosting, which does work on macOS!.. and makes macOS look so different since I’m used to how it looks with all of the ghosting all the time.

Another little something that’s rarely discussed: the nano-texture glass option causes a slight ‘frosting’ which is especially noticeable on text… it’s only meant as a compromise if you’re working in a literal sun room, sometimes more expensive does not mean better. This is exemplified with the M2 MacBook Air situation: if you opted for the superior GPU it ended up running more slowly because of the thermal throttling so the lower-end GPU option is more performant, lol. Of course Apple doesn’t always disclose these caveats or finer details, but their divisions responsible for publishing the products may not be privy to them.

Peripherals:

Onto the peripherals: I will indeed be using the Magic Mouse… before your jaw drops and you grab some tomatoes while calling me a heretic, let me explain. The Magic Mouse is one of the few peripherals with velocity sensitive 360º scrolling AND fully integrated in the UI of the operating system. This is extraordinarily similar and analogous to IBM’s ScrollPoint which also offered dynamic 360º scrolling and to a lesser extent the TrackPoint scrolling but which only offers vertical and horizontal. Needless to say 360º scrolling and horizontal scrolling is something I use all the time and cannot fathom why we still even have (notched!) mouse wheels. It’s bizarrely a mouse Apple seemingly designed specifically for me and nobody else, I imagine average or larger hands would be extremely uncomfortable with it and Apple really should offer a larger version to encompass a better demographic.

Men & Mice

The Magic Mouse and ScrollPoint Pro share very similar design philosophies in the way we scroll. I also made another strange discovery when I was looking for some more flat slick mousepads since the Magic Mice don’t work well on cloth ones at all, and that are these 3M ‘Precise’ mouse pads: AMAZON LINK.

Apparently the 3M mouse pads have a reflective material which allows the lasers to use less strength and thus supposedly saves 50% battery life, some Magic Mouse users affirmed this, so we’ll see how this goes down. It’s kind of surprising I’ve never heard of any tech reviewers mention these because saving 50% of battery life on a wireless mouse is huge.

Keyboards..

There’s a lot of good reasons to NOT use Bluetooth keyboards due to wireless keylogging, there’s not going to be anyone with that talent in rural Canada so I’m in the clear. You could buy a Matias keyboard but they’re actually worse in many aspects than the 1st party Apple keyboards: the legend printing is of dramatically worse quality, the surface of the keycaps don’t have that special velvety texture, and the snappiness of the scissor switches is probably worse. While I have many mechanical keyboards, I don’t care so much about it anymore. The Apple Magic Keyboard is just a little bit too flat for my tastes today so I ordered a “ESC Flip PRO Computer Keyboard Stand”, which can stick on the back and give you different height adjustments if needed.

onboard LEDs everywhere.

Both the iLok and Sonnet NVMe card have so many LEDs on them you can see the lightshow through the rear of the ‘grater’ now.

Now my plans are to use this thing for at least a decade to get my money’s worth: will 64GB of RAM be enough? To that I say: 64GB ought to be enough for anybody. The only major hindrance will be the forced software obsolescence when the Apple overlords declare it will not be receiving anymore updates… and then you know things like the Roland Cloud and other major vendor software will cease to get updates and functionally work. It’s appalling at how all software is heavily DRMed and requires a live account to work against. At the very least when WWIII breaks out I’ll have plenty of premium aluminum to donate to the state, forged by Tim Apple himself!

For the record I was never really an ‘Apple person’, but they’ve finally fixed all of the problems (mice have two buttons and the keyboard layout is restored to be more IBM-like) and made a product that fulfills everything I’ve ever wanted… AND forced developers to program for ARM: so now my Stallman-not-approved-absolutely-proprietary audio software runs incredibly well on a non-x86 platform. Astounding. Yeah there were some 3rd party mice that had two buttons for Macs ‘back in the day’ but a good portion of the *software* and games weren’t programmed for a real right click rendering it useless. I remember watching a ‘making-of’ video of the Myst developers pushing down Ctrl with the mouse to right click EVERY SINGLE TIME in their 3D modelling software and nearly fell off my chair… it’s quite jarring when you need to press a button on the keyboard at the same time with clicking the mouse so I’ve no idea how they tolerated that. Maybe they loved doing it? Who knows.

It’s crazy how much changes, and how much is the same

So everyone is going nuts for free VMware Workstation

I haven’t had a moment to really look into it, but Mathew Duggan‘s take on the whole experience: The Worst Website In The Entire World pretty much sums it up the whole thing.

On the plus side this makes things like adding VMnetworks and configuring them all possible for lowly end user people like me, and makes stuff like SNA networking a bit more easier.

But what is the catch?

There is always a catch!

I have a feeling this is just like when Java became ‘free’. And then followed the audits and lawsuits. Most of this legal strong arming is all under NDAs so don’t expect to find much but ask anyone who’d used Oracle, and balked at some terms, and found themselves under a JDK download audit.

The free version is available for non-commercial, personal and home use. We also encourage students and non-profit organizations to benefit from this offering.

Commercial organizations require commercial licenses to use Workstation Player.

And there is the trap.

Download at home, and away from work.

You’ve been warned.