MinGW-w64

Well after yesterdays x86_64 excitement, I figured I’d take a look in the windows area to see about the 32bit vs 64bit performance on Windows…

I know this isn’t the ‘best’ test as I’m using VMWare Fusion on OS X and running Windows XP x64 sp2 (the 2007 edition, not the 2003 one).

If anyone wants to know how to get a 64bit gcc on windows, this is the best formula I’ve come up with so far.

First download a build of mingw-w64-bin_x86_64 from sourceforge… Right now I’m using a ‘Personal Build’ from sezero_20101003, because… it’s recent, and I would imagine a personal build has some hope of actually working.

I downloaded that, and extracted to the root of the C drive. Then I renamed the mingw64 to mingw.

Next up, I downloaded and installed MSYS 1.0.11, and installed that. I selected the default options, and of course specified my mingw is installed in

C:/mingw

After that, I install the MSYS DTK 1.0, again with default options.

The final part is some kind of editor, I like VIM but it’s… involved to download as the default package ‘vim-7.2-1-msys-1.0.11-bin.tar.lzma’ is in a 7zip compatible archive that needs a bit of tweaking to get a tar file out of. I can provide it here in gzip format, that you can simply extract within the msys command prompt in the /usr directory.

Now with all that done, you should be in business!

$ gcc -v
Using built-in specs.
Target: x86_64-w64-mingw32
Configured with: ../gcc44-svn/configure –host=x86_64-w64-mingw32 –target=x86_6
4-w64-mingw32 –disable-multilib –enable-checking=release –prefix=/mingw64 –w
ith-sysroot=/mingw64 –enable-languages=c,c++,fortran,objc,obj-c++ –enable-libg
omp –with-gmp=/mingw64 –with-mpfr=/mingw64 –disable-nls –disable-win32-regis
try
Thread model: win32
gcc version 4.4.5 20101001 (release) [svn/rev.164871 – mingw-w64/oz] (GCC)

 

But will it run Dungeon?

What is also cool, is that this build of mingw includes gfortran, which is a Fortran 95 compiler with various 2003 & 2008 enhancements. So for the heck of it, I’ve rebuilt the makefile from dungeon-2.5.6 and tweaked the machdep.f to at least call the ITIME function to get the current time. The resulting archive runs pretty well!

Windows XP x64 - dungeon

Yes, it runs! And without a *32 meaning this is a 64bit binary!

 

Onward with SIMH

So going back to SIMH as my benchmark, here is the vax780 with -O0/-O0

Dhrystone(1.1) time for 500000 passes = 18
This machine benchmarks at 27777 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 20
This machine benchmarks at 25000 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 19
This machine benchmarks at 26315 dhrystones/second

Which comparing it to the native x86_64 build is pretty good considering I’m running this in a VM (VMware Fusion!). Now the same test with -O1/-O1

Dhrystone(1.1) time for 500000 passes = 12
This machine benchmarks at 41666 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 12
This machine benchmarks at 41666 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 12
This machine benchmarks at 41666 dhrystones/second

Which is pretty good! Now for the finally with -O2/-O1

Dhrystone(1.1) time for 500000 passes = 12
This machine benchmarks at 41666 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 12
This machine benchmarks at 41666 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 12
This machine benchmarks at 41666 dhrystones/second

Which is interesting in that there is no appreciable difference in the -O2/-O1 vs the -O1/-O1 build. Although I kind of expect different results on a native machine. If anyone else cares to test, I’m going to make available the whole project here. This includes the source and the pre-built binaries.

Unzip it on a win64/win32 machine and it should be somewhat straightforward to build / run. You can alter the makefile and change the primary CC flags from O0 to O1 or O2 if you so wish… just run make and it’ll generate a vax780.exe . Then in the test directory you can bench your exe like this:

$ make
gcc -O0 -c -o VAX/vax_cpu.o VAX/vax_cpu.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DU
SE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax_cpu1.o VAX/vax_cpu1.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 –
DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax_fpa.o VAX/vax_fpa.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DU
SE_ADDR64 -I VAX -I PDP11
VAX/vax_fpa.c: In function ‘op_cmpfd’:
VAX/vax_fpa.c:210: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:210: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:211: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:211: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c: In function ‘op_cmpg’:
VAX/vax_fpa.c:233: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:233: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:234: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:234: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c: In function ‘vax_fadd’:
VAX/vax_fpa.c:371: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:373: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:386: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c: In function ‘unpackd’:
VAX/vax_fpa.c:525: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:525: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c: In function ‘unpackg’:
VAX/vax_fpa.c:540: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:540: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c: In function ‘norm’:
VAX/vax_fpa.c:548: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:548: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:548: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:549: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:549: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:557: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c: In function ‘rpackfd’:
VAX/vax_fpa.c:574: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c:575: warning: integer constant is too large for ‘long’ type
VAX/vax_fpa.c: In function ‘rpackg’:
VAX/vax_fpa.c:597: warning: integer constant is too large for ‘long’ type
gcc -O0 -c -o VAX/vax_cis.o VAX/vax_cis.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DU
SE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax_octa.o VAX/vax_octa.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 –
DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax_cmode.o VAX/vax_cmode.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64
-DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax_mmu.o VAX/vax_mmu.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DU
SE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax_sys.o VAX/vax_sys.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DU
SE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax_syscm.o VAX/vax_syscm.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64
-DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax780_stddev.o VAX/vax780_stddev.c -I. -DVM_VAX -DVAX_780 -DU
SE_INT64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax780_sbi.o VAX/vax780_sbi.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax780_mem.o VAX/vax780_mem.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax780_uba.o VAX/vax780_uba.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax780_mba.o VAX/vax780_mba.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax780_fload.o VAX/vax780_fload.c -I. -DVM_VAX -DVAX_780 -DUSE
_INT64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax780_syslist.o VAX/vax780_syslist.c -I. -DVM_VAX -DVAX_780 –
DUSE_INT64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o PDP11/pdp11_rl.o PDP11/pdp11_rl.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o PDP11/pdp11_rq.o PDP11/pdp11_rq.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o PDP11/pdp11_ts.o PDP11/pdp11_ts.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o PDP11/pdp11_dz.o PDP11/pdp11_dz.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o PDP11/pdp11_lp.o PDP11/pdp11_lp.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o PDP11/pdp11_tq.o PDP11/pdp11_tq.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o PDP11/pdp11_xu.o PDP11/pdp11_xu.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o PDP11/pdp11_ry.o PDP11/pdp11_ry.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o PDP11/pdp11_cr.o PDP11/pdp11_cr.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o PDP11/pdp11_rp.o PDP11/pdp11_rp.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o PDP11/pdp11_tu.o PDP11/pdp11_tu.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o PDP11/pdp11_hk.o PDP11/pdp11_hk.c -I. -DVM_VAX -DVAX_780 -DUSE_INT
64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o PDP11/pdp11_io_lib.o PDP11/pdp11_io_lib.c -I. -DVM_VAX -DVAX_780 –
DUSE_INT64 -DUSE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o scp.o scp.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DUSE_ADDR64 -I VAX
-I PDP11
scp.c:470: warning: integer constant is too large for ‘long’ type
scp.c:470: warning: integer constant is too large for ‘long’ type
scp.c:470: warning: integer constant is too large for ‘long’ type
scp.c:470: warning: integer constant is too large for ‘long’ type
scp.c:471: warning: integer constant is too large for ‘long’ type
scp.c:471: warning: integer constant is too large for ‘long’ type
scp.c:471: warning: integer constant is too large for ‘long’ type
scp.c:471: warning: integer constant is too large for ‘long’ type
scp.c:472: warning: integer constant is too large for ‘long’ type
scp.c:472: warning: integer constant is too large for ‘long’ type
scp.c:472: warning: integer constant is too large for ‘long’ type
scp.c:472: warning: integer constant is too large for ‘long’ type
scp.c:473: warning: integer constant is too large for ‘long’ type
scp.c:473: warning: integer constant is too large for ‘long’ type
scp.c:473: warning: integer constant is too large for ‘long’ type
scp.c:473: warning: integer constant is too large for ‘long’ type
scp.c:474: warning: integer constant is too large for ‘long’ type
scp.c:474: warning: integer constant is too large for ‘long’ type
scp.c:474: warning: integer constant is too large for ‘long’ type
scp.c:474: warning: integer constant is too large for ‘long’ type
scp.c:475: warning: integer constant is too large for ‘long’ type
scp.c:475: warning: integer constant is too large for ‘long’ type
scp.c:475: warning: integer constant is too large for ‘long’ type
scp.c:475: warning: integer constant is too large for ‘long’ type
scp.c:476: warning: integer constant is too large for ‘long’ type
scp.c:476: warning: integer constant is too large for ‘long’ type
scp.c:477: warning: integer constant is too large for ‘long’ type
scp.c:477: warning: integer constant is too large for ‘long’ type
scp.c:478: warning: integer constant is too large for ‘long’ type
scp.c:478: warning: integer constant is too large for ‘long’ type
scp.c:479: warning: integer constant is too large for ‘long’ type
scp.c:479: warning: integer constant is too large for ‘long’ type
gcc -O0 -c -o sim_console.o sim_console.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DU
SE_ADDR64 -I VAX -I PDP11
gcc -O0 -c -o sim_fio.o sim_fio.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DUSE_ADDR6
4 -I VAX -I PDP11
gcc -O0 -c -o sim_timer.o sim_timer.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DUSE_A
DDR64 -I VAX -I PDP11
gcc -O0 -c -o sim_sock.o sim_sock.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DUSE_ADD
R64 -I VAX -I PDP11
gcc -O0 -c -o sim_tmxr.o sim_tmxr.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DUSE_ADD
R64 -I VAX -I PDP11
gcc -O0 -c -o sim_ether.o sim_ether.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DUSE_A
DDR64 -I VAX -I PDP11
gcc -O0 -c -o sim_tape.o sim_tape.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DUSE_ADD
R64 -I VAX -I PDP11
gcc -O0 -c -o VAX/vax_cpu2.o VAX/vax_cpu2.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64 –
DUSE_ADDR64 -I VAX -I PDP11
gcc -O1 -c -o VAX/vax_cpu2.o VAX/vax_cpu2.c -I. -DVM_VAX -DVAX_780 -DUSE_INT64
-DUSE_ADDR64 -I VAX -I PDP11
gcc -o vax780 VAX/vax_cpu.o VAX/vax_cpu1.o VAX/vax_fpa.o VAX/vax_cis.o VAX/vax_
octa.o VAX/vax_cmode.o VAX/vax_mmu.o VAX/vax_sys.o VAX/vax_syscm.o VAX/vax780_st
ddev.o VAX/vax780_sbi.o VAX/vax780_mem.o VAX/vax780_uba.o VAX/vax780_mba.o VAX/v
ax780_fload.o VAX/vax780_syslist.o PDP11/pdp11_rl.o PDP11/pdp11_rq.o PDP11/pdp11
_ts.o PDP11/pdp11_dz.o PDP11/pdp11_lp.o PDP11/pdp11_tq.o PDP11/pdp11_xu.o PDP11/
pdp11_ry.o PDP11/pdp11_cr.o PDP11/pdp11_rp.o PDP11/pdp11_tu.o PDP11/pdp11_hk.o P
DP11/pdp11_io_lib.o scp.o sim_console.o sim_fio.o sim_timer.o sim_sock.o sim_tmx
r.o sim_ether.o sim_tape.o VAX/vax_cpu2.o -I. -DVM_VAX -DVAX_780 -DUSE_INT64 -DU
SE_ADDR64 -I VAX -I PDP11 -lwinmm -lwsock32

[email protected] /usr/src/simh
$ cd test/

[email protected] /usr/src/simh/test
$ ../vax780.exe bsd42.ini

VAX780 simulator V3.8-1
loading ra(0,0)boot
Boot
: ra(0,0)vmunix
199488+56036+51360 start 0x11a0
4.2 BSD UNIX #9: Wed Nov 2 16:00:29 PST 1983
real mem = 8384512
avail mem = 7073792
using 102 buffers containing 835584 bytes of memory
mcr0 at tr1
mcr1 at tr2
uba0 at tr3
hk0 at uba0 csr 177440 vec 210, ipl 15
rk0 at hk0 slave 0
rk1 at hk0 slave 1
uda0 at uba0 csr 172150 vec 774, ipl 15
ra0 at uda0 slave 0
ra1 at uda0 slave 1
zs0 at uba0 csr 172520 vec 224, ipl 15
ts0 at zs0 slave 0
dz0 at uba0 csr 160100 vec 300, ipl 15
dz1 at uba0 csr 160110 vec 310, ipl 15
dz2 at uba0 csr 160120 vec 320, ipl 15
dz3 at uba0 csr 160130 vec 330, ipl 15
root on ra0
WARNING: should run interleaved swap with >= 2Mb
Automatic reboot in progress…
Tue Nov 8 03:44:30 PST 1983
Can’t open checklist file: /etc/fstab
Automatic reboot failed… help!
erase ^?, kill ^U, intr ^C
# ./d2;./d2;./d2
Dhrystone(1.1) time for 500000 passes = 19
This machine benchmarks at 26315 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 19
This machine benchmarks at 26315 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 18
This machine benchmarks at 27777 dhrystones/second
# sync
# sync
# sync
#
Simulation stopped, PC: 80001629 (BNEQ 80001630)
sim> q
Goodbye

[email protected] /usr/src/simh/test
$

The d0,d1,d2 are the dhyrstone benchmark compiled with -O0, -O1, and -O2 respectively. This gives you a chance to observe various optimizations in the GCC 2.7.2.2 for the VAX.

moving hosting companies…..

I’m moving some of my hosting stuff, so there will probably be some bumps in the move… But in the meantime, I think the ‘big’ thing I have, ‘vpsland/install‘ should now be running up again..

Even while I was writing this, I saw someone download f2c!!!

It’s a scary thing to think of my work being used.. Esp regarding Fortran.

Also I’ll have to post some stuff on the x86_64 gcc for Windows… Too bad it won’t run Qemu, but there is 64bit Gfortran!

gcc on the x86_64

Today I thought I’d isolate the portion of simh that crashes 4.X BSD, when SIMH is built with GCC. I managed to find that the “op_ldpctx” and “op_mtpr” procedures of vax_cpu1.c become unstable when built with -O2 flags.

So I extracted the two procedures so I could then built the remainder of SIMH with GCC’s O2 flags. Before going too far, I first went to verify that the results would be consistent with the i386/32bit binaries…

However the results were.. interesting. By default SIMH will build with -O0 flags for the VAX giving the following dhrystone benchmark in 4.3 UWisc BSD:

Dhrystone(1.1) time for 500000 passes = 24
This machine benchmarks at 20833 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 24
This machine benchmarks at 20833 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 24
This machine benchmarks at 20833 dhrystones/second

24 seconds, not so hot. Then the same thing with -O1 flags…

Dhrystone(1.1) time for 500000 passes = 18
This machine benchmarks at 27777 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 18
This machine benchmarks at 27777 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 18
This machine benchmarks at 27777 dhrystones/second

18 seconds! pretty good, thats a 33% increase! Now for my hybrid -O2/-O1

Dhrystone(1.1) time for 500000 passes = 17
This machine benchmarks at 29411 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 17
This machine benchmarks at 29411 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 17
This machine benchmarks at 29411 dhrystones/second

A 5% increase over the -O1 … Nothing too big, but still an INCREASE.

Now for the real fun, I re-ran these bulid factors with the x86_64 compiler.

The -O0/-O0 combination came in at 19 seconds!!

Dhrystone(1.1) time for 500000 passes = 19
This machine benchmarks at 26315 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 19
This machine benchmarks at 26315 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 19
This machine benchmarks at 26315 dhrystones/second

This is pretty amazing considering everything we do should make it faster… The next thing I tried was the -O1/-O1 combination and go the following:

Dhrystone(1.1) time for 500000 passes = 12
This machine benchmarks at 41666 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 13
This machine benchmarks at 38461 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 13
This machine benchmarks at 38461 dhrystones/second

A 12.5 second average! This blew the doors off of the -O2/-O1 on the i386! Naturally I was hoping for at least a 5% increase going to the -O2/-O1 flags on the x86_64, however I got this…

Dhrystone(1.1) time for 500000 passes = 14
This machine benchmarks at 35714 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 13
This machine benchmarks at 38461 dhrystones/second
Dhrystone(1.1) time for 500000 passes = 12
This machine benchmarks at 41666 dhrystones/second

And I ran it numerous times, but the bottom line is that the O2 flags actually produces a SLOWER SIMH on OS X 10.6.4 then -O1 flags… The whole exercise is a tad perplexing to me, to say the least, but unless you bench it, you’ll never know if the compiler is getting confused somewhere, and going to throw a loop. I should also point out that SIMH will not run with full -O2 flags as there is something in the code that breaks under optimization so this very well may not hold true 100% of the time, but at the same point things still have to be tested…