Compiling rsync for VMWare ESXi 6.5

So what had started this little ‘adventure’ is that years ago there was this great sale over at Joe’s Datacenter, where I had picked up a physial server for the mere price of $20 USD a month!  What a deal!  No more quotas, CPU sharing issues and generally having to share.  Awesome!

So I have them install Debian, and load up the KVM modules, and away I go and life is good.  So foolishly years later, I jumped onto the whole container thing, installed docker, and that is where everything went south.

Every few seconds while doing a tcpdump on the 100% virtual bridge I’d see a massive influx of arp traffic.  I tried static arp’s on the host & the guest and it was ‘better’ but now the network traffic would hang.  Things like TCP would break after a minute and stuff like UDP game state would break so bad it’d end up unplayable.  This basically broke maraakate.org and hosting stuff like Quake I/ Quake II/ Daikatana and other iD based games.

My existing virtual machines now had a major issue where they would stop responding to traffic.  I never could find a fix, and it ended up with me moving my blog out to sloppy.io to keep running as a container based service until it magically stopped working and I gave up and did a full dump/reload on a hosted WordPress over at ChicagoVPS.  What a nightmare!

But what about all those virtual machines?

Well even after apt-get purge of everything docker, upgrading and downgrading the kernel nothing helped.  The VMs still dropped off the network periodically.  So with some spare time I decided to just go ahead and backup the box, and wipe the machine.

Since the physical network was working fine I was able to rsync the 300GB+ worth of data over the span of a few days.  That was fine, and considering how crap the server had been, I figured some straight downtime wouldn’t hurt anything too much.  While looking for an OS to install onto the machine, I saw that Joe offers VMWare ESXi 6.5, so I thought I’d just go with that, as naturally VMWare runs both VMs and with Project Photon I could maybe mess with containers again at a later date.

Since rsync had worked so well for moving hundreds of gigabytes of data from the USA to Hong Kong, I figured it’d be trivial to just convert the existing RAW KVM/Qemu disk images back to the United States of America.  And that is where the fun begins.

Let me tell you!

While reading this great post on virten.net they drop mention of XSIBackup, which requires registration (yuck) and worse their stupid registration system is broken:

LOL WHUT?

Rest assured the email does show up!

Dear Neo Zeed, thank you for visiting 33hops.com
This is your download url http://a.33hops.com/downloads/?key=bq7l5ptPB70MJj9dkftxFegr3xWoBZwpdFPQOUC3Cm10KPSIl3v1532877224253

But of course it doesn’t work

The key is invalid or has expired, only two downloads are allowed per key, download a new one at 33hops.com

What

The

Fuck

I know this is an ongoing issue at large when you provide executable binaries on the internet, as you will no doubt get flagged with some false positive by some virus crusaiding idiot who just sets up a tool and never reads anything but sends out threatening emails.  I went through this with the need for the simple 404 redirect, all thanks to Gerhard W. Recher.

So since this wasn’t going to be an avenue to persue I dug a little deeper and ran across this post over at virtuallyGhetto.  So it turns out the userland for ESXi is a CentOS environment that uses busybox.  And if you just download and install CentOS 3.9 into a VM, and build whatever you want, and ideally add in the -static flag, and copy it over.  For those who want to look into more ‘internals’ of the userland, check out zemris.fer.hr.

Great!

Things like UID/GUID mappings are broken in the libc it seems among other things.  So for my simple rsync config I just put the numbers in myself.

uid = 0
gid = 0
use chroot = no
max connections = 4
syslog facility = local5
pid file = /var/run/rsyncd.pid
hosts allow = a.b.c.d

[datastore1]
path = /vmfs/volumes/datastore1
comment = WDC_WD5000AAJS2D00A8B0
read only = no

I have read that you really ought to keep your binaries/config on the datastore so they aren’t subject to upgrades overwriting them and other chaotic stuff.  So editing the file “/etc/rc.local.d/local.sh” I just added the following before the exit 0:

/vmfs/volumes/datastore1/tools/rsync –daemon –log-file=/tmp/rsync.log –config=/vmfs/volumes/datastore1/tools/rsyncd.conf

And then ran it manually to kick it off.

So now I don’t have to rely on someone’s elses broken downoad system, and now we can build other fun native stuff.

And the best part is that after all of this fighting Maraakate’s site is back online and I get this message from him:

holy crap that new server setup so much better
its like playing it locally honest to god
played a whole unit not a single fuck up
no rubber banding lag effect or any of that

So now things are actually performing better on VMWare than we were getting on KVM.  And yes I had flattened out the disk images, loaded up the paravirtual disk & network drives on KVM, but VMWare does such a surprisingly better job.

I honestly wasn’t expecting that.

And as a bonus, I messed with qemu-0.9.0 (I didn’t feel the need to go through glibc2 hell), and qemu-img works great with a simple raw to vmdk

[root@jdc-user:/vmfs/volumes/5b5806fd-339444da-f897-003048d70598] ./tools/qemu-i
mg.static info win30.raw
image: win30.raw
file format: raw
virtual size: 32M (33554432 bytes)
disk size: 32M
[root@jdc-user:/vmfs/volumes/5b5806fd-339444da-f897-003048d70598] ./tools/qemu-i
mg.static convert -f raw -O vmdk win30.raw win30.vmdk
[root@jdc-user:/vmfs/volumes/5b5806fd-339444da-f897-003048d70598] ./tools/qemu-i
mg.static info win30.vmdk
image: win30.vmdk
file format: vmdk
virtual size: 32M (33554432 bytes)
disk size: 27M

And it boots!

Transcopied Windows 3.0 VM

So yes, wrapping up you can in fact run stuff on ESXi, copy data, and even convert disk images.

Oh yeah, and so people can deal with my 404 based download system (the password is on the 404 page)

Let the games begin!

Qemu now supports the HPPA in softmmu mode!

I’ve worked on machines with HP-UX, but never owned one.  Well Qemu now has system emulation thanks to Richard Henderson! You can find information over at:

https://parisc.wiki.kernel.org/index.php/Qemu

Being the unfair person I am, I thought I’d try NeXTSTEP to see how far it gets.

Processor   Speed            State           Coprocessor State  Cache Size
---------  --------   ---------------------  -----------------  ----------
0      250 MHz    Active                 Functional            0 KB

Available memory: 512 MB
Good memory required: 16 MB

Primary boot path: FWSCSI.6.0
Alternate boot path: LAN.0.0.0.0.0.0
Console path: SERIAL_1.9600.8.none
Keyboard path: PS2

Available boot devices:
1. DVD/CD [lsi 00:00.0 2:0 Drive QEMU QEMU CD-ROM 2.5+]
2. lsi 00:00.0 0:0 Drive QEMU QEMU HARDDISK 2.5+

Booting from lsi 00:00.0 0:0 Drive QEMU QEMU HARDDISK 2.5+

Booting...
Boot IO Dependent Code (IODC) revision 153

HARD Booted.
Can't determine I/O subsystem type

NEXTSTEP boot v3.3.4.17
524288 memory

NEXTSTEP will start up in 10 seconds, or you can:
Type -v and press Return to start up NEXTSTEP with diagnostic messages
Type ? and press Return to learn about advanced startup options
Type any other character to stop NEXTSTEP from starting up automatically

boot:

And amazingly the bootloader works, although that is about it. Trying to boot up OpenBSD gets about this far:

PDC_CHASSIS: Initialize (3), CHASSIS  cec0
>> OpenBSD/hppa CDBOOT 0.2
booting dk0a:/bsd.rd: 2703360+851960+2675712+547840=0x8631f0


SeaBIOS: Unimplemented PDC_CACHE function 1 8ddad0 1 1 1

I found on Windows though that the Debian 8 CD’s work the best, as the earlier ones lock up after loading a kernel, and the later one doesn’t fully initialize.  I’ve been using this one: debian-8.0-hppa-NETINST-1.iso  Serial console interaction is the way to go, so I ran Qemu like this:

qemu-system-hppa.exe -L . -serial telnet:127.0.0.1:4444,server,wait -boot d -cdrom debian-8.0-hppa-NETINST-1.iso -hda Deiban8HPPA.vmdk

So this way I can get get the install kicked off.  Although I should probably have just downloaded debian-8.0-hppa-CD-1.iso

Linux on HPPA

Otherwise, yeah, it’s Linux, on HPPA

And for anyone who is interested, the only version of HP-UX I have hanging around, HP-UX 10.20 [HP9000 S700] gives me the following:

HP-UX 10.20 on Qemu

IRC necromancy

I’m xorhash, a guest poster, here to talk about my tale going down a trip on the memory lane with QuakeNet’s service bot Q. If you’re not interested in IRC, you can probably skip this one.

On the Trails of Q

As far as I know, QuakeNet’s service bot Q went through these three major codebases:

a. the old Perl Q,
b. the first version written in C, and
c. Q as part of newserv.

There’s a reason I didn’t have anything to link for (a). That’s because to the best of my knowledge and research, no version has survived these past decades.

As for (b), it seems only the linked version 3.99 from the year 2003 was saved.
The CVS repository and thus commit history has been lost.

If anyone has either actual code for the old Perl Q or the CVS repo for the old
Q written in C, please reach out to me via `xorhash ++at++ protonmail.com’.
I’m most interested in looking through it.

However, not all hope was lost with the old Perl Q. As it turns out, most likely, the old Perl Q was actually based on an off-the-shelf product called “CServe”. What makes me think so?

Let’s take a look at [the QuakeNet Q command listing from 1998.

I picked the command “WHOIS” and googled its use “Will calculate a nick!user@host mask for you from the whois information of this nick.” This lead me to a help file for StarLink IRC. At the top, it reads:

CStar3.x User Command Help File **** 09/10/99
Information extracted from CServe Channel Service
Version 2.0 and up Copyright (c)1997 by Michael Dabrowski
Help Text (c)1997 (c)1997 StarLink-IRC (with permission)

Wait a second, “CServe Channel Service”? I know that from somewhere.

[email protected]

So the commands between that help file and the QuakeNet Q command listing match up and so does Q’s host today. Most likely, I’m on the right track with this. What’s left is to track down a copy of CServe.

Note: I’ve been on the old Perl Q for a while and this strategy didn’t use to work. It seems Google newly indexed these pages. For once I can sincerely say: Thank you, Google.

I found that CServe was hosted on these websites:

a. Version 3.0 on http://www.cs.cuc.edu/~mdabrows/cserve/,
b. Version 3.1 on http://www.wam.umd.edu/~devy/cserve/,
c. Version 4.0 on http://www.othernet.org/devon/cserve/, and
d. Version 5.0 and above on http://www.ircore.com/.

The only surviving versions are 3.0 and 5.1. CServe got renamed to “CS” starting with 5.0 and was rewritten in C by someone other than the original CServe author, going by the comments in the file header of CS5.1 `src/show_access.c’. CS was actually sold as a commercial product. I wonder how many people bought it.

QuakeNet most likely took a version between 2.0 and 4.0, inclusive, as the basis for the old Perl Q. Which one in particular it was, we may never know. If you have any details, please reach out to me at the e-mail address above.

I can’t make any clever guesses anyway since the only versions that the web archive has are 3.0 and 5.1. The latter is written in C, so it quite obviously can’t be the old Perl Q.

Making It Run

So now that I have CServe 3.0, I wanted to actually see it running.

There are three ways to reasonably accomplish this:

a. port CServe to a modern IRCd’s server-to-server protocol,
b. port an old IRCd to a modern platform,
c. emulate an old platform and run both IRCd and CServe there.

I chose option (b).
Once upon a time, I did option (a) for the old UnderNet X bot. It was a very painful exercise to port a bot that predates the concept of UIDs (or numeric nicks/numnicks as ircu’s P10 server-to-server protocol calls them). There’s nothing too exciting about doing (c) by just emulating a 486 or so and FreeBSD, just sounds like a boring roundtrip of emulation and network bridging.

Fortunately, the author was a nice person and wrote on the CServe website that version 3.0 requires “ircu2.9.32 and above”.

It seems the ircu2.10 series followed right after ircu2.9.32. While I’m sure there’s some linking backwards compatibility, determining which ircu in the ircu2.10 series still spoke enough P09 to link with CServe sounded like an exercise in boring excruciating pain. Modern-day ircu most certainly no longer speaks P09. Besides, what’s the fun in just doing the manual equivalent of `git bisect’?

So after grabbing ircu2.9.32, I tried to just straightforward compile and run it.

There’s a `Config’ script that’s supposed to be kind of like autoconf `configure’, but I’ve found it extremely non-deterministic. It generates `include/setup.h’. I’ve made a diff for your convenience. It targets Debian stable, and should work with any reasonably modern Linux. There are special `#ifdef’ branches for  FreeBSD/NetBSD in the code. This patchset may break for BSDs in general.

Do not touch `Config’, meddle with `include/setup.h’ manually. Remember this is an ancient IRCd, there are actual tunables in `include/config.h’.

The included example configuration file is correct for the most part, but the documentation on U:lines is wrong. U:lines do what modern-day U:lines do, i.e., designate services servers with uber privileges.

U:cserve.mynetwork.example:*:*

Excuse Me, But What The Fuck?

Of course, I’m dealing with old code. It wouldn’t be old code if I didn’t have some things that just make me go “Excuse me, but what the fuck?”

Looping at the speed of light

aClient *find_match_server(mask)
char *mask;
{
  aClient *acptr;
  if (BadPtr(mask))
    return NULL;
  for (acptr = client, (void)collapse(mask); acptr; acptr = acptr->next) 
  {
  if (!IsServer(acptr) && !IsMe(acptr))
    continue;
    if (!match(mask, acptr->name))
      break;                                                                                                    continue;
  }
  return acptr;
}

See that `continue’ way on the left? What is it doing there? Telling the compiler to loop faster?

Carol of the Old Varargs

So apparently some of this code predates C89. Which means it uses old-style declarations, but that’s okay. It also uses old-style varargs, which is adorable.

The hacks around not even that being there are adorable, too:

#ifndefUSE_VARARGS
/*VARARGS*/
voidsendto_realops(pattern, p1, p2, p3, p4, p5, p6, p7)
char*pattern, *p1, *p2, *p3, *p4, *p5, *p6, *p7;
{
#else
voidsendto_realops(pattern, va_alist)
char*pattern;
va_dcl
{
  va_list vl;
#endif
  Reg1 aClient *cptr;
  Reg2 int i;
  char fmt[1024];
  Reg3 char *fmt_target;

#ifdef USE_VARARGS
  va_start(vl);
#endif

  (void)sprintf(fmt, ":%s NOTICE ", me.name);
  fmt_target = &fmt[strlen(fmt)];

  for (i = 0; i <= highest_fd; i++)
if ((cptr = local[i]) && IsOper(cptr))
  {
  strcpy(fmt_target, cptr->name);
  strcat(fmt_target, " :*** Notice -- ");
  strcat(fmt_target, pattern);
  #ifdef USE_VARARGS
  vsendto_one(cptr, fmt, vl);
  #else
  sendto_one(cptr, fmt, p1, p2, p3, p4, p5, p6, p7);
  #endif
  }
#ifdef USE_VARARGS
va_end(vl);
#endif
return;
}

These functions were declared like this (the example chosen above actually has
no declaration because why not):

/*VARARGS1*/
extern    void    sendto_ops();

Whatcmp

There are `mycmp’ and `myncmp’ for doing RFC1459 casemapping string comparisons. `strcasecmp’ got `#define’d to `mycmp’, but in one case `mycmp’ got `#define’d back to `strcasecmp’. It seemed easier to just remove `mycmp’, replacing it with `strcasecmp’ and forgo RFC1459 casemapping. This is doubly useful because CServe doesn’t actually honor RFC1459 casemapping.

Waiting for the Cookie

ircu uses PING cookies. I was rather confused when I didn’t get one immediately after sending `NICK’ and `USER’. In fact, it took so long that I thought the IRCd got stuck in a deadloop somewhere. That would’ve been a disaster since the last thing I wanted to do is get up close and personal with the networking stack.

As it turns out, it can’t send the cookie:

/*
 * Nasty.  Cant allow any other reads from client fd while we're
 * waiting on the authfd to return a full valid string.  Use the
 * client's input buffer to buffer the authd reply.
 * Oh. this is needed because an authd reply may come back in more
 * than 1 read! -avalon
 */

Nasty indeed.

I lowered `CONNECTTIMEOUT’ to 10 in the diff linked above. This makes the wait noticeably shorter when you aren’t running an identd.

CServe Isn’t Much Better

Not that CServe is much better. I have to hand it to Perl, I only needed to undo the triple-`undef’ on line 450 of `cserve.pl’ and it worked with no modifications. God bless the backwards compatibility of Perl 5.

That said, it has its own interesting ideas of code. This is the main command execution:

foreach $i (keys %commands)
{
    if($com eq $i)
    { $found = 1; break; }
}
if($found == 1)
{
    open(COMMAND, "<./include/$com");
    @evalstring = ; close(COMMAND);
    foreach $i (@evalstring) { $evals .= $i; }
    eval($evals);
}
else
{
    ¬ice("2No such command 1[4$com1]. /msg $unick SHOWCOMMANDS\n");
}

Yep, it opens, reads into an array, closes and then evals. For every command it recognizes. Of course, this means code hot swapping, but it also means terrible performance with any non-trivial amount of users.

Oh, and all passwords are hashed. But they’re hashed with `crypt()’. And a never-changing salt of ZZ.

End Result

up & running

Was it worth it?
No, not really.
Would I do it again?
Absolutely.

You probably do not want to expose this to the outside world.
The IRCd code is scary in all the wrong ways.

Further Links

Some other things if you’re into ancient IRC stuff:

Trying to debug an ancient Linux 0.11 kernel with GDB

It doesn’t hit on the breakpoints for some reason.  I’m most likely doing something wrong.

GNU gdb 4.17
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type “show copying” to see the conditions.
There is absolutely no warranty for GDB. Type “show warranty” for details.
This GDB was configured as “–host=i486-pc-netbsd –target=i386-linux-gnuaout”…
Setting up the environment for debugging gdb.
Breakpoint 1 at 0x94a4: file panic.c, line 18.
Breakpoint 2 at 0x667b: file init/main.c, line 110.
tcp_open ! localhost:1234
0xfff0 in sys_unlink () at memory.c:430
430 }
(top-gdb)c
Continuing.

Program received signal SIGINT, Interrupt.
panic (s=0x1dd6c “”) at panic.c:23
23 for(;;);
(top-gdb)bt
#0 panic (s=0x1dd6c “”) at panic.c:23
#1 0xd9b3 in mount_root () at memory.c:430
#2 0x12f39 in sys_setup (BIOS=0x1ab38) at hd.c:157
#3 0x7477 in system_call () at sched.c:412
#4 0x1000000 in ?? ()
(top-gdb)

But after a LOT of struggling it certainly looks about right.

Linux 0.11 kernel panic on mounting root

I then went ahead and built GDB 8.0.1, and it’s incredibly slow, and I guess not too compatible with Qemu 0.9 (I had issues with newer builds though) but it does break.

(top-gdb)target remote localhost:1234
Remote debugging using localhost:1234
0x0000fff0 in sys_unlink () at memory.c:83
83 }
(top-gdb)i b
Num Type Disp Enb Address What
1 breakpoint keep y 0x000094a4 in panic at panic.c:18
2 breakpoint keep y 0x000094a4 in panic at panic.c:18
silent
return
(top-gdb)c
Continuing.

Breakpoint 1, panic (s=0xd8cf <sys_mount+227>) at panic.c:18
18 printk(“Kernel panic: %s\n\r”,s);
Reply contains invalid hex digit 79
(top-gdb)i s
Reply contains invalid hex digit 79
(top-gdb)bt
#0 0x0000d9b3 in mount_root () at memory.c:83
#1 0x0001ab60 in hd_info ()
#2 0x00000000 in ?? ()
(top-gdb)i r
eax 0x0 0
ecx 0x51 81
edx 0x1fb4c 129868
ebx 0x0 0
esp 0xffff94 0xffff94
ebp 0xffffa8 0xffffa8
esi 0x1ab60 109408
edi 0x0 0
eip 0xd9b3 0xd9b3 <mount_root+139>
eflags 0x246 [ PF ZF IF ]
cs 0x8 8
ss 0x10 16
ds 0x10 16
es 0x10 16
fs 0x17 23
gs 0x17 23
(top-gdb)

And while it more or less runs there is some issues using the GDB stub from Qemu 0.9.0, although I had a world of pain with newer versions.  And I’ve never done the kernel debug thing before so this is all new to me.

And I guess it goes as no surprise that with GDB 8, they have a.out Linux tagged as obsolete and to be removed.  I guess I need to try a GDB that was current to Qemu 0.90 so it may not have so many packet overruns and unexpected results…

Upgrading Debian 8 to version 9

More so my own use, but this article on linuxconfig.org was a LOT of help.

Basically and with no explanation the steps are :

apt-get update
apt-get upgrade
apt-get dist-upgrade

dpkg -C

apt-mark showhold

cp /etc/apt/sources.list /etc/apt/sources.jessie

sed -i ‘s/jessie/stretch/g’ /etc/apt/sources.list

apt-get update

apt list –upgradable

apt-get upgrade
apt-get dist-upgrade

aptitude search ‘~o’

And that’s it!

I also took the BBR congestion update tip from tipsforchina.com

wget –no-check-certificate https://github.com/teddysun/across/raw/master/bbr.sh

And adding the following to /etc/sysctl.conf after the “net.ipv4.tcp_congestion_control = bbr” line.

fs.file-max = 51200
net.core.rmem_max = 67108864
net.core.wmem_max = 67108864
net.core.netdev_max_backlog = 250000
net.core.somaxconn = 4096
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.ip_local_port_range = 10000 65000
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_fastopen = 3
net.ipv4.tcp_mem = 25600 51200 102400
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864
net.ipv4.tcp_mtu_probing = 1

26th anniversary of Linux!

As the joke goes:

Happy 25th birthday, Linux! Here’s your f-ing cake, go ahead and compile it yourself.

So it’s always a fun time for me to push my old project Ancient Linux on Windows.  And what makes this so special?  Well it’s a cross compiler for the ancient Linux kernels, along with source to the kernels so you can easily edit, compile and run early Linux from Windows!

As always the kernels I have built and done super basic testing on are:

  • linux-0.10
  • linux-0.11
  • linux-0.12
  • linux-0.95c+
  • linux-0.96c
  • linux-0.97.6
  • linux-0.98.6

All of these are a.out kernels, like things were back in the old days.  You can edit stuff in notepad if you so wish, or any other editor.  A MSYS environment is included, so you can just type in ‘make’ and a kernel can be built, and it also can be tested in the included Qemu.  I’ve updated a few things, first with better environment variables, and only tested on Windows 10.  Although building a standalone linux EXE still requires a bit of work, it isn’t my goal here as this whole thing is instead geared around building kernels from source.  I included bison in this build, so more of GCC is generated on the host.  Not that I think it matters too much, although it ended up being an issue doing DooM on GCC 1.39.

So for people who want to relive the good old bad days of Linux, and want to do so from the comfort of Windows, this is your chance!


Download Ancient Linux on Windows
Download Ancient Linux on Windows

Manually adding ncurses & VDE support to the Linux Qemu build

For some reason I had issues for this to automatically pick up building Qemu 2.8.0 on Ubuntu 16.10 (Which is really Debian)…

Anyways, be sure to have the needed dev components installed.  If you have a FRESH system, natrually you’ll need a lot more.

apt-get install libvdeplug-dev
apt-get install libvde-dev
apt-get install ncurses-dev

editing the file config-host.mak, I found I needed to add the following to turn on ncurses & VDE:

CONFIG_CURSES=y
CONFIG_VDE=y

And lastly add in the following libs to the libs_softmmu, to ensure it’ll link

-lncurses -lvdeplug

And now I’m good!

From my notes on flags needed to run Qemu the old fashioned way:

-net none -device pcnet,mac=00:0a:21:df:df:01,netdev=qemu-lan -netdev vde,id=qemu-lan,sock=/tmp/local/

This will join it to the VDE listening in /tmp/local

Obviously I have something more interesting and more evil going on….

Installing VMware ESXi 5.5.0 Update 3 on KVM

Well I had no luck with the boot process hanging during initialization.  I searched a little, and came across this thread, stating :

The line that says “Running inside a VM; adjusting spinout timeout to 180 seconds” would suggest that KVM implements enough of our backdoor interface to make it look like we’re running under a VMware hypervisor.  When we’re running in this environment, we use the backdoor to get the host TSC frequency.  I suspect that KVM doesn’t implement the “GETMHZ” backdoor call, so we are confused about the TSC frequency.  The 30ms delay turns into … 30 hours?  30 years?

So they had a source code change for QEMU 1.7.0, however it obviously doesn’t work in 2.x.  It was rolled up stream, and then made into a switch to disable with a simple flag to add into the command line.

-machine vmport=off

So with that set I ran the following:

kvm -vnc 0.0.0.0:1 -cpu host \
-machine vmport=off \
-m 4096M \
-smp cpus=2 \
-drive file=esx-1.qcow2,if=ide,index=0,media=disk \
-serial telnet:127.0.0.1:5001,server,nowait \
-monitor tcp:127.0.0.1:6001,server,nowait \
-cdrom /root/VMware-VMvisor-Installer-5.5.0.update03-3116895.x86_64.iso -boot d \
-net none \
-device vmxnet3,mac=00:2e:3c:92:26:00,netdev=esx-0 \
-device vmxnet3,mac=00:2e:3c:92:26:01,netdev=esx-1 \
-device e1000,mac=00:2e:3c:92:26:02,netdev=esx-2 \
-device e1000,mac=00:2e:3c:92:26:03,netdev=esx-3 \
-netdev socket,id=esx-0,udp=127.0.0.1:10000,localaddr=127.0.0.1:20000 \
-netdev socket,id=esx-1,udp=127.0.0.1:10001,localaddr=127.0.0.1:20001 \
-netdev socket,id=esx-2,udp=127.0.0.1:10002,localaddr=127.0.0.1:20002 \
-netdev socket,id=esx-3,udp=127.0.0.1:10003,localaddr=127.0.0.1:20003

And now I can boot up, and install VMWare!

ESXi 5.5.0 on Qemu KVM

By default you will not be permitted to start any virtual machine.  To get around this you have to enable VMWare to run nested.
Add the following to /etc/vmware/config under ESX:

vmx.allowNested=TRUE

And then you are good to go!

VM running on nested ESXi 5.5.0

Running VMWare ESXi 6.5 under Linux/KVM!

So with VIRL in hand, the next thing I wanted to do was play with some LACP, and VMWare ESX.  Of course the best way to do this is under KVM as you can use UDP to bounce packets around between virtual machines, like the VIRL L2 switch.  I went ahead and fired up 5.5 and got this nice purple screen of death.

Purple screen of death!

So naturally I need to force the processor type.  Also after reading a few sites, I needed to turn on a nested & ignore_msrs settings:

root@ubuntu:/etc/modprobe.d# cat qemu-system-x86.conf

options kvm_amd nested=1
options kvm ignore_msrs=1

Naturally if you are using an Intel processor the statements need to reflect that.  All being well you will see something like this in your log file:

Mar 7 11:34:38 ubuntu kernel: [ 14.802132] kvm: Nested Virtualization enabled
Mar 7 11:34:38 ubuntu kernel: [ 14.802134] kvm: Nested Paging enabled

I got a little further trying to install VMWare ESXi 5.5 update 3, however it just hangs on Intitializing timing…

vMWare 5.5.0 update 3 hanging

(I did later solve the 5.5 problem in a follow up here!)

After going nowhere with that, I went ahead and downloaded VMWare ESXi 6.5 which as of today is the latest version, and that installed just fine!

ESXi 6.5.0 running under KVM

For anyone brave or crazy enough to think about reproducing this, here is my install command line (yes Im doing this old school way on purpose)

kvm -vnc 0.0.0.0:1 -cpu host \
-machine pc-i440fx-2.1 \
-m 4096M \
-smp cpus=2 \
-boot order=d \
-drive file=esx-1.qcow2,if=ide,index=0,media=disk \
-serial telnet:127.0.0.1:5001,server,nowait \
-monitor tcp:127.0.0.1:6001,server,nowait \
-net none \
-device vmxnet3,mac=00:2e:3c:92:26:00,netdev=esx-0 \
-device vmxnet3,mac=00:2e:3c:92:26:01,netdev=esx-1 \
-device vmxnet3,mac=00:2e:3c:92:26:02,netdev=esx-2 \
-device vmxnet3,mac=00:2e:3c:92:26:03,netdev=esx-3 \
-netdev socket,id=esx-0,udp=127.0.0.1:10000,localaddr=127.0.0.1:20000 \
-netdev socket,id=esx-1,udp=127.0.0.1:10001,localaddr=127.0.0.1:20001 \
-netdev socket,id=esx-2,udp=127.0.0.1:10002,localaddr=127.0.0.1:20002 \
-netdev socket,id=esx-3,udp=127.0.0.1:10003,localaddr=127.0.0.1:20003 \
-cdrom VMware-VMvisor-Installer-5.5.0.update03-3116895.x86_64.iso \
-boot d

As you can see it really isn’t that involved, well once you get the formatting to make some sense.  And to run it normally I run it something like this:

kvm -vnc 0.0.0.0:1 -cpu host \
-machine pc-i440fx-2.1 \
-m 4096M \
-smp cpus=2 \
-drive file=esx-1.qcow2,if=ide,index=0,media=disk \
-serial telnet:127.0.0.1:5001,server,nowait \
-monitor tcp:127.0.0.1:6001,server,nowait \
-net none \
-device vmxnet3,mac=00:2e:3c:92:26:00,netdev=esx-0 \
-device vmxnet3,mac=00:2e:3c:92:26:01,netdev=esx-1 \
-device vmxnet3,mac=00:2e:3c:92:26:02,netdev=esx-2 \
-device vmxnet3,mac=00:2e:3c:92:26:03,netdev=esx-3 \
-netdev socket,id=esx-0,udp=127.0.0.1:10000,localaddr=127.0.0.1:20000 \
-netdev socket,id=esx-1,udp=127.0.0.1:10001,localaddr=127.0.0.1:20001 \
-netdev socket,id=esx-2,udp=127.0.0.1:10002,localaddr=127.0.0.1:20002 \
-netdev socket,id=esx-3,udp=127.0.0.1:10003,localaddr=127.0.0.1:20003

So it’s basically the same, just no mounted CD-ROM image.  Now this is all fun, but what about networking?  As I had mentioned before, I bought a VIRL license, which includes a l2 Catalyst image, so why not use that, instad of a ‘traditional’ Linux bridge?  Sure!  In this example I’m going to connect the 4 ethernet ports from the ESXi into the first 4 ports on the cisco switch, with the last port connecting to a Linux bridge, that I then route to, as I wanted all my lab crap on a seperate network.  To start the switch I use this script:

kvm \
-m 768M \
-smp cpus=1 \
-boot order=c \
-drive file=vios_l2-adventerprisek9-m.vmdk.SSA.152-4.0.55.E.qcow2,if=ide,index=0,media=disk \
-serial telnet:127.0.0.1:5000,server,nowait \
-monitor tcp:127.0.0.1:51492,server,nowait \
-net none \
-device e1000,mac=00:2e:3c:92:26:00,netdev=gns3-0 \
-device e1000,mac=00:2e:3c:92:26:01,netdev=gns3-1 \
-device e1000,mac=00:2e:3c:92:26:02,netdev=gns3-2 \
-device e1000,mac=00:2e:3c:92:26:03,netdev=gns3-3 \
-device e1000,mac=00:2e:3c:92:26:04 \
-device e1000,mac=00:2e:3c:92:26:05 \
-device e1000,mac=00:2e:3c:92:26:06 \
-device e1000,mac=00:2e:3c:92:26:07 \
-device e1000,mac=00:2e:3c:92:26:08 \
-device e1000,mac=00:2e:3c:92:26:09 \
-device e1000,mac=00:2e:3c:92:26:0a \
-device e1000,mac=00:2e:3c:92:26:0b,netdev=gns3-tap \
-netdev socket,id=gns3-0,udp=127.0.0.1:20000,localaddr=127.0.0.1:10000 \
-netdev socket,id=gns3-1,udp=127.0.0.1:20001,localaddr=127.0.0.1:10001 \
-netdev socket,id=gns3-2,udp=127.0.0.1:20002,localaddr=127.0.0.1:10002 \
-netdev socket,id=gns3-3,udp=127.0.0.1:20003,localaddr=127.0.0.1:10003 \
-netdev tap,id=gns3-tap,ifname=tap0,script=/etc/qemu-ifup \
-nographic

Now as you can see the udp sockets are inverse of eachother, meaning that the ESX listens on 10000 and sends to 127.0.0.1 on port 20000, while the switch listesns on 20000, and sends packets to 10000 for the first ethernet interface pair.

By default VMware only assigns the first NIC into the first virtual switch, so after enabling CDP, we can see we have basic connecitivity:

AMD-kvm#sho run int gig0/1
Building configuration…

Current configuration : 99 bytes
!
interface GigabitEthernet0/1
media-type rj45
speed 1000
duplex full
no negotiation auto
end

AMD-kvm#show cdp neigh
Capability Codes: R – Router, T – Trans Bridge, B – Source Route Bridge
S – Switch, H – Host, I – IGMP, r – Repeater, P – Phone,
D – Remote, C – CVTA, M – Two-port Mac Relay

Device ID Local Intrfce Holdtme Capability Platform Port ID
KVMESX-1 Gig 0/0 155 S VMware ES vmnic0

Total cdp entries displayed : 1

And of course the networking actually does work… I created a quick VM, and yep, It’s online!

AMD-kvm#show mac address-table
Mac Address Table
——————————————-

Vlan Mac Address Type Ports
—- ———– ——– —–
1 000c.2962.09e5 DYNAMIC Gi0/0
1 002e.3c92.2600 DYNAMIC Gi0/0
1 76b0.3336.34b3 DYNAMIC Gi2/3
Total Mac Addresses for this criterion: 3

And of course some obliguttory pictures:

Nested ESXi running a simple NT 4.0 server

And:

Welcome to IIS 2.0

With ip forwarding turned on my Ubuntu server, and an ip address assigned to my bridge interface, I can then access the NT 4.0 VM from my laptop directly.

Nex’t time to make the L2 more complicated, and add in some L3 insanity…

Firefly-Host-6.0-CloudSDK fun in “modern” times

Getting started

Ugh. nothing like ancient crypto, major security vulnerabilities, and ancient crap.  So first I’m going to use Juniper’s SDK (get it while you can, if you care).  Note that the product is long since EOL’d, and all support is GONE.  I’m using Debian 7 to perform this query, although I probably should be using something like 4 or 5.  Anyways first off is that the python examples require “Ft.Xml.Domlette” which doesn’t seem to have a 4Suite-XML package.  SO let’s build it the old fashioned way:

 apt-get install build-essential python-dev
wget http://pypi.python.org/packages/source/4/4Suite-XML/4Suite-XML-1.0.2.tar.bz2
tar -xvvf 4Suite-XML-1.0.2.tar.bz2
cd 4Suite-XML-1.0.2
./setup.py install

Well (for now) and in my case I could reconfigure tomcat to be slightly more secure. Otherwise running the examples gives this fun filled error:

ssl.SSLError: [Errno 1] _ssl.c:504: error:14082174:SSL routines:SSL3_CHECK_CERT_AND_ALGORITHM:dh key too small

Naturally as time goes on this will not work anymore, and I’ll need a stale machine to query this stale service. Using ssl shopper’s Tomcat guide, I made changes to the server.xml file on the vGW SD VM. (Don’t forget to enable SSH in the settings WebUI, and then login as admin/<whatever password you gave> then do a ‘sudo bash’ to run as root, screw being nice!


# diff -ruN server.xml-old server.xml
--- server.xml-old 2017-01-14 18:20:07.000000000 +0800
+++ server.xml 2017-01-14 19:31:36.000000000 +0800
@@ -98,7 +98,7 @@
enableLookups="false" disableUploadTimeout="true"
acceptCount="100" scheme="https" secure="true"
clientAuth="false" sslProtocol="TLS" URIEncoding="UTF-8"
- ciphers="TLS_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_DSS_WITH_AES_128_CBC_SHA, SSL_RSA_WITH_3DES_EDE_CBC_SHA, SSL_DHE_RSA_WITH_3DES_EDE_CBC_SHA, SSL_DHE_DSS_WITH_3DES_EDE_CBC_SHA"
+ ciphers="TLS_RSA_WITH_AES_128_CBC_SHA, ECDH-RSA-AES128-SHA"
keystoreFile="/var/lib/altor/cert/public_keystore" keystorePass="altoraltor"/>

Naturally don’t forget to restart Tomcat, which does take forever:

bash-3.2# /etc/init.d/tomcat restart
Stopping tomcat: [ OK ]
Starting tomcat: [ OK ]

And now I’m FINALLY able to run  one of the sample scripts

# ./policyToXML.py –grp 1
<?xml version=”1.0″ encoding=”UTF-8″?>
<policy xmlns=”urn:altor:center:policy”>
<revision>340</revision>
<name>Global Policy</name>
<id>1</id>
<rev>1</rev>
<type>G</type>
<groupId>-1</groupId>
<machineId>-1</machineId>
<Inbound>

And you get the idea.  Certainly on the one hand it’s nice to get some data out of the vGW without using screen captures or anything else equally useless, and it sure beats trying to read stuff like this:

vGW VM effective policy for a VM

What on earth was Altor/Juniper thinking?  Who thought making the screen damned near impossible to read was a “good thing”â„¢

I just wish I’d known about the SDK download on the now defunct firefly page a few years ago as it’d have saved me a LOT of pain, but as they say, not time like the present.

Naturally someone here is going to say, upgrade to the last version it’ll fix these errors, and sure it may, but are you going to bet a production environment that is already running obsolete software on changing versions?  Or migrate to a new platform? Sure, the first step I’d want of course is a machine formatted rule export of the existing rules.  And here we are.