ZFS on Xen – why so hard?

Ok, so no posts for a couple of weeks.  Why?  Well I’ve been banging my head against various ZFS appliances, trying to make them do what I want 🙂 What I want is to migrate my old HP Microserver over to new hardware, and in the process, change out a bunch of the network services from custom Linux VM’s to something more Appliance-based, with Web management interfaces. Of course, sounds easy, but in practice, its turning out to be a nightmare.  I’ve spent a couple of weeks fiddling with FreeNAS, NAS4Free, Nexenta(stor), and OpenIndiana/Napp-IT, coupled with Citrix’s XenServer, and Centos 6.4 (which now officially supports Xen 4.3). What I want to do is run a VM with:

  • 2-4 cpu cores
  • 4-8Gb of memory
  • PCI Passthrough of LSI 9211-8i HBA disk controller (Dell Perc H310)

For the VM storage on the dom0, I’ve got a couple of RAID1 mirror pairs, one of SSD’s and one of 2.5″ SATA, so they’re reasonably fault-tolerant and I can use fast/slow storage as desired.  The intent with the ZFS is to use it for network storage (NFS, Windows Shares, etc.), and enjoy simplified backups, error-detection and management, as well as provide common repositories for some of the other service VM’s to share (e.g. my FLAC music collection). Initial testing has got me to successful installs of domU’s for all of the above, although only Nexenta and OI have successfully managed to run as Paravirt domU’s.

  • FreeNAS and NAS4free performance has been 3-legged-dog-like, largely due to running as HVM and using simulated IO controllers (disk/network).
  • PCI Passthrough has failed to work on the Paravirt installs of OI and Nexenta (as OpenSolaris domU support doesn’t include the Pci-frontend drivers) but passthrough of individual disks and network performance are near-native.

So right now I’m stuck choosing between:

  1. poor IO performance creating massive CPU load on HVM instances, but sole control of LSI hardware
  2. good IO performance, but unable to dedicate the LSI controller to the storage domU.

I do have one more trick up my sleeve, but I need to spend some time fiddling, which is to use a combination of PV/HVM with the FreeNAS or NAS4Free setups, so I get the benefits of paravirt IO performance, and the convenience of PCI Passthrough. This is certainly possible with FreeBSD – so I’m hoping I can do it with these FreeBSD-derived Appliance setups, as if I’m forced to do it with FreeBSD, I may as well just go with ZFS-on-Linux. More on this as I get further with it, or hit roadblocks!

4 thoughts on “ZFS on Xen – why so hard?

  1. HVM domains should be faster than PV domains on new hardware provided you use PV drivers. This mode iscalled “PVHVM” or “PVonHVM”. FreeBSD can run on that.

    The reason why HVM is faster on new HW are all the hardware extensions that aid virtualization (like EPT) which can not be used in PV.

    There is another mode which is very experimental that combines the good parts called PVH. See: http://wiki.xen.org/wiki/Virtualization_Spectrum

    • Thanks for the note – interesting, I hadn’t read about PVH.

      Your comment about FreeBSD pretty much sums up where I’m at.. I’ve got my LSI 9211-8i HBA card working in a NAS4free domU using PCI passthrough, and running in PVHVM mode (I built a XENHVM kernel for nas4free).

      I’ve been taking notes as I’ve gone and will be posting the steps this weekend, if I can get past an annoying bug with PVHVM and a pfSense kernel 🙂

        • I was using Xen 4.2.3 (the standard Centos release packages from Centos 6.4) – however I ran into some issues with pfSense firewall so am currently trying out Fedora 20 with Xen 4.3

          Getting the HBA to reserved by pci-back was a little tricky – for some reason it wouldn’t reserve on boot-up of the system (with entries in /etc/modprobe.d), even though the intel NIC would reserve just fine. In the end I added some lines to the end of /etc/sysconfig/modules/xen.modules:

          rmmod mpt2sas
          rmmod xen-pciback
          modprobe xen-pciback

          Other than that – for nas4free I had to build a custom freeBSD kernel using a freeBSD 9 build environment; basically just build the stock freeBSD 9 XENHVM kernel conf, and then copy the newly built kernel into an HVM instance of nas4free.

          A new motherboard arrived today, some WD Red’s are on their way to me, and Fedora 20 is released next week, so I’ll be refollowing my notes and writing it up on this blog as I go now that I have an idea of how to do it 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *