Wednesday, August 1, 2012

Customize Instance Libvirt Environment

Eucalyptus supports a variety of hypervisors (KVM, VMWare, Xen). Libvirt is used to control instances when eucalyptus is configured to use KVM or Xen. Simply put, Eucalyptus generates a domain file (aptly called libvirt.xml) to start the instance: the domain file can be found in a working directory of a running instance.

Eucalyptus generates the domain file (libvirt.xml) in
 response to the user action euca-rum-instances. Libvirt
will then instruct the hypervisors to execute it.

Older version of Eucalyptus (up to 2.0.3) used an helper perl script (gen_libvirt_xml or gen_kvm_libvirt_xml) to generate the domain file. Changing the hypervisor behavior was a matter of modifying the helper script.

Eucalyptus 3 brings a greater flexibility to customize the domain file. The Node Controller produces a stub xml file with all the instance-related information (the file, called instance.xml, can be found in the instance working directory). Then, using an XSL Transformation on instance.xml, Eucalyptus generates the domain file (libvirt.xml) used to start the instance. The XSL filter can be found on the Node Controller at /etc/eucalyptus/libvirt.xsl. At this point a couple of examples will clarify the process, and how it can be customized. To simplify the debugging and creation of the new filter, we suggest employing a command line XSLT processor during the development of the new libvirt.xsl (the examples below will use xsltproc).

 <?xml version="1.0" encoding="UTF-8"?>  
  <hypervisor type="kvm" capability="hw" bitness="64"/>  
   <root type="image"/>  
  <key isKeyInjected="false" sshKey="ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDPAQQZD644Jep3HWbRfv2TZxRKYSfXI6omWZV/JKnyOxJAkYS9ZxTPCeWeg/J0mguaXHVQapUYWnZkRRfJ2CAv4Yss8ya2mG9Itc3l113C1Rjiyk1YFZcDzxikauJX/r25+M32r1CUbxOnK90z16HUdOFBe78ebe/uA9P+FWCdo/qItF8VfBnKsTqrTi4pe2DP5fJnrtrsJA9vPNh+jrWcUCjN5byknGR/wgiQ0CeySeec0k7TIKKi8aIMcvEezUX0laY1kCC7WblT6HIRH6K+5VmXFmsMpdgENnakwvVIwX9MlT6scAtVRyTOCY1qz5YyK1U7pcxWs8bGyKhTQYvp 345590850920@eucalyptus.admin"/>  
  <os platform="linux" virtioRoot="true" virtioDisk="true" virtioNetwork="false"/>  
   <diskPath targetDeviceType="disk" targetDeviceName="sda" targetDeviceNameVirtio="vda" targetDeviceBusVirtio="virtio" targetDeviceBus="scsi" sourceType="block">/dev/mapper/euca-4XVKCF4WDM4NAIXBORW99-i-37F04164-prt-15360none-5085ae2c</diskPath>  
   <nic bridgeDeviceName="eucabr558" mac="D0:0D:37:F0:41:64"/>  
This is an example of instance.xml extracted from one of
our QA run. Notice the comprehensive instance information,
available: only a small subset will make it into libivrt.xml, but 
all can be used in the XSL Transformation.

Example: Using Huge Pages

Huge pages can boost KVM performance. Once the Node Controller has been modified to use huge pages, the domain files needs to be modified as well, to ensure that the hypervisor will take advantage of this feature. This is as simple as adding a static stanza to the domain file, and can be easily achieved with the following addition to libvirt.xsl

          <xsl:value-of select="/instance/name"/>  
        <description>Eucalyptus instance <xsl:value-of select="/instance/name"/></description>  
 +       <memoryBacking>  
 +         <hugepages/>  
 +       </memoryBacking>  
            <xsl:when test="/instance/os/@platform = 'linux' and /instance/backing/root/@type = 'image'">  
This diff shows how we configured huge pages within the domain file.

Example: Legacy Images

RHEL 6 does not support anymore the SCSI driver: this can be a problem if you are using old images with no VIRTIO driver. When VIRTIO is disabled (eucalyptus.conf variables USE_VIRTIO_ROOT, and USE_VIRTIO_DISK), the generated domain file uses the SCSI  bus, which results in a non-compatible image for your RHEL 6 Node Controller (the image will not be able to find its own root device).  

 <  <os platform="linux" virtioRoot="true" virtioDisk="true" virtioNetwork="false"/>  
 >  <os platform="linux" virtioRoot="false" virtioDisk="false" virtioNetwork="false"/>  
The diff of instance.xml generated when VIRTIO is disabled

The output of

xsltproc /etc/eucalyptus/libvirt.xsl /tmp/instance.xml

confirms that the generated libvirt.xml uses the deprecated driver.

   <disk device="disk" type="block">  
    <source dev="/dev/mapper/euca-4XVKCF4WDM4NAIXBORW99-i-37F04164-prt-15360none-5085ae2c"/>  
    <target dev="sda" bus="scsi"/>  
When disabling VIRTIO, Eucalyptus defaults to SCSI.

With a simple modification of libvirt.xsl, we can add support for these legacy images:

 graziano@x220t:~/Prog/Eucalyptus/xsl$ diff libvirt.xsl /etc/eucalyptus/libvirt.xsl   
 <                     <cmdline>root=/dev/hda1 console=ttyS0</cmdline>  
 <                 <root>/dev/hda1</root>  
 >                     <cmdline>root=/dev/sda1 console=ttyS0</cmdline>  
 >                 <root>/dev/sda1</root>  
 <                          <xsl:call-template name="string-replace-all">  
 <                            <xsl:with-param name="text" select="@targetDeviceName"/>  
 <                            <xsl:with-param name="replace" select="'sd'"/>  
 <                           <xsl:with-param name="by" select="'hd'"/>  
 <                     </xsl:call-template>  
 >                          <xsl:value-of select="@targetDeviceName"/>  
 <                          <xsl:value-of select="'ide'"/>  
 >                          <xsl:value-of select="@targetDeviceBus"/>  
The diff with the modified libvirt.xsl shows how some hard coded variable
have been changed (sda1 becomes hda1), and how we hard code the default bus
to be IDE instead of using what comes in instance.xml. Finally we need to
change every reference to block device starting with sd to hd.

Possibly not the best XSLT to handle the task (if you have better, please let me know), but it does the job.

Customization And Drawbacks

The information contained in  instance.xml is fairly complete, and allows a wide range of customization. For example, there could be specific rules based on the user ID, or rules to add another NIC, or rules to make some  PCI device available to the instances. All of this customization may require modifications within the image itself: for example, an extra NIC would require the image to be aware of it and to configured it correctly.

Eucalyptus 3 added also hooks for the Node Controller: in /etc/eucalyptus/nc-hooks you will find an example on how to add scripts to tailor your cloud to your specific needs. Hooks gets invoked at specific time during the instance staging (post-init, pre-boot, pre-adopt, and pre-clean). Hooks and XSL Transformation allow a complete control of what is passed to the hypervisor, and the environment your instances will find.

One of the attractive promise of the cloud, and in particular of the hybrid cloud, is to be able to run the same images everywhere: on your cloud, on your friend's cloud, on the public cloud. A heavily customized environment, may bind your images to the specific cloud you have, thus nullifying the benefit of running everywhere. As usual, with great power comes great responsibility, and the budding Cloud Administrator should be fully aware of all possible consequences.

[Edited: Aug 2, 2012]

It looks like blogspot is not the best way to discuss code: one cannot put xml in the comments. Lester had another great example. Disabling the hosts pagecache for EBS volumes can be done with the following in libvirt.xsl:

     <xsl:when test="/instance/hypervisor/@type='kvm' and ( /instance/os/@platform='windows' or /instance/os/@virtioRoot = 'true')">
                   <xsl:attribute name="bus">virtio</xsl:attribute>
                   <xsl:attribute name="cache">none</xsl:attribute>
                   <xsl:attribute name="dev">
                     <xsl:call-template name="string-replace-all">
                       <xsl:with-param name="text" select="@targetDeviceName"/>
                       <xsl:with-param name="replace" select="'sd'"/>
                       <xsl:with-param name="by" select="'vd'"/>

Which is then rendered as:

   <disk device="disk" type="block">
    <source dev="/dev/mapper/euca-3IRYEXXJ6OHXZXXYAFDOG-i-82CD4318-prt-04096none-11df8dda"/>

    <target bus="virtio" cache="none" dev="vdb"/>