Skip to content


libvirt support for Xen’s new libxenlight toolstack

Originally posted on my blog, here.

Xen has had a long history in libvirt.  In fact, it was the first hypervisor supported by libvirt.  I’ve witnessed an incredible evolution of libvirt over the years and now not only does it support managing many hypervisors such as Xen, KVM/QEMU, LXC, VirtualBox, hyper-v, ESX, etc., but it also supports managing a wide range of host subsystems used in a virtualized environment such as storage pools and volumes, networks, network interfaces, etc.  It has really become the swiss army knife of virtualization management on Linux, and Xen has been along for the entire ride.

libvirt supports multiple hypervisors via a hypervisor driver interface, which is defined in $LIBVIRT_ROOT/src/drvier.h – see struct _virDriver.  libvirt’s virDomain* APIs map to functions in the hypervisor driver interface, which are implemented by the various hypervisor drivers.  The drivers are located under $LIBVIRT_ROOT/src/<hypervisor-name>.  Typically, each driver has a $LIBVIRT_ROOT/src/<hypervisor-name>/<hypervisor-name>_driver.c file which defines a static instance of virDriver and fills in the functions it implements.  As an example, see the definition of libxlDriver in $libvirt_root/src/libxl/libxl_driver.c, the firsh few lines of which are

static virDriver libxlDriver = {
    .no = VIR_DRV_LIBXL,
    .name = “xenlight”,
    .connectOpen = libxlConnectOpen, /* 0.9.0 */
    .connectClose = libxlConnectClose, /* 0.9.0 */
    .connectGetType = libxlConnectGetType, /* 0.9.0 */
    ...
}

Continued…

Posted in Community, Uncategorized, Xen Development.


First Xen Project 4.4 Test Day on Monday, January 20

Release time is approaching, so Xen Project Test Days have arrived!

On Monday, January 20, we are holding a Test Day for Xen 4.4. Release Candidate 2.

Xen Project Test Day is your opportunity to work with code which is targeted for the next release, ensure that new features work well, and verify that the new code can be integrated successfully into your environment.  This is the first of a few Test Days for the 4.4 release, scheduled to occur at roughly 2 week intervals.

General Information about Test Days can be found here:
http://wiki.xenproject.org/wiki/Xen_Test_Days

and specific instructions for this Test Day are located here:
http://wiki.xenproject.org/wiki/Xen_4.4_RC2_test_instructions

XEN 4.4 FEATURE DEVELOPERS:

If you have a new feature which is cooked and ready for testing in RC2, we need to know about it and how to test it. Either edit the instructions page or send me a few lines describing the feature and how it should be tested.

Right now, RC2 is labelled a general test (e.g., “Does Xen compile, install, and do the things Xen normally does?”). We don’t have any specific tests of new functionality identified. If you have something new which needs testing in RC2, we need to know about it.

EVERYONE:

Please join us on Monday, January 20, and help make sure the next release of Xen is the best one yet!

Posted in Announcements, Community, Events.


Xen Related Talks @ FOSDEM 2014

Going to FOSDEM’14? Well, you want to check out the schedule of the Virtualization & IaaS devroom then, and make sure you do not miss the talks about Xen. There are 4 of them, and they will provide some details about new and interesting usecases for virtualization, like in embedded systems of various kind (from phones and tablets to network middleboxes), and about new features in the upcoming Xen release, such as PVH, and how to use them with profit.

Here they are the talks, in some more details:
- Dual-Android on Nexus 10 using XEN, on Saturday morning
- High Performance Network Function Virtualization with ClickOS, on Saturday afternoon
- Virtualization in Android based and embedded systems, on Sunday morning
- How we ported FreeBSD to PVH, on Sunday afternoon

There actually is more: one called Porting FreeBSD on Xen on ARM, in the BSD devroom, and one about MirageOS one in the miscellaneous Main track, but the schedule for them has not been announced yet.

Last but certainly not least, there will be a Xen-Project booth, where you can meet the members of the Xen community as well as enjoying some other, soon to be revealed, activities. I and some of my colleagues from Citrix will be in Brussels, and will definitely spend some time at the booth, so come and visit us. The booth will be in building K, on level 1.

Read more here: http://xenproject.org/about/events.html

Edit:

The schedule for the FreeBSD and MirageOS talks have been announced. Here it comes:
- Porting FreeBSD on Xen on ARM, will be given on Saturday early afternoon (15:00), in the BSD devroom
- MirageOS: compiling functional library operating systems, will happen on Sunday late morning (13:00), in the misc main track

Also, there is another Xen related talk, in the Automotive development devroom: Xen on ARM: Virtualization for the Automotive industry, on Sunday morning (11:45).

Posted in Announcements, Events, Partner Announcements, Xen Hypervisor.

Tagged with , , , , .


2013 : A Year to Remember

2013 has been a year of changes for the Xen Community. I wanted to share my five personal highlights of the year. But before I do this, I wanted to thank everyone who contributed to the Xen Project in 2013 and the years before. Open Source is about bringing together technology and people : without your contributions, the Xen Project would not be a thriving and growing open source project.

Xen Project joins Linux Foundation

The biggest community story of 2013, was the move of Xen to the Linux Foundation in April. For me, this journey started in December 2011, when I won in-principle agreement from Citrix to find a neutral, non-profit home for Xen. This took longer than I hoped: even when the decision was made to become a Linux Foundation Collaborative project, it took many months of hard work to get everything off the ground. Was it worth it? The answer is a definite yes: besides all the buzz and media interest in April 2013, interest in and usage of Xen has increased in the remainder of 2013. The Xen Project became a first class citizen within the open source community, which it was not really before.

Wiki Page Visits

Monthly visits by users to the Xen Project wiki doubled after moving Xen to the Linux Foundation.

Of course, the ripples of this change will be felt for many years to come. Some of them, are covered in the other 4 highlights of 2013. I personally believe that the Xen Project Advisory Board (which is made up of 14 major companies that fund the project), will have a positive impact on the community going forward. This will become apparent next year, when initiatives that are funded by the Advisory Board – such as an independently hosted test infrastructure, more coordinated marketing and PR, growing the Xen talent pool and many others – will kick into gear.
Continued…

Posted in Community.


Where Would You Like to See the Next Xen Project User Summit Held?

In 2013, we held the first major Xen event aimed specifically at users: the Xen Project User Summit. In 2014, we want to do it again — but where and when?

The Xen Project wants to hold its second Xen Project User Summit.  We’d like to hold it somewhere which is accessible by a large percentage of our user community.  And we’d like to schedule it at a time which makes sense, possibly in coordination with some existing conference.

We need your help to pick the time and place.  Give us your preferences in a very quick 2 minute survey found here:

https://www.surveymonkey.com/s/YJQCHJ6

It’s very quick and easy to do.  And you may just find that the next User Summit is too convenient for you to pass up.

Posted in Announcements, Community, Events, Xen Summit.

Tagged with , .


What is the ARINC653 Scheduler?

The Xen ARINC 653 scheduler is a real time scheduler that has been in Xen since 4.1.0.  It is a cyclic executive scheduler with a specific usage in mind, so unless one has aviation experience they are unlikely to have ever encountered it.

The scheduler was created and is currently maintained by DornerWorks.

Background

The primary goal of the ARINC 653 specification [1] is the isolation or partitioning of domains.  The specification goes out of its way to prevent one domain from adversely affecting any other domain, and this goal extends to any contended resource, including but not limited to I/O bandwidth, CPU caching, branch prediction buffers, and CPU execution time.

This isolation is important in aviation because it allows applications at different levels of certification (e.g. Autopilot – Level A Criticality, In-Flight Entertainment – Level E Criticality, etc…) to be run in different partitions (domains) on the same platform.  Historically to maintain this isolation each application had its own separate computer and operating system, in what was called a federated system.  Integrated Modular Avionics (IMA) systems were created to allow multiple applications to run on the same hardware.  In turn, the ARINC653 specification was created to standardize an Operating System for these platforms.  While it is called an operating system and could be implemented as such, it can also be implemented as a hypervisor running multiple virtual machines as partitions.  Since the transition from federated to IMA systems in avionics closely mirrors the transition to virtualized servers in the IT sector, the latter implementation seems more natural.

Beyond aviation, an ARINC 653 scheduler can be used where temporal isolation of domains is a top priority, or in security environments with indistinguishability requirements, since a malicious domain should be unable to extract information through a timing side-channel.  In other applications, the use of an ARINC 653 scheduler would not be recommended due to the reduced performance.

Scheduling Algorithm

The ARINC 653 scheduler in Xen provides the groundwork for the temporal isolation of domains from each other. The domain scheduling algorithm itself is fairly simple:  a fixed predetermined list of domains is repeatedly scheduled with a fixed periodicity resulting in a complete and, most importantly, predictable schedule.  The overall period of the scheduler is know as a major frame, while the individual domain execution windows in the schedule are know as minor frames.

Major_Minor_Frame

As an example, suppose we have 3 domains all with periods of 5, 6, 10 ms and worst case running times respectively of 1 ms, 2 ms, and 3 ms.  The major frame is set to the least common multiple of these periods (30 ms) and minor frames are selected so that the period, runtime, and deadline constraints are met.  One resulting schedule is shown below, though there are other possibilities.

ExampleSchedule

The ARINC 653 scheduler is only concerned with the scheduling of domains. The scheduling of real-time processes within a domain is performed by that domain’s process scheduler.  In a compliant ARINC 653 system, these processes are scheduled using a fixed priority scheduling algorithm, but if ARINC 653 compliance is not a concern any other process scheduling method may be used.

Using the Scheduler

Directions for using the scheduler can be found on the Xen wiki at ARINC653 Scheduler. When using the scheduler, the most obvious effect will be that the cpu usage and execution windows for each domain will be fixed regardless of whether the domain is performing any work.

Currently multicore operation of the scheduler is not supported.  Extending the scheduling algorithm to multiple cores is trivial, but the isolation of domains in a multicore system requires a number of mitigation techniques not required in single-core systems.[2]

References

[1] ARINC Specification 653P1-3, “Avionics Application Software Standard Interface Part 1 – Required Services” November 15, 2010

[2] EASA.2011/6 MULCORS – Use of Multicore Processors in airborne systems

Posted in Xen Development.

Tagged with , .


Announcing the 1.0 release of Mirage OS

We’re very pleased to announce the release of Mirage OS 1.0. This is the first major release of Mirage OS and represents several years of development, testing and community building. You can get started by following the install instructions and creating your own webserver to host a static website! Also check out the release notes and download page.

What is Mirage OS and why is it important?

Most applications that run in the cloud are not optimized to do so. They inherently carry assumptions about the underlying operating system with them, including vulnerabilities and bloat.

Compartmentalization of large servers into smaller ‘virtual machines’ has enabled many new businesses to get started and achieve scale. This has been great for new services but many of those virtual machines are single-purpose and yet they contain largely complete operating systems which typically run single applications like web-servers, load balancers, databases, mail servers and similar services. This means a large part of the footprint is unused and unnecessary, which is both costly due to resource usage (RAM, disk space etc) and a security risk due to the increased complexity of the system and the larger attack surface.

Cloud OS Diagram

On the left, you see a typical application stack run in the cloud today. Cloud Operating systems such as MirageOS remove the Operating System and replace it with a Language Runtime that is designed to cooperate with the Hypervisor.

Mirage OS is a Cloud Operating System which represents an approach where only the necessary components of the operating system are included and compiled along with the application into a ‘unikernel’. This results in highly efficient and extremely lean ‘appliances’, with the same or better functionality but a much smaller footprint and attack surface. These appliances can be deployed directly to the cloud and embedded devices, with the benefits of reduced costs and increased security and scalability.

Some example use cases for Mirage OS include: (1) A lean webserver, for example the openmirage.org, website is about 1MB including all content, boots in about 1 second and is hosted on Amazon EC2. (2) Middle-box applications such as small OpenFlow switches for tenants in a cloud-provider. (3) Easy reuse of the same code and toolchain that create cloud appliances to target the space and memory constrained ARM devices.

How does Mirage OS work?

Mirage OS works by treating the Xen hypervisor as a stable hardware platform and using libraries to provide the services and protocols we expect from a typical operating system, e.g. a networking stack. Application code is developed in a high-level functional programming language OCaml on a desktop OS such as Linux or Mac OSX, and compiled into a fully-standalone, specialized unikernel. These unikernels run directly on Xen hypervisor APIs. Since Xen powers most public clouds such as Amazon EC2, Rackspace Cloud, and many others, Mirage OS lets your servers run more cheaply, securely and faster on those services.

Mirage OS is implemented in the OCaml language, with 50+ libraries which map directly to operating system constructs when being compiled for production deployment. The goal is to make it as easy as possible to create Mirage OS appliances and ensure that all the things found in a typical operating system stack are still available to the developer. Mirage OS includes clean-slate functional implementations of protocols ranging from TCP/IP, DNS, SSH, OpenFlow (switch/controller), HTTP, XMPP and Xen Project inter-VM transports. Since everything is written in a single high-level language, it is easier to work with those libraries directly. This approach guarantees the best possible performance of Mirage OS on the Xen Hypervisor without needing to support the thousands of device drivers found in a traditional OS.

Bind 9 vs. Mirage OS throughput comparison

Performance comparison of Bind 9 vs. a DNS server written in Mirage OS.

An example of a Mirage OS appliance is a DNS server and below is a comparison with one of the most widely deployed DNS servers on the internet, BIND 9. As you can see, the Mirage OS appliance outperforms BIND 9 but in addition, the Mirage OS VM is less than 200kB in size compared to over 450MB for the BIND VM. Moreover, the traditional VM contains 4-5 times more lines of code than the Mirage implementation, and lines of code are often considered correlated with attack surface. More detail about this comparison and others can be found in the associated ASPLOS paper.

For the DNS appliance above, the application code was written using OCaml and compiled with the relevant Mirage OS libraries. To take full advantage of Mirage OS it is necessary to design and construct applications using OCaml, which provides a number of additional benefits such as type-safety. For those new to OCaml, there are some excellent resources to get started with the language, including a new book from O’Reilly and a range of tutorials on the revamped OCaml website.

We look forward to the exciting wave of innovation that Mirage OS will unleash including more resilient and lean software as well as increased developer productivity.

Posted in Announcements.

Tagged with .


Xen on ARM and the Device Tree vs. ACPI debate

ACPI vs. Device Tree on ARM

Some of you may have seen the recent discussions on the linux-arm-kernel mailing list (and others) about the use of ACPI vs DT on the ARM platform. As always LWN have a pretty good summary (currently subscribers only, becomes freely available on 5 December) of the situation with ACPI on ARM.

Device Tree (or DT) and Advanced Configuration & Power Interface (or ACPI) are both standards which are used for describing a hardware platform e.g. to an operating system kernel. At their core both technologies provide a tree like data structure containing a hierarchy of devices and specifying what type they are and a set of “bindings” for that device. A binding is essentially a schema for specifying I/O regions, interrupt mappings, GPIOs and clocks etc.

For the last few years Linux on ARM has been moving away from hardcoded “board files” (a bunch of C code for each platform) towards using Device Tree instead. In the ARM space ACPI is the new kid on the block and has many unknowns. Given this the approach to ACPI which appears to have been reached by the Linux kernel maintainers, which is essentially to wait and see how the market pans out, seems sensible.

On the Xen side we started the port to ARM around the time that Linux’s transition from board files to Device Tree was starting and made the decision early on to go directly to device tree (ACPI wasn’t even on the table at this point, at least not publicly). Xen DT to discover all of the hardware on the system, both that which it intends to use itself and that which it intends to pass to domain 0. As well as consuming DT itself Xen also creates a filleted version of the host DT which it passes to the domain 0 kernel. DT is simple and yet powerful enough to allow us to do this relatively easily.

DT is also used by some of the BSD variants in their ARM ports as well.

My Position as Xen on ARM Maintainer

The platform configuration mechanism supported by Xen on ARM today is Device Tree. Device Tree is a good fit for our requirements and we will continue to support it as our primary hardware description mechanism.

Given that a number of operating system vendors and hardware vendors care about ACPI on ARM and are pushing hard for it, especially in the ARM server space, it is possible, perhaps even likely, that we will eventually find ourselves needing support ACPI as well. On systems which support both ACPI and DT we will continue to prefer Device Tree. Once ARM hardware platforms that only support ACPI are available, we will obviously need to support ACPI.

The Xen Project works closely with the Linux kernel and other open source upstreams as well as organisations such as Linaro. Before Xen on ARM can support ACPI I would like see it gaining some actual traction on ARM. In particular I would like to see it get to the point where it has been accepted by the Linux kernel maintainers. It is clearly not wise for Xen to be pioneering the use of ACPI before to it becoming clear whether or not it is going to gain any traction in the wider ecosystem.

So if you are an ARM silicon or platform vendor and you care about virtualization and Xen in particular, I encourage you to provide a complete device tree for your platform.

Note that this only applies to Xen on ARM. I cannot speak for Xen on x86 but I think it is pretty clear that it will continue to support ACPI so long as it remains the dominant hardware description on that platform.

It should also be noted that ACPI on ARM is primarily a server space thing at this stage. Of course Xen and Linux are not just about servers: both communities have sizable communities of embedded vendors (on the Xen side we had several interesting presentations at the recent Xen Developer Summit on embedded uses of Xen on ARM). Essentially no one is suggesting that the embedded use cases should move from DT to ACPI and so, irrespective of what happens with ACPI, DT has a strong future on ARM.

ACPI and Type I Hypervisors

Our experience on x86 has shown that the ACPI model is not a good fit for Type I hypervisors such as Xen, and the same is true on ARM. ACPI essentially enforces a model where the hypervisor, the kernel, the OSPM (the ACPI term for the bit of an OS which speaks ACPI) and the device drivers all must reside in the same privileged entity. In other words it effectively mandates a single monolithic entity which controls everything about the system. This obviously precludes such things as dividing hardware into that which is owned and controlled by the hypervisor and that which is owned and controlled by a virtual machine such as dom0. This impedance mismatch is probably not insurmountable but experience with ACPI on x86 Xen suggests that the resulting architecture is not going to be very agreeable.

UEFI

Due to their history on x86 ACPI and UEFI are often lumped together as a single thing when in reality they are mostly independent. There is no reason why UEFI cannot also be used with Device Tree. We would expect Xen to support UEFI sooner rather than later.

Posted in Thought Leadership, Uncategorized, Xen Hypervisor.

Tagged with , , , , .


RT-Xen: Real-Time Virtualization in Xen

RT-XenThe researchers at Washington University in St. Louis and University of Pennsylvania are pleased to announce, here on this blog, the release of a new and greatly improved version of the RT-Xen project. Recent years have seen increasing demand for supporting real-time systems in virtualized environments (for example, the Xen-ARM projects and several other real-time enhancements to Xen), as virtualization enables greater flexibility and reduces cost, weight and energy by breaking the correspondence between logical systems and physical systems. As an example of this, check out the video below from the 2013 Xen Project Developer Summit

The video describes how Xen could be used in an in-vehicle infotainement system.

In order to combine real-time and virtualization, a formally defined real-time scheduler at the hypervisor level is needed to provide timing guarantees to the guest virtual machines. RT-Xen bridges the gap between real-time scheduling theory and the virtualization technology by providing a suite of multi-core real-time schedulers to deliver real-time performance to domains running on the Xen hypervisor.

Background: Scheduling in Xen

In Xen, each domain’s core is abstracted as a Virtual CPU (VCPU), and the hypervisor scheduler is responsible for scheduling VCPUs. For example, the default credit scheduler would assign a weight per domain, which decides the proportional share of CPU cycles that a domain would get. The credit scheduler works great for general purpose computing, but is not suitable for real-time applications due to the following reasons:

  1. There is no reservation with credit scheduler. For example, when two VCPUs runs on a 2 GHz physical core, each would get 1 GHz. However, if another VCPU also boots on the same PCPU, the resource share shrinks to 0.66 GHz. The system manager have to carefully configure the number of VMs/VCPUs to ensure that each domain get an appropriate amount of CPU resource;
  2. There is little timing predictability or real-time performance provided to the VM. If a VM is running some real-time workload (video decoding, voice processing, and feedback control loops) which are periodically triggered and have a timing requirement — for example, the VM must be scheduled every 10 ms to process the data — there is no way the VM can express this information to the underlying VMM scheduler. The existing SEDF scheduler can help with this, but it has poor support for multi-core.

RT-Xen: Combining real-time and virtualization

RT-Xen aims to solve this problem by providing a suite of real-time schedulers. The users can specify (budget, period, CPU mask) for each VCPU individually. The budget represents the maximum CPU resource a VCPU will get during a period; the period represents the timing quantum of the CPU resources provided to the VCPU; the CPU mask defines a subset of physical cores a VCPU is allowed to run. For each VCPU, the budget is reset at each starting point of the period (all in milliseconds), consumed when the VCPU is executing, and deferred when the VCPU has budget but no work to do.

Within each scheduler, the users can switch between different priority schemes: earliest deadline first (EDF), where VCPU with earlier deadline has higher priority; or rate monotonic (RM), where VCPU with shorter period has higher priority. As a results, not only the VCPU gets a resource reservation (budget/period), but also an explicit timing information for the CPU resources (period). The real-time schedulers in RT-Xen delivers the desired real-time performance to the VMs based on the resource reservations.

To be more specific, the two multi-core schedulers in RT-Xen are:

  • RT-globalwhich uses a global run queue to hold all VCPUs (in runnable state). It is CPU mask aware, and provides better resource utilization, as VCPU can migrate freely between physical cores (within CPU mask)
  • RT-partition: which uses a run queue per physical CPU. In this way, each physical CPU only looks at its own run queue to make scheduling decisions, which incurs less overhead and potentially better cache performance. However, load-balancing between physical cores is not provided in the current release.

Source Code and References

The developers of RT-Xen are looking closely at how to integrate both schedulers into the Xen mainstream. In the meantime, please check out publications at [Tech Report’13] [RTAS’12] [EMSOFT’11] (the latter two focus on single-core case) and source code.

Posted in Partner Announcements, Xen Development, Xen Hypervisor.

Tagged with , , , , , .


Xen Project Well Represented at SUSECon and openSUSE Summit

What do a chameleon, a panda, and a mouse have in common?  More than you might imagine, unless you were present at SUSECon and the openSUSE Summit at the Walt Disney Coronado Springs resort last week in Florida.  During the week, it was clear that the SUSE chameleon and Xen panda could happily coexist in Mickey Mouse’s home turf.

Geeko, the openSUSE Chameleon

I had the opportunity to attend and speak at both conferences. The week started with 3 and a half days of SUSECon, an event dedicated to SUSE’s commercial Linux products, and finished with 2 days of openSUSE Summit, which celebrates the Open Source distribution. Key people from both worlds filled the halls, and the schedule boasted an excellent assortment of talks.

At least a half dozen different sessions included some Xen-related content at SUSECon, not to mention additional sessions at openSUSE Summit.  Both the Open Source openSUSE distribution and the commercial SUSE Linux Enterprise Server (SLES) boast support for multiple virtualization engines (Xen, KVM, Linux containers, etc.).  While this might not seem too significant, it is a refreshing departure from many companies in the industry which insist on hawking their preferred virtualization technology. On platforms where hypervisors have equal footing, Xen has the opportunity to shine — and usually does.

Continued…

Posted in Community, Events.

Tagged with , , .