<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>blog.xen.org</title>
	<atom:link href="http://blog.xen.org/index.php/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.xen.org</link>
	<description>Xen.org Community Blog</description>
	<lastBuildDate>Fri, 11 May 2012 20:31:46 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Benchmarking the new PV ticketlock implementation</title>
		<link>http://blog.xen.org/index.php/2012/05/11/benchmarking-the-new-pv-ticketlock-implementation/</link>
		<comments>http://blog.xen.org/index.php/2012/05/11/benchmarking-the-new-pv-ticketlock-implementation/#comments</comments>
		<pubDate>Fri, 11 May 2012 06:00:23 +0000</pubDate>
		<dc:creator>attilio</dc:creator>
				<category><![CDATA[Xen Development]]></category>
		<category><![CDATA[Benchmarking]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[performance]]></category>
		<category><![CDATA[spinlock]]></category>
		<category><![CDATA[ticketlock]]></category>

		<guid isPermaLink="false">http://blog.xen.org/?p=4521</guid>
		<description><![CDATA[This post written collaboratively by Attilio Rao and George Dunlap Operating systems are generally written assuming that they are in direct control of the hardware. So when we run operating systems in virtual machines, where they share the hardware with other operating systems, this can sometimes cause problems. One of the areas addressed by a [...]]]></description>
			<content:encoded><![CDATA[<p><i>This post written collaboratively by Attilio Rao and George Dunlap</i></p>
<p>Operating systems are generally written assuming that they are in direct control of the hardware.  So when we run operating systems in virtual machines, where they share the hardware with other operating systems, this can sometimes cause problems.  One of the areas addressed by a recently proposed patch series is the problem of spinlocks on a virtualized system.  So what exactly is the problem here, and how does the patch solve it?  And what is the effect of the patch when the kernel is running natively?</p>
<p><strong>Spinlocks and virtualization</strong></p>
<p>Multiprocessor systems need to be able to coordinate access to important data, to make sure that two processors don&#8217;t attempt to modify things at the same time.  The most basic way to do this is with a <i>spinlock</i>.  Before accessing data, the code will attempt to grab the spinlock.  If code running on another processor is holding the spinlock, the code on this processor will &#8220;spin&#8221; waiting for the lock to be free, at which point it will continue.  Because those waiting for the spinlock are doing &#8220;busy-waiting&#8221;, code should try to hold the spinlock only for relatively short periods of time.</p>
<p><span id="more-4521"></span>Now let&#8217;s consider how this looks on a virtualized system. In this case, the VM has virtual cpus (vcpu) which share the physical cpus with virtual cpus from other VMs.  Typically the total number of virtual cpus across all VMs exceeds the number of physical CPUs; in some cases, such as cloud environments, there may be several times as many vcpus as physical cpus.  </p>
<p>To accomplish this, the hypervisor scheduler gives timeslices of physical processor time to the vcpus, similar to the way that operating system schedules processes.  If the system has more virtual cpus wanting to run than physical processors to run them, some of them will be preempted to let others run.  This allows the VMs to share the physical cpu resources effectively, but it breaks a hidden assumption in the spinlock algorithm: that the kernel code is not preempted while holding a spinlock.</p>
<p>Now, suppose that vcpu A grabs a spinlock, and before it finishes, is preempted by the scheduler.  And then suppose vcpu B tries to grab the spinlock.  Now B, instead of spinning for the short period of time that A needed the spinlock for, will be spinning until vcpu is scheduled again &#8212; which may be anywhere from several milliseconds to hundreds of milliseconds, depending on how busy the system is.  B is now using the cpu but accomplishing nothing.  It&#8217;s burning up its VM&#8217;s share of CPU time, and keeping other VMs with useful work to do from running.  Worse yet, it may even be that the reason A is not running is that the hypervisor&#8217;s scheduler is trying to give priority to B &#8212; so B is actively keeping A from finishing the work that it needed to do with the spinlock.</p>
<p>The situation gets even worse with a more advanced form of spinlock called a <i>ticketlock</i>.  Ticketlocks have a lot of advantages for large systems, including reduced cacheline bouncing and more predictable wait time.  (See <a title="Ticket spinlocks" href="http://lwn.net/Articles/267968/" target="_blank">this LWN article</a> for a complete description.)  The key attribute for this discussion is that ticketlocks essentially make a first-come first-served queue: if A has the lock, then B tries to grab it, and then C, B is guaranteed to get the lock before C.  So now, if C is spinning waiting for the lock, it must wait for <i>both</i> A and B to finish before it can get the lock.</p>
<p>The result is that on a moderately loaded system, the vast majority of the cpu cycles are actually spent waiting for ticketlocks rather than doing any useful work.  This is called a &#8220;ticketlock storm&#8221;.  (It was documented for the first time by Thomas Friebel in his <a title="Thomas Friebel analysis 2008" href="http://www.xen.org/files/xensummitboston08/LHP.pdf" target="_blank">presentation</a> at XenSummit 2008.)</p>
<p><!--more--><strong>Fixes to the vCPU starvation</strong></p>
<p>In order to fix this starvation problem, in 2008, Jeremy Fitzhardinge developed a Linux kernel patch. His approach is to offer an intermediate layer for the spinlocks to allow paravirt backends to redefine the full serie of spinlock operations. Then he wrote a XEN-specific implementation which:</p>
<ul>
<li>For the fast case (uncontented) uses the traditional spinlock single-byte lock approach, overriding the ticketlock logic</li>
<li>If a vCPU cannot get the lock after a specific amount of iterations (probabilly because of lock contention), it adds itself on a per-cpu list and blocks on an event channel</li>
<li>On unlock the per-cpu list is walked and the first vCPU in line is awaken</li>
</ul>
<p>In other words, after spinning for a certain amount of time, the code will assume that the vcpu holding the lock is not running; and instead of continuing to spin, will yield the cpu so that other work can get done.</p>
<p><!--more--><br />
This case doesn&#8217;t penalize native Linux while still giving all the wanted benefits. However, the paravirtualized spinlocks introduced a performance regression when using as native.  So a specific kernel option, CONFIG_PARAVIRT_SPINLOCK, was introduced to include them.  That way, distros could choose whether to take the additional spinlock indirection overhead when running a kernel natively (CONFIG_PARAVIRT_SPINLOCK=y), or the risk ticketlock storm when running a kernel as a VM (CONFIG_PARAVIRT_SPINLOCK=n).</p>
<p>While this approach has proven to work well, it has also shown some problems. First of all the indirection layer adds some overhead, even if not excessive, on a critical path. Second, this model imposes some code duplication.</p>
<p>In order to address these issues, Jeremy worked on a new approach based on existing ticketlock implementation. More specifically, the fastpath is left untouched and some PVOPs are added to the slow paths. These PVOPs are responsible for doing the following:</p>
<ul>
<li>The __ticket_lock_spinning() PVOP is used in the lock case and it is invoked if a CPU has been spinning for a certain, specific, threshold. Once invoked, __ticket_lock_spinning() marks the spinlock as in slowpath state and blocks the vCPU.</li>
<li>The __ticket_unlock_kick() is invoked in the release slow path case and kicks the next vCPU in line awake. When the lock gets uncontented<br />
the slowpath bits gets cleared. Also, the bit signaling for the slow-path is stored in the lock tail ticket at the LSB, which means the number of CPUs ticketlocks can handle is reduced by a factor of 2, which is someway important on little tickets.</li>
</ul>
<p>This allows paravirtualized ticketlock to share most of the code with native ticketlocks without the usage of any additive layer.  It also removes the need for distros to choose between slower native performance and potentially catastrophic virtualized performance.</p>
<p>These patches have been heavilly tested in the past and recently they have been rebased to mainline Linux and further enhached by IBM engineers Srivatsa Vaddagiri and Raghavendra K T.</p>
<p><strong>Benchmarks of the new approach</strong></p>
<p>One of the key things the Linux maintainers care about when considering this kind of functionality is the effect it will have when running the kernel native.  In order to support patch inclusion in mainline Linux, I (Attilio) wanted to show evidence that the patches were not introducing a performance regression in native configuration with real-world workloads. In order to reproduce real-world situations I used mainly 3 tools:</p>
<ol>
<li>pgbench</li>
<li>pbzip2</li>
<li>kernbench</li>
</ol>
<p>pgbench is a tool benchmarking PostgreSQL behaviour. It runs the same sequence of SQL commands repeteadly and calculates the average transactions per seconds. The sequence of commands can be customized, but by default pgbench uses five SELECT, UPDATE and INSERT for every transaction.</p>
<p>For my test I used a postgresql 9.2-devel version (mainline) as a backend, which has a lot of important scalability and performance improvements over the stable version, aiming for a larger output and more performing results. Also, I used a stock installation, with only a simple <a href="http://xenbits.xen.org/people/attilio/jeremy-spinlock/postsgresql.conf.patch" target="_blank">configuration change</a>.</p>
<p>In order to have a full characterization of the scalability, I ran the workload with different sets of threads (ranging from 1 to 64), 10 times each, and used some warmup runs in order to load all the database in memory and thus avoid subsequent actual I/O operations when real measuration is taken. The script used for collecting datas is available <a href="http://xenbits.xen.org/people/attilio/jeremy-spinlock/pgbench_script" target="_blank">here</a>.</p>
<p>pbzip2 is a parallel implementation of the bzip2 file compressor utility using threads. I used this in order to emulate a CPU-intensive, multithreaded, application. The compressed file was a 1GB recipient created from /dev/urandom and all the I/O was performed on tmpfs in order to reduce floaty effects. Again, for a scalability characterization, the workload has been tried with several sets of threads and the used is <a href="http://xenbits.xen.org/people/attilio/jeremy-spinlock/pbzip2bench_script" target="_blank">here</a>.</p>
<p>kernbench is a script comparing Linux kernel compile times, taking into account several number of make jobs. I went with the following invokation:</p>
<ul>
<li>kernbench -n10 -s -c16 -M -f</li>
</ul>
<p>which basically means running the test for 10 times with 1 thread, 8 and 16 threads. Again, in order to reduce the I/O effect I used a tmpfs volume to do all the I/O.</p>
<p>The tests were performed on top of a vanilla Linux-3.3-rc6 (commit 4704fe65e55fb088fbcb1dc0b15ff7cc8bff3685), with a <a href="http://xenbits.xen.org/people/attilio/jeremy-spinlock/kernel-configs/" target="_blank">monholitic configuration</a>. It was important also to estimate the impact of CONFIG_PARAVIRT_SPINLOCK option on this benchmark in order to eventually consider its removal.<br />
The tests done, then, involved 4 different kernels which in turn had on and off Jeremy&#8217;s patch and CONFIG_PARAVIRT_SPINLOCK.</p>
<p>Below is information related to the system and machine used:</p>
<ul>
<li><a href="http://xenbits.xen.org/people/attilio/jeremy-spinlock/dmesg" target="_blank">Machine is a XEON x3450, 2.6GHz, 8-ways system</a></li>
<li><a href="http://xenbits.xen.org/people/attilio/jeremy-spinlock/debian-version" target="_blank">System version, a Debian Squeeze 6.0.4</a></li>
<li><a href="http://xenbits.xen.org/people/attilio/jeremy-spinlock/gcc-version" target="_blank">gcc version 4.4.5</a></li>
</ul>
<p>All the results have been chartered with ministat, a tool calculating fundamental statistical properties of data sets provided in files. They are summarized, divided by number of threads and kernel configurations, in the following links:</p>
<ul>
<li><a href="http://xenbits.xen.org/people/attilio/jeremy-spinlock/pgbench-9.2-total.bench" target="_blank">http://xenbits.xen.org/people/attilio/jeremy-spinlock/pgbench-9.2-total.bench</a></li>
<li><a href="http://xenbits.xen.org/people/attilio/jeremy-spinlock/pbzip2-1.1.1-total.bench" target="_blank">http://xenbits.xen.org/people/attilio/jeremy-spinlock/pbzip2-1.1.1-total.bench</a></li>
<li><a href="http://xenbits.xen.org/people/attilio/jeremy-spinlock/kernbench-0.50-total.bench" target="_blank">http://xenbits.xen.org/people/attilio/jeremy-spinlock/kernbench-0.50-total.bench</a></li>
</ul>
<p>As you can easilly verify, ministat made an average calculation of the 10 retrieved values for every case, then compared the results and calculates difference and standard deviation for every compare.</p>
<p>The final result is that the patch doesn&#8217;t seem to introduce any performance penalty for these 3 workloads. Futhermore, it seems that the option CONFIG_PARAVIRT_SPINLOCK can be removed at the present time (although it likely needs to be re-evaluated with older CPUs than XEON x3450.)</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.xen.org/index.php/2012/05/11/benchmarking-the-new-pv-ticketlock-implementation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>XCP in Ubuntu Server 12.04 LTS: “apt-get install xcp-xapi“</title>
		<link>http://blog.xen.org/index.php/2012/05/06/xcp-in-ubuntu-server-12-04-lts-%e2%80%9capt-get-install-xcp-xapi%e2%80%9c/</link>
		<comments>http://blog.xen.org/index.php/2012/05/06/xcp-in-ubuntu-server-12-04-lts-%e2%80%9capt-get-install-xcp-xapi%e2%80%9c/#comments</comments>
		<pubDate>Sun, 06 May 2012 09:00:37 +0000</pubDate>
		<dc:creator>Lars</dc:creator>
				<category><![CDATA[XCP]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[Ubuntu]]></category>
		<category><![CDATA[Ubuntu Server 12.04 LTS]]></category>

		<guid isPermaLink="false">http://blog.xen.org/?p=4525</guid>
		<description><![CDATA[Canonical’s release of Ubuntu Server 12.04 LTS now includes support for the Xen Hypervisor (version 4.1.2), Xen Cloud Platform (XCP) packages and XCP OpenStack plug-ins. The inclusion of the Xen Cloud Platform packages into Ubuntu Server 12.04 LTS makes Xen more easily accessible to Ubuntu users and adds a wealth of enterprise virtualization functionality. XCP [...]]]></description>
			<content:encoded><![CDATA[<p><img alt="" src="http://design.canonical.com/wp-content/uploads/2011/03/cof_orange_hex1.png" class="alignright" width="100" />Canonical’s release of <a href="http://releases.ubuntu.com/12.04/">Ubuntu Server 12.04 LTS</a> now includes support for the <a href="http://xen.org/products/xenhyp.html">Xen Hypervisor</a> (version 4.1.2), <a href="http://xen.org/products/cloudxen.html">Xen Cloud Platform</a> (XCP) packages and XCP OpenStack plug-ins.  The inclusion of the Xen Cloud Platform packages into Ubuntu Server 12.04 LTS makes Xen more easily accessible to Ubuntu users and adds a wealth of enterprise virtualization functionality.</p>
<p>XCP Benefits for Ubuntu Users:</p>
<ul>
<li>Ubuntu users can get started with Xen and stay up-to-date through the native Ubuntu package management system and update service</li>
<li>Ubuntu users are now able to make use of enterprise virtualization functionality such as Host Pools, Storage Repositories, Performance Monitoring, Host Plugins, Guest VM Templates and <a href="http://openvswitch.org">Open vSwitch</a> integration (for a detailed list see the <a href="http://wiki.xen.org/wiki/XCP/XenServer_Feature_Matrix">XCP feature matrix</a>).</li>
<p>	<span id="more-4525"></span>
<li> Ubuntu users can interact with Xen through the powerful Xen API (XAPI), with its host of integrations and tools.</li>
<li> Ubuntu users can now customize XCP to their needs, whereas traditionally XCP appliances only utilized pre-defined configurations. For example users can now choose the latest versions of the Xen Hypervisor, Open vSwitch, Python other software in their Ubuntu Dom 0.</li>
</ul>
<p>For more information see the XCP-API package and documentation. This is just the beginning: XCP in Ubuntu will evolve with the help of our user and developer communities. You can influence the direction of XCP in Ubuntu by attending the <a href="http://summit.ubuntu.com/uds-q/meeting/20367/servercloud-q-xcp/">XCP design</a> and <a href="http://summit.ubuntu.com/uds-q/meeting/20627/servercloud-q-xen/">Xen design</a> session at <a href="http://summit.ubuntu.com/uds-q/">Ubuntu Developer Summit – Q</a>.</p>
<h3>Supporting Quotes:</h3>
<p>&#8220;<em>The inclusion of XCP into Ubuntu lets users keep up to date with Xen with a single command.</em>&#8221; said <strong>Paul Oh, Director of Business Development at Canonical</strong>. &#8220;<em>The combination of Xen and Ubuntu offers a compelling combination of performance and security for the Cloud that Ubuntu users will find very compelling.</em>&#8221;</p>
<p>&#8220;<em>The collaboration between the Xen and Ubuntu communities was a great example of the power of open source. XCP-XAPI packages in Ubuntu will enable Ubuntu users to build Xen based clouds more easily.</em>“ said <strong>Mark Heath, VP of Products for XenServer at Citrix</strong>. “<em>As we continue to integrate and optimize Xen and XCP with Ubuntu, we look forward to meeting our users needs and helping to evolve better solutions for cloud.</em>”</p>
<p>“<em>I am pleased that project Kronos, which was started five months ago, has delivered XCP into Ubuntu Server 12.04 LTS. Until now, it was only possible to use XCP in Linux appliances within a tightly controlled environment. In Ubuntu Server 12.04 LTS we changed how users interact with XCP, providing much more flexibility and enabling anybody to use Ubuntu as a XCP Dom0 kernel.</em>” said <strong>Lars Kurth, Community Manager, Xen.org</strong>. <em>“I am also very excited about the possibilities this opens to build <a href="http://cloudstack.org/">CloudStack</a>, <a href="https://projects.eucalyptus.com/">Eucalyptus</a>, <a href="http://opennebula.org/">OpenNebula</a> and <a href="http://openstack.org/">OpenStack</a> powered clouds on Ubuntu and XCP-XAPI.</em>“</p>
<h3>Further Information:</h3>
<ul>
<li><a href="https://help.ubuntu.com/community/XenProposed">Xen Documentation on Ubuntu</a></li>
<li><a href="http://packages.ubuntu.com/precise/xcp-xapi">XCP-API package on Ubuntu</a></li>
<li><a href="http://manpages.ubuntu.com/manpages/precise/man1/xapi.1.html">XCP-XAPI documentation on  Ubuntu</a></li>
<li><a href="http://wiki.xen.org/wiki/Category:OpenStack">XCP-OpenStack documentation</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.xen.org/index.php/2012/05/06/xcp-in-ubuntu-server-12-04-lts-%e2%80%9capt-get-install-xcp-xapi%e2%80%9c/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Do﻿m0 Memory &#8212; Where It Has Not Gone</title>
		<link>http://blog.xen.org/index.php/2012/04/30/do%ef%bb%bfm0-memory-where-it-has-not-gone/</link>
		<comments>http://blog.xen.org/index.php/2012/04/30/do%ef%bb%bfm0-memory-where-it-has-not-gone/#comments</comments>
		<pubDate>Mon, 30 Apr 2012 15:22:49 +0000</pubDate>
		<dc:creator>dvrabel</dc:creator>
				<category><![CDATA[Xen Development]]></category>
		<category><![CDATA[Dom0]]></category>
		<category><![CDATA[Linux]]></category>

		<guid isPermaLink="false">http://blog.xen.org/?p=4480</guid>
		<description><![CDATA[If you are upgrading domain 0 Linux kernel from a non-pvops (classic, 2.6.18/2.6.32/etc.) kernel to a pvops one (3.0 or later), you may find that the amount of free memory inside dom0 has decreased significantly.  This is because of changes in the way kernel handles the memory given to it by Xen.  With some updates [...]]]></description>
			<content:encoded><![CDATA[<p>If you are upgrading domain 0 Linux kernel from a non-pvops (classic, 2.6.18/2.6.32/etc.) kernel to a pvops one (3.0 or later), you may find that the amount of free memory inside dom0 has decreased significantly.  This is because of changes in the way kernel handles the memory given to it by Xen.  With some updates and configuration changes, the &#8220;lost&#8221; memory can be recovered.</p>
<p style="padding-left: 30px;">tl;dr: If you previously had &#8216;dom0_mem=2G&#8217; as a command line option to Xen, change this to &#8216;dom0_mem=2G,max:2G&#8217;.  If that didn&#8217;t help, read on.</p>
<p><strong><span id="more-4480"></span>What Changed?<br />
</strong><br />
When a domain is started, it is provided with:</p>
<ol>
<li> A number of pages of memory. These are contiguous in the guest&#8217;s pseudo-physical address space and start at address 0.</li>
<li>The maximum number of pages the domain is allowed to have (the <em>maximum reservation</em>).  This may be more than the initial number of pages to allow the the guest to balloon up (increase its number of pages).</li>
<li>The e820 memory map.  For dom0, the physical e820 map is used so dom0 can access all BIOS data areas and PCI device MMIO regions.</li>
</ol>
<p>The principle differences between the new and old kernels are:</p>
<ol>
<li>The new kernel allocates page tables and struct page&#8217;s for all the pages up to the maximum reservation, the older kernels only allocate enough for the current number of pages.  On systems with lots of RAM relative to the number of pages used by dom0 this can result in lots of memory wasted.</li>
<li>The new kernel tries harder to release pages back to Xen that are unusable (because they are behind reserved regions or holes in the memory map).</li>
</ol>
<p><a href="http://blog.xen.org/wp-content/uploads/2012/04/e820_current1.png"><img class="aligncenter size-full wp-image-4517" title="e820_map" src="http://blog.xen.org/wp-content/uploads/2012/04/e820_current1.png" alt="Diagram showing how the kernel modifies the e820 map and how it releases pages that overlap with holes." width="500" height="460" /></a></p>
<p><strong>Give Me Back My Memory!</strong></p>
<p>To stop the kernel from wasting memory on page tables etc. for memory it will never use:</p>
<ol>
<li>Use Xen dom0_mem=max:<em>LLL</em> command line option to set the maximum reservation to <em>LLL</em> (the M and G suffixes can be used for for MiB and GiB) .  e.g., If you previously had dom0_mem=<em>XXX</em>G, change this to dom0_mem=<em>XXX</em>G,max:<em>XXX</em>G.</li>
</ol>
<p>This requires Xen 4.1.2 or later (which sets the maximum reservation based on the command line) <em>and</em> Linux 3.0.5 or later (which makes use of it)..</p>
<p>To recover the memory released during boot:</p>
<ol>
<li>Ensure the balloon driver is enabled in the kernel (CONFIG_XEN_BALLOON).</li>
<li>Set the balloon driver&#8217;s target to the desired amount of RAM. Either with the mechanism provided by your toolstack or directly with (<em>XXX</em> is the amount of RAM in KiB):
<p style="padding-left: 30px;">echo <em>XXX</em> &gt; /sys/devices/system/xen_memory/xen_memory0/target_kb</p>
</li>
</ol>
<p><strong>I&#8217;m Lazy, Make it So</strong></p>
<p>A series of <a href="http://lists.xen.org/archives/html/xen-devel/2012-04/msg01152.html">patches</a> by Konrad Rzeszutek Wilk will eliminate the need for the balloon driver.  With these patches, the otherwise unusable pages will be moved instead of released.  These patches are expected in 3.5.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.xen.org/index.php/2012/04/30/do%ef%bb%bfm0-memory-where-it-has-not-gone/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>NUMA and Xen: Part 1, Introduction</title>
		<link>http://blog.xen.org/index.php/2012/04/26/numa-and-xen-part-1-introduction/</link>
		<comments>http://blog.xen.org/index.php/2012/04/26/numa-and-xen-part-1-introduction/#comments</comments>
		<pubDate>Thu, 26 Apr 2012 13:00:18 +0000</pubDate>
		<dc:creator>dariof</dc:creator>
				<category><![CDATA[Xen Development]]></category>
		<category><![CDATA[Xen Hypervisor]]></category>
		<category><![CDATA[Benchmarking]]></category>
		<category><![CDATA[NUMA]]></category>
		<category><![CDATA[Request for Comment]]></category>

		<guid isPermaLink="false">http://blog.xen.org/?p=4234</guid>
		<description><![CDATA[Yes, no matter whether big or small NUMA piece of hardware, the Xen.org community is staring at you right in the eyes, and we will get the best out of you, no matter how hard it will be... "Lower your shields and surrender. Resistance is futile"!]]></description>
			<content:encoded><![CDATA[<h3>NUMA? What&#8217;s NUMA?</h3>
<p>Having to deal with a Non-Uniform Memory Access (<a title="NUMA" href="http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access" target="_self">NUMA</a>) machine is becoming more and more common. This is true no matter whether you are part of an engineering research center with access to one of the first <a title="Intel SCC" href="http://news.bbc.co.uk/2/hi/technology/8392392.stm">Intel SCC</a>-based machines, a virtual machine hosting provider with a bunch of dual <a title="AMD Opteron" href="http://products.amd.com/pages/OpteronCPUDetail.aspx?id=489&amp;f1=&amp;f2=&amp;f3=Yes&amp;f4=&amp;f5=&amp;f6=&amp;f7=&amp;f8=&amp;f9=&amp;f10=&amp;f11=&amp;" target="_self">2376 AMD Opteron</a> pieces of iron in your server farm, or even just a Xen developer using a dual socket <a title="Intel Xeon" href="http://ark.intel.com/products/47925/Intel-Xeon-Processor-E5620-(12M-Cache-2_40-GHz-5_86-GTs-Intel-QPI)" target="_self">E5620 Xeon</a> based test-box (any reference to the recent experiences of the author of this post is purely accidental <img src='http://blog.xen.org/wp-includes/images/smilies/icon_biggrin.gif' alt=':-D' class='wp-smiley' /> ).  Just very quickly,  NUMA means the memory accessing times of a program running on a CPU depends on the relative <em>distance</em> between that CPU and that memory. In fact, most of the NUMA systems are built in such a way that each processor has its local memory, on which it can operate very fast. On the other hand, getting and storing data from and on remote memory (that is, memory local to some other processor) is quite more complex and slow.  Therefore, while hardware engineers bump their heads against cache coherency protocols and routing strategies for the system BUSes to be put on such machines, the most urgent issue for us, OS and hypervisor developers, is the following: how can we couple scheduling and memory management so that most of the accesses for most of our tasks/VMs stay local?<br />
<span id="more-4234"></span><br />
<h3>NUMA and Xen</h3>
<p>The Xen hypervisor already deals with NUMA in a number of ways. For example, each domain has its own &#8220;node affinity&#8221;, which is the set of NUMA nodes of the host from which memory for that domain is allocated (in equal parts). That becomes very important as soon as many domains start running memory-intensive workloads on a shared host. In fact, as soon as the majority of the memory accesses become remote, the degradation in performance is likely to be noticeable.  An effective technique to deal with this architecture in a virtualization environment is virtual CPU (vCPU) pinning. I mean, if a domain can only run on a subset of the host&#8217;s physical CPUs (pCPUs), it is very easy to turn all its memory accesses into local ones, isn&#8217;t it? Actually, that is exactly what Xen does by default: at domain creation time, it constructs the domain&#8217;s node affinity basing on what nodes these pCPUs belongs to&#8230; <span style="text-decoration: underline;">Provided the domain specifies some pinning for its vCPUs in its config file</span>, with something like the below (assuming CPUs #0 to #3 belongs to the same NUMA node):</p>
<p style="padding-left: 30px;"><span style="color: #808080;">&#8230;<br />
</span><span style="color: #808080;">vcpus =  &#8217;4&#8242;<br />
memory =  &#8217;1024&#8242;<br />
cpus = &#8220;0-3&#8243;<br />
&#8230;</span></p>
<p>This is quite effective, to the point that the (old) xend-based toolstack, if there is no vCPU pinning in a domain config file, tries to figure out on its own where to put it, and pin its vCPUs there! That could be seen as something reasonable to do, and surely brings performance benefits. However, vCPU pinning is quite unflexible, as the VM in question won&#8217;t <strong>for any reason</strong> be allowed to run outside from that set of pCPUs. This means, no matter whether it is the most (artificially-)intelligent of the toolstacks or the most careful of the sysadmins setting it up, you are in constant danger of under/bad utilizing your hardware resources you payed a lot of money for.</p>
<p>As of now, the new xl-based toolstack does not have anything like that.  Yes, there is the possibility of dealing with NUMA by partitioning the system using cpupools (available in the upcoming release of Xen, 4.2), as explained in <strong><a title="Xen.org - cpupools in xen 4.2" href="http://blog.xen.org/index.php/2012/04/23/xen-4-2-cpupools/" target="_self">Xen 4.2: cpupools</a></strong>. Again, this could be &#8220;The Right Answer&#8221; for many needs and occasions, but  has to to be carefully considered and manually setup by hand. What would be nice to have is a self-configuring solution automatically jumping into the game and maximizing the overall system performances.</p>
<h3>Some numbers or, even better, some graphs!</h3>
<p>Let&#8217;s give it a break to all this talking and try looking at what we are discussing more concretely. So question is: can we run some memory-intensive benchmark within a bunch of competing VMs (on a NUMA host) and see if this local/remote memory accessing thing really matters? Well, actually, yes we can. Unfortunately, no matter how bad you are hoping the benchmarker to be the research lab guy with login credentials for an SCC platform&#8230; He actually is the poor Xen developer with the 2 Socket Xeon. Plots showing what happened on a 2 nodes, 16 cores system is what we can show then. For now.</p>
<p>The elected benchmark was SpecJBB2005, under the assumption that it will generate quite a bit of stress on the memory subsystem, which turned out to be the case. Host was the 16 CPUs, 2-NUMA nodes, Xeon based system with 12GB RAM (2GB of which reserved for Dom0). Linux kernel for dom0 was 3.2, Xen was xen-unstable. Guests had 4 vCPUs and 1GB of RAM each. Numbers come from running the benchmark on an increasing (1 to 8 ) number of Xen PV-guests at the same time, and repeating each run 5 times for each of the VMs configurations below:</p>
<ul>
<li><em>default</em> is the defaul Xen and xl behaviour without any vCPU pinning at all;</li>
<li><em>pinned</em> means VM#1 was pinned on NODE#0 after being created. This implies its memory was striped on both the nodes, but it can only run on the fist one;</li>
<li><em>all memory local (best case)</em> means VM#1 was created on NODE#0 and kept there. That implies all its memory accesses are local, and we thus call it the the best case;</li>
<li><em>all memory remote (worst case)</em> means VM#1 was created on NODE#0 and then moved (by explicitly pinning its sCPUs) on NODE#1. That implies all its memory accesses are remote, and we thus call it the worst case.</li>
</ul>
<p>In all the experiments, it is only VM#1 that was pinned/moved. All the other VMs have their memory &#8220;striped&#8221; between the two nodes and are free to run everywhere. The final score achieved by SpecJBB on VM#1 is reported below. As SpecJBB output is in terms of &#8220;business transactions per second (bops)&#8221;, higher values correspond to better results.</p>
<p style="text-align: center;">
<p style="text-align: center;"><a href="http://xenbits.xen.org/people/dariof/images/blog/NUMA_1/kernbench_avgstd2.png"><img class="aligncenter" title="SpecJBB2005 Throughput on VM#1" src="http://xenbits.xen.org/people/dariof/images/blog/NUMA_1/kernbench_avgstd2.png" alt="Absolute throughput of SpecJBB2005 on VM#1" width="498" height="374" /></a></p>
<p style="text-align: left;">First of all, notice how small the standard deviation is for all the runs: this just confirms SpecJBB is a good benchmark for our purposes. The most interesting lines to look at and compare are the red and the blue ones. Evidence is there that things can improve quite a bit, even on such a small box, especially in presence of heavy load (6 to 8 VMs). Let&#8217;s also look at the percent increase in performance of each run with respect to the worst case (all memory remote):</p>
<p style="text-align: center;"><a href="http://xenbits.xen.org/people/dariof/kernbench_avg2.png"><img class="aligncenter" title="Percent Increase of SpecJBB Throughput in VM#1 wrt Worst Case" src="http://xenbits.xen.org/people/dariof/kernbench_avg2.png" alt="" width="491" height="369" /></a></p>
<p style="text-align: left;">This second graph makes even more clear how NUMA placement and scheduling is accountable for a ~10% to 20% (depending on the load) impact on performance. The default Xen behavior is certainly not as bad as it could be: <em>default</em> almost always manage in getting ~10% better performance than the worst case. Also, although <em>pinning</em> can help in keeping performance consistent, it doesn&#8217;t always yield an improvement (and when it does, it is only by few percent points). There is a ~10% performance increase to gain (and even more, in heavy loaded cases), if we manage in getting <em>default</em> to be close enough to <em>all memory local</em>, and that should be the way to go!<br />
The full set of results, with plots about all the statistical properties of the data can be found <a title="SpecJBB2005 on xen--unstable benchmarks" href="http://xenbits.xen.org/people/dariof/benchmarks/specjbb2005-numa/" target="_self">here</a>.</p>
<h3>Are we on the case then?</h3>
<p>We sure are! <a title="[PATCH 00 of 10 [RFC]] Automatically place guest on host's NUMA nodes with xl" href="http://lists.xen.org/archives/html/xen-devel/2012-04/msg00732.html" target="_self">Preliminary patches</a> have been posted to the xen-devel mailing list, and the results of some (preliminary as well, of course) benchmarks are available on <a title="Benchmarks on NUMA RFC series" href="http://wiki.xen.org/wiki/Xen_NUMA_Benchmarks" target="_self">this Wiki article</a>. A new blog post expanding on their aims, features, and performances will follow. In the meanwhile, should you feel like wanting to help with testing, benchmarking and fixing things, please, jump in!</p>
<h3>And the moral of the story is&#8230;</h3>
<p>Yes, my dear NUMA pieces of hardware out there, the Xen.org community is staring at you right in the eyes, and we will get the best out of you, no matter how hard it will be&#8230; &#8220;<a title="Resistance is futile" href="http://en.wikipedia.org/wiki/Borg_(Star_Trek)#.22Resistance_is_futile.22" target="_self">Lower your shields and surrender. Resistance is futile</a>&#8220;!</p>
<p><a href="http://blog.xen.org/wp-content/uploads/2012/04/borg.png"><img class="aligncenter size-full wp-image-4403" title="Attribution to d.loop @ flickr under CC BY 2.0" src="http://blog.xen.org/wp-content/uploads/2012/04/borg.png" alt="" width="500" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.xen.org/index.php/2012/04/26/numa-and-xen-part-1-introduction/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Xen Event Update, May 2012</title>
		<link>http://blog.xen.org/index.php/2012/04/25/xen-event-update-may-2012/</link>
		<comments>http://blog.xen.org/index.php/2012/04/25/xen-event-update-may-2012/#comments</comments>
		<pubDate>Wed, 25 Apr 2012 11:35:56 +0000</pubDate>
		<dc:creator>Lars</dc:creator>
				<category><![CDATA[Events]]></category>
		<category><![CDATA[Build a Cloud Day]]></category>
		<category><![CDATA[Ubuntu Developer Summit]]></category>
		<category><![CDATA[Xen Day]]></category>
		<category><![CDATA[Xen Summit]]></category>
		<category><![CDATA[Xen Summit 2012]]></category>

		<guid isPermaLink="false">http://blog.xen.org/?p=4422</guid>
		<description><![CDATA[A quick round-up of Xen events in May and an update on Xen Summit in August: for more information see the Xen Events page. Xen @ Ubuntu Developer Summit, May 7-11, Oakland, CA The Xen and XCP teams will be participating in the Ubuntu Developer Summit in Oakland, CA. The agenda for technical discussions is [...]]]></description>
			<content:encoded><![CDATA[<p>A quick round-up of Xen events in May and an update on Xen Summit in August: for more information see the <a href="http://xen.org/community/xenevents.html">Xen Events</a> page.</p>
<h2>Xen @ Ubuntu Developer Summit, May 7-11, Oakland, CA</h2>
<p>The Xen and XCP teams will be participating in the <a href="http://uds.ubuntu.com/">Ubuntu Developer Summit</a> in Oakland, CA. The agenda for technical discussions is not yet quite tied down: when it is we will let you know. </p>
<h2>Xen @ Build a Cloud Day, May 10, San Francisco, CA</h2>
<p><a href="http://blog.xen.org/wp-content/uploads/2012/04/BACD.gif"><img src="http://blog.xen.org/wp-content/uploads/2012/04/BACD.gif" alt="" title="BACD" width="160" height="110" class="alignright size-full wp-image-4435" /></a>Learn how to build an open source cloud cloud with CloudStack, RightScale, Ubuntu, Xen and Zenoss at the free <a href="http://cloudstack.org/about-cloudstack/cloudstack-events/viewevent/74-build-a-cloud-day-san-francisco.html">Build a Cloud Day</a>. Build a Cloud Days are about learning, best practices and industry insights into building elastic, scalable and profitable open source clouds. The event is held in conjunction with <a href="http://www.citrixsynergy.com/">Citrix Synergy 2012</a> in San Francisco and as a bonus you will get a free pass to the Synergy Cloud Keynotes and Solutions Expo. </p>
<p><span id="more-4422"></span><br />
<h2>Xen Day Fortaleza 2012, May 30, Brazil</h2>
<p><a href="http://blog.xen.org/wp-content/uploads/2012/04/XenDayFortaleza2012.png"><img src="http://blog.xen.org/wp-content/uploads/2012/04/XenDayFortaleza2012.png" alt="" title="Xen Day Fortaleza 2012" width="500" /></a></p>
<p><a href="http://xen.org/community/events/xendayfortaleza2012.html">Xen Day Fortaleza</a> is a community organized evening of lectures focussing on virtualization with Xen and cloud management with Cloudstack. The event will be held on May 30, 2012 in Fortaleza, CE, Brasil. It is a free event: register <a href="http://xen.org/polls/xendayfortaleza2012_reg.html">here</a>.</p>
<h2><a href="http://xen.org/community/xensummit.html">Xen Summit</a>, August 27-28, San Diego, CA</h2>
<p><img class="alignleft" src="http://xen.org/images/logos/XS12SD_Small.png" alt="" width="500" /></p>
<h3>Program Management Committee and CFP</h3>
<p>First I wanted to welcome the Xen Summit Program Committee, which this year will be: David Nalley (Cloudstack.org), Donald D Dugger (Intel), Ian Campbell (Citrix), Konrad Rzeszutek Wilk (Oracle) and Lars Kurth (Xen.org).</p>
<p>I also wantet to let you know that the community asked me to extended the deadline for the Call for Participation: the deadline now is <b>midnight June 15, 2012 PDT</b>. More information at the <a href="http://xen.org/community/xensummit.html">XenSummit event page</a>.</p>
<h3>Xen Dev Day</h3>
<p>You should also note that we are contemplating to hold a Kernel Summit like Developer event called Xen Day alongside Xen Summit. The event would be invite. A <a href="http://lists.xen.org/archives/html/xen-devel/2012-04/msg01511.html">community vote is open</a> until Friday, Apr 27 regarding time and date preferences.</p>
<h3>Hotel Bookings</h3>
<p>It is now possible to <a href="http://xen.org/community/xensummit.html">book the hotel for Xen Summit</a>. Several events will be held in the same hotel in the same week: <a href="http://xen.org/community/xensummit.html">Xen Summit</a>, <a href="https://events.linuxfoundation.org/events/linuxcon">LinuxCon</a>, <a href="http://www.linuxplumbersconf.org/2012/">Linux Plumbers Conference</a>, <a href="https://events.linuxfoundation.org/events/linux-kernel-summit">Linux Kernel Summit</a> and <a href="https://events.linuxfoundation.org/events/cloudopen">CloudOpen</a>. If you plan to attend Xen Summit and any of the other events, it will be possible for you to book your hotel for the entire week for USD$170/night through any of the event sites. Note that rooms must be booked before 17:00 PST, July 27, 2012 and that there are only limited rooms available, which might fill up fast.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.xen.org/index.php/2012/04/25/xen-event-update-may-2012/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Xen 4.2: cpupools</title>
		<link>http://blog.xen.org/index.php/2012/04/23/xen-4-2-cpupools/</link>
		<comments>http://blog.xen.org/index.php/2012/04/23/xen-4-2-cpupools/#comments</comments>
		<pubDate>Mon, 23 Apr 2012 11:09:23 +0000</pubDate>
		<dc:creator>dunlapg</dc:creator>
				<category><![CDATA[Xen Development]]></category>
		<category><![CDATA[Xen Hypervisor]]></category>
		<category><![CDATA[cpupools]]></category>
		<category><![CDATA[Xen 4.2]]></category>

		<guid isPermaLink="false">http://blog.xen.org/?p=4292</guid>
		<description><![CDATA[Among the more unique features of Xen 4.2 is a feature called cpupools, designed and implemented by Jürgen Groß at Fujitsu. At its core it&#8217;s a simple idea, but one that allows it to be a flexible and powerful solution to a number of different problems. The core idea behind cpupools is to divide the [...]]]></description>
			<content:encoded><![CDATA[<p>Among the more unique features of Xen 4.2 is a feature called <em>cpupools</em>, designed and implemented by <a href="http://xen.org/community/spotlight/juergengross.html">Jürgen Groß</a> at <a href="http://www.fujitsu.com/fts/">Fujitsu</a>.  At its core it&#8217;s a simple idea, but one that allows it to be a flexible and powerful solution to a number of different problems.</p>
<p>The core idea behind cpupools is to divide the physical cores on the machine into different <em>pools</em>.  Each of these pools has an entirely separate cpu scheduler, and can be set with different scheduling parameters.  At any time, a given logical cpu can be assigned to only one of these pools (or none).  A VM is assigned to one pool at a time, but can be moved from pool to pool.</p>
<p>There are a number of things one can do with this functionality. Suppose you are a hosting or cloud provider, and you have a number of customers who have multiple VMs with you. Instead of selling based on CPU metering, you want to <strong>sell access to a fixed number of cpus for all of their VMs</strong>: e.g. a customer with 6 single-vcpu VMs might buy 2 cores worth of computing space which all of the VMs share.</p>
<p>You could solve this problem by using cpu masks to pin all of the customer&#8217;s vcpus to a single set of cores. However, cpu masks do not work well with the scheduler&#8217;s weight algorithm &#8212; the customer wont&#8217; be able to specify that VM A should get twice the cpu as VM B.  Solving the weight issue in a general way is very difficult, since VMs can have any combination of overlapping cpu masks.  Furthermore, this extra complication would be there for all users of the credit algorithm, regardless of whether they use this particular mode or not.<br />
<span id="more-4292"></span><br />
With cpu pools, you simply create a pool for each customer, assign it the number of cpus that customer is paying for, and then put all of that customer&#8217;s VMs in the pool.  That pool has its own complete cpu scheduler; and as far as that pool&#8217;s scheduler is concerned, the only cpus in existence are the one inside the pool.  This means all of the algorithms regarding weight and so on work exactly the same, just on a restricted set of cpus.</p>
<p>Additionally, this means that each customer can <strong>request different scheduling parameters for their VMs</strong> (for example, the <code>timeslice</code> or <code>ratelimit</code> parameters <a href="http://blog.xen.org/index.php/2012/04/10/xen-4-2-new-scheduler-parameters-2/">we talked about last week</a>), or even completely different schedulers, including the experimental credit2 scheduler, and the real-time SEDF scheduler.</p>
<p>Cpupools have the potential to <strong>increase security</strong> as well: they limit the interaction between different customers to physically separate cpus.  Sometimes information about cryptographic keys can be pieced together just by knowing cache patterns or the amount of time spent on certain operations; having VMs from different customers run on phsyically separate cpus removes this vector of attack with very little effort.</p>
<p>Of course, all of the above can be useful <strong>even if you&#8217;re not a cloud provider</strong>: your realtime workloads can run in a pool with the SEDF scheduler, your latency sensitive workloads can run in a pool with a short timeslice, and your number-crunching workloads can run in a pool with a really long timeslice.</p>
<p>One of the particulary convenient commands that Jürgen implemented is the <code>cpupool-numa-split</code> command.  This command will <strong>automatically detect the NUMA topology</strong> of the box you&#8217;re on, create a single pool for each NUMA node, and put all of the cpus in the corresponding pool.  Then when you create VMs, you specify the pool you wish them created in, and all of the memory allocated will be local NUMA accesses.</p>
<p>The details of the interface for cpupools is still undergoing some cleaning up in the last few weeks before the 4.2 release, so I don&#8217;t want to go into details.  There will be an introduction with examples on the Xen.org wiki page before the release, as well as documentation in the man pages and in the command-line help.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.xen.org/index.php/2012/04/23/xen-4-2-cpupools/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Xen Documentation Day : April 23rd</title>
		<link>http://blog.xen.org/index.php/2012/04/19/xen-documentation-day-april-23rd/</link>
		<comments>http://blog.xen.org/index.php/2012/04/19/xen-documentation-day-april-23rd/#comments</comments>
		<pubDate>Thu, 19 Apr 2012 21:33:50 +0000</pubDate>
		<dc:creator>Lars</dc:creator>
				<category><![CDATA[Community]]></category>
		<category><![CDATA[Xen Documentation Day]]></category>

		<guid isPermaLink="false">http://blog.xen.org/?p=4407</guid>
		<description><![CDATA[We have another Xen document day come up next Monday. Xen Document Days are for people who care about Xen Documentation and want to improve it. We introduced Documentation Days, because working on documentation in parallel with like minded-people, is just more fun than working alone! Everybody who can contribute is welcome to join! For [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright" title="We Need You!" src="http://www.xen.org/images/blog/UncleSamSmall.png" alt="" width="174" height="200" /><br />
We have another Xen document day come up next Monday. <a href="http://wiki.xen.org/wiki/Xen_Document_Days">Xen Document Days</a> are for people who care about Xen Documentation and want to improve it. We introduced Documentation Days, because working on documentation in parallel with like minded-people, is just more fun than working alone! Everybody who can contribute is welcome to join!</p>
<p>For a list of items that need work, check out the community <a href="http://wiki.xen.org/wiki/Xen_Document_Days/TODO">maintained TODO and wishlist</a>. Of course, you can work on anything you like: the list just provides suggestions.<br />
<span id="more-4407"></span></p>
<h2>How do I participate?</h2>
<ul>
<li>Join us on IRC: freenode channel <strong>#xendocday</strong></li>
<li>Tell people what you intend to work on (to avoid doing something somebody else is already working on)</li>
<li>Fix some documentation</li>
<li>Help others</li>
<li>And above all: have fun!</li>
</ul>
<p>See you on IRC!</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.xen.org/index.php/2012/04/19/xen-documentation-day-april-23rd/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Linux 3.3!</title>
		<link>http://blog.xen.org/index.php/2012/04/17/linux-3-3-2/</link>
		<comments>http://blog.xen.org/index.php/2012/04/17/linux-3-3-2/#comments</comments>
		<pubDate>Tue, 17 Apr 2012 13:21:15 +0000</pubDate>
		<dc:creator>Konrad Rzeszutek Wilk</dc:creator>
				<category><![CDATA[Xen Development]]></category>
		<category><![CDATA[Linux]]></category>
		<category><![CDATA[Linux 3.3]]></category>

		<guid isPermaLink="false">http://blog.xen.org/?p=4365</guid>
		<description><![CDATA[On March 18th, Linux 3.3 was released and it featured a number of interesting Xen related features. Re-engineered how tools can perform hypercalls &#8211; by using a standard interface (/dev/xen/privcmd instead of using /proc/xen/privcmd) Backends (netback, blkback) can now function in HVM mode. This means that a device driver domain can be in charge of [...]]]></description>
			<content:encoded><![CDATA[<p><img alt="" src="http://upload.wikimedia.org/wikipedia/commons/b/b0/NewTux.svg" title="Tux by Larry Ewing" class="alignright" width="200" />On March 18th, Linux 3.3 was released and it featured a number of interesting Xen related features.</p>
<ul>
<li>Re-engineered how tools can perform hypercalls &#8211; by using a standard interface (/dev/xen/privcmd instead of using /proc/xen/privcmd)</li>
<li>Backends (netback, blkback) can now function in HVM mode. This means that a device driver domain can be in charge of a device (say network) and a subset of the network (netback). What is exciting about this it allows for security by isolation &#8211; so if one domain is compromised it does not affect the other domains. Both Qubes and NSA Research Center have been focusing on this functionality and it is exciting to see components of this goal taking shape!</li>
<p>	<span id="more-4365"></span>
<li>My pet peeve: graphics not working. The Xen architecture makes an interesting sandbox for making drivers compatible with other platforms. This is due to the fact that the when device drivers DMA data, the “bus” address (so the address used on PCI chipsets) is different from the “physical” (the address used by the CPU) address. This mapping is common on other platforms besides x86 &#8211; SPARC, or PPC. The x86 architecture usually has a 1:1 mapping (so “phys” == “bus”) &#8211; and both nouveau and radeon drivers weren’t using th full gamma of the PCI API to take advantage of that. The TTM DMA pool driver does that and now PCI (example: ATI ES1000) and PCIe cards can function properly under Xen &#8211; the remaining piece is to hookup this to the AGP API, but it not clear if there is that much huge demand for that.</li>
<li>Thanks for the <a href="http://wiki.xen.org/wiki/Xen_Document_Days">XenDoc day</a> a lot of documentation was added to the kernel and the XenBus code was cleaned up.</li>
</ul>
<p>Also some pretty serious bug fixes were put in:</p>
<ul>
<li>A fix in the spinlock code: if you build the kernel for fewer than 256 CPUs corruption of the spinlock occurs and a crash will ensure.</li>
<li>We fixed a long-uptime bug when using the radeon or nouveau driver. The user would see: “ WARNING: at arch/x86/xen/mmu.c:475 xen_make_pte” and things started to mysteriously break.</li>
<li>Hardening of the XenBus code to deal with bad frontends.</li>
</ul>
<p>That is it for now. The Linux 3.4 merge window has opened. Expect to see another blog post from in the near future, with detail on what will go in Linux 3.4. The good news is that some of these items have been requested by many of you for some time!</p>
<h2>A thank you to the contributors of this release</h2>
<p><a href="http://blog.xen.org/wp-content/uploads/2012/04/sports-panda.png"><img src="http://blog.xen.org/wp-content/uploads/2012/04/sports-panda.png" alt="" title="Sports Panda" width="200" class="alignright size-full wp-image-4384" /></a>Annie Li (7):</p>
<ul>
<li>Annie Li (7):</li>
<li>xen/granttable: Introducing grant table V2 stucture</li>
<li>xen/granttable: Refactor some code</li>
<li>xen/granttable: Grant tables V2 implementation</li>
<li>xen/granttable: Keep code format clean</li>
<li>xen/granttable: Improve comments for function pointers</li>
<li>xen/granttable: Support sub-page grants</li>
<li>xen/granttable: Support transitive grants</li>
</ul>
<p>Bastian Blank (5):</p>
<ul>
<li>xen: Add privcmd device driver</li>
<li>xen: Add xenbus device driver</li>
<li>xen: Add xenbus_backend device</li>
<li>xen/privcmd: Remove unused support for arch specific privcmp mmap</li>
<li>xen/xenbus-frontend: Make error message more clear</li>
</ul>
<p>Daniel De Graaf (10):</p>
<ul>
<li>xen/gntalloc: Change gref_lock to a mutex</li>
<li>xen/gnt{dev,alloc}: reserve event channels for notify</li>
<li>xen/event: Add reference counting to event channels</li>
<li>xen/events: prevent calling evtchn_get on invalid channels</li>
<li>xen/gntalloc: release grant references on page free</li>
<li>xen/gntalloc: fix reference counts on multi-page mappings</li>
<li>xenbus: Support HVM backends</li>
<li>xenbus: Use grant-table wrapper functions</li>
<li>xen/grant-table: Support mappings required by blkback</li>
<li>xen/netback: Enable netback on HVM guests</li>
</ul>
<p>David Vrabel (3):</p>
<ul>
<li>xen: document balloon driver sysfs files</li>
<li>xen: document backend sysfs files</li>
<li>x86: xen: size struct xen_spinlock to always fit in arch_spinlock_t</li>
</ul>
<p>Ian Campbell (3):</p>
<ul>
<li>xen/xenbus: Reject replies with payload &gt; XENSTORE_PAYLOAD_MAX.</li>
<li>xenbus: maximum buffer size is XENSTORE_PAYLOAD_MAX</li>
<li>xen/xenbus: don&#8217;t reimplement kvasprintf via a fixed size buffer</li>
</ul>
<p>Jan Beulich (2):</p>
<ul>
<li>Xen: consolidate and simplify struct xenbus_driver instantiation</li>
<li>xenbus_dev: add missing error check to watch handling</li>
</ul>
<p>Jeremy Fitzhardinge (1):</p>
<ul>
<li>Xen: update MAINTAINER info</li>
</ul>
<p>Jerome Glisse (8):</p>
<ul>
<li>drm/ttm: remove userspace backed ttm object support</li>
<li>drm/ttm: remove split btw highmen and lowmem page</li>
<li>drm/ttm: remove unused backend flags field</li>
<li>drm/ttm: use ttm put pages function to properly restore cache attribute</li>
<li>drm/ttm: test for dma_address array allocation failure</li>
<li>drm/ttm: merge ttm_backend and ttm_tt V5</li>
<li>drm/ttm: introduce callback for ttm_tt populate &amp; unpopulate V4</li>
<li>ttm: fix agp since ttm tt rework</li>
</ul>
<p>Julia Lawall (1):</p>
<ul>
<li>xen-gntalloc: introduce missing kfree</li>
</ul>
<p>Kay Sievers (1):</p>
<ul>
<li>xen-balloon: convert sysdev_class to a regular subsystem</li>
</ul>
<p>Konrad Rzeszutek Wilk (24):</p>
<ul>
<li>xen/blk[front|back]: Squash blkif_request_rw and blkif_request_discard together</li>
<li>xen/blk[front|back]: Enhance discard support with secure erasing support.</li>
<li>xen/blkback: Move processing of BLKIF_OP_DISCARD from dispatch_rw_block_io</li>
<li>swiotlb: Expose swiotlb_nr_tlb function to modules</li>
<li>drm/ttm: provide dma aware ttm page pool code V9</li>
<li>drm/radeon/kms: enable the ttm dma pool if swiotlb is on V4</li>
<li>drm/nouveau: enable the ttm dma pool when swiotlb is active V3</li>
<li>xen/xenbus-frontend: Fix compile error with randconfig</li>
<li>xen/xenbus: Fix compile error &#8211; missing header for xen_initial_domain()</li>
<li>drm/ttm/dma: Only call set_pages_array_wb when the page is not in WB pool.</li>
<li>drm/ttm/dma: Fix accounting error when calling ttm_mem_global_free_page and don&#8217;t try to free freed pages.</li>
<li>x86/PCI: Expand the x86_msi_ops to have a restore MSIs.</li>
<li>xen/pciback: Move the PCI_DEV_FLAGS_ASSIGNED ops to the &#8220;[un|]bind&#8221;</li>
<li>xen/pciback: Fix &#8220;device has been assigned to X domain!&#8221; warning</li>
<li>xen/pciback: Expand the warning message to include domain id.</li>
<li>xen/mmu: Fix compile errors introduced by x86/memblock mismerge.</li>
<li>xen/balloon: Move the registration from device to subsystem.</li>
<li>ttm/dma: Remove the WARN() which is not useful.</li>
<li>xen/granttable: Disable grant v2 for HVM domains.</li>
<li>xen/bootup: During bootup suppress XENBUS: Unable to read cpu state</li>
<li>xen/smp: Fix CPU online/offline bug triggering a BUG: scheduling while atomic.</li>
<li>xen/pci[front|back]: Use %d instead of %1x for displaying PCI devfn.</li>
<li>xen/setup: Remove redundant filtering of PTE masks.</li>
<li>xen/pat: Disable PAT support for now.</li>
</ul>
<p>Li Dongyang (1):</p>
<ul>
<li>xen-blkback: convert hole punching to discard request on loop devices</li>
</ul>
<p>Maxim Uvarov (1):</p>
<ul>
<li>xen: Make XEN_MAX_DOMAIN_MEMORY have more sensible defaults</li>
</ul>
<p>Stefano Stabellini (1):</p>
<ul>
<li> xen pvhvm: do not remap pirqs onto evtchns if !xen_have_vector_callback</li>
</ul>
<p>Tejun Heo (1):</p>
<ul>
<li>memblock: Fix alloc failure due to dumb underflow protection in memblock_find_in_range_node()</li>
</ul>
<p>Thomas Meyer (1):</p>
<ul>
<li>xen-blkfront: Use kcalloc instead of kzalloc to allocate array</li>
</ul>
<p>Tony Luck (1):</p>
<ul>
<li>xen/ia64: fix build breakage because of conflicting u64 guest handles</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.xen.org/index.php/2012/04/17/linux-3-3-2/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Xen at the OpenStack Design Summit</title>
		<link>http://blog.xen.org/index.php/2012/04/16/4306/</link>
		<comments>http://blog.xen.org/index.php/2012/04/16/4306/#comments</comments>
		<pubDate>Mon, 16 Apr 2012 10:05:23 +0000</pubDate>
		<dc:creator>Lars</dc:creator>
				<category><![CDATA[XCP]]></category>
		<category><![CDATA[OpenStack]]></category>

		<guid isPermaLink="false">http://blog.xen.org/?p=4306</guid>
		<description><![CDATA[The OpenStack community has recently released the Essex release, which supports XCP and XenServer. A number of vendors have worked at that support including Citrix, Internap and Rackspace Public Cloud. You can find some more information about Xen support in the OpenStack Essex release at this wiki page. If you are interested in what has [...]]]></description>
			<content:encoded><![CDATA[<p>The OpenStack community has recently released the <a href="http://www.openstack.org/projects/essex/">Essex release</a>, which supports XCP and XenServer. A number of vendors have worked at that support including <a href="http://community.citrix.com/display/cloud/OpenStack">Citrix</a>, <a href="http://www.internap.com/flexible-cloud-hosting-solutions/enterprise-public-cloud-solutions/">Internap</a> and <a href="http://www.rackspace.com/blog/rackspace-cloud-servers-powered-by-openstack-beta/">Rackspace Public Cloud</a>. You can find some more information about Xen support in the OpenStack Essex release at this <a href="http://wiki.openstack.org/XenServer">wiki page</a>. If you are interested in what has changed in Xen support fro Essex check out this <a href="http://blogs.citrix.com/2012/04/13/openstack-essex-and-folsom-design-summit/">blog post</a>.</p>
<p>Note that there is a roadmap session on XenAPI support in at the OpenStack Design Summit <strong>later today</strong>: <a href="http://folsomdesignsummit2012.sched.org/event/87b0f07db6f2d00e3696243205d8233c">XenAPI (XenServer/XCP) Folsom Roadmap</a> at <strong>2pm, PST</strong>. If you are at the Design Summit and care about Xen support, why not drop in and meet Ewan Mellor, John Garbutt or Renuka Apte.</p>
<p>Of course, <a href="http://blog.xen.org/index.php/2011/07/22/project-kronos/">project Kronos</a> is almost completed, which will help Xen support in OpenStack. You can find information about XCP-XAPI in Debian in the <a href="http://packages.debian.org/unstable/xcp-xapi">Debian package repository</a> (for docs see <a href="http://wiki.debian.org/XCP">Debian XCP wiki</a> and <a href="http://wiki.xen.org/wiki/XCP_toolstack_on_a_Debian-based_distribution">this tutorial</a>). XCP-XAPI support in Ubuntu is near complete: we are waiting for the XCP-XAPI packages to be synced to Ubuntu (see ticket <a href="https://bugs.launchpad.net/ubuntu/+source/xen-api/+bug/962184">#962184</a>). Documentation can be found in the <a href="http://manpages.ubuntu.com/manpages/precise/man1/xapi.1.html">manpages </a>and the <a href="http://packages.ubuntu.com/precise/xcp-xapi">XCP-XAPI package description</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.xen.org/index.php/2012/04/16/4306/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fedora 17 Virtualization Test Day!</title>
		<link>http://blog.xen.org/index.php/2012/04/11/fedora-17-virtualization-test-day/</link>
		<comments>http://blog.xen.org/index.php/2012/04/11/fedora-17-virtualization-test-day/#comments</comments>
		<pubDate>Wed, 11 Apr 2012 02:11:26 +0000</pubDate>
		<dc:creator>deshantm</dc:creator>
				<category><![CDATA[Community]]></category>

		<guid isPermaLink="false">http://blog.xen.org/?p=4239</guid>
		<description><![CDATA[Fedora is planning a number of test days as part of their release cycle, including a Virtualization Test Day on April 12th (this Thursday). Information about the virtualization test day can be found here. We will have some people hanging out on IRC at #fedora-test-day and you can also get in touch with us via the usual channels. We [...]]]></description>
			<content:encoded><![CDATA[<p>Fedora is planning a number of <a href="http://fedoraproject.org/wiki/QA/Fedora_17_test_days">test days</a> as part of their release cycle, including a <strong>Virtualization Test Day on April 12th (this Thursday)</strong>.<img class="alignright" title="Fedora Logo" src="http://fedoraproject.org/w/uploads/2/2d/Logo_fedoralogo.png" alt="Fedora Logo" width="150" height="46" /></p>
<p>Information about the virtualization test day can be found <a href="http://fedoraproject.org/wiki/Test_Day:2012-04-12_Virtualization_Test_Day">here</a>. We will have some people hanging out on IRC at <a href="http://webchat.freenode.net/?channels=#fedora-test-day">#fedora-test-day</a> and you can also get in touch with us via the usual channels.</p>
<p>We have some Xen-specific information <a href="http://openetherpad.org/xen-fedora-testing">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.xen.org/index.php/2012/04/11/fedora-17-virtualization-test-day/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Minified using disk
Page Caching using disk (enhanced)

Served from: blog.xen.org @ 2012-05-16 15:47:29 -->
