Skip to content


Xen network: the future plan

As many of you might have (inevitably) noticed, Xen frontend / backend network drivers in Linux suffered from regression several months back after the XSA-39 fix (various reports here, here and here). Fortunately that’s now fixed (see the most important patch of that series) and the back-porting process to stable kernels is on-going. Now that we’ve put everything back into stable-ish state, it’s time to look into the future to prepare Xen network drivers for the next stage. I mainly work on Linux drivers, but some of the backend improvements ideas should benefit all frontends.

The goal is to improve network performance and scalability without giving up the advanced security feature Xen offers. Just to name a few items:

Split event channels: In the old network drivers there’s only one event channel between frontend and backend. That event channel is used by frontend to do TX notification and RX buffer allocation notification to backend. It is also used by backend to do TX completion and RX notification to frontend. So this is definitely not ideal as TX and RX interferes with each other. So with a little change to the protocol we can split TX and RX notifications into two event channels. This work is now in David Miller’s tree (patch for backend, frontend and document).

1:1 model netback: The current model of netback is M:N. That is, we create nr_vcpus kthreads in Dom0 and attach every DomU’s vif to a specific kthread. The old model works well so far, but it certainly has drawback. One significant drawback is that the fairness among vifs is somewhat poor as vifs are statically attached to one kthread. It’s easy to run into a situation that several vifs on a kthread compete for CPU time while another worker thread is idle. The idea behind the 1:1 model is to create one kthread for each vif and trust backend scheduler to do the right thing. Preliminarily test shows that this model indeed improve fairness. What’s more, this model is also prerequisite for implementing multi-queue in Xen network. This work is under-going some test (with many nice-looking graphs) and discussions (1, 2).

Multi-page ring: The size of TX / RX ring is only one page. According to Konrad’s calculation, we can only have ~898K in flight data on the ring. Hardware is becoming faster and faster which can possibly make the ring a bottleneck. Extending the ring can be generally useful. This work can benefit bulk transfer like NFS. All other Xen frontend / backend drivers can also benefit from the new multi-page ring Xenbus API.

Multi-queue vif: This should help vifs scale better with number of vcpus. The XenServer team from Citrix is working on this (see the discussion thread).

The ideas listed above are concrete, we also have many other vague ideas :-) :

Zero-copy TX path: This idea is not likely to be upstreamed in the near future as there’s some prerequisite patches for core network driver and we are now also considering whether copying is really such a bad idea – modern hardware copies data blazingly fast and TLB shot-down required for mapping is expensive (at least that’s our impression at the moment). The only way to verify the worthiness of zero-copy is to hack a prototype. If TLB shot-down is less expensive than we expect and the gain overweights the lost we might consider adding in zero-copy TX. I implemented a vhost-net like netback to verify this. An unexpected side effect of this new prototype is that it also reveals some problem in notification scheme – DomU TX sends too many notifications than necessary. We need to solve the notification problem before moving on.

Separate ring indices: Producer and consumer index is on the same cache line. In present hardware that means the reader and writer will compete for the same cacheline causing a ping-pong between sockets. This involves altering the ring protocol.

Cache alignment for ring request / response: Pretty self-explanatory. This also involves altering the ring protocol.

Affinity of FE / BE on the same NUMA node: Then the Xen scheduler, with some help from the toolstack, can make sure that the vCPU in Dom0 where the backend runs is kept on the same NUMA node of the vCPUs of the DomU (where the frontend runs), for improved locality and, hence, performance. We discussed this during Xen Hackathon in May and we also have an email thread on Xen-devel.

That’s pretty much it. If you’re interested in any of the items above, don’t hesitate to mail your thought to Xen-devel. You can also find our TODO list on Xen wiki.


Be Sociable and Share!

Posted in Xen Development.

Tagged with , , .


3 Responses

Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.

  1. sewi says

    While I am glad this bug is finally fixed, I am disappointed and disheartened with how you handled this bug. I am the person reporting the bug in the second link you posted.

    This is an extremely critical bug, since it can bring down any production domU at any point. When something like that happens, I expected that someone – like you – would step up and let everyone know
    - that this bug exists
    - how you can avoid it
    - when it’ll be fixed.

    Instead, I got nothing. Just some obscure post in a newsgroup (which I linked), that kind of hints there might be a problem or something.
    This blog post of yours – as far as I am aware – is the first official acknowledgement on Xen that this bug even exists. In June! When the bug has been plaguing people for months. This acknowledgement should have come MUCH earlier.

    See, you are the “pro” here. It would have taken you maybe 10 minutes to judge the scope, tell people what they can do to avoid the bug, what the implications of doing that are, and what they eventually need to do to fix it, and how they can verify that it’s fixed.

    In contrast, you left all that up to everyone else. I spent well over two hours learning where the bug was introduced, that I actually just need to downgrade the dom0 kernel and don’t have to touch the domUs (that is not obvious to someone who isn’t reading kernel sources every day. Even today, I’m not *completely* sure whether patching the dom0 kernel is all I have to do), where your patches are posted, where in the review queue they are, what they are called, so I can check the changelog of the kernels to see when they’re finally in. And of course, the time reporting the bug, defending the bug against the “report is incomplete, I’ll close it” bot, figuring out a workaround, figuring out I don’t need the workaround if I downgrade to a version that still has the DoS vulnerability in it (which doesn’t affect me, since my domUs are trusted).

    Next time, please do your users a favour, save hundreds of human beings time of their life, and tell them if there’s a known issue that might affect them. For example:

    “Kernels having the patch ‘……….’ in it (check it by typing “apt-get changelog (kernel version) | grep -i -C 2 netback”) suffer from a bug where your domU’s network interface can shut down. We are rolling out a fix, once it is in (check for the patch ‘……………….’ when upgrading your distribution kernel), upgrade your dom0.”

    that would have told me everything, left me informed, and I’d know exactly what to do. And it would have taken you probably like 5 minutes.

  2. sewi says

    By the way – one thing I need to be thankful for, in particular: thanks to you, I have gotten much more wary updating. I don’t upgrade without reason anymore, even if a package promises me the sky. So with that, I’ll probably wait a good while before even peering at Xen 4.3.

  3. Lars Kurth says

    Sewi,

    thank you for your lengthy reply. I understand your frustration, and we are reviewing and improving the way how we handle bugs (see http://blog.xen.org/index.php/2013/06/04/reporting-a-bug-against-the-xen-hypervisor/ Tracking bugs). This work has not yet entirely been completed and it is also not 100% clear, whether this would have fixed the issue you raised.

    We do not have the bandwidth to closely monitor or all bugs on distro bug trackers. There are just too many Linux distros out there. We are in the same boat as most components that get shipped as part of distros. We have to rely on distros to raise bugs on xen-devel and to link to the bug raised against xen from their bug reports. For the bigger distros (including Ubuntu), we do have community members which work with distros and monitor distro bugs of high importance. However the bug you raised (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1171135) was only marked with medium importance and the duplicate (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1162924) had no importance assigned to it at all. So this one just fell through the cracks. Looking at your bug reports, there also was no link to the bug raised on xen-devel.

    When bugs are raised against xen-devel, we fix it and deliver fixes into the next maintenance release. The Linux distro decides when they take that fix using their own criteria about how important the bug is. We first saw this bug on Xen-devel (before you raised it on Ubuntu) where all Xen development happens. And the bug was fixed in Xen relatively swiftly. No project makes announcements about bugs, but we do recognize that we need to provide an easier way to monitor bugs.

    We have put some effort into this (see http://lists.xen.org/archives/html/xen-users/2013-05/msg00556.html and we are doing the same for security vulnerabilities – see http://xenbits.xenproject.org/xsa/). This should make it easier for distros trace bugs back to us and for users to follow through. It still relies on the distro to link back to the bug though.

    I hope this helped clarify a few things.

You must be logged in to post a comment.