Monday 9 January 2012

Towards Transparent CPU Scheduling

Of BFS related interest is an excellent thesis by Joseph T. Meehean
entitled "Towards Transparent CPU Scheduling". Of particular note is
the virtually deterministic nature of BFS, especially in fairness and
latency. While this of course interests me greatly because of
extensive testing of the BFS CPU scheduler, there are many aspects of
both the current CFS CPU scheduler and the older O(1) CPU scheduler
that are discussed that anyone working on issues to do with
predictability, scalability, fairness and latency should read.


In this thesis we propose using the scientific method to develop a deeper understanding of CPU schedulers; we use this approach to explain and understand the sometimes erratic behavior of CPU schedulers. This approach begins with introducing controlled workloads into commodity operating systems and observing the CPU scheduler's behavior. From these observations we are able to infer the underlying CPU scheduling policy and create models that predict scheduling behavior.
We have made two advances in the area of applying scientific analysis to CPU schedulers. The first, CPU Futures, is a combination of predictive scheduling models embedded into the CPU scheduler and user-space controller that steers applications using feedback from these models. We have developed these predictive models for two different Linux schedulers (CFS and O(1)), based on two different scheduling paradigms (timesharing and proportional-share). Using three different case studies, we demonstrate that applications can use our predictive models to reduce interference from low-importance applications by over 70%, reduce web server starvation by an order of magnitude, and enforce scheduling policies that contradict the CPU scheduler's.
Harmony, our second contribution, is a framework and set of experiments for extracting multiprocessor scheduling policy from commodity operating systems. We used this tool to extract and analyze the policies of three Linux schedulers: O(1), CFS, and BFS. These schedulers often implement strikingly different policies. At the high level, the O(1) scheduler carefully selects processes for migration and strongly values processor affinity. In contrast, CFS continuously searches for a better balance and, as a result, selects processes for migration at random. BFS strongly values fairness and often disregards processor affinity.


  1. The versions they chose are outdated. So much work into this thesis, but they're slacking on using a recent kernel? Seriously, that's messed up. It's a thesis, so they couldn't have started this more than a year ago, but they used 2.6.32 from late 2009? Wtf?

  2. Indeed they did use quite an old BFS by today's standards. However, unlike the mainline scheduler, BFS has not changed that drastically since then. It has had nanosecond acounting added throughout, and the soft throughput CPU affinity code has been simplified even further than it was back then - unlike mainline which has gotten more and more complex. So it is likely going to behave virtually the same as the older version.

  3. I think this guy correctly found the reason why BFS outperforms CFS: more aggressive process migrations permits faster balancing and better CPU loading.

    Is this not true with whichever version?

  4. That's pretty much correct from whatever angle you look at it. BFS barely has even the concept of process migration since no CPU has ownership of any process. Everything belongs to the global queue.

  5. Quoting from the docs:

    "trying that [last] CPU first when [did you mean then?] looking for an idle CPU to use the next time it's scheduled."

    What if the last CPU goes idle as a result? How do you protect from bouncing tasks? Can you add a few words about the bouncing to the docs please?

  6. The docs EXTENSIVELY describe how it chooses a CPU to schedule a task on. When a task wakes up, it always tries to schedule on the same CPU it last ran on. Bouncing occurs when the last CPU it ran on is not idle, and another CPU is idle.

  7. One thing I am glad of with CFS and BFS over the O(1) is the nice 19 not rescheduling too much. Even for just that the newer schedulers are worth it.

  8. the tendency of the predictive scheduling models to superimpose data overlay onto the existing user space is key. i agree with the methodology as such, but as with all cs projects i of course have a lot of input.

  9. I am currently in the midst of developing a very low level python implementation of an adaptive predictive scheduling model. I'm importing a bunch of existing PHP modules. The problem is, JavaScript is necessary for the Python shell. Is there any way you guys know of to switch the parameter to accept said PHP modularity?