Friday, 11 December 2015

BFS 466, linux-4.3-ck2

Announcing an updated BFS for linux-4.3 based kernels.

BFS by itself:
4.3-sched-bfs-466.patch

-ck branded linux-4.3-ck2 patches:
4.3-ck2

In addition to a build fix for the nohz compile issue with BFS 465, this is the first BFS in a very long time to have performance improvements. For some time now it's bugged me that tasks would have very poor affinity with CPUs on BFS, even if the performance was good. By this I mean if you fired up a fully CPU bound task and watched a CPU monitor/graph on a multicore machine, you'd see the one task would bounce around from CPU to CPU very frequently instead of occasionally. While this was great for latency purposes and interactivity, single threaded workloads would suffer as a result, and additionally it would represent a small amount of performance loss in multithreaded workloads too since CPU cache effects improving throughput would be diminished. Every time I'd previously tackled this issue, I found myself making some other workload worse.

After approximately 100 rebuilds of the kernel and benchmarking, I finally found where the problem lay, and it wasn't just trying to maintain bias against moving tasks from CPU to CPU, it was also that the code responsible is in the most frequently traversed code path in the schedule() call. Simplifying the code that biased against moving tasks in earliest_deadline_task, as well as calling on the bias for all tasks, not just fully CPU bound tasks, improved performance statistically significantly without detriment to latency or other workloads in my testing.

Apart from improving measurable throughput benchmarks, users may notice that some workloads that are single threaded (such as some video playback software, or even virtualisation with kvm etc.) may actually improve because of their ability to bind to one CPU better and not incur the wrath of being moved to a CPU speed throttled for power saving. Please give it a whorl and report back anything you find, positive or negative - though it should all be positive. If you have benchmarks you want to throw at it, even better.

EDIT: After the initial enthusiasm, it appears this DOES have a detrimental effect on interactivity so I will be looking for another change in the near future with yet another release.

Enjoy!
お楽しみください

3 comments:

  1. Gah...figures I just built 4.3.2-ck last night for my netbook, and I'm out of town tonight! :-P Will try it out this weekend. Thanks ck!

    Fwiw, I'm still getting the upstream intel drm atomic warning referenced from Arch linux-ck AUR: https://bugs.freedesktop.org/show_bug.cgi?id=93104

    ReplyDelete
  2. I backported this to 4.1 (bfs-464). Some tasks are noticeably more sticky & max. CPU utilisation is up across all CPUs (e.g. with ffmpeg or compiling), but overall responsiveness esp. under load really suffers, even when cores are available.

    ReplyDelete
    Replies
    1. That's a very interesting observation since idle cores will still always pull tasks to them as much as they did previously. I'll take what you say on face value though and wait for more people to report back.

      Delete