Showing posts with label skip lists. Show all posts
Showing posts with label skip lists. Show all posts

Tuesday, 13 September 2016

BFS 497, linux-4.7-ck4

For the first time in a very long time, I'm announcing yet another -ck release up to ck4 along with yet more substantial updates for BFS for linux-4.7 based kernels.

BFS by itself:
4.7-sched-bfs-497.patch

-ck branded linux-4.7-ck4 patches:
linux-4.7-ck4

Thanks(?) to the massive changes to the mainline kernel I'd been forced to rewrite significant components of BFS to work properly with them, specifically the cpu frequency governors. At the same time I've had quite a bit of energy and enthusiasm for working on BFS in a way I haven't had in a long time. As a result, this updated version not only addresses the remaining cgroup stub patch bug (mentioned on the previous announcement) but implements further improvements and clean ups to go with those improvements.

Alas I still have no explanation for the random lockups some people are seeing, but I have seen reports of it happening on mainline kernels as well now, so while I'm always suspicious of my own code, there is also the chance that BFS exacerbates an issue in mainline. Something that appears common is onboard Intel graphics with the Haswell chipset.

Additionally I had reports of people being unable to suspend with BFS from 4.7 but I haven't heard back from them on later versions.

The short summary of improvements in this version are less overhead, higher throughput and less latencies.

I've rewritten the skiplist implementation to not require a malloc/free on insertion/removal of a new node which seemed to noticeably improve throughput at high loads.
Now that CPU frequency governors know what the scheduler is doing, the approach of BFS of old of knowing what the governor was doing and working around it is no longer helpful and I've removed the whole sticky task and offset for throttled CPUs and throughput has actually improved instead.
I've also added some micro-optimisations and cleanups.
I've added a minor change for offlining CPUs to prevent tasks trying to schedule to them.

The set of patches in ck4 is the largest in the ck patchset since the early 2.6 patchset days. I've also included the patch from Alfred (thanks!) to fix the warning that happens with suspend which is mostly harmless.

Each patch included has a mini changelog at the top.

I'm also keen to get feedback from people on if they see any noticeable interactive/responsiveness regressions by disabling the interactive flag as follows:

echo 0 > /proc/sys/kernel/interactive

Enjoy!
お楽しみ下さい
-ck

Friday, 2 September 2016

BFS 480 with skip lists, linux-4.7-ck2

Announcing a major update for BFS for linux-4.7 based kernels.

BFS by itself:
4.7-sched-bfs-480.patch

-ck branded linux-4.7-ck2 patches:
linux-4.7-ck2

This is the largest BFS update in a long time. The various problems that had been accumulating forced me to spend a more extended period fixing BFS to work with the latest mainline changes and encouraged me to overhaul some areas that had long been needing it.

The changes are:
  • Fixed the crash when SMT NICE is configured in on a CPU without SMT.
  • Added my skiplist implementation.
  • Converted BFS from its long-standing O(n) lookup to use skiplists.
  • Fix crash when SMT NICE is enabled on some hardware
  • Fix try_preempt missing the locality diff effect in non-interactive mode
  • Ignore busy threads/caches when still on the same core
  • Reworked the testing of idle threads and cores for less overhead and to correctly identify idle siblings
  • Fix the CPU load that's passed to the cpu frequency governor, fixing a crash and non-working schedutil governor.
The short summary is I've fixed a number of showstopper bugs on the last version, and improved throughput .

Actually incorporating the skiplists that I had experimented with a long time ago was decided on by the fact that I was able to trim the skiplist overhead further and maintain identical semantics for process selection (maintaining interactivity) whereas on the previous experiment I had never completed the work. Throughput testing shows virtually identical performance on normal workloads and theoretically would be helpful in extreme overload cases.

The original post regarding skip lists was here:
bfs-and-skip-lists.html


This now means that BFS is no longer O(n) lookup after O(1) insertion. It is now O(log(n)) insertion, O(1) lookup and O(k) removal where k <= 16, thereby tackling a long-standing criticism of the overall design.

I did not find a specific cause for peoples' inability to suspend to ram so I doubt this has been fixed despite the large code update.


The list of patches making up bfs480 is as follows:

bfs472-fix_set_task_cpu.patch
skiplists.patch
bfs472-skiplist.patch
bfs-delay-smt-siblings.patch
bfs-fix-noninteractive-try-preempt.patch
bfs-ignore_local_busy.patch
bfs-rework-idles.patch
bfs-fix-schedutil.patch
bfs-v480.patch

As always I'm giving this to you not long after I've finished coding it so all the usual warnings apply, especially with an update of this size.

EDIT: Uniprocessor build fix: bfs480-fix-upbuild.patch
EDIT2: Here is a test patch to try and improve cpufreq behaviour: bfs480-rework_cpufreq.patch

Enjoy!
お楽しみ下さい
-ck