Showing posts with label tree. Show all posts
Showing posts with label tree. Show all posts

Monday, 11 October 2010

Further updates on hierarchical tree-based penalty.

First of all, no new patch today, yay \o/ I know it's sometimes hard to keep up but that's the nature of things. I (sort of kind of hope that I can) promise not to release anything much new any time soon with respect to the hierarchical tree-based penalty code.

I experimented with removing the separation of processes from threads and treating them all equal and discovered that led to real lag in some gui applications as many of them use threads. So it seems the default of penalising fork depth and not penalising threads works best at biasing CPU distribution for interactivity and responsiveness (which is the default on the current patch). This is rather ironic, as this code evolved out of an initial attempt to control threads' behaviour on massively threaded applications, yet it turns out that being nice to threaded apps works better for the desktop.

I thought up another way of demonstrating the effect this patch has in a measurable way.

Using a dual core machine as an example, and running the "browser benchmark" at http://service.futuremark.com/peacekeeper/index.action allowed me to show the effect of the gold standard load of make versus the almost universal gui app, a browser.

The benchmark runs a number of different browser based workloads, and gives a score in points, where higher is better.

Running the benchmark under various different loads with the feature enabled / disabled gave me the following results:

Disabled:
Baseline: 2437
make -j2: 1642
make -j24: 208
make -j42: Failed

Enabled:
Baseline: 2437
make -j2: 2293
make -j24: 2187
make -j42: 1626

As can be seen, on the dual core machine, normally a load of 2 makes the benchmark run almost precisely 1/3 slower as would be expected with BFS' fair CPU distribution of 3 processes between 2 CPUs. Enabling this feature makes this benchmark progress almost unaffected at this load, and only once the load is more than 20 times higher does it hinder the benchmark to the same degree. At that load with it disabled, the browser just spat out 'a script on this page is causing the browser to run slowly' etc. etc. and virtually gave up.


In the last few days, most of the reports have been very positive. However, as expected, not everything is rosy. There have been reports of applications such as mplayer stalling and some people have had gnome applets fail to initialise!? How on earth a scheduling decision about who goes first can cause these is... well it's not a mystery. In my experience it's because some assumption has been made in the userspace application that naively expects a certain behaviour; that being that one process will run first. These sorts of bugs, although likely due to the userspace application itself, make changes of this nature in the scheduler as the default impossible, or at least foolish. So as much as I'd like to see this change go into the next -ck release and be the default, I can't.

The patch will still be around to play with and I rather like it on my own desktop so I'm not throwing it out any time soon. Maybe something else will come of it in the future. But now I can relax and just sync up with mainline again when 2.6.36 final comes out.

Friday, 8 October 2010

Hierarchical tree-based penalty, interactivity at massive load, updated.

I've updated the patch for fork_depth based penalty with some minor tweaks. The default fork depth is now allowed to be zero for init, thus making the base system have no fork depth penalty. Userspace reporting of "PRI" to top, ps etc was modified to make more sense when penalty was enabled. Thread group accounting was fixed to only offset by absolute deadline, instead of already penalised by fork_depth deadline. Updated changelog follows:
Make it possible to have interactivity and responsiveness at very high load
levels by making deadlines offset by the fork depth from init. This has a
similar effect to 'nice'ing loads that are fork heavy. 'make' is a perfect
example of this and will, with fork_depth_penalty enabled, be felt as much
at 'make -j24' as it normally would be with just 'make'.

Note that this drastically affects CPU distribution, and also has the
indirect side effect of partitioning CPU entitlement to different users as
well. No assumption as to CPU distribution should be made based on past
behaviour.

This is achieved by separating out forks to new processes vs new threads.
When a new process is detected, its fork depth is inherited from its parent
across fork() and then is incremented by one. That fork_depth is then used
to cause a relative offset of its deadline.

This feature is enabled in this patch by default and can be optionally
disabled.

Threads are kept at the same fork_depth as their parent process, and can
optionally have their CPU entitlement all managed as one process together
by enabling the group_thread_accounting feature. This feature is disabled
by default in this patch, as many desktop applications such as firefox,
amarok, etc are multithreaded. By disabling this feature and enabling the
fork_depth_penalty feature (default) it favours CPU towards desktop
applications.

Extensive testing is required to ensure this does not cause regressions in
common workloads.

There are two sysctls to enable/disable these features.

They are in /proc/sys/kernel/

group_thread_accounting - groups CPU accounting by threads
fork_depth_penalty - penalises according to depth of forking from init

An updated patch for 2.6.36-rc7-ck1 follows, though it should apply to a BFS357 patched kernel with offsets:
bfs357-penalise_fork_depth_account_threads.patch

EDIT: Here is a patch that can be applied to vanilla 2.6.35.7 which gives you BFS + this change:
2.6.35.7-sched-bfs357+penalise_fork_depth_account_threads.patch

EDIT2: I notice some people are trying this patch and the earlier released group as entities patch and trying to compare them. Let me make it clear, this patch REPLACES the group as entities patch and does exactly the same thing, only updated.

I am still after feedback on this approach as for my workloads it's only advantageous so I'd love it if more people would report back their experiences, either in the comments here or via email to me at kernel@kolivas.org . Thanks!

Wednesday, 6 October 2010

Hierarchical tree-based penalty.

Further investigation at last reveals why the latest code affects program groups as well as threads, which was my original intention only. The number of tasks running was getting inherited across fork, meaning it was really acting as a branch penalty. That is, the more times a task has forked from init (the further along the branching), the greater the penalty it is receiving to its deadline. So it's a hierarchical tree penalty. While there may be a real case for such a feature, it is not program grouping at all! Now I have to think hard about where to go from here. The thread grouping idea is still a valid idea, and will work with this code corrected. However the tree penalty may be worth pursuing as a separate concept since it had such dramatic effects...

Somehow I found myself doing a lot more hacking than I had intended, and that's also why there's so much blogging going on from someone who hates blogs.

EDIT: Enough talk. Here's some code. Patch to apply to a BFS 357 patched kernel, with two knobs in /proc/sys/kernel/
group_thread_accounting - groups CPU accounting by threads (default off)
fork_depth_penalty - penalises according to depth of forking from init (default on)
(Updated version):
bfs357-penalise_fork_depth_account_threads.patch