Monday 11 October 2010

Further updates on hierarchical tree-based penalty.

First of all, no new patch today, yay \o/ I know it's sometimes hard to keep up but that's the nature of things. I (sort of kind of hope that I can) promise not to release anything much new any time soon with respect to the hierarchical tree-based penalty code.

I experimented with removing the separation of processes from threads and treating them all equal and discovered that led to real lag in some gui applications as many of them use threads. So it seems the default of penalising fork depth and not penalising threads works best at biasing CPU distribution for interactivity and responsiveness (which is the default on the current patch). This is rather ironic, as this code evolved out of an initial attempt to control threads' behaviour on massively threaded applications, yet it turns out that being nice to threaded apps works better for the desktop.

I thought up another way of demonstrating the effect this patch has in a measurable way.

Using a dual core machine as an example, and running the "browser benchmark" at http://service.futuremark.com/peacekeeper/index.action allowed me to show the effect of the gold standard load of make versus the almost universal gui app, a browser.

The benchmark runs a number of different browser based workloads, and gives a score in points, where higher is better.

Running the benchmark under various different loads with the feature enabled / disabled gave me the following results:

Disabled:
Baseline: 2437
make -j2: 1642
make -j24: 208
make -j42: Failed

Enabled:
Baseline: 2437
make -j2: 2293
make -j24: 2187
make -j42: 1626

As can be seen, on the dual core machine, normally a load of 2 makes the benchmark run almost precisely 1/3 slower as would be expected with BFS' fair CPU distribution of 3 processes between 2 CPUs. Enabling this feature makes this benchmark progress almost unaffected at this load, and only once the load is more than 20 times higher does it hinder the benchmark to the same degree. At that load with it disabled, the browser just spat out 'a script on this page is causing the browser to run slowly' etc. etc. and virtually gave up.


In the last few days, most of the reports have been very positive. However, as expected, not everything is rosy. There have been reports of applications such as mplayer stalling and some people have had gnome applets fail to initialise!? How on earth a scheduling decision about who goes first can cause these is... well it's not a mystery. In my experience it's because some assumption has been made in the userspace application that naively expects a certain behaviour; that being that one process will run first. These sorts of bugs, although likely due to the userspace application itself, make changes of this nature in the scheduler as the default impossible, or at least foolish. So as much as I'd like to see this change go into the next -ck release and be the default, I can't.

The patch will still be around to play with and I rather like it on my own desktop so I'm not throwing it out any time soon. Maybe something else will come of it in the future. But now I can relax and just sync up with mainline again when 2.6.36 final comes out.

8 comments:

  1. I understand and agree with your conservative approach. I suspect that with a little more time to investigate any issues that come up, alternatives can be found to make this work. I would love to know more about the problems others have had, so that I can try to reproduce them on my machine.

    Galen

    ReplyDelete
  2. You can read about the mplayer issues in the comments on this blog. The other gnome issues are on the arch forums that someone linked me to. See https://bbs.archlinux.org/viewtopic.php?id=106086&p=2

    ReplyDelete
  3. I suppose the reason I didn't notice mplayer stalling is because I use a multi-threaded version of it (ffmpeg-mt). I flipped the sysctl and configured mplayer to use only one thread, and indeed, it stalls when doing GUI effects (for example hovering the mouse over the calendar widget which fades-in the current date makes mplayer stall for the duration of the fade-in.)

    ReplyDelete
  4. Thanks for that report. That's a pretty serious regression and confirms my concerns about enabling this sort of feature by default. People should just learn to use nice...

    ReplyDelete
  5. In this relation it would be good (or easier;) ) for most users, if there is some kind of configuration file, which defines the standard nice/priority for some processes. So mplayer (or vlc etc.) could start with higher priority and stalling is gone. (Have seen this kind of automagic elsewhere, but could not remember.

    Remark for (the old) BFS: With the CFS and my "multitouch" synaptics touchpad, I had serious trouble to configure 2. mouse button for double finger press. Does work sometimes, but not really. With BFS it works like a charm, no problem at all, even with high load. (I had troble to test the new features with 2.6.35.7 maybe a patched in the wrong way. The 2.6.36 RC couldn't I test, because even with the evil ;) NVidia Beta driver the plasma desktop on KDE 4.5.2 crashes here on my OpenSuse.)

    Question: Could you make the new tree-bases penalty always be included in -ck BFS Kernel patch, but for dummies as default off and switchable with sysctl value?

    Thanks from a noob.
    CU sysitos

    ReplyDelete
  6. Thanks for the feedback.

    Indeed I could include the patch for tree based penalty and just disable it, however it does add a very small amount of overhead to the kernel, even if it is default set to off, so I am a little reluctant.

    ReplyDelete
  7. Here in my laptop, your hierarchical-tree patch gave me some significant regressions, mainly in the overall system perfomance.

    For example, when I was transferring a large amount of data between Linux - Windows partitions, when I opened new programs (in Linux, of course), my system ocasionally stalled and took much longer to open a new program than when I wasn't using this patch...

    Currently, I'm using your "BFS 357" patch but without the hierarchical-tree feature, because it gave me the regressions mentioned above...

    Hope my feedback can help you.
    Kudos from Portugal...

    p.s.: Sorry for my english, I'm not a native speaker... :(

    ReplyDelete
  8. Don't be sorry! I'm very thankful for your testing. It confirms my testing and I think we shouldn't enable this feature at all. I don't even think it should be included in the final 2.6.36-ck1 since I'm not sure what value it adds.

    ReplyDelete