Thursday 6 January 2011

2.6.37-ck1, BFS 0.363 for 2.6.37, Grouping by UID

It looks like 2.6.37 made it out in time before I left for my trip, so here's some goodies to keep you all busy, from the emails I just sent to announce them on lkml:

---

These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but suitable to any workload.


Apply to 2.6.37:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/2.6.37/2.6.37-ck1/patch-2.6.37-ck1.bz2

Broken out tarball:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/2.6.37/2.6.37-ck1/2.6.37-ck1-broken-out.tar.bz2

Discrete patches:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/2.6.37/2.6.37-ck1/patches/

All -ck patches:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/


Web:
http://kernel.kolivas.org

Code blog when I feel like it:
http://ck-hack.blogspot.com/

Each discrete patch contains a brief description of what it does at the top of
the patch itself.


The most significant change is an updated BFS CPU scheduler, up to version
0.363. See the announce of that patch for the changelog. The rest is a resync.


2.6.37-sched-bfs-363.patch
sched-add-above-background-load-function.patch
mm-make_swappiness_really_mean_it.patch
mm-zero_swappiness.patch
mm-enable_swaptoken_only_when_swap_full.patch
mm-drop_swap_cache_aggressively.patch
mm-kswapd_inherit_prio-1.patch
mm-background_scan.patch
mm-idleprio_prio-1.patch
mm-lru_cache_add_lru_tail.patch
mm-decrease_default_dirty_ratio.patch
kconfig-expose_vmsplit_option.patch
hz-default_1000.patch
hz-no_default_250.patch
hz-raise_max.patch
preempt-desktop-tune.patch
cpufreq-bfs_tweaks.patch
ck1-version.patch


---

The BFS (name shall not be said for PG requirements) CPU scheduler for 2.6.37
is now available.

Since the last release, a lot more work was put into maintaining fine grained
accounting at all times (which should help on 32 bit machines, uniprocessor
and 100Hz configs), minor changes have been made to make CPU offline code more
robust for the purposes of suspend to ram/disk, and some small scalability
improvements were added for SMT CPUs (eg i7). These changes are unlikely to
have dramatically noticeable effects unless you were already experiencing a
problem or poor performance.

A direct link to the patch for 2.6.37 is here:
http://ck.kolivas.org/patches/bfs/2.6.37/2.6.37-sched-bfs-363.patch

All BFS patches here:
http://ck.kolivas.org/patches/bfs

Version 363 has been ported to 2.6.35 and 2.6.32 and available from that
directory in lieu of the long term release nature of these kernels.


On a related note, a small multi-user server feature request was commissioned
for BFS that I was happy to work on, which I'd like to also make publicly
available.

Here is the changelog:

---

Make it possible to proportion CPU resource strictly according to user ID by
grouping all tasks from the one user as one task.

This is done through simply tracking how many tasks from the one UID are
running at any one time and using that data to determine what the virtual
deadline is, offset by proportionately more according to the number of running
tasks. Do this by creating an array of every UID for very quick lookup of the
running value and protect it by the grq lock. This should incur almost
immeasurably small overhead even when enabled. An upper limit of 65535 UIDs
is currently supported.

Make this feature configurable at build time and runtime via Kconfig, and
through the use of sysctls

/proc/sys/kernel/group_by_uid
to enable or disable the feature (default 1 == on), and

/proc/sys/kernel/group_uid_min
to decide the minimum uid to group tasks from (default 1000)

Nice values are still respected, making it possible to allocate different
amounts of CPU to each user.

This feature is most suited to a multi-user shell type server environment and
is NOT recommended for an ordinary desktop.

---
The patch is available for the moment here:
http://ck.kolivas.org/patches/bfs/test/bfs363-group_uids.patch


A reminder that this is NOT a desktop, laptop or embedded device type feature.
The purpose of this feature is to make it impossible for any one user to get
more CPU than any other user on a multi-user login. This is suitable for
multiuser shared GUI/X session or shell type machines, and incurs almost
negligible overhead.


---
I'll be offline shortly and in Japan for a few weeks so I'll be unlikely to respond to any emails in that time.

Enjoy!

23 comments:

  1. Your backports still don't make it into kernel.org, eg:
    http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/2.6.32/

    I wonder if distribution like Gentoo ck-sources will follow your recent developments.

    Perhaps you could boost adoption of most recent Bfs development using a unified/harmonized sources with github or gitorious ? Extra benefit we would be able to look at any mirco changesets...

    ReplyDelete
  2. Backports are in the BFS directory, not on kernel.org

    I don't do git for this work just because of the way I port my code, sorry.

    ReplyDelete
  3. For point release 2.6.36.3 I just try my patch
    ---
    --- a/kernel/sched_bfs.c 2011-01-08 21:14:35.583000004 +0100
    +++ b/kernel/sched_bfs.c 2011-01-08 21:16:53.421000003 +0100
    @@ -1924,7 +1924,7 @@
    /*
    * calc_load - update the avenrun load estimates every LOAD_FREQ seconds.
    */
    -void calc_global_load(void)
    +void calc_global_load(unsigned long ticks)
    {
    long active; ---
    Also -ck2/mm- series I disabled...

    ReplyDelete
  4. My patch works (place a new line before ---).
    No wonder for I have
    CONFIG_NO_HZ disabled

    ReplyDelete
  5. @Ralph, please can you explain, why you have done the patch? And as you already wrote, the calculation of load had problems with CONFIG_NO_HZ, so without it it doesn't make much sense. Or I am mistaken?

    PS: ck is away for holiday so no clarification from him is expected.

    CU sysitos

    ReplyDelete
  6. Coming with point release 2.6.36.3 there is a patch backporting 2.6.37 CONFIG_NO_HZ calculations. This redefines a function found in sched_bfs.c:

    void calc_global_load(void)

    patching to
    void calc_global_load(unsigned long ticks)
    makes make work again. But I am unsure if this Kolivas function works correctly. You can avoid this Kolivas function to actually work by unsetting CONFIG_NO_HZ. It is a patch to just allow compiling linux-2.6.36.3 with bfs-363 !

    ReplyDelete
  7. @Ralph,

    thanks so far. I wouldn't unset CONFIG_NO_HZ, because I use a laptop, seems to be a power saver.

    CU sysitos

    ReplyDelete
  8. I thought your patches kill the cgroups feature. Am I wrong? o.O

    ReplyDelete
  9. They kill the cgroups for the CPU scheduler only. For other subsystems, such as block I/O, cgroups are still available.

    ReplyDelete
  10. Yes, the comments here are correct about building the kernel without hotplug and about cgroups. I made a minor screwup that prevents it building if hotplug is disabled. I'll post a patch to fix it in my bfs directory when I get a chance.

    ReplyDelete
  11. i dont know about this one. this version seems much slower than bfs 0.360. well, at least at multitasking. any ideas? not even schedtool makes it as fast as the older version without schedtool. anyways, i am not using this patch directly, but through zen kernel, which uses the exact same version. i have a core 2 duo processor.

    ReplyDelete
  12. Can't really comment then unless I know that it's only BFS that has changed. No matter what the zen devs say, it's impossible for it to accurately represent one change when they add other things to it, sorry.

    ReplyDelete
  13. alright then, i will try and compare the new bfs against the old one on the vanilla kernel and see the results.

    ReplyDelete
  14. That would be most helpful, thanks! Feel free to email me the results from here on instead: kernel @ kolivas.org

    ReplyDelete
  15. false alarm. i compiled the two versions on vanilla kernel. turns out, they give the same interactivity under load. you were right ck, it was the zen kernel after all. i am sorry that i wasted your time.

    ReplyDelete
  16. Not a problem, I'm just glad you sorted it out. Thanks for the feedback.

    ReplyDelete
  17. Hi Anonymous, you can report your experience at #zen-sources @ irc.rizon.net. These are the kinds of things we'd like to know about.

    ReplyDelete
  18. oh yeah and damentz, if i compile your latest patch with bfs, it doesnt work. it says something about sched.c has a similar reference or something like that. i will compile it again and give the exact output.

    ReplyDelete
  19. patch? not from git? come on irc:)

    ReplyDelete
  20. yeah, the liquorix patch. more up to date.

    ReplyDelete
  21. How's the liquorix patch + BFS ?

    Is it the best combo for desktop interactivity ?

    ReplyDelete