Showing posts with label 2.6.37. Show all posts
Showing posts with label 2.6.37. Show all posts

Thursday, 6 January 2011

2.6.37-ck1, BFS 0.363 for 2.6.37, Grouping by UID

It looks like 2.6.37 made it out in time before I left for my trip, so here's some goodies to keep you all busy, from the emails I just sent to announce them on lkml:

---

These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but suitable to any workload.


Apply to 2.6.37:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/2.6.37/2.6.37-ck1/patch-2.6.37-ck1.bz2

Broken out tarball:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/2.6.37/2.6.37-ck1/2.6.37-ck1-broken-out.tar.bz2

Discrete patches:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/2.6.37/2.6.37-ck1/patches/

All -ck patches:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/


Web:
http://kernel.kolivas.org

Code blog when I feel like it:
http://ck-hack.blogspot.com/

Each discrete patch contains a brief description of what it does at the top of
the patch itself.


The most significant change is an updated BFS CPU scheduler, up to version
0.363. See the announce of that patch for the changelog. The rest is a resync.


2.6.37-sched-bfs-363.patch
sched-add-above-background-load-function.patch
mm-make_swappiness_really_mean_it.patch
mm-zero_swappiness.patch
mm-enable_swaptoken_only_when_swap_full.patch
mm-drop_swap_cache_aggressively.patch
mm-kswapd_inherit_prio-1.patch
mm-background_scan.patch
mm-idleprio_prio-1.patch
mm-lru_cache_add_lru_tail.patch
mm-decrease_default_dirty_ratio.patch
kconfig-expose_vmsplit_option.patch
hz-default_1000.patch
hz-no_default_250.patch
hz-raise_max.patch
preempt-desktop-tune.patch
cpufreq-bfs_tweaks.patch
ck1-version.patch


---

The BFS (name shall not be said for PG requirements) CPU scheduler for 2.6.37
is now available.

Since the last release, a lot more work was put into maintaining fine grained
accounting at all times (which should help on 32 bit machines, uniprocessor
and 100Hz configs), minor changes have been made to make CPU offline code more
robust for the purposes of suspend to ram/disk, and some small scalability
improvements were added for SMT CPUs (eg i7). These changes are unlikely to
have dramatically noticeable effects unless you were already experiencing a
problem or poor performance.

A direct link to the patch for 2.6.37 is here:
http://ck.kolivas.org/patches/bfs/2.6.37/2.6.37-sched-bfs-363.patch

All BFS patches here:
http://ck.kolivas.org/patches/bfs

Version 363 has been ported to 2.6.35 and 2.6.32 and available from that
directory in lieu of the long term release nature of these kernels.


On a related note, a small multi-user server feature request was commissioned
for BFS that I was happy to work on, which I'd like to also make publicly
available.

Here is the changelog:

---

Make it possible to proportion CPU resource strictly according to user ID by
grouping all tasks from the one user as one task.

This is done through simply tracking how many tasks from the one UID are
running at any one time and using that data to determine what the virtual
deadline is, offset by proportionately more according to the number of running
tasks. Do this by creating an array of every UID for very quick lookup of the
running value and protect it by the grq lock. This should incur almost
immeasurably small overhead even when enabled. An upper limit of 65535 UIDs
is currently supported.

Make this feature configurable at build time and runtime via Kconfig, and
through the use of sysctls

/proc/sys/kernel/group_by_uid
to enable or disable the feature (default 1 == on), and

/proc/sys/kernel/group_uid_min
to decide the minimum uid to group tasks from (default 1000)

Nice values are still respected, making it possible to allocate different
amounts of CPU to each user.

This feature is most suited to a multi-user shell type server environment and
is NOT recommended for an ordinary desktop.

---
The patch is available for the moment here:
http://ck.kolivas.org/patches/bfs/test/bfs363-group_uids.patch


A reminder that this is NOT a desktop, laptop or embedded device type feature.
The purpose of this feature is to make it impossible for any one user to get
more CPU than any other user on a multi-user login. This is suitable for
multiuser shared GUI/X session or shell type machines, and incurs almost
negligible overhead.


---
I'll be offline shortly and in Japan for a few weeks so I'll be unlikely to respond to any emails in that time.

Enjoy!

Sunday, 2 January 2011

BFS version 0.363

Welcome to 2011!

The testing on BFS ported to 2.6.37-rc8 has been reassuring, and no real show stopper bugs have shown up. The remaining changes required to make it release ready for 2.6.37 have now been committed, along with some other very minor changes, so I've bumped the version up to 0.363. The main change was implementing the fine grained interrupt accounting which will have very little, if any, impact on regular users. These changes are ONLY suitable for 2.6.37, so they have not been ported back to the BFS I'm maintaining for earlier kernels. The rest of the changes suitable for older kernels have gone into 363 for them.

Here is the changelog as it affects existing BFS 360 users:
Make CPU offlining more robust by simply removing all affinity for processes
that no longer have any CPUs they can run on. This allows the machine stop
thread to complete offlining CPUs and makes for a little less overhead in hot
paths.

Allow SCHED_IDLEPRIO to wake up idle CPUs in try_preempt. This would have
caused minor slowdowns for IDLEPRIO tasks only on relatively quiescent systems.

Remove inappropriate likely()s.

Update cpustat for irq - may have been under-reporting interrupt load.

Cosmetic changes.

Bump version to 0.363

Most of these changes should have no visible behavioural effect to the user, apart from the following:

For those on BFS 360, if you were having warnings or even OOPSes on suspend to ram/disk or wakeup from them, or if you were having trouble suspending or resuming, this change might help.

The other change guarantees that CPUs will be busier on SMP machines when tasks are being run IDLEPRIO, so it will increase throughput, but ONLY if you run tasks IDLEPRIO.


Incremental: 2.6.36-bfs-360-363.patch

For 2.6.36ish:
2.6.36-sched-bfs-363.patch

2.6.35.10:
2.6.35.10-sched-bfs-363.patch

2.6.32.27:
2.6.32.27-sched-bfs-363.patch

Shortly I'll be going to Japan for a few weeks as I do almost every year now, so I'll be offline for a while.

Thursday, 30 December 2010

BFS for 2.6.37-rc8

So we approach yet another "stable" release with 2.6.37 just around the corner now that -pre8 is out (yes I'm old fashioned, it is still a pre-release and not a release candidate in my eyes). I usually use pre8 as the flag to port BFS to the new kernel so that I have it ready in time for the "stable" 3 point release.

I've ported the current BFS release v0.360 to the latest kernel. Of significance in the new kernel are some changes yet again to the way CPU offline code occurs (which happens just before suspend to ram and disk), some new way to improve accounting of CPU attribution during interrupt handling, and improving the reporting of CPU load during idle periods with NOHZ enabled. There are also some other architectural changes that have no cosmetic effect. None of these will have any effect on "performance" for the end user so don't expect any surprises there.

So I've ported BFS with only the changes necessary to get it working on the new kernel. I've not yet added the support for interrupt accounting, and I haven't changed the way total load is calculated on NOHZ configured kernels. The only change of any significance is the way I offline CPUs in the new BFS. I have been fighting with that for a while now since there really is no right way to do it, and it changes so frequently in mainline that I often have trouble keeping up.

For those already on version 0.360 of BFS, the only significant changes can be had now in this patch:
bfs360-test.patch

For the dirty details in the CPU offline code, what happens is that a worker thread runs at ultra high priority on the CPU it's about to offline, and half way through its work it turns off that CPU and then needs to run on another CPU to die itself. Till now, BFS has looked for tasks that had affinity for CPUs that could no longer run on any alive CPUs and then would be happy to run them anywhere. With this latest bfs360-test.patch it does more what the mainline kernel does and formally breaks affinity for tasks that no longer have anywhere they're allowed to run. It only has a negligible improvement in overhead but the main reason for doing it is to make the offline code more robust.

For those looking for the first BFS release on 2.6.37-rc8, here's the test patch:
2.6.37-rc8-sched-bfs-362.patch.

I've bumped the version number up simply because it includes the test changes above, and has had other architectural changes to be in sync with the mainline kernel. The main thing new in mainline is that there is a new "class" of realtime scheduling used only internally by the CPU offlining code called a stop class. Instead of using a new scheduling class, BFS simply adds one more realtime priority that can be used by kernel threads and is only ever applied to the CPU offline thread (kmigration). I did this to make it add zero overhead to the existing design, yet support the concept that it is a thread that nothing else can preempt.

Almost certainly I will be adding more code to the final version of BFS that I use when 2.6.37 comes out, to support the new interrupt accounting code and the global load reported, but these will only have cosmetic effects. This patch has only been lightly tested and not compile tested for multiple configurations just yet, but seems to be working nicely. For now you can grab it here:
2.6.37-rc8-sched-bfs-362.patch

Happy New Year everyone :)