These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but suitable to any commodity hardware workload.
Apply to 2.6.39(.x):
http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/2.6.39/2.6.39-ck2/patch-2.6.39-ck2.bz2
Ubuntu packages (2.6.39-ck1-3 is equivalent to 2.6.39-ck2):
http://ck.kolivas.org/patches/Ubuntu%20Packages/
Broken out tarball:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/2.6.39/2.6.39-ck2/2.6.39-ck2-broken-out.tar.bz2
Discrete patches:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/2.6.39/2.6.39-ck2/patches/
All -ck patches:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/
BFS by itself:
http://ck.kolivas.org/patches/bfs/
Web:
http://kernel.kolivas.org
Code blog when I feel like it:
http://ck-hack.blogspot.com/
Each discrete patch contains a brief description of what it does at the top of the patch itself.
The only change from 2.6.39-ck1 is an upgrade to BFS CPU scheduler version 0.406. A bug that would cause hangs due to an incompatibility with the new block plug flushing code and BFS was fixed. For those who tried the "bfs404-test9" patch, this is only trivially different apart from the bfs version change.
Full patchlist:
2.6.39-sched-bfs-406.patch
sched-add-above-background-load-function.patch
mm-zero_swappiness.patch
mm-enable_swaptoken_only_when_swap_full.patch
mm-drop_swap_cache_aggressively.patch
mm-kswapd_inherit_prio-1.patch
mm-background_scan.patch
mm-idleprio_prio-1.patch
mm-lru_cache_add_lru_tail-1.patch
mm-decrease_default_dirty_ratio.patch
kconfig-expose_vmsplit_option.patch
hz-default_1000.patch
hz-no_default_250.patch
hz-raise_max.patch
preempt-desktop-tune.patch
ck2-version.patch
Please enjoy!
お楽しみください
--
-ck
A development blog of what Con Kolivas is doing with code at the moment with the emphasis on linux kernel, MuQSS, BFS and -ck.
Showing posts with label 2.6.39. Show all posts
Showing posts with label 2.6.39. Show all posts
Sunday, 5 June 2011
Friday, 3 June 2011
2.6.39 BFS test 9 - is this the one?
Hopefully this test patch should fix all the problems with BFS 404 on 2.6.39:
bfs404-test9.patch
Ubuntu Packages : grab the 2.6.39-ck1-3 package
Please report back if you haven't already! Thanks to everyone who has tested so far! Your feedback has been absolutely essential on this weird and wonderful bug.
bfs404-test9.patch
Ubuntu Packages : grab the 2.6.39-ck1-3 package
Please report back if you haven't already! Thanks to everyone who has tested so far! Your feedback has been absolutely essential on this weird and wonderful bug.
Monday, 30 May 2011
2.6.39 BFS progress
TL;DR: 2.6.39 BFS fixed maybe?
After walking away from the code for a while, annoyed at the bug I couldn't track down, I had another good look at what might be happening. It appears that while the grq lock is dropped in schedule() to perform the block plug flush, a call to the task via try_to_wake_up may be missed entirely, leaving the task deactivated when it should actually keep running. Anyway, first tests from the people on these blog comments are reassuring.
Here is a cleaned up and slightly modified version of the "test8" patch that has so far been stable and shows to have fixed the problem for a handful of people:
Apply to 2.6.39-ck1 or 2.6.39 with BFS 404:
bfs404-recheck_unplugged.patch
In response to requests for packaged versions, I've uploaded a 2.6.39-ck1-2 ubuntu package which includes this change:
Ubuntu Packages
Please test and report back! If this fixes the problem, I'll be releasing it as ck2.
After walking away from the code for a while, annoyed at the bug I couldn't track down, I had another good look at what might be happening. It appears that while the grq lock is dropped in schedule() to perform the block plug flush, a call to the task via try_to_wake_up may be missed entirely, leaving the task deactivated when it should actually keep running. Anyway, first tests from the people on these blog comments are reassuring.
Here is a cleaned up and slightly modified version of the "test8" patch that has so far been stable and shows to have fixed the problem for a handful of people:
Apply to 2.6.39-ck1 or 2.6.39 with BFS 404:
bfs404-recheck_unplugged.patch
In response to requests for packaged versions, I've uploaded a 2.6.39-ck1-2 ubuntu package which includes this change:
Ubuntu Packages
Please test and report back! If this fixes the problem, I'll be releasing it as ck2.
Thursday, 26 May 2011
2.6.39-ck1 unstable
As much as I hate to say this, I have to give up on 2.6.39 for now. I just don't have the time nor energy to fix this. I'm grateful for all your testing, but it's just going to have to go on hold and I'll have to support .38 kernels in the meantime until I have a revelation of some sort, or help from someone who also knows kernel internals.
Thursday, 19 May 2011
2.6.39-ck1
These are patches designed to improve system responsiveness and interactivity
with specific emphasis on the desktop, but suitable to any commodity hardware workload.
Apply to 2.6.39:
patch-2.6.39-ck1.bz2
Broken out tarball:
2.6.39-ck1-broken-out.tar.bz2
Discrete patches:
patches
Ubuntu packages:
http://ck.kolivas.org/patches/Ubuntu%20Packages
All -ck patches:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/
BFS by itself:
http://ck.kolivas.org/patches/bfs/
Web:
http://kernel.kolivas.org
Code blog when I feel like it:
http://ck-hack.blogspot.com/
Each discrete patch contains a brief description of what it does at the top of
the patch itself.
The most substantial change since the last public release is a major version upgrade to the BFS CPU scheduler version 0.404.
Full details of the most substantial changes, which went into version 0.400, are in my blog here:
http://ck-hack.blogspot.com/2011/04/bfs-0400.html
This version exhibits better throughput, better latencies, better behaviour with scaling cpu frequency governors (e.g. ondemand), better use of turbo modes in newer CPUs, and addresses a long-standing bug that affected all configurations, but was only demonstrable on lower Hz configurations (i.e. 100Hz) that caused fluctuating performance and latencies. Thus mobile configurations (e.g. Android on 100Hz) also perform better. The tuning for default round robin interval on all hardware is now set to 6ms (i.e. tuned primarily for latency). This can be easily modified with the rr_interval sysctl in BFS for special configurations (e.g. increase to 300 for encoding / folding machines).
Performance of BFS has been tested on lower power single core machines through various configuration SMP hardware, both threaded and multicore, up to 24x AMD. The 24x machine exhibited better throughput on optimally loaded kbuild performance (from make -j1 up to make -j24). Performance beyond this level of load did not match mainline. On folding benchmarks at 24x, BFS was consistently faster for the unbound (no cpu affinity in use) multi-threaded version. On 6x hardware, performance at all levels of load in kbuild and x264 encoding benchmarks was better than mainline in both throughput and latency in the presence of the workloads.
For 6 core results and graphs, see:
benchmarks 20110516
(desktop = 1000Hz + preempt, server = 100Hz + no preempt):
Here are some desktop config highlights:
Throughput at make -j6:
Latency in the presence of x264 ultrafast:
Throughput with x264 ultrafast:
This is not by any means a comprehensive performance analysis, nor is it meant to claim that BFS is better under all workloads and hardware than mainline. They are simply easily demonstrable advantages on some very common workloads on commodity hardware, and constitute a regular part of my regression testing. Thanks to Serge Belyshev for 6x results, statistical analysis and graphs.
Other changes in this patch release include an updated version of lru_cache_add_lru_tail as the previous version did not work entirely as planned, dropping the dirty ratio to the extreme value of 1 by default in decrease_default_dirty_ratio, and dropping of the cpufreq ondemand tweaks since BFS detects scaling CPUs internally now and works with them.
Full patchlist:
2.6.39-sched-bfs-404.patch
sched-add-above-background-load-function.patch
mm-zero_swappiness.patch
mm-enable_swaptoken_only_when_swap_full.patch
mm-drop_swap_cache_aggressively.patch
mm-kswapd_inherit_prio-1.patch
mm-background_scan.patch
mm-idleprio_prio-1.patch
mm-lru_cache_add_lru_tail-1.patch
mm-decrease_default_dirty_ratio.patch
kconfig-expose_vmsplit_option.patch
hz-default_1000.patch
hz-no_default_250.patch
hz-raise_max.patch
preempt-desktop-tune.patch
ck1-version.patch
Please enjoy!
お楽しみください
--
-ck
EDIT4: For those having hangs, please try this patch on top of ck1:
bfs404-test6.patch
with specific emphasis on the desktop, but suitable to any commodity hardware workload.
Apply to 2.6.39:
patch-2.6.39-ck1.bz2
Broken out tarball:
2.6.39-ck1-broken-out.tar.bz2
Discrete patches:
patches
Ubuntu packages:
http://ck.kolivas.org/patches/Ubuntu%20Packages
All -ck patches:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/
BFS by itself:
http://ck.kolivas.org/patches/bfs/
Web:
http://kernel.kolivas.org
Code blog when I feel like it:
http://ck-hack.blogspot.com/
Each discrete patch contains a brief description of what it does at the top of
the patch itself.
The most substantial change since the last public release is a major version upgrade to the BFS CPU scheduler version 0.404.
Full details of the most substantial changes, which went into version 0.400, are in my blog here:
http://ck-hack.blogspot.com/2011/04/bfs-0400.html
This version exhibits better throughput, better latencies, better behaviour with scaling cpu frequency governors (e.g. ondemand), better use of turbo modes in newer CPUs, and addresses a long-standing bug that affected all configurations, but was only demonstrable on lower Hz configurations (i.e. 100Hz) that caused fluctuating performance and latencies. Thus mobile configurations (e.g. Android on 100Hz) also perform better. The tuning for default round robin interval on all hardware is now set to 6ms (i.e. tuned primarily for latency). This can be easily modified with the rr_interval sysctl in BFS for special configurations (e.g. increase to 300 for encoding / folding machines).
Performance of BFS has been tested on lower power single core machines through various configuration SMP hardware, both threaded and multicore, up to 24x AMD. The 24x machine exhibited better throughput on optimally loaded kbuild performance (from make -j1 up to make -j24). Performance beyond this level of load did not match mainline. On folding benchmarks at 24x, BFS was consistently faster for the unbound (no cpu affinity in use) multi-threaded version. On 6x hardware, performance at all levels of load in kbuild and x264 encoding benchmarks was better than mainline in both throughput and latency in the presence of the workloads.
For 6 core results and graphs, see:
benchmarks 20110516
(desktop = 1000Hz + preempt, server = 100Hz + no preempt):
Here are some desktop config highlights:
Throughput at make -j6:
Latency in the presence of x264 ultrafast:
Throughput with x264 ultrafast:
This is not by any means a comprehensive performance analysis, nor is it meant to claim that BFS is better under all workloads and hardware than mainline. They are simply easily demonstrable advantages on some very common workloads on commodity hardware, and constitute a regular part of my regression testing. Thanks to Serge Belyshev for 6x results, statistical analysis and graphs.
Other changes in this patch release include an updated version of lru_cache_add_lru_tail as the previous version did not work entirely as planned, dropping the dirty ratio to the extreme value of 1 by default in decrease_default_dirty_ratio, and dropping of the cpufreq ondemand tweaks since BFS detects scaling CPUs internally now and works with them.
Full patchlist:
2.6.39-sched-bfs-404.patch
sched-add-above-background-load-function.patch
mm-zero_swappiness.patch
mm-enable_swaptoken_only_when_swap_full.patch
mm-drop_swap_cache_aggressively.patch
mm-kswapd_inherit_prio-1.patch
mm-background_scan.patch
mm-idleprio_prio-1.patch
mm-lru_cache_add_lru_tail-1.patch
mm-decrease_default_dirty_ratio.patch
kconfig-expose_vmsplit_option.patch
hz-default_1000.patch
hz-no_default_250.patch
hz-raise_max.patch
preempt-desktop-tune.patch
ck1-version.patch
Please enjoy!
お楽しみください
--
-ck
EDIT4: For those having hangs, please try this patch on top of ck1:
bfs404-test6.patch
Monday, 16 May 2011
BFS 0.404 page that really exists
There was one regression going into BFS 0.403, and that was expanding the sticky flag to cache warm as well. Not only didn't it improve throughput on anything I could measure, it caused latency regressions so I've backed it out. The only other change going to 404 was fixing a couple of unused variable warnings that were reported by a commenter on this blog. So I consider this patch now stable and pretty much how it will go into 2.6.39 final when it comes out.
Get it here:
2.6.39-rc7-sched-bfs-404.patch.lrz
Get it here:
2.6.39-rc7-sched-bfs-404.patch.lrz
Wednesday, 11 May 2011
BFS 0.402 test2 for 2.6.39-rc7
Well it looks like another stable release is just around the corner, so it's time for me to sync up. Here's the first BFS test release patch for 2.6.39-rc7:
2.6.39-rc7-sched-bfs-402-test2.patch.lrz
Of course I've used my evil powers to compress it with lrzip as a ploy to make you all have to use it again.
I've been using it for a few hours and it seems to be stable enough, but all the usual warnings apply. I also tested it on the most common configurations, but that doesn't mean it will definitely build fine on all configurations.
The only changes in the impending final release of BFS version 0.402 include some changes inspired by the people posting changes here in the forums (Thanks guys!), though not exactly in the form offered, and a resync of the new changes required to support 2.6.39. Specifically there is more high resolution IRQ accounting, and a new syscall "yield_to".
Funnily enough, it was a good 6 years or so ago I had a discussion with William Lee Irwin III who suggested such a yield call as a useful programming addition which of course was discounted by the mainline maintainers back then. Now they suddenly find it's a useful idea after all, since there may well be scenarios where a directed yield is helpful instead of strict locking semantics. Oh well, I guess there is the adage that you should only ever implement a feature at the time you need it rather than "for when you might need it in the future". The difference now from back then is that the people who wanted it back then couldn't push so hard since they weren't kernel hackers themselves. This time it's KVM that desires it, so it's required by another part of the kernel instead of userspace.
So anyway, please test and report back, and enjoy!
2.6.39-rc7-sched-bfs-402-test2.patch.lrz
Of course I've used my evil powers to compress it with lrzip as a ploy to make you all have to use it again.
I've been using it for a few hours and it seems to be stable enough, but all the usual warnings apply. I also tested it on the most common configurations, but that doesn't mean it will definitely build fine on all configurations.
The only changes in the impending final release of BFS version 0.402 include some changes inspired by the people posting changes here in the forums (Thanks guys!), though not exactly in the form offered, and a resync of the new changes required to support 2.6.39. Specifically there is more high resolution IRQ accounting, and a new syscall "yield_to".
Funnily enough, it was a good 6 years or so ago I had a discussion with William Lee Irwin III who suggested such a yield call as a useful programming addition which of course was discounted by the mainline maintainers back then. Now they suddenly find it's a useful idea after all, since there may well be scenarios where a directed yield is helpful instead of strict locking semantics. Oh well, I guess there is the adage that you should only ever implement a feature at the time you need it rather than "for when you might need it in the future". The difference now from back then is that the people who wanted it back then couldn't push so hard since they weren't kernel hackers themselves. This time it's KVM that desires it, so it's required by another part of the kernel instead of userspace.
So anyway, please test and report back, and enjoy!
Subscribe to:
Posts (Atom)


