Sunday, 5 June 2011

2.6.39-ck2, bfs-0.406

These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but suitable to any commodity hardware workload.


Apply to 2.6.39(.x):
http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/2.6.39/2.6.39-ck2/patch-2.6.39-ck2.bz2

Ubuntu packages (2.6.39-ck1-3 is equivalent to 2.6.39-ck2):
http://ck.kolivas.org/patches/Ubuntu%20Packages/

Broken out tarball:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/2.6.39/2.6.39-ck2/2.6.39-ck2-broken-out.tar.bz2

Discrete patches:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/2.6.39/2.6.39-ck2/patches/

All -ck patches:
http://www.kernel.org/pub/linux/kernel/people/ck/patches/

BFS by itself:
http://ck.kolivas.org/patches/bfs/

Web:
http://kernel.kolivas.org

Code blog when I feel like it:
http://ck-hack.blogspot.com/

Each discrete patch contains a brief description of what it does at the top of the patch itself.


The only change from 2.6.39-ck1 is an upgrade to BFS CPU scheduler version 0.406. A bug that would cause hangs due to an incompatibility with the new block plug flushing code and BFS was fixed. For those who tried the "bfs404-test9" patch, this is only trivially different apart from the bfs version change.

Full patchlist:
2.6.39-sched-bfs-406.patch
sched-add-above-background-load-function.patch
mm-zero_swappiness.patch
mm-enable_swaptoken_only_when_swap_full.patch
mm-drop_swap_cache_aggressively.patch
mm-kswapd_inherit_prio-1.patch
mm-background_scan.patch
mm-idleprio_prio-1.patch
mm-lru_cache_add_lru_tail-1.patch
mm-decrease_default_dirty_ratio.patch
kconfig-expose_vmsplit_option.patch
hz-default_1000.patch
hz-no_default_250.patch
hz-raise_max.patch
preempt-desktop-tune.patch
ck2-version.patch

Please enjoy!
お楽しみください
--
-ck

35 comments:

  1. Let me be the first to say thank you for the diligence and perseverance with all you do for the Linux community! Your efforts are deeply appreciated.

    ReplyDelete
  2. Thanks a lot for your work...It works magic...

    ReplyDelete
  3. I just wanted to let you know that you just gained a follower. I've been running your kernel on Arch for a month now and haven't seen any issue.

    There is one problem with text rendering with nvidia that BFS just made it more profound. But the problem exists with mainline kernels also, and I think people have attributed it to an nvidia driver bug.

    ReplyDelete
  4. Great great work!
    Well done Con!
    I had to stick with .38.7 for a while, unfortunately I have limited time for testing at the moment, I need something that just works while I'm in front of the screen.
    Anyway, curious as ever, I took ck2, patched, compiled and... voilà! Perfectly running for more than a day :D
    I think I can now safely stick with .39.1 ;)
    Again, thank you for your great work!

    ReplyDelete
  5. Thanks for your feedback :)

    @Duy Truong: It's not unusual that race related bugs show up much faster and more profoundly with BFS. Of course that doesn't mean BFS is at fault, just that it exposes bugs in other code.

    ReplyDelete
  6. My testing build ubuntu 2.6.39-ck1-3 showed poor results. Excuse me.
    1) I'm using a server running game servers Counter-Strike Source (CSS).
    CSS is very sensitive to the lag of the system.
    Here is the schedule of responsiveness to the game server on port 27015
    Grid spacing of approximately 2 sec. Vertical - responsiveness.

    http://www.rasslabyxa.ru/_ph/9/2/405639033.jpg

    And so he looks after the change to the assembly 2.6.38.7-ck1-1

    http://www.rasslabyxa.ru/_ph/9/2/983669091.jpg

    When the load increases the difference is much more.
    CPU utilization in both cases the same and equal to about 30%.

    2) On the assembly 2.6.39ck1-3 I somehow could not come at the ssh via putty (windows)
    But was able to go on with ssh from another ubuntu-server on 2.6.38.4

    After replacing an assembly game server at 2.6.38ck1 - went to ssh ok

    ReplyDelete
  7. @VamPiro: Thanks or your report. Have you actually compared mainline 2.6.38.x versus mainline 2.6.39.x ? The difference may have nothing to do with BFS since the BFS in both of those kernels is virtually identical.

    ReplyDelete
  8. I have 3 computers:
    1) P4-3000 c mainline 2.6.38.x without patch
    2) AMD64-3000 now to downgrade to Ubuntu packages linux-image-2.6.38.7-ck1-1_2.6.38.7-ck1-a-10.00.Custom_amd64.deb
    3) AMD Phenom 1090 overclocked via the BIOS to 6 * 3800 with downgrade to linux-image-2.6.38.7-ck1-1_2.6.38.7-ck1-a-10.00.Custom_amd64.deb

    Number 2 and number 3 had a problem on the linux-image-2.6.39-ck1-3_2.6.39-ck1-3-10.00.Custom_amd64.deb

    Number 1 has allowed me to go to them via ssh and roll back to 2.6.38.7-ck1-1

    I can not spend a lot of tests, because it really works server

    ReplyDelete
  9. Most likely the problem is not in the patch, and the imposition of the kernel 2.6.39 on my system, base on 2.6.38

    ReplyDelete
  10. Still going strong with ck1-3. Thanks.

    ReplyDelete
  11. Test ck2 = make progress

    ReplyDelete
  12. Say, i just wondering.
    With this nice progress with BFS, is there benefit to run a program with schedtool ?

    ReplyDelete
  13. There are still advantages to using schedtool to obtain extra CPU for SCHED_ISO applications (such as audio or video) to guard against extreme CPU loads and to make things completely background CPU usage only with SCHED_IDLEPRIO (such as folding@home etc.).

    ReplyDelete
  14. Ralph Ulrich27 June 2011 03:24

    I am sad to not be able to try the new beast. Trying to patch linux-3.0.0-rc4 has two rejects:

    # quilt push
    Wende Patch patches/ck2/2.6.39-sched-bfs-406.patch an
    patching file arch/powerpc/platforms/cell/spufs/sched.c
    patching file Documentation/scheduler/sched-BFS.txt
    patching file Documentation/sysctl/kernel.txt
    Hunk #3 succeeded at 258 (offset 1 line).
    Hunk #4 succeeded at 455 (offset 1 line).
    patching file fs/proc/base.c
    Hunk #1 succeeded at 411 (offset 3 lines).
    patching file include/linux/init_task.h
    Hunk #1 succeeded at 130 (offset 2 lines).
    Hunk #2 succeeded at 255 (offset 1 line).
    patching file include/linux/ioprio.h
    patching file include/linux/sched.h
    Hunk #3 FAILED at 1204.
    Hunk #4 succeeded at 1340 (offset 21 lines).
    Hunk #5 succeeded at 1571 (offset 21 lines).
    Hunk #6 succeeded at 1644 (offset 21 lines).
    Hunk #7 succeeded at 2005 (offset 38 lines).
    1 out of 7 hunks FAILED -- rejects in file include/linux/sched.h
    patching file init/Kconfig
    Hunk #1 succeeded at 29 (offset -1 lines).
    patching file init/main.c
    Hunk #1 succeeded at 748 (offset 4 lines).
    patching file kernel/delayacct.c
    patching file kernel/exit.c
    patching file kernel/kthread.c
    Hunk #1 FAILED at 203.
    1 out of 1 hunk FAILED -- rejects in file kernel/kthread.c
    patching file kernel/posix-cpu-timers.c
    patching file kernel/sched_bfs.c
    patching file kernel/sched.c
    Hunk #2 succeeded at 9183 (offset -302 lines).
    patching file kernel/sysctl.c
    .....

    ReplyDelete
  15. Hi!

    I have running BFS on machines > 1 core very successfully, but this will be rather strange question maybe: will there be a benefit to run BFS on Athlon 2500XP which is used as web browsing / small flash games for kids. Will there be a real visible benefit?

    regards
    Me

    ReplyDelete
  16. Not a strange question at all. The answer is it should help and certainly won't harm, so give it a try! I use it on my kids machine which is very similar and for the same purpose :)

    ReplyDelete
  17. would you recommend the kernel for server hosting? or its only desktop?

    ReplyDelete
  18. It would probably be best to compile a custom one at 100Hz without preempt for a server.

    ReplyDelete
  19. ok, thanks for response!
    but i read and heard that your patch (bfs) would be very nice in processing with e.g. quad cores that all cores working much better then cfs? is that for server side using not better?

    ReplyDelete
  20. Yes, I meant a bfs kernel with 100hz and preempt. Sorry if I wasn't clear :)

    ReplyDelete
  21. Gah I mean bfs with 100hz and no preempt.

    ReplyDelete
  22. dont worry ^^ thanks!

    i love your work since patch 2.6.21

    ok another question ^^ do you mean only the bfs patch or the ck one? or are the included files in the ck patch more or less not important for everyone?

    another logical question, your patch perform a lower latency to the kernel > but if i use 100hz and no preempt is that not directly the other way? normally i would think 1000hz and preempt + bfs would improve the low latency performance? where did i think wrong?

    -many thanks!

    ReplyDelete
  23. oh man, lots of questions...

    its recommend to use bfs/ your patch with "chrt -f -p 99" for realtime scheduling?

    ReplyDelete
  24. The whole -ck patch would benefit a server as well, provided you manually set the correct config options. While 1000Hz gives better latencies than 100Hz, most servers need improved macro-latencies, while desktops need improved micro-latencies. 100Hz gives lower overhead and higher throughput, but the micro-latencies are better with 1000Hz.

    Realtime scheduling works the same on BFS as it does on mainline (but has better latencies), so use it the same way you would if you were on a mainline kernel only in scenarios where you would normally use realtime scheduling.

    ReplyDelete
  25. Many thanks for your wonderfull patch!

    One question: do you know the impact on laptop's consumption?

    ReplyDelete
  26. what do you think of "Multi-core scheduler support" its important to compile it?

    ReplyDelete
  27. It's pretty much mandatory for any recent 2 core or more CPU.

    ReplyDelete
  28. thank you.

    another one...

    i compiled a kernel 2.6.39 without HT scheduler on a i7 2600 CPU. now i got on top 8 CPUs but why? i thought if i dont activate HT in the kernel HT is deactivated on the system? or do i need to deactivate it in the bios? when i set the max cpu in the kernel do the kernel use all real cpus? or only 2 with HT? iam very confused :-(

    thanks a lot for help!

    ReplyDelete
  29. --- linux-2.6.38.orig/kernel/sched_bfs.c 2011-04-22 06:19:33.727967207 +0200
    +++ linux-2.6.38/kernel/sched_bfs.c 2011-04-22 06:27:11.628822678 +0200
    @@ -215,14 +215,14 @@ struct rq {

    struct root_domain *rd;
    struct sched_domain *sd;
    - unsigned long *cpu_locality; /* CPU relative cache distance */
    + unsigned int *cpu_locality; /* CPU relative cache distance */
    #ifdef CONFIG_SCHED_SMT
    - int (*siblings_idle)(unsigned long cpu);
    + int (*siblings_idle)(unsigned int cpu);
    /* See if all smt siblings are idle */
    cpumask_t smt_siblings;
    #endif
    #ifdef CONFIG_SCHED_MC
    - int (*cache_idle)(unsigned long cpu);
    + int (*cache_idle)(unsigned int cpu);
    /* See if all cache siblings are idle */
    cpumask_t cache_siblings;
    #endif
    @@ -717,7 +717,7 @@ static inline int queued_notrunning(void
    * It's cheaper to maintain a binary yes/no if there are any idle CPUs on the
    * idle_cpus variable than to do a full bitmask check when we are busy.
    */
    -static inline void set_cpuidle_map(unsigned long cpu)
    +static inline void set_cpuidle_map(unsigned int cpu)
    {
    if (likely(cpu_online(cpu))) {
    cpu_set(cpu, grq.cpu_idle_map);
    @@ -725,7 +725,7 @@ static inline void set_cpuidle_map(unsig
    }
    }

    -static inline void clear_cpuidle_map(unsigned long cpu)
    +static inline void clear_cpuidle_map(unsigned int cpu)
    {
    cpu_clear(cpu, grq.cpu_idle_map);
    if (cpus_empty(grq.cpu_idle_map))
    @@ -765,14 +765,14 @@ static void resched_task(struct task_str
    * Other node, other CPU, busy threads.
    */
    static void
    -resched_best_mask(unsigned long best_cpu, struct rq *rq, cpumask_t *tmpmask)
    +resched_best_mask(unsigned int best_cpu, struct rq *rq, cpumask_t *tmpmask)
    {
    - unsigned long cpu_tmp, best_ranking;
    + unsigned int cpu_tmp, best_ranking;

    - best_ranking = ~0UL;
    + best_ranking = ~0U;

    for_each_cpu_mask(cpu_tmp, *tmpmask) {
    - unsigned long ranking;
    + unsigned int ranking;
    struct rq *tmp_rq;

    ranking = 0;
    @@ -854,11 +854,11 @@ static inline int queued_notrunning(void
    return grq.nr_running;
    }

    -static inline void set_cpuidle_map(unsigned long cpu)
    +static inline void set_cpuidle_map(unsigned int cpu)
    {
    }

    -static inline void clear_cpuidle_map(unsigned long cpu)
    +static inline void clear_cpuidle_map(unsigned int cpu)
    {
    }

    @@ -996,7 +996,7 @@ static inline int task_sticky(struct tas

    /* Reschedule the best idle CPU that is not this one. */
    static void
    -resched_closest_idle(struct rq *rq, unsigned long cpu, struct task_struct *p)
    +resched_closest_idle(struct rq *rq, unsigned int cpu, struct task_struct *p)
    {
    cpumask_t tmpmask;

    @@ -1015,7 +1015,7 @@ resched_closest_idle(struct rq *rq, unsi
    * latency at all times.
    */
    static inline void
    -swap_sticky(struct rq *rq, unsigned long cpu, struct task_struct *p)
    +swap_sticky(struct rq *rq, unsigned int cpu, struct task_struct *p)
    {
    if (rq->sticky_task) {
    if (rq->sticky_task == p) {
    @@ -1052,7 +1052,7 @@ static inline int task_sticky(struct tas
    }

    static inline void
    -swap_sticky(struct rq *rq, unsigned long cpu, struct task_struct *p)
    +swap_sticky(struct rq *rq, unsigned int cpu, struct task_struct *p)
    {
    }

    @@ -1341,7 +1341,7 @@ static void try_preempt(struct task_stru
    {
    struct rq *highest_prio_rq = this_rq;
    u64 latest_deadline;
    - unsigned long cpu;
    + unsigned int cpu;
    int highest_prio;
    cpumask_t tmp;

    @@ -6786,14 +6786,14 @@ static int update_runtime(struct notifie
    * Cheaper version of the below functions in case support for SMT and MC is
    * compiled in but CPUs have no siblings.
    */
    -static int sole_cpu_idle(unsigned long cpu)
    +static int sole_cpu_idle(unsigned int cpu)
    {
    return rq_idle(cpu_rq(cpu));
    }
    #endif
    #ifdef CONFIG_SCHED_SMT
    /* All this CPU's SMT siblings are idle */
    -static int siblings_cpu_idle(unsigned long cpu)
    +static int siblings_cpu_idle(unsigned int cpu)
    {
    return cpumask_subset(&(cpu_rq(cpu)->smt_siblings),
    &grq.cpu_idle_map);

    ReplyDelete
  30. @@ -6801,7 +6801,7 @@ static int siblings_cpu_idle(unsigned lo
    #endif
    #ifdef CONFIG_SCHED_MC
    /* All this CPU's shared cache siblings are idle */
    -static int cache_cpu_idle(unsigned long cpu)
    +static int cache_cpu_idle(unsigned int cpu)
    {
    return cpumask_subset(&(cpu_rq(cpu)->cache_siblings),
    &grq.cpu_idle_map);
    @@ -6856,7 +6856,7 @@ void __init sched_init_smp(void)
    for_each_online_cpu(cpu) {
    struct rq *rq = cpu_rq(cpu);
    for_each_domain(cpu, sd) {
    - unsigned long locality;
    + unsigned int locality;
    int other_cpu;

    #ifdef CONFIG_SCHED_SMT
    @@ -6974,7 +6974,7 @@ void __init sched_init(void)
    rq->cache_idle = sole_cpu_idle;
    cpumask_set_cpu(i, &rq->cache_siblings);
    #endif
    - rq->cpu_locality = kmalloc(nr_cpu_ids * sizeof(unsigned long),
    + rq->cpu_locality = kmalloc(nr_cpu_ids * sizeof(unsigned int),
    GFP_NOWAIT);
    for_each_possible_cpu(j) {
    if (i == j)

    ReplyDelete
  31. Why the change from long to int?

    ReplyDelete
  32. Good question. Do you wonder whether your 32768 CPU desktop will work? It won't.

    ReplyDelete
  33. Linux kernel 3.0 i out!!!! :)
    When I can play it with a BFS super patch? :)
    Thank you C.K. , Happy Birthday Linux!!!!

    axlmas

    ReplyDelete
  34. hey! I've been hacking linux kernel with your patch since ... lemme see ... like around you started delivering it... around 2003... more or less... man! it has been almost 10 yrs! I think I can have a word from many when I say: "Don't ever quit!" we need your work, is just as plain as that. check out my ubuntu-based VM uname -a output:
    Linux vcevil 2.6.39-ck2-vcevil+ #1 SMP Mon Aug 8 17:30:53 VET 2011 x86_64 x86_64 x86_64 GNU/Linux

    neat... won't you agree?

    I was trying freebsd 9-current and expected rudo would port BFS from releng8.2 but then things started to get ugly with some dependencies when building gnome2... so I had to go ubuntu... using ZFS as root, and the wonderful patch... THANKS THANKS THANKS BIIIIIG THANKS!!!

    ReplyDelete