Thursday, 16 August 2012

3.5-ck1, BFS 424 for linux-3.5

Thanks to those who have been providing interim patches porting BFS to linux 3.5 while I've been busy! Finally I found some downtime from my current coding contract work to port BFS and -ck to linux 3.5, and here is the announce below:
 
 
These are patches designed to improve system responsiveness and
interactivity with specific emphasis on the desktop, but suitable to
any commodity hardware workload.

Apply to 3.5.x:
http://ck.kolivas.org/patches/3.0/3.5/3.5-ck1/patch-3.5-ck1.bz2
or
http://ck.kolivas.org/patches/3.0/3.5/3.5-ck1/patch-3.5-ck1.lrz

Broken out tarball:
http://ck.kolivas.org/patches/3.0/3.5/3.5-ck1/3.5-ck1-broken-out.tar.bz2
or
http://ck.kolivas.org/patches/3.0/3.5/3.5-ck1/3.5-ck1-broken-out.tar.lrz

Discrete patches:
http://ck.kolivas.org/patches/3.0/3.5/3.5-ck1/patches/

Latest BFS by itself:
http://ck.kolivas.org/patches/bfs/3.5.0/3.5-sched-bfs-424.patch

Web:
http://kernel.kolivas.org

Code blog when I feel like it:
http://ck-hack.blogspot.com/


This is a resync from 3.4-ck3. However, the broken out tarballs above also 
include the upgradeable rwlocks patch, and a modification of the global 
runqueue in BFS to use the urwlocks. These are NOT applied in the -ck1 patch, 
but can be applied manually at the  end of the series as indicated by the 
series file. It is currently of no demonstrable performance advantage OR 
detriment in its current state, but is code for future development.

Enjoy!
お楽しみください

-- 
-ck

117 comments:

  1. Thanks for taking the time to release the new patches!

    ReplyDelete
  2. Judging from this post http://ck-hack.blogspot.com/2012/06/upgradeable-rwlocks-and-bfs.html it looks like I can avoid urw-locks.patch since I only have a 4-core cpu, right?

    ReplyDelete
  3. It's just a playground for the moment so I doubt anyone will want to enable it unless they wanted to hack on it or do some testing with it.

    ReplyDelete
  4. Hi Con,
    very thanks to keep up with new linux release!

    A little question: By diffing old 3.4 bfs-424 with this new one I saw you having done a lot regarding NUMA. This is at virtual level managing multiple cpus?
    I have a system: x86_64 Intel(R) Core(TM)2 Duo
    Is following NUMA .config acceptable, does it make sense?

    CONFIG_NUMA=y
    # CONFIG_AMD_NUMA is not set
    CONFIG_X86_64_ACPI_NUMA=y
    # CONFIG_NUMA_EMU is not set
    CONFIG_USE_PERCPU_NUMA_NODE_ID=y
    CONFIG_ACPI_NUMA=y

    Greetings from Germany,
    Ralph Ulrich

    ReplyDelete
    Replies
    1. It's all to keep in sync with mainline, nothing more. I'd suggest you just turn NUMA off.

      Delete
    2. linux-3.5.2-bfs-with-NUMA just started, works! Thank you for the suggestion to turn NUMA off. What motivated me at first with NUMA is this help text:
      ---
      CONFIG_NUMA: Enable NUMA (Non Uniform Memory Access) support.
      ...
      For 64-bit this is recommended if the system is Intel Core i7
      (or later), AMD Opteron, or EM64T NUMA
      ---
      I think my processor has EM64T. Now I will try a compile session without NUMA config!
      Ralph Ulrich

      Delete
    3. Yeah, linux-3.5.2-bfs-without-NUMA feels like a little performance boost. Although my extra virtualbox modules got an increase in size. I didn't try virtualbox with new Bfs enabled kernel yet.
      Ralph Ulrich

      Delete
  5. --- linux-3.5.orig/kernel/sched/bfs.c
    +++ linux-3.5/kernel/sched/bfs.c
    @@ -5977,8 +5977,6 @@ static int __init isolated_cpu_setup(cha

    __setup("isolcpus=", isolated_cpu_setup);

    -#define SD_NODES_PER_DOMAIN 16
    -
    static const struct cpumask *cpu_cpu_mask(int cpu)
    {
    return cpumask_of_node(cpu_to_node(cpu));

    ReplyDelete
  6. This comment has been removed by the author.

    ReplyDelete
  7. This comment has been removed by the author.

    ReplyDelete
  8. Thanks CK! As usual, here are the results of my make benchmark on my dual quad machine. CK1 clearly differentiates itself from mainline with n=27 runs doing a `make -j16 bzImage` on v3.5.2.

    http://s19.postimage.org/hxdw0papd/big_anova.jpg

    Here is a link to my benchmark script:
    https://github.com/graysky2/bin/blob/master/bench

    Details:
    1) It is a non-latency based measure.
    2) Compilation benchmark using gcc to “make bzImage” for a preconfigured linux 3.4.4 build.
    3) Runs benchmarks 28 times totally to get a decent number of observations for a statistical comparison. In all cases, the first run is omitted leaving an n=27.
    4) Results are how many seconds it took to compile on a dual Intel E5620 (2x hyperhreaded quadcore CPUs on a single board) @ 2.40 GHz.
    5) Make is run with 16 threads (8 physical cores and 8 HT cores).

    Just to correct a mistake I made before at this url:
    http://ck-hack.blogspot.com/2012/07/bfs-424-test.html?showComment=1341259994610#c1478189585724857286

    I benchmarked mainline against mainline through a mistake of not uncommenting the lines in my script that patch the kernel source with your patchset. So, bfs still reigns supreme. Rock on.

    ReplyDelete
    Replies
    1. Sorry, I have a question. What is the meaning of 'n=27'? And will the option '-j 16' make any difference to the kernel? It seems that it will not. Thanks in advance.

      Delete
    2. @kelvin - n=27 means that I repeated the make benchmark 27 times to get a good number of observations on which to basis the statistics. Without a larger set, we cannot say that kernel A is faster than kernel B for example.

      I use a -j16 switch because the machine has 8 physical cores and 8 hyperthreaded cores. It was selected to maximize throughput.

      Hope that makes sense.

      Delete
    3. Repeated same experiment on my new 3770K (hyperthreaded quad @ 4.5 GHz).

      http://s19.postimage.org/fjwet2d83/3770k.jpg

      Delete
    4. @graysky

      Thanks for your reply. The resolution of the benchmark for your new Ivy Bridge CPU(wow, a great CPU which is quite expensive!) is too low.
      It seems that the faster the CPU, the greater the difference between the mainline kernel and ck-patched kernel.

      Delete
  9. Dunno, seems like radeon driver is broken with 3.5.2 (3.5.1 from openSUSE +inc patch +BFS +BFQ +BFQ-Addon). radeon makes mess instead of working and fills the logs!

    So, I cannot talk with you about 3.5.x :-(

    But, Con, also from my side: A great "Thank you!" for the new patches!

    Manuel Krause

    ReplyDelete
    Replies
    1. Hi Manuel,
      just today I see there coming two big radeon patches:
      git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git

      All you have to do, after git clone:
      ln -s /usr/src/git/stable-queue/queue-3.5 /usr/src/linux/patches
      cd /usr/src/linux && quilt push -a

      Greetings from Hamburg, Ralph Ulirch

      Delete
    2. Excuse me:
      git://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git
      Ralph Ulrich

      Delete
    3. So, the above mentionned setup, just with 3.4.9 instead of 3.5.2 is running fine. (But only 3h of uptime so far.)

      Thank you, Ralph! At least for the news that there may be coming something hopefully better for radeon. I have no experience with git so far and don't know if I need to learn in. (Me = being very lazy atm) ;-) Don't mind.

      Manuel

      Delete
    4. BTW. ... I kept/altered some of your recently proposed kernel config settings:
      CONFIG_RCU_BOOST_PRIO=1 <--- giving a "-2" in top
      CONFIG_RCU_BOOST_DELAY=440
      and left
      CONFIG_HZ_1000=y <---------- following Cons advices for interactivity
      (Some of your options are not relevant for UP single core systems so: not mentioned, just to be complete.)

      Too sad that these parameters including scheduler choice still aren't runtime-configurable.

      Seems like with either 3.4.9 or the BFQ-addon patch
      https://groups.google.com/forum/?fromgroups#!topic/bfq-iosched/hPe2jFW55Is[1-25]
      I do not need to further fiddle with schedtool to fight video/audio playback hickups on my weak old system.

      Manuel

      Delete
  10. Just wondering. Do schedtool still needed with the current bfs version ?

    ReplyDelete
    Replies
    1. Schedtool was never "needed". It is purely an optional feature an works the same on BFS as it always has.

      Delete
    2. Of course, never "needed" by design, as Con wrote.
      But it's still a nice and handy tool to adjust priorities of certain processes within the BFS schema when the kernel / application & combination shows glitches.

      E.g., I recently used these two on kernel 3.4.7:
      schedtool -R -p 1 -n -10 `pidofproc pulseaudio`
      schedtool -I -n -5 `pidofproc Xorg`
      These haven't solved the issue of sound stalls with pulseaudio while playing video, but eased them at least (In my subjective scale counting -5 to +5: changed from -5 to -1.)
      (And, please, also read the schedtool manpage for detailed understanding.)

      Manuel Krause

      Delete
  11. Hi people

    2 error after fix patch for 3.6.0-rc2
    Patch for 3.5 kernel need to fix sched.h lines.

    and error in kernel 3.6-rc2 :

    First error in sched.h

    init/init_task.c:16:32: error: expected expression before 'do'
    make[4]: *** [init/init_task.o] Error 1
    make[4]: *** Waiting for unfinished jobs....
    CC kernel/printk.o
    init/main.c:704:2: warning: data definition has no type or storage class [enabled by default]
    init/main.c:704:2: warning: type defaults to 'int' in declaration of 'print_scheduler_version' [-Wimplicit-int]
    init/main.c:704:2: warning: function declaration isn't a prototype [-Wstrict-prototypes]
    init/main.c:704:2: error: conflicting types for 'print_scheduler_version'
    In file included from include/linux/cgroup.h:11:0,
    from include/linux/perf_event.h:582,
    from include/linux/ftrace_event.h:8,
    from include/trace/syscall.h:6,
    from include/linux/syscalls.h:78,
    from init/main.c:16:
    include/linux/sched.h:1638:6: note: previous declaration of 'print_scheduler_version' was here
    make[4]: *** [init/main.o] Error 1
    make[3]: *** [init] Error 2
    make[3]: *** Waiting for unfinished jobs....

    after comment line : void print_scheduler_version(void);

    next error :

    init/init_task.c:16:32: error: expected expression before 'do'
    VDSOSYM arch/x86/vdso/vdso-syms.lds
    make[4]: *** [init/init_task.o] Error 1
    AS arch/x86/realmode/rmpiggy.o
    CC mm/filemap.o
    LD arch/x86/vdso/built-in.o
    LD arch/x86/realmode/built-in.o
    make[3]: *** [init] Error 2
    make[3]: *** Waiting for unfinished jobs....
    CC mm/page_alloc.o
    CC mm/page-writeback.o
    CC mm/readahead.o
    CC kernel/sched/bfs.o
    LD arch/x86/built-in.o
    CC mm/swap.o
    kernel/sched/bfs.c: In function 'cpuset_cpu_active':
    kernel/sched/bfs.c:6956:3: error: too few arguments to function 'cpuset_update_active_cpus'
    In file included from kernel/sched/bfs.c:56:0:
    include/linux/cpuset.h:23:13: note: declared here
    kernel/sched/bfs.c: In function 'cpuset_cpu_inactive':
    kernel/sched/bfs.c:6968:3: error: too few arguments to function 'cpuset_update_active_cpus'
    In file included from kernel/sched/bfs.c:56:0:
    include/linux/cpuset.h:23:13: note: declared here
    make[5]: *** [kernel/sched/bfs.o] Error 1
    make[4]: *** [kernel/sched] Error 2
    make[3]: *** [kernel] Error 2
    CC mm/truncate.o


    best regards
    m.

    ReplyDelete
  12. After a few days experience:

    linux-3.5.2-bfs is the fastest kernel I ever had!
    Without any errors nor problems!

    With previous releases I had to
    CONFIG_RCU_BOOST_PRIO > default one
    as a workaround to not get process time overflows with top tool. This error isn't any more!

    And thank you CK for advice to disable NUMA!
    Ralph Ulrich

    ReplyDelete
  13. ck.kolivas.org is down, Saturday, 8/18/12. I was just about to try the new patch. :( ;)

    Galen Seaman

    ReplyDelete
  14. I can confirm too that disabling NUMA is beneficial for desktop systems. Just look at the results of the make benchmark that I wrote about earlier in this blog. Here you see the mainline "3.5.2-1-ARCH" vs. two different BFS patched kernels. One has NUMA enabled per the Arch Linux defaults and the other has it disabled:

    http://s19.postimage.org/a8mk5gxgz/3770k.jpg

    There is a clear and statistically significant difference in compile times (n=28) with the median gain through disabling NUMA being 344 ms. From my research, unless your hardware has >1 PHYSICAL CPUs -- not cores but physical processors -- it is advantageous to disable NUMA as measured by this non-latency endpoint.

    Thoughts?

    ReplyDelete
    Replies
    1. Should have mentioned that the above results on are an Intel 3770K @4.5 GHz running with 8 threads.

      Delete
    2. NUMA is about memory in relation to cpu.
      - only cpu cache?
      - all several physical cpu profit from NUMA?
      - why is this NUMA .config enabled for nearly all big distros?

      Delete
    3. Excellent question, Anon.

      If all major distros do something because their peer group did it, this is a shameful reason to perpetuate it -- particularly considering the data showing that it is a performance regression for those users with only one physical CPU. In other words, if 99.99999 % of distro users have only one physical CPU (home users/laptop users) and thus are impacted detrimentally by this option, why in the world would we enable the option in the kernel package that benefits the 0.00001 % of users that do? Just because our peer groups do it is no justification that it is a data-driven and sound decision.

      I made this very point to the Arch Linux kernel devs in a feature request. Let's see if they agree... https://bugs.archlinux.org/task/31187

      Delete
    4. Or in other words, "turn this flag off for a kernel that's 0.3% faster on desktops, but massively slower on large machines!"

      Delete
  15. The server hosting ck.kolivas.org was migrated. All the files were lost. I've had to drag out what I could from backups so a lot's missing...

    ReplyDelete
    Replies
    1. github, gitorious, launchpad, googlecode
      all of them provide professional hosting backup included ...

      Delete
    2. I prefer bitbucket.org. It offers unlimited space for open source project like a clone kernel git tree.

      Delete
  16. Hi, ck,

    After I apply bfs for 3.5 last weekend for a machine which I happen to enable some kernel hacking option in kernel config, I got a WARNING and a suspicious RCU usage in dmesg, then I trac back to kernel 3.3 and also tested on other machine, the issue still there, I believe it is a long existed issue. I post the dmesg here and attached one of the kernel config of my machine.

    ***
    [ 0.039032] ------------[ cut here ]------------
    [ 0.040008] WARNING: at kernel/sched/bfs.c:1063 set_task_cpu+0x7a/0xe0()
    [ 0.041003] Modules linked in:
    [ 0.042005] Pid: 0, comm: BFS/0 Not tainted 3.5.2+ #44
    [ 0.043003] Call Trace:
    [ 0.044007] [] warn_slowpath_common+0x6b/0xa0
    [ 0.045004] [] warn_slowpath_null+0x15/0x20
    [ 0.045906] [] set_task_cpu+0x7a/0xe0
    [ 0.046005] [] ? debug_check_no_locks_freed+0x96/0x160
    [ 0.047004] [] ? trace_hardirqs_on+0xd/0x10
    [ 0.048004] [] ? lockdep_init_map+0x65/0x150
    [ 0.049005] [] ? ktime_get_ts+0xa8/0xe0
    [ 0.050004] [] sched_fork+0x2d/0x1c0
    [ 0.051004] [] copy_process+0x69e/0x12c0
    [ 0.052004] [] do_fork+0x5c/0x320
    [ 0.053003] [] ? sched_clock_local+0x25/0x90
    [ 0.054004] [] ? trace_hardirqs_off+0xd/0x10
    [ 0.055003] [] ? local_clock+0x4f/0x60
    [ 0.056004] [] ? _raw_spin_unlock_irqrestore+0x5d/0x70
    [ 0.057004] [] kernel_thread+0x6c/0x70
    [ 0.057908] [] ? repair_env_string+0x5b/0x5b
    [ 0.058003] [] ? gs_change+0xb/0xb
    [ 0.059004] [] ? rcu_scheduler_starting+0x20/0x60
    [ 0.060003] [] rest_init+0x21/0x160
    [ 0.061003] [] start_kernel+0x2ca/0x2d7
    [ 0.062002] [] ? kernel_init+0x198/0x198
    [ 0.063002] [] x86_64_start_reservations+0xff/0x104
    [ 0.064002] [] x86_64_start_kernel+0xed/0xf4
    [ 0.065045] ---[ end trace 778bdcd3ab492196 ]---

    ReplyDelete
    Replies
    1. [ 0.066142] DMAR: Host address width 36
      [ 0.067009] DMAR: DRHD base: 0x000000feb03000 flags: 0x0
      [ 0.068016] IOMMU 0: reg_base_addr feb03000 ver 1:0 cap c9008020e30260 ecap 1000
      [ 0.069001] DMAR: DRHD base: 0x000000feb01000 flags: 0x0
      [ 0.070020] IOMMU 1: reg_base_addr feb01000 ver 1:0 cap c0000020630260 ecap 1000
      [ 0.071001] DMAR: DRHD base: 0x000000feb02000 flags: 0x1
      [ 0.072014] IOMMU 2: reg_base_addr feb02000 ver 1:0 cap c90080206f0460 ecap 1000
      [ 0.073001] DMAR: RMRR base: 0x000000000e0000 end: 0x000000000effff
      [ 0.074001] DMAR: RMRR base: 0x000000be000000 end: 0x000000bfffffff
      [ 0.076146] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
      [ 0.087064] CPU0: Intel(R) Core(TM)2 Duo CPU L9400 @ 1.86GHz stepping 06
      [ 0.088992] Performance Events: PEBS fmt0+, 4-deep LBR, Core2 events, Intel PMU driver.
      [ 0.089003] ... version: 2
      [ 0.089995] ... bit width: 40
      [ 0.090994] ... generic registers: 2
      [ 0.091995] ... value mask: 000000ffffffffff
      [ 0.092994] ... max period: 000000007fffffff
      [ 0.093994] ... fixed-purpose events: 3
      [ 0.094994] ... event mask: 0000000700000003
      [ 0.104000] lockdep: fixing up alternatives.
      [ 0.104907] Booting Node 0, Processors #1 Ok.
      [ 0.100081] CPU1: Thermal monitoring handled by SMI
      [ 0.118028] Brought up 2 CPUs
      [ 0.118996] Total of 2 processors activated (7447.89 BogoMIPS).
      [ 0.120146]
      [ 0.120988] ===============================
      [ 0.120988] [ INFO: suspicious RCU usage. ]
      [ 0.120988] 3.5.2+ #44 Tainted: G W
      [ 0.120988] -------------------------------
      [ 0.120988] kernel/sched/bfs.c:7054 suspicious rcu_dereference_check() usage!
      [ 0.120988]
      [ 0.120988] other info that might help us debug this:
      [ 0.120988]
      [ 0.120988]
      [ 0.120988] rcu_scheduler_active = 1, debug_locks = 1
      [ 0.120988] 1 lock held by BFS/0/1:
      [ 0.120988] #0: (&grq.lock){-.....}, at: [] sched_init_smp+0xf6/0x286
      [ 0.120988]
      [ 0.120988] stack backtrace:
      [ 0.120988] Pid: 1, comm: BFS/0 Tainted: G W 3.5.2+ #44
      [ 0.120988] Call Trace:
      [ 0.120988] [] lockdep_rcu_suspicious+0xe5/0x130
      [ 0.120988] [] sched_init_smp+0x180/0x286
      [ 0.120988] [] kernel_init+0x86/0x198
      [ 0.120988] [] ? schedule_tail+0x8c/0x110
      [ 0.120988] [] kernel_thread_helper+0x4/0x10
      [ 0.120988] [] ? retint_restore_args+0xe/0xe
      [ 0.120988] [] ? repair_env_string+0x5b/0x5b
      [ 0.120988] [] ? gs_change+0xb/0xb
      [ 0.147393] PM: Registering ACPI NVS region [mem 0xb94c0000-0xb94cffff] (65536 bytes)
      [ 0.148991] PM: Registering ACPI NVS region [mem 0xbd4df000-0xbd6defff] (2097152 bytes)
      [ 0.151014] PM: Registering ACPI NVS region [mem 0xbd9cf000-0xbdacefff] (1048576 bytes)
      [ 0.152991] NET: Registered protocol family 16


      kernel config @ https://github.com/cchalpha/kernelconfig/blob/master/x200/x200-3.5.2-gc.config

      Delete
    2. As I look at the code, grp lock seems not been held in do_fork -> copy_process -> sched_fork. I am not familiar with these codes, plz have a check, ck.

      Delete
    3. Thanks. That does indeed look wrong, but the way the data is locked in bfs it won't lead to a problem. I may correct it next version just so the warning won't happen.

      Delete
    4. Thanks ck for the quick reply. I forgot to tell that this cause no trouble, at lease from 3.3 to 3.5 on all machines I used, but it should be good to fix such warning.

      Delete
  17. Thanks for the update. I've been running BFS with 3.5.2 for about 5 days without any problems.

    ReplyDelete
  18. scriptkernel= kernel linux + 3.5-ck1 BFS + 3.5.0 BFQ + -Ofast CFLAG

    ReplyDelete
    Replies
    1. http://sourceforge.net/projects/scriptkernel

      Delete
  19. Just a notice about systemd:
    Although it is mentioned BFScheduler doesnt work with cgroup, this just means you are unable to manipulate the scheduler using cgroups. But

    Systemd plays well with .config having:
    CONFIG_CGROUPS=y
    # CONFIG_CGROUP_DEBUG is not set
    CONFIG_CGROUP_FREEZER=y
    CONFIG_CGROUP_DEVICE=y
    CONFIG_CGROUP_MEM_RES_CTLR=y
    CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y
    CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED=y
    # CONFIG_CGROUP_MEM_RES_CTLR_KMEM is not set
    # CONFIG_CGROUP_PERF is not set
    CONFIG_BLK_CGROUP=y
    # CONFIG_DEBUG_BLK_CGROUP is not set
    # CONFIG_CFQ_GROUP_IOSCHED is not set
    CONFIG_NETFILTER_XT_MATCH_DEVGROUP=m
    # CONFIG_NETPRIO_CGROUP is not set

    I am just trying systemd-188
    https://forums.gentoo.org/viewtopic-p-7120752.html#7120752
    Ralph Ulrich

    ReplyDelete
    Replies
    1. Using OpenSuse 12.1 with systemd and zen Kernel (with BFS and BFQ). Was much simpler than your way on gentoo ;)

      No problem here, my .config differs:
      # CONFIG_CGROUP_MEM_RES_CTLR is not set
      # CONFIG_CGROUP_PERF is not set
      CONFIG_CGROUP_BFQIO=y

      Don't wan't to miss systemd, nice to see how simple you can see the boot time graph for the different deamons. Und how quick you can tweak the boot process. Maybe some time BFS don't work together with systemd (as I read, systemd tries to expand to a kind of super user space deamon, big brother for all tasks, with resource management), but at the moment there is no problem so far.

      Btw. On OpenSuse my shutdown time was dramatically reduced with systemd too.

      Cu
      Mike

      Delete
    2. @Mike, I am pretty sure
      - you run systemd-44 with consolekit
      - BFS enabled linux kernel does not affect systemd
      My tries are about newest systemd-189 with advanced features. openSUSE-12.2 will not update to newer systemd ...
      Ralph Ulrich

      Delete
    3. @Ralph,

      no, still on 37 with OpenSuse 12.1 ;)
      Hope, that OpenSuse Tumbleweed will update to newer systemd, or I must wait another year for OpenSuse 12.3 :D

      PS: Nice to see, that systemd evolves so quick, but that is another story, hope that BFS will support it.

      Delete
  20. @Ralph Ulrich, et al.,
    some weeks ago you suggested settings for the RCU subsystem. Some of them for non-UP systems (so: not for my machine). Some of them as a workaround for you in those days.
    That was:
    CONFIG_RCU_BOOST_PRIO=14 (workaround)
    CONFIG_RCU_BOOST_DELAY=440

    Do you have insights or experience with the last one, to share with us?
    I've already googled the world but did not find a clue. This setting does have an important influence on I/O vs. interactivity/responsiveness.
    Currently I'm running 3.5.3 with BFS + BFQ +BFQ-Addon @ 1000HZ & CONFIG_RCU_BOOST_DELAY=333. ATM, I can only say that this favours disk I/O a bit with unchanged responsiveness compared to default(500). Worst cases to test with: Writing files to NTFS partitions via ntfs-3g and playing video in parallel; and watchout for stuttering audio/video.

    Manuel Krause

    ReplyDelete
    Replies
    1. Isn't ntfs-3g going a fuse way of IO? I am not sure this is approached with RCU?

      By the way my settings of RCU_BOOST were a workaround for linux-3.4 kernel. My current linux-3.5.3-bfs works well and without error using default setting!
      Ralph Ulrich

      Delete
    2. Yes, ntfs-3g is working via fuse. But I don't know at all how this corresponds/interacts to/with RCU. Perhaps someone else can explain?

      Thanks for your immediate reply! :-))) After reading that, I should try the default settings with next kernel compile again.

      Manuel Krause

      Delete
  21. Please, excuse me for this off-topic posting. But I expect Linux specialists to come back on here that possibly know, on how to help me.

    Issue: I've upgraded from openSUSE 12.1 to 12.2 3 days ago. With that change I got from gcc 4.6.2 to 4.7.1. But with that I don't get the same kernel compiled in the same way. It does not call the PM resume sequence any more after suspend to disk & later poweron.

    If I use the old 3.5.3-BFS kernel compiled with 12.1/4.6.2 everything works fine.

    Does someone of you know any clue?

    Manuel Krause

    ReplyDelete
    Replies
    1. Also, the new (4.7.1) kernel cannot keep the timezone any more. The old one does.

      Manuel Krause

      Delete
    2. Weird thing!!! [SOLVED]
      Wrong way: I even installed the old gcc 4.6.3 + required libs and recompiled the kernel(s).
      Right way: Reinstall the needed package (named "suspend" on openSUSE + packages it depends on)
      Possible reason: Uninstalling too much of the plymouth related fancy shiny stuff and systemd.

      Shame on me for bothering you all with that!

      But also really many thanks for not answering, so I finally achieved to elaborate it @ my own!

      Manuel Krause

      Delete
  22. in case anyone wants to apply BFS 4.24 to 3.6.0-rc5: i made the following patch (without intrinsic knowlegde of the code, so it may not be perfect). It compiles, boots, and has been stable since then, but is otherwise untested. ;)

    http://www.filefactory.com/file/4uo5xkecax3x/n/3_6-sched-bfs-424_patch

    ReplyDelete
  23. Have semantics of cpuset_update_active_cpus been changed?

    ---- linux-3.5-ck1/kernel/sched/bfs.c
    ++++ linux-3.6-rc5/kernel/sched/bfs.c
    + switch (action & ~CPU_TASKS_FROZEN) {
    + case CPU_ONLINE:
    + case CPU_DOWN_FAILED:
    -+ cpuset_update_active_cpus();
    ++ cpuset_update_active_cpus(true);
    @@ -6952,7 +6952,7 @@
    +{
    + switch (action & ~CPU_TASKS_FROZEN) {
    + case CPU_DOWN_PREPARE:
    -+ cpuset_update_active_cpus();
    ++ cpuset_update_active_cpus(false);

    Ralph Ulrich

    ReplyDelete
    Replies
    1. it appears so: a boolean parameter has been added to the function prototype, the semantics of which I have gathered from the comment in the code...

      just for info, I've had an uptime of almost 5 days with 3.6-rc5 and BFS 4.24. :)

      Delete
  24. A queued fix for coming linux-3.5.5 will break bfs-424 in include/linux/sched.h:
    queue-3.5/sched-fix-race-in-task-group.patch

    Because concerned new code is never to be used with BFScheduler it will be an easey - just reorder patch - fix to do. As Martin above pointed out, beside the change with cpuset_update_active_cpus, this is also the only case to change with linux-3.6-rc. So we wait for a new common ground:
    BFS-425

    Ralph Ulrich

    ReplyDelete
  25. I tried bfs on a shaved lowjitter 3.5.4 kernel. Chromium makes ubuntu 12.04 unstable with it. Read also http://paradoxuncreated.com/Blog/wordpress/?p=3226

    Peace Be With You.

    ReplyDelete
    Replies
    1. what do u mean by "unstable"???

      Delete
    2. No experience running on Ubuntu, but running 3.5.4 patched w/ ck1 and running chromium on several different boxes with no stability issues.

      Delete
    3. This comment has been removed by the author.

      Delete
  26. CK - seems as though upstream changed something in include/linux/sched.h that causes ck1 [bfs] patch to fail:

    patching file arch/powerpc/platforms/cell/spufs/sched.c
    patching file Documentation/scheduler/sched-BFS.txt
    patching file Documentation/sysctl/kernel.txt
    patching file fs/proc/base.c
    patching file include/linux/init_task.h
    Hunk #1 succeeded at 141 (offset 9 lines).
    Hunk #2 succeeded at 269 (offset 10 lines).
    patching file include/linux/ioprio.h
    patching file include/linux/sched.h
    Hunk #3 FAILED at 1240.
    Hunk #4 succeeded at 1360 (offset 3 lines).
    Hunk #5 succeeded at 1596 (offset 3 lines).
    Hunk #6 succeeded at 1671 (offset 3 lines).
    Hunk #7 succeeded at 2055 (offset 3 lines).
    Hunk #8 succeeded at 2771 (offset 3 lines).
    1 out of 8 hunks FAILED -- saving rejects to file include/linux/sched.h.rej

    Any quick fixes?

    ReplyDelete
    Replies
    1. Ack... meant to include that this is in going from 3.5.4 --> 3.5.5!

      Delete
    2. Oops...

      https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=commit;h=4f83989550ace0aa91464051cbaddc10e1b85778

      Delete
    3. > Any quick fixes?

      depends on your workflow. I'd apply the vanilla kernel diff to 3.5.4-ck1. This will result in 1 failed hunk in sched.h. I would then manually apply the following (note: garbled whitespace):

      --- include/linux/sched.h.3.5.5.broken 2012-10-03 15:06:42.858278393 +0200
      +++ include/linux/sched.h 2012-10-03 15:14:50.841213062 +0200
      @@ -1265,6 +1265,9 @@
      struct sched_entity se;
      struct sched_rt_entity rt;
      #endif
      +#ifdef CONFIG_CGROUP_SCHED
      + struct task_group *sched_task_group;
      +#endif

      #ifdef CONFIG_PREEMPT_NOTIFIERS
      /* list of struct preempt_notifier: */

      I verified it compiles, but I haven't installed the kernel yet.

      Delete
    4. update: the patched 3.5.5-ck1 kernel runs and appears stable.

      Delete
  27. Resync of sched-bfs-424 with linux-3.5.5

    First apply 3.5-sched-bfs-424.patch on top of linux-3.5.5 although it doesn't apply cleanly. Then apply this patch:

    http://www.file-upload.net/download-6651389/resync-sched-bfs-424-with-linux-3.5.5-_sid_.patch.html

    ReplyDelete
  28. Resync of sched-bfs-424 with linux-3.5.5

    First apply 3.5-sched-bfs-424.patch on top of linux-3.5.5 although it doesn't apply cleanly. Then apply this patch:

    http://www.file-upload.net/download-6651389/resync-sched-bfs-424-with-linux-3.5.5-_sid_.patch.html

    ReplyDelete
    Replies
    1. Seems as though the git patch I posted above is the cause of this... thank you for your work, but I don't think in this case, that it is needed.

      Link to patch:

      https://bugs.archlinux.org/task/31778?getfile=9421

      Delete
    2. @ _sid_
      no! Steps are:

      1. patch-3.5.5 on top of linux-3.5.0 source
      2. reverse-patch - you can find here:
      https://bugs.gentoo.org/show_bug.cgi?id=437088#c1
      3. apply 3.5-sched-bfs-424.patch

      Ralph ULrich

      Delete
    3. Big discussion about
      https://bugs.gentoo.org/show_bug.cgi?id=437088#c3

      because Gentoo maintainers are special bound to their upstream. And perhaps the feature to disable BFS as the scheduler ist kind of a flaw ?
      Ralph ULrich

      Delete
    4. @graysky, Ralph:

      My instructions to create a bfs patched kernel on top of linux-3.5.5 work perfectly here (uptime 1 day). BTW, the kernel created that way is identical to Martin's one.

      Delete
  29. Unfortunately, BFSv424 ported to 3.6 doesn't compile:

    ===
    kernel/sched/bfs.c: In function ‘update_rq_clock_task’:
    kernel/sched/bfs.c:2290:2: error: implicit declaration of function ‘static_branch’ [-Werror=implicit-function-declaration]
    cc1: some warnings being treated as errors
    make[2]: *** [kernel/sched/bfs.o] Помилка 1
    make[1]: *** [kernel/sched] Помилка 2
    make: *** [kernel] Помилка 2
    make: *** Очікування завершення завдань...
    ===

    ReplyDelete
    Replies
    1. It seems that replacing static_branch with static_key_false fixes the issue.

      Delete
    2. Just wait for CK to post the port :)

      Delete
    3. No need to wait if I can fix it myself ;).

      Delete
  30. Come on Ck, bring up a new patch for 3.6. Even BFQ is faster for this kernel relaese!

    Manuel

    ReplyDelete
    Replies
    1. Sorry, a family tragedy has kept me mostly offline. There will still be some delay before I can sync up with 3.6.

      Delete
    2. Don't be sorry ck ! We are !
      Take care of your family, everybody here can wait !
      Those who can't are not human => They can't wait but they will and nobody cares.

      Wishes for courage and strength !

      Eric

      Delete
    3. Sorry for this difficult time your family is going through. And don't worry, computer software is not THAT important, we can wait.

      Delete
    4. Yes, take care of your family and best wishes from Hamburg,
      Ralph ULrich

      Delete
    5. Dear Con,
      I'm really sorry. If had known that before, I wouldn't have written this initial posting.
      I do not depend on your patchset, but originally and generally, wanted to let you know that we appreciate your ongoing work.

      Take your time. My best wishes are with you!

      Manuel

      Delete
    6. kisses and lovies.

      Delete
  31. You raise-max patch seems to be going in the wrong direction for desktop. I have found 90hz to give optimal jitter in OpenGL. Which pretty much translates to a well-running system. You can try my low-jitter kernel here: http://paradoxuncreated.com/Blog/wordpress/?p=2268

    PS: This is not -ck. Cfs gave less jitter i OpenGL. I like cfs granularity also, and have set it to a suitable value.

    Peace Be With You.

    ReplyDelete
    Replies
    1. Therefor servers only use 100hz also: longer periods of uninterrupted tasks,calculating - do need less context switches! All over better performance!

      If you run a less powered computer or a high demanding task (gaming) you experience this fact!
      Ralph Ulrich

      Delete
    2. Excellent post, Ralph Ulrich. Actually you are the first poster in twenty, that actually understand this. Ofcourse on servers many understand this. On desktop, there seems to be few who understand this, and higher hz, and suboptimal configs are much more common there. (for instance 250hz, no preempt, liek a standard ubuntu kernel - mindless.) It seems to be similar people that argue many services/drivers on windows, and tubeamp/vinyl in audio, or similar things.

      Peace Be With You.

      Delete
  32. I played with kernel 3.6 today and I found that the BFs 424 patch for 3.6-rc5 I posted previously still applies to 3.6. I have also made a ck1 patch available here:

    http://www.filefactory.com/file/6wddd3pfr2mf/n/patch-3_6-ck1_bz2

    The usual big disclaimer: I have simply adapted the existing coding so that it compiles and runs on 3.6 (posting this from a running 3.6-ck1 kernel). I am not in a position to maintain the actual semantics of the coding. That is Con's prerogative.

    Further good news: the BFQ patch for 3.5 applies to 3.6 as well.

    ReplyDelete
    Replies
    1. scratch the BFQ comment, they have a new patch out anyway... :p

      Delete
    2. @Martin
      Thank you for providing interim patches for us in the meantime!
      I want to bug-report that (at least) the latest BFS-only patch you made is not uniprocessor friendly. Only capable of SMP.
      What I've found so far is: It's related to the changes you've made in the patch regarding include/linux/shed.h (HUNK 3 @@ -1239,17 +1244,36 @@)
      ATM I don't really know how to fix this. Any advices?

      Thank you in advance,
      Manuel Krause

      Appendix: Compiling error output:
      CC kernel/sched/bfs.o
      kernel/sched/bfs.c: In function ‘task_running’:
      kernel/sched/bfs.c:467:10: error: ‘struct task_struct’ has no member named ‘on_cpu’
      kernel/sched/bfs.c: In function ‘sched_fork’:
      kernel/sched/bfs.c:1742:3: error: ‘struct task_struct’ has no member named ‘on_cpu’
      kernel/sched/bfs.c: In function ‘schedule’:
      kernel/sched/bfs.c:3314:7: error: ‘struct task_struct’ has no member named ‘on_cpu’
      kernel/sched/bfs.c:3315:7: error: ‘struct task_struct’ has no member named ‘on_cpu’
      kernel/sched/bfs.c: In function ‘init_idle’:
      kernel/sched/bfs.c:5010:6: error: ‘struct task_struct’ has no member named ‘on_cpu’
      kernel/sched/bfs.c: In function ‘task_running’:
      kernel/sched/bfs.c:468:1: warning: control reaches end of non-void function [-Wreturn-type]
      make[2]: *** [kernel/sched/bfs.o] Error 1
      make[1]: *** [kernel/sched] Error 2
      make: *** [kernel] Error 2

      Delete
    3. hmm. thanks for the feedback. The on_cpu error is easy enough to fix (and, i have to admit, was unnecessarily introduced by myself). However, when testing UP configurations I hit on another problem in the context of urwlocks which I can't get my head around so easily. I'm afraid that one is left to the man himself...
      In other words: for SMP configurations my previous patch works fine, but for UP I have no real solution yet.

      Delete
  33. just for the crack of it I have uploaded the corrected versions (with respect to cpu_on), in case someone wants to have a go at the locks for UP.

    http://www.filefactory.com/file/4tudaofjq88f/n/3.6-sched-bfs-424.patch

    http://www.filefactory.com/file/3epahpsasjwz/n/patch-3.6-ck1.bz2

    ReplyDelete
    Replies
    1. The patch compiled, but it wont boot.
      I'm using SMP

      Delete
    2. @Martin:
      Thank you for the corrections. And you're right, _my_brain_ wasn't capaple to fix it myself yesterday even if it was such an easy fix.

      3h uptime now with my usual workloads un my UP-only machine
      @ kernel 3.6.0 + fixed 424 BFS + mm-drop_swap_cache_aggressively.patch + most recent BFQ-v5

      I don't want to add confusion - but for correctness - I only replaced the failing patch hunk (as reported) in Martins weeks ago patch with the corrected sequence from his today's patch.

      Thank you very much!

      Manuel Krause

      Delete
    3. Thanks for porting the ck patch to 3.6. I'm running just fine with it on my quad core machine. Arifhn, I compiled 3.6.1 with patch-3.6-ck1.bz2 using gcc-4.7.2. What's your kernel config look like?

      Delete
    4. @Martin - Thanks for working on the unofficial ck1 and bfs patchsets. Filefactory and captha are lame. I would be glad to mirror your patches on http://repo-ck.com which houses the unofficial Arch Linux-CK packages.

      http://repo-ck.com/PKG_source/testing/unofficial_patchset_from_martin/patch-3.6-ck1.bz2

      http://repo-ck.com/PKG_source/testing/unofficial_patchset_from_martin/3.6-sched-bfs-424.patch

      Delete
    5. ...did you know that Oleksandr Natalenko merged your unofficial bfs into his linux-pf patchset? Quite an honor :)

      http://pf.natalenko.name/

      Delete
    6. @Martin - Good job with these patches as measured by my standard make benchmark comparing identically configured kernels with and without the patchset.

      http://s19.postimage.org/u12xbrto3/unofficial_patch_comparison.jpg

      Benchmark details:
      As you see, the -ck patched kernel clearly differentiates itself from mainline with n=28 runs doing a `make -j8 bzImage` on v3.6.1.

      Here is a link to my benchmark script:
      https://github.com/graysky2/bin/blob/master/bench

      Details:
      1) It is a non-latency based measure.
      2) Compilation benchmark using gcc to “make bzImage” for a preconfigured linux 3.6.1 build.
      3) Runs benchmarks 28 times totally to get a decent number of observations for a statistical comparison.
      4) Results are how many seconds it took to compile on a 3370K @ 4.5 GHz.
      5) Make is run with 8 threads (4 physical cores and 4 HT cores).

      Delete
    7. hey grayski,

      thx for benchmarking the patch. That's a good indication that nothing got horribly broken. Btw, I am still looking for a benchmark measuring the "desktop fluidity" or -- to translate a word invented by German c't magazine -- the "swoopdicity" of a system. At least that's where I feel the benefits of BFS.

      Thanks also for pointing out O. Natalenko's site. cool stuff.

      And yes, from my side there is no problem hosting the patches elsewhere. I just used the first file hoster Google would find. It's all in the cloud anyway. ;)

      Delete
    8. > ...did you know that Oleksandr Natalenko merged your unofficial bfs into his linux-pf patchset? Quite an honor :)

      Guys, stop that. I just merge patches made by other people and *I must* say "thanks" to all of them. Not they to me.

      Delete
  34. there is a problem

    init/init_task.c:16:8: error: unknown field ‘deadline’ specified in initializer
    init/init_task.c:16:8: error: unknown field ‘run_list’ specified in initializer
    init/init_task.c:16:32: error: ‘struct task_struct’ has no member named ‘run_list’
    init/init_task.c:16:32: error: ‘struct task_struct’ has no member named ‘run_list’
    init/init_task.c:16:8: error: unknown field ‘time_slice’ specified in initializer
    make[2]: *** [init/init_task.o] Errore 1
    make[1]: *** [init] Errore 2

    ReplyDelete
  35. This comment has been removed by the author.

    ReplyDelete
  36. @Martin - Both of your patches compiles up just fine when I build using a subset of modules; if I build trying to use the official ARCH config files (linked below), my build errors out:

    http://pastebin.com/xkmcKysu

    Any advice?

    https://projects.archlinux.org/svntogit/packages.git/tree/trunk/config.x86_64?h=packages/linux

    ReplyDelete
    Replies
    1. hmmm. unfortunately irq time accounting is a bit out of my league. I think we need Con's expertise here.

      Delete
    2. ===
      kernel/sched/bfs.c: In function ‘update_rq_clock_task’:
      kernel/sched/bfs.c:2394:2: error: implicit declaration of function ‘static_branch’ [-Werror=implicit-function-declaration]
      cc1: some warnings being treated as errors
      CC [M] crypto/pcbc.o
      make[2]: *** [kernel/sched/bfs.o] Error 1
      make[1]: *** [kernel/sched] Error 2
      make: *** [kernel] Error 2
      make: *** Waiting for unfinished jobs....
      ===

      Please replace static_branch function call with static_key_false function call and have a fun.

      Delete
    3. @post-factum - Can you please upload the BFS in your patchset broken-out? Did you do this replacement that you suggested?

      Delete
    4. This comment has been removed by the author.

      Delete
  37. I independently ported 3.5-sched-bfs-424.patch to linux-3.6.1 a few days ago:

    http://www.file-upload.net/download-6680793/3.6.1-sched-bfs-424-_sid_.patch.bz2.html

    As always, no guarantees.

    ReplyDelete
    Replies
    1. Your download is missing from the link provided. "Datei existiert nicht!

      Diese Datei wurde vom User oder durch eine Abuse-Meldung gelöscht.

      Tipp: Kredite und mehr!"

      Delete
  38. > @Martin - Both of your patches compiles up just fine when I build
    > using a subset of modules; if I build trying to use the official ARCH
    > config files (linked below), my build errors out:
    > ...

    An update. It seems the build errors ONLY occurs when I build having guest virtualization enabled. If I disable it, I can build the _full set_ of modules just fine using the ck1 patch Martin provided.

    [ ] Processor type and features --->Paravirtualized guest support --->

    wtf?

    ReplyDelete
  39. one problem after build kernel 3.6.1 with Martin patch system not boot. after remove patch system boot normal.

    m.

    ReplyDelete
    Replies
    1. Please supply more information!
      Do you receive errors during patch application and/or kernel compilation?

      And, when booting, do you receive messages you can provide to us?

      With your posting you're leaving us digging in the dust of Mars.

      Manuel

      Delete
    2. Hi
      No error only black screen after boot loader .

      m.

      Delete
    3. @Micron:
      I use the opensource radeon for my graphics. When going to a new kernel I often have to reboot at least twice.
      First reboot may lead to a blank/black/striped screen. But keyboard control is active after 40s. So I usually can login blindly and do the reboot (avoiding disk loss compared to RESET-button).

      Please also try the new patches from CK,

      Manuel Krause

      Delete
  40. So, I had been running openSUSE kernel 3.6.1 with my old setup (most recent BFS-only from Martin + mm-drop_swap_cache_aggressively.patch + most recent BFQ, UP-system) for over 2 days of uptime without any problems. {Please keep in mind, that the BFS-patch always needs minor adjustments for openSUSE kernel-sources.}

    Maybe something like that has hit Micron? (I always watch the patching output before make and compilation output before rebooting^^ !)

    Then, I tested the ck1 provided by Martin and it failed compiling after a few minutes. Error output at the end. I then reverted bfs424-grq_urwlocks.patch from Con's broken-out (http://ck.kolivas.org/patches/3.0/3.5/3.5-ck1/patches/) and it compiled fine and is up and running since this afternoon without issues.


    Best regards,
    Manuel Krause


    Error Output:
    CC kernel/sched/bfs.o
    In file included from kernel/sched/bfs.c:72:0:
    include/linux/urwlock.h: In function ‘__urw_write_lock’:
    include/linux/urwlock.h:42:2: error: implicit declaration of function ‘arch_write_lock’ [-Werror=implicit-function-declaration]
    include/linux/urwlock.h: In function ‘__urw_write_unlock’:
    include/linux/urwlock.h:48:2: error: implicit declaration of function ‘arch_write_unlock’ [-Werror=implicit-function-declaration]
    include/linux/urwlock.h: In function ‘__urw_read_lock’:
    include/linux/urwlock.h:54:2: error: implicit declaration of function ‘arch_read_lock’ [-Werror=implicit-function-declaration]
    include/linux/urwlock.h: In function ‘__urw_read_unlock’:
    include/linux/urwlock.h:60:2: error: implicit declaration of function ‘arch_read_unlock’ [-Werror=implicit-function-declaration]
    cc1: some warnings being treated as errors
    make[2]: *** [kernel/sched/bfs.o] Error 1
    make[1]: *** [kernel/sched] Error 2
    make: *** [kernel] Error 2

    ReplyDelete
    Replies
    1. Thx Manuel, this is the locks problem I mentioned above. I wasn't aware you could simply revert (or not apply in the first place) bfs424-grq_urwlocks.patch. Thanks for confirming that.

      Delete
  41. I have tried now both BFS and CFS, with the most possible low-jitter tweaks. My impression is that on jitter-sensitive applications like Doom 3, they can perform very similar. On additional compatibility layers like wine, who are even more jitter sensitive, BFS jitter-extremes seem higher. Meaning average jitter is lower, but wine has some 1 second jitters, with BFS. CFS has higher average jitter, but no 1 second jitters.

    Both tested with high_res_timers off, 90hz timer, and a fast granularity setting for a psychovisual jitter-profile of natural.

    Peace Be With You.

    ReplyDelete
  42. After a few days experience:

    linux-3.5.2-bfs is the fastest kernel I ever had!
    Without any errors nor problems!

    With previous releases I had to
    CONFIG_RCU_BOOST_PRIO > default one
    as a workaround to not get process time overflows with top tool. This error isn't any more!

    ReplyDelete