Monday 4 March 2013

BFS 0.428 for linux-3.8.x

Announcing a resync of the BFS and -ck patchsets for linux-3.8

Full ck patch:

BFS only patch:

The full set of incremental patches is here:

The only changes to BFS include a resync from BFS 0.427, and a micro-optimisation to the CPU accounting courtesy of Olivier Langlois (thanks!). See the incremental patch for details.

As for the -ck patchset, I am dropping the patches that no longer seem to reliably work that set sysctl values since distributions seem to change them, along with removing patches of dubious utility.



  1. I am very pleased now using your BrainFuckScheduler with Linux-3.8.2
    I hinted the Gentoo maintainer about this new release at:

    Ciao from Hamburg in Germany, which just leaves ice winter cold powered by the first warming sun of this year,
    Ralph Ulrich

  2. Thanks! It works good on Gentoo 3.8.2 :>

  3. Thx a lot for yor patch.

    "dubious utility". I guess you talk about Kconfig.hz. I used it for testing purposes. A pity you took it out. It wouldn't do any bad, if you'd leave it in.

    Obviusoly it's causing extra work.

    Thx for your efforts.

  4. I want to thank you, too, for the new patches @ck ! Maybe I'll bother with some questions about the left out patches later. ;-)

    For now I want to crosspost for everyone also using the BFQ I/O scheduler patches, that there's a new fix out as of today. Arianna doesn't use the word "critical" or sth. scary like this but it maybe a good idea to have it.!topic/bfq-iosched/f6SGEUM38ns

    Best regards, Manuel Krause

  5. @ Ralph Ulrich

    Just curious about your tests with RCU you posted in some other page of this blog.
    The one with :

    What value did you assign to CONFIG_NR_CPUS ?

  6. Thank you for the update. I've been running it for a while now without a hitch, playing Counter-Strike on Steam or watching movies while doing Gentoo heavy-duty compiling. No problems at all.

  7. Hi Con.

    I stepped over a very intersting patch, which introduces architecture related gcc parameters.

    As it seems these will have quite a kernel performance impact.

    I'm wondering if you'de beinterested to look into the gcc optimization subject. Might be an interesting patch for the ck patch set.

    Below the reference:


  8. @ Anon: nice find! thanks for posting a link to the patch

  9. @Anon - The patch has been part of the Arch Linux linux-ck package since version 3.6.9-3 (08-Dec-2012) :p

    For those interested, scroll down in the readme on my github to the benchmarks that establish statistically significant increases on several different test machines.

    For precompiled, CPU-specific & optimized linux-ck packages for Arch Linux, have a look at my unofficial repo which hosts 14 different package sets for i686 and x86_64.

  10. Hi there (graysky in particular - thx for your work btw).

    Phoronix did some very interesting benchmarks on the -march parameters.

    What's IMO missing in your (graysky) patch is -march native , that option is supposed to select the matching option for the particular processor

    Graysky: I also think that the BCK IO scheduler patch is a great addition to your patch set.

    BFS obviously covers the CPU only.

    What I use on top of that are different gcc options in the kernel Makefile:

    Kernel offers

    -O2 or -Os

    maybe you guys try

    -O3 or -Ofast

    Obviously newest gcc >= 4.7 should be used.

    1. ^ Anon, there was a suggestion on the LKML ( about using -Os, thought from the replies it's not clear to me whether it's better than O2 or not.

      Do you know if there is a benchmark somewhere about O3/Ofast, maybe even including those other options?

  11. @Anon - Updated with the 'native' option. Also verified to work with gcc 4.8.0.

    1. Thanks for your work!

      What should I add for my oldfashioned machine, that you know? -march pentium3 -mtune pentium3 ?
      Should I use that 'native' version? I want to have the -Ofast :-)

      Best regards,
      Manuel Krause

    2. BTW., "native" only compiles for the currently running machine!

    3. Phoronix just posted some Haswell benchmarks for AVX2 vs other options:

  12. (sorry Con for kind of highjacking your blog ;) )


    Applied your updated patch on 3.8.4-ck1 .
    Works also quite nice on 3.8.4-rt2 btw.

    I compiled with gcc 4.8 and -Os

    I was surpised to read the benchmarking results of gcc 4.8 vs. 4.7.2:

    When I read it I immediately upgraded from 4.7 to 4.8.


    I'd really like to see those extended -march and gcc -O options (2/3/s) being part and selectable in the standard kernel config process. (Perhaps you also write a little patch for those -O options one day ;) )


  13. @graysky:
    The 'native' selection compiled and behaves fine! :-)))
    2 questions:
    1.) The vermagic in 'modinfo _modulename_' tells me "CORE2" on a PIII Tualatin
    --- Why?
    2.) How and where can we add -mtune=native (or any other cpu-type) to your patch? According to 'gcc -march=native -E -v - &1 | grep cc1' I'd only get a "-mtune=generic"
    --- Or doesn't my question make any sense?

    I'm referring to the previously posted

    Thank you and best regards,

    Manuel Krause

  14. @Manuel -

    1) Not sure. On an older i686 box I have that requires the nv module:

    % modinfo -F vermagic nvidia
    3.8.5-1-ck SMP preempt mod_unload modversions 686

    % modinfo nvidia | grep vermagic
    vermagic: 3.8.5-1-ck SMP preempt mod_unload modversions 686

    It is running the linux-ck (generic) packages from

    2) My understanding of -march=native is that users may omit the -mtune parameter altogether since it is handled therein. Look at the URL you posted: "Specifying -march=cpu-type implies -mtune=cpu-type."

    Perhaps I am misinterpreting it?

    1. @graysky & all interested people:

      ad 1.) At least this doesn't seem to be harmful at runtime. I've found the culprit: In your provided kernel-38-gcc48-1.patch there's a copy&paste line mistake in the first hunk where you inserted the MNATIVE piece into linux/arch/x86/include/asm/module.h (wrongly inbetween the MCORE2 definition).

      ad 2.) Some more (re)search leads me to the same understanding like yours: The -mtune=cpu-type may bring further optimizations "under the constraints of the selected -march's instruction set" like I read them in your patch for CORE2 and following. For my "pentium3" it selects automagically -mtune=generic -- perhaps as there is no more possible optimization, I assume.

      Who wants to check what -march=native (or whatever) really enables/disables should have a look at linux/arch/x86/kernel/asm-offsets.s -- in the header you'll find useful information.

      But that leads me to another question:
      Why does my linux/arch/x86/Makefile contain this section:
      # prevent gcc from generating any FP code by mistake
      KBUILD_CFLAGS += $(call cc-option,-mno-sse -mno-mmx -mno-sse2 -mno-3dnow,)
      KBUILD_CFLAGS += $(call cc-option,-mno-avx,)
      That effectively disables the use of -msse & -mmmx on my machine as I see in the previously mentioned asm-offsets.s header. Is it openSUSE kernel-source specific or do you have it, too?

      Best regards, Manuel Krause

  15. 1.) Good catch, Maunel. Please review

    2.) No idea. My patch is based off jeroen's patch and just adds a few more CPUs to the mix:

    1. @graysky: The patch now looks like my adjusted one. Should be usable. (review ack: o.k.)

      BTW: forcing -mmmx -msse, results in a non-bootable kernel 3.8.6 with openSUSE kernel-sources on here. So, that previously mentioned exclusion makes a sense -- at least for me. (I am still using gcc 4.7.2.)


  16. @graysky & co,
    some more weeks of testing have gone by. I need to correct my previously posted findings:

    The settings leading to a non-bootabale kernel on here were -Ofast and -O3 in /Makefile in connection with my specific optimizations (-mmmx -msse).

    With -O2 (standard) I can use the compiled kernel with -mmmx -msse. I can't show to you that it's getting faster now, as it's a subjective feeling to me only, so far.

    --- a/arch/x86/Makefile.orig 2013-03-15 09:16:33.000000000 +0100
    +++ b/arch/x86/Makefile 2013-03-18 18:58:08.000000000 +0100
    @@ -139,7 +139,7 @@
    KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
    # prevent gcc from generating any FP code by mistake
    -KBUILD_CFLAGS += $(call cc-option,-mno-sse -mno-mmx -mno-sse2 -mno-3dnow,)
    +KBUILD_CFLAGS += $(call cc-option,-mno-sse2 -mno-3dnow,)
    KBUILD_CFLAGS += $(call cc-option,-mno-avx,)

    KBUILD_CFLAGS += $(mflags-y)

    Best regards,
    kernel 3.10 is supposed to ship UVD for ATI radeon open source driver users,

    Manuel Krause

    1. I just fail to understand which sections of the kernel code you expect optimizing with sse and mmx instruction sets.

      BTW, preventing the compiler to generate FP code is definitely *not* made in the intention to ruin the performances of your kernel.

    2. @aCOSwt:

      Do you have more insight/ knowledge? Please, then, go ahead and explain.

      I don't know what's exactly benefiting from the added instruction sets. I just wanted to unlock them to see what happens. So far it's less dangerous than adding -O3 or -Ofast to the main Makefile.


    3. @aCOSwt:
      And, what I've forgotten to add: The kernel size is getting smaller: 1,9 MiB -> 1,7 MiB. Without any disadvantages.


    4. In short: (=> There are a couple of very rare exceptions and my wording might not be perfectly accurate)
      1/ Most of the sse instructions work on FP datas
      => will use FP registers.
      2/ These registers are large and it takes time to save/restore them to/from the stack when calling a procedure.
      3/ The Linux kernel does not need FP numbers => Its procedures do not make provision for saving/restoring the FP registers.
      This is indeed the most efficient. Can you imagine what your performances would be if FP registers were saved/restored on any system call ?
      4/ Normally, the kernel code is written so that gcc should not believe it profitable to generate fp instructions. (Hence my "I fail to understand which sections of the kernel code you expect optimizing")
      But, if, by mistake, (from the kernel programmer or because of a fancy gcc heuristic), gcc believes profitable to generate FP instructions and is allowed to, then the result of the operations at run time will necessarily be, at best,... random!
      => -mno-sse -mno-mmx...

    5. Thank you very much for your 'short' ;-) explanation. This helps me to understand the too short description found in the linux/arch/x86/Makefile a bit better.

      So -- if I understand your words -- having -mmmx -msse can lead to slower operations or even wrong results? Did I get it correctly what you say? And, maybe: Do you have a guess what things would/could get slower with these instructions?


    6. Yes ! this is a "short" explanation. The exact reality would need twice as much words...

      No ! having -mmmx -ssse will *not* "lead to slower operations or even wrong results"!

      having -mmmx -ssse will at best do nothing because the Linux kernel is everywhere perfect and gcc will not use these instruction sets at all,


      will *necessarily* lead to wrong results if some gcc heuristic is fancy enough to generate fp code for some fancy section of code.

      The probability of the latter is greater than the probability of the former.

      If you want to safely enable gcc to use these instruction sets then you *must* rewrite the kernel procedures that could be concerned by such an "optimization", making provision for saving/restoring the FP registers.
      But then, you'll realize at run time, that the time needed to save/restore the fp registers is *much* greater than what you win with your sse/mmx instructions.

      On a side note about instruction sets / gcc optimizations, the patch graysky mentions can lead to surprising results.
      As an example of a surprise, you can see that the kernel makefile selects -march=core2 and mtune="generic" for an Intel core 2... and the patch, probably believing that the kernel devs want to ruin the performances of the kernel, corrects that and selects mtune=core2.
      I cannot tell with newer gcc versions but it has been noticed that gcc-4.5 was generating better (more efficient) code with mtune=generic...

    7. @ aCOSwt:
      I didn't want you to exaggerate. :-(


    8. It's true that the kernel doesn't concern itself with floating point, and it's wrong to force the issue with compiler parameters, but keep in mind that Linux is pretty much the only kernel that bans FP operations. Other kernels (or operating systems) don't do that, and performance is just fine. Restoring the registers doesn't take much time.

    9. - AFAIK, and at least up to V9, FreeBSD is in the exact same situation than Linux.
      - As for OS-X, the "Kernel Programming Guide" writes : "you should avoid doing using floating-point math or AltiVec instructions in the kernel... It is not forbidden, but is strongly discouraged."

      So... what's left ?

    10. I haven't found any remarkable performance issues with -msse -mmmx nor advantages with kernel default. Maybe it's better to keep kernel default as some people may already have made their heads burn exhaustively about this topic. ^^

      Thank you for this discussion,

  17. Replies
    1. Working on it, but a major mainline cpu offline code update is wreaking havoc galore. That and my VPS provider is regularly finding ways of giving me endless downtime.

    2. OK, good luck. Tell me if you need pre-release tester — I'd be glad to compile it and give it some try.

    3. Thanks. Even having trouble hosting it right now, but consider this is an official release candidate patch that needs the usual compile/testing:

    4. Thanks for the patch. It works OK for me now. Uptime is 1 hour :).

    5. ittaku-subs? CK, you an anime fansubber or something? :-)

  18. Linux-3.9.0 (non-ck) seem to me a very broken release:
    no more ondemand available for me (MacMini core2) anymore :(

    Also all of htop or top output days instead of minutes of task uptimes :(

    Greetings from Lutherian-fanatic Hamburg
    Ralph Ulrih

    1. make oldconfig
      is totally broken, when major release upgrade ...
      (If I use make defconfig all ok)
      Ralph Ulrich

  19. Very minor cosmetic only problem (no immediate functional impact) but potentially leading to confusion :

    Since 2.6.37 I think, the BFS patchset forces CONFIG_IRQ_TIME_ACCOUNTING=y.

    Since 3.7, Linux introduced the "cputime accounting" configuration menu, offering CONFIG_IRQ_TIME_ACCOUNTING in an exclusive-or with a new config option representing the default choice : CONFIG_TICK_CPU_ACCOUNTING.

    Once BFS-patched, the "cputime accounting" configuration menu only offers the un-deselectable CONFIG_TICK_CPU_ACCOUNTING option.

    a/ This may lead to the confusion for the user who knows BFS wants CONFIG_IRQ_TIME_ACCOUNTING and that both are in exclusive-or.

    b/ There is no immediate functional problem as CONFIG_IRQ_TIME_ACCOUNTING=y is actually set in the .config

    c/ Depending on how things evolve on the kernel side, this might lead to troubles in the future as both CONFIG_IRQ_TIME_ACCOUNTING and CONFIG_TICK_CPU_ACCOUNTING are set in the .config when the kernel devs meant them in exclusive-or.

    1. Hi, again, aCOSwt, :-))

      This topic has been discussed some months ago on here, already. I've made some patches for this issue and uploaded them:

      Previous choice behaviour with TICK_CPU_ACCOUNTING available, too:

      Cons choice behaviour with IRQ_TIME_ACCOUNTING only:

      Don't mind the 3.7-ck1 lines in the patch, I've applied them to 3.8.11 after ck1 with -p1 and I currently use Cons choice.

      Best regards,

  20. Hello admin, I m the owner of BORN to Hack (a facebook group ' ) I m searching for partnrs .U CAN POST Your article there and can advertise your blog . If u r interested plz join the group.

    -BORN 2 HACK.

  21. Above rc-patch (patch-3.9-ck1.lrz)
    for Linux-3.9 solved all problems regarding
    time accounting

    It runs for hours without problems.
    I had a little weird problem shutting down:
    The fan kept rolling. But that might be due
    to some othe buggyness of Linux-3.9
    Greg presented over a hundred patches in the
    stable-queue yesterday planed for linux-3.9.1
    . Most of them arm related. The funny time
    accounting of Linux-3.9 doesn't change but with

    I also had an error with first try compiling
    I quickly saw it was CONFIG_IRQ_TIME_ACCOUNTING
    related. I knew this is mandatory for Bfs. I
    quilt poped Bfs-patch. I re-did
    make menuconfig
    disabled CONFIG_IRQ_TIME_ACCOUNTING - saved
    enabled CONFIG_IRQ_TIME_ACCOUNTING - saved
    repatched Bfs
    and in the second try Linux-3.9.1rc-ck1-bfs
    compiled through.
    And runs perfectly.
    Very thanks from Hamburg

    Ralph Ulrich

    1. Maybe you can/ feel free test one of the patches posted by me some lines above.


    2. @Manuel, I meant there perhaps is a problem in the auto generation of linux/.config when Bfs-patch applied. After looking in your link, you:
      bool "Simple tick based cputime accounting"
      depends on !S390
      + depends on !SCHED_BFS
      Uups, does this mean Bfs is incompatible with
      TICK_CPU_ACCOUNTING ? I ask because I had before:
      grep TICK_CPU_ACCOUNTING config*
      Ralph Ulrich

    3. @Ralph, I've added this line as Con said he wants IRQ_TIME_ACCOUNTING some time ago. You can leave it away. I've tested both versions at the time of the discussion without problems.