Wednesday, 10 July 2013

BFS 0.440, -ck1 for linux-3.10

I finally managed to set up some 3g wireless internet in this remote mountain village I'm staying in (probably the first to ever do so). After a few revisions I was able to bring BFS into line with mainline. There are no significant changes to the design itself, but hopefully a few minor fixes have come along as a result of the resync as I also carved out bits of code not relevant to BFS and tinkered with the shutdown mechanism a bit more. As for the new tickless on busy CPU feature from mainline, it is not being offered in BFS as it is quite orthogonal to a design that so easily moves tasks from one CPU to another, and it provides no advantage for desktop/laptop/tablet/PDA/mobile device/phone/router etc. which BFS is targeted towards.

Some of the configuration code was also changed since the last version allowed you to generate an invalid configuration. You might get some strange warnings about the IRQ TIME ACCOUNTING configuration option but it should be harmless.

Get BFS by itself for 3.10.0 here:
3.10-sched-bfs-440.patch

 After careful consideration, I've decided to remove the remaining -ck patches and just make the -ck patchset BFS with some extra default config options and the -ck tag. As I've said previously, those other patches were from long ago, the kernel has changed a lot since then, and I've been unable to confirm they do anything useful any more, whereas there have been reports of regressions with them.

Get the -ck tagged patchset here:
3.10-ck1

Enjoy!
お楽しみください

51 comments:

  1. Thanks, Con!

    A "remote mountain village", eh? How I envy you...

    Btw: FYI it looks like there are RFC patches for a new scheduler "that's power-aware and aims for offering power-efficient performance has been published ... The patch set introduces a cpu capacity managing 'power scheduler' which lives BY THE SIDE of the existing (process) scheduler."

    http://www.phoronix.com/scan.php?page=news_item&px=MTQwNzI

    ReplyDelete
  2. Still not fixed:

    http://ck-hack.blogspot.com/2013/05/bfs-0430-ck1-for-linux-39x.html?showComment=1368474200657#c5292647727716098765

    ReplyDelete
  3. Nice job, CK. Running just fine with 300 Hz tick rate and haven't been able to trip the shutdown freeze which was solved in 3.9 by jacking the tick rate up to 1k. Will post if I see it. Will also post the usual Pepsi Challenge comparing CFS to BFS in the 3.10 tree when I get some time.

    ReplyDelete
  4. BFS is the bacon of kernel patches.

    ReplyDelete
  5. I still can't suspend to ram or disk. It's hanging at the very end sadly I get no stack trace just a blinking cursor.
    The normal shutdown works just fine.

    ReplyDelete
    Replies
    1. Same issues here. Blinking cursor at the very end, system freeze. At least one other person experienced this also:
      https://bbs.archlinux.org/viewtopic.php?pid=1309022#p1309022

      Delete
  6. Unfortunately, bare 3.10 patched with BFS only (no extra patches applied) even doesn't boot for me. Machine just hangs right after grub loads initramfs image. Early printk also doesn't produce anything.

    How can I manage that?

    Here is my kernel config: https://gist.github.com/748a872933982630e52a

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. Got it work after disabling CONFIG_RCU_USER_QS and CONFIG_RCU_NOCB_CPU. No idea why it happens, but it seems that BFS doesn't handle latest changes in idle/rcu code.

      Delete
    3. Thanks for picking that up. These new RCU features are designed to help debug the new full dynticks option which BFS doesn't even support so I never even bothered to try enabling them with the latest BFS (since they're not supposed to do anything unless you enable full dynticks, yet they add significant overhead). I guess I should have masked the options for it to not even be possible to enable them.

      Delete
  7. I am in the process of building up the ck kernels and packages for repo-ck. If you run Arch Linux, you can try out the x86_64 generic if you wish. Built on an up-to-date Arch Linux box:

    http://repo-ck.com/x86_64/linux-ck-3.10-1-x86_64.pkg.tar.xz
    http://repo-ck.com/x86_64/linux-ck-headers-3.10-1-x86_64.pkg.tar.xz

    ReplyDelete
  8. I did the usual 'make' benchmark comparing BFS v0.440 to CFS in Linux version 3.10. Usual results found: BFS is statistically faster. See details in blogpost:

    http://grayskysblog.blogspot.com/2013/07/bfs-v0440-vs-cfs-in-linux-3100.html

    ReplyDelete
  9. the difference in non-stuttering and less delays compared to a kernel with CFS is strikingly noticable


    thanks a ton for your hard work, Con !

    ReplyDelete
  10. Hello,
    had anyone else problems with suspend to disk/ram ?
    Since 3.9 it's broken on my machines with the BFS patches.
    Shutdown was broken too with the first 3.9 patches but it worked again with 3.9.4 or 3.9.5...
    What's the best way to debug this ?
    The last thing I get is a blinking cursor no stack trace ...

    ReplyDelete
    Replies
    1. works fine here - give the following patches a try:

      [PATCH] PM: avoid 'autosleep' in shutdown progress
      [PATCH] cpufreq: Fix cpufreq regression after suspend/resume
      [PATCH 1/8] cpufreq: Revert commit a66b2e to fix cpufreq regression during suspend/resume

      Delete
    2. ok, not sure why the post wasn't displayed

      so I'm reposting

      [PATCH] PM: avoid 'autosleep' in shutdown progress
      [PATCH] cpufreq: Fix cpufreq regression after suspend/resume
      [PATCH 1/8] cpufreq: Revert commit a66b2e to fix cpufreq regression during suspend/resume

      anyway: suspend-to-ram and resume works fine for me with 3.10

      Delete
    3. blogspot has spam filtering and some posts get wrongly tagged so and don't show up. Every so often I check and get them untagged and they appear later on. Same happens with false negatives.

      Anyway as I said in 3.9-ck, the whole suspend resume cycle was completely revamped in 3.9.

      Delete
  11. Same here. Suspend to disk/ram is broken. I only get a dead, frozen console with no stack trace, nor any way to get one. Need a hard reset to resume operation.

    I run Linux-3.9.10 on Xubuntu 12.04 LTS. Shutdown/restart works normally for me though.

    I tried to get an incremental diff between bfs-430 and bfs-440, but I end up with this:

    1 out of 1 hunk FAILED -- saving rejects to file /tmp/interdiff-1.sYDH6A.rej
    interdiff: Error applying patch1 to reconstructed file

    The contents of the reject file is:

    --- interdiff-1.sYDH6A
    +++ interdiff-1.sYDH6A
    @@ -26,8 +26,8 @@
    #include "cpufreq_governor.h"

    /* Conservative governor macros */
    -#define DEF_FREQUENCY_UP_THRESHOLD (80)
    -#define DEF_FREQUENCY_DOWN_THRESHOLD (20)
    +#define DEF_FREQUENCY_UP_THRESHOLD (63)
    +#define DEF_FREQUENCY_DOWN_THRESHOLD (26)
    #define DEF_SAMPLING_DOWN_FACTOR (1)
    #define MAX_SAMPLING_DOWN_FACTOR (10)

    Is there any chance to get bfs-440 backported to Linux-3.9? It is still receiving updates, the latest of which is the 3.9.10 release: http://lwn.net/Articles/558874/

    ReplyDelete
    Replies
    1. according to that snippet - that's just the ondemand cpufreq governor more aggressively tuned

      switch to conservative governor and see whether that makes a change (which I doubt) - afaik conservative and ondemand governor are affected by BFS patch so your mileage may vary

      as a last test you could try initiating suspend-to-ram with performance governor and see whether that is working


      when trying out 3.10 kernel - try out the 3 cpufreq/PM (1 PM and 2 cpufreq) related patches a try

      Delete
    2. So I recompiled the kernel again (3.9.10 on Xubuntu 12.04), but this time I changed to CONFIG_HZ=1000 (was CONFIG_HZ=300). And now the I can suspend/resume, as well as restart/shutdown, normally.

      Of course I cannot be certain that this one change rectified the problem I was experiencing with suspend, as I have changed other options as well (using graysky's linux-ck-atom-3.9.10-1-i686 config as guide, customizing to my preference and selections appropriate to a Debian-based distro from there).

      Anyway, I'm a happy camper now, and thanks to graysky for his public repository and architecture-specific configurations.

      Delete
  12. Same problems with suspension to disk/ram here.
    Checking the logs all the suspension hooks are called correctly (pm-suspend), it hangs once the request is sent to the kernel.

    Nothing wrong appears in the kernel logs.

    ReplyDelete
    Replies
    1. Ok I now tried with kernel 3.10.1 and suspension seems to work again.

      Delete
    2. last update (I hope): suspension seems to work unreliably, sometimes it goes in suspension, sometimes it hangs.

      Delete
    3. for those having issues with suspend try following patches against 3.10 (or 3.10.1/3.10.2 if it isn't already included)

      [PATCH] PM: avoid 'autosleep' in shutdown progress
      [PATCH] cpufreq: Fix cpufreq regression after suspend_resume
      [PATCH 1/8] cpufreq: Revert commit a66b2e to fix cpufreq regression during suspend_resume

      hope that helps

      Delete
    4. No that didn't help sadly...
      I tried enabling some debug stuff but all I see is the blinking cursor.
      Any help to pin down this issue would be appreciated.
      If I switch to CFS everything works but that's not a good solution.

      Delete
    5. Since kernel 3.9.8 I had issues with suspend/resume, too, using suspend-to-disk usually. As I often try different kernel config options and there were many changes in 3.10 and CK's patch drops I wasn't able to track this down to a particular patch/config/setting with a minimum of rational reasoning -- at least I tried to. ;-)

      - disabled radeon UVD as it completely breaks suspend (with a not accepted patch from LKML:
      http://pastebin.com/0mRGb224 && issuing radeon.no_uvd=1 @ kernel command line).
      - applied mm-drop_swap_cache_aggressively.patch from 3.9-ck1
      - CONFIG_HZ_300
      - CONFIG_HIGHPTE=n
      ( - for 3.10.3 I newly tried the Transparent Hugepage Support which enables memory compaction and page migration as well )
      - set /proc/sys/vm/dirty_background_ratio to 3 (openSUSE seems to default to 5)
      - set /proc/sys/vm/dirty_ratio to 8 (openSUSE seems to default to 10)
      - additionally I've had set 'early writeout = n' in /etc/suspend.conf some times ago

      So, maybe one or more of this stuff is helpful to you, here it survived 3 consecutive suspend/resume cycles within ~2 1/2 days of uptime.

      Best regards, Manuel Krause

      Delete
    6. Thanks for the help but non of this tips helped. But I could further pin it down. I enabled no_console_suspend and let my machine try to suspend now I see the suspending is getting to the last phase and I see the CPU x is now offline stuff. But now the strange thing and most likely related to the BUG some times it gets to CPU 4 until it hangs and some times to CPU 6 but never further.
      This stuff is is in "kernel/cpu.c" in the "disable_nonboot_cpus" function but I'm not a kernel programmer i'm stuck there.

      Delete
    7. One more thing I disabled hyperthreading to test what happens If I half the core counta nd now the issue is hitting immediately without disabling any cpus.

      Delete
    8. Post in pastebin or send me the full debug output of that debugging with no_console_suspend. Since you can't usually cut and paste, a digital camera shot usually suffices ;)

      Delete
    9. http://imgur.com/YQ1eNej

      Delete
    10. Here are two backtraces that also happend by suspending the kernel they don't look related to the hang but I guess everything helps.

      http://imgur.com/yU8GOpa
      http://imgur.com/t0BSbMT

      Delete
    11. Here are two backtraces from the same situation they trigger not everytime...
      http://imgur.com/yU8GOpa
      http://imgur.com/t0BSbMT

      Delete
    12. @Manuel Krause:

      that patch with radeon.no_uvd=1

      is golden !

      now resuming from suspend (with the new DPM (!)) also works

      gotta try out the new drm fixes from time to time - but until now it only would work occasionally on the first try

      thanks :)

      Delete
    13. @kerneloftruth & all other readers:
      Thank you for the feedback.

      I see now that my proposed workarounds don't heal the suspend/resume problems that must be somewhere else in kernel. 3 times it does work, the 4th or 5th attempt fails.
      This is still on my old unicore PIII Tualatin with ATI Radeon HD 4350 Gfx and the opensource radeon driver (and the known setup).

      When resuming and the BUG finds a way to the logs I then usually get somekind of this:
      http://pastebin.com/91fy4Rcr
      The unlink_anon_vmas can come earlier than __rb_erase_color with kernel versions < 3.10.4

      Don't know whom to contact about this and whether it is BFS related at all. I haven't tested against CFS so far, but I'll do now (don't want, don't want, don't want) ;-)

      Manuel Krause

      Delete
    14. @Manuel Krause:

      there seem to be 2 issues:

      the 1st rb_erase_color related:

      ./lib/rbtree.c:____rb_erase_color(struct rb_node *parent, struct rb_root *root,
      ./lib/rbtree.c:void __rb_erase_color(struct rb_node *parent, struct rb_root *root,
      ./lib/rbtree.c: ____rb_erase_color(parent, root, augment_rotate);
      ./lib/rbtree.c:EXPORT_SYMBOL(__rb_erase_color);
      ./lib/rbtree.c: ____rb_erase_color(rebalance, root, dummy_rotate);
      ./include/linux/rbtree_augmented.h:extern void __rb_erase_color(struct rb_node *parent, struct rb_root *root,
      ./include/linux/rbtree_augmented.h: * so as to bypass __rb_erase_color() later on.
      ./include/linux/rbtree_augmented.h: __rb_erase_color(rebalance, root, augment->rotate);


      look for fixes for rbtree on lkml or following patches for 3.10 or file a bug report/ask on lkml - could be triggered by BFS or it's simply an issue which got introduced recently - didn't see it so far with 3.10


      the unlink_anon_vmas related issue:
      might be an inherent issue in preemption code that gets triggered by BFS - search for related messages on lkml


      hope that helps in some way

      Delete
  13. AMD Phenom II / Radeon HD6950 here refuses to suspend to RAM or Disk with kernel 3.10.1-ck. With the Arch 3.9.9 it suspends (albeit with some errors ([Firmware Bug]: cpu x, try to use APIC500 (LVT offset 0) with x being every CPU core present except 0). Was told this was a BIOS bug for AMD K10 chips.

    HD turns off, screen shows a non-flashing cursor at the top left. Hard boot required to get things going again.

    Also compiled it for an AMD E-450 (or was it E-350..) and there it suspends correctly.

    Both use the proprietary AMD ATI driver.

    ReplyDelete
    Replies
    1. yeah, same problem here most likely its releated to the cpu offline code changes.

      Delete
  14. Hmmm. One one of my machines I got a panic with a (repeated) "BUG: unable to handle kernel paging request" (and 10 pages of info). The IP points to anon_vma_clone+0x7f/0x160. Since the kernel (3.9.8) is patched (BFS, BFQ) and tainted (nvidia), and since the issue most likely not easily reproducable, I will never be able to report this bug to the right people... sigh.

    ReplyDelete
    Replies
    1. Hmmm. another machine panicked on me. this time under xorg, and i got no screenshots or logs. Common to both machines was: kernel 3.9.8, CK1 patch, BFQ v6r2 patch, nvidia blob, and the panic occured during or minutes after resuming from STR. I've upgraded the kernels to 3.9.10 and I'll keep observing... Going 3.10 soon.

      Delete
  15. Finally updated to kernel 3.10. No problems so far with BFS. 3.10 is probably going to be a longterm kernel, so hopefully I'll be sticking with it for the next year or so. I'm sick and tired of new kernels breaking half my software.

    ReplyDelete
    Replies
    1. AFAIK suspend to ram is still broken :/

      Delete
    2. Not here, using a i7, s2ram and s2disk works fine. Had a problem with s2disk, could not find swap disk, but that was another story.

      If I remember correct, since 3.9 (until kernel 3.10.3?) I had a problem with CPU frequency scaling, the load was nearly 0, but the cpus run at max MHz. There was a new config switch for Intel governors (CONFIG_X86_INTEL_PSTATE), my .config had still the old ondemand and this was the cause. For additional info: https://plus.google.com/117091380454742934025/posts/2vEekAsG2QT

      So Con, thanks for your work on BFS. Using it with the ZEN Kernel.

      PS: Do only have a performance problem with BFQ in conjunction with BFS, the disk io drops sometimes to 10% of normal.

      CU Mike

      Delete
  16. Almost certainly any suspend to disk/ram issues that arise as a result of patching with BFS are the fault of BFS itself. The complete rewrite of the suspend mechanism after linux-3.8 led to drastic changes being required in BFS to suspend again (see the BFS announce for 3.9). As is increasingly common, I have to say time and enthusiasm prevents me investigating much further right now. If suspending is crucial to your everyday activities, and you wish to use BFS, a 3.8 based kernel is your best option. Sorry, I wish I had infinite hours to work on every one of my projects as much as I'd like...

    ReplyDelete
    Replies
    1. I hear you. Unfortunately 3.10 has many goodies for intel drm etc, so just have to stay off BFS for now..

      Delete
    2. > Almost certainly any suspend to disk/ram issues that arise as a result of patching with BFS are the fault of BFS itself.

      I'd just like to add that there are some reports of unusual kernel panics out there by people without BFS. Maybe the new suspend & tickless mechanisms are not as foolproof as we'd like.

      The only problem for people like us is that we cannot report panics to anybody as long as they are sporadic and our kernels are patched and tainted. [evil laugh]

      Delete
    3. I've dropped my retest against 3.10.x CFS, as I faced new and different and sometimes looping BUGs/OOPSes after suspend-to-disk, that never showed up earlier. I have to say, CFS is not more reliable than BFS at all. And the experience with CFS is so bad in comparison to BFS that I trust in the ext4 journalling to not loose too much data when expecting/having a lockup upon 5th resume with BFS. CFS may happen to fail on the 2nd resume already, unpredictably. I even drew back some PCI-register-tunings that had worked until 3.8.y --with no advantage.

      I still, for years now, suspect that there is some severe problem with memory distribution within physical<->swap<->shm that no kernel developer feels responsible for. And I don't have enough knowledge to dig into that.

      Best regards, Manuel Krause

      Delete
    4. Hi Manuel,

      you, by chance,

      tried out the following patchset: [patch v2 0/3] mm: improve page aging fairness between zones_nodes

      ?

      patches include:

      [patch v2 1/3] mm: vmscan: fix numa reclaim balance problem in kswapd
      [patch v2 2/3] mm: page_alloc: rearrange watermark checking in get_page_from_freelist
      [patch v2 3/3] mm: page_alloc: fair zone allocator policy

      that patchset (and some others) + BFS/ck-patchset make things way more bearable :)

      give it a try - it runs stable here with heavily patched up 3.10 kernel base

      Delete
    5. Hi, kerneloftruth!

      Sorry for answering so late, that kernel 3.11 is almost out now...

      The three patches are a real benefit! And I dont understand that the haven't been incuded to 3.10.Y yet. I'm now @ 3.10.10. Please, see & apply the two related fixes in: http://www.ozlabs.org/~akpm/mmots/broken-out/

      I have now adapted my old machine's setup with shm & swap to my new system. Although it is much faster and has smp now -- the old glitches remain, when using swap backed shm the system often keeps stalling for a second or so when processing a file from there.

      Best regards, Manuel

      Delete
  17. I'm abandoning my old machine as the main one within the next week or so. Time to say "Thank You!" to all of you supporting old hardware and keeping up beeing tolerant.

    Special Thanks go to Con, without his patches my old machine would act like a dynosaur, and I see they do much good for my new one, too.

    See U soon with new issues ;-)
    Best regards, Manuel

    P.S.: There's an issue with Kernel 3.10.8/9 AND BFQ I/O scheduler: Please, read:
    https://groups.google.com/forum/?fromgroups=#!topic/bfq-iosched/f4Lg5INzQ-k

    ReplyDelete
    Replies
    1. thanks, Manuel, for pointing out the BFQ issue

      Delete