Announcing a resync and update of BFS for linux-3.18
BFS by itself:
-ck branded linux-3.18-ck1 patches:
Uncharacteristically I found time to resync up quickly for this latest stable linux release. There are no new BFS features, but there have been a number of changes to stay in sync with mainline. Apart from keeping up with the usual churn in new releases, of which there was a modest amount this time, a number of other low level changes were committed making this much less of a trivial resync so some caution is warranted before blindly updating.
Hilf Danton pointed out a bug in the yield_to code (thanks!) which is now fixed. Since almost nothing uses this code you probably won't notice anything. He also pointed out some other now outdated components in BFS which are also updated. The above_background_load function has also been removed since
the VM tweaks in older -cks no longer exist to use it.
More substantially, I've reworked the plugged I/O code to match mainline now, which
I had been reluctant to touch previously because of the deadlocks the
unlocking and relocking in the scheduler code path introduced when the
the first plugged I/O code made its way into BFS needing iterations of
fixes - watch for any I/O misbehaviour/stalls. There are some changes to how mainline responds to idle CPUs so watch for any unusual behaviour there.
Having said that I've been using it for a while and not noticed anything out of the ordinary, but please report back if there are any issues.
Seems to work for me, many thanks.ReplyDelete
Great job with bfs v460! Been ticking away on my WS 6 days now with no issues.ReplyDelete
Thanks so far Con, using it with the ZEN Kernel on 3.18.1. No problem here.ReplyDelete
Btw. Merry Chrismas and a happy new year.
I must do a revision of my altitude. I had no problem on my laptop running the ZEN kernel (maybe I don't have identified some quirks as a problem ;) ). But on my server running the "same" kernel I run into big trouble with new BFS. Copying data from my external esata/USB drive to my XFS Raid5, leads to an reproducible error. After approx. 10secs the systems stocks for seconds to minutes and becomes unusable. High top values of >15.
Some dmesg output:
INFO: task kworker/1:0:18 blocked for more than 480 seconds.
[ 960.118057] Not tainted 3.18.2-zen-server+ #16
[ 960.118060] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 960.118065] kworker/1:0 D 0000000000000001 0 18 2 0x00000000
[ 960.118140] Workqueue: xfs-data/md0 xfs_end_io [xfs]
[ 960.118145] ffff88022714fc98 0000000000000046 00000000aad73000 000000000000bdf0
[ 960.118152] 000000000000bde8 ffff8802270c5fe0 ffff880225ef6740 ffff880227139620
[ 960.118158] ffff880225e46d28 ffff880227139620 ffff8801ee273ca8 ffff8801ee273c90
[ 960.118164] Call Trace:
[ 960.118179]  schedule+0x24/0x60
and so on.
Hard reset of the computer was necessary. (And than the resync of the raid starts from scratch and needs 10 hours :( )
After some testing:
Working: Zen-Kernel 3.17.7 with BFS, Vanilla Kernel 3.18.2, Zen-Kernel 3.18.2 with CFS
Not Working: Zen-Kernel 3.18.2 with BFS
And surprise: Working Zen-Kernel 3.18.2 with your bfs460-locked-pluggedio.patch
another issue with the BFS for 3.18.
Using BFS on an old CPU (Pentium M) with 32bit kernel 3.18 leads to a kernel panic during bootup (btw. debug preemptible kernel isn't set)
Working: CFS and 3.18
Working too: BFS with 3.17
It seems like if -ck patch doesn't work on single core, single thread, 32 bit CPU.Delete
I have this problem on a via C3 CPU (post 22 jan 2015).
I found that SMP kernel options have to be enabled to avoid the idle task panic. See 3.19 comments. -jwhDelete
Great job. 3.18.1 with ck1 is rock solid.ReplyDelete
I think I've encountered a problem with your rework of the plugged I/O. I use btrfs and while I was doing scrub, which is pretty intensive I/O task, I've got a kernel oops. I've put the kernel log here: http://pastebin.com/xbvaia9a
With kernel 3.17.7 everything is smooth. The same with kernel 3.18.1 and the CFS scheduler.
I just tested btrfs scrub on my system after reading this, and it froze my system with 3.18.1 and BFS after a few seconds. I didn't test with vanilla 3.18.1 kernel yet, but I'm sure I was able to run scrub on 3.17.* kernel with BFS without any issues.Delete
Other than that I haven't had any issues with BFS on 3.18.1 kernel after 5 days of running it on my laptop.
Thanks again ck for your work.
Thanks for that. I'll try and get a patch that backs out the plugged I/O changes out soon for you to try.Delete
Try this patch: bfs460-locked-pluggedio.patchDelete
Thanks Con! I ran btrfs scrub twice after applying your patch and it finished successfully. No deadlocks.Delete
the patch seems to solve the issue. at least I was able to run scrub on my btrfs partitions without freezes this time :)Delete
now ubutnu lastest lts use 3.13 kernel.
can you release bfs with smt nice patch for 3.13
Thanks i believe you should do.
Just got 3.18 working for the first time on my netbook (using a bobcat-optimized kernel on Arch) (I'd been stuck using earlier versions for a week due to the broadcom-wl thing until I found out my BCM43228 wireless chipset is now supported by the b43 driver in 3.17 and higher, whoops) and while it's working fine overall, I tried adding elevator=bfq to enable bfq and it crashes during startup with a kernel panic in btrfs. It boots fine without that parameter. The weird part, is my desktop (haswell kernel) DOESN'T do this crash with BTRFS and BFQ on 3.18, but it's UEFI based and therefore boots from a vfat partition for /boot (root is still BTRFS).ReplyDelete
I'd try to upload logs of the kernel panic, but its not saved anywhere due to being during early boot during first partition mounting. Should I just take a picture of the screen and upload it?
BFS is this CPU scheduler, on here.Delete
BFQ is a disk I/O scheduler, that you can find there:
@ post-factum: Any further news upon TuxOnIce, other than the publicly available perhaps?ReplyDelete
@ all: I wish a Happy and Successful New Year to ALL of you,
Funny you. I asked for news, other than publicly available. Thanks. ManuelDelete
That is all I know, sorry :(.Delete
Mmmh. Now there appeared TuxOnIce patches for 3.19-rc6 and 3.18.5, but at least the 3.18.5 version is so unreliable, that I can't recommend it for now. I've only had 3 successful resumes of approx. 15 attempts (also trying different TuxOnIce settings) and the successful resumes only occurred with low memory load, not depending on a changed setting. :-(Delete
There seems to be a bug that causes plasma-desktop to fail to start correctly with 3.18.1-ck1.ReplyDelete
FYI, I'm getting the following panic message using 3.18.1-ck1 w/ BFS .460:ReplyDelete
Kernel panic - not syncing: Attempted to kill the idle task!
I have been using ck1 and ck2 with respective BFS' since 3.15.x through 3.17.6, without issue.
This is now a known problem when "debug preemptible kernel" is enabled in combination with SMT nice. Disabling the former will fix it.Delete
I have the same problem :Delete
The system is a VIA-C3 CPU, 32 bits, one core, one thread, EPIA Motherborad.
3-18-3 vanilla works fine : here is my kernel config : http://perso.crans.org/~bebert/ck/config-epia-nock
3.18.3-ck1 crashes on boot (in the first seconds, just after Laoding Linux, BIOS Data check) : here is my config :
I have disabled "debug preemptible kernel" in both kernels.
I have taken a picture of the screen with crash:
3.14.28-ck1 used to works well.
I hope it can help...
Thanks for all your work.
For completeness w/ others who may be searching...Delete
I found that SMP kernel options have to be enabled to avoid the idle task panic. See 3.19 comments. -jwh
That is in reference to these last two matches for 'PREEMP' in my kernel config (which I've not changed, for your reference, versus 3.17.6, eg)?ReplyDelete
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
Or is 3.18 exercising something new in reference to this? While I'm here, much thanks for this kernel and BFS! I'm doing a test build of 3.17.7 right now, or I'd already be trying it out. :-)
Whoops, I missed the 'debug' part in my kernel config search. So its this: http://cateee.net/lkddb/web-lkddb/DEBUG_PREEMPT.htmlDelete
...however, CONFIG_DEBUG_PREEMPT isn't in my kernel config at all (or is that the problem?), and CONFIG_DEBUG_KERNEL isn't set; CONFIG_TRACE_IRQFLAGS_SUPPORT is 'y'.
Thanks con I appreciate your work!ReplyDelete
Just noticed a kernel panic. My eth went down for a few min then came back up by itself.ReplyDelete
This is not a kernel panic. This is WARN_ON().Delete
Guess I should have read more closely.Delete
Porting to 3.19 at first appeared to be challenging, but in the end - I suppose the solution turned out to be quite elegant ;)
only a few minor changes were necessary:
Not sure if it's placebo, Alfred Chen's additional patches (https://bitbucket.org/alfredchen/linux-gc/commits/branch/linux-3.18.y-gc) or commenting out of most of the cgroups & cpuset stuff (overhead):
the desktop feels quite snappy :D
It doesn't seem to be that easy to port. Following your approach, I got a very unreliable KDE desktop (mouse pointer lagging, Xserver crashing) when having 'a bit more' disk I/O including swap & /dev/shm. (CFS is doing well in a similarly configured setup.)Delete
I hope Con is aware of this to make a better release patch.
Hm, did you try the branch with Alfred Chen's patches added on top ?Delete
perhaps that fixes lots of rough edges ?
I had rsync backups, portage compilations (e.g. firefox) and others compilations in the background and everything is buttery smooth
instabilities, lagginess, etc. related to swapping & /dev/shm
seems to be related to be other things than solely the cpu scheduler (BFS)
it's strange though that it's works fine with CFS
can't really explain that
also I don't have any additional time
hope you guys figure this out =)
Blah blah blah:Delete
what I *actually* wanted to write:
it's at least working as well as on 3.17.8 with BFS and patches from Alfred Chen
Sorry, for making that noise.Delete
I had additionally adopted one old patch from 3.18 kernel related to my intel-gfx for the test, that apparently doesn't behave well on 3.19, (not only together with _your_ patches).
Please, accept my apologies,
apologies accepted, no harm done
Thanks for letting me know =)
Currently I'm trying to figure out why the box locks up with Alfred Chen's ported BFS and not with my attempted port
might be due to the fact that I added some additional patches that don't play well with ZFSonLinux and screw it up
symptoms: it hardlocks as soon as X starts up and any serious work is attempted (e.g. git, chromium startup, etc. etc.)
if I'm able to post a kernel lockup message or anything related or figure out that it is, indeed related to the port I'll post here
other than that:
Con, Alfred and all others involved to make BFS the best cpu scheduler for latency & desktop usage
You guys rock !
Thanks a lot !
just found out that my port has one BUG that makes it unusable for me:Delete
it hardlocks during attempt of a stage4 backup (tar & 7z)
so - since I'm meanwhile also working on improving Alfred Chen's BFS port/version
please defer to Alfred's BFS =)
@post-factum: I've seen, that you've created 3.18-pf1. Thank you for your work!ReplyDelete
But why have you first imported and then reverted the official TuxOnIce patch for 3.18.x, to then patch with a "remote-tracking" git version of TuxOnIce? Please, can you explain your reasons for that?
Addon: I would also be thankful if you could share your TuxOnIce related kernel and (maybe) userspace settings on here, as you've said, it is working well for you. For me it does not. :-(Delete
Thank you in advance,
> Please, can you explain your reasons for that?Delete
> For me it does not. :-(
The sandboxed remote tracking version exactly matches the one derived by using the most recent 3.18.x patch from the TuxOnIce server (http://tuxonice.nigelcunningham.com.au/downloads/all/). I've reordered the 105 individual diffs by hand to prove this fact. So, ... "Merge conflict." cannot the complete truth.Delete
I had tested the 3.18.6 version of TuxOnIce and posted the results on here. When it was successful, I got logs of the success, of course. When it failed, it either failed in seeking to free memory, or saving atomic copy (after saving caches), or simply refused to load the saved image (swap). It simply hung in the middle of nowhere, and logs of these crashes were not available. Maybe I get time to retest with a freshly set-up 3.18.x soon.
Most probably, you've also seen already, that the TuxOnIce for 3.19 highly differs from the 3.18 ("backported") version. Currently, I'm testing 3.19.0, for now without BFS/CK, and with BFQ with a slightly modified most recent patch from Nigel's server (original didn't apply cleanly). I'd need to test more, but it seems to work well. Also with my userspace settings (from 3.17.x):
echo 1 > /sys/power/tuxonice/full_pageset2
echo 1 > /sys/power/tuxonice/no_flusher_thread
In kernel config I kept the checksumming pageset2 ON.
> So, ... "Merge conflict." cannot the complete truth.Delete
Should I care?
> In kernel config I kept the checksumming pageset2 ON.
In 3.18 I did the same for my config.
My porting of 0460 to 3.19 is done during last weekend, the -gc branch kernel is up for 2+days, last night there are 2 fixes for issues found in -vrq branch been port-back to -gc and new -gc kernel has been up for 14+ hours till now.ReplyDelete
You can check and try my -gc branch with v3.19-gc tag at https://bitbucket.org/alfredchen/linux-gc/commits/tag/v3.19-gc
I will write a detail changes for 3.19-gc later. Have fun with 3.19.
thanks a lot !Delete
Really good work, indeed! Thank you very much.Delete
This port works very well together with BFQ-v7r7 and most recent TuxOnIce patch with kernel 3.19.0. Need to accumulate a bit more uptime, but so far, I don't face any issues (like reported here on 17 February 2015 at 10:14).
Also, that hibernation with TuxOnIce works like a charm, makes me really lucky!
BR Manuel Krause
Those who'd like to test 3.19-pf1 are welcome to git tree:Delete
BTW, had to fix some BFS issue:
Thanks for point out these 2 issues.Delete
First one is confirmed. I don't have NUMA config, and don't notice when miss it.
For the second, Em, as a funtoo user, I live happy without systemd, how it goes with CGROUP now? I think I can spend some time to look at it and make them not such dummy. :)
And @pf, please consider drop the following commits you merged, they are kind of hard-coded and specify work good for CORE2 cpus
1. Add XOR_PREFER_TEMPLATE to xor[v2].
2. Use prefered raid6 gen function.
Hehehe, but they work well for CORE2 Cpus!!!Delete
Just push a fix to -gc branch, pls check the commit at https://bitbucket.org/alfredchen/linux-gc/commits/81196b0faa1ec127afd182cad2ac645ce9f3bad8?at=linux-3.19.y-gcDelete
Fixed and merged, thanks.Delete
BTW, getting small oops on each boot:Delete
I need to add a "Me, too." Haven't noticed it. I should check the full dmesg more often.Delete
Any news on this, Alfred? At least, this WARNING doesn't result in any failures or irregular system's behaviour.Delete
@post-factum: Thank you for your engagement to improve TuxOnIce!
Potential lockup fix:Delete
looks, like it also applied to BFS
patch for BFS:
please check against: https://lkml.org/lkml/2015/1/22/465
Sorry for the late reply, I'm out of town for CNY last week. 2 threads here, one at a time.Delete
The WARNING is introduced by the new added WARN_ONCE in set_task_cpu(), as CFS doesn't allow to set task's cpu while it is blocked. For BFS, it's no harm to set_task_cpu() and set_cpus_allowed_ptr() calls it for non-running tasks.
This reminds 2 things:
1. the gap between CFS and BFS, TASK_WAKING status seems not used in BFS, on_rq has different means in CFS and BFS.
2. task's cpu is one of useful scheduling info to let us know which cpu the task is last run on, and this info is used to choose the best idle cpu for the task.
IMO, the call to set_task_cpu() for non-running tasks in set_cpus_allowed_ptr is unnecessary as the allowed cpumask has been set and leave the last run cpu info there will cause no harm.
So, my simple fix to this issue is remove set_task_cpu() calling in set_cpus_allowed_ptr(), and leave the WARN_ONCE in set_task_cpu(), but keep in mind that the condition may need to adjusted for BFS.
Good Info. I'll like to try that patch. I'm fighting against a very weird boot-up issue related to preempt_schedule() or somehow recently.
@kernelOfTruth works for me, thanks.Delete
@Alfred Chen shouldn't we consider that to be locking issue as it happens only on SMP boot?
glad to be of help =)Delete
the commit needed an additional change to let it compile:
so now there's no __cond_resched anymore in bfs.c
tell me anyone try tuxonice with kernel 3.19+bfs-ck or 3.18+bfs-ck in openssuse 13.2 and it work?ReplyDelete
i think kernel panic related to a bug in kernel 3.18 and 3.19 and not to bfs-ck.
used the 3.19-pf kernel with tuxonice on opensuse tumbleweed, starts fine, but hibernating destroyed my ext4 superblock on my sdd. Thanks god (better to say Ted Tso ;-) ) it could be restored. Don't know, if it was accident or luck, but will not trying it again.
That's no good news! :-(Delete
I'm using a (self-)modified TuxOnIce with Alfred's 3.19.y-gc patches and BFQ v7r7 for several days now, without any issue. I don't use SSDs.
on opensuse 13.1Delete
During the last days I've made up a test row of 27 (so far, and running) hibernates/resumes with this current kernel to possibly unreveal a BUG in kscreenlocker_greet. There had been no issues with the kernel (except for with kscreenlocker_greet or with the WARNING reported above.)Delete
Some years ago, I've also tried the pf-kernel; or then after failure, adding the uksm-patches manually to the same base setup. They leaded to memory errors for me, in those days. I think you can phase them out in kernel config, also in the pf-kernel. If you'd leave out uksm, you'd most probably get the same kernel setup as I have.Delete