Some degree of normality has returned to my life, so I bring to you a resync of the BFS cpu scheduler for 3.7, along with the -ck patches to date.
Apply to 3.7.x:
Broken out tarball:
Latest BFS by itself:
People often ask me why I don't maintain a git tree of my patches or at least BFS and make it easier on myself and those who download it. As it turns out, it is actually less work only for those who download it to have a git tree and would actually be more work for me to maintain a git tree.
While I'm sure most people are shaking their head and thinking I'm just some kind of git-phobe, I'll try to explain (Note that I maintain git trees for lrzip https://github.com/ckolivas/lrzip and cgminer https://github.com/ckolivas/cgminer).
I do NOT keep track of the linux kernel patches as they come in during the development phase prior to the latest stable release. Unfortunately I simply do not have the time nor the inclination to care on that level any more about linux kernel. However I still do believe quite a lot in what BFS has to offer. If I watched each patch as it came into git, I could simply keep my fork with BFS and merge the linux kernel patches as they came in, resyncing and modifying as it went along with the changes. When new patches go into the kernel, there is a common pattern of many changes occurring shortly after they're merged, with a few fixes going in, some files being moved around a few times, and occasionally the patch backed out when it's found the patch introduces some nasty regression that proves a showstopper to it being released. Each one of these changes - fixes, moves, renames, removal, require a resync if you are maintaining a fork.
The way I've coded up the actual BFS patch itself is to be as unobtrusive as possible - it does not actually replace large chunks of code en bloc, just adding files and redirecting builds to use those new files instead of the mainline files. This is done to minimise how much effort it is to resync when new changes come. The vast majority of the time, only trivial changes need to be made for the patch to even apply. Thus applying an old patch to a new kernel just needs fixes to apply (even if it doesn't build). This is usually the first step I do in syncing BFS, and I end up with something like this after fixing the rejects:
This patch is only the 3.6 patch fixing any chunks that don't apply.
After that, I go through the incremental changes from mainline 3.6 to 3.7 to see any scheduler related changes that should be applied to BFS to 1. make it build with API changes in mainline and 2. benefit from any new features going into mainline that are relevant to the scheduler in general. I manually add the changes and end up with an incremental patch like this:
This patch is only merging 3.6->3.7 changes into BFS itself
Finally I actually apply any new changes to BFS since the last major release, bugfixes or improvements as the case may be, as per this patch here:
Git is an excellent source control tool, but provides me with almost nothing for this sort of process where a patch is synced up after 3 months of development. If I were to have my fork and then start merging all patches between 3.6 and 3.7, it would fail to merge new changes probably dozens and potentially hundreds of times along the way, each requiring manual correction. While merge conflicts are just as easy to resolve with git as they are with patch, they aren't actually easier, and instead of there being conflicts precisely once in the development process, there are likely many with this approach.
However git also does not provide me with any way to port new changes from mainline to the BFS patch itself. They still need to be applied manually, and if changes occur along the way between 3.6 stable through 3.7-rc unstable to 3.7 stable, each time a change occurs to mainline, the change needs to be done to BFS. Thus I end up reproducing all the bugfixes, moves, renames and back-outs that mainline does along the way, instead of just doing it once.
Hopefully this gives some insight into the process and why git is actually counter-productive to BFS syncing.
Enjoy 3.7 BFS.
Thank you Con.ReplyDelete
Btw, broken links: the "3.71" in the URLs should be "3.7-ck1" instead. The full address is http://ck.kolivas.org/patches/3.0/3.7/3.7-ck1/
Thanks. I always seem to screw up links on blogspot. Links fixed.ReplyDelete
Thanks as usual CK. For those interested readers, here are the tests confirming the performance of bfs v0.425 to bfs v0.426 as I usually provide. A reminder that these are my `make bzImage` test on my workstation comparing 4 kernels: 3.6.9, 3.6.9-bfs, 3.7.0, and 3.7.0-bfs. The script is on my github linked below.ReplyDelete
If you haven't seen this yet, a branch of BFS:ReplyDelete
o tanoshimi i suppose? Should learn my kankis...ReplyDelete
Thanks for your benchmark, graysky. For those of us (e.g. me) who can't read graphs and don't know what a quantile is, could you please sum it up in plain english? ;-) Bfs vs cfs, who wins this round?ReplyDelete
That graphic is overly complicated for people who don't do statistics. But since CK has a background in stats, I included them. In plain English, both version of the BFS gave statistically significant DECREASES in compile time compared to the CFS. Less time = faster compile.ReplyDelete
Look at the median for each group. BFS v0.426 (the current one that patches into the linux 3.7 series) was around 350 ms faster than the corresponding mainline scheduler (CFS).
Thanks for the explaination.Delete
actually using git is *much easier*ReplyDelete
all you need to do is git rebase v3.7 and resolve little conflects.
I thought I just explained why, so I guess you didn't even read the article and just assumed I'm trolling against git... Feel free to grab a 3.6-bfs425 patched kernel and then git rebase v3.7 for yourself and see if you end up with anything like bfs426.Delete
If git is much easier, please explain why a project such as bcache  doesn't provide patches for recent kernels.Delete
Con Kolivas please look at what this is doing.ReplyDelete
I do not mean to be mean here. But this person is at long last addressing the faults that were raised against BFS you did not want to hear a while back when you stormed away for the Linux kernel main developers and made an ass out of yourself.
As the maintainer of CFS said at the time BFS was not scaling properly and you response was that those system were more complex than desktop. We are now getting desktops 8 cores and more.
It would be good to get you back in the main-line development Con Kolivas. Hopefully big enough lesson not to be pig headed even if the other person appears to be. If it does not work on large machines sooner or latter those large machines will be the general desktop/mobile phone machines. Yes there is a 8 core mobile phone due out as well. The sub 4 core gains are over.
So scaling well as number of cores increases has become critical.
The one thing CFS has had always over BFS is better handling on large systems by avoiding cpus having to lock to access data as much.
"BFS takes a diferent approach than both the O(1) scheduler and CFS. BFS uses runqueues like O(1); however, unlike O(1), which has both an active and an expired runqueue per CPU, BFS has only one system-wide runqueue containing all non-running tasks."
Yes the place BFS broke from a O(1) scheduler. Comes back and hurts you as you get more and more cores needing to talk to the 1 single system-wide runqueue.
Now a wise person would have investigated why. Worst locking is memory controller to memory controller for performance cost. So a run-queue per memory controller. Possible in a cgroup of processes assigned to that physical cores connected to that memory control the cost would be min. And you would avoid having to perform load balancing.
Now you also have stated your hate for cgroups around processes. This now forces you down the path of runqueue per core that now leads back to costly load balancing.
Basically the problem you put off along time ago Con is back with for revenge.
Clearly you do not understand me at all then. If regaining my life makes me an ass then so be it - I shall remain an ass. As an amateur hacker it is absolutely impossible to maintain that degree of interaction with the linux kernel in my spare time without it affecting my personal life and health. So as it stands, I don't actually want to engage them again at that price. There is no "problem I put off long ago" as far as I can see. You make it sound like I'm obliged to do something. I never once pretended BFS was a fix for everything, nor was I trying to make it so. I was the one who pointed out all its faults long before that paper came out.Delete
As for the actual code, I shall respond in time to the email as appropriate.
Furthermore... BFS didn't exist when I "stormed off" by the way. And it was nothing to do with scalability of the Staircase Deadline scheduler that were considered the problem at the time. But time muddies the story and people mix up the new and old so it's understandable.Delete
"...stormed away... and made an ass out of yourself". Smells like an uninformed troll. I for one thank CK for continuing his work DESPITE what the "Linux kernel main developers" said and wanted, it's thanks to him that my Linux-based laptop is responsive.Delete
Anonymous - Feel free to improve upon BFS as Matthias is attempting to do... otherwise, shut up.Delete
The limited scale argument to which you refer needs data to support it AFAIK. Look back a few posts to see that on a dual quad (hyperthreaded) machine, a `make -j16` endpoint clearly establishes that BFS outperforms CFS. I would be interesting to see the numbers on a larger machine. What is the point at which it breaks down, etc. I don't think that we will see 16 core desktop/laptops in the near future :p
++ graysky !Delete
I find it amazing to consider how quick some individuals are in deducing strong moral lessons (to the intention of others) from some purely technical works.
At the end of the day, it appears that they just do not care about the (in)correctness of facts. They just want to say : Hurray ! The author is right ! or Booo ! The author is wrong !
I have posted a response to Matthias, as promised.Delete
Interesting thoughts, Con. Hope people take note...Delete
I read your post but I'm confused - why don't you just clone kernel.org's repository and create your own branch in your repo (call it "ck")?ReplyDelete
Then when you want to merge with a new kernel from kernel.org, just do a pull on the master branch followed by a single "git merge" into your "ck" branch.
Commit the merge to "ck" and repeat as needed.
Thanks for the detailed explanation. I always wondered why you don't use git for this. I didn't consider how different it would be to do countless smaller changes throughout the process of a kernel release instead of one larger rebase of your code.ReplyDelete
Time to put your new ck patch to work on my shiny new (to me) core 2 duo. It always helped dramatically on my old pentium 4 boxes
Just wanted to say thank you, my machine feels so much snappier with the CK patch I actually thought I left it on the performance governor instead of ondemand :)
Thank you for the updated BFS, Con. I've been running it for a week now on a heavily used Core i5 with no issues whatsoever. Looks solid.ReplyDelete
I've compiled (x86_64) 3.7.1 based on Ubuntu's kernel git repo both with CK1 and BFQ: http://narod.ru/disk/64736436001.f1d7509f20fd047f0c3be8d64190cdc9/linux-image-3.7.1-ck1s1_3.7.1-ck1s1-10.00.Custom_amd64.deb.htmlReplyDelete
Thanks for bfs for kernel 3.7. Hope you'll resume your work on it soon, hacking and improving bfs' features for even better desktop responsivity :)ReplyDelete
ck patches I dont know - they are pointless or even harmful if you are using newer kernel features ...ReplyDelete
3.7-sched-bfs-426.patch applies cleanly with linux-3.7.1
Yours isn't any different ...
Ciao from Hamburg, Ralph Ulrich
Unfortunately the new BFS does make accounts the process times wrong again :(ReplyDelete
- I had this issue with linux-3.4.x-bfs
- came around with higher RCU boost with linux-3.5.x-bfs
- I had no thus issues with linux-3.6.x-bfs
- now manipulating RCU .config doesn't help with linux-3.7.1-bfs
As a simple user I feel dependend observing htop -ordering by time- for controlling functioning of my system(d). Also I feel very unsecure observing some 50 million hours on some tasks :(
Ciao from Hamburg, Ralph Ulrich
At this point I want to add some experiences I've made with the 3.6.x series + ck/bfs + bfq. At some time suspend-to-disk broke. I tried to investigate further and came to the conclusion that my switch from pure bfs+bfq to ck+bfq made the difference. So... long story short...ReplyDelete
On my old machine with 1.4 GHz CPU, 2GB RAM, 3GB shm, 4 GB swap my normal operations AND suspend-to-disk+resume worked again, when
leaving swappiness = 60 (openSUSE kernel-source default; ck = 10 always distorted my desktop latency),
setting dirty_ratio = 6 (default = 20; ck = 1),
setting dirty_background_ratio = 2 (default = 10; ck = 1),
[the last both settings are the lowest known working with suspend +1 to be on the safe side].
I haven't retested this on 3.7.1 as with 3.6.x so far, but it's working fine for 5 days of uptime with 3.7.1 and regular nightly suspends!
Is this difference due to my outdated computer?
Greets and my very best wishes for your 2013 to all of you,
Since I'm not using a swap file (I have enough RAM), I assume these settings would not affect me?Delete
Then, swappiness would not affect you. (Sic swap<->swappiniess)Delete
And YOU're not able to suspend to disk.
You can read (...)/linux/Documentation/sysctl/vm.txt for yourself to find into the topic I wrote about.
dirty_ratio & dirty_background_ratio
determine when old things in Memory get written to disk or are trashed. So, that can affect you, too.
Thanks, I'll try dirty_ratio = 6 and dirty_background_ratio = 2 then.Delete
Looks like there's lots of contradicting opinions on the net on what these values should be set to. In 2007 Linus decreased the ratios to 10/5 but people complained about poor DB performance.
Hi ck, just want to report back that the following issue still existed in 3.7 bfs, I have reported it in 3.5&3.6ReplyDelete
[ 0.126666] kernel/sched/bfs.c:7171 suspicious rcu_dereference_check() usage!
It's not impact bfs functionality, you said you would fix it in next release but may forget to mark it down.
Thanks again for the effect in ck patch, it is running well in 3.7.1.
CK's reply at the time was: "That does indeed look wrong, but the way the data is locked in bfs it won't lead to a problem". I hope he can find some time to fix it anyway.Delete
But it looks untrustworthy for someone like me, who uses htop to get some glimpse of his system performance, if there is constant overflow of times :(Delete
I can provoke earlier or later occurences of these overflows by different RCU .config options and priorities.
Line 7171 is for_each_domain(cpu, sd)Delete
which is done under grq_lock_irq();
and this only happens once on startup.
It is definitely a false positive and the code is not hit again after startup. The accounting bug is a totally different issue.
PS: I have perfected the low-jitter config on linux, for mainline kernel. Please see the jitter links on my blog. www.paradoxuncreated.comReplyDelete
I also tested BFS during this. It seems general experience of jitter, (lost frames, poor frametiming) is worse with BFS.). So if I was you, I´d drop it. Get an Intel E5 workstation on top of this, and even windows won´t stutter.
Unless ofcourse you have some idea you want to realize. But then somne measure of fairness seems quite good.
Peace Be With You.
I was looking for actual numbers on your blog. It seems the only thing I've found is your conclusion that with BFS native games run almost as low jitter as CFS, while games under wine perform as good as under a real Windows OS. Which is a good thing :)Delete
Can you, please, elaborate what the poster above indirectly asked for: numbers! And perhaps post them on here as link in your answer?
Thank you, Manuel Krause
Is it an Islamic/Muslim/halal kernel, that you provide? ->After reading your BLOG while keeping up all my "western" tolerance, I'm severely in doubt that this could serve our needs in general.<-
Please, also provide a quick download link for your low-jitter .config (as your BLOG doesn't provide a senseful quick search for the download).
@Manuel you can download the deb from http://paradoxuncreated.com/Blog/wordpress/?p=2268 and then unpack it using "dpkg -x nameofpackage /tmp/" if you want to analyze his .config fileDelete
That's not the version of "quick" that I meant...
still need a quick link to the config?
@tvall: Thank you for the uploaded .config file.Delete
I also managed to read to some of Paradox Knows' low-jitter related articles and links (and found some newer config for his local kernel as of 3.6.6).
There are ~ 440 different lines comparing his with my .config. Some things only for SMP & x64 that I don't use. And, of course, WITHOUT BFS (or CK) & WITHOUT BFQ.
Without an outline, about what magic setting makes that kernel "lower-jitter" than BFS/CK + BFQ I feel left in the dark. Sorry, for not checking each of the ~440 diff. lines for now.
BTW, 3.7.3 is working fine on here with BFS + some CK-patches + BFQ.
So again, many thanks and my very best wishes go to Con,
After reading some more & testing...Delete
I can say that Paradox Uncreated's approach of adjusting priorities via schedtool (as in his ljtune script) can still help to achieve better==lower latencies effectively. This at least applies to my low performance system. (Many months ago I asked Con about adjustments of this kind and he said he didn't use it. O.k. ... ^^) And, remember, Paradox Uncreated doesn't use BFS or BFQ. I do and will use both.
My most prominent settings to work fine against "jitter" in audio&video are (leaving all other and kernel related processes @ -20 or 0 as they appear originally):
/usr/bin/schedtool -v -N -n -19 `pgrep "X"`
/usr/bin/schedtool -v -N -n -18 `pgrep "kwin"`
/usr/bin/schedtool -v -N -n -17 `pgrep "pulseaudio"`
I've now put them into a script to be executed during KDE-Startup.
What I additionally want is to limit any mount.ntfs-3g to be started @ nicelevel +10 systemwide.
Does someone of you know how and in which startup-script I need to add *what* ?
Thanks in advance,
bad performance on video encoding with bfs (i have never converted a video before on linux)ReplyDelete
(data0001.ts = 487,2MB)
time HandBrakeCLI -i "data0001.ts" -o "data0001.ts.mp4" -E copy:ac3 -e x264 -m -4 -f mp4 -q 20.0 -r 25 --decomb --loose-anamorphic -x ref=4:bframes=4:b-adapt=2:rc-lookahead=60:analyse=all
the cpu usage is higher and the cpu gets ~2°C warmer without bfs
can anybody confirm this? (sorry, bad english....)
if someone is interested....Delete
# CONFIG_NO_HZ is not set
I'm testing multithreading behavoir of BFS especially the Hyperthreading awareness.
Here is a sample with the Whetstone MP Benchmark
normal run 4 4 threads:
with taskset -c 0,1,2,3:
As you can see the taskset variant is ~17% percent faster.
My question is now is there any way to let BFS prioritize the physical cores until we have more than 4 active threads and then for each new thread use virtual core 4,5,6,7.
So basically handle my processor as quad core until we have more than 4 active threads. A behavoir like this would be nice cause most of the time we don't use more than 4 cores and the normal BFS behavoir diminishes performance.
I also found a blog entry to this topic:Delete
"BFS bounces the tasks around different thread contexts and often gets the scheduling wrong, leading to lower performance between 2 and 6 instances" ... ouch! :(Delete
It's a very interesting topic. CK should have a look at it. As I remember, BFS now is kind of HT awared, in task switching, it will stick to the cache-hot core/virtual core.Delete
I have seen this benchmark a long time ago (note he calls CFS CFQ by mistake) and it is a one off test for one particular workload. "Wrong" is too strong a word to describe this behaviour, because it depends largely on what endpoint you're measuring. BFS prioritises latency over throughput and in the relatively unloaded CPU case, BFS shines at its ability to find the earliest available CPU to minimise latency. Doing this sacrifices throughput -slightly- of cache bound throughput intensive workloads. BFS is a scheduler designed to optimise interactivity and responsiveness primarily and to maintain good throughput secondarily. The fact that BFS does better at any throughput benchmark compared to the mainline scheduler is a bonus.ReplyDelete
My hole point is to first use the "real" cores and than the siblings. Is there any way to get this behavior to test latencys etc.Delete
There is no such thing as "real" cores versus siblings. Siblings only become SMT siblings when something is bound to the other thread unit on a core. Yes, BFS already does bind to unused "cores" before trying siblings of busy units. However, if they're all in use, it will then find a sibling in the interests of latency rather than hold off and wait to get back on the same core.Delete
Thanks for clarifying!Delete
Here is a lengthy article of what I actually meant with "real" cores.ReplyDelete
As I already said in my previous response, it does move to unused cores before thread idle siblings on busy cores.Delete
I am having in the last versions, from 3.6.x upwards i think, problems making backups with rsnapshot (a rsync based backup solution) to my mdadm software raid 5 xfs filesystem, here you can see a thread where i reported the problem to xfs mailing list, but seems to be related to the kernel, could this be a bfs problem?ReplyDelete
That could well be. There is a bug striking in weird and wonderful places and this might be the same bug. I'm investigating, thanks.Delete
Thank you for make my desktop usable :), i can test any patch you send me or any update you do in bfs if needed, it's very easy to me to trigger this bug.Delete
Would be interesting to see if BFS 427 has any effect on this bug of yours.Delete
i will test ASAP and post here the results, thanks!Delete
I have tried two consecutive backups and the problem didn't show up, i will be testing the next few days but seems to be fixed or more difficult to trigger.Delete
On the other side i have a new kernel problem but i don't know if maybe can be a VirtualBox driver problem, i pasted here http://pastebin.com/yAKVWk9F
it seems to be failing still, here you have the last error http://pastebin.com/Axb62q0ZDelete
yes, confirmed, i have other one today and have been with the normal desktop use, no using rsnapshot. http://pastebin.com/dQMyHp3eDelete
normaly i can choose between TICK_CPU_ACCOUNTING or IRQ_TIME_ACCOUNTINGReplyDelete
With bfs, both! will be enabled in .config
That could be a very relevant finding with respect to bugs, thanks for pointing it out!Delete
I have to say thanks, for bfs and all the work and effort you invest in it.Delete
All the best for you and your beloved.
This was introduced with BFS 426 for linux-3.7 (the hunk @ line 754) and is not present in the previous versions.Delete
Is there a reason to prefer IRQ_TIME_ACCOUNTING over TICK_CPU_ACCOUNTING? ( @ck: do I read the patch correctly that this is what you intended? ) The config help says, "... so there can be a
small performance impact" with it. And, ck, do you know what takes precedence when wrongly having both set to y or if this has side effects?
Thank you, Manuel Krause
The CPU accounting in BFS by default is already using high resolution just like the mainline "IRQ TIME ACCOUNTING" so it actually doesn't change anything. It's just the new visible kernel option in mainline. I've investigated and enabling both does nothing harmful to BFS.Delete
Thank you for the quick reply! :-)Delete
Just to be sure: Does _not_ setting IRQ_TIME_ACCOUNTING but TICK_CPU_ACCOUNTING = y manually do something harmful or anything better?
The accounting for interrupt time becomes less accurate. The code overhead is less but I've been unable to demonstrate a performance difference.Delete
ck, thank you for your continued contributions.ReplyDelete
I tend to find BFS doesn't play nice with nice. When I run a processor-intensive task such as a backup or compile in the background niced so I can keep working on a responsive machine, BFS seems to thrash, pushing the load up, grinding the desktop to a halt, and taking around 3 times as long to complete the task as the normal kernel scheduler.ReplyDelete
For this reason I have always avoided BFS-enabled kernels, but the distro I use (PCLinuxOS) has now removed the non-BFS kernels from its repository, so it looks as if I will be faced with a choice between BFS kernels or compiling my own in future.
One of the kernel packagers has suggested I report the problem here.
World's Most Popular Cars, Hot Speed Cars, Hot Cars with Hot Girls, Cars Latest Pictures with all info, Latest updates Cars Models and Company Cars, Strange Vehicles, Concept Cars, Top 10 Expensive Cars in the World.ReplyDelete
Visit this Link for More Strange Vehicles and Cars with Latest info and Pictures