-ck hacking: bfs 0.423, 3.4-ck2

Monday, 11 June 2012

bfs 0.423, 3.4-ck2

A couple of issues showed up with BFS 0.422, one being the "0 load" bug and the other being a build issue on non-hotplug releases. So here is BFS 0.423 and 3.4-ck2 (which is just ck1 with the BFS update) which should fix those:

3.4-sched-bfs-423.patch

3.4-ck2/

and the increment only:

3.4bfs422-423.patch

Enjoy!
お楽しみください

48 comments:

Anonymous12 June 2012 at 10:06
Good experience linux-3.4.2-bfs-423
No slow down after some hours as with linux-3.3.
I suspect the sys-time bug before had issued some side effects, but that is just my FUD ...

Con Kolivas, thank you for your work!
Ralph Ulrich
ReplyDelete
Replies
Anonymous12 June 2012 at 21:42
Thanks Con - you are really quick :)

fanthom
ReplyDelete
Replies
Anonymous13 June 2012 at 00:54
@Ralph:
What did you mean in "linux-3.4.2-12queuePatches-bfs423-full-ck2" with '-12queuePatches' in the other Blog-thread?! Something away from my radar?

Just gonna reboot now with 3.4.2+bfs-423+BFQ+mm-drop_swap_cache_aggressively.patch.

Thx,
Manuel
ReplyDelete
Replies
Anonymous13 June 2012 at 01:44
BTW, thinking about what Ralph Ulrich wrote in the other Blog thread, he felt the need to learn benchmarking...

Do we have any tool to really benchmark "interactivity" within Linux desktop systems?
Isn't it like only having reports, that things got better/worse? (up to today)

Manuel
ReplyDelete
Replies
Anonymous13 June 2012 at 23:00
@ Con Kolivas:
A big THANK YOU for this BFS-423 patch. It really makes a difference. Now having it running for almost 22h and there is no slowdown as noticed with previous kernel&BFS combos (also mentioned by Ralph).
In addition the recovery time of suspected swapped out content of the desktop after making use of possibly swapped-out-shmfs is greatly reduced on here now.
It also looks like that the base CPU load of usually running processes on here has also dropped.

I don't have any insight on how this is related to your patch improvement since 422, but: Very nice experience, indeed.

Again, many thanks for your work!!!
Manuel
ReplyDelete
Replies
Anonymous14 June 2012 at 01:00
@ Con Kolivas:
NO, nothing o.k. at all.

I again got a complete system failure (like with 3.4.2 plus BFS 422). Just some minutes after my last posting, while only watching something from disk via vlc.

There's nothing in the logs, the machine hang up completely as last time.

SLUB? SLAB?
Please, inspect the differences in the transition in detail.

For me in the meanwhile, I'll compile with a fresh install and will come back.

Manuel Krause
ReplyDelete
Replies
Anonymous16 June 2012 at 01:49
Again I got a complete lockup without any obvious reason (nothing special done, clearly nothing in the logs) yesterday. Around 24h of uptime again.

Time to revert to 3.3.8 + D.H. patches and wait for 3.4.3. and BFS 425.

Manuel
ReplyDelete
Replies
Anonymous16 June 2012 at 01:51
I am runing linux-3.4-bfs-423 for days without errors. And of today with linux-3.4.3rc1. This is still the better than mainline Linux!

At the LKML Hillf Danton is publishing patches every now and then. These are bfs-420 based - named bfs 421 - but Linux-3.3 is dead now - end of lifetime.
Ralph Ulrich
Hamburg, Germany
ReplyDelete
Replies
Anonymous16 June 2012 at 06:42
If I hadn't have had so many random lockups with 3.4.x + BFS I wouldn't have had to have written about.
At least 3 lockups in four days. That has never occurred that often, as of before 2.6.39.

I'm now using 3.3.8 with SLUB (instead of my previous SLAB) + my usual BFS setup + all of Hillf Dantons recent patches. Just to find the same lockups if they're not 4.2.* or BFS 422/423 related.

If it's kernel-3.4 related, I would not suffer.
Manuel
ReplyDelete
Replies
Anonymous16 June 2012 at 07:12
But, let's give it some uptime...
12 hours are nothing.

Manuel

P.S. It should read "if they're not 3.4.* or BFS 422/423 related".
ReplyDelete
Replies
X16 June 2012 at 18:43
Subject: BFS-O(1) is now a correct algorithm.
Con please take a look of this mail. ;-)
ReplyDelete
Replies
Anonymous21 June 2012 at 00:01
Recently linux-3.4.4rc-bfs I had some top time overflows at
rcuc/0
rcuc/1
Is this rpc related?

I am just deleting the one patch
rpc_pipefs-allow-rpc_purge_list-to-take-a-null-waitq-pointer
I see about that and try again ....

Ralph Ulrich
ReplyDelete
Replies
Anonymous22 June 2012 at 22:44
@ Con Kolivas & all other on here as well:
I've now spent some days chacking and reverting some config changes I made between 3.3.8 and 3.4.2/3 and testing the resulting kernels whether they run longer then 24h. One of them hardlocked after almost 32h.

Then I set CONFIG_JUMP_LABEL back to n (like I had with 3.3.8). And this one, including BFS, ran longer than 49h.

Would you consider that this option may harm the BFS or something else in such a way that the machine hardlocks? (gcc version is 4.6.2)

Does someone else have experience with this option?

Now I'll still need to test the standard scheduler with this option set to y although I don't like to run kernels without BFS. ;-)

Manuel
ReplyDelete
Replies
Anonymous23 June 2012 at 00:22
Does BFS have some timing dependencies with side effects?

Last help text sentence
Optimize very unlikely/likely branches
CONFIG_JUMP_LABEL:

update of the condition is slower, but those are always very rare.

Ralph Ulrich
PS: I had also disabled this option
ReplyDelete
Replies
Anonymous23 June 2012 at 04:15
And the other way round?: May _this option_ affect the way BFS works?

IIRC, it made BFS snappier on my old hardware. But that's subjective. Perhaps Ralph would share his experience with us.

Thank you for your replies!
Manuel
ReplyDelete
Replies
Anonymous23 June 2012 at 09:51
Manuel, this Option normally brings a performance boost. This is why I disabled it at first to have a more stable experience. But the last sentence in the help of the option: in rare cases there is a slow down to update conditions.

This must have a side effect for BFS: I have no issues with BFS since I enabled this Jump_Label optimization.
Ciao from happy soccer Germany, Ralph
ReplyDelete
Replies
Anonymous23 June 2012 at 11:00
Con, our issue is not performance here:
Without JUMP_LABEL
1. Manuels system halts after a day
2. me (shutting down ervery night)
ps -e -o pcpu,bsdtime,stat,comm --sort -pcpu
gives me a rcuc/0 thread with 175 Million seconds run time
in rare cases (after hours).
ReplyDelete
Replies
Mike26 June 2012 at 04:21
Hi Con, I have Hard freezes with BFS,

I use zen kernel (with BFS and BFQ, linux 3.4.4). Command "mkfs.ext4 -L Diskname -c -v /dev/sdb1" (SATA drive on esata connector) leeds to an hard freeze within minutes during bad block test. Not even the Magic SysRq keys do work anymore.
So I compiled zen with CFS, no problem, no freeze.
Zen with BFS but without BFQ -> freeze.

So I tested it with Vanilla Kernel 3.4.4 and CK2 Patches -> freeze too.
Vanilla Kernel 3.4.4 with your patches but without BFS -> no freeze!

PS: I even tested to disable in the discussion mentioned CONFIG_JUMP_LABEL and BFS, but freeze too.

I use OpenSuse 12.1 and with the original Tumbleweed kernel 3.4.3 there is no such a problem.

Regards Mike
ReplyDelete
Replies
Anonymous30 June 2012 at 02:41
Just adding to my reports.
I abandonned the 3.4.3 CFS + JUMP_LABEL test after 3d0h10m. Rock stable but really predictably unresponsive.
The 3.4.4 with BFS(only) +JUMP_LABEL crashed after ~8h.
Two days ago I had a complete lockup with that kernel+config but WITHOUT JUMP_LABEL after some hours. So it really has nothing to do with that config setting.
I've tried RIFS but that doesn't work on non-SMP systems at all.

I feel a bit unsafe at the moment, when using BFS-patched kernels.

Manuel
ReplyDelete
Replies
ck30 June 2012 at 09:36
Ok looking at the pattern of lockups people are having, I'm reasonably sure it's the block plugging code which I changed going into this BFS release. I will put together an update soon that backs out those changes to the old proven mechanism. Thanks everyone for your bug reports.
ReplyDelete
Replies
Micron30 June 2012 at 15:51
This comment has been removed by the author.
ReplyDelete
Replies
Anonymous5 October 2012 at 06:38
nice Tool
ReplyDelete
Replies

Add comment