Announcing yet another substantial update for BFS for linux-4.7 based kernels.
BFS by itself:
4.7-sched-bfs-490.patch
-ck branded linux-4.7-ck3 patches:
linux-4.7-ck3
Following on from the large update to BFS in 480 to skip lists, numerous regressions became apparent, the bulk of which were related to doing a poor job of signalling cpu load to the various cpufrequency governors. Some were affected badly, others not so, but there were plenty of helpful people giving feedback about those regressions which encouraged me to slowly but surely chip away at the problems. Additionally, there were some minor behavioural regressions which were oversights during the updates to BFS 480. Finally the rudimentary cgroup stub patch would crash the system.
As the number of patches required to address these issues got larger and larger, it became hard for people on this blog to keep up with the changes so I've released 490 which hopefully should address the bulk of these issues - there are patches in there that haven't been posted on this blog, but I've included all of them with a brief description in the incremental/ directory for your perusal.
Anyway it is much easier for people to grab the latest version which includes all of those changes, including the updated cgroups stub patch.
EDIT: Here's a patch to make cgroup stubs safer cgroup-stubs-safe2.patch
Enjoy!
お楽しみ下さい
-ck
Thanks Con, your are hacking bfs faster than I benchmark it !
ReplyDeleteSo I skipped your lasts testing patches for bfs 480 and went straight to 490.
I've put my results in a google spreadsheet as they are becoming quite big. You can find it here:
https://docs.google.com/spreadsheets/d/1ZfXUfcP2fBpQA6LLb-DP6xyDgPdFYZMwJdE0SQ6y3Xg/edit?usp=sharing
bfs 490 has improved a lot over bfs 480 ! Next I'll test it with interactive=0.
I the meantime I've run linux 4.4+bfs with interactive=1.
I've also updated the results for linux 4.7+cfs. The results I posted previously where for the stock archlinux kernel (4.7.2-1), whereas the kernel running bfs has several config options disabled (NUMA disabled, CONFIG_MCORE2 enabled, DEBUG_KERNEL disabled, FRAME_POINTER disabled, and others...), but that should barely make any difference.
So now, 'cfs 4.7' kernel as the exact same configuration as 'bfs' kernels, and the comparison is all the more fair.
Pedro
Thanks very much for doing those Pedro, it looks much more respectable now. As always my mini-hack to get the massively changed cpufreq code working was hopeless and it's only working better now that I did a more comprehensive patch for it.
ReplyDeleteThank you for your work.
DeleteI've finished testing bfs 490 with interactive=0. Even if it's not the goal of bfs, throughput is indeed better with interactive=0 for single-threaded workload.
Now regarding difference in responsiveness, I don't know how to test it. I'll wait for other users input.
Pedro
I have a new, I tested the kernel just put in the repo, I had only one freeze, I am not sure yet if it is fixed my old proble, with the ck2 and bfs 480. in the other hand I have better temps with this kernel, vs oficial an older kernel of ck. Thank you very much for your work, I will test more the new kernel and see if the problem was fixed or not because when I put the older kernel the freezes went out
ReplyDelete===
ReplyDeletekernel/sched/bfs.o: warning: objtool: __schedule()+0x5f1: duplicate frame pointer save
===
I believe that is OK, but just want to let you know.
Con, also, please take a look at this panic:
ReplyDeletehttps://gist.github.com/8c65b2c01f7182eb578dbd9b2ef8ffd3
It occurs after doing poweroff in qemu, and I believe it is related to CPU cgroups support.
Thanks pf!
DeleteI'm not sure on the first and this is the second time it's been posted (presumably only shows up on gcc6+), but the second definitely is cgroup related. Can you get a backtrace for both of those?
gdb vmlinux
list *__schedule()+0x5f1
and
gdb vmlinux
list *sched_offline_group+0x2a
Thanks!
Sure, but I should recompile kernel locally instead of having it in OBS. Will re-check this in several hours.
DeleteRe-compiled kernel with debug info. Trying to do you've asked for, but get this:
Delete===
(gdb) list *__schedule()+0xa1d
You can't do that without a process to debug.
===
What I'm doing wrong?
Also, relevant panic for debugging kernel:
Deletehttps://gist.github.com/b89d670535b160b7648d1cd5b16fadf0
And info obtained from addr2line:
https://gist.github.com/7e7d152dcdde40470257bf58bfdf37e1
Hope this helps.
Oh, managed that. Please, see "list" output:
Delete===
(gdb) list *(__schedule+0xa1d)
0xba1d is in __schedule (kernel/sched/bfs.c:2237).
2232 * do an early lockdep release here:
2233 */
2234 spin_release(&grq.lock.dep_map, 1, _THIS_IP_);
2235
2236 /* Here we just switch the register state and the stack. */
2237 switch_to(prev, next, prev);
2238 barrier();
2239
2240 return finish_task_switch(prev);
2241 }
===
Also:
===
(gdb) list *(sched_offline_group+0x2a)
0x977a is in sched_offline_group (include/linux/list.h:89).
84 * This is only for internal list manipulation where we know
85 * the prev/next entries already!
86 */
87 static inline void __list_del(struct list_head * prev, struct list_head * next)
88 {
89 next->prev = prev;
90 WRITE_ONCE(prev->next, next);
91 }
92
93 /**
===
Let me know if you need additional info.
Hmm, parser ate my comment!
DeleteGCC6 seems to miscompile sometimes, for example firefox crashes with it too: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=836533
DeleteI have a very similar stack trace while compiling from graysky's AUR package at 9c78234 (4.7.3-3). I'll try to build the kernel without cgroups and see how that goes.
DeleteBy the way, the panic happens with both GCC 6.2.1 and 5.4.0.
DeleteThanks pf. The duplicate frame pointer save warning is okay. I've posted a safe cgroups patch in the top post to address the other crash (generically.)
DeleteCon, second patch that does remove the code seems to fix the panic — I cannot trigger it anymore.
DeleteI hope, that is the reasonable solution.
Great thanks. Hopefully that fixes all the crashes people were experiencing so I can move on and do more fun development :)
Deletenothing, it's just weird I cant do nothing, now the fourth freeze and now without games or browser, watching videos, i turned with the oficial kernel again.
ReplyDeleteDid you enable the cgroup stub feature?
DeleteI installed arch and let the default options except for dirty bytes and io scheduler for ssd and hdd, other thing is that I use the haswell linux ck kernel of repo of gravisky.. I am not sure if it is enabled sorry if i Don't help so much, you can see the options used in linux-ck aur page
Deleteother thing is that the linux ck1 don't give me any problem of freezes while ck2 and ck3 yes.. the problem starts with the change of bfs to 480
DeleteThanks, probably still a bug somewhere in the new core code then. I'll keep looking but hopefully someone will capture a crash/backtrace for me to know where the problem is.
DeleteI am also experiances these freezes, around 4-5 times now, while playing games, watching videos... I am using linux-ck-haswell 4.7.3
DeleteSame repo. Still enables the experimental cgroups feature which I know now is unstable. I've asked graysky to disable them but he's currently busy.
Deletei asked him too while others have less errors if the cgroups is disabled.. I will test the new kernel when he put in the repo and I will post here if the error is out or not, I am using now the bfs 472 in the older kernel for the pc with intel, the laptos haven't got this problems with freezes, I don't know why this differences with the same arch , the same packages.. the difference is the time of cpu and the cpu
Deletein the aur page gravysky has changed the pkgbuild with the cgroups patches disable, it is only time that the repo need to put the new version.. I will test in a few hours or the next day
DeleteCon, with the new BFS 490 after every suspend / resume this appears in the logs:
ReplyDeleteCPU: 1 PID: 16 Comm: migration/1 Tainted: G OE 4.7.3-bfs-skp #1
Hardware name: Dell Inc. XPS L521X/0880F2, BIOS A16 12/17/2013
0000000000000286 00000000a55cfa0b ffff88044caf7e48 ffffffff813e79f3
0000000000000001 ffffffff81cc5b28 ffff88044caf7e78 ffffffff81406895
ffff88044cae9800 ffff88044e801280 ffffffff81e59a20 0000000000000001
Call Trace:
[] dump_stack+0x65/0x92
[] check_preemption_disabled+0xe5/0xf0
[] debug_smp_processor_id+0x17/0x20
[] smpboot_thread_fn+0x173/0x230
[] ? sort_range+0x30/0x30
[] kthread+0xd8/0xf0
[] ret_from_fork+0x1f/0x40
[] ? kthread_worker_fn+0x180/0x180
smpboot: CPU 1 is now offline
I had checked the logs and even 4.7.2 + 480 gave me some sort of similar dumps in the logs.
CPU: 0 PID: 9 Comm: migration/0 Tainted: G OE 4.7.2-bfs-skp #1
Hardware name: Dell Inc. XPS L521X/0880F2, BIOS A16 12/17/2013
0000000000000086 000000003cd7157e ffff88044c997d88 ffffffff813d8bf3
0000000000000000 0000000000000000 ffff88044c997dc8 ffffffff8108178b
0000007d4c997e50 0000000000000001 ffff88045f217dd0 0000000000017d00
Call Trace:
[] dump_stack+0x63/0x90
[] __warn+0xcb/0xf0
[] warn_slowpath_null+0x1d/0x20
[] native_smp_send_reschedule+0x3e/0x40
[] wake_smt_siblings+0x70/0x80
[] __schedule+0xa01/0xcd0
[] schedule+0x35/0xc0
[] smpboot_thread_fn+0xc0/0x160
[] ? sort_range+0x30/0x30
[] kthread+0xd8/0xf0
[] ret_from_fork+0x1f/0x40
[] ? kthread_create_on_node+0x1a0/0x1a0
---[ end trace 639864e7b4173949 ]---
smpboot: CPU 1 is now offline
As system was not affected in any visible way, I didn't even knew that errors were there.
With BFS 472 there were no such errors.
br, Eduardo
Additionally, this gets printed to dmesg before trace:
Delete[ 3005.418397] Removed affinity for 631 processes to cpu 1
[ 3005.418400] BUG: using smp_processor_id() in preemptible [00000000] code: migration/1/16
[ 3005.418405] caller is debug_smp_processor_id+0x17/0x20
br, Eduardo
The using smp_processor_id() in preemptible code BUG during suspend/resume cycle is a long existed bug in mainline but don't know why there is no fix for releases, maybe it is not triggered in cfs. Here is my fix for it which I can't stand for it last month. It's in my new -test branch but not yet released.
Deletecommit 5894464e9238b794a03713046d50e5972d526fa2
Author: Alfred Chen
Date: Tue Aug 30 11:11:36 2016 +0800
bfs: Fix mainline smp_processor_id() called in preempt code issue
diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index 13bc43d..fc0d8270 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -122,12 +122,12 @@ static int smpboot_thread_fn(void *data)
if (kthread_should_park()) {
__set_current_state(TASK_RUNNING);
- preempt_enable();
if (ht->park && td->status == HP_THREAD_ACTIVE) {
BUG_ON(td->cpu != smp_processor_id());
ht->park(td->cpu);
td->status = HP_THREAD_PARKED;
}
+ preempt_enable();
kthread_parkme();
/* We might have been woken for stop */
continue;
Alfred, I'll cherry-pick this one for -pf :). Thanks!
DeleteThx Alfred, I'll incorporate this into my build as well.
DeleteYou're spot on there Alfred. I think even if you don't trigger it on mainline the code so clearly violates the preempt disabled requirement in such a short space that you could just generically submit a patch for it to mainline anyway.
DeleteThis would be a good sample that bfs can trigger issues that mainline scheduler may not notice about.
DeleteI will submit the patch or someone plz help to submit it. Now, the skiplist in bfs is lot more interesting than this, :)
@Alfred:
DeleteIf the patches' issue is so longstanding and so obvious, as Con states, you should simply post it to LKML, maybe with links to the regarding supportive posts.
It's not satisfying to know of a BUG people may encounter, but need not.
BR, Manuel Krause
@Alfred. I can submit it if you like.
Delete@ck
DeleteThanks, I got your submit email.
CK I make some test with and without pstate driver, with both of them (pstate and cpufreq ) the kernel continues freezing, I don't know if it is a problem with memory(I have 8GB) or with swapping( I have 50 in vm for swap) but the problem is here, with the older kernel any freeze ocur.. but I think I will go to the older version to see if it is a new problem only with me or it is a new config in my pc, I will inform.. sorry for not to being more helpful
ReplyDeletenothing, in the older kernel isn't any type of freeze, I am here writing and gaming since 30 minutes with the same settings and kernel build of repo gravysky
DeleteHey Alberto. I checked graysky's repos and they enable the cgroups feature which is still unstable so perhaps that's where your problem is coming from with this latest kernel (though other bugs may also be present that I don't know about.)
DeleteI was thinking about this and I have just knew that I had c states enabled in bios, it is the only thing that is different from my other 2 laptops, one with an older intel core 2 duo and other with amd.. may it be the problem? with the change of bfs behaviour may be the cores idle with c7 or c6 state don't wake up, this is one of the things that may cause the freeze.. I am not sure about the problem with cgroups because I have in the other 2 laptops the same repo of gravisky but one with kernel piledriver and the other with core 2 duo version and it didn't make me any freeze.. thanks again and i'll post in the aur page that the cgroups patch is unstable yet, i'll wait for other compilation for test and to get you know the result
DeleteIf You are talking about CONFIG_CGROUP_SCHED, then I have it, but I don't experience any freezes or crashes on my Dell XPS 15 (i7 CPU).
DeleteMy kernel is 500Hz, BFS 490 patch, Generic 64bit compilatiion, low latency desktop preemption.
Br, Eduardo
@ck As you are saying that cgroups support might be the reason for all these bugs but I am not sure if this is really the reason as it just works fine with BFS 472.
DeleteI'm not saying it's the reason for ALL the bugs but it's a KNOWN reason. It can't have worked fine with bfs472 because there was no way to enable CONFIG_CGROUP_SCHED in bfs472.
Deletehave you enabled c states in bios? I have not tried disabling because I downgraded the kernel before I thought that
ReplyDeleteThe c states won't be responsible. There's a real bug there somewhere in the BFS patch.
DeleteI did some tests of bfs vs cfs about throughput, but bfs is about minimizing latencies. I remember that a while back someone posted about this tool :
ReplyDeletehttps://github.com/iovisor/bcc/blob/master/tools/runqlat_example.txt
So I gave it a try on two linux 4.7.2 kernels with the same config. One has cfs the other has bfs490. The bfs kernel is compiled with SMT_NICE and CGROUP_SCHED disabled.
I wrote a basic script that load the cpu (i5-3210m) by building ffmpeg with make -j4, waits a few seconds, and then runs 'runqlat 10 2'. The build takes place in /dev/shm to prevent disk io from interfering with the results.
The raw results are:
for cfs: http://pastebin.com/gS8FfnmY
for bfs490+interactive=1: http://pastebin.com/PsFwvVFn
for bfs490+interactive=0: http://pastebin.com/zhqV0Kpe
One can make all kind of maths and graphs with this (mean, std dev, median, ...), but before I do, I must first dust off my maths and then know if these results and the test are of any relevance (I think they are, but I know little about tasks scheduling and cpu).
Con and other users, what do you think of it ?
Thanks
Pedro
Ok I've put the data in the google spreadsheet:
Deletehttps://docs.google.com/spreadsheets/d/1ZfXUfcP2fBpQA6LLb-DP6xyDgPdFYZMwJdE0SQ6y3Xg/edit?usp=sharing
It's a bit messy though.
Pedro
Thanks. Those qlat graphs are mildly interesting, but they're measuring microlatencies which would be unnoticeable by humans. If any of them went beyond the 6 millisecond range they'd start being noticeable. It's good to see the values bound for both schedulers. The results might be more interesting as load is increased progressively further with higher make -j values. Additionally recent BFS patches have not been optimal due to issues with cpu frequency signalling so either try an older BFS, say 469 on linux 4.5, or the latest BFS 490, or disable cpufreq scaling by setting a performance governor while doing tests. Thanks!
Deletebad news I have just tested the new version in the gravysky repo, the kernel is freezing too, the cgroups wasn't the problem, I saw the journalctl to see if there were a problem and in the log there isn't any error before the freeze..
ReplyDelete@Alberto, @Con,
DeleteI have freezes as well with 4.7.3 + BFS490 (CGROUPS enabled), but I can trigger them 100% (at least it seems so) by changing laptop brightness (press fn+brightness up/down and here we go), locks up w/o blinking caps lock.
BUT, I can trigger that in Ubuntu 16.10 and NOT in 16.04. Of course like a LOT has changed in 16.10 compared to 16.04, so it's difficult to say (for me at least) what is the reason, but in 16.04 I haven't had any lockups so far.
I compile kernel in 16.04 with GCC 5.4 (default which comes with 16.04) then install it to 16.04 and 16.10. Alfred has Arch, so it most likely have newest stable packages for everything, 16.10 is in development, so it has rather new stuff as well. That's the only thing in common.
What can I do to help You to track down the problem?
Btw, Con, which distro/version do You use for development/testing?
br, Eduardo
Did you add the cgroups safe2 patch?
Deleteit is supposed to be disabled by gravysky in the pkgbuild.. and the freeze is totally, i can't do nothing, reisub is not function and the brightness can't be controled because is a desktop where I have the problems, I am testing now if the freeze only occurs with the game or can be with other things(because the other kernel with 480 and 490 bfs patch freeze many times but I have no time with this kernel for tried this.. but I think the freeze will occur with browsing and gaming too.. I will post the results as soon I will able to do, in the forum or arch there are many people with haswell desktop and this problem, but with laptop haswell appear not to be freezes
DeleteThanks alberto. In that case it seems to be somehow haswell related. Was bfs472 stable for you? If so I'll have to post a couple of different patchsets to see where the culprit lies.
Deleteyes, the bfs 472 patch was very stable for me, any freeze there. Other thing i was before testing again ck, the freezes come yet with browsing too, and other thing, there are people that have appeared with other architectures(silvermont ie) that have this freezes with rsync, I have the freezes with gaming, browsing playing videos.. I don't know why
Deletehttps://bbs.archlinux.org/viewtopic.php?id=111715&p=113
Delete(see the lastest pages)
Alberto it looks like others are affected withOUT -ck patches. It could just be that the latest BFS makes it happen more easily and that it's a bug from mainline. I've looked hard at all the code I put in and can't see anything wrong so far. Give -ck4 a try when it comes out (shortly.)
Delete@Con,
Deletesafe2 seems ok. I didn't notice safe2 was availble, sorry.
br, Eduardo
No problem. It's hard to keep up when I'm hacking this aggressively... ck4 is about to be released.
Deletethere are a minimun number of persons affected in the mainline that is related sure to other aspects, I have freezes ONLY with the ck kernel, I have been working like 3 or 4 hours and any freeze, with the 472bfs and the oficial kernel, since the bfs 480 I can not work more than 1 hour because the system freezes totally, I am sure the problem is related to the kernel patched because always occurs with that kernel, I will test the new patch and tell you if the problem continues here.. other thing is that with my other 2 laptops, the intel core 2 duo and pildriver cpu there are any freezes, only in my desktop haswell
Deletethe log stops to write before the freeze or the freeze is not logged in any form
ReplyDeleteOk I've run the tests with cfs+acpi-cpufreq+performance and bfs490+acpi-cpufreq+performance, which lock the cpu frequency at maximum.
ReplyDeleteHere are the results for increasing -j values.
for bfs: http://pastebin.com/MSKHPYa0
for cfs: http://pastebin.com/Bdjx6Dnp
The latencies are higher and the distribution clearly becomes bimodal. Maybe it is because of kernel processes that runs at higher priority.
The upper bound is higher with cfs than with bfs at small -j values.
I've also put those data in the spreadsheet.
With the runqlat utility it is possible to monitor the runqueue latencies for a specific PID. I was thinking of loading the system with a make, and then monitoring a specific process like a movie player or web browser. Would this be interesting ?
Pedro
Pedro
Hey Pedro those latency figures are VERY interesting because this is now showing how BFS is able to keep the latencies bound to under human perception rates while those on CFS start to blow out when the load is only 50% higher than your number of CPUs. I don't think you need to test specific processes as you're already getting the results you need. Out of curiosity was the sched autogroups feature enabled on CFS?
ReplyDeleteThanks for the explanation, and yes SCHED_AUTOGROUP was enabled on cfs.
DeletePedro
That is probably why CFS looks ok when the load gets ridiculously high. It's not a fair comparison with that enabled.
Delete