-ck hacking: BFS 490, linux-4.7-ck3

Wednesday, 7 September 2016

BFS 490, linux-4.7-ck3

Announcing yet another substantial update for BFS for linux-4.7 based kernels.

BFS by itself:
4.7-sched-bfs-490.patch

-ck branded linux-4.7-ck3 patches:
linux-4.7-ck3

Following on from the large update to BFS in 480 to skip lists, numerous regressions became apparent, the bulk of which were related to doing a poor job of signalling cpu load to the various cpufrequency governors. Some were affected badly, others not so, but there were plenty of helpful people giving feedback about those regressions which encouraged me to slowly but surely chip away at the problems. Additionally, there were some minor behavioural regressions which were oversights during the updates to BFS 480. Finally the rudimentary cgroup stub patch would crash the system.

As the number of patches required to address these issues got larger and larger, it became hard for people on this blog to keep up with the changes so I've released 490 which hopefully should address the bulk of these issues - there are patches in there that haven't been posted on this blog, but I've included all of them with a brief description in the incremental/ directory for your perusal.

Anyway it is much easier for people to grab the latest version which includes all of those changes, including the updated cgroups stub patch.

EDIT: Here's a patch to make cgroup stubs safer cgroup-stubs-safe2.patch

Enjoy!
お楽しみ下さい
-ck

65 comments:

Anonymous7 September 2016 at 23:48
Thanks Con, your are hacking bfs faster than I benchmark it !
So I skipped your lasts testing patches for bfs 480 and went straight to 490.

I've put my results in a google spreadsheet as they are becoming quite big. You can find it here:
https://docs.google.com/spreadsheets/d/1ZfXUfcP2fBpQA6LLb-DP6xyDgPdFYZMwJdE0SQ6y3Xg/edit?usp=sharing

bfs 490 has improved a lot over bfs 480 ! Next I'll test it with interactive=0.

I the meantime I've run linux 4.4+bfs with interactive=1.
I've also updated the results for linux 4.7+cfs. The results I posted previously where for the stock archlinux kernel (4.7.2-1), whereas the kernel running bfs has several config options disabled (NUMA disabled, CONFIG_MCORE2 enabled, DEBUG_KERNEL disabled, FRAME_POINTER disabled, and others...), but that should barely make any difference.
So now, 'cfs 4.7' kernel as the exact same configuration as 'bfs' kernels, and the comparison is all the more fair.

Pedro
ReplyDelete
Replies
ck8 September 2016 at 00:33
Thanks very much for doing those Pedro, it looks much more respectable now. As always my mini-hack to get the massively changed cpufreq code working was hopeless and it's only working better now that I did a more comprehensive patch for it.
ReplyDelete
Replies
alberto gomez marin8 September 2016 at 10:56
I have a new, I tested the kernel just put in the repo, I had only one freeze, I am not sure yet if it is fixed my old proble, with the ck2 and bfs 480. in the other hand I have better temps with this kernel, vs oficial an older kernel of ck. Thank you very much for your work, I will test more the new kernel and see if the problem was fixed or not because when I put the older kernel the freezes went out
ReplyDelete
Replies
Oleksandr Natalenko8 September 2016 at 19:39
===
kernel/sched/bfs.o: warning: objtool: __schedule()+0x5f1: duplicate frame pointer save
===

I believe that is OK, but just want to let you know.
ReplyDelete
Replies
Oleksandr Natalenko8 September 2016 at 19:59
Con, also, please take a look at this panic:

https://gist.github.com/8c65b2c01f7182eb578dbd9b2ef8ffd3

It occurs after doing poweroff in qemu, and I believe it is related to CPU cgroups support.
ReplyDelete
Replies
alberto gomez marin9 September 2016 at 00:11
nothing, it's just weird I cant do nothing, now the fourth freeze and now without games or browser, watching videos, i turned with the oficial kernel again.
ReplyDelete
Replies
Anonymous9 September 2016 at 18:03
Con, with the new BFS 490 after every suspend / resume this appears in the logs:

CPU: 1 PID: 16 Comm: migration/1 Tainted: G OE 4.7.3-bfs-skp #1
Hardware name: Dell Inc. XPS L521X/0880F2, BIOS A16 12/17/2013
0000000000000286 00000000a55cfa0b ffff88044caf7e48 ffffffff813e79f3
0000000000000001 ffffffff81cc5b28 ffff88044caf7e78 ffffffff81406895
ffff88044cae9800 ffff88044e801280 ffffffff81e59a20 0000000000000001
Call Trace:
[] dump_stack+0x65/0x92
[] check_preemption_disabled+0xe5/0xf0
[] debug_smp_processor_id+0x17/0x20
[] smpboot_thread_fn+0x173/0x230
[] ? sort_range+0x30/0x30
[] kthread+0xd8/0xf0
[] ret_from_fork+0x1f/0x40
[] ? kthread_worker_fn+0x180/0x180
smpboot: CPU 1 is now offline

I had checked the logs and even 4.7.2 + 480 gave me some sort of similar dumps in the logs.

CPU: 0 PID: 9 Comm: migration/0 Tainted: G OE 4.7.2-bfs-skp #1
Hardware name: Dell Inc. XPS L521X/0880F2, BIOS A16 12/17/2013
0000000000000086 000000003cd7157e ffff88044c997d88 ffffffff813d8bf3
0000000000000000 0000000000000000 ffff88044c997dc8 ffffffff8108178b
0000007d4c997e50 0000000000000001 ffff88045f217dd0 0000000000017d00
Call Trace:
[] dump_stack+0x63/0x90
[] __warn+0xcb/0xf0
[] warn_slowpath_null+0x1d/0x20
[] native_smp_send_reschedule+0x3e/0x40
[] wake_smt_siblings+0x70/0x80
[] __schedule+0xa01/0xcd0
[] schedule+0x35/0xc0
[] smpboot_thread_fn+0xc0/0x160
[] ? sort_range+0x30/0x30
[] kthread+0xd8/0xf0
[] ret_from_fork+0x1f/0x40
[] ? kthread_create_on_node+0x1a0/0x1a0
---[ end trace 639864e7b4173949 ]---
smpboot: CPU 1 is now offline

As system was not affected in any visible way, I didn't even knew that errors were there.

With BFS 472 there were no such errors.

br, Eduardo
ReplyDelete
Replies
alberto gomez marin9 September 2016 at 23:00
CK I make some test with and without pstate driver, with both of them (pstate and cpufreq ) the kernel continues freezing, I don't know if it is a problem with memory(I have 8GB) or with swapping( I have 50 in vm for swap) but the problem is here, with the older kernel any freeze ocur.. but I think I will go to the older version to see if it is a new problem only with me or it is a new config in my pc, I will inform.. sorry for not to being more helpful
ReplyDelete
Replies
alberto gomez marin10 September 2016 at 04:54
have you enabled c states in bios? I have not tried disabling because I downgraded the kernel before I thought that
ReplyDelete
Replies
Anonymous10 September 2016 at 07:40
I did some tests of bfs vs cfs about throughput, but bfs is about minimizing latencies. I remember that a while back someone posted about this tool :
https://github.com/iovisor/bcc/blob/master/tools/runqlat_example.txt
So I gave it a try on two linux 4.7.2 kernels with the same config. One has cfs the other has bfs490. The bfs kernel is compiled with SMT_NICE and CGROUP_SCHED disabled.
I wrote a basic script that load the cpu (i5-3210m) by building ffmpeg with make -j4, waits a few seconds, and then runs 'runqlat 10 2'. The build takes place in /dev/shm to prevent disk io from interfering with the results.

The raw results are:
for cfs: http://pastebin.com/gS8FfnmY
for bfs490+interactive=1: http://pastebin.com/PsFwvVFn
for bfs490+interactive=0: http://pastebin.com/zhqV0Kpe

One can make all kind of maths and graphs with this (mean, std dev, median, ...), but before I do, I must first dust off my maths and then know if these results and the test are of any relevance (I think they are, but I know little about tasks scheduling and cpu).
Con and other users, what do you think of it ?

Thanks
Pedro
ReplyDelete
Replies
alberto gomez marin12 September 2016 at 07:23
bad news I have just tested the new version in the gravysky repo, the kernel is freezing too, the cgroups wasn't the problem, I saw the journalctl to see if there were a problem and in the log there isn't any error before the freeze..
ReplyDelete
Replies
alberto gomez marin12 September 2016 at 07:25
the log stops to write before the freeze or the freeze is not logged in any form
ReplyDelete
Replies
Anonymous12 September 2016 at 07:31
Ok I've run the tests with cfs+acpi-cpufreq+performance and bfs490+acpi-cpufreq+performance, which lock the cpu frequency at maximum.
Here are the results for increasing -j values.
for bfs: http://pastebin.com/MSKHPYa0
for cfs: http://pastebin.com/Bdjx6Dnp

The latencies are higher and the distribution clearly becomes bimodal. Maybe it is because of kernel processes that runs at higher priority.
The upper bound is higher with cfs than with bfs at small -j values.
I've also put those data in the spreadsheet.

With the runqlat utility it is possible to monitor the runqueue latencies for a specific PID. I was thinking of loading the system with a make, and then monitoring a specific process like a movie player or web browser. Would this be interesting ?

Pedro

Pedro
ReplyDelete
Replies
ck13 September 2016 at 08:01
Hey Pedro those latency figures are VERY interesting because this is now showing how BFS is able to keep the latencies bound to under human perception rates while those on CFS start to blow out when the load is only 50% higher than your number of CPUs. I don't think you need to test specific processes as you're already getting the results you need. Out of curiosity was the sched autogroups feature enabled on CFS?
ReplyDelete
Replies

Add comment