-ck hacking: linux-4.8-ck5, MuQSS version 0.120

Saturday, 29 October 2016

linux-4.8-ck5, MuQSS version 0.120

Announcing a new version of MuQSS and a -ck release to go with it in concert with mainline releasing 4.8.5

4.8-ck5 patchset:
http://ck.kolivas.org/patches/4.0/4.8/4.8-ck5/

MuQSS by itself for 4.8:
4.8-sched-MuQSS_120.patch

MuQSS by itself for 4.7:
4.7-sched-MuQSS_120.patch

Git tree:
https://github.com/ckolivas/linux

This is a fairly substantial update to MuQSS which includes bugfixes for the previous version, performance enhancements, new features, and completed documentation. This will likely be the first publicly announced version on LKML.

EDIT: Announce here: LKML

New features:
- MuQSS is now a tickless scheduler. That means it can maintain its guaranteed low latency even in a build configured with a low Hz tick rate. To that end, it is now defaulting to 100Hz, and it is recommended to use this as the default choice for it leads to more throughput and power savings as well.
- Improved performance for single threaded workloads with CPU frequency scaling.
- Full NoHZ now supported. This disables ticks on busy CPUs instead of just idle ones. Unlike mainline, MuQSS can do this virtually all the time, regardless of how many tasks are currently running. However this option is for very specific use cases (compute servers running specific workloads) and not for regular desktops or servers.
- Numerous other configuration options that were previously disabled from mainline are now allowed again (though not recommended for regular users.)
- Completed documentation can now be found in Documentation/scheduler/sched-MuQSS.txt
Bugfixes:
- Fix for the various stalls some people were still experiencing, along with the softirq pending warnings.
- Fix for some loss of CPU for heavily sched_yielding tasks.
- Fix for the BFQ warning (-ck only)

Enjoy!
お楽しみ下さい
-ck

84 comments:

Unknown29 October 2016 at 15:52
Should we enable the "Full dynticks CPU accounting" option or leave it to "Simple tick based cputime accounting" for a standard desktop system? I was reading the documentation and I'm not sure if it should be dynticks cpu accounting or tick based. Does it matter for MuQSS?
ReplyDelete
Replies
ck29 October 2016 at 15:55
The accounting choice doesn't matter but adds slight overhead. Tickless idle should be used though.
ReplyDelete
Replies
Unknown29 October 2016 at 18:52
Hello,

since ck2 that replaced BFS with MuQSS I have an issue with wine:
While I play a game (Paladins) it will eventually freeze, but the rest of the computer is totally fine. Killing wine and restarting it works till the next freeze.
I see nothing in dmesg, nor in winelog, so it took me some trials to realize it was from the ck patchset, so far I've only had that issue while on ck2 to ck5.

I've tried just MuQSS, 250hz, 1000hz and preempt from ck4 in case it was something else (like BFQ that was just added), but it was the same. It still is the same in ck5.

I have not noticed anything else going wrong since ck2, but I haven't tried playing native games since then.

I'm not sure how to give better information.
ReplyDelete
Replies
Anonymous30 October 2016 at 01:16
@ck and all interested testers:
Can someone of you explain to me, what it means, when in gkrellm's proc chart there's a severely high continuous amount of "fork" being displayed, along with relatively high sys cpu usage, ~33% on each of 2 cores?
I observe this reproducibly, when firefox loads my set of 160 tabs, after the new changes I added to the kernel:
o MuQSS commits 5065068c and newer applied,
o .config setting to CONFIG_HZ_100=y for the first time instead of 1000 ever before
During this period firefox isn't responding in any way, and only way for relief is playing "switch-to-other-open-windows-game". Or using CONFIG_HZ_1000 again.

My first time to see that, I don't think behaviour is intended to be like that. Side note: In my experience CONFIG_HZ_1000 still leads to a more interactive mouse pointer.
Maybe this info is of help for last debugging steps.

BR and thank you,
Manuel Krause
ReplyDelete
Replies
monotykamary30 October 2016 at 01:26
I still get stutters and system slowdowns when using wine osu! after a period of time.

I was able to view the dmesg over the lag and it showed a similar output at during stutter:

10月 29 21:18:49 freeiBP kernel: NOHZ: local_softirq_pending 40
10月 29 21:18:49 freeiBP kernel: NOHZ: local_softirq_pending 40
10月 29 21:18:49 freeiBP kernel: NOHZ: local_softirq_pending 40
10月 29 21:18:49 freeiBP kernel: NOHZ: local_softirq_pending 40
10月 29 21:18:49 freeiBP kernel: NOHZ: local_softirq_pending 40
10月 29 21:18:49 freeiBP kernel: NOHZ: local_softirq_pending 40
10月 29 21:18:49 freeiBP kernel: NOHZ: local_softirq_pending 40
10月 29 21:18:49 freeiBP kernel: NOHZ: local_softirq_pending 40
10月 29 21:18:49 freeiBP kernel: NOHZ: local_softirq_pending 40
10月 29 21:18:49 freeiBP kernel: NOHZ: local_softirq_pending 40
ReplyDelete
Replies
Anonymous30 October 2016 at 04:23
Thanks Con.
I've ran the usual tests with MuQSS120 on my laptop.
MuQSS120 shows a little improvement over MuQSS116 indeed.
I don't have any error messages on my side.
The results are here:

https://docs.google.com/spreadsheets/d/1ZfXUfcP2fBpQA6LLb-DP6xyDgPdFYZMwJdE0SQ6y3Xg/edit?usp=sharing

Also, what's the status of interbench ? I plan to setup a test environment next week on my desktop and give interbench and Phoronix Test Suite a go.

Pedro
ReplyDelete
Replies
Anonymous30 October 2016 at 05:03
Hello there. After ck5 and MuQSS 0.120 update GDM and GNOME desktop start had significantly slowed down. This also affects user logout/login/hotchange.
Vanilla 4.8.4 and 4.8.4-ck4 works pretty good. Any ideas? i5-6600K, GTX960.
ReplyDelete
Replies
ck30 October 2016 at 07:09
If you're getting a slowdown at 100Hz it's worth checking to see if your hardware is not supporting high res timers properly. Look in dmesg and see if there's something like "Could not switch to high resolution". I was hoping they'd be supported on all modern hardware but I could be wrong...
ReplyDelete
Replies
Anonymous30 October 2016 at 08:04
@Con,

I tried using mux 120 w/ full tickless enabled. Top shows roughly 50% of each core is busy with "si" except the first "ticking" core (CONFIG_NO_HZ_FULL=y, CONFIG_NO_HZ_FULL_ALL=y), which means software interrupts. That's odd, maybe I messed smth up.
Also, Fedora is doing full tickless quite for some time and it seems ok. What are the downsides using full tickless as You suggest it only for very specific cases? Must be smth with interactivity, but can You please explain that in 2 sentences for "the stupid"?

Br, Eduardo
ReplyDelete
Replies
ck30 October 2016 at 08:18
Found some missing nohz related softirq code which could be the problem. I'll work on a patch for that today.
ReplyDelete
Replies
kernelOfTruth30 October 2016 at 09:03
Hi Con,

I'm using

zcat /proc/config.gz | grep -i NO_HZ
CONFIG_NO_HZ_COMMON=y
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
CONFIG_NO_HZ_FULL_ALL=y
# CONFIG_NO_HZ_FULL_SYSIDLE is not set
CONFIG_NO_HZ=y
# CONFIG_RCU_FAST_NO_HZ is not set

with

CONFIG_HZ_100=y
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set

The mouse pointer and interactivity so far stays fluid and responsive during high load (no audio stalls or glitches AT ALL - up until now),

no high load or stability issues so far

I'll keep you updated

Thanks for the great work !

This is the first time I'm running full dynticks ><'

and I like it :)
ReplyDelete
Replies
ck30 October 2016 at 10:13
For those having softirq issues, either the local_softirq_pending warning with stalls, or high si% count on idle CPUs, can you try the following test patch please?
muqss120-try_idle_softirq.patch

This may not be the solution but it will hopefully help me track down what the issue might be.
ReplyDelete
Replies
Anonymous30 October 2016 at 20:13
Con, I see no problems in kernel log, neither MuQSS or switching to high resolution. Slowdown actually occurs only on DE/GNOME loading/logout/login, rest system works faster than on vanilla kernel and even than on linux-ck with BFS sched.
ReplyDelete
Replies
Florian31 October 2016 at 02:16
Hi,

I still get this/these BFQ warning(s) once or twice when heavy load occurs near after system startup in graphical target:

[ 53.960106] BFQ WARNING:last 4611686022722359627 budget 13835058059577135366 jiffies 4294971692
[ 53.960110] diff 4611686018427387942
[ 70.537813] BFQ WARNING:last 4611686022722361253 budget 13835058059577137039 jiffies 4294973349
[ 70.537817] diff 4611686018427387926

Generally asked: are these warnings meaningless or is there really a scheduler problem?

Thanks,

Florian.
ReplyDelete
Replies
Florian31 October 2016 at 02:23
By the way: on my Intel Core2 I have a better performance when showing video (Arch Linux mpv player with frame doubling vapoursynth filters running) with around 10% less CPU consumption (~80% CPU instead of 90%) with

CONFIG_HZ_PERIODIC=y
# CONFIG_NO_HZ_IDLE is not set
# CONFIG_NO_HZ_FULL is not set
CONFIG_HZ_100=y

compared with NO_HZ_IDLE config.
ReplyDelete
Replies
ck31 October 2016 at 11:23
I've pushed an updated patch for dealing with softirqs to git. Use that instead of the patch I put into the test directory I linked earlier.
ReplyDelete
Replies
Anonymous31 October 2016 at 19:29
@Con,

I applied both patches and still have the same high "si" usage in top command. Apart from showing strange "si" usage values, normal system behavior is not really affected, as far as I can tell.
Also, it appears that I don't have to suspend/resume for more CPU's to start having high "si" usage, they start doing it by themselves after some time. I don't know what triggers it. At boot it's 3 CPUs with high "si", later up to 7. Weird.
I'll keep using this kernel for some time and see what happens.

Br, Eduardo
ReplyDelete
Replies
Anonymous31 October 2016 at 21:01
@ck,
I've tried to build linux-ck with 2 new commit fixes (just to test), but Gnome login hasn't speeded up.
ReplyDelete
Replies
Florian31 October 2016 at 21:14
I compared CONFIG_HZ_100 with CONFIG_HZ_1000 (without test patches) and new defaulting 100 HZ boots faster and more "fluid" on my old Core2 Duo machine, system has better responsiveness and less waiting till graphical userspace is ready. I have no data to prove this but as I know my system very well and never had configured with 100 HZ before - thumbs up!
ReplyDelete
Replies
Florian1 November 2016 at 08:58
I know it's very subjective what I write but as I'm new to linux-ck kernel since a few days and just upgraded to 4.8.6-1: seems to be an EXCELLENT work, dear CK! :-)

The 100 HZ work absolutely fine on my 8 years old Core2 Duo machine and even sound quality improved a little bit! I have to explain that I'm an audio freak meaning I mostly hear my music by headphone. I resample my audio via own alsa config doubling samplerate to 96000 before it passes Eq10 equalizer. So I know how excellent my music sounds and I just realized that with this kernel it's even better, absolutely precise, clear and fantastic! I don't know if it's the 100 Hz (never configured my kernels before with that freq), MuQSS or the kernel itself, but compared to Liquorix or Zen kernel I'm aware of further giant steps directly to perfection! Thanks for this brilliant work! :-)

Florian
ReplyDelete
Replies
Anonymous2 November 2016 at 02:51
@ck:
Maybe it's a little too much wishful thinking of mine, that you still also maintain the 4.7-muqss git branch after 4.7 got tagged EOL. I just tried to compile with the newest 3 commits upon v0.120 on 4.7.10 and it failed with the following abort message:
...
CC kernel/time/timer.o
CC arch/x86/kernel/cpu/mtrr/generic.o
kernel/time/timer.c:1328:42: warning: ‘struct timer_base’ declared inside parameter list
static u64 cmp_next_hrtimer_event(struct timer_base *base, u64 basem, u64 expires)
^
kernel/time/timer.c:1328:42: warning: its scope is only this definition or declaration, which is probably not what you want
kernel/time/timer.c: In function ‘cmp_next_hrtimer_event’:
kernel/time/timer.c:1347:7: error: dereferencing pointer to incomplete type
base->is_idle = false;
^
kernel/time/timer.c: In function ‘get_next_timer_interrupt’:
kernel/time/timer.c:1393:32: warning: passing argument 1 of ‘cmp_next_hrtimer_event’ from incompatible pointer type
return cmp_next_hrtimer_event(base, basem, expires);
^
kernel/time/timer.c:1328:12: note: expected ‘struct timer_base *’ but argument is of type ‘struct tvec_base *’
static u64 cmp_next_hrtimer_event(struct timer_base *base, u64 basem, u64 expires)
^
make[2]: *** [kernel/time/timer.o] Error 1
make[1]: *** [kernel/time] Error 2
make: *** [kernel] Error 2
make: *** Waiting for unfinished jobs....
CC arch/x86/mm/init.o
...

I hope, it's readable on this blog... and you can give me a hint on how to fix it.

Thank you in advance,
BR, Manuel Krause
ReplyDelete
Replies
Unknown2 November 2016 at 05:44
This comment has been removed by the author.
ReplyDelete
Replies
Unknown2 November 2016 at 05:46
Hello again, Con!
I've built linux-ck with all latest commits (including 6a764ab) to test and found that 6a764ab did some speedup on DE loading/login/relogin, but even in this case vanilla upstream kernel is faster on this actions than ck5 with MuQSS (on my Skylake machine).

Best regards.
ReplyDelete
Replies
Anonymous2 November 2016 at 10:13
First of all thank you very much.
I have been using old kernels like 3.12... because of speed and responsiveness (with bfs and bfq) and shunning the 4.x kernels but this one (4.8 muqss git+bfq) is just as fast if not slightly better in some areas.
I play Xonotix open source game occasionally and I noticed a maybe 0.5-1 sec freeze/hang for maybe 3 times in 2 hrs.
Apart from that pretty solid on my test machine.

Keep up the good work.

ReplyDelete
Replies
Unknown3 November 2016 at 05:52
Some "great" news to the GNOME users.

Con, you're completely right, GDM and part of gnome-session (or gnome-shell) has some Hz-dependent function realizations. Because:

vanilla linux 4.8.6 with 300 Hz timer - login and DE loading is fast.
vanilla linux 4.8.6 with 100 Hz timer - login and DE loading is VERY slow.

linux-ck 4.8.6 (MuQSS 0.120) with 100 Hz timer - login and DE loading is slow.
linux-ck 4.8.6 (MuQSS 0.120) with 1000 Hz timer - login and DE loading is fast.
linux-ck 4.8.4 (BFS 0.512) with 1000 Hz timer - login and DE loading is fast (a bit slower than with MuQSS).

So I think it's a good reason to open new issue in upstream GNOME bugtracker.
ReplyDelete
Replies
Anonymous5 November 2016 at 04:35
Linux 4.8 w MuQSS 4c59753

System stutter and freeze (only SysRQ could be used to shutdown) when Wine is running or when JACK is running after TLP had set switched to battery mode

Dmesg messages:
[ 650.039583] NOHZ: local_softirq_pending 202
[ 670.112980] NOHZ: local_softirq_pending 02
[ 720.029885] NOHZ: local_softirq_pending 202
[ 870.242973] NOHZ: local_softirq_pending 202
[ 1025.751366] NOHZ: local_softirq_pending 202
[ 1025.761367] NOHZ: local_softirq_pending 202
[ 1030.641595] NOHZ: local_softirq_pending 202
ReplyDelete
Replies

Add comment