Announcing a new -ck release, 4.20-ck1 with the latest version of the Multiple Queue Skiplist Scheduler, version 0.185. These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but configurable for any workload.
In addition to a resync from 4.19-ck1 I've extended the runqueue sharing options to all CPUs as well, meaning it can be used in NUMA hardware as a single runqueue if desired.
Merry Christmas, and have a happy new year everyone. May your new year be filled with good health, stable kernels, and more bitcoin adoption and value.
Thanks so much.ReplyDelete
Perfect way to start into the new year.
Happy New Year!
Damnit, I love you Con. Happy new yearReplyDelete
ooo, good, thx for new ver Con<3ReplyDelete
Joyeuse année 2019 et merci de ta persévérance :-)ReplyDelete
What do you think about gnu hurd project ?
Runs great <3.ReplyDelete
Hi Con, I'm having a compilation issue. x64 builds fine. But when I try to build for x86, I'm getting the following compile error:ReplyDelete
(the 4.19 patch worked fine for x86)
Thanks for all your work!
There are some fixes in git on both the muqss and -ck branch, courtesy of SSB. Try those :)Delete
Worked like a charm, thanks again!Delete
I'm reporting this here as it is the current blog item, but is affects all recent MuQSS and kernel releases. I've observed it since kernel version 4.17, when I introduced encryption for my main workstation. I'm using LUKS (linux unified key setup) and the system is configured to read the pass codes from the console relatively early during kernel boot.
With the vanilla kernel it works as designed. However, as soon as I compile in MuQSS, all key presses sort of "bounce". In other words each key press results in any number of characters being generated, making the blind input of key phrases impossible. I have not been able to use MuQSS since then.
Any ideas how I could fix it?
I used to have this issue too, it was probably solved by enabling periodic timer ticks - CONFIG_HZ_PERIODIC=y . A workaround was to use a USB keyboard. But it was really long ago and it's hard to recall the details, my current muqss config just does not exhibit this issue.Delete
Config for 4.14 which does not exhibit the issue:Delete
Hope that helps.
Thank you a lot. The timer tick option did the trick. I'm back on MuQSS. ^_^Delete
It still means that there is some underlying issue preventing me from using MuQSS should the kernel ever go tickless.
Were you using MuQSS alone or with the rest of the -ck patches?Delete
The issue being solved with periodic ticks does not surprise me actually. Ran into a different issue with MuQSS and the (default) idle dynticks.Delete
Some audio distortion in WINE that I eventually solved by recompiling the kernel with full periodic ticks.
This was with just MuQSS, sans -ck.
The kernel will probably eventually go full tickless by default, so whatever the underlying problem may be, it does need to be addressed.
As people (or is it person?) keep repeating ad nauseam. Perhaps the ck patches should become part of muqss since muqss is intrinsically a tickless scheduler and relies on highres timeouts to work properly, but unfortunately the highres timers in mainline are stupidly tick resolution limited...Delete
No offense was implied, no need to infer any either.Delete
to answer the earlier question, I have been applying the pure MuQSS patch, no other kernel patches. Also I'd like to clarify, I have no deep insight in kernel development. I just picked the rumour up here...Delete
Smooth as silk. Partially because of the low level VLA work on 4.20 itself and the rest is MuQSS' doing.ReplyDelete
Great work as always. Amazing job on including NUMA nodes in the single runqueue option.
Great job and smooth AF 4.20.1.
Hi all, I was wondering if other people are seeing a 'psi: task underflow!' message when booting 4.20-1 linux-ck kernel?ReplyDelete
More info on psi (pressure stall information for CPU, memory, and IO): https://lwn.net/Articles/763629/
When adding 'psi=0' kernel parameter to effectively disable psi, this message goes away. Alas my experience with/knowledge of psi is lacking, so I cannot judge if this is a wise thing to do, or at all related to linux-ck or MuQSS...
$ pacman -Q linux-ck-core2
[ 0.509321] MuQSS locality CPU 0 to 1: 2
[ 0.509323] Sharing MC runqueue from CPU 1 to CPU 0
[ 0.509327] CPU 0 RQ order 0 RQ 1
[ 0.509328] CPU 1 RQ order 0 RQ 1
[ 0.509329] CPU 0 CPU order 0 RQ 0
[ 0.509331] CPU 0 CPU order 1 RQ 1
[ 0.509332] CPU 1 CPU order 0 RQ 1
[ 0.509333] CPU 1 CPU order 1 RQ 0
[ 0.509334] MuQSS runqueue share type MC total runqueues: 1
[ 0.509542] psi: task underflow! cpu=0 t=2 tasks=[0 0 0] clear=4 set=0
full dmesg: https://ptpb.pw/xwAE.log
PSI support is new on MuQSS and completely untested at this stage and probably broken. That said, it's a debugging feature that you won't be using so there's not much point enabling it.Delete
Thanks for clearing that up so quickly!Delete
> it's a debugging featureDelete
It isn't. Or, it is, but to the same extent as loadavg.
Whereas it may not be a literal debugging feature in the strictest sense of the word, it is a feature that is most commonly used by and most commonly useful for developers.Delete
That does make its use case mostly of a debugging nature.
psi: task underflow! cpu=0 t=2 tasks=[0 0 0 1] clear=c set=0Delete
on 5.7.4-ck1 on a ryzen 1600
First of all, thank you for your continuous work with the patchset.
I have a question about using the 'workqueue.power_efficient' kernel boot parameter, which can be used to disable per-cpu workqueues in order to improve power efficiency, and how it relates to the runqueue sharing in MuQSS.
I understand these are two different things, but I'm curious whether the per-cpu workqueues should work in any way differently with MuQSS compared to vanilla kernel, that should be taken into consideration with the kernel configuration.
Do you have any thoughts or recommendations about using the workqueue.power_efficient option with MuQSS enabled kernel?
Thank you again, and I hope you'll have a great year.
It should just work the same as in vanilla, though I have no informed opinion on its usage as such.Delete
Thanks for the clarification.Delete
It Seems Docker have some trouble with MuQSS? please have a look https://bbs.archlinux.org/viewtopic.php?pid=1825773#p1825773ReplyDelete
No, docker and containers that use CPU scheduler cgroups in general do not work at all with MuQSS. There is no 'containment' as such, and the cgroups are only there to allow systems to run that mandate their existence.Delete
Which is a good thing imho.Delete
I actually suspect systemd was doing something right before v240 that broke support in docker. One can switch back to CFS to use docker with modern systemd, but that why does modern systemd suddenly make MuQSS incompatible?Delete
I'd say it's probably worth investigating restoring the behavior that let MuQSS work without the cpuacct cgroup controller. Docker doesn't _need_ it to work properly, and especially for my use case, I just use docker to build kernels, so I don't really care how docker wants to use cgroups to manage CPU usage.
And really, that leaves us to, how hard would it be to add the most simplest shim to MuQSS, even if the cgroup itself is functionally useless? Obviously we were fine before without it, but now suddenly docker needs it since systemd updated. Very bizarre.
Also, I forgot to link to the docker issue for anyone that's unaware. There's already a confirmation that downgrading systemd lets you run docker with MuQSS.Delete
I regret to inform you that after a long use of the MuQSS, I decided to try the CFS + cgroup + ulatencyd combination and this combination turned out to be more beneficial for use on the desktop.ReplyDelete
Although the system began to use more RAM, with a large load it behaves more smoothly and more responsively. There is also no interruption of sound reproduction. In normal operation, the consumption of electricity has decreased.
Switched back to CFS as well here; although without ulatencyd (as it has been abandoned).Delete
CFS, tickless, 100 Hz, BFQ ( and scsi_mod.use_blk_mq=1 ). Since there is even talk of outright dumping all legacy IO schedulers and there seems to be some sort of interaction between at least BFQ_MQ and cgroups; the latter of which MuQSS does not support well.
Not overly happy with recent developments in the kernel but, well... there are politics in play as well and the direction is set. So... what can we do but adapt?
I did an effort to create a "tickless" system with the complete -ck patchset, but i think i have something wrong with my .config for that.ReplyDelete
Creating "make defconfig" does not seem to set CONFIG_NO_HZ_FULL so i am not sure what is the correct way to implement this tbh.
What i did experience was that when compiling with -j12 (i7 8700K), the desktop was more or less useless, and i even had a gcc error spewing out something about "resource temporarily unavailable". So obviously i have done something wrong when setting the build parameters.
Could you point me to something that MUST be set for a full tickless and amazing performing system? :) Eg. CONFIG_NO_HZ_FULL and stuff like that.
Using only MuQSS and CONFIG_NO_HZ_IDLE++ seems to be oki, but wanted to try the "full tickless" type of system.
I did it once, full tickless with MuQSS. What I did was that I used the base Ubuntu generic kernel; recompiled that as full tickless and use the config that resulted from that as a basis to use for a tickless MuQSS kernel.Delete
It did work.
But, accounting is off with a tickless MuQSS, the consequences of which might be harmless but personally, I am not sure we can rule out problems with CPU states as well as governors functioning normally when accounting is off as is the case with tickless MuQSS.
I keep trying to tell people to not make completely tickless kernels. No idle ticks is ideal for MuQSS. There is no advantage to a completely tickless kernel even for mainline for a normal desktop or mobile device.Delete
And yet, kernels across the board seem to be adopting it:Delete
- Mainline Linux (for years now)
- FreeBSD (from 9 on)
- The Solaris kernel (from Solaris 8 on)
- The NT kernel (from Win 8 on)
- The Zircon kernel (Google's new microkernel)
So, here we have kernels aimed at desktops, servers as well as mobile and even embedded devices all adopting or at the very least allowing a tickless mode. Whether or not there is an actual advantage to it is becoming moot. It is simply becoming the industry standard.
No, it's not there's no advantage. It's DISadvantageous, but that's fine you can keep posting here to taunt me on this issue.Delete
Oh, then i misunderstood tbh. You wrote:Delete
As people (or is it person?) keep repeating ad nauseam. Perhaps the ck patches should become part of muqss since muqss is intrinsically a tickless scheduler and relies on highres timeouts to work properly, but unfortunately the highres timers in mainline are stupidly tick resolution limited...
And i thought you actually meant that -ck patchset was MEANT TO be configured as "tickless" (CONFIG_NO_HZ_FULL). Guess i did not really grasp the meaning of that :)
It works fine with CONFIG_NO_HZ_IDLE=y tho.
My bad then. I mean intrinsically it doesn't depend on ticks as such, but the configuration of ticks should be nohz idle as you correctly figured out.Delete
Apparently the 4.20.8 patch breaks the build for 4.20-ck1 for at least kvm-intel with the following error:ReplyDelete
ERROR: "sched_smt_present" [arch/x86/kvm/kvm-intel.ko] undefined!
Reverting this commit allows the build to finish: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v4.20.8&id=f29a8be0e5d28f89c835cbae700e67a383280916
I'm assuming the proper fix would be adding "EXPORT_SYMBOL_GPL(sched_smt_present);" somewhere in MuQSS sources..
Thank you very much, sir.Delete
Seems as that is the change for PDS scheduler too. Will see if i can do a test compile when i get home.Delete
The change is in /kernel/sched/MuQSS.c
Adding the line like this i guess:
Seemed to work for me:Delete
diff --git a/0001-MultiQueue-Skiplist-Scheduler-version-v0.185.patch b/0001-MultiQueue-Skiplist-Scheduler-version-v0.185.patch
index 1b235e8..bf61ce0 100644
@@ -1598,7 +1598,7 @@ new file mode 100644
-@@ -0,0 +1,7437 @@
+@@ -0,0 +1,7438 @@
+// SPDX-License-Identifier: GPL-2.0
+ * kernel/sched/MuQSS.c, was kernel/sched.c
@@ -1828,6 +1828,7 @@ index 000000000000..e8610b659791
Indeed, the above change looks correct and seems to work.Delete
Here's a patch against -ck patched kernel sources: https://pastebin.com/EPMEir9b
I get this with just the MUQSS patch:Delete
(Stripping trailing CRs from patch; use --binary to disable.)
patching file kernel/sched/MuQSS.c
patch unexpectedly ends in middle of line
Hunk #1 succeeded at 227 with fuzz 1.
Is this a problem?Delete
Updated patch: https://github.com/SveSop/kernel_cybmod/blob/MuQSS/0001-MultiQueue-Skiplist-Scheduler-version-v0.185.patchDelete
I should probably have named it _v2.patch or something, but well.. Soon 5.0 kernel, and it will be a new version anyway :)
So I use just that one^ for MUQSS only (no ck)?Delete
It is the MuQSS patch, and not the whole -ck patch set.Delete
You can either replace this "fixed" patch in your -ck patchset for a full -ck kernel, or use it as a single patch if you only want MuQSS. (My github has the full -ck patchset if interested in that).
Kind of late to the party, but you can refer to Zen Kernel's MuQSS branch if you're having trouble building - we probably already ran into and fixed many build problems due to stable patches.Delete
the pastebin is missing a newline at the end which results in the error.ReplyDelete
It still applies fine regardless, as stated at the last line of patch output (or you can add the newline yourself if the error bothers you)
And this was obviously meant to be a reply for the >=4.20.8 patch comments... :|Delete
Any ETA for 5.0?ReplyDelete
Might be a while, I'm guessing. With the complete removal of legacy IO schedulers (yes, it happened) and BFQ's interaction with cgroups, MuQSS might need some work to be fully compatible with 5.0.Delete
MUQSS is still the best.
Well worth to wait.
How would any of the IO scheduler changes require changes to MuQSS? Nothing was changed within the kernel's default CPU scheduler either for that reason.Delete
Also, there were barely any changes to BFQ, definitely nothing related to cgroups. Additionally MuQSS has never supported cgroups, so even if there was any such changes, I don't think they would require huge amounts of work.
Energy Aware Scheduler is probably the biggest scheduler related change, but I'm not sure whether that requires big changes for MuQSS.
MuQSS not supporting cgroups is exactly the problem. Some time ago I predicted, in response to some proposed patches, the removal of the legacy IO schedulers.Delete
As of right now, I am also predicting that the hierarchical support (cgroup support; see KConfig for reference) of BFQ will become non-optional, it will become mandatory. An integral part of BFQ. At that point we can expect MuQSS to no longer fully support BFQ. And since BFQ is the only remaining viable option for an IO scheduler on non-SSD devices (MQ-DEADLINE is just a joke for heavy IO), well... I think the pattern should become obvious.
Why do you think the legacy IO schedulers were removed? The argument was to simplify the code, maintainability. Obviously they are going to play that card for blk_mq and the mq schedulers as well. At which point the aforementioned hierarchical support of BFQ will become non-optional.
Additionally, BFQ is being pushed HARD as the de-facto standard IO scheduler. And cgroups are likewise being pushed hard as the de-facto standard to priority handling.
Regarding CFS not being changed -- CFS has fully supported cgroups from the day those were implemented. Since it predates cgroups (although not by much).
Continuing not support cgroups will probably be fine for 5.0, probably even for the remainder of 2019. But at some point, they will simply become unavoidable. Probably mid-2020, I'm guessing.
So much misinformation... MuQSS has nothing to do with BFQ, nor anything to do with any I/O schedulers. -ck also has nothing to do with BFQ nor any I/O schedulers.Delete
It's actually already up on git. Lacking only separate patches and an announce.Delete
looking forward to the availability of the patches. Somehow I am too stupid to create them from git on my own ...Delete
How to git2patch(es)?Delete
The patches are uploaded. Too busy to announce right now.Delete
[ 1.535455] ------------[ cut here ]------------ReplyDelete
[ 1.535460] Current state: 1
[ 1.535464] WARNING: CPU: 1 PID: 0 at 0xffffffff8108c865
[ 1.535466] Modules linked in:
[ 1.535469] CPU: 1 PID: 0 Comm: MuQSS/1 Not tainted 5.0.7-ck1 #3
[ 1.535471] Hardware name: System manufacturer System Product Name/Crosshair IV Formula, BIOS 3029 10/09/2012
[ 1.535474] RIP: 0010:0xffffffff8108c865
[ 1.535476] Code: 04 77 29 89 f1 ff 24 cd 38 76 a0 81 80 3d 53 1b bd 00 00 75 17 89 c6 48 c7 c7 90 c6 ad 81 c6 05 41 1b bd 00 01 e8 7b ae fa ff <0f> 0b 48 83 c4 08 5b c3 48 8b 47 60 48 85 c0 75 64 83 fe 03 89 73
[ 1.535480] RSP: 0018:ffff888437c43f50 EFLAGS: 00010082
[ 1.535482] RAX: 0000000000000010 RBX: ffff888437c504c0 RCX: ffffffff81c1fdb8
[ 1.535483] RDX: 0000000000000001 RSI: 0000000000000086 RDI: ffffffff81f8fcac
[ 1.535485] RBP: 7fffffffffffffff R08: 00000000000001f0 R09: 0000000000000000
[ 1.535487] R10: 0720072007200720 R11: 0720072007200720 R12: 7fffffffffffffff
[ 1.535489] R13: ffff888437c56900 R14: ffff888437c569f8 R15: ffff888437c56a38
[ 1.535491] FS: 0000000000000000(0000) GS:ffff888437c40000(0000) knlGS:0000000000000000
[ 1.535493] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.535494] CR2: 0000000000000000 CR3: 0000000001c0c000 CR4: 00000000000006e0
[ 1.535496] Call Trace:
[ 1.535500] 0xffffffff8108e7cb
[ 1.535502] 0xffffffff810817cb
[ 1.535503] 0xffffffff81601568
[ 1.535505] 0xffffffff8160117f
[ 1.535507] RIP: 0010:0xffffffff8100f592
[ 1.535509] Code: 0f ba e0 24 72 11 65 8b 05 bb eb ff 7e fb f4 65 8b 05 b2 eb ff 7e c3 bf 01 00 00 00 e8 17 e0 07 00 65 8b 05 a0 eb ff 7e fb f4 <65> 8b 05 97 eb ff 7e fa 31 ff e8 ff df 07 00 fb c3 66 66 2e 0f 1f
[ 1.535512] RSP: 0018:ffffc9000007bf00 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[ 1.535515] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000001
[ 1.535516] RDX: 000000005b7f1466 RSI: 0000000000000001 RDI: 0000000000000380
[ 1.535518] RBP: ffffffff81c601a8 R08: 0000000000000000 R09: 0000000000019840
[ 1.535520] R10: 0000001e3c819be7 R11: 000000007260bc7a R12: 0000000000000000
[ 1.535522] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1.535524] 0xffffffff8105bd2f
[ 1.535526] 0xffffffff8105bf5b
[ 1.535527] 0xffffffff810000d4
[ 1.535529] ---[ end trace 71fe021b29fa5d1f ]---
I am having this problem on all my phenom 2 systems, looks like some kind of interrupt problem, I tried to enable nothreadedirqs option and also enabled fix for broken boot irqs option but none of them had any effect on this, I enabled stack traces but for some reason he don't show them :x
anyone might know what this is ? I searched around and found this which may be helpful: https://pastebin.com/y0aXvBNP
(this is not mine but it looks very much like mine)