-ck hacking: linux-5.0-ck1, MuQSS version 0.190 for linux-5.0

Tuesday, 12 March 2019

linux-5.0-ck1, MuQSS version 0.190 for linux-5.0

Announcing a new -ck release, 5.0-ck1 with the latest version of the Multiple Queue Skiplist Scheduler, version 0.190. These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but configurable for any workload.

linux-5.0-ck1:

-ck1 patches:

5.0-ck1

Git tree:

5.0-ck

MuQSS only:

Download:

5.0-muqss-190.patch

Git tree:

5.0-muqss

Web: http://kernel.kolivas.org

This is mostly a resync from 4.20-ck1 with a minor tweak to CPU ordering for slightly better throughput. Note that BFQ and I/O schedulers have nothing to do with MuQSS or any of the -ck code so the changes to I/O schedulers in mainline are of no consequence.

Enjoy!

お楽しみ下さい

-ck

65 comments:

Anonymous12 March 2019 at 12:04
... right.
ReplyDelete
Replies
Sveinar Søpler13 March 2019 at 00:01
Pulled those before the weekend from GIT, and compile/works fine so far for me :)

Thank you CK!
ReplyDelete
Replies
Anonymous13 March 2019 at 01:39
Thank you, feels a little more responsive. Good job!
ReplyDelete
Replies
Anonymous14 March 2019 at 03:30
Did you ever test your patches against AMD CPUs?

I have the problem that on (pre-Ryzen) AMD CPUs, the clock speed is always on max with schedutil on the MuQSS patchset...?
ReplyDelete
Replies
Anonymous15 March 2019 at 05:52
OP here:
I have broken schedutil on 4.19
ReplyDelete
Replies
Anonymous16 March 2019 at 05:43
Thanks Con for maintaining this.

I've made throughput benchmarks here :
https://docs.google.com/spreadsheets/d/163U3H-gnVeGopMrHiJLeEY1b7XlvND2yoceKbOvQRm4/edit?usp=sharing

I've tested ck1 and MuQSS alone, configured with NO_HZ_IDLE and HZ=100.

Reading one of your comments on the 0.185 announcement, I understand that for low latency 'MuQSS + high res timers' (aka ck1 patchset) is better than MuQSS alone.
Is that right ?

I also wanted to update interbench results. Is this benchmark suited to compare different scheduler ?
From all my tries to benchmark latency, I remember it is tricky to do so due to different system calls implementations.

Pedro
ReplyDelete
Replies
Anonymous24 March 2019 at 15:40
Hey there, I've noticed that the Multi-Core sibling runqueue sharing doesn't seem to detect the cache topology of older core 2 quad cpus, it ends up running running 1 runqueue for all 4 cores, when the cpu is essentially 2 dual core dies on a package (2x2), instead of running 2 runqueues.

This results in lower than expected performance.

I suspect it is due to the fact it is uma, unlike most modern multisocket/multi-die platforms.

Is there anyway to manually configure cpu locality? so that i can test it against runqueue sharing being off.
ReplyDelete
Replies
Giuseppe Ghibò2 April 2019 at 05:02
Hi. Upstream kernels are using the CONFIG_SCHED_SMT option around 4.19.8 and beyond for sched_cpu_activate|deactivate; shouldn't the file MuQSS.c have to include the following patch too:

https://pastebin.com/ZB7X0WQe
ReplyDelete
Replies
Anonymous6 April 2019 at 09:25
MUQSS fails with 5.0.7:
patching file kernel/sysctl.c
Hunk #1 FAILED at 127.
Hunk #2 succeeded at 297 (offset 1 line).
Hunk #3 succeeded at 314 (offset 1 line).
Hunk #4 succeeded at 469 (offset 1 line).
Hunk #5 succeeded at 1042 (offset 1 line).
1 out of 5 hunks FAILED -- saving rejects to file kernel/sysctl.c.rej
ReplyDelete
Replies
Sveinar Søpler7 April 2019 at 02:18
The failed segment looks like this with 5.0.7 kernel:
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index ba4d9e8..37cf9d5 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -128,8 +128,14 @@ static int __maybe_unused two = 2;
static int __maybe_unused four = 4;
static unsigned long one_ul = 1;
static unsigned long long_max = LONG_MAX;
-static int one_hundred = 100;
-static int one_thousand = 1000;
+static int __read_mostly one_hundred = 100;
+static int __read_mostly one_thousand = 1000;
+#ifdef CONFIG_SCHED_MUQSS
+extern int rr_interval;
+extern int sched_interactive;
+extern int sched_iso_cpu;
+extern int sched_yield_type;
+#endif
#ifdef CONFIG_PRINTK
static int ten_thousand = 10000;
#endif

Offsets is just cosmetics, so you dont need to change.
ReplyDelete
Replies
Klotz10 April 2019 at 20:10
I'd like to add that with 5.0.7 the patch might also fail because of the new $(LIBELF_FLAGS) in tools/objtool/Makefile.

This can be remedied with a short sed command before applying the ck patchset:

sed -i '/-CFLAGS/ s/$/ \$(LIBELF_FLAGS)/' patch-5.0-ck1

As for the rejects in kernel/sysctl.c, these can be worked around with a fuzz factor of 3 as pointed out above.

With these two tweaks i was able to build 5.0.7 with the -ck patches applied.

ReplyDelete
Replies
Anonymous13 April 2019 at 14:02
[ 1.535455] ------------[ cut here ]------------
[ 1.535460] Current state: 1
[ 1.535464] WARNING: CPU: 1 PID: 0 at 0xffffffff8108c865
[ 1.535466] Modules linked in:
[ 1.535469] CPU: 1 PID: 0 Comm: MuQSS/1 Not tainted 5.0.7-ck1 #3
[ 1.535471] Hardware name: System manufacturer System Product Name/Crosshair IV Formula, BIOS 3029 10/09/2012
[ 1.535474] RIP: 0010:0xffffffff8108c865
[ 1.535476] Code: 04 77 29 89 f1 ff 24 cd 38 76 a0 81 80 3d 53 1b bd 00 00 75 17 89 c6 48 c7 c7 90 c6 ad 81 c6 05 41 1b bd 00 01 e8 7b ae fa ff <0f> 0b 48 83 c4 08 5b c3 48 8b 47 60 48 85 c0 75 64 83 fe 03 89 73
[ 1.535480] RSP: 0018:ffff888437c43f50 EFLAGS: 00010082
[ 1.535482] RAX: 0000000000000010 RBX: ffff888437c504c0 RCX: ffffffff81c1fdb8
[ 1.535483] RDX: 0000000000000001 RSI: 0000000000000086 RDI: ffffffff81f8fcac
[ 1.535485] RBP: 7fffffffffffffff R08: 00000000000001f0 R09: 0000000000000000
[ 1.535487] R10: 0720072007200720 R11: 0720072007200720 R12: 7fffffffffffffff
[ 1.535489] R13: ffff888437c56900 R14: ffff888437c569f8 R15: ffff888437c56a38
[ 1.535491] FS: 0000000000000000(0000) GS:ffff888437c40000(0000) knlGS:0000000000000000
[ 1.535493] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.535494] CR2: 0000000000000000 CR3: 0000000001c0c000 CR4: 00000000000006e0
[ 1.535496] Call Trace:
[ 1.535498]
[ 1.535500] 0xffffffff8108e7cb
[ 1.535502] 0xffffffff810817cb
[ 1.535503] 0xffffffff81601568
[ 1.535505] 0xffffffff8160117f
[ 1.535506]
[ 1.535507] RIP: 0010:0xffffffff8100f592
[ 1.535509] Code: 0f ba e0 24 72 11 65 8b 05 bb eb ff 7e fb f4 65 8b 05 b2 eb ff 7e c3 bf 01 00 00 00 e8 17 e0 07 00 65 8b 05 a0 eb ff 7e fb f4 <65> 8b 05 97 eb ff 7e fa 31 ff e8 ff df 07 00 fb c3 66 66 2e 0f 1f
[ 1.535512] RSP: 0018:ffffc9000007bf00 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
[ 1.535515] RAX: 0000000000000001 RBX: 0000000000000001 RCX: 0000000000000001
[ 1.535516] RDX: 000000005b7f1466 RSI: 0000000000000001 RDI: 0000000000000380
[ 1.535518] RBP: ffffffff81c601a8 R08: 0000000000000000 R09: 0000000000019840
[ 1.535520] R10: 0000001e3c819be7 R11: 000000007260bc7a R12: 0000000000000000
[ 1.535522] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 1.535524] 0xffffffff8105bd2f
[ 1.535526] 0xffffffff8105bf5b
[ 1.535527] 0xffffffff810000d4
[ 1.535529] ---[ end trace 71fe021b29fa5d1f ]---

I am having this problem on all my phenom 2 systems, looks like some kind of interrupt problem, I tried to enable nothreadedirqs option and also enabled fix for broken boot irqs option but none of them had any effect on this, I enabled stack traces but for some reason he don't show them :x

anyone might know what this is ? I searched around and found this which may be helpful: https://pastebin.com/y0aXvBNP
(this is not mine but it looks very much like mine)

thank :)
ReplyDelete
Replies
Manoa13 April 2019 at 16:33
here more information: https://gist.github.com/ManoaNosea/69f698e40661be29df5016143bd81a86
ReplyDelete
Replies
Manoa14 April 2019 at 06:37
I used kyber in all this
ReplyDelete
Replies
Manoa15 April 2019 at 02:42
thank :) it big help :) I will test all this functions :)
ReplyDelete
Replies
Manoa15 April 2019 at 02:57
but it look verry strange, this functions show the problems in things like hpet and x2apic and iommu (I don't know the functions realy, mybe this is not necessary a problem with kernel at all, mybe hardware bad iteself), but it show that it affect the muqss, this I don't understand whay...muqss can be calling this functions ?
ReplyDelete
Replies
Manoa15 April 2019 at 03:23
it strange because I am not running any virtualization, only python3 was running on the computer :x
ReplyDelete
Replies
Manoa15 April 2019 at 21:24
I disabled HPET,IOMMU,x2apic but it didn't fixed: https://gist.github.com/ManoaNosea/7ceec21d49d87dc679265a1371c0433e
ReplyDelete
Replies
Manoa15 April 2019 at 21:32
thank now I know whay he didn't gived codes and instead gived addresses
ReplyDelete
Replies
Manoa15 April 2019 at 21:55
[ 1.864822] Current state: 1
[ 1.864829] WARNING: CPU: 2 PID: 0 at clockevents_switch_state+0x45/0xe0
[ 1.864830] Modules linked in:
[ 1.864833] CPU: 2 PID: 0 Comm: MuQSS/2 Not tainted 5.0.7-ck1 #10
[ 1.864835] Hardware name: System manufacturer System Product Name/Crosshair IV Formula, BIOS 3029 10/09/2012
[ 1.864839] RIP: 0010:clockevents_switch_state+0x45/0xe0
[ 1.864841] Code: 04 77 29 89 f1 ff 24 cd 38 75 a0 81 80 3d 53 24 bd 00 00 75 17 89 c6 48 c7 c7 c8 9e b7 81 c6 05 41 24 bd 00 01 e8 db af fa ff <0f> 0b 48 83 c4 08 5b c3 48 8b 47 60 48 85 c0 75 64 83 fe 03 89 73
[ 1.864844] RSP: 0018:ffff888437c83f50 EFLAGS: 00010082
[ 1.864846] RAX: 0000000000000010 RBX: ffff888437c904c0 RCX: ffffffff81c1f618
[ 1.864848] RDX: 0000000000000001 RSI: 0000000000000086 RDI: ffffffff81d41c04
[ 1.864850] RBP: 7fffffffffffffff R08: 00000000000001e6 R09: 0000000000000000
[ 1.864852] R10: 0720072007200720 R11: 0720072007200720 R12: 7fffffffffffffff
[ 1.864854] R13: ffff888437c96900 R14: ffff888437c969f8 R15: ffff888437c96a38
[ 1.864856] FS: 0000000000000000(0000) GS:ffff888437c80000(0000) knlGS:0000000000000000
[ 1.864858] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1.864860] CR2: 0000000000000000 CR3: 0000000001c0c000 CR4: 00000000000006e0
[ 1.864861] Call Trace:
[ 1.864864]
[ 1.864866] ? tick_program_event+0x4b/0x80
[ 1.864869] ? hrtimer_interrupt+0x12b/0x220
[ 1.864872] ? smp_apic_timer_interrupt+0x48/0xa0
[ 1.864874] ? apic_timer_interrupt+0xf/0x20
[ 1.864875]
[ 1.864877] ? amd_e400_idle+0x32/0x60
[ 1.864880] ? do_idle+0x1cf/0x280
[ 1.864882] ? cpu_startup_entry+0x1b/0x20
[ 1.864884] ? secondary_startup_64+0xa4/0xb0

it look like hrtimer problem, but the options of kernel not allow to disable that option, so I enable hpet and booted clocksource=hpet, but this boot error the same :x
ReplyDelete
Replies
Manoa15 April 2019 at 21:57
could be a dynticks idle problem ?
ReplyDelete
Replies
Manoa15 April 2019 at 23:53
I disable dynticks idle and the boot error is gone
ReplyDelete
Replies
Anonymous16 April 2019 at 00:27
On AMD Ryzen 2400G APU booting muqss with mc (default) results in 25 runqueues (which is more than odd, because this is 4 core 8 thread CPU):
kernel: MuQSS runqueue share type MC total runqueues: 25

When using virtualization (KVM), a lot of errors like these appear:
apr 15 16:44:52 kernel: BUG: using smp_processor_id() in preemptible [00000000] code: CPU 4/KVM/14536
apr 15 16:44:52 kernel: caller is single_task_running+0xe/0x30
apr 15 16:44:52 kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./AB350M Pro4, BIOS P5.70 03/14/2019
apr 15 16:44:52 kernel: Call Trace:
apr 15 16:44:52 kernel: dump_stack+0x65/0x8a
apr 15 16:44:52 kernel: debug_smp_processor_id+0xe8/0xf0
apr 15 16:44:52 kernel: single_task_running+0xe/0x30
apr 15 16:44:52 kernel: kvm_vcpu_block+0x230/0x370 [kvm]
apr 15 16:44:52 kernel: kvm_arch_vcpu_ioctl_run+0x312/0x1cc0 [kvm]
apr 15 16:44:52 kernel: kvm_vcpu_ioctl+0x24b/0x630 [kvm]
apr 15 16:44:52 kernel: ? hrtimer_start_range_ns+0x1ce/0x360
apr 15 16:44:52 kernel: do_vfs_ioctl+0xa9/0x760
apr 15 16:44:52 kernel: ? __schedule+0xa5a/0xd90
apr 15 16:44:52 kernel: ? __fget+0x73/0xa0
apr 15 16:44:52 kernel: __x64_sys_ioctl+0x6a/0xa0
apr 15 16:44:52 kernel: do_syscall_64+0x5a/0x110
apr 15 16:44:52 kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
apr 15 16:44:52 kernel: RIP: 0033:0x7f25c93995d7

Using standard Ubuntu mainline kernel, there are no errors.
ReplyDelete
Replies
Manoa16 April 2019 at 00:54
but gives again kernel problems when running: https://gist.github.com/ManoaNosea/61fff8a1504b869d352b938528a5a4ab
ReplyDelete
Replies
Manoa16 April 2019 at 04:46
thank, I tested same .config same kernel without patch - no problems at all, this is muqss problem I think it don't work with AMD
ReplyDelete
Replies
Anonymous28 April 2019 at 18:58
I admire your patchset, as a suggestion to boost single-threaded performance, will the scheduler allow a single thread to hog a core?

This would boost FPS in games, and web app run times.
ReplyDelete
Replies
Anonymous30 April 2019 at 01:42
Another tool to test schedulers in "gaming" workflow. used to test VariableRefreshRate implementations and visual research. github.com/kleinerm/VRRTestPlots
ReplyDelete
Replies
Anonymous16 May 2019 at 04:28
Did anyone adapt this to 5.1 yet?
ReplyDelete
Replies

Add comment