-ck hacking: linux-4.11-ck2, MuQSS version 0.156 for linux-4.11

Friday, 26 May 2017

linux-4.11-ck2, MuQSS version 0.156 for linux-4.11

Announcing a new -ck release, 4.11-ck2 with the latest version of the Multiple Queue Skiplist Scheduler, version 0.156. These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but configurable for any workload.

linux-4.11-ck2

-ck1 patches:

http://ck.kolivas.org/patches/4.0/4.11/4.11-ck2/

Git tree:

https://github.com/ckolivas/linux/tree/4.11-ck

MuQSS

Download:

4.11-sched-MuQSS_156.patch

Git tree:

4.11-muqss

MuQSS 0.156 updates

- Fixed failed UP builds.

- Remove the last traces of the global run queue data, moving nr_running, nr_uninterruptible and nr_switches to each runqueue. Calculate nr_running accurately at the end of each context switch only once, reusing the variable in place of rq_load. (May improve reported load accuracy.)

4.11-ck2 updates

- Make full preempt default on all arches.

- Revert inappropriately reverted part of vmsplit patch.

Enjoy!

お楽しみ下さい

-ck

I seem to have unintentionally deleted the -ck1 post, sorry about that.

34 comments:

Vitéz Gábor27 May 2017 at 00:33
4.11.3-ck1 (and probably other versions too) seems to frequently move the running process(es) between the CPUs. This causes a problem with cpu load accounting and thus cpu freq scaling (conservative governor).

The symptom is that on 2 CPU system (FUJITSU ESPRIMO Mobile V6555 laptop w/Intel Core2 Duo T6570) with a single cpu-intensive process the frequency does not get raised at all. In this case probably both cores get 50-50% load, which is lower than the 80% default threshold of the conservative governor.

If the process is pinned to one of the cores, the frequency of the core the process is pinned to rises to the maximum as expected.

Running two cpu-intensive processes on this 2 core system raises the frequency of both cores as expected.

Any ideas how to fix this?

By the way, powertop seems to mess up something in the kernel and frequencies stay low after starting powertop. Changing the governor to something else and then back again to conservative fixes this issue.

thanks
Gabor
ReplyDelete
Replies
David Karlsson27 May 2017 at 10:33
Ever since I started using 4.11 I keep getting these kernel panics:

------------[ cut here ]------------
WARNING: CPU: 1 PID: 449 at net/ipv4/tcp_input.c:2819 tcp_fastretrans_alert+0x8e7/0xad0
Modules linked in: ip6table_nat nf_nat_ipv6 ip6t_REJECT nf_reject_ipv6 ip6table_mangle ip6table_raw nf_conntrack_ipv6 nf_defrag_ipv6 xt_recent ipt_REJECT nf_reject_ipv4 xt_comment xt_multiport xt_conntrack xt_hashlimit xt_addrtype xt_mark xt_nat xt_tcpudp xt_CT iptable_raw nf_log_ipv6 xt_NFLOG nfnetlink_log xt_LOG nf_log_ipv4 nf_log_common nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda ts_kmp nf_conntrack_amanda nf_conntrack_sane nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netlink nfnetlink nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp ip6table_filter ip6_tables iptable_filter iptable_mangle ipt_MASQUERADE
nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c crc32c_generic btrfs xor adt7475 hwmon_vid iTCO_wdt gpio_ich iTCO_vendor_support evdev mac_hid raid6_pq nouveau led_class mxm_wmi wmi video psmouse ttm i2c_i801 drm_kms_helper lpc_ich skge sky2 drm syscopyarea sysfillrect asus_atk0110 sysimgblt fb_sys_fops i2c_algo_bit button shpchp intel_agp intel_gtt acpi_cpufreq tpm_tis tpm_tis_core tpm sch_fq_codel coretemp msr nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc ip_tables x_tables ext4 crc16 jbd2 fscrypto mbcache sd_mod ata_generic pata_acpi serio_raw atkbd libps2 uhci_hcd ehci_pci ehci_hcd ahci libahci usbcore usb_common pata_jmicron mpt3sas raid_class libata scsi_transport_sas scsi_mod i8042 serio
CPU: 1 PID: 449 Comm: irq/19-enp5s4 Tainted: G W 4.11.3-1-ck-core2 #1
Hardware name: System manufacturer System Product Name/P5B-Deluxe, BIOS 1238 09/30/2008
Call Trace:

dump_stack+0x63/0x83
__warn+0xcb/0xf0
warn_slowpath_null+0x1d/0x20
tcp_fastretrans_alert+0x8e7/0xad0
tcp_ack+0xe57/0x14f0
tcp_rcv_established+0x11f/0x6f0
? sk_filter_trim_cap+0xb7/0x270
tcp_v4_do_rcv+0x130/0x210
tcp_v4_rcv+0xb39/0xcc0
ip_local_deliver_finish+0xa1/0x200
ip_local_deliver+0x5d/0x100
? inet_del_offload+0x40/0x40
ip_rcv_finish+0x1eb/0x3f0
ip_rcv+0x2b3/0x3c0
? ip_local_deliver_finish+0x200/0x200
__netif_receive_skb_core+0x507/0xa70
? tcp4_gro_receive+0x11a/0x1c0
? try_preempt+0x160/0x190
__netif_receive_skb+0x18/0x60
netif_receive_skb_internal+0x81/0xd0
napi_gro_receive+0xdb/0x150
skge_poll+0x397/0x880 [skge]
net_rx_action+0x242/0x3d0
__do_softirq+0x104/0x2e1
? irq_finalize_oneshot.part.2+0xe0/0xe0
do_softirq_own_stack+0x1c/0x30

do_softirq.part.4+0x41/0x50
__local_bh_enable_ip+0x88/0xa0
irq_forced_thread_fn+0x59/0x70
irq_thread+0x12f/0x1c0
? wake_threads_waitq+0x30/0x30
kthread+0x108/0x140
? irq_thread_dtor+0xc0/0xc0
? kthread_create_on_node+0x70/0x70
ret_from_fork+0x2c/0x40
---[ end trace a181bdf0ee69c250 ]---

After awhile the computer just hangs.
ReplyDelete
Replies
Marcus27 May 2017 at 18:22
I get really high cpu load with 4.11. These processes have huge cpu spikes:

rcu_preempt
kworker/u8:
irq/279-s-iwlwi
ksoftirqd/1

My fans are screaming. Stock arch kernel is calm. Can anybody else confirm?
ReplyDelete
Replies
damentz28 May 2017 at 10:51
Hopefully I'm not the only to report, but load averages are fixed. While workload is constant, the load average for the last minute now roughly represents the total amount of CPU being used.

This also fixes high loads while nothing is happening - instead of a solid 1.00 or something close like 0.96, loads while idle basically look correct, either completely 0.00 or under 0.10.

Thanks Con!
ReplyDelete
Replies
Unknown28 May 2017 at 16:25
Yeah I've been logging / graphing (munin, because it works for what I need) ever since the -ck2 bump and when I'm actually AFK and things are idle it does seem to show reasonable "very close to zero" loads. Good job it's working sanely :)
ReplyDelete
Replies
Anonymous31 May 2017 at 20:23
After months i have tried again a kernel patched with MuQSS on a netbook with an atom Z520.

Again, kernel panic at almost every boot.
Laptop is working without problems with an unpatched kernel and also with an old kernel with BFS patch.

Is there anything that i could try?
ReplyDelete
Replies
jwh71 June 2017 at 05:33
Fwiw, I've been using MuQSS on my old netbook (Eee 701 with Celeron M ULV 353) for a long time, and with BFS before that. However, mine is UP, vs the Z520's SMT. Perhaps providing the panic info would help. Did you use the vanilla kernel's config?
ReplyDelete
Replies
Anonymous1 June 2017 at 11:10
I noticed something weird about how 'htop' reports CPU% for processes. In comparison, 'top' reports things close to what you'd think it should report.

I tested with a busy loop in bash like this:

while true; do true; done

This shows 100% in 'top' but shows 83% in 'htop' in the CPU% column.

Then next, I experimented with spawning a bunch of sub-shells with those busy loops, like in this example:

for x in {1..8}; do while true; do true; done& done; sleep 10; kill $(jobs -p)

This example is for 8 processes. They run for ten seconds and then get killed.

I repeated this starting with 1 process and up to 12 processes, and this is what 'htop' and 'top' report in their CPU% column for those processes:

num, htop, top
1, 83, 100
2, 70, 100
3, 60, 100
4, 52, 100
5, 42, 84
6, 35, 70
7, 30, 60
8, 26, 53
9, 23, 46
10, 21, 42
11, 19, 38
12, 17, 35

I have a quad-core CPU (and no SMT). The output of 'top' seems to be kind of right, but I have a hunch it's also off and always calculating a result that's double of what's in 'htop'. It might just get clamped to 100%, and that makes the numbers up to 4 look good.

Kernel is Linux 4.11.3 with 4.11-ck2 patches.
ReplyDelete
Replies
Anonymous7 June 2017 at 03:58
This is the first time I am using MuQSS and my workload is just focused on virtual machines and compilation tasks (QEMU/KVM).

I've noticed the following:

- Same performance like CFS but probably snappier (Just a feeling)
- qemu/kvm processes are using much more cpu than before:

These numbers (CPU utilization) were measures with htop while leaving the vms idle.

Before (CFS):

Windows 7 Idle: 3-7% CPU
Windows 10 Idle: 4-8% CPU
Ubuntu 17.04 Idle: 1-2% CPU

After (MuQSS-156):

Windows 7 Idle: 18-38%
Windows 10 Idle: 11-41%
Ubuntu 17.04 Idle: 1-3%

Is MuQSS affecting KVM performance somehow? I am not sure why the Windows CPU utilization are so high but Ubuntu's CPU utilization hasn't changed.

@ck Do you have any fix for this?

- Nick
ReplyDelete
Replies
Unknown28 June 2017 at 08:05
Hi, long time BFS/ck patches user here.
In this patchset my gentoo box has 3 of my 4 cores always at 50% (atop, htop,top report the same).
1 min load average is below 1. Any idea why? Is this normal?
ReplyDelete
Replies
Tom6 July 2017 at 01:05
https://docs.google.com/spreadsheets/d/14nHLMeJXOqxj-mlMk_vdb7yLhLOPRdNS4JkYMU0ArHI/edit?usp=sharing

Just as I finish benchmarking, 4.11.9 comes out. I'm a very sad man.

There seems to be some latency regressions from my last test, but I really haven't been keeping track of what the causes are. Colors aside, ck2 still relatively* meets more deadlines.

Input latency with CFS on Optimus is still quite noticeable with primusrun, but no longer as much with PRIME (no sync); barely any difference with respect to MuQSS.
ReplyDelete
Replies
Anonymous9 July 2017 at 17:34
Hi, using ck patches for long time. I was experience a freeze on Xorg (firefox + gnome) every 2-3 seconds with the default 100Hz tick even before MuQSS so I had manually setting it to 1000Hz. First iterations were better I think but still getting that now with default 100Hz. Even with 1000Hz is still there but way less than before so might not even notice it. Any idea where I Can track down it's source? CPU: Intel(R) Core(TM) i7-4510U CPU @ 2.00GHz
ReplyDelete
Replies
Anonymous17 July 2017 at 23:15
Hi, any plans for a 4.12 release ?
ReplyDelete
Replies
Anonymous23 August 2017 at 16:02
MuQSS increases my CPU usage on idling, in other words, not stable, goes from 3 to 50 percent. Without MuQSS the idle is at 0-1% any suggestions?
ReplyDelete
Replies

Add comment