linux-4.10-ck1
-ck1 patches:http://ck.kolivas.org/patches/4.0/4.10/4.10-ck1/
Git tree:
https://github.com/ckolivas/linux/tree/4.10-ck
Ubuntu 16.10 packages (sorry I'm no longer on 16.04):
http://ck.kolivas.org/patches/4.0/4.9/4.10-ck1/Ubuntu16.10/
MuQSS
Download:4.10-sched-MuQSS_152.patch
Git tree:
4.10-muqss
MuQSS 0.152 updates
Removed the rapid ramp-up in schedutil cpufreq which was overactive.Bugfixes
4.10-ck1 updates
Apart from resyncing with the latest tree from linux-bfq:- The wb-buf-throttling patches are now part of mainline and do not need to be added separately
- Minor swap setting tweaks
For those of you trying to build the evil nvidia driver for linux-4.10, this patch will help:
nvidia-375.39-linux-4.10.patch
Enjoy!
お楽しみ下さい
-ck
Nice low latency you got there ;).
ReplyDeleteAlso thanks for the nvidia patch, couldn't find one before.
@CK I have weird SCHED logs in dmesg
ReplyDelete#3
[ 0.753442] SCHED: No cpumask for kworker/4:0/36
...
[ 0.131197] TSC deadline timer enabled
[ 0.131200] smpboot: CPU0: Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz (family: 0x6, model: 0x3a, stepping: 0x9)
[ 0.131243] Performance Events: PEBS fmt1+, IvyBridge events, 16-deep LBR, full-width counters, Intel PMU driver.
[ 0.131263] ... version: 3
[ 0.131264] ... bit width: 48
[ 0.131264] ... generic registers: 4
[ 0.131265] ... value mask: 0000ffffffffffff
[ 0.131266] ... max period: 00007fffffffffff
[ 0.131266] ... fixed-purpose events: 3
[ 0.131266] ... event mask: 000000070000000f
[ 0.201364] NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
[ 0.221236] smp: Bringing up secondary CPUs ...
[ 0.291257] SCHED: No cpumask for kworker/1:0/18
[ 0.301260] SCHED: No cpumask for kworker/1:0H/19
[ 0.301289] x86: Booting SMP configuration:
[ 0.301290] .... node #0, CPUs: #1
[ 0.453414] SCHED: No cpumask for kworker/2:0/24
[ 0.453427] SCHED: No cpumask for kworker/2:0H/25
[ 0.453441] #2
[ 0.603439] SCHED: No cpumask for kworker/3:0/30
[ 0.603452] SCHED: No cpumask for kworker/3:0H/31
[ 0.603463] #3
[ 0.753442] SCHED: No cpumask for kworker/4:0/36
[ 0.753457] SCHED: No cpumask for kworker/4:0H/37
[ 0.753466] #4
[ 0.903475] SCHED: No cpumask for kworker/5:0/42
[ 0.903488] SCHED: No cpumask for kworker/5:0H/43
[ 0.903499] #5
[ 1.053497] SCHED: No cpumask for kworker/6:0/48
[ 1.053509] SCHED: No cpumask for kworker/6:0H/49
[ 1.053521] #6
[ 1.203507] SCHED: No cpumask for kworker/7:0/54
[ 1.203521] SCHED: No cpumask for kworker/7:0H/55
[ 1.203530] #7
[ 1.353452] smp: Brought up 1 node, 8 CPUs
[ 1.353454] smpboot: Total of 8 processors activated (36719.67 BogoMIPS)
[ 1.360342] MuQSS locality CPU 0 to 1: 2
[ 1.360343] MuQSS locality CPU 0 to 2: 2
[ 1.360343] MuQSS locality CPU 0 to 3: 2
[ 1.360344] MuQSS locality CPU 0 to 4: 1
[ 1.360344] MuQSS locality CPU 0 to 5: 2
[ 1.360345] MuQSS locality CPU 0 to 6: 2
[ 1.360345] MuQSS locality CPU 0 to 7: 2
[ 1.360346] MuQSS locality CPU 1 to 2: 2
[ 1.360347] MuQSS locality CPU 1 to 3: 2
[ 1.360347] MuQSS locality CPU 1 to 4: 2
[ 1.360347] MuQSS locality CPU 1 to 5: 1
[ 1.360348] MuQSS locality CPU 1 to 6: 2
[ 1.360348] MuQSS locality CPU 1 to 7: 2
[ 1.360349] MuQSS locality CPU 2 to 3: 2
[ 1.360350] MuQSS locality CPU 2 to 4: 2
[ 1.360350] MuQSS locality CPU 2 to 5: 2
[ 1.360350] MuQSS locality CPU 2 to 6: 1
[ 1.360351] MuQSS locality CPU 2 to 7: 2
[ 1.360352] MuQSS locality CPU 3 to 4: 2
[ 1.360352] MuQSS locality CPU 3 to 5: 2
[ 1.360353] MuQSS locality CPU 3 to 6: 2
[ 1.360353] MuQSS locality CPU 3 to 7: 1
[ 1.360354] MuQSS locality CPU 4 to 5: 2
[ 1.360354] MuQSS locality CPU 4 to 6: 2
[ 1.360355] MuQSS locality CPU 4 to 7: 2
[ 1.360355] MuQSS locality CPU 5 to 6: 2
[ 1.360356] MuQSS locality CPU 5 to 7: 2
[ 1.360356] MuQSS locality CPU 6 to 7: 2
[ 1.360570] devtmpfs: initialized
...
FULL dmesg:https://pastebin.com/raw/9YbMTik5
CONFIG: https://github.com/FadeMind/linux410-custom.src/blob/master/linux410/config.x86_64
Regards
FadeMind
I put them there much like the MuQSS locality messages. They're harmless and for my information.
DeleteThanks for quick reply.
DeleteThanks Con! I built and ran x64 and i686-UP on Arch last night; working fine.
ReplyDeleteYes, thanks!
DeleteNo problems so far.
This comment has been removed by the author.
ReplyDeleteToo bad there isn't a nvidia 340.x driver for 4.10 as of yet.
ReplyDeleteHave a look at the end of Cons -ck1 announcement for 4.10, he provided a link to a nvidia driver patch.
ReplyDeletePeter
Yes, I saw it, but it is for the 375.x driver series. Thanks anyway.
DeleteFirst hit on google search:
Deletehttps://devtalk.nvidia.com/default/topic/982052/linux/latest-nvidia-driver-340-101-builds-compiles-properly-but-fails-to-load-has-errors-with-linux-kernel-4-9-resolved-with-patch-/
search terms: "nvidia 340 4.10 kernel"
As with the 4.9.0 -ck, I am getting huge spikes in several CPU monitors while the system is actually idle process-wise. `top` sees my CPUs at '100% si' almost constantly, `xosview` displays 100% "SYS" spikes in a second's interval or less, and XFCE4's xfce4-systemload-plugin shows the CPU at 100% constantly. I set CONFIG_HZ=300. Any hints how to get a usable CPU load monitoring again?
ReplyDeleteI have the same problem with 4.10-muqss branch.
DeleteCon, any hints on how to track this down, and possibly fix this?
DeleteIt's an accounting error (it's not actually using extra CPU.) Unless you can hack the code and fix it, there's nothing more you can do until I find time to investigate and fix it (which alas won't be any time soon.)
DeleteThanks for the heads-up. I unfortunately cannot fix it myself, but as long as it is on the list, I'm a happy camper. :)
DeleteGreat work once more on updating MuQSS. Personally I think it's a great scheduler. I've been getting very impressive results from it when combined with the schedutil governor and using yield_type 2, interactive 1 and rr_interval 1.
ReplyDeleteNot only is the system incredibly responsive but performance seems to be the best as well, like that. Mileage may vary for other people but I could not be happier.
+1 All of the above.
DeleteIt is the best, no doubt.
echo 1 > /proc/sys/kernel/rr_interval nice low latency for "real-time" audio work.
ReplyDeleteThanks a lot.
Astonishing. And this doesn't hurt throughput in any way? In my former testings, some years and kernels ago, setting to 1 did not only affect disk i/o negatively, but also gfx and audio being not "in-time" asap.
DeleteAre you using the full feature -ck1 or the MuQSS only patch?
BR, Manuel Krause
Yes, it hurts throughput.
DeleteBut when the CPU is fast it takes some "abuse" to reach that point.
On a slow CPU it might be not that fun since it might be choking all the time when the value is too low.
ck1.
@27 February 2017 at 05:30:
ReplyDeleteActually, I'm running rr_interval 1, interactive 1 and yield_type 2 and whereas one might expect that to hurt throughput, from some testing (both synthetic as well as real world) I've actually found that throughput seems to be BETTER than with, for example, rr_interval 6, interactive 0 and yield_type 1 (or 0).
I suspect this has to do with more and more applications as well as OS subsystems becoming increasingly multithreaded and the overhead of the context switching (yield type 2 and rr_interval 1) being less than the overhead of threads simply waiting for other threads to complete their tasks.
Something along those lines anyhow.
Just to give you an idea -- running a demoscene demo (synthetic metric, obviously) in WINE sees a 12% difference for me between running the highly cooperative mode (yield 2, interval 1, interactive 1) and the highly selfish mode (yield 0, interval 6+, interactive 0). In favour of the cooperative approach.
@Anon,
Deleteplease specify whether (in Your tests) You use performance / ondemand / powersave governor and which of cpufreq or p-state You actually use.
As mentioned in different threads here and there, ondemand vs performance itself is a big win, at least on non p-state capable hw, if You get 12% out of performance that's neat and worth a try :)
Schedutil. Been a fan of that one since it was first implemented. Tried it with ondemand as well and even that was a performance degradation. Performance might be on-par with schedutil but I'd hardly wager it being better.
DeleteObviously I meant that the 'Performance' governor might be on-par with the 'schedutil' governor.
DeleteIf you have an Intel CPU, I would be cautious about schedutil.
ReplyDeleteI've tested it with CFS on my Intel 4770k, with both acpi-cpufreq and intel_pstate (by adding intel_pstate=passive to kernel boot line, a new option of 4.10).
It is broken: the CPU frequency is always locked at the maximum turbo frequency (3.9GHz in my case), and the performance is bad with acpi-cpufreq (I didn't benchmarked intel_pstate).
I've not tried MuQSS with schedutil.
Pedro
I use "intel_pstate=disable intel_idle.max_cstate=0 idle=poll nohalt" on Intel CPUs for maximum performance.
Deletepstate passive + schedutil scales correctly for me but the performance is way lower than cpufreq + schedutil
Deletei dont know why
pstate + schedutil + mux = almost always max standard freq for me on skylake here. Not usable at all.
Deletepstate or cpufreq both are ok separately.
Br, Eduardo
Thanks Con for this new release.
ReplyDeleteHere are the usual throughput benchmarks on 4.10:
https://docs.google.com/spreadsheets/d/163U3H-gnVeGopMrHiJLeEY1b7XlvND2yoceKbOvQRm4/edit?usp=sharing
Pedro
Is it possible if you could re-run the latency tests (Interbench). I am wondering if it's even worth running MuQSS because the throughput is probably better with CFS but I am not sure about the latencies I've been using both schedulers and I can't find any difference latency wise while my workload consists of compiling large projects like LLVM/Chromium while programming, I haven't noticed anything slowing down even with CFS.
DeleteLatency-wise CFS is a turtle while MUQSS is a rabbit. HTH.
DeleteBut then you have the catch: MuQSS aims for low latency vs. CFS.
DeleteBR, Manuel Krause
Added the interbench results.
DeletePedro
I've done some latency oriented tests with runqlat and cyclictest, on MuQSS152@100Hz and CFS@300Hz.
DeleteThis time I hope i get it right.
Charts are at the bottom of the sheets.
The cpu is loaded with a linux kernel build at various -j.
During the build runqlat or cyclictest are run with the following command lines :
'runqlat -m 180 1'
'cyclictest -q -S -m -D3m -H=40000'
I also ran cyclictest at +1 nice level, as suggested by some doc I read.
Overall MuQSS show much higher average latencies under high load, but lower max latencies under runqlat.
Maybe ck or someone else can comment on these.
Is it expected ? Are the tests not measuring the right thing ?
Pedro
Again it's not testing what you think it's testing. Try changing the yield proc tunable and you'll see the results will change.
DeleteAdditionally the function it hooks into aren't exactly the same so the results are never going to be directly comparable.
DeleteThanks for replying.
DeleteThe thing is, I try to back up the positive comments I read on MuQSS latency with figures, as I don't feel any difference between MuQSS and CFS with my workload.
I guess it's not that easy.
I'll try changing the sched_yield setting.
I had looked at cyclictest source code and didn't see any call to sched_yield, so I thought it was the right tool to compare CFS and MuQSS.
Well, I just don't understand this scheduling stuff :(
Pedro
I've done some testing with yield_type setting.
DeleteIt doesn't make real differences with this workload (build kernel).
I won't draw conclusions from that.
Pedro
maybe http://www.brendangregg.com/blog/2017-03-16/perf-sched.html could help further tweaking of MUQSS. HTH.
ReplyDeleteRunning any WINE application hardlockups my system (either on execution or given a period of time). Before, my workaround was to use SCHED_ISO for pulseaudio, jackdbus, and osu!.exe with SCHED_NORMAL for wine and winserver. However, simply tuning yield_type to 0 fixes this.
ReplyDeleteRunning CS:GO with yield_type 0 or 1 both showed hard stuttering when loading player or bot threads with multicore rendering enabled. Tuning yield_type to 2 removes this stutter and I am assuming it applies to source games. Thank god you made it tunable.
After using the system for a considerable while, or playing osu! enough to reach this error, I still come across NOHZ: local_softirq_pending 202. Usually on CFS, the warning goes away with no apparent problems on the system. On MuQSS when the warning appears, the entire system lags. Specifically, display is not always updated, mouse stutters and does not poll correctly, and keyboard input is delayed and occassionally does not poll correctly. This issue goes away when I am able to set or run any program with policy SCHED_ISO (and keep it running) or restart with nohz=off at runtime.
All of this was tested on ck-ivybridge from the repo-ck repository with an Intel i5-3317U. The minimal workarounds are really stable, with the only thing worrying me is idle power consumption from disabling idle dynticks. Apart from that, the kernel is awesome to work and play with.
And I spoke too soon. Workarounds and tunables above do still significantly delay it from hardlocking the system.
DeleteThe only usecase I found to guarantee hardlocking on my system is using yield_type=2 and rr_interval=1, running a wine program using GL/EGL/GLES and opening 1-20 terminals at once.
I'm pretty confident it's the realtime scheduling issue mentioned before and that all programs that use or bridge to OpenGL on wine coincidentally demands realtime scheduling. htop shows this, but schedtool says otherwise. I'm beginning to think wine is coded to shit.
In the event that the wine program becomes a zombie, wineserver -k and schedtool -I the parent/child process relevant to the zombie kills the process (??). Using schedtool -R in the same process hardlocks the system OR puts the cpu to an unworkable idle state with softirq warnings.
I've also stopped rtkit-daemon to see if it helps, but to no avail. I really don't want to make a huge list of programs to SCHED_NOT_RR in schedtoold on this incredibly responsive kernel.
Looks like SCHED_BATCH for wine, wineserver, and wine-preloader is the most stable setup for me. Realtime priority programs have no problem running for extended periods of time and wine mostly spawns children at SCHED_NORMAL. Not sure what to make of it.
DeleteApart from my longstanding issues, latency-wise:
- primusrun and nvidia-xrun with intel_cpufreq schedutil makes all Valve games I've played open several seconds faster and leaves me with unbelievably low mouse latency on an Optimus system compared to mainline and Windows.
- I/O detection for my external keyboard and mouse is really fast and never fails to register compared to the few times that happened on mainline.
- Dota 2 on CFS caps at 30 FPS after reaching a specific load from multiple unit selection (even though it can run well above this on Pause). MuQSS does not have this issue.
- TTY switching is noticeably faster.
Looks like SCHED_BATCH for wine, wineserver, and wine-preloader is the most stable setup for me. Realtime priority programs have no problem running for extended periods of time and wine mostly spawns children at SCHED_NORMAL. Not sure what to make of it.
DeleteApart from my longstanding issues, latency-wise:
- primusrun and nvidia-xrun with intel_cpufreq schedutil makes all Valve games I've played open several seconds faster and leaves me with unbelievably low mouse latency on an Optimus system compared to mainline and Windows.
- I/O detection for my external keyboard and mouse is really fast and never fails to register compared to the few times that happened on mainline.
- Dota 2 on CFS caps at 30 FPS after reaching a specific load from multiple unit selection (even though it can run well above this on Pause). MuQSS does not have this issue.
- TTY switching is noticeably faster.
I've benchmarked CFS, MuQSS and CK1 with the Phoronix Test Suite. Last time I did this, MuQSS was still in development.
ReplyDeletehttp://openbenchmarking.org/result/1704053-RI-410CFSVSM76
http://openbenchmarking.org/result/1704069-RI-GAMING41095
I've also updated my google spreadsheet with various yield_type settings and CK1. It's a bit messy though.
Throughput wise :
from the PTS results, there is no clear winner. It depends on the workload.
from the spreadsheet, I would say the best MuQSS setting is "interactive=0" and "interactive=1 & yield_type=0 or 1". CK patchset is slower.
Pedro
Looks like SCHED_BATCH for wine, wineserver, and wine-preloader is the most stable setup for me. Realtime priority programs have no problem running for extended periods of time and wine mostly spawns children at SCHED_NORMAL. Not sure what to make of it.
ReplyDeleteApart from my longstanding issues, latency-wise:
- primusrun and nvidia-xrun with intel_cpufreq schedutil makes all Valve games I've played open several seconds faster and leaves me with unbelievably low mouse latency on an Optimus system compared to mainline and Windows.
- I/O detection for my external keyboard and mouse is really fast and never fails to register compared to the few times that happened on mainline.
- Dota 2 on CFS caps at 30 FPS after reaching a specific load from multiple unit selection (even though it can run well above this on Pause). MuQSS does not have this issue.
- TTY switching is noticeably faster.
Thanks for the positive feedback too! Good to hear of some concrete examples of advantages.
DeleteA suggestion if you're having lockups with -ck is it might be worth building the kernel without threadirqs enabled. It could be a subtle driver priority inversion bug that only shows up with threaded irqs and since they're off by default in mainline they wouldn't be picked up.
DeleteOh, I did not notice this comment, along with an old mailing list concerning wine(server) priority inversion. I did notice less jack2 xruns with this off, but never used it long enough to reach a conclusion. I will test this out.
DeleteWithout threadirqs, its pretty stable. I haven't seen any xruns reported from jack2 despite leaving Cadence on for a few hours with moderate workloads compared to w/ threadirqs' occasional pops.
DeleteSCHED_BATCH nice 19 wineserver is still the most stable policy. It also solves freezing issues I had with Ragnarok Online on wine-staging CSMT that I had with CFS. Only lockup I've reached so far is with hibernate from low battery which is a very rare use-case for me.
It's still a mystery why I haven't found your mailing list on wineserver priority inversion sooner, but at least I reached the same conclusion.
Hi Con,
ReplyDeleteLinux-ck experiences crashes with Docker. I've posted description here https://bbs.archlinux.org/viewtopic.php?pid=1704251 - can you have a look
I haven't experienced any crashes with docker on my system yet with BFQ enabled and I usually leave ziahamza's webui-aria2 docker container running for days.
DeleteWhat is your kernel and docker versions?
DeleteDocker version 17.04.0-ce, build 4845c567eb
Delete4.10.10-1-ck-ivybridge #1 SMP PREEMPT
I have ran docker a few times on ck-generic and linux-ck from AUR with no problems (from 4.9.11+).
Have you tried using mainline with BFQ to see if its a bug specific to the I/O scheduler?
Hi Con
ReplyDeleteI am playing with ck-1 patch together with Nvidia videocard (Optimus if it matters) under 4.10.10.
Unfortunately, it freezes GUI after sometime.
All processes continue to work, only display stops refreshing.
Is it a known issue?
BR
Sheesh, why is everyone so reserved with posting details ?
Deletewhat driver version, what card, what other hardware components ?
system details !
Nvidia works fine here (but NOT optimus - so it could be that)
Blog comments is bad place for bug reports, I guess.
DeleteHardware - laptop with Intel(R) Core(TM) i7-3630QM, GeForce GT 640M LE + Intel Corporation 3rd Gen Core (Optimus). Kernel is 4.10.10 vanilla (Gentoo distribution).
I tried nvidia-drivers-381.09 and 375.39 with Con patch + xf86-video-intel-2.99.917_p20170313. Both combinations eventually freeze laptop.
@kernelOfTruth: I also experience GUI freezes on 4.10.x series, but not on 4.9.x. I have Asus laptop with Optimus (UX303UB) and those freezes are with 375.39, 378.13 and 381.09 drivers (Gentoo Linux here).
Delete@Денис , @mbar
Delete>Blog comments is bad place for bug reports, I guess.
Indeed :/
by "GUI freezes" you mean that the X server locks up screen content and it doesn't change anymore,
only a forced reboot (or Magic SYSRQ Key) works ?
(all comments suggest so)
I got bash / terminal content freezing and it only gets updated when switching between apps - back and forth; kwin_x11 without compositing
but that's obviously not the same that you are experiencing,
did you try using a different window manager or desktop environment if that prevents it from happening ?
Does disabling frequency switching (ondemand governor, etc.) and switching to "performance" or using intel pstate make a change ?
This would at least help to further cycle it down to specifics and allow you to continue work somehow ...
> by "GUI freezes" you mean that the X server locks up screen content and it doesn't change anymore,
Deleteonly a forced reboot (or Magic SYSRQ Key) works ?
Yes. Exactly.
> did you try using a different window manager or desktop environment if that prevents it from happening ?
No, I haven't changed anything. The regression is related to kernel.
> Does disabling frequency switching (ondemand governor, etc.) and switching to "performance" or using intel pstate make a change ?
Again, no. I've tried nvidia-drivers-375.20 and 381.09 and that's it. After freezes I went back to 4.8-ck1.
Try to use BFQ v8r11. There's patches on their website for vanilla 4.10, alternatively you can update version with patches from sirlucjan (linux-rt-bfq).
Deletenvidia drivers 375.66 and 381.22 have fixes to prevent deadlock issues with PRIME Sync. It might be worth a try to use those drivers if it's specific to 4.10.
DeleteThanks.
DeleteIt's valuable comment.
For those interested, see full announcement - https://devtalk.nvidia.com/default/topic/1007268/b/t/post/5141478/#5141478
I'm hitting a BUG when trying to create a QEMU/KVM VM, any ideas? I saw similar BUG in previous user comments regarding to BFS for 4.8, where person hit same BUG when he tried using VirtualBox.
ReplyDeleteApr 18 20:08:35 ROG audit[5962]: AVC apparmor="STATUS" operation="profile_replace" profile="unconfined" name="libvirt-f02a7ff8-d128-4db2-
Apr 18 20:08:35 ROG kernel: audit: type=1400 audit(1492564115.688:55): apparmor="STATUS" operation="profile_replace" profile="unconfined"
Apr 18 20:08:35 ROG kernel: usercopy: kernel memory overwrite attempt detected to ffff9b05d3ece708 (kmalloc-8) (128 bytes)
Apr 18 20:08:35 ROG kernel: ------------[ cut here ]------------
Apr 18 20:08:35 ROG kernel: kernel BUG at /usr/src/linux-4.10.0/mm/usercopy.c:75!
Apr 18 20:08:35 ROG kernel: invalid opcode: 0000 [#1] SMP
Apr 18 20:08:35 ROG kernel: Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 b
Apr 18 20:08:35 ROG kernel: cryptd snd_hwdep snd_pcm intel_cstate nvidia(POE) intel_rapl_perf snd_seq_midi saa7164 snd_seq_midi_event sn
Apr 18 20:08:35 ROG kernel: multipath linear uas usb_storage hid_generic usbhid hid raid0 i915 i2c_algo_bit drm_kms_helper syscopyarea s
Apr 18 20:08:35 ROG kernel: CPU: 0 PID: 4052 Comm: libvirtd Tainted: P OE 4.10.0-19+my-generic #21
Apr 18 20:08:35 ROG kernel: Hardware name: ASUS All Series/MAXIMUS VII GENE, BIOS 3003 10/28/2015
Apr 18 20:08:35 ROG kernel: task: ffff9b05aa5c5300 task.stack: ffffaaf342f30000
Apr 18 20:08:35 ROG kernel: RIP: 0010:__check_object_size+0x77/0x1d6
Apr 18 20:08:35 ROG kernel: RSP: 0018:ffffaaf342f33ee0 EFLAGS: 00010282
Apr 18 20:08:35 ROG kernel: RAX: 000000000000005e RBX: ffff9b05d3ece708 RCX: 0000000000000000
Apr 18 20:08:35 ROG kernel: RDX: 0000000000000000 RSI: ffff9b05efa0dbc8 RDI: ffff9b05efa0dbc8
Apr 18 20:08:35 ROG kernel: RBP: ffffaaf342f33f00 R08: 0000000000000005 R09: 0000000000000551
Apr 18 20:08:35 ROG kernel: R10: 0000000000000008 R11: ffffffffa84469cd R12: 0000000000000080
Apr 18 20:08:35 ROG kernel: R13: 0000000000000000 R14: ffff9b05d3ece788 R15: ffff9b05d3ece708
Apr 18 20:08:35 ROG kernel: FS: 00007f410c5d1700(0000) GS:ffff9b05efa00000(0000) knlGS:0000000000000000
Apr 18 20:08:35 ROG kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr 18 20:08:35 ROG kernel: CR2: 00007f41196eaaa0 CR3: 00000003f0da5000 CR4: 00000000001406f0
Apr 18 20:08:35 ROG kernel: Call Trace:
Apr 18 20:08:35 ROG kernel: SyS_sched_setaffinity+0x6b/0xe0
Apr 18 20:08:35 ROG kernel: entry_SYSCALL_64_fastpath+0x1e/0xad
Apr 18 20:08:35 ROG kernel: RIP: 0033:0x7f41188425dc
Apr 18 20:08:35 ROG kernel: RSP: 002b:00007f410c5d0798 EFLAGS: 00000246 ORIG_RAX: 00000000000000cb
Apr 18 20:08:35 ROG kernel: RAX: ffffffffffffffda RBX: 00007f41193e271c RCX: 00007f41188425dc
Apr 18 20:08:35 ROG kernel: RDX: 00007f40e81211e0 RSI: 0000000000000080 RDI: 000000000000174c
Apr 18 20:08:35 ROG kernel: RBP: 00007f40e83155d0 R08: 00007f40e81de0e0 R09: 0000000000000000
Apr 18 20:08:35 ROG kernel: R10: 00007f40e81211e0 R11: 0000000000000246 R12: 00007f40e83155d0
Apr 18 20:08:35 ROG kernel: R13: 00007f41196eaa90 R14: 0000000000000001 R15: 00007f410c5d1698
Apr 18 20:08:35 ROG kernel: Code: c7 c2 13 4f ed a7 48 c7 c6 d1 da e9 a7 48 c7 c7 60 a5 e9 a7 48 0f 44 d1 48 c7 c1 8a 2e e9 a7 48 0f 44 f
Apr 18 20:08:35 ROG kernel: RIP: __check_object_size+0x77/0x1d6 RSP: ffffaaf342f33ee0
Apr 18 20:08:35 ROG kernel: ---[ end trace 7f5e3e96a69c8802 ]---
I have met this kind of issue too. causing some of my program cannot execute. like winecfg.
Deletehope it will be fix soon
Hardened usercopy related (CONFIG_HARDENED_USERCOPY_PAGESPAN)
Deletehttps://patchwork.kernel.org/patch/9181869/
https://lkml.org/lkml/2017/1/15/152
could be scheduler (MuQSS) or something totally else ...
I think that MuQSS is incompatible with CONFIG_CPUMASK_OFFSTACK, which is implied by CONFIG_MAXSMP ("Configure maximum number of SMP processors and NUMA Nodes"). sched/core.c get_user_cpu_mask bounds the copy length to cpumask_size() which is a runtime value when CONFIG_CPUMASK_OFFSTAK, but MuQSS's version bounds it to sizeof(cpumask_t) which will, in this case, probably be larger than the actual target buffer. Refer to Linux commit 96f874e26428a (from 2008). I think MuQSS needs to either handle this case, or require !CONFIG_CPUMASK_OFFSTACK.
DeleteThat's very helpful information. Thank you very much.
DeleteHey,
ReplyDeleteI've had this issue (more like an annoyance) since I've been using linux 4.10 with muqss.
I am usually running 4 vms (Windows and Linux) and every vm produces this kind of kernel warning:
[ +0.000029] WARNING: CPU: 2 PID: 2655 at arch/x86/kvm/lapic.c:1468 kvm_lapic_expired_hv_timer+0xee/0x110 [kvm]
[ +0.000002] Modules linked in: vhost_net vhost macvtap macvlan fuse ctr ccm xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat libcrc32c crc32c_generic nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic arc4 nls_iso8859_1 nls_cp437 vfat fat intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp iwlmvm kvm_intel nouveau kvm mac80211 iTCO_wdt iTCO_vendor_support eeepc_wmi asus_wmi irqbypass sparse_keymap snd_hda_intel crct10dif_pclmul crc32_pclmul evdev crc32c_intel snd_hda_codec joydev input_leds mousedev ghash_clmulni_intel led_class snd_hwdep mac_hid aesni_intel
[ +0.000062] snd_hda_core iwlwifi mxm_wmi aes_x86_64 crypto_simd snd_pcm ttm cryptd glue_helper snd_timer e1000e i2c_algo_bit snd cfg80211 intel_cstate psmouse intel_rapl_perf soundcore hci_uart ptp btbcm pcspkr i2c_i801 pps_core btqca btintel bluetooth mei_me mei battery shpchp rfkill acpi_als video tpm_tis kfifo_buf intel_lpss_acpi wmi tpm_tis_core i2c_hid tpm industrialio intel_lpss fjes acpi_pad button sch_fq_codel sg ip_tables x_tables ext4 crc16 jbd2 fscrypto mbcache hid_generic usbhid hid sd_mod serio_raw atkbd libps2 ahci libahci libata xhci_pci xhci_hcd scsi_mod usbcore usb_common i8042 serio nvidia_drm(PO) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm nvidia_uvm(PO) nvidia_modeset(PO) nvidia(PO)
[ +0.000079] CPU: 2 PID: 2655 Comm: CPU 0/KVM Tainted: P W O 4.10.10-1-ck #1
[ +0.000002] Hardware name: System manufacturer System Product Name/Z170-A, BIOS 3401 01/25/2017
[ +0.000001] Call Trace:
[ +0.000008] dump_stack+0x76/0xa0
[ +0.000005] __warn+0xda/0x100
[ +0.000005] warn_slowpath_null+0x30/0x40
[ +0.000019] kvm_lapic_expired_hv_timer+0xee/0x110 [kvm]
[ +0.000006] handle_preemption_timer+0x21/0x30 [kvm_intel]
[ +0.000006] vmx_handle_exit+0x169/0x1480 [kvm_intel]
[ +0.000005] ? clear_atomic_switch_msr+0x15a/0x180 [kvm_intel]
[ +0.000005] ? atomic_switch_perf_msrs+0x7e/0xb0 [kvm_intel]
[ +0.000022] kvm_arch_vcpu_ioctl_run+0x880/0x1690 [kvm]
[ +0.000005] ? _copy_to_user+0x67/0x80
[ +0.000013] kvm_vcpu_ioctl+0x348/0x640 [kvm]
[ +0.000004] do_vfs_ioctl+0xb2/0x600
[ +0.000005] ? __fget+0x8a/0xc0
[ +0.000002] SyS_ioctl+0x88/0xa0
[ +0.000006] entry_SYSCALL_64_fastpath+0x1a/0xa9
[ +0.000002] RIP: 0033:0x7f45c136e0d7
[ +0.000002] RSP: 002b:00007f45b2efb8a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ +0.000004] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f45c136e0d7
[ +0.000002] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000012
[ +0.000001] RBP: 00007f45b374f980 R08: 0000563b84228b90 R09: 00000000000000ff
[ +0.000002] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
[ +0.000002] R13: 00007f45c85d9000 R14: 0000000000000000 R15: 00007f45b374f980
[ +0.000003] ---[ end trace f36b1d45e660e691 ]---
I've also reported this to the linux kernel bugzilla but so far nothing.
Strangely I am not getting these errors with the a vanilla kernel, so I assume that this is somethig with the muqss scheduler.
I remember you did some changes to the high-resolution time and changed some stuff all over the linux kernel, could this also affected this?
Peet
information from kdb
ReplyDelete-----summary-----
sysname Linux
release 4.10.11-ck1-otakux-2
version #6 SMP PREEMPT Wed Apr 19 20:38:49 CST 2017
machine x86_64
nodename otakux-VirtualBox
domainname (none)
ccversion CCVERSION
date 2017-04-20 09:15:33 tz_minuteswest 0
uptime 00:04
load avg 1.14 0.78 0.32
MemTotal: 2045916 kB
MemFree: 968412 kB
Buffers: 28260 kB
-----panic-----
usercopy: kernel memory overwrite attempt detected to ffff8873f7093bb0 (kmalloc-8) (128 bytes)
Entering kdb (current=0xffff8873f87f0000, pid 1684) on processor 1 Oops: (null)
due to oops @ 0xffffffff9222fafe
CPU: 1 PID: 1684 Comm: wineserver Tainted: G W 4.10.11-ck1-otakux-2 #6
Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
task: ffff8873f87f0000 task.stack: ffffa712c149c000
RIP: 0010:__check_object_size+0x6e/0x1e3
RSP: 0018:ffffa712c149fee0 EFLAGS: 00010282
RAX: 000000000000005e RBX: ffff8873f7093bb0 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffff8873ffd0dbc8 RDI: ffff8873ffd0dbc8
RBP: ffffa712c149ff00 R08: 0000000000000001 R09: 00000000000001fa
R10: 0000000000000008 R11: ffffffff9323f98d R12: 0000000000000080
R13: 0000000000000000 R14: ffff8873f7093c30 R15: ffff8873f7093bb0
FS: 00007f424f661700(0000) GS:ffff8873ffd00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe923f0e640 CR3: 0000000077ad9000 CR4: 00000000000406e0
Call Trace:
SyS_sched_setaffinity+0x6b/0x100
entry_SYSCALL_64_fastpath+0x1e/0xad
RIP: 0033:0x7f424edd5d7c
RSP: 002b:00007fffb2ce8d78 EFLAGS: 00000206 ORIG_RAX: 00000000000000cb
RAX: ffffffffffffffda RBX: 0000000000001b4a RCX: 00007f424edd5d7c
RDX: 00007fffb2ce8d80 RSI: 0000000000000080 RDI: 0000000000000696
RBP: 00000000006c7980 R08: 0000000000000696 R09: 00007fffb2ce8f30
R10: 00000000000002f9 R11: 0000000000000206 R12: 00000000006c94e0
R13: 00007fffb2ce8e70 R14: 00000000006c4cd0 R15: 0000000000000148
Code: 48 0f 44 d1 48 c7 c6 26 f4 c9 92 48 c7 c1 07 9d ca 92 48 0f 45 f1 4d 89 e1 49 89 c0 48 89 d9 48 c7 c7 90 63 ca 92 e8 e8 72 f6 ff <0f> 0b 48
It seems that the MuQSS scheduler is causing animation lags with gnome 3.24.1. Can someone reproduce this? (By moving windows or by hovering over the left navigation bar in nautilus.)
ReplyDeleteHave you already tried to compare your same setup with Alfred Chen's VRQ patch applied instead of MuQSS? I don't want to advertise it, but it may be worth a try. For my system Alfred's patch results in much more responsiveness at all without negative effects. No gaming on my machine tested.
Deletehttp://cchalpha.blogspot.de/2017/04/vrq-095-release.html
BR, Manuel Krause
I might give this a try. I am wondering though, if you have tested any work intense stuff like compiling large projects like llvm/clang or chromium while having multiple virtual machines running?
DeleteNo, I don't have these kinds of workload on here, having no need for this. Though, kernel compilation, severe swapping and additional I/O are usual on here. BTW, I also use the most recent BFQ I/O scheduler.
DeleteLet us know after your VRQ trial.
BR, Manuel Krause
@Con:
ReplyDeleteThe BFQ I/O scheduler has recently been updated into a stable release, v8r10. Maybe it's time to pick this up into your -ck patchset.
BR, Manuel Krause
Every boot using reiserfs gives the following WARN:
ReplyDelete[ 7.106973] ------------[ cut here ]------------
[ 7.107654] WARNING: CPU: 0 PID: 30 at fs/quota/dquot.c:619 dquot_writeback_dquots+0x248/0x250
[ 7.108356] Modules linked in: nls_iso8859_1 nls_cp437 snd_hda_codec_hdmi iTCO_wdt iTCO_vendor_support acer_wmi sparse_keymap coretemp hwmon joydev intel_rapl x86_pkg_temp_thermal intel_powerclamp pcspkr snd_hda_codec_realtek psmouse snd_hda_codec_generic efi_pstore i915 ath9k ath9k_common ath9k_hw input_leds ath snd_hda_intel efivars mac80211 drm_kms_helper snd_hda_codec cfg80211 snd_hda_core atl1c led_class snd_hwdep nvidiafb snd_pcm drm vgastate fb_ddc i2c_i801 lpc_ich intel_gtt snd_timer syscopyarea sysfillrect sysimgblt mei_me fb_sys_fops mei i2c_algo_bit shpchp acpi_cpufreq tpm_tis tpm_tis_core tpm thermal wmi video button evdev mac_hid sch_fq_codel uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core vboxnetflt(O) videodev vboxnetadp(O) pci_stub media vboxpci(O) vboxdrv(O)
[ 7.112231] ath3k btusb btrtl btbcm btintel bluetooth rfkill loop usbip_host usbip_core sg ip_tables x_tables hid_generic usbhid hid sr_mod cdrom sd_mod serio_raw atkbd libps2 ehci_pci xhci_pci xhci_hcd ehci_hcd ahci libahci libata scsi_mod usbcore usb_common i8042 serio raid1 raid0 dm_mod md_mod
[ 7.114406] CPU: 0 PID: 30 Comm: kworker/0:1 Tainted: G O 4.10.8-1-ck1-ck #1
[ 7.115804] Hardware name: Acer Aspire V3-771/VA70_HC, BIOS V2.16 01/14/2013
[ 7.117213] Workqueue: events_long flush_old_commits
[ 7.118632] Call Trace:
[ 7.120042] ? dump_stack+0x5c/0x7a
[ 7.122146] ? __warn+0xb4/0xd0
[ 7.124214] ? dquot_writeback_dquots+0x248/0x250
[ 7.126422] ? reiserfs_sync_fs+0x12/0x70
[ 7.127951] ? finish_task_switch+0x7f/0x390
[ 7.129203] ? flush_old_commits+0x30/0x50
[ 7.130473] ? process_one_work+0x1b1/0x3a0
[ 7.131714] ? worker_thread+0x42/0x4c0
[ 7.132952] ? kthread+0xea/0x120
[ 7.134202] ? process_one_work+0x3a0/0x3a0
[ 7.135432] ? kthread_create_on_node+0x40/0x40
[ 7.136663] ? ret_from_fork+0x26/0x40
[ 7.137996] ---[ end trace 8c87d43bebda3f80 ]---
Could you give a hint on how to do that?
ReplyDeleteIt looks like threadirq is built unconditionally in 4.10 and I don't have threadirq as boot parameter.
It's actually a unique config option in -ck only:
DeleteSymbol: FORCE_IRQ_THREADING [=y]
Type : boolean
Prompt: Make IRQ threading compulsory
Location:
-> General setup
-> IRQ subsystem
Thanks. I've checked when this option is off.
DeleteUnfortunately, it freezes anyway. 4.8-ck1 is rock-solid though.
I think my issues with wine (which I have narrowed down to mostly wineserver) might be a priority inversion issue. Applications zombify when audio is out of sync and I'm assuming they deadlock when it doesn't render something in time.
ReplyDeleteosu! with SCHED_BATCH wineserver will lockup the system when running SCHED_IDLEPRIO make -j8, given some time. SCHED_BATCH nice 19 wineserver delays the lockup much longer under the same stress, but will still occasionally zombify it. I have also tried this with Zero Escape The Nonary Games and reached similar results. Apart from this test case, wineserver is relatively stable with these policies under moderate stress.
What I discovered along the way was that when compiling DKMS modules, CFS would sometimes terminate it with SIGPIPE during context switch. This occurs more frequently with linux-zen and linux-rt-bfq when BFQ is enabled. I have not seen this happen once on the ck-patchset and my test kernels with MuQSS in the past 3 months.
This comment has been removed by the author.
ReplyDeleteIs the 4.11 resync in the works ?
ReplyDeleteHopefully.
Delete4.11 is a monster resync.
DeleteNo doubt.
Delete@ck Does this resync also include new features?
DeleteWasn't planning any, no.
DeleteCon, could you try pushing MuQSS to mainline again https://lkml.org/? Maybe Linus and Scheduler maintainers changed their past views and might reconsider the pull.
ReplyDeleteI don't have the time, inclination, intestinal fortitude nor psychological disturbance required for attempting something so futile. Linus' position against multiple CPU schedulers in the kernel has been hard line for over a decade. Additionally a patch this size maintained in mainline requires a full time job to respond to issues and maintain. I spend a few days every few months on this patch and it's fun; why would I want to make it torture?
DeleteUsing mqss I get lags playing cpu intensive winegames like with CSMT (command stream) like Wow.
ReplyDeleteI get lags that are not present with cfs on the stock arch kernel.
Using renice helps however.
Im using yield type 0.
The lags especially occur when the sence changes.