linux-4.9-ck1
-ck1 patches:http://ck.kolivas.org/patches/4.0/4.9/4.9-ck1/
Git tree:
https://github.com/ckolivas/linux/tree/4.9-ck
Ubuntu 16.04 LTS packages:
http://ck.kolivas.org/patches/4.0/4.9/4.9-ck1/Ubuntu16.04/
MuQSS
Download:4.9-sched-MuQSS_150.patch
Git tree:
4.9-muqss
MuQSS 0.150 updates
Regarding MuQSS, apart from a resync to linux-4.9, which has numerous hotplug and cpufreq changes (again!), I've cleaned up the patch to not include any Hz changes of its own, leaving Hz changes up to users to choose, unless they use the -ck patchset.Additionally, I've modified sched_yield yet again. Since expected behaviour is different for different (inappropriate) users out there of sched_yield, I've made it tunable in /proc/sys/kernel/yield_type and changed the default to what I believe should happen. From the documentation I added in Documentation/sysctl/kernel.txt:
yield_type: (MuQSS CPU scheduler only)
This determines what type of yield calls to sched_yield will perform.
0: No yield.
1: Yield only to better priority/deadline tasks. (default)
2: Expire timeslice and recalculate deadline.
Previous versions of MuQSS defaulted to type 2 above. If you find behavioural regressions with any of your workloads try switching it back to 2.
4.9-ck1 updates
Apart from resyncing with the latest trees from linux-bfq and wb-buf-throttling- Added a new kernel configuration option to enable threaded IRQs and set it by default
- Changed Hz to default to the safe 100 value, removing 128 which caused spurious issues and had no real world advantage.
- Fixed a build for muqss disabled (why would you use -ck and do that I don't know)
- Made hrtimers not be used if we know we're in suspend which may have caused suspend failures for drivers that did no use correct freezable vs normal timeouts
- Enabled bfq and set it to default
- Enabled writeback throttling by default
Enjoy!
お楽しみ下さい
-ck
Wow! Thanks! You've really been on the ball lately. I'd gotten accustomed to waiting a month or more for resynced patches.
ReplyDeleteCC arch/x86/kernel/setup_percpu.o
ReplyDeletekernel/time/timer.c: In function ‘msleep’:
kernel/time/timer.c:1914:62: error: ‘pm_freezing’ undeclared (first use in this function)
if (jiffs < 5 && hrtimer_resolution < NSEC_PER_SEC / HZ && !pm_freezing) {
^~~~~~~~~~~
kernel/time/timer.c:1914:62: note: each undeclared identifier is reported only once for each function it appears in
kernel/time/timer.c: In function ‘msleep_interruptible’:
kernel/time/timer.c:1936:62: error: ‘pm_freezing’ undeclared (first use in this function)
if (jiffs < 5 && hrtimer_resolution < NSEC_PER_SEC / HZ && !pm_freezing) {
^~~~~~~~~~~
make[2]: *** [scripts/Makefile.build:293: kernel/time/timer.o] Error 1
make[1]: *** [scripts/Makefile.build:544: kernel/time] Error 2
A fix for that is in the 4.9-ck git branch. Or you can enable suspend to ram in your config.
Deleteapplied patch, now it' okay. thanx
ReplyDeleteI really want to try this kernel but at the moment the Nvidia 375.20 drivers are causing a lot of problems so I guess I will have to wait until the next release ><
ReplyDeleteboth 375.20 and 375.10 caused problems for me,
Deleteuse 370.28 instead:
http://rglinuxtech.com/?p=1834 Kernel – 4.9-rc2 Breaks VMware and NVIDIA – Patch for NVIDIA..
(patch referenced)
375.26 is out.
DeleteDidn't test yet though.
Just be patient or use the open source driver!
ReplyDeleteWorking fine so far in Arch on my x64 Athlon64 X2 PC and i686 UP Atom netbook, with a few upstream merges added.
ReplyDeleteWere there issues with posting yesterday? Couldn't log in with my laptop at home; working from work now though.
I am seeing all soft interrupts for cpu utilization in top on a Core i5 with HyperThreading. On a AMD X6 system, cpu utilization reports user/system as you would expect.
ReplyDeleteDid you enable nohz_full? If so that's a cosmetic issue I've not fixed (and nohz_idle is recommended instead for most users.)
DeleteI checked the config and its set to CONFIG_NO_HZ_IDLE=y.
DeleteThis is a 32-bit kernel, could that be the cause?
Almost certainly since I've never really checked its 32 bit sanity. Why on earth would anyone be making a 32 bit kernel in this day and age?
DeleteIn general, because there's tons of old hardware (like my Eee 701 netbook) that is revitalized by Linux, and especially w/ -ck! :-P
DeleteBut I presume you're speaking to 'why would one use 32bit on a 64bit-capable CPU?' And that is usually due to the Windows mindset to use 32bit if you have less than 4GB memory, which does not apply to Linux.
wiki.archlinux.org/index.php/Frequently_asked_questions#64-bit
And in the ARM world: www.cnx-software.com/2016/03/01/64-bit-arm-aarch64-instructions-boost-performance-by-15-to-30-compared-to-32-bit-arm-aarch32-instructions/
In my case its because this system was an upgrade from an old P4 and I didn't want to rebuild the entire OS, so I just continued with 32-bit.
DeleteThanks for the Ubuntu builds!!
ReplyDeleteYou can easily try this out on Ubuntu using this script: https://github.com/Turbine1991/build_ubuntu_kernel_wastedcores
ReplyDeleteObviously offtopic and wrong place to ask -- but maybe someone of you knows how to help:
ReplyDeleteWhat can I do against these warnings, like e.g:
WARNING: "phys_base" [sound/drivers/snd-dummy.ko] has no CRC!
Many of them occur at compilation time and I don't know if that leads to further problems. Kernel is vanilla 4.9.0 from opensuse src rpm +ck1.
Any hint or link appreciated! Thank you in advance,
BR Manuel Krause
Disable dummy sound module/driver in kernel config.
DeleteNo, no. I wanted to keep the posting as short as possible and only gave ONE example. There are many many more of these warnings.
Delete...
WARNING: "___preempt_schedule" [fs/fscache/fscache.ko] has no CRC!
...
BR Manuel Krause
https://lkml.org/lkml/2016/10/27/471
DeleteOMG... That last link took me quite some time to read upto the end of that LKML thread.
DeleteI've taken the third provided patch from there, and got rid of the compile warnings. As it's dated 21th November, I'm somehow disappointed, that it's not taken into vanilla 4.9.0.
Thank you for your hint!
BR, Manuel Krause
Thx ck for the new yield_type configuration. I'm getting very good results when set to 'No yield' in xonotic. Game feels very responsive input is very consistent. I had already set __GL_YIELD="NOTHING" previously but still it's much better if I also set yield_type to 0. Not sure why this is the case.
ReplyDeleteduud
This means then that other code besides the GPU driver is also using sched_yield. It is arguably the most misused syscall on linux today and should not even exist any more. Setting it to zero basically makes it do nothing, which is why I added it as a feature :)
Delete@ck:
DeleteI also want to thank you for making yield_type a tunable!
After trying to do my humble port of old and unmaintained TuxOnIce to 4.9.0 and failed to resume from disk all times, and after investigating your code changes for 4.9.0 (-ck1), I've coincidentally tried yield_type=2 -- and it works again, for many cycles now.
You've added the new default =1 for some rational reason, let it be interactiveness/performance/both, I've then read some of your code comments regarding the yield() -- so how can I debug and change possibly faulty code in order to make it work well with your yield_type=1 ? In the TOI code there is one yield() call, e.g., but there can eventually be more sources of error in other drivers. I don't know what to search for and what to change to what, but I want to do mainly for TOI.
If you find some time to explain at least a little bit, I would really appreciate it.
Thank you in advance and best regards,
Manuel Krause
Maybe someone wants to see the non-official TOI code for 4.9.0:
Deletehttp://workupload.com/file/sVqjhDZ
* checksumming does not work, don't configure it
* with 4.9.0 + MuQSS/ck use "echo 2 > /proc/sys/kernel/yield_type"
* possible other bugs I haven't encountered, use at your own risk
BR, Manuel Krause
In kernel code if it wants a guarantee that it will schedule away it should be using schedule() instead of yield().
Delete@ck:
DeleteShort note after first succeeded test: I've taken your proposal and replaced the bad one by the better one. Seems to work very well with schedule() and the default yield_type of 1.
Con, many many thanks to you!
BR, Manuel Krause
I did some benchmarks and it seems that yield=1 has slower overall performance, at least in gaming, on average. Yield 2 and Yield 0 performed about the same:
Deletehttp://openbenchmarking.org/result/1701176-TA-CKYIELD0V56
Sadly, in the test I have above, only 1 test (OpenArena) has per frame analysis. According to that test though, yield 0 had the smallest lag spikes, with a max of 17ms per frame vs. 27ms and 28ms for yield 2 and yield 1 respectively.
DeleteDoing the test with some popular multicore CPU benchmarks, it seems that yield=1 is the same or slightly better in most cases. I wonder why games tend to perform better but CPU bound tests don't...
Deletehttp://openbenchmarking.org/result/1701170-TA-CKYIELDMU34
Runs nice on core2 duo machine.
ReplyDeleteyield 0 is awesome.
I had serious mouse lag due to slow integrated intel graphics which is pretty much gone now. :)
Thank you very much.
@Anonymous:
DeleteGreat thanks for sharing this information. On my comparable system (cpu&gpu) yield_type=0 solves mouse lag issues perfectly, too, without negatively affecting other subsystems. (Haven't cross checked with =2 again, the old default, though.)
Thanks again and Merry Christmas to all of you!
BR, Manuel Krause
@ck:
DeleteYou've defined the yield_type as runtime configurable. Thank you for the choice!
My question: When does it get effective after changing the value? Is there a difference to be expected for all old tasks running, for newly started tasks or other unnamed conditions, and then: when?
I'm currently re-testing the yield_type=2 after one day of =0, uptime with ck1: ~8 days.
BR, Manuel Krause
It changes behaviour immediately for everything running without delay.
DeleteThank you Con for your reply!
DeleteMy main reason to ask for this was a weird behaviour of the sound system via headphones that confused me yesterday. After switching the yield_type forth and back several times the stereo sound was suddenly changing from left to right and back to normal without pattern but continuously over time, and I wondered whether the system may get confused by switching yield_type too often and if you could imagine such a case.
Atm. I'm considering just a simple cable issue and am sorry for bothering you with this, but your info above is valuable anyways.
BR, Manuel Krause
@ck:
DeleteI don't believe in speeches for copper cable healing or such, but the issue went away all of a sudden for two days. And came back (same unchanged cable and system setup).
The only way to solve the stereo audio waving around (headphones) was to pin the pulseaudio process to the first of my two cpus via schedtool.
Sidenote: I've set the HZ value to 512 atm.
BR, Manuel Krause
BFQv8r6 for Linux 4.9 is out. After reverting patch 0017 from ck1 and applying the new BFQ manually I noticed wbt.h was deleted when reverting. I think wbt shouldn't be in patch 0017 together with BFQ out wasn't meant to be there in the first place.
ReplyDeleteMerry Christmas and best regards,
Peter
I've noticed this too. Unfortunately v8r6 is broken without hierarchical scheduling enabled atm.. :-(
DeleteBR, Manuel Krause
Some hours ago Paolo from BFQ submitted a patch for the non-hierarchical compilation issue at
Deletehttps://github.com/linusw/linux-bfq/commits/bfq-v8-v4.9
and named the version v8r7.
@ck: Please don't forget to move the wbt.h hunk from 0017 patch to 0016 when baking a new -ck patchset.
BR, Manuel Krause
Merry Christmas and thanks for all your work.
ReplyDelete@ck In a comment probably above this one, you said that the kernel shouldn't even include sched_yield() anymore because it's mostly not used correctly. This makes me somehow curious as if there would be no sched_yield() in userspace, wouldn't it be quite insufficient (waste of cpu cycles) for an user-space implemented hybrid mutex to do spinning when the lock is only hold for a small amount of time or when the lock is uncontented. After spinning for a constant time while atomically checking for a state change it changes its locking strategy to a futex-based mutex one.
ReplyDeleteIt's not that sched_yield is used incorrectly at all, it's that no such ill-defined function should exist in the first place. It's an ancient concept that was implemented in unix that "I want to yield a little", but never really defined beyond that. The problem with sched_yield is that there is absolutely no definition of the semantics of what it is meant to do. There is no definition of exactly what is to be yielded, to what else, how to yield, for how long to yield, and why. One does not need to invent a userspace equivalent for a function that has absolutely no defined semantics in the first place. Userspace should be sleeping for defined durations, reasons and defined wake conditions. There are plenty of syscalls that do exactly what is asked and expected of them that should be used instead.
DeleteYeah, I agree, that sounds reasonable to me. Thanks for explaining.
DeleteMy next question really has nothing to do with muqqs but since I am not that familiar with linux's internals myself, I'd like to ask you on your personal opinion on my usecase. As you may have noticed, I am implementing a more efficient mutex/lock for a performance sensitive application and I am running muqss as my main kernel with sched_yield set to 0, to improve some of my applications.
Now, my lock implementation uses a hybrid approach where it firstly uses a spinning lock then a futex-based lock. The thing is that I am having a loop with an iteration count of 30 which does atomic operations on the lock variable which has some bits stored whether is mutex is locked or not. If it's not, then it basically calls sched_yield and then repeats the loop.
So the question is now, do I simply remove the sched_yield since it's basically a non-operation or can you suggest me something else which might be more efficient?
The real question is: what are you waiting on happening during the sched_yield you were calling? If setting yield to zero improved the behaviour then you're not really doing anything at all or waiting on anything at all during that yield call, you're just spinning. If you have a defined wake condition, use a callback from that wake condition instead.
DeleteI haven't measured the actual difference in performance yet as I have noticed a possible performance increase in other applications so I decided to leave it set to zero (Some people here I guess have found the same behavior though).
DeleteWhen calling sched_yield I was expecting a context switch to other threads so that eventually the other thread which is holding the mutex (Assuming it's micro-contention, so a spinning lock will be sufficient otherwise it will take the futex-based path) will unlock the mutex so that when it context-switches back to our spinning thread it will eventually lock the mutex.
So I am more like waiting for "one context-switch" which is somehow contradicting your question because the time to wait is basically unknown.
But setting it to zero means there is no context switch... And on SMP machines one context switch on one CPU means precisely zero for a thread on another CPU releasing a lock.
DeleteWell, that's kinda my problem, sched_yield has no effect so I was trying to ask you what other approach I could try. Also I think I haven't explained the "one context-switch" correctly since one context-switch doens't make much sense but actually this is looped 30 times. So I guess there is a chance that the other thread will eventually continue (Or not, that means me not understanding SMP correctly).
DeleteNew cyclictest record minimum 756ns (avg 1120 ns) on a 2.66ghz quad core Xeon W3520 using yield_type 0.
ReplyDeleteHave been hanging around 980/1350 for quite a while and couldn't really improve on it.
Thank you.
As the kernel gets more and more "bloated" and slower almost every new release is there any chance to port this to older kernels like 3.12... ?
ReplyDeletehttp://www.phoronix.com/scan.php?page=article&item=linux-39-49&num=1
Deleteyou were saying ?
if you mean the code size, no one forces you to select everything and compile everything into the resulting kernel ...
http://www.phoronix.com/scan.php?page=article&item=linux-45-rc1&num=1
Deletehttp://www.phoronix.com/scan.php?page=article&item=linux-44-19way&num=1
It is very noticeable on old machines.
Apart from that I use the same stripped config (disabled "everything", so the kernel will still run but not more) and only use video/network/audio drivers needed for the particular machine.
you were saying? :)
Not sure what you're referring here (meaning: I can't see the significant lower performance for more recent kernels) or I need more coffee ;) ,
Deletethanks for those links,
the only kernels that were really faster in certain cornercase benchmarks were 3.5 and 3.6 and those aren't even supported anymore [longer bars do not equal better performance]
Comparing FPS of games doesn't really count since it's graphics subsystem regressions or other stuff
What "old" hardware are you using ?
Keep in mind that newer kernels include lots of stability and security improvements,
so that partially might slow things down in the short- to mid-term
Regards
FPS in games is a hint regarding responsiveness.
DeleteBut you can try it yourself.
Just download a 3.12... kernel and compare with newer like 4.... releases ("same" config).
There is a huge difference in responsiveness (from my experience).
But thanks to Con the difference is not that huge anymore. :)
Infact I gave up on new kernels altogether but since the last 2-3 muqss/ck releases new kernels are fun also.
It's a 2009-ish Xeon W3520 quadcore 2.66 GHz 6GB 1066 RAM (Lenovo Thinkstation S20).
A happy new year to everyone.
ReplyDeleteI got a freeze trying to use https://github.com/ggreer/the_silver_searcher.
ReplyDeleteI was in a call in Discord and playing an OpenGL game (minecraft) at the time.
Do any of those use real time scheduling (specifically the call in Discord might for audio)?
DeleteI am unfortunately not sure... Discord uses Pulseaudio, I can say that, and "realtime-scheduling" in the config is default. Wether that means it's enabled or not I don't know.
DeleteHi, I forgot to post the updated benchmarks of MuQSS150 I ran some time ago. They are here as usual :
ReplyDeletehttps://docs.google.com/spreadsheets/d/163U3H-gnVeGopMrHiJLeEY1b7XlvND2yoceKbOvQRm4/edit?usp=sharing
I've put some colors to make the results more readable (hopefully).
The reference kernel is the one on the first column. Following the value of the realtime difference between tested kernel and reference kernel, the colors are :
- blue if difference is within 'realtime of reference kernel +/- maximum standard deviation'
- green if difference is lower than 'realtime of reference kernel - maximum standard deviation'
- red if difference is higher than 'realtime of reference kernel + maximum standard deviation'
Overall best and worst are also shown ,if not in between +/- std dev.
I know a standard deviation computed of 3 runs is not very significant, but it's all I've got.
Pedro
Hey,
DeleteThe benchmark seems quite underwhelming as it probably has the same throughput as cfs. Also if I understood the interbench benchmark correctly it doesn't seem like cfs (300Hz) is that bad compared to muqqs (100Hz). It even looks like that CFS has a better max latency compared to MuQSS which makes me wondering, shouldn't MuQSS improve responsiveness (lower latency)?
In interbench results the order of importance is deadlines met followed by desired cpu followed by max latency.
DeleteIn terms of throughput remember that muqss is still primarily designed with responsiveness and interactivity in mind. The difference between bfs and muqss is that muqss will scale to any number of CPUs without breaking down in its throughput performance which is currently not of great importance until 16 or more CPUs. That number is not common at the moment but will become more so since the only direction manufacturers have to scale now is outwards instead of up in speed - that phones may have up to 8 cores now is evidence of that.
As I see it, this is a throughput benchmark only.
DeleteYou can't have high throughput and low latency both at the same time since they are mutually exclusive.
In my experience MUQSS is in a different league compared to CFS latency-wise.
And then keeping about the same throughput rather speaks for MUQSS I would say. :)
Besides, CFS gives variable results with interbench. I've seen this on linux 4.8 when running interbench 4 times for each kernel.
DeleteMuQSS results are more consistant under interbench.
However, I must add that under real world usage I don't feel the difference between CFS and MuQSS latency-wise. I guess it depends on your workload. Mine is not CPU intensive.
Pedro
@Pedro Is it just me or did you leave the last "% Deadlines Met" ("Benchmarking simulated cpu of Gaming") column empty ?
Delete@ck I am wondering if you are still working on improving the latency-aspect of MuQSS. Can we expect further improvements in future MuQSS releases?
Interbench doesn't return deadlines met with the gaming load. It's by design. You can see it in the manual.
DeletePedro
There are no planned changes to MuQSS in the latency area now unless bugs show up.
DeleteLess latency is better, always. ;)
DeleteWhen I play Counter Strike: Go I experience random pauses. I tried yield_type 0 and 2, in addition I am using schedtool -I -e. This kind of a behavior is not reproducible with cfq. On the other hand with yield_type = 0 the game does not suffer from jitters like cfq.
ReplyDeleteThe pauses are almost certainly hitting the threshold in CPU for realtime for isochronous scheduling - it is not designed for fully cpu bound applications like games, but for things like video and audio. Also, you mean cfs, not cfq, but everyone makes the same mistake since the names are so similar (just like bfs and bfq but no problem with muqss now.) In summary, don't run it with schedtool -I and I'm pretty sure your pauses will go away.
DeleteWorks fine now, but there is a jitter now like in cfs, it's very small and rare, but still. With SCHED_ISO only pauses.
DeleteI took it from man of schedtool:
SCHED_ISO was designed to give users a SCHED_RR-similar class. To quote Con Kolivas: "This is a non-expiring scheduler policy designed to guarantee a timeslice within a reasonable latency while preventing starvation. Good for gaming, video at the limits of hardware, video capture etc."
Should I use SCHED_BATCH instead?
The kernel parameter skew_tick=1 offsets the timer interrupt on each cpu. Does MuQSS rely on timer interrupts having no offset?
ReplyDeleteduud
It shouldn't affect it, no.
DeleteSry for hijacking this thread, but maybe someone knows the answer here.
DeleteI'm using CONFIG_HZ_PERIODIC=y
cat /proc/interrupts yields:
LOC: 1147176 1143230 1136259 1135356
Why is there such a huge difference (about 11%) in timer interrupts between the cpus. Are the timers so inaccurate?
The use of hrtimers cause this, also irqbalance doesn't work that good anymore.
DeleteIts somewhat strange the name hrtimers seems to imply an increased accuracy.
DeleteI'm not sure how irqbalance is related to this.
Does this mean that each cpu is effectively running on different timer interrupt frequencies?
Local timer interrupts have no fixed frequency.
DeleteSry for the mistake, the difference is about 1.1% but it seems to increase with time, it's about 2.2% now.
ReplyDeleteCan you please add some info about what impact this has on your system?
DeleteThanks, Manuel Krause
I don't know if the difference in timer interrupt count is of any importance, but I have issues with input latancy in games. The behavior varies, but most of the time input is very responsive after rebooting and gets very laggy after some time.
ReplyDeleteSo after some time the situation with timer interrupt counts changed completely.
Now CPU0 has a much lower count compared to CPU1, it was the other way around and the relative difference is about 9%.
Does somebody have more information about this behavior? Am I missing some timer interrupts? Maybe because if regions with disabled interrupts? Do u have simmilar behavior on your pcs? I can't find any information about this.
This is not related to interrupts.
DeleteI see, I was thinking
DeleteLOC: 510872 352627 357530 224704
counts only! the periodic timer interrupt as defined in CONFIG_HZ_100=y but it doesnt.
How long is the time to see the system 'laggy' after starting the assumed lagging program?
DeleteBtw. can't all these Anonymous provide a name to distinguish themselves?
BR, Manuel Krause
It's pretty random.
DeleteI don't think it's related to the scheduler. I was just wondering why LOC differs so much between CPUs because I completely misinterpreted the meaning of local timer interrupts.
duud
@duud:
DeleteThx for the added info. I was asking as I've had seen increasing input lags with earlier BFQ releases, e.g. even in Firefox, the longer it was up running.
Luckily, in my case, this doesn't happen any more/ not noticable since Con's MuQSS+ck rework. But honestly, I don't know whether it to be timer or scheduler or mainline related progress.
BR, Manuel Krause
-BFQ / +BFS
DeleteSorry, Manuel Krause
duud, were you able to track down the cause of lag by now?
DeleteIs it general system lag or just mouse lag?
I use those in my startup script for the mouse:
echo 1 > /sys/module/usbhid/parameters/mousepoll
echo N > /sys/module/drm_kms_helper/parameters/poll
Mmmmh... And those two lines effectively help against mouse pointer lags?
DeleteUnfortunately the first line only affects "usbhid", whereas I'd need something for a PS/2 trackman via adapter on the serial port. I remember me fiddling around some years ago with serial polling in xorg.conf and with setserial. And in that former times it was due to not matured scheduling in BFS, my fiddling was without success -- only improved BFS releases faded that out.
What effects does the second line have?
BR, Manuel Krause
First line sets USB polling to 1000Hz, whereas default is like 125Hz. Should help with pointer precision. On old slow machines a value of 2 (500Hz) might be more appropriate.
DeleteLine 2 disables polling of the Direct Rendering Manager Kernel Mode Setting driver which is a known source of mouse lag.
Oh, thx for the quick reply! I note them for possible future use.
DeleteBtw. I'm currently testing Alfred Chen's latest VRQ, and I don't have comparable lags with it.
BR, Manuel Krause
Less lag than MUQSS?
DeleteYes, less. Not noticeable with normal KDE desktop work and video playback and editing.
DeleteI'm no gamer, so my judgement is limited. And it may be not perfect so far but worth a try to compare.
BR, Manuel Krause
Ok, thanks.
DeleteWill give it a try.
Thx for the suggestions...
DeleteI was already using 1000HZ mouse polling.
drm_kms_helper.poll doesnt have any effect for me.
duud
Sometimes gcc will generate "broken" binaries even when only using -O2.
DeleteI would just recompile the kernel.
4.9.1 does not build with ck1:
ReplyDeletekernel/time/timer.c: In Funktion »msleep«:
kernel/time/timer.c:1914:62: Fehler: »pm_freezing« nicht deklariert (erste Benutzung in dieser Funktion)
if (jiffs < 5 && hrtimer_resolution < NSEC_PER_SEC / HZ && !pm_freezing) {
^
kernel/time/timer.c:1914:62: Anmerkung: jeder nicht deklarierte Bezeichner wird nur einmal für jede Funktion, in der er vorkommt, gemeldet
kernel/time/timer.c: In Funktion »msleep_interruptible«:
kernel/time/timer.c:1936:62: Fehler: »pm_freezing« nicht deklariert (erste Benutzung in dieser Funktion)
if (jiffs < 5 && hrtimer_resolution < NSEC_PER_SEC / HZ && !pm_freezing) {
^
scripts/Makefile.build:293: die Regel für Ziel „kernel/time/timer.o“ scheiterte
make[2]: *** [kernel/time/timer.o] Fehler 1
scripts/Makefile.build:544: die Regel für Ziel „kernel/time“ scheiterte
make[1]: *** [kernel/time] Fehler 2
Makefile:992: die Regel für Ziel „kernel“ scheiterte
make: *** [kernel] Fehler 2
^^ nothing said.
DeleteFound the fix.
Better read everything next time.
With 4.9.2-ck1 I get multiple KDE Plasma hungs and dmesg spits this:
ReplyDelete[11551.712334] INFO: task pool:19698 blocked for more than 120 seconds.
[11551.712336] Not tainted 4.9.2-ck1 #1
[11551.712336] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[11551.712337] pool D 0 19698 2281 0x00000000
[11551.712339] ffff8800b7b16200 ffff8800b7a16200 ffff880092a2c980 ffff8800b1e7c000
[11551.712342] ffff880092a2c980 ffff8800048e3af8 ffffffff817280b1 ffff8800b7b16220
[11551.712345] 0000000000000001 00000b3b56e88e75 ffff88000f874dd8 000000010102c478
[11551.712347] Call Trace:
[11551.712350] [] ? __schedule+0x601/0xb60
[11551.712353] [] ? schedule+0x34/0xc0
[11551.712355] [] ? schedule_preempt_disabled+0xc/0x20
[11551.712357] [] ? __mutex_lock_slowpath+0xba/0x130
[11551.712360] [] ? mutex_lock+0xe/0x20
[11551.712363] [] ? cifs_reconnect_tcon+0xf6/0x220
[11551.712365] [] ? __switch_to+0x307/0x470
[11551.712368] [] ? smb_init+0x34/0x90
[11551.712370] [] ? CIFSSMBQPathInfo+0x51/0x260
[11551.712372] [] ? cifs_query_path_info+0x77/0x1a0
[11551.712374] [] ? lookup_fast+0xe0/0x2f0
[11551.712377] [] ? cifs_get_inode_info+0x2ff/0x590
[11551.712380] [] ? filename_lookup+0xde/0x160
[11551.712382] [] ? __kmalloc+0x2c/0x110
[11551.712385] [] ? build_path_from_dentry+0x154/0x2e0
[11551.712387] [] ? cifs_revalidate_dentry_attr+0xc8/0xe0
[11551.712390] [] ? cifs_getattr+0x5b/0x120
[11551.712393] [] ? vfs_fstatat+0x52/0x90
[11551.712396] [] ? SYSC_newlstat+0x1d/0x40
[11551.712399] [] ? __getnstimeofday64+0x32/0xc0
[11551.712402] [] ? do_gettimeofday+0x10/0x60
[11551.712405] [] ? SyS_gettimeofday+0x31/0x70
[11551.712408] [] ? entry_SYSCALL_64_fastpath+0x13/0x94
If it's new since 4.9.2 then there's not much I can do about it since I've fallen back to only syncing up with major releases. I'm still on 4.9.0-ck1 myself.
DeleteInstalled 4.9.3-1 on my Arch system with nvidia-340xx-dkms and got with AND without FORCE_IRQ_THREADING
ReplyDeleteJan 14 16:02:03 steinrose kernel: WARNING: CPU: 1 PID: 1927 at fs/proc/generic.c:345 proc_register+0x116/0x12f
Jan 14 16:02:03 steinrose kernel: proc_dir_entry 'driver/nvidia' already registered
Jan 14 16:02:03 steinrose kernel: Modules linked in: nvidia(PO+) ipt_REJECT nf_reject_ipv4 xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 f71882fg xt_recent xt_conntrack adt7475 nf_conntrack ipta
Jan 14 16:02:03 steinrose kernel: CPU: 1 PID: 1927 Comm: modprobe Tainted: P O 4.9.3-1-ck #1
Jan 14 16:02:03 steinrose kernel: Hardware name: MICRO-STAR INTERNATIONAL CO.,LTD MS-7512/MS-7512, BIOS V1.0 05/21/2008
Jan 14 16:02:03 steinrose kernel: ffffc90000843af8 ffffffff81380458 ffffc90000843b48 0000000000000000
Jan 14 16:02:03 steinrose kernel: ffffc90000843b38 ffffffff8106a1c2 0000015900843b60 00000000ffffffef
Jan 14 16:02:03 steinrose kernel: ffff880236830e40 ffff880234bf6985 ffff880234f15048 ffff880234bf6900
Jan 14 16:02:03 steinrose kernel: Call Trace:
Jan 14 16:02:03 steinrose kernel: [] dump_stack+0x62/0x78
Jan 14 16:02:03 steinrose kernel: [] __warn+0xda/0xf2
Jan 14 16:02:03 steinrose kernel: [] warn_slowpath_fmt+0x6e/0x85
Jan 14 16:02:03 steinrose kernel: [] ? preempt_count_add+0xbb/0xcc
Jan 14 16:02:03 steinrose kernel: [] proc_register+0x116/0x12f
Jan 14 16:02:03 steinrose kernel: [] proc_mkdir_data+0x76/0x9a
Jan 14 16:02:03 steinrose kernel: [] proc_mkdir_mode+0x26/0x28
Jan 14 16:02:03 steinrose kernel: [] nv_register_procfs+0x4c/0x1c9 [nvidia]
Jan 14 16:02:03 steinrose kernel: [] nvidia_init_module+0x29c/0x79f [nvidia]
Jan 14 16:02:03 steinrose kernel: [] ? nv_drm_init+0x15/0x15 [nvidia]
Jan 14 16:02:03 steinrose kernel: [] nvidia_frontend_init_module+0x50/0x84c [nvidia]
Jan 14 16:02:03 steinrose kernel: [] do_one_initcall+0x5b/0x15e
Jan 14 16:02:03 steinrose kernel: [] ? vfree+0x41/0x8e
Jan 14 16:02:03 steinrose kernel: [] do_init_module+0x72/0x202
Jan 14 16:02:03 steinrose kernel: [] load_module+0x2104/0x28b3
Jan 14 16:02:03 steinrose kernel: [] ? symbol_put_addr+0x69/0x69
Jan 14 16:02:03 steinrose kernel: [] ? vfs_read+0x105/0x125
Jan 14 16:02:03 steinrose kernel: [] SyS_finit_module+0xf3/0x121
Jan 14 16:02:03 steinrose kernel: [] entry_SYSCALL_64_fastpath+0x13/0x94
Jan 14 16:02:03 steinrose kernel: ---[ end trace c68407c4b37c7644 ]---
Jan 14 16:02:03 steinrose kernel: NVRM: failed to register procfs!
Jan 14 16:02:03 steinrose kernel: NVRM: request_mem_region failed for 16M @ 0xfd000000. This can
NVRM: occur when a driver such as rivatv is loaded and claims
NVRM: ownership of the device's registers.
Jan 14 16:02:03 steinrose kernel: nvidia: probe of 0000:01:00.0 failed with error -1
Jan 14 16:02:03 steinrose kernel: Error: Driver 'nvidia' is already registered, aborting...
Jan 14 16:02:03 steinrose kernel: NVRM: DRM init failed
How can nvidia driver be already registered when I do have to use a kernel module? Does anything significantly got changed since linux kernel 4.9 (upgraded from 4.8.17-1-ck) concerning video driver? Never had this before and did not change anything with nvidia kernel module.
Thanks, Florian.
Con,
ReplyDeleteIs vm.swappiness=10 (as on help.ubunu)recommended with your kernel patch ?
https://help.ubuntu.com/community/SwapFaq#What_is_swappiness_and_how_do_I_change_it.3F
I don't have any specific advice regarding swappiness. It is a two edged sword and a very blunt tool (pun) for dealing with hitting swap at the same time. Lower than the default 60 does seem like a good idea, but nothing beats disabling swap entirely...
DeleteI agree, swap is crap :)
DeleteFor my needs to suspend to disk (to swap) and using /dev/shm that is backed by more swap, also meaning me not having enough RAM, I can't turn swap off.
DeleteSwap -- or it's interaction -- is a severe bottleneck for linux for many years now. And I was always wishing that someone would take it over to ease the heavy slowdowns, that occur sometimes, when reclaims(?) are needed. Recently, after heavy /dev/shm & swap activity, I then had to wait ~15 minutes before firefox got back to responding to input. A complete reboot and FF reloading all 170 tabs would have been faster.
In my case I have some vm tunables changed years ago that worked well until now, 4.9.3 kernel. After the // you'd see the openSUSE original defaults:
echo 4 > /proc/sys/vm/dirty_background_ratio //10
echo 9 > /proc/sys/vm/dirty_ratio //20
echo 70 > /proc/sys/vm/swappiness //60
echo 6 > /proc/sys/vm/page-cluster //3
echo 200 > /proc/sys/vm/vfs_cache_pressure //100
Some of those settings base on suggestions from other ck community members, some on own reliability testings made many months ago.
If someone of you sees possibilities of improvements, except for turning off swap ^^, your hints are highly appreciated,
BR, Manuel Krause
Try setting "vm.swappiness" to zero or one.
DeleteThe default 60 and your 70 make it so the kernel starts swapping when just half of your RAM is in use. It does this by deciding to rather keep disk cache contents in RAM instead of programs.
When you use zero, it will only start to swap if your RAM is fully in use by programs.
The suggestion to use one instead of zero comes from an old report about a program that behaved bad with the zero value but behaved normal with a one.
To use zero (or one) is a suggestion for when you really have enough RAM for everything at all times. In that case it never makes sense to use swap because there's just no way to avoid choppiness on the desktop. It will always happen when you do something with a program that had its data swapped out.
If your programs actually use more memory than you have RAM, there will always be choppiness and perhaps something like 20 might be interesting to experiment with to make the kernel not reduce cache sizes to a minimum.
The default 60 is a value that's for things like web servers. Over there, you don't care for interactivity and you would be fine with just the programs involved in serving the web stuff in RAM while other rarely used programs thrown out into swap to have more RAM for larger disk caches.
@Anonymous:
DeleteThank you for your suggestions and explanations. Unfortunately, only setting swappiness may not be sufficient on my system. Side question: Can it be related to shared memory, integrated intel laptop graphics?
At least, setting swappiness to 1, 10 or 20 on here led to severe knockouts of my system for many minutes, until it recovered for only short periods of time. Seems like I'd need to readjust the other settings as well, but have no idea where to begin this Odyssee. Atm. I'm going downwards from my former swappiness value, and 50 is the actual known good step that doesn't affect interactiveness (w.i.p.).
Further hints are still appreciated 8-)
BR, Manuel Krause
Runs nice and fast (ck1) although I had to downgrade from 4.9.4 to 4.9.1 because of latency.
ReplyDeleteIs there any difference between 4.9.0 and 4.9.1?
DeleteGood question.
DeleteI was testing just before.
cyclictest --smp -p95 -m -N
Average across all cores (W3565 quad, 3.2GHz):
928,5 ns for 4.9.1-ck1
844,25 ns for 4.9-ck (git).
Seems it is like with the 4.8... releases.
It's all downhill from the initial one.
My osu! problems are gone. You're a wizard ck.
ReplyDeleteFirst time feedback ever...
ReplyDeleteThank you very much! Without your Patchset, and later BFQ Linux always felt broken. I began using them aeons ago on a P3@933Mhz which i bought refurbished. And using them now on an old Thinkpad T60P with CoreDuo T2600 and 3GB Ram. Yes! 32Bits! Why? No Money! Anyways, right now everything runs very smooth at 4.9.4 which Greysky kindly supplies via his repo for Archlinux. That couldn't be said for the whole of 4.8 which forced me to gnarlingly fall back to default upstream, and experimentally using ZEN. Which worked less buggy, but not flawless. But the pain is gone now.
Very good job! :-)
get this: http://www.ebay.com/itm/Original-Intel-Core-2-Duo-T7500-2-2-GHz-LF80537GG0494M-Processor-CPU-/282060717222?hash=item41ac20fca6:g:xesAAOSwLnBXVksg
Delete$2.20 and go 64 Bit :)
Makes no sense while being limited to 3GB by the chipset.
DeleteHi see ur blog for many months and I have to say that u do nice job!!! I have a question.I have many years to do hacks so I have forget some basics.I remember how can I make a phising url.I want to ask where I have to upload a phising url.(with purpose to steal someone's password.)Im not native english speaker.Please answer me..
ReplyDelete^ You might want to ask there instead: https://www.justice.gov/criminal
DeleteI am noticing heavy stuttering with graphics (games: Tomb Raider in-game benchmarking option and Chromium + Imgur scrolling) on Ubuntu 16.04 + nVidia GFX 950m + 375.20 driver + 4.9-ck1 drivers. With stock Ubuntu Kernels it behaves normal.
ReplyDeletedon't use bfq sched for games - which is enabled by default. Try cfq or deadline
DeleteAfter downgrading from 4.9.4-ck1 to 4.9.0-ck1 (git) because of latency I downgraded to 4.8-ck (git).
ReplyDeleteFeels much faster than the 4.9... bunch.
It seems the kernel gets more and more bloated and slower every release.
My name is Jennifer Lora me and my husband are here to testify about how we
ReplyDeleteuse Lisa ATM CARD to make money and also have our own business
today. Go get your blank ATM card today and be among the lucky ones. This
PROGRAMMED blank ATM card is capable of hacking into any ATM
machine,anywhere in the world.It has really changed our life for good and
now we can say we are rich and we can never be poor again. You can withdraw
the maximum of $ 10,000 daily We can proudly say our business is doing fine
and we have up to 20,000 000 (20 millions dollars in our account) Is not
illegal,there is no risk of being caught ,because it has been programmed in
such a way that it is not traceable,it also has a technique that makes it
impossible for the CCTV to detect you..For details on how to get yours today, email her on : [ lisaatmcard@gmail.com ]
or call her on
( +12678734910 )
Like a previous poster, I also got a process hard freeze when using the silver searcher while compiling the Linux kernel in the background. It was at a point where 'killall -9 ag' would not kill the process
ReplyDeleteHi, I have high cpu and unresponsive machine at any program using 4.9-CK, using yield_type=1 or 2, this is in a Haswell Laptop. Had to downgradde to 4.8-ck .
ReplyDeletejournal at the moment of freeze : http://pastebin.com/3s6VvmHZ
Had to hard reset the laptop. Any ideas why?
Try downgrading to 4.9.0-ck1. People have reported all sorts of weird issues with newer 4.9.x kernels.
Delete4.9.6 fixed the problems, thanks for your time, and your work in these patches.
DeleteGreat. I'm glad to see the problem came from me doing nothing and went away by me also doing nothing.
DeleteThis comment has been removed by the author.
DeleteHi, ck.
ReplyDeleteAfter some time from the first MuQss release I have tried again your patch but i still have problems.
Wine is not usable since no application can be executed due to the error:
"kernel: usercopy: kernel memory overwrite attempt detected"
With an Atom Z520 i still have some intermittent boot panic. When boot goes well, then everything runs smooth for many days.
Any suggestion?
Many thanks.
No idea, sorry.
DeleteI just enabled UBSAN in the kernel and it found an integer overflow in MuQSS, apparently in its iso ticks calculation.
ReplyDelete================================================================================
UBSAN: Undefined behaviour in kernel/sched/MuQSS.c:3230:33
signed integer overflow:
4204941 * 522 cannot be represented in type 'int [40]'
CPU: 0 PID: 0 Comm: MuQSS/0 Tainted: P O 4.9.6-ck1 #1
Hardware name: System manufacturer System Product Name/M2N-SLI, BIOS ASUS M2N SLI ACPI BIOS Revision 0903 06/18/2008
0000000000000000 ffffffffa0a32ba1 000000000000002a dd38ca36cb270427
ffff9dcb77c03e18 000000000000020a ffffffffa0a9c5f9 ffffffffa14b9c00
ffffffffa0a9d0e9 0000002aa14c6120 0000000000000002 0031343934303234
Call Trace:
[] ? dump_stack+0x5a/0x99
[] ? ubsan_epilogue+0x9/0x40
[] ? handle_overflow+0xf9/0x120
[] ? sched_clock_local+0x1b/0xa0
[] ? scheduler_tick+0x857/0xa70
[] ? rcu_check_callbacks+0x17a/0x5a0
[] ? tick_sched_handle+0xa0/0xa0
[] ? update_process_times+0x46/0x60
[] ? tick_sched_timer+0x3d/0x90
[] ? __hrtimer_run_queues+0x10c/0x470
[] ? hrtimer_interrupt+0xd7/0x260
[] ? smp_apic_timer_interrupt+0x45/0x70
[] ? apic_timer_interrupt+0x7c/0x90
[] ? default_idle+0x15/0x1b0
[] ? amd_e400_idle+0x37/0x140
[] ? cpu_startup_entry+0x205/0x2d0
[] ? start_kernel+0x459/0x479
================================================================================
Thanks very much for that, good find! I will attend to it in git soon.
DeleteAnother one, probably unrelated to the first one. Unlike the other one, which always occurs within five minutes of booting, this one took over 15 hours to occur.
DeleteI like the irony of the comment in line 4285.
================================================================================
UBSAN: Undefined behaviour in kernel/sched/MuQSS.c:4287:16
signed integer overflow:
-58002454 * 40 cannot be represented in type 'int [40]'
CPU: 0 PID: 9026 Comm: pidof Tainted: P O 4.9.6-ck1 #2
Hardware name: System manufacturer System Product Name/M2N-SLI, BIOS ASUS M2N SLI ACPI BIOS Revision 0903 06/18/2008
0000000000000000 ffffffffa1632ba1 000000000000002a 00000000a0295bf5
ffffb88e0188fc40 0000000000000028 ffffffffa169c5f9 ffffffffa20b8ac8
ffffffffa169d0e9 0000002a38f5b5c0 0000000000000202 353432303038352d
Call Trace:
[] ? dump_stack+0x5a/0x99
[] ? ubsan_epilogue+0x9/0x40
[] ? handle_overflow+0xf9/0x120
[] ? cputime_adjust+0x50/0x200
[] ? task_prio+0x1ac/0x270
[] ? do_task_stat+0x3ad/0xce0
[] ? proc_single_show+0x75/0x100
[] ? seq_read+0xbf/0x5f0
[] ? vfs_read+0xcb/0x220
[] ? SyS_read+0x5f/0xd0
[] ? do_syscall_64+0x62/0x140
[] ? entry_SYSCALL64_slow_path+0x25/0x25
================================================================================
Thanks again. Will investigate when time permits.
DeleteBtw, my kernel tick rate is 100 Hz. That might be relevant for the first issue.
DeleteNice catch.
DeleteI've looked into the first issue (line 3230) a bit more because it seems more serious.
DeleteThe comment above the no_iso_tick() function says rq->iso_ticks should be decreased. If I read line 3232 correctly, that means that the 'ticks' argument must be positive. As simple printk debugging shows, ticks is often negative. In fact, it is negative about a third of the time. This causes rq->iso_ticks to grow until an overflow happens.
I logged all ticks < -5 and with a counter for positive, negative and zero ticks. You can find the log here: sendspace.com/file/ccz4c8
I've looked into the first issue (line 3230) a bit more because it seems more serious.
DeleteThe comment above the no_iso_tick() function says rq->iso_ticks should be decreased. If I read line 3232 correctly, that means that the 'ticks' argument must be positive. As simple printk debugging shows, ticks is often negative. In fact, it is negative about a third of the time. This causes rq->iso_ticks to grow until an overflow happens.
I logged all ticks < -5 and with a counter for positive, negative and zero ticks. You can find the log here: sendspace.com/file/ccz4c8
My kernel tick rate is 100 Hz.
Very impressed. Maxing out all 4 cores with 2 different compiler jobs and still the machine is responsible like there's nothing going on.
ReplyDelete[OFF-TOPIC]
ReplyDeleteSorry for disturbing... Am currently upgrading my openSUSE and want to ask, what's the currently recommended (mature) gcc compiler version for (mainly) kernel compilation.
Thanks in advance and best regards,
Manuel Krause
I use 5.3.0 at the moment (no problems).
DeleteAbout to upgrade to 6.3.0.
6.3.1 on arch. Everything fine.
Deleteduud
Thank you people, I read it like there aren't known issues like in the early gcc5 days.
DeleteUnfortunately the ugrade process in openSUSEs is a little lengthy until living on the safe side, what's also my fault, keeping an old 13.1 only freshly updated.
Means, it'll take some more days for me to be able test your suggestions.
BR, Manuel Krause
x.3.x should be safe.
DeleteO.k. I've only taken the first step from 4.9.? (last possible of my former distro openSUSE 13.1) to now 5.4.1 -- and the resulting kernel behaves as well as before.
DeleteThank you for your insights and BR, Manuel Krause
Hey,
ReplyDeleteI've been running linux 4.9.7 with muqss for quite some time now without any issue. But today I wanted to try golang and by simply issuing one command, the application segfaulted. Well, I thought this must be a golang error but before I wanted to report this I tried this with the stock archlinux vanilla kernel and it didn't seg faulted which means that somehow its muqss fault. I also tried comparing both kernel configs and they are equivalent with some obvious exceptions like bfq.
To reproduce this:
mkdir go && cd go
export GOPATH=$(pwd)
go get -u -v github.com/nsf/gocode
Running 4.9.0-ck(1) (git) (MUQSS).
DeleteNo problem using the above commands.
What version of go are you using? I can reproduce this on "go version go1.7.5 linux/amd64".
Delete1.4.2
ReplyDeletePerhaps it's something with go 1.7? Could you try version 1.7?
Delete^Used go 1.7.5 now on Slackware 14.2 with MUQSS, no problems with your commands above.
Delete^ update: Upon closer inspection I noticed: although I updated to 1.7.5 it still uses 1.4.3 somehow. Too lazy to debug right now.
DeleteNvm, it seems it's go's fault since I could reproduce this issue with a vanilla linux kernel. So it's all good.
DeleteHi,
ReplyDeleteWith Linux 4.9.8-ck1 I get the following stack trace upon resuming from suspend. Happens with HZ=250/300, I haven't noticed it with 1000.
[21889.468401] ------------[ cut here ]------------
[21889.468414] WARNING: CPU: 0 PID: 16898 at kernel/sched/MuQSS.c:1950 valid_task_cpu+0xa7/0xb0
[21889.468415] Modules linked in: nvidia_uvm(PO) nvidia(PO) bbswitch(O) nf_log_ipv4 nf_log_common xt_LOG ipt_REJECT nf_reject_ipv4 xt_state iptable_mangle iptable_nat nf_nat_ipv4 nf_nat iptable_filter rndis_host cdc_ether usbnet mii vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) nfsd ipv6 crc_ccitt fuse algif_skcipher af_alg uvcvideo btusb videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 btrtl btbcm videobuf2_core btintel bluetooth videodev snd_hda_codec_realtek snd_hda_codec_generic intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp kvm_intel iwlmvm kvm snd_hda_intel snd_hda_codec snd_hwdep irqbypass i915 snd_hda_core crct10dif_pclmul snd_pcm ghash_clmulni_intel iwlwifi alx snd_timer psmouse mei_me i2c_dev snd serio_raw mei mdio asus_nb_wmi asus_wmi sparse_keymap mxm_wmi wmi
[21889.468488] CPU: 0 PID: 16898 Comm: kworker/3:0 Tainted: P W O 4.9.8-ck1-smp #12
[21889.468490] Hardware name: ASUSTeK COMPUTER INC. N56VM/N56VM, BIOS N56VM.214 08/28/2012
[21889.468501] ffffc9000b8c7d20 ffffffff8143b52d 0000000000000000 0000000000000000
[21889.468506] ffffc9000b8c7d60 ffffffff81098746 0000079e00000002 ffff8802223dbc00
[21889.468511] 0000000000017680 ffff880225c1d988 0000000000000282 ffffc9000b8c7e18
[21889.468515] Call Trace:
[21889.468524] [] dump_stack+0x4f/0x72
[21889.468530] [] __warn+0xc6/0xe0
[21889.468534] [] warn_slowpath_null+0x18/0x20
[21889.468537] [] valid_task_cpu+0xa7/0xb0
[21889.468541] [] do_set_cpus_allowed+0x37/0xa0
[21889.468545] [] __kthread_bind_mask+0x3b/0x70
[21889.468549] [] kthread_bind_mask+0xe/0x10
[21889.468552] [] create_worker+0xfb/0x1a0
[21889.468554] [] worker_thread+0x318/0x4e0
[21889.468557] [] ? process_one_work+0x4a0/0x4a0
[21889.468561] [] kthread+0xd4/0xf0
[21889.468564] [] ? kthread_park+0x60/0x60
[21889.468571] [] ret_from_fork+0x22/0x30
[21889.468574] ---[ end trace 2557c3739e4d37b5 ]---
[21889.497983] Task kworker/3:1 (pid=17134) is on cpu 3 (state=0, flags=4208040)
[21889.521652] Removed affinity for 617 processes to cpu 4
[21889.522233] smpboot: CPU 4 is now offline
[21889.568254] Removed affinity for 618 processes to cpu 5
[21889.568266] smpboot: CPU 5 is now offline
[21889.621578] Removed affinity for 617 processes to cpu 6
[21889.621588] smpboot: CPU 6 is now offline
[21889.671566] Removed affinity for 618 processes to cpu 7
[21889.671579] smpboot: CPU 7 is now offline
[21889.696021] ACPI: Low-level resume complete
Sad story. 4.4.14 vanilla kernel (~180k config) feels more responsive than a custom 4.9.9-ck1 kernel (~70k config).
ReplyDeleteThe kernel is getting too bloated.
Seems like no one cares about speed/latency/efficiency anymore.
Or it is by intent to sell more new cpus.
/rant.
Have you compared 4.9.9-ck1 based from the vanilla kernel config? Where are you getting your -ck config?
DeleteI just tailored it to my hardware.
DeleteDisabled all the other drivers I don't need.
Tracked it down to some file system issue.
DeleteAll good now.
Pardon?
DeleteHave I read you correctly: Your file system issue had led to responsiveness issues? Would be nice to read a little more details about how you fixed it, so other users don't need to face it.
Thanks in advance,
BR, Manuel Krause
I just forgot to add tmpfs /run tmpfs... to my /etc/fstab as I was setting up the new system.
DeleteI doubt someone else will mess this up.
Oh, yes, that one. Thank you for clarifying. On my side, I'm also not completely done with (mis)configuaration hassles after upgrading through 3 openSUSE major releases. What I hate most with it are such unpredictable automatic installer decisions that still happen (though always choosing the manual adjustments' way). The reason why I had pushed it for so long time.
DeleteBR, Manuel Krause
If you want complete control and Linux as vanilla as it gets I suggest trying Slackware.
DeleteAlso it has no shitstemd.
I never looked back.
Quite a seductive proposal.
DeleteBut I'm with openSUSE now for almost 2 decades and always managed the good times and the bad times so far.
I'm not _as_ upset with "shitstemd" ;-), and luckily I've seen, that it kept and keeps improving over the years.
What really worries me is the co-existence of plasma5 and old kde4 and related Qt libs needed for each of them, severely filling up the partition. Noone wants/accepts to repartition disks for incomplete software reasons, except for Windows users.
BR, Manuel Krause
I also started my Linux journey on SUSE Linux 6.0 back in 1999.
DeleteBut they "forced" me to switch.
Hi ck,
ReplyDeleteI am using Linux 4.9.9-ck1 and I have the following stack trace upon resuming from suspend. This happens when HZ=250/300 but doesn't seem to happen when HZ=1000.
[26639.048008] Removed affinity for 589 processes to cpu 2
[26639.048021] smpboot: CPU 2 is now offline
[26639.051410] ------------[ cut here ]------------
[26639.051423] WARNING: CPU: 0 PID: 13564 at kernel/sched/MuQSS.c:1950 valid_task_cpu+0xa7/0xb0
[26639.051424] Modules linked in: nf_log_ipv4 nf_log_common xt_LOG ipt_REJECT nf_reject_ipv4 xt_state iptable_mangle iptable_nat nf_nat_ipv4 nf_nat iptable_filter vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) nfsd ipv6 crc_ccitt fuse algif_skcipher af_alg uvcvideo btusb btrtl btbcm videobuf2_vmalloc btintel videobuf2_memops videobuf2_v4l2 bluetooth videobuf2_core videodev rndis_host cdc_ether usbnet mii snd_hda_codec_realtek snd_hda_codec_generic intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp kvm_intel iwlmvm snd_hda_intel kvm snd_hda_codec snd_hwdep i915 snd_hda_core irqbypass snd_pcm crct10dif_pclmul snd_timer ghash_clmulni_intel iwlwifi snd mei_me alx psmouse i2c_dev mei mdio asus_nb_wmi serio_raw asus_wmi sparse_keymap mxm_wmi wmi
[26639.051493] CPU: 0 PID: 13564 Comm: kworker/2:2 Tainted: G O 4.9.9-ck1-smp #13
[26639.051494] Hardware name: ASUSTeK COMPUTER INC. N56VM/N56VM, BIOS N56VM.214 08/28/2012
[26639.051505] ffffc9000ccd7d20 ffffffff8143a6ed 0000000000000000 0000000000000000
[26639.051510] ffffc9000ccd7d60 ffffffff81097786 0000079e00000002 ffff88004e82b000
[26639.051514] 0000000000017680 ffff880225c1d948 0000000000000282 ffffc9000ccd7e18
[26639.051518] Call Trace:
[26639.051528] [] dump_stack+0x4f/0x72
[26639.051533] [] __warn+0xc6/0xe0
[26639.051537] [] warn_slowpath_null+0x18/0x20
[26639.051541] [] valid_task_cpu+0xa7/0xb0
[26639.051544] [] do_set_cpus_allowed+0x37/0xa0
[26639.051549] [] __kthread_bind_mask+0x3b/0x70
[26639.051552] [] kthread_bind_mask+0xe/0x10
[26639.051555] [] create_worker+0xfb/0x1a0
[26639.051558] [] worker_thread+0x318/0x4e0
[26639.051561] [] ? process_one_work+0x4a0/0x4a0
[26639.051564] [] kthread+0xd4/0xf0
[26639.051568] [] ? kthread_park+0x60/0x60
[26639.051574] [] ret_from_fork+0x22/0x30
[26639.051577] ---[ end trace 6e2ff89d2389b048 ]---
[26639.077517] Task kworker/2:1 (pid=14035) is on cpu 2 (state=0, flags=4208040)
[26639.117982] Removed affinity for 590 processes to cpu 3
[26639.118006] smpboot: CPU 3 is now offline
[26639.167935] Removed affinity for 590 processes to cpu 4
[26639.167950] smpboot: CPU 4 is now offline
[26639.217857] Removed affinity for 589 processes to cpu 5
[26639.217868] smpboot: CPU 5 is now offline
[26639.267850] Removed affinity for 589 processes to cpu 6
[26639.267861] smpboot: CPU 6 is now offline
[26639.317831] Removed affinity for 589 processes to cpu 7
[26639.317842] smpboot: CPU 7 is now offline
[26639.342205] ACPI: Low-level resume complete
[26639.342259] ACPI : EC: EC started
[26639.342260] PM: Restoring platform NVS memory
[26639.342626] Suspended for 106421.939 seconds
[26639.342710] Enabling non-boot CPUs ...
Thank you.
Thanks. Luckily that's harmless. I'll look into silencing it in the future.
DeleteHi Con,
ReplyDeleteI've made some scaling tests with CFS and MuQSS, to see why MuQSS is performing poorly under half load.
They are here :
https://docs.google.com/spreadsheets/d/163U3H-gnVeGopMrHiJLeEY1b7XlvND2yoceKbOvQRm4/edit?usp=sharing
in the '4.9.9 Scaling test' sheet.
I remember that you said that it might be related to load balancing and Intel turbo boost.
However, I've found that my motherboard set the CPU to it's max turbo boost frequency when XMP memory profile is enabled (and XMP is always enabled on my computer).
So my 4770k CPU always runs at a maximum frequency of 3.9, whether 1, 2, 3 or more cores are loaded. I've checked that with turbostat.
So I believe it's not a turbo boost issue.
I've also done some tests with XMP disabled and turbo boost working as intended.
The only thing I've found, using 'turbostat make -jN', is that with CFS, load is distributed evenly across physical cores and logical cores, whereas MuQSS puts more load on physical cpu.
I don't know if it's intended or if it can cause this performance issue.
I just write this to let you know.
Pedro
Using cores over threads should actually improve performance, not make it worse, so it's not that either.
DeleteHi Con, my last post about performance under half load has been filtered.
ReplyDeleteCan you please bring it up?
Pedro
Low-latency 4.9.9-ck1 kernel config.
ReplyDelete(Based on Slackware64 14.2 4.4.14 kernel config, should run on any hardware.)
Getting 1.1 µs average latency on all 4 cores on a Xeon W3565 3.2 GHz quad-core using cyclictest.
http://pastebin.com/vvwsT3mE
No initramfs, cgroups, namespaces, etc. support, adjust as needed.
Hi, when I use an external usb wifi in 4.9-CK kernel the system hangs/freeze and only happens with that kernel ck, when I use 4.9 vanilla or Zen it doesnt happen? syslog doesnt show anything I had to press the power button to restart again.
ReplyDeleteThe United States Federal Government and WordPress Private Grant Foundations give away billions in free money every year to millions of US and Canada Citizens just like me and you. These are free cash grants that all US and Canada taxpaying citizens are entitled to and should take advantage of. This free money can be used for almost anything you can imagine. In fact right now people are being approved for large sums of money to start a business, even to buy a house. I am a witness and i was given $200,000 cash so don't sit back and watch these opportunities pass you by.It doesn't even matter if you have debt, or a bad credit rating; you can still qualify. Grant Programs are not loans,and no matter how much free government money you receive you will never have to pay it back. Visit the federal grant official website for more details (federalgrantrefundcom.wordpress.com)
ReplyDelete
ReplyDeleteIf you are in desperate need of a hacker for hire? This dude's is a cyber guru, he is involved with Getting your bank blank atm cards which could debit money from any atm machine. Bank transfers and wire transfers as well as Paypal jobs, hes that good,had to make him my personal hacker. You could mail him cyberhacker906@gmail.com as well if you got issues, he's as discreet and professional too. He's kinda picky though so make mention of the reference. Bryan referred you.
*Cheating Spouse *University grades changing *Bank accounts hack *Twitters hack *email accounts hack *Grade Changes hack *Website crashed hack *server crashed hack *Retrieval of lost file/documents *Erase criminal records hack *Databases hack *Sales of Dumps cards of all kinds *Untraceable Ip *Individual computers hack *Websites hack *Facebook hack *Control devices remotely hack *Burner Numbers hack *Verified Paypal Accounts hack *Any social media account hack *Android & iPhone Hack *Word Press Blogs hack *Text message interception hack *email interception hack
ReplyDeletecontact: hackwithjonny at gmail dot com +17272202668