A number of minor fixes as queued up post 3.15-ck1 made their way into this patchset, along with some changes inspired by the development work of Alfred Chen (thanks!).
The major feature upgrade in this one is the inclusion of SMT nice as discussed at length on this blog. This version of BFS includes an updated version of SMT nice beyond version 6 posted here with one change - 25% of the CPU time of any nice level of SCHED_NORMAL tasks can be shared with any other nice level over and above the nice-based CPU distribution. This is to capitalise on the slightly increased throughput that is available by using the sibling CPU concurrently without too dramatically affecting higher priority process CPU loss. In addition it dramatically reduces the massive latencies that can sometimes otherwise be seen by heavily niced tasks with SMT nice enabled by dithering the metering out of CPU instead of giving it all as a burst only when it's entitled to CPU.
Making SMT nice configurable means users can get to choose if they still want the standard behaviour. The config option will recommend users who enable the SMT scheduler option also enable the SMT nice option. I believe this to be a good default choice for virtually all desktop users, and selectively for server users if they depend heavily on the use of 'nice' or scheduling policies for their work cases (but otherwise it should be disabled).
BFS by itself:
3.16-ck1 branded BFS patchset directory:
3.16-ck1
EDIT: A build fix for non SMT enabled kernels to prevent it being possible to enable SMT nice is here:
bfs450-nosmt-buildfix.patch
Just disabling SMT nice will achieve the same thing for those affected.
EDIT: A build fix for non SMT enabled kernels to prevent it being possible to enable SMT nice is here:
bfs450-nosmt-buildfix.patch
Just disabling SMT nice will achieve the same thing for those affected.
Enjoy!
お楽しみください
Config please.
ReplyDeleteThanks. For now you can disable "SMT (Hyperthreading) aware nice priority and policy support" in the processor settings to fix it without any detriment. I will provide a build fix shortly.
ReplyDeleteAwesome! Thanks so much for your help.
Delete@ck
ReplyDeleteIn 0450, I can see tsk_is_polling checking is total removed in resched_task(). But a debug code shows that TIF_POLLING_NRFLAG bit is set when checking in resched_task(), about 10000 counts in 2mins.
[ 116.728800] bfs: resched_task 9571
Would you give further hint why remove this checking in bfs? Thanks.
Still trying to decide if it even means anything on BFS without wake lists.
DeleteDon't know how, but it seems that BFS breaks i8k module for my laptop. It simple doesn't work.
ReplyDeleteReverting BFS fixes the issue.
The only change to i8k module between 3.15 and 3.16 is this commit:
http://permalink.gmane.org/gmane.linux.kernel.commits.head/465805
And it's related to CPU scheduler.
Con?
Odd. Try this patch:
Deletebfs450-resched-scap.patch
Investigating the code involved also led to this:
Deletei8k: Don't revert affinity in i8k_smm
Seems to be OK. Will test more. Thanks!
DeleteSo the first patch resched-scap was ok?
DeleteYep, i8k works OK after applying bfs450-resched-scap.patch.
DeleteWell done!
ReplyDeleteThanks to CK
and Chen looking in the code
Greetings,
Ralph from Hamburg
Also, BFSv450 still triggers ath9k issue, but after dancing around it I guess I've found solution:
ReplyDeletehttps://github.com/pfactum/pf-kernel/commit/85b98dfc2279d478c6aea72d7fac082fab28f55a
At least, it doesn't hang for now. Will test more.
Well this got me thinking and I've come up with this patch instead. Can you try it instead please for your ath9k problem?
Deletebfs450-sched-ipi.patch
I've reverted my patch and applied yours.
DeleteWith your patch ksoftirqd jumps from 0 to 100 of CPU usage periodically, but now it doesn't hang the whole system, just slows it down, and Wi-Fi works.
With my patch ksoftirqd doesn't jump to high CPU usage and everything works OK too without slowdown.
I guess that's right direction. Would be happy to test more patches.
Just to note, I realize my patch to be dirty hack, and the real solution should be elsewhere.
Delete@pf
DeleteGood finding. scheduler_ipi() is removed by my patch [BFS] Remove runqueue wake_list. I do missed the preempt_fold_need_resched() call in it. ck's patch fixed it.
Do you have a workable old version of kernel(maybe 3.13, 3.14) to jump back and check the ksoftirqd behaviours?
Everything just worked before 3.14.
ReplyDeleteI guess preempt_fold_need_resched() is right fix, but it's only the part of the problem.
BTW, Con, do you remember replacing cond_resched() with schedule() in KVM code to make it work properly with BFS? The symptom was similar to ath9k one — 100% of CPU used by ksoftirqd.
Also, Alfred, you may take a look at this thread:
ReplyDeletehttp://lists.tuxonice.net/pipermail/tuxonice-devel/2014-August/007509.html
(many emails spanned across several months).
There's detailed ath9k issue description.
Also, I'd like to share with you more traces captured during ath9k+BFS hang (without any patches) with "perf top".
ReplyDelete1. all CPU-consuming kernel functions: http://habrastorage.org/files/ae0/70f/8c3/ae070f8c3cba46a6a041891de8cb10d9.JPG
2–13. disassembled functions (those at the top of the list):
* http://habrastorage.org/files/ab3/b7a/c6a/ab3b7ac6a81c44e88e06dfd9908ea745.JPG
* http://habrastorage.org/files/750/f75/212/750f7521230a46e2b29ce2ace3f850ed.JPG
* http://habrastorage.org/files/758/c07/aee/758c07aee9784cdda9c9df30d429a887.JPG
* http://habrastorage.org/files/1b6/0b2/7b9/1b60b27b9325494597c06792cba6b6a9.JPG
* http://habrastorage.org/files/f72/6ea/d98/f726ead980cc475e9b394fa7a6b7f40f.JPG
* http://habrastorage.org/files/9c6/d29/1f5/9c6d291f59ef429c9e444348bb5ef32e.JPG
* http://habrastorage.org/files/9d7/8a9/232/9d78a92326ab4df9b4661a2f1e297889.JPG
* http://habrastorage.org/files/3e1/07b/f15/3e107bf159154122a57b6f81b8b20080.JPG
* http://habrastorage.org/files/b23/a4f/f55/b23a4ff55126404ea1160f59ed3873c7.JPG
* http://habrastorage.org/files/89e/337/034/89e3370342be42aab95075a8fb198062.JPG
* http://habrastorage.org/files/56f/a44/1a9/56fa441a97644c439f127916050da99c.JPG
* http://habrastorage.org/files/494/5f9/c7e/4945f9c7e3f348f4951f86e65551f204.JPG
Thanks PF. I assume you're saying the hang still happens then even despite the last patch I posted?
Delete@ck do you mean bfs450-sched-ipi.patch? No. I'll describe once again.
DeleteWith your last patch ksoftirqd CPU usage jumps from 0 to 100% periodically, but now it doesn't hang the whole system, just slows it down, and Wi-Fi works.
I posted bunch of jpegs for the case *without* your patch just to help you to debug the issue as it still persists *with* your patch (though in less extent and without hangs).
@pf
DeleteWould you apply this patch upon bfs450-sched-ipi.patch and see it help with the ksofirqd CPU usage issue?
It enables the mainline TIF_POLLING_NRFLAG checking routines, should help with ipi in somehow, but I am not sure if it help with ath9k module.
https://bitbucket.org/alfredchen/linux-gc/downloads/0450-enable-polling-check.patch
@Alfred, I've applied bfs450-sched-ipi.patch and 0450-enable-polling-check.patch, and that doesn't fix ath9k issue. The system behaves the same as without your 0450-enable-polling-check.patch.
Delete@pf, thanks for testing anyway.
DeletePF thanks for your tireless testing.
DeleteCan you give this crazy patch a try on top of sched ipi please?
bfs450-tifcheck_in_cond_resched.patch
@ck it works :).
DeleteSo, I guess, the working solution on top of bare -ck1 is:
1. https://github.com/pfactum/pf-kernel/commit/44b3e870e656a11aa7116c236b7e00591141a68a — brings back scheduler_ipi()
2. https://github.com/pfactum/pf-kernel/commit/6a180442f154c5a624ee377dacfcc0b8631eb1e0 — uses tif_need_resched() in cond_resched()
Also I've reverted KVM workarounds here:
3. https://github.com/pfactum/pf-kernel/commit/ad4d566baf9a825f41240ce1785096028fdacd45
KVM+QEMU works OK.
Also, we don't need special i8k workaround. I've reverted it, and i8k seems to work as well.
I'm going to test it more, but now everything seems to be OK. Thanks!
Thanks PF. I'd spent the last couple of days auditing code to see what might be responsible and that was the only solution I could come up with. The behaviour with this patch is definitely correct, but it's a bit disappointing because it means there's something fundamentally different in BFS handling the resched flag compared to mainline and I didn't intend to start diverting from mainline in this way. I'll keep auditing the code to see if there's an obvious trigger to act on this flag in a different place that I've missed but it's fair to say this is a sane solution for the time being and if I can't come up with anything, I'll just run with it.
DeleteOK, if there's necessity to test more patches, feel free to send them to me.
Delete@PF: Here's a crazy thing. Try with only the sched-ipi patch and disable all preempt completely in your config and see what that does please.
Delete@PF: And after that try this combination:
Deletebfs450-resched-scap.patch
bfs450-sched-ipi.patch
+
bfs450-add-preempt-resched.patch
This comment has been removed by the author.
Delete@ck: applying sched-ipi patch only and disabling preemption completely does the trick — ath9k seems to work OK, but i8k doesn't work.
Delete@ck: also the last combination of patches works OK (both ath9k and i8k) with preemption enabled.
DeleteAha! Now we're talking! This last set of patches is the correct fix (unlike the tifcheck patch). Let's try it for a day or two and then I can formalise these changes as a new BFS if nothing shows up. Thanks for testing!
Delete@ck: if that is correct fix, we kindly ask you to explain what's happened :).
Delete@ck: also with the last set of patches we do not need KVM workaround as well. I've reverted it too.
DeleteI can confirm that the last set of patches fix the ath9k issue.
DeleteI'll report back if I encounter any issues.
kudos to pf an ck for solving this issue, and big thanks!
@PF: From linux 3.13 setting just the "tif needs resched" flag alone was not enough to trigger a descheduling from certain places in the code, it needed the "preempt needs resched" tagged to trigger a different type of descheduling to hand over to another process or kick it off a cpu where it should no longer be.
DeleteGot that, thanks.
Deleteck > "Let's try it for a day or two and then I can formalise these changes as a new BFS if nothing shows up."
DeleteI hope you will release: new fixed BFS releases, or as a cumulative patch, for the previous kernel versions, like 3.15 and also for the current lts 3.14
thanks,
bye, NicCo
The recurring theme is that _cond_resched no longer works properly in BFS. It presents as a different bug for the i8k module not unconditionally rescheduling when the affinity changes but is the same issue as the ath9k tasklet not properly rescheduling and the ksoftirq spinning without rescheduing. Now to go back to 3.13 and see what changed at that time and how it broke.
ReplyDeleteHi Con,
ReplyDeletewould only comment, that my system is running fine with 3.16.1 and BFS+SMT on i7. Hibernate and suspend are working trouble-free (now with an Intel Wireless 7260 card and not anymore with the ath9k module). Had in mind, that there was an higher load value, but this was not the case. And the NFS server on my machine gives the same throughput as without BFS, or even enough to stream wireless some HD videos. Make operations could need some more time now, but that was the goal ;)
Or with other words, no negative drawback.
So thanks for your work.
Regards sysitos
So, if I read correctly, the patches:
ReplyDelete(1) http://ck.kolivas.org/patches/bfs/3.0/3.16/test/bfs450-sched-ipi.patch
(2) http://ck.kolivas.org/patches/bfs/3.0/3.16/test/bfs450-tifcheck_in_cond_resched.patch
are of benefit?
And this one is not needed?:
(3) http://ck.kolivas.org/patches/bfs/3.0/3.16/test/bfs450-resched-scap.patch
Do these patches "only" (but thankfully!) heal the issues post-factum and others reported, or are they considered as bug-fixes for BFS?
@ Alfred Chen: Would you recommend the patches 1-2 to be applied onto your 3.16.y-gc patched kernel, too?
Best regards, Manuel Krause
All 3 are bugfixes for everyone, but the patches have not been finalised. If you have no behavioural issues you are unlikely to see anything by applying them.
DeleteHehe, Con, you're getting funny... My "behavioural issues"...? ;-) Besides tracking and applying your patches...? ^^ Keep up this sense of humor!
DeleteNo, really funny, and self-ironical for me!
I still want to read Alfred Chen on this, too, as I currently run his "old" 3.16.y-gc patches, so far, and he hasn't published an updated one for your, Con's, 3.16 release yet.
Manuel Krause
The same 3 bugfix patches can be applied to Kernel 3.15.x?
Deletethanks
NicCo
Yes they apply equally there too except for bfs450-resched-scap.patch
DeleteThe NEW patch set also works well for a non affected system with a 3.16.y-gc patched kernel,
Deleteapplied on top on here:
bfs450-resched-scap.patch
bfs450-sched-ipi.patch
bfs450-add-preempt-resched.patch (No.1 with fuzz o.k, No.8 failed, as already removed o.k)
I hope Alfred Chen does consider this safe... ^^
Thank you all, and best regards,
Manuel Krause
BTW, I knew the "behavioural issues" are meant regarding my system, that's why I found it so funny as it can have double meaning for real life..
I'm watching this thread. I will update my -gc branch by re-basing 0450 and sync with 3.16.2 from mainline, hopefully next week. As ck said debug is not finished, I will not include these 3 patches so you can apply updated ones if you affected by similar issues.
DeleteWith "behavioural issues" he means the way your system behaves.
ReplyDelete