I tried my luck and applied the 5.12 patch onto 5.13 and it actually almost applied cleanly. There were only some minor issues because of formatting and two removed functions. I just added one of them back and replaced the call to nohz_run_idle_balance in kernel/sched/idle.c with what appears to be the new equivalent nohz_balance_enter_idle.
I'm no experienced kernel hacker and didn't look too deeply into possible issues, especially with that nohz_balance_enter_idle function, but I've been running this kernel for two days now without any problems.
If anyone is interested, here's the fixup patch that cleans up after applying the 5.12 patch: https://nopaste.chaoz-irc.net/view/601b3dc0
USAGE: 1. patch -p1 < patch-5.12-ck1 2. Hit enter a few times 3. patch -p1 < fixup.patch 4. make oldconfig && make -j$n && make install
Tested with 5.13.2-gentoo, but should apply and clean up the patch failures of patch-5.12 on any 5.13 source tree. Works, for me, but use at your own risk.
Since that worked so well I played around a bit more and added schedule_jiff_hrtimeout_* wrappers for the msec variants of those functions. This way replacing all the calls to timeout functions was a bit easier by just applying a single sed run for each one.
It actually boots and the effect on latency according to cycletest and even LatencyMon on my Windows VM is actually *HUGE*. On my system it basically removes any and all spikes above 1ms, when before I had at least a couple for every 10 second test run. Average latency is *halved*. That's way more than I expected. IME -ck is actually better than -rt for achieving low latency for some reason and this change seems to give it even more of an edge.
I can get 2ms latency out of PulseAudio with a stock default settings -ck on my i5 8300H, which is actually a hardware limit of my sound chip. Not even -rt could go that low, 5ms was the lowest latency that didn't cause constant underruns. Maybe (?) it has a lower absolute bound on random spikes, but on a multicore system average latency across all cores seems more important if the load can just be scheduled on any other unblocked core. At least that's how I am understanding the results I got.
So I wonder why there are so many calls to the regular timeout functions in -ck? Are there implications wrt to stability that I have yet to see, or that depend on the hardware/drivers? Or did you just not have time to thoroughly review all the new places where non-hrtimer timeouts were added? Not criticizing you or anything, I can understand if you don't have time or motivation to thoroughly investigate through the churn of constant changes everywhere and opt for safe defaults instead of risking breaking anyone's system. Maintaining anything out-of-tree is difficult by design, after all. I'm just wondering if I'm doing anything you already know will cause catastrophic failure down the line...
Anyway, thank you for your hard work, MuQSS and the whole -ck patchset are an amazing improvement over the stock kernel for the desktop use-case. Saving lives and showing Ingo and Linus how it's done as a hobby on the side. What a madman!!
@Con I noticed that main MuQSS patch and hrtimeout patches both lower limit for min_delta (1000 > 100). But in a ck patch it's controlled by "hrtimer_granularity_us" tunable while MuQSS hard-codes new value. What do you think about moving "hrtimer_granularity_us" into the main MuQSS patch? Also, is this value really worth modifying? Even with mainline kernel there's still reports about "hpet increased min_delta_ns". Thanks.
I tried my luck and applied the 5.12 patch onto 5.13 and it actually almost applied cleanly. There were only some minor issues because of formatting and two removed functions. I just added one of them back and replaced the call to nohz_run_idle_balance in kernel/sched/idle.c with what appears to be the new equivalent nohz_balance_enter_idle.
ReplyDeleteI'm no experienced kernel hacker and didn't look too deeply into possible issues, especially with that nohz_balance_enter_idle function, but I've been running this kernel for two days now without any problems.
If anyone is interested, here's the fixup patch that cleans up after applying the 5.12 patch: https://nopaste.chaoz-irc.net/view/601b3dc0
USAGE:
1. patch -p1 < patch-5.12-ck1
2. Hit enter a few times
3. patch -p1 < fixup.patch
4. make oldconfig && make -j$n && make install
Tested with 5.13.2-gentoo, but should apply and clean up the patch failures of patch-5.12 on any 5.13 source tree. Works, for me, but use at your own risk.
Since that worked so well I played around a bit more and added schedule_jiff_hrtimeout_* wrappers for the msec variants of those functions. This way replacing all the calls to timeout functions was a bit easier by just applying a single sed run for each one.
ReplyDeleteIt actually boots and the effect on latency according to cycletest and even LatencyMon on my Windows VM is actually *HUGE*. On my system it basically removes any and all spikes above 1ms, when before I had at least a couple for every 10 second test run. Average latency is *halved*. That's way more than I expected. IME -ck is actually better than -rt for achieving low latency for some reason and this change seems to give it even more of an edge.
I can get 2ms latency out of PulseAudio with a stock default settings -ck on my i5 8300H, which is actually a hardware limit of my sound chip. Not even -rt could go that low, 5ms was the lowest latency that didn't cause constant underruns. Maybe (?) it has a lower absolute bound on random spikes, but on a multicore system average latency across all cores seems more important if the load can just be scheduled on any other unblocked core. At least that's how I am understanding the results I got.
So I wonder why there are so many calls to the regular timeout functions in -ck? Are there implications wrt to stability that I have yet to see, or that depend on the hardware/drivers? Or did you just not have time to thoroughly review all the new places where non-hrtimer timeouts were added? Not criticizing you or anything, I can understand if you don't have time or motivation to thoroughly investigate through the churn of constant changes everywhere and opt for safe defaults instead of risking breaking anyone's system. Maintaining anything out-of-tree is difficult by design, after all. I'm just wondering if I'm doing anything you already know will cause catastrophic failure down the line...
Anyway, thank you for your hard work, MuQSS and the whole -ck patchset are an amazing improvement over the stock kernel for the desktop use-case. Saving lives and showing Ingo and Linus how it's done as a hobby on the side. What a madman!!
Sorry guys the amount of spamming today has been insane so I've restricted comments for the time being.
ReplyDelete@Con
ReplyDeleteI noticed that main MuQSS patch and hrtimeout patches both lower limit for min_delta (1000 > 100). But in a ck patch it's controlled by "hrtimer_granularity_us" tunable while MuQSS hard-codes new value. What do you think about moving "hrtimer_granularity_us" into the main MuQSS patch?
Also, is this value really worth modifying? Even with mainline kernel there's still reports about "hpet increased min_delta_ns".
Thanks.