Thursday 31 December 2020

linux-5.10-ck1, MuQSS version 0.205 for linux-5.10

Announcing a new -ck release, 5.10-ck1  with the latest version of the Multiple Queue Skiplist Scheduler, version 0.205 These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but configurable for any workload. 
 
Probably the most interesting thing to happen as pointed out to me by Damentz was that the Intel i915 scheduler is based on the scheduling algorithm from MuQSS:
 [Intel-gfx] [PATCH 36/56] drm/i915: Fair low-latency scheduling  
 
It seems they understand the incredible simplicity of the underlying scheduling algorithm that guarantees both latency and fairness intrinsically.
 
This was a very minor resync from 5.9-ck1.

linux-5.10-ck1:
-ck1 patch:
 
Git tree:
 

MuQSS only:
Download:
 
Git tree:
 
 
Enjoy!
お楽しみ下さい
-ck

24 comments:

  1. Thank you for your work Con! Your scheduler is great, it improves performance (yes, not only responsiveness, but performance too, at least in some apps) noticeably over CFS on my weak CPU. And i even benchmarked it a while ago to be sure it's not a placebo effect.

    ReplyDelete
  2. Nice to see your ideas spreading between developers, Con.
    Happy new year, btw!

    ReplyDelete
  3. Are you announcing 5.9-ck1 or 5.10-ck1? You've mentioned both.

    ReplyDelete
    Replies
    1. Copied over from old post by mistake, fixed now thanks.

      Delete
  4. Many thanks Con, your work is much appreciated, it makes my daily Linux experience much better.

    Love to see other developers appreciating your work and applying it to other areas of the system.

    Wishing you a happy new year and and the best for you and your loved ones.

    ReplyDelete
  5. And now have a very smooth ride into the next year, sir! :-) I've been waiting for that, and now you've given me a New Year's pledge to work on once midnight has passed here. Stay safe!

    ReplyDelete
  6. Thank you very much and a Happy New Year!

    ReplyDelete
  7. I'm wandering which rqshare option I should select for my laptop with an i5-8265U processor (4c/8t). MC seems to be the preferred option for processors with < 6 cores, although the help seems to hint that SMT is always beneficial in terms of overhead, latency and throughput

    ReplyDelete
    Replies
    1. Depends on what you want to prioritise. For power SMT is probably better, but for latency, MC is still the best.

      Delete
  8. With kernel v5.10 I noticed that upstream now forbids usage of "ondemand" (and "conservative") CPUfreq governor in favour of "schedutil". And we all know that "schedutil" doesn't work that great. Now we need extra patches to have better defaults.
    I'm using at least 2
    1) Allow "ondemand" selection by default

    --- a/drivers/cpufreq/Kconfig
    +++ b/drivers/cpufreq/Kconfig
    @@ -71,7 +71,6 @@

    config CPU_FREQ_DEFAULT_GOV_ONDEMAND
    bool "ondemand"
    - depends on !(X86_INTEL_PSTATE && SMP)
    select CPU_FREQ_GOV_ONDEMAND
    select CPU_FREQ_GOV_PERFORMANCE
    help

    2) Deal with intel_pstate
    a. Disable schedutil so it won't be used with intel_pstate
    (https://github.com/sirlucjan/kernel-patches/blob/master/5.10/ll-patches/0005-Disable-CPU_FREQ_GOV_SCHEDUTIL.patch)
    b. Disable intel_pstate by default (patch "drivers/cpufreq/intel_pstate.c" to force "no_load" usage by default
    based on old Canonical's patch http://people.canonical.com/~apw/lp1188647-saucy/0001-UBUNTU-SAUCE-intel_pstate-toggle-default-to-disable.patch )
    I'm choosing option "b" for my kernel build.

    @CK
    I'm really not certain about future now as upstream wants to remove ondemand/conservative at some point. Schedutil, especially in combination with MuQSS, constantly forces CPU to stay on higher frequencies. Also, according to Phoronix tests, there might be still performance issues.

    ReplyDelete
    Replies
    1. By virtue of the way schedutil tries to ramp up and down each logical CPU quickly whilst MuQSS spreads tasks evenly across all logical CPUs for latency, schedutil won't really work at its best. Disabling runqueue sharing and interactive mode on MuQSS will make it work better, but of course comes at a latency cost. How does one have their cake and eat it, I wonder? P-states work fine on Intel CPUs in my experience.

      Delete
    2. >>P-states work fine on Intel CPUs in my experience.
      With kernel 5.8 upstream changed default governors for Intel P-state
      https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2228540.html
      With kernel 5.9 they expanded "passive" mode to HWP
      https://lkml.org/lkml/2020/7/14/1788
      And finally with 5.10 they soft-blocked "ondemand"
      https://lkml.org/lkml/2020/10/22/713

      Delete
    3. Oh well I guess I'm talking about earlier kernels then. Seems we go flat out with 5.10 and MuQSS for now. Sigh.

      Delete
    4. So starting with kernel v5.8 Intel CPUs older than 6th gen (Skylake) use intel_cpufreq by default (old "passive" mode with "schedutil). Skylake+ still use intel_pstate ("active" mode with Intel's "powersave" & "performance" governors) but there's a movement towards "passive" mode.

      Delete
    5. That's probably why I didn't notice. So I guess reenabling the active mode would be a workaround for MuQSSers.

      Delete
    6. Looks like AMD users still use cpufreq so they would need at least afformentioned revert-patch just to build their kernel with ondemand by default. Prior to version 20.10 Ubuntu shipped systemd/sysvint service which forced ondemand governor but now it's removed (https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=65f46a7d14b335e5743350dbbc5b5ef1e72826f7).
      So ultimately we need few patches:
      * Revert intel_pstate to "active" mode by default (drivers/cpufreq/intel_pstate.c)
      * remove schedutil hard dependency + unblock ondemand selection and make it default for cpufreq (drivers/cpufreq/Kconfig)

      Delete
    7. Yes, figured as much from your post. Happy to take git pull requests...

      Delete
    8. As a shameless plug, Liquorix already is configured as ideally as @Yevhen is suggesting. The only two governors available are performance/ondemand for cpufreq. Intel P-state is disabled by default (but can be turned on). So if you want MuQSS with proper CPU frequency scaling support, give Liquorix a shot.

      Delete
  9. Thanks Con.
    I've done some throughput and interbench benchmarks on my Intel 4770K CPU with ck1 and SMT sharing.
    Looking at the results, it's seems that shedutil is better than ondemand.
    And on a side note, it seems schedutil has slightly improved with CFS.

    https://docs.google.com/spreadsheets/d/163U3H-gnVeGopMrHiJLeEY1b7XlvND2yoceKbOvQRm4/edit?usp=sharing

    ReplyDelete
    Replies
    1. Thanks, if schedutil never ramps down its speed then of course its performance will be better, but that's useless for power. You may as well measure it with the performance governor in that case.

      Delete
  10. I just switched to kwin_wayland, and with MuQSS, somehow all of kwin_wayland's child processes also inherit the SCHED_RR policy.

    The relevant commit is at https://invent.kde.org/plasma/kwin/commit/91d78daac4

    Apparently it does something funky, such that the process itself is already SCHED_RR, and then after starting the libinput thread, the policy is set to SCHED_RR & SCHED_RESET_ON_FORK.

    I think this doesn't work with MuQSS, due to the "If not changing anything there's no need to proceed further" check, which doesn't take reset-on-fork into consideration. As a quick hack, I try removing the if-block, and that seems to fix the problem.

    ReplyDelete
    Replies
    1. Looks like a bug from mainline that got fixed several years ago: https://github.com/torvalds/linux/commit/d6b1e911

      Delete
    2. Thanks. Will look into it for next release.

      Delete
    3. 5.11-ck1 works great here, thanks!

      Delete