Tuesday, 1 May 2018

linux-4.16-ck1, MuQSS version 0.171 for linux-4.16

Announcing a new -ck release, 4.16-ck1  with the latest version of the Multiple Queue Skiplist Scheduler, version 0.171. These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but configurable for any workload.
linux-4.16-ck1:
-ck1 patches:
Git tree:
MuQSS only:
Download:
Git tree:


Web: http://kernel.kolivas.org


This is mostly just a resync with 4.15 MuQSS and -ck patches. The only significant difference is that the default config for threaded IRQs is now set to disabled as this seems to be associated with boot failures when used in concert with runqueue sharing. I still include the patch in -ck that stops build warnings from making the kernel build fail, and I've added a single patch to aid building an evil out-of-kernel driver that many of us use.


Enjoy!
お楽しみ下さい
-ck

35 comments:

  1. Thanks for the resync.

    ReplyDelete
  2. Thanks. Much appreciated.

    ReplyDelete
  3. Great. Much appreciated man

    ReplyDelete
  4. Running great. Very thanks.

    ReplyDelete
  5. Hello, great work, I've used -ck for a while but have missed it recently on Artix since the maintainer gave up on it.
    So I compiled my own and tried to also use the repo-ck which interrupts my connection at about 2%, which is better than doing it at 99%.

    Here is an article I put up for -ck http://sysdfree.wordpress.com/204

    ReplyDelete
  6. Thanks Con.
    I did some throughput benchmarks.
    https://docs.google.com/spreadsheets/d/163U3H-gnVeGopMrHiJLeEY1b7XlvND2yoceKbOvQRm4/edit?usp=sharing

    Still the same consistant performance for MuQSS.

    I'm willing to try benchmarking latencies again.
    I've found this article :
    https://lwn.net/Articles/725238/

    Do you think any of those tool could be used ?

    Pedro

    ReplyDelete
  7. may in this case be appropriate to use https://github.com/ckolivas/interbench ?

    ReplyDelete
    Replies
    1. I already ran interbench with linux 4.15. The results are rather difficult to understand.
      Judging by the numbers, PDS seems to be the best, but some users noted slowdowns in UI while using it. So there is more to it than that.
      I'd like to find other tools to compare latencies.

      Pedro

      Delete
    2. rt-tests, cyclictest.

      Delete
    3. Well I've tried that one too with linux 4.10 and MuQSS 152, and also bcc runqlat.
      MuQSS latencies where higher than CFS. Con commented that you can't compare directly CFS and MuQSS with this tool as it doesn't use the same functions.

      Pedro

      Delete
    4. Any tools that hook into function calls in the kernel are simply not going to work as the function names and purposes are different in muqss. As for interbench results, it is a fairness test as well as a latency test so looking for just the lowest latency as some kind of perfect endpoint will give you the wrong conclusion.

      Delete
  8. Thanks for answering.
    So could these tools be used for regression testing between MuQSS releases ?
    I don't know if the internals change that much between MuQSS and mainline releases.

    Pedro

    ReplyDelete
    Replies
    1. Most of the time, yes, though there is variance in results too, so repeating the tests is almost important if something suddenly looks much better or much worse.

      Delete
  9. Hi Con,

    Compile error on 32 bit Pentium 4:

    CC kernel/sched/MuQSS.o
    In file included from kernel/sched/MuQSS.c:73:0:
    kernel/sched/MuQSS.h:739:46: warning: ‘struct sched_domain’ declared inside parameter list will not be visible outside of this definition or declaration
    unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
    ^~~~~~~~~~~~
    kernel/sched/MuQSS.h: In function ‘arch_scale_cpu_capacity’:
    kernel/sched/MuQSS.h:741:15: error: dereferencing pointer to incomplete type ‘struct sched_domain’
    if (sd && (sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
    ^~
    kernel/sched/MuQSS.h:741:25: error: ‘SD_SHARE_CPUCAPACITY’ undeclared (first use in this function)
    if (sd && (sd->flags & SD_SHARE_CPUCAPACITY) && (sd->span_weight > 1))
    ^~~~~~~~~~~~~~~~~~~~
    kernel/sched/MuQSS.h:741:25: note: each undeclared identifier is reported only once for each function it appears in

    Kernel is 4.16.8, only your patch is added. But the same with zen Kernel.

    Old 4.15 muqss was working fine.
    Thanks.

    Regards sysitos

    ReplyDelete
  10. For hyper-threading CPUs, 'threaded IRQs' together with SMT runqueue sharing will cause random glitches and prevent system from suspending/hibernating.

    ReplyDelete
    Replies
    1. Use maxcpus= kernel parameter, where x is # of real cores.
      HT sucks anyway.

      Delete
    2. fail^ maxcpus=x

      Delete
    3. why are you giving bad advice?

      Delete
    4. bad advice?

      Delete
    5. the real number of cores is at least twice as small as the number of cores with HT - the system will be almost twice as slow.

      Delete
    6. well thats bullshit.

      Delete
    7. https://en.wikipedia.org/wiki/Hyper-threading
      "For each processor core that is physically present, the operating system addresses two virtual (logical) cores and shares the workload between them when possible."

      Delete
    8. yes.
      i was refering to "the system will be almost twice as slow." though.

      Delete
  11. Ever since this version of MuQSS I have been having complete lockups; initially I figured it to have been WINE since it seemed to only occur when using that.

    But just now, it also happened after having completely removed WINE. Another theory was Chromium but it also did occur at least once without having a single instance of Chromium running.

    Having discounted all other variables over time by now, all I can say is that the sole constant is this new version of MuQSS.

    This is on an AMD A6-3650.

    Was running rqshare=mc, going to try rqshare=none now to see if that resolves the issue. If not, I am going to try to see what happens if I run the stock kernel (so, no MuQSS at all).

    I'd hate for it to truly be related to MuQSS though, it is such an amazing scheduler.

    ReplyDelete
    Replies
    1. rqshare=none did nothing to resolve the issue; still experiencing complete deadlocks. And no mention of any warning or error or panic in any of the logs in /var/log. Not even in kern.log, which should be the one that is always being updated.

      Implying the kernel is really deadlocking hard.

      Trying a completely stock kernel and configuration now.

      Delete
    2. Good news: MuQSS is not the cause. Stock kernel and configuration also displayed this behaviour.

      Bad news: Something is really wrong. Going to have to look elsewhere. Not inclined to point to a hardware issue though, as it mostly seems to occur when context switching between tasks and putting memory pressure on the box by having an array of heavy applications open is no guarantee. So, probably not memory. And if it had been the CPU, I'd have expected it to fail to boot. I'd be expecting kernel panics if it had been hardware.

      Anyhow, nothing to see here, sorry to have cried wolf over... well, something that is unrelated to MuQSS.

      Delete
    3. simply to disable threaded irq would resolve your issue.

      Delete
    4. If you got no swap and disabled oom maybe you run out of memory?

      Delete
  12. @Enih -- As per Con's own entry for this version of MuQSS -- Threaded IRQ were defaulted to off. And, besides, this problem never occurred before. I'd have expected a problem to occur long, long ago if that was the problem. The PC is years old and I've been using Linux for years on it as well. This problem never occurred until quite recently.

    @Anonymous -- Running oom would not cause the entire system to completely seize up. The Linux kernel is a bit more sophisticated than that, it will, at one point or another, start aggressively killing applications just to keep running. No joke, it will.

    Anyhow, traced the issue -- It's hardware. Specifically, it seemed to have been an overheating issue. It has been rather warm lately and apparently the PC just needed a good cleaning. Been running very stable for hours now after I gave it a quick once-over.

    ReplyDelete
    Replies
    1. Just in case people were curious, here's another update:

      It is not hardware at all... not even remotely. It's a bug in BFQ. Finally saw the kernel panic floating by (no log was ever recorded for it though, just happened to be in one of the virtual TTY's when it happened, so no desktop manager was in the way of me seeing the kernel panic).

      "kernel panic in blahblahblah/bfq_sq/iosched.c" or words to that effect.

      End of the line for Liquorix for me. End of the line for BFQ for me. Apparently it is still in too rough of a shape to be relied upon.

      Back to stock. So, no more MuQSS either. Sadly. Because I really fell in love with it. It is lightyears ahead of CFS.

      Delete
  13. CK, I have a static IP and domain with 5 Megabit upload. I'd like to offer to mirror your materia for free (I'm not hosting anything else). Are you interested? If so, send an SMS to 0438 470 680

    ReplyDelete
  14. 4.16.16-ck1 panics or hangs at "Sharing SMP runqueue from CPU 3 to CPU0" in virtualbox 5.2.12 vm.
    Might boot after like 5 tries or more.
    config https://pastebin.com/raw/dqKm9MvW

    ReplyDelete
  15. Hi Con,
    First of all thank you very much.
    Linux has been way more fun for me because of your work.
    Since 4.16 is EOL now, are there 4.17 patches in the making?

    Best regards,
    Anonymous

    ReplyDelete
  16. 4.16.17 & 4.16.18 didnt boot here. Early crash within 0.3s. 4.16.16 working fine though.

    ReplyDelete