Sunday, 18 February 2018

linux-4.15-ck1, MuQSS version 0.170 for linux-4.15

Announcing a new -ck release, 4.15-ck1  with the latest version of the Multiple Queue Skiplist Scheduler, version 0.170. These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but configurable for any workload.

linux-4.15-ck1:
-ck1 patches:
Git tree:
MuQSS only:
Download:
Git tree:


Web: http://kernel.kolivas.org


The major change in this release is the addition of a much more mature version of the experimental runqueue sharing code I posted on this blog earlier. After further experimenting and with lots of feedback from users, I decided to make multicore based sharing default instead of multithread. The numbers support better throughput and it should definitely provide more consistent low latency compared to previous versions of MuQSS. For those that found that interactivity on MuQSS never quite matched that of BFS before it, you should find this version now equals it.

In addition, the runqueue sharing code in this release also allows you to share runqueues for SMP as well so you can share runqueues with all physical CPUs if latency is your primary concern, even though it will likely lead to worse throughput. I have not made it possible to share between NUMA nodes because the cost of shifting tasks across nodes is usually substantial and it may even have worse latency, and will definitely have worse throughput.

I've also made the runqueue sharing possible to be configured at boot time with the boot parameter rqshare. Setting it to one of none, smt, mc, smp is done by appending the following to your kernel command line:
 rqshare=mc

Documentation has been added for the runqueue sharing code above to the MuQSS patch.

A number of minor bugs were discovered and have been fixed, which has also made booting more robust.

The -ck tree is mostly just a resync of previous patches, but with the addition of a patch to disable a -Werror CFLAG setting in the build tools which has suddenly made it impossible to build the kernel with newer GCCs on some distros.


Enjoy!
お楽しみ下さい
-ck

79 comments:

  1. The patch for disabling the -Werror in kernel compilation works wonders. I now can patch my kernel and compile it successfully, as well as enjoy the better throughput and responsiveness. Thanks much!

    ReplyDelete
  2. Compiles & boots fine. Compilations & games running great (default MC + 100HZ here).
    But, I noticed that on my Ryzen system, CPU frequency is rather high, for seemingly idle system it's around 3GHz almost all the time, while vanilla and PDS sits nicely at about 1.3-1.5GHz.
    I don't know if this is an issue per se, but it seems kinda unusual.

    BR, Eduardo

    ReplyDelete
  3. Hi,

    On a Phenom ii X6, 4.14 with muqss scheduler ( Gentoo ck-sources ) is able to boot but 4.15 hangs at “acpi: pci interrupt link [lub0] enabled at irq 23” and a blinking cursor,
    rqshare=smt, none and disabling muqss work.

    Muqss locality from /var/log/debug from working rqshare=none

    0 to 1: 2
    0 to 2: 2
    0 to 3: 2
    0 to 4: 2
    0 to 5: 2
    1 to 2: 2
    1 to 3: 2
    1 to 4: 2
    1 to 5: 2
    2 to 3: 2
    2 to 4: 2
    2 to 5: 2
    3 to 4: 2
    3 to 5: 2
    4 to 5: 2



    ReplyDelete
    Replies
    1. And what about rqshare=mc? I'd imagine the issue being SMT sharing in particular. At least, that's what I'm getting from it anyhow.

      Delete
    2. Never mind, misread what you wrote. Apologies.

      Delete
    3. rqshare=mc hangs

      Delete
    4. What about rqshare=smp ?

      Delete
    5. rqshare=smp hangs in the same way

      Delete
    6. Darn. Well rqshare=none is the same as the previous version muqss behaviour. Booting is always such a fragile process and different every release :(

      Delete
    7. @Hanging Anon -- Does this also occur with 4.14? Here's the git for Con's 4.14 that includes rq sharing:

      https://github.com/ckolivas/linux/tree/4.14-muqss-rqshare

      For me, 4.14 boots fine with rqshare=mc. Can't try 4.15 yet, not going to compile the thing myself and Liquorix is not yet synced with 4.15. But it's worth a shot.

      Not that I am trying to discount the possibility of it being related to rqshare=mc. Just to rule out the possibility of it also being linked to 4.15, one of the larger kernel releases in the past few years.

      Delete
    8. It seems the Phenom family in particular is having issues booting with rqshare=mc going by the comments on “Runqueue sharing experiments with Muqss” post.

      Delete
    9. You mean comment. It was one person only.

      Delete
    10. One had an X6 and a reply had a X4 965.

      Delete
    11. Liquorix finally synced with 4.15 so, just tested it, rqshare=mc on 4.15. Not a problem here. And, not a Phenom. Still an AMD though. So, it's not an AMD issue.

      Delete
  4. Thank you very much!

    ReplyDelete
  5. One had an X6 and a reply had a X4 965.

    ReplyDelete
  6. Wow! E8400 runs like a dream. Very thanks.

    ReplyDelete
  7. Thank You for the resync.
    I had no scheduler related problems with this kernel.

    ReplyDelete
  8. @ck,

    Noticed the following on my Thinkpad X220 [sandybridge] and can reproduce it on my Thinkpad T440s [haswell] and am wondering if you or anyone else has seen this happening. Following table contains booted kernel
    version in the order of their testing along with their 1 min load average approximately 5 min after booting with only i3 running (with all ck packages coming from Repo-ck by graysky @ the Archlinux forums):

    linux-ck-sandybridge-4.15.5-1 = ~1.0
    linux-4.15.4-1 = ~0
    linux-ck-sandybridge-4.14.19-1 = ~0
    linux-ck-sandybridge-4.15.7-1 = ~1.0

    Everything else was kept identical except switching between ck and stock kernels or downgrading and upgrading the ck kernel between boots.

    The constant ~1.0 average load causes the machines to run a few degrees C warmer than usual, which is how this caught my attention. Nothing obvious or out of the ordinary in CPU activity or output from:

    journalctl
    ps -aux
    iostat
    dmesg

    Didn't come across any similar reports on the last few pages of Archlinux ck forum thread or here, so apologies in advance if I missed something.

    So is anyone else seeing increased load averages (with no obvious source) between ck 4.14 -> 4.15?

    Thanks in advance,

    Halocaridina

    ReplyDelete
  9. negative, 4.15, Wolfdale:
    Tasks 66, 147 thr; 1 running
    Load average: 0.02 0.10 0.31
    Uptime: 03:51:01


    ReplyDelete
  10. positive, 4.15, ck-atom
    load ~1,5
    maybe Xorg?

    ReplyDelete
    Replies
    1. I think there's something wrong with how load is counted, but in reality everything is fine and the hardware is not really under load. If you compare what's shown in "turbostat" with what's happening in "top/htop", the CPU usage in top/htop seems to be wrong.

      The following filters turbostat's output to just CPU usage:

      sudo turbostat --show CPU,Busy% --quiet

      or like this for only the average:

      sudo turbostat --show Busy% --cpu '' --quiet

      If you look at that, you'll see what I mean.

      Delete
    2. Hey,
      Thanks for your explanation. I did try turbostat and frankly it shows different results, however they are still not reassuringly low. What about increased heat? Thanks

      Delete
    3. Winter is almost over.
      It`s getting warmer already.

      Delete
    4. I wasn't saying it's not bad that load is high. I was mentioning turbostat just to show something maybe interesting about the problem. I'm guessing it means there is no problem in Xorg and such.

      I'm guessing the problem is purely in how stuff gets counted somewhere in the kernel. This hopefully then means that for people that already use 'performance' governor, there's not really more heat.

      But for people using 'ondemand' governor, I'd assume CPU MHz is being set wrong and too high because of the wrong load numbers, so for those people there should be more heat, I guess?

      Something to try would be the 'schedutil' governor if there's a MHz problem. It might behave differently.

      Delete
    5. Do you have CONFIG_NO_HZ_IDLE enabled? At least on 4.14-ck1

      - CONFIG_NO_HZ_IDLE + conservative governor = sometimes the selected frequency is higher than expected
      - CONFIG_HZ_PERIODIC + conservative governor = good frequency selection

      With CONFIG_HZ_PERIODIC the system still runs a bit warm probably due to the periodic interrupts preventing it from entering the C* states.

      Delete
  11. Thanks Con.
    I did some throughput & interbench tests with MuQSS 150 on my 4770k.
    https://docs.google.com/spreadsheets/d/163U3H-gnVeGopMrHiJLeEY1b7XlvND2yoceKbOvQRm4/edit?usp=sharing

    On this kind of cpu, what's the difference between rqshare=mc and rqshare=smp ?

    Pedro

    ReplyDelete
    Replies
    1. I assume you meant muqss 170 where you wrote 150 everywhere. MC and SMP sharing on that kind of CPU are identical so any differences you are seeing are simply showing the wide variance in your results.

      Delete
    2. You are right. It was a mistake.
      Thanks for clarifying the MC and SMP sharing on my CPU.
      And thanks for your continuous efforts in maintaining and improving MuQSS.

      Pedro

      Delete
    3. Well so which process scheduler has won?
      I have a PSD scheduler of processes constantly lagging in the graphical interface.
      MuQSS is the best in this case.

      Delete
    4. Well, by those numbers alone, PDS actually seems the best, the most well rounded scheduler. But, like you said, you're experiencing a laggy UI while using it.

      Which does actually account for a great deal as well. Performance only goes so far; the user experience is at least as important, if not more so.

      And, my experience is the exact same as yours, MuQSS simply provides the most smooth desktop experience.

      Delete
    5. MuQSS with RQsharing (mc) wins.

      Delete
  12. OK. CPU load, etc. is way off.
    But I don`t care as long as the performance is there, and it is.

    ReplyDelete
  13. Although I compiled BFQ into the kernel I can`t use or choose it.

    cat /sys/block/sda/queue/scheduler
    [noop] deadline cfq


    ReplyDelete
    Replies
    1. elevator=bfq in kernel parameters shows this in dmesg:

      [ 0.210536] I/O scheduler bfq not found

      Delete
    2. Found something:
      https://github.com/NixOS/nixpkgs/issues/35025

      Delete
    3. OK.
      Solved it by adding scsi_mod.use_blk_mq=1 to kernel parameters.

      cat /sys/block/sda/queue/scheduler
      [mq-deadline] kyber bfq none

      Delete
    4. Please don't use bfq with kernel 4.15, its broken. Check https://groups.google.com/forum/#!forum/bfq-iosched for details. The fix is already be done, but it seems, that it will not go into 4.15 anymore. For me the error shows quick up in mounting a NTFS USB thumb drive, a simple "blkid" hangs in D state (together with udev process). Workaround, change your udev rule, not using BFQ for all (new) drives or better use zen-kernel with the integrated BFQ-MQ (which is BFQ next ;) ), where the patch is already integrated. Works for me super.

      cat /sys/block/sd?/queue/scheduler
      mq-deadline kyber bfq [bfq-mq] none

      Regards sysitos

      Delete
    5. Thank you very much, sir.
      Regards

      Delete
    6. One question, can I patch MUQSS into the zen kernel?

      Delete
    7. Or use pf-kernel, where I have backported fixes needed for BFQ to work properly.

      Delete
    8. Or is there a way to get the bfq-mq patch(es)?

      Delete
    9. Zen kernel is with muqss, bfq-mq, exfat and some other tweaks already included. A simple "git clone" of https://github.com/zen-kernel/zen-kernel and than after a new kernel release a "git pull" to stay current.
      The pf-kernel from Oleksandr is similar, but without the exfat driver and I dont like the versioning, the kernel stays at 4.xx.0-pfxx, so I don't see immediately the actual patch number of the main linux kernel. All imho. But anyway, thanks Oleksandr for your work.

      Regards sysitos

      Delete
    10. OK.
      Thanks.

      Delete
    11. @sysitos:
      IMO, complaining about Oleksandr's kernel naming scheme is not appropriate and no reason not to use or try it.
      In my experience he always did and does a great job to add valuable patches to his combo, some even picked from future releases, making his github repo a good source of information about what's going on in the cpu-/ disk-scheduler- world (etc.).
      I also don't like his naming scheme, for me it confuses GRUB2 boot menu, but he ships with the "apply to bare vanilla .0.0 kernel" and it's a clear announcement.
      We are all free to edit the top Makefile or new patches, w.r.t the release notes, to indicate the current kernel version.

      Regarding the BFQ-MQ, I also hope that both, kernel configuration and the ability to distinguish MQ from the older SQ one at runtime, would be easier.

      Best regards,
      Manuel Krause

      Delete
    12. @Manuel, sorry, but what would you like to say to me? Maybe my german - english translator is as bad as yours, but nowhere I wrote, that anybody or I shouldn't use the pf-kernel. So don't blame me. I use both kernels, they are similiar with differences, so I wrote. There was a time, Oleksandr had even the PDS scheduler within his patchset and zen-kernel was some versions behind the mainline kernel.
      And I know, which great work is Oleksandr doing, he is mostly the first who test and check the new CPU schedulers (from Con or Alfred) or is bughunting on BFQ or other things.
      But anyway, here again, thanks Oleksandr for all your work. And to stay not offtopic at all, thanks Con for your MuQSS too. It does a good job here with SMT on an i7.

      Btw: zen kernel has BFQ-SQ, BFQ (mainline) and BFQ-MQ (BFQ next). So if you dont use the mq, you can still use the BFQ (some time ago there were huge problems with i386 and mq, discovered/described fine by Oleksandr too.)

      Regards sysitos.

      Delete
    13. I recently had problems with algodeb (git) version of BFQ-MQ on I386.
      But Oleksander's patch for mainline BFQ seems fine.

      Delete
    14. Typo. I meant "Algodev-github/bfq-mq".

      Delete
  14. About BFQ and PF, you don't need whole patch (and you can't apply it anyway for updated kernel from your distro). Just extract BFQ part and apply it. Oleksander has well organized patch so you won't miss anything.

    ReplyDelete
  15. Simple patch to make BFQ default MQ Device for SCSI
    In /block/elevator.c
    change default elevator for SQ devices from mq-deadline to BFQ
    if (q->nr_hw_queues == 1)
    - e = elevator_get(q, "mq-deadline", false);
    + e = elevator_get(q, "bfq", false);

    ReplyDelete
  16. His Oleksandr

    ReplyDelete
  17. I probably found the reason of default rqshare-mc problem because my Skylake also refused to boot with it when I forgot to disable
    Processor type and features -> Enable Maximum number of SMP Processors and NUMA Nodes
    (In config it's CONFIG_MAXSMP)
    I also experienced some Wine problems (TeamViewer maybe) so luckily found comment by Andrew Rodland on 4.10-ck1 page where he pointed at this incompatibility.
    So Con please leave some warning about this. It's a major issue and people who rebuild Ubuntu kernel with their config will face it.

    ReplyDelete
    Replies
    1. Rebuilt, now it also fails to boot. I'm puzzled.

      Delete
    2. Any message, hint?

      Delete
    3. Been using MuQSS in conjunction with Wine for quite some time now and never seen any issues as a result of that specific combination.

      I think it might be specific to your exact hardware configuration more than anything. Wine is too high level to really suffer from any problems. It operates purely in userspace (which is at the very core of why it has problems running DRM, for example). Other than simply performing poorly I'd be hard pressed to imagine any reason why it would ever have actual problems with MuQSS.

      Regarding the performance -- Not seen any degradation as a result of MuQSS. In fact, CFS performs worse in almost all workloads for me.

      Anyhow, the NUMA Node / SMP thing most decidedly is worth investigating. Particularly in relation to the rq-sharing.

      Delete

    4. About Wine, it I remember correctly there was issue with Teamviewer 11. Person with Windows wasn't able to control my PC. Somehow I found some log message and discovered this comment chain (http://ck-hack.blogspot.com/2017/02/linux-410-ck1-muqss-version-0152-for.html?showComment=1492567649041#c102199203888580767 + https://github.com/Hubbitus/kernel/issues/4). After that I disabled MAX_SMP and didn't have any new problems with Wine. Maybe it's already fixed though.

      Delete
    5. As for booting with RQshare, I believe it has something to do with Zen/Liquorix tweaks (however liquorix kernel boots fine with rqshare=mc)
      My changes:
      ++int sched_interactive __read_mostly = 0;
      ++int sched_yield_type __read_mostly = 0;
      Always worked fine
      After Zen-tune:
      ++int rr_interval __read_mostly = 3;
      ++int sched_iso_cpu __read_mostly = 25;
      it fails to boot with rqshare=mc.
      All this with 1000HZ+Preempt.
      My PC compiles kernel for few hours so if someone could check this faster I will be grateful.

      Delete
  18. This comment has been removed by the author.

    ReplyDelete
  19. Hey Con,

    Do you know when you'll be able to rebase this on top of 4.16?

    Cheers,
    Kyle

    ReplyDelete
    Replies
    1. I am also interested, since 4.15 is EOL now.

      Delete
    2. 4.16 is a fairly big release; lots of changes all over the thing. Including some changes in areas that possibly intersect with MuQSS.

      I think we'll see the same thing with 4.17 too. We're in fairly choppy waters as far as kernel development is concerned. New CPU architectures, older ones being removed, Spectre/Meltdown kicking things up, low level idle loop rewrites, etc, etc.

      It's all very active. Which is good.

      Delete
    3. I posted a patch sometime ago. I'm using it for a while now without any problems.
      I also have a heavily modified 4.16 kernel which is mostly pf-kernel + ck patches from 4.15 and the clear linux intel patches.
      I could also upload this patch if someone is interested.

      Delete
    4. Here is my complete patch applies to 4.16.0
      https://jki.io/jk.patch.bz2

      list of what is included:
      complete ck patch from 4.15
      latest stable-queue for 4.16
      https://github.com/pfactum/pf-kernel
      https://github.com/clearlinux-pkgs/linux
      lz4 support for btrfs but this is only maintained by me and breaks mounting with kernels without the patch.
      elsepoll patch for usbhid this is for 1000 HZ support for every usb input device.

      Delete
    5. No entirely sure I trust anyone but Con to properly rebase MuQSS onto 4.16, probably because he understand's his code and its interactions in the kernel the most.

      I've tried rebasing it myself, and I couldn't trust that I understood what was safe and what wasn't.

      Delete
    6. I totaly agree.
      I'm not a kernel developer this is why I added the disclaimer.
      I can only say it's working fine for me.

      Delete
  20. Hi,
    I modified the 4.15 patch.
    The patch is working on my machines but no guarantee for anyone else.
    https://jki.io/muqss-4.16.patch.bz2

    ReplyDelete
    Replies
    1. maybe Con can take a quick look? something in the lines of "it's mostly ok" would give relieve. (so I'm not setting my pc on fire).

      Delete
    2. I never do that sorry. It takes as long to audit code as it does to create the port the code myself. Alas I'm busy so it will still be a while before I can resync (though I haven't stopped contrary to what some may have claimed.)

      Delete
    3. Yes!
      Thanks.

      Delete