This is mainly a bugfix release for those who had boot failures, TOI patched failures, and warnings. Otherwise it only has minor changes.
http://ck.kolivas.org/patches/4.0/4.8/4.8-ck3/
MuQSS version 0.115 by itself:
4.8-sched-MuQSS_115.patch
Git tree includes branches for MuQSS and -ck:
https://github.com/ckolivas/linux
EDIT: There is a regression in this release as well and you need to either grab the latest 4.8-ck git tree or add the two patches here:
http://ck.kolivas.org/patches/muqss/4.0/4.8/Pending/
Sorry, when enough other problems get fixed I'll release another version pretty soon too.
Enjoy!
お楽しみ下さい
-ck
A development blog of what Con Kolivas is doing with code at the moment with the emphasis on linux kernel, MuQSS, BFS and -ck.
Saturday, 22 October 2016
Friday, 21 October 2016
lrzip version 0.631
Announcing an updated version of lrzip.
Tarballs:
http://ck.kolivas.org/apps/lrzip/
Git tree:
https://github.com/ckolivas/lrzip
This is a minor bugfix release.
- Encryption complexity has been altered to match CPU speed rate rises that have NOT paralleled Moore's law.
- Some of the command line parameters did not work properly in compatibility mode.
- Compressed files did not retain the same date as the original file.
- The -p parameter did not accept arguments and would not work.
Enjoy!
お楽しみ下さい
-ck
Tarballs:
http://ck.kolivas.org/apps/lrzip/
Git tree:
https://github.com/ckolivas/lrzip
This is a minor bugfix release.
- Encryption complexity has been altered to match CPU speed rate rises that have NOT paralleled Moore's law.
- Some of the command line parameters did not work properly in compatibility mode.
- Compressed files did not retain the same date as the original file.
- The -p parameter did not accept arguments and would not work.
Enjoy!
お楽しみ下さい
-ck
linux-4.8-ck2, MuQSS version 0.114
Announcing an updated version, and the first -ck release with MuQSS as the scheduler, officially retiring BFS from further development, in line with the diminished rate of bug reports with MuQSS. It is clear that the little attention BFS had received over the years apart from rushed synchronisation with mainline had cause a number of bugs to creep in and MuQSS is basically a rewritten evolution of the same code so it makes no sense to maintain both.
http://ck.kolivas.org/patches/4.0/4.8/4.8-ck2/
MuQSS version 0.114 by itself:
4.8-sched-MuQSS_114.patch
Git tree includes branches for MuQSS and -ck:
https://github.com/ckolivas/linux
In addition to the most up to date version of MuQSS replacing BFS, this is the first release with BFQ included. It is configurable and is set by default in -ck though it is entirely optional.
The MuQSS changes since 112 are as follows:
- Added cacheline alignment to atomic variables courtesy of Holger Hoffstätte
- Fixed PPC build courtesy of Serge Belyshev.
- Implemented wake lists for separate CPU packages.
- Send hotplug threads to CPUs even if they're not alive yet since they'll be enabling them.
- Build fixes for uniprocessor.
- A substantial revamp of the sub-tick process accounting, decreasing the number of variables used, simplifying the code, and increasing the resolution to nanosecond accounting. Now even tasks that run for less than 100us will not escape visible accounting.
This release should bring slightly better performance, more so on multi-cpu machines, and fairer accounting/latency.
Enjoy!
お楽しみ下さい
-ck
http://ck.kolivas.org/patches/4.0/4.8/4.8-ck2/
MuQSS version 0.114 by itself:
4.8-sched-MuQSS_114.patch
Git tree includes branches for MuQSS and -ck:
https://github.com/ckolivas/linux
In addition to the most up to date version of MuQSS replacing BFS, this is the first release with BFQ included. It is configurable and is set by default in -ck though it is entirely optional.
The MuQSS changes since 112 are as follows:
- Added cacheline alignment to atomic variables courtesy of Holger Hoffstätte
- Fixed PPC build courtesy of Serge Belyshev.
- Implemented wake lists for separate CPU packages.
- Send hotplug threads to CPUs even if they're not alive yet since they'll be enabling them.
- Build fixes for uniprocessor.
- A substantial revamp of the sub-tick process accounting, decreasing the number of variables used, simplifying the code, and increasing the resolution to nanosecond accounting. Now even tasks that run for less than 100us will not escape visible accounting.
This release should bring slightly better performance, more so on multi-cpu machines, and fairer accounting/latency.
Enjoy!
お楽しみ下さい
-ck
Tuesday, 18 October 2016
First MuQSS Throughput Benchmarks
The short version graphical summary:
Red = MuQSS 112 interactive off
Purple = MuQSS 112 interactive on
Blue = CFS
The detail:
http://ck.kolivas.org/patches/muqss/Benchmarks/20161018/
I went on a journey looking for meaningful benchmarks to conduct to assess the scalability aspect as far as I could on my own 12x machine and was really quite depressed to see what the benchmark situation on linux is like. Only the old and completely invalid benchmarks seem to still be hanging around in public sites and promoted, like Reaim, aim7, dbench, volanomark, etc. and none of those are useful scalability benchmarks. Even more depressing was the only ones with any reputation are actually commercial benchmarks costing hundreds of dollars.
This made me wonder out loud just how the heck mainline is even doing scalability improvements if there are precious few valid benchmarks for linux and no one's using them. The most promising ones, like mosbench, need multiple machines and quite a bit of set up to get them going.
I spent a day wading through the phoronix test suite - a site and its suite not normally known for meaningful high performance computing discussion and benchmarks - looking for benchmarks that could be used for meaningful results for multicore scalability assessment and were not too difficult to deploy and came up with the following collection:
John The Ripper - a CPU bound application that is threaded to the number of CPUs and intermittently drops to one thread making for slightly more interesting behaviour than just a fully CPU bound workload.
7-Zip Compression - a valid real world CPU bound application that is threaded but rarely able to spread out to all CPUs making it an interesting light load benchmark.
ebizzy - This emulates a heavy content delivery server load which scales beyond the number of CPUs and emulates what goes on between a http server and database.
Timed Linux Kernel Compilation - A perennial favourite because it is a real world case and very easy to reproduce. Despite numerous complaints about its validity as a benchmark, it is surprisingly consistent in its results and tests many facets of scalability, though does not scale to use all CPUs at all time either.
C-Ray - A ray tracing benchmark that uses massive threading per CPU and is completely CPU bound but overloads all CPUs.
Primesieve - A prime number generator that is threaded to the number of CPUs exactly, is fully CPU bound and is cache intensive.
PostgreSQL pgbench - A meaningful database benchmark that is done at 3 different levels - single threaded, normal loaded and heavily contended, each testing different aspects of scalability.
And here is a set of results comparing 4.8.2 mainline (labelled CFS), MuQSS 112 in interactive mode (MuQSS-int1) and MuQSS 112 in non-interactive mode (MuQSS-int0):
http://ck.kolivas.org/patches/muqss/Benchmarks/20161018/
It's worth noting that there is quite a bit of variance in these benchmarks and some are bordering on the difference being just noise. However there is a clear pattern here - when the load is light, in terms of throughput, CFS outperforms MuQSS. When load is heavy, the heavier it gets, MuQSS outperforms CFS, especially in non-interactive mode. As a friend noted, for the workloads where you wouldn't be running MuQSS in interactive mode, such as a web server, database etc, non-interactive mode is of clear performance benefit. So at least on the hardware I had available to me, on a 12x machine, MuQSS is scaling better than mainline on these workloads as load increases.
The obvious question people will ask is why MuQSS doesn't perform better at light loads, and in fact I have an explanation. The reason is that mainline tends to cling to processes much more so that if it is hovering at low numbers of active processes, they'll all cluster on one CPU or fewer CPUs than being spread out everywhere. This means the CPU benefits more from the turbo modes virtually all newer CPUs have, but it comes at a cost. The latency to tasks is greater because they're competing for CPU time on fewer busy CPUs rather than spreading out to idle cores or threads. It is a design decision in MuQSS, as taken from BFS, to always spread out to any idle CPUs if they're available, to minimise latency, and that's one of the reasons for the interactivity and responsiveness of MuQSS. Of course I am still investigating ways of closing that gap further.
Hopefully I can get some more benchmarks from someone with even bigger hardware, and preferably with more than one physical package since that's when things really start getting interesting. All in all I'm very pleased with the performance of MuQSS in terms of scalability on these results, especially assuming I'm able to maintain the interactivity of BFS which were my dual goals.
There is MUCH more to benchmarking than pure throughput of CPU - which is almost the only thing these benchmarks is checking - but that's what I'm interested in here. I hope that providing my list of easy to use benchmarks and the reasoning behind them can generate interest in some kind of meaningful standard set of benchmarks. I did start out in kernel development originally after writing and being a benchmarker :P
To aid that, I'll give simple instructions here for how to ~imitate the benchmarks and get results like I've produced above.
Download the phoronix test suite from here:
http://www.phoronix-test-suite.com/
The generic tar.gz is perfectly fine. Then extract it and install the relevant benchmarks like so:
Now obviously this is not ideal since you shouldn't run benchmarks on a multiuser login with Xorg and all sorts of other crap running so I actually always run benchmarks at init level 1.
Enjoy!
お楽しみ下さい
-ck
Red = MuQSS 112 interactive off
Purple = MuQSS 112 interactive on
Blue = CFS
The detail:
http://ck.kolivas.org/patches/muqss/Benchmarks/20161018/
I went on a journey looking for meaningful benchmarks to conduct to assess the scalability aspect as far as I could on my own 12x machine and was really quite depressed to see what the benchmark situation on linux is like. Only the old and completely invalid benchmarks seem to still be hanging around in public sites and promoted, like Reaim, aim7, dbench, volanomark, etc. and none of those are useful scalability benchmarks. Even more depressing was the only ones with any reputation are actually commercial benchmarks costing hundreds of dollars.
This made me wonder out loud just how the heck mainline is even doing scalability improvements if there are precious few valid benchmarks for linux and no one's using them. The most promising ones, like mosbench, need multiple machines and quite a bit of set up to get them going.
I spent a day wading through the phoronix test suite - a site and its suite not normally known for meaningful high performance computing discussion and benchmarks - looking for benchmarks that could be used for meaningful results for multicore scalability assessment and were not too difficult to deploy and came up with the following collection:
John The Ripper - a CPU bound application that is threaded to the number of CPUs and intermittently drops to one thread making for slightly more interesting behaviour than just a fully CPU bound workload.
7-Zip Compression - a valid real world CPU bound application that is threaded but rarely able to spread out to all CPUs making it an interesting light load benchmark.
ebizzy - This emulates a heavy content delivery server load which scales beyond the number of CPUs and emulates what goes on between a http server and database.
Timed Linux Kernel Compilation - A perennial favourite because it is a real world case and very easy to reproduce. Despite numerous complaints about its validity as a benchmark, it is surprisingly consistent in its results and tests many facets of scalability, though does not scale to use all CPUs at all time either.
C-Ray - A ray tracing benchmark that uses massive threading per CPU and is completely CPU bound but overloads all CPUs.
Primesieve - A prime number generator that is threaded to the number of CPUs exactly, is fully CPU bound and is cache intensive.
PostgreSQL pgbench - A meaningful database benchmark that is done at 3 different levels - single threaded, normal loaded and heavily contended, each testing different aspects of scalability.
And here is a set of results comparing 4.8.2 mainline (labelled CFS), MuQSS 112 in interactive mode (MuQSS-int1) and MuQSS 112 in non-interactive mode (MuQSS-int0):
http://ck.kolivas.org/patches/muqss/Benchmarks/20161018/
It's worth noting that there is quite a bit of variance in these benchmarks and some are bordering on the difference being just noise. However there is a clear pattern here - when the load is light, in terms of throughput, CFS outperforms MuQSS. When load is heavy, the heavier it gets, MuQSS outperforms CFS, especially in non-interactive mode. As a friend noted, for the workloads where you wouldn't be running MuQSS in interactive mode, such as a web server, database etc, non-interactive mode is of clear performance benefit. So at least on the hardware I had available to me, on a 12x machine, MuQSS is scaling better than mainline on these workloads as load increases.
The obvious question people will ask is why MuQSS doesn't perform better at light loads, and in fact I have an explanation. The reason is that mainline tends to cling to processes much more so that if it is hovering at low numbers of active processes, they'll all cluster on one CPU or fewer CPUs than being spread out everywhere. This means the CPU benefits more from the turbo modes virtually all newer CPUs have, but it comes at a cost. The latency to tasks is greater because they're competing for CPU time on fewer busy CPUs rather than spreading out to idle cores or threads. It is a design decision in MuQSS, as taken from BFS, to always spread out to any idle CPUs if they're available, to minimise latency, and that's one of the reasons for the interactivity and responsiveness of MuQSS. Of course I am still investigating ways of closing that gap further.
Hopefully I can get some more benchmarks from someone with even bigger hardware, and preferably with more than one physical package since that's when things really start getting interesting. All in all I'm very pleased with the performance of MuQSS in terms of scalability on these results, especially assuming I'm able to maintain the interactivity of BFS which were my dual goals.
There is MUCH more to benchmarking than pure throughput of CPU - which is almost the only thing these benchmarks is checking - but that's what I'm interested in here. I hope that providing my list of easy to use benchmarks and the reasoning behind them can generate interest in some kind of meaningful standard set of benchmarks. I did start out in kernel development originally after writing and being a benchmarker :P
To aid that, I'll give simple instructions here for how to ~imitate the benchmarks and get results like I've produced above.
Download the phoronix test suite from here:
http://www.phoronix-test-suite.com/
The generic tar.gz is perfectly fine. Then extract it and install the relevant benchmarks like so:
tar xf phoronix-test-suite-6.6.1.tar.gz
cd phoronix-test-suite
./phoronix-test-suite install build-linux-kernel c-ray compress-7zip ebizzy john-the-ripper pgbench primesieve
./phoronix-test-suite default-run build-linux-kernel c-ray compress-7zip ebizzy john-the-ripper pgbench primesieve
Now obviously this is not ideal since you shouldn't run benchmarks on a multiuser login with Xorg and all sorts of other crap running so I actually always run benchmarks at init level 1.
Enjoy!
お楽しみ下さい
-ck
Labels:
benchmark,
bfs,
interactivity,
kernel,
latency,
linux,
MuQSS,
scalability,
scheduler
Monday, 17 October 2016
MuQSS - The Multiple Queue Skiplist Scheduler v0.112
Here's an updated version of MuQSS.
For 4.8.*:
4.8-sched-MuQSS_112.patch
For 4.7.*:
4.7-sched-MuQSS_112.patch
Git tree here as 4.7-muqss or 4.8-muqss branches:
https://github.com/ckolivas/linux
It's getting close now to the point where it can replace BFS in -ck releases. Thanks to the many people testing and reporting back, some other misbehaviours were discovered and their associated fixes have been committed.
In particular,
- Balancing across CPUs was not looking at higher and lower scheduling policies correctly (SCHED_ISO, SCHED_IDLEPRIO and realtime policies)
- A serious stall/hang could happen with tasks using sched_yield (such as f@h client and numerous GPU drivers)
- Some minor accounting issues on new tasks with affinity set were fixed
- Overhead was further decreased on task selection
- Spurious preemption on CPUs where the preempted task had already gone are now avoided
- Spurious wakeup on CPUs that were assumed and are no longer idle are avoided
- A potential race in suspending to ram was fixed
- Old unused code from BFS was removed, along with unnecessary intermediate variables.
- Clean ups
- Some work towards actually documenting MuQSS in Documentation/scheduler/sched-MuQSS.txt was done, though incomplete.
Enjoy!
お楽しみ下さい
-ck
For 4.8.*:
4.8-sched-MuQSS_112.patch
For 4.7.*:
4.7-sched-MuQSS_112.patch
Git tree here as 4.7-muqss or 4.8-muqss branches:
https://github.com/ckolivas/linux
It's getting close now to the point where it can replace BFS in -ck releases. Thanks to the many people testing and reporting back, some other misbehaviours were discovered and their associated fixes have been committed.
In particular,
- Balancing across CPUs was not looking at higher and lower scheduling policies correctly (SCHED_ISO, SCHED_IDLEPRIO and realtime policies)
- A serious stall/hang could happen with tasks using sched_yield (such as f@h client and numerous GPU drivers)
- Some minor accounting issues on new tasks with affinity set were fixed
- Overhead was further decreased on task selection
- Spurious preemption on CPUs where the preempted task had already gone are now avoided
- Spurious wakeup on CPUs that were assumed and are no longer idle are avoided
- A potential race in suspending to ram was fixed
- Old unused code from BFS was removed, along with unnecessary intermediate variables.
- Clean ups
- Some work towards actually documenting MuQSS in Documentation/scheduler/sched-MuQSS.txt was done, though incomplete.
Enjoy!
お楽しみ下さい
-ck
Tuesday, 11 October 2016
MuQSS - The Multiple Queue Skiplist Scheduler v0.111
Lots of bugfixes, lots of improvements, build fixes, you name it.
For 4.8:
4.8-sched-MuQSS_111.patch
For 4.7:
4.7-sched-MuQSS_111.patch
And in a complete departure from BFS, a git tree (which suits constant development like this, unlike BFS's stable release massive ports):
https://github.com/ckolivas/linux
Look in the pending/ directory to see all the patches that went into this or read the git changelog. In particular numerous warnings were fixed, throughput improved compared to 108, SCHED_ISO was rewritten for multiple queues, potential races/crashes were addressed, and build fixes for different configurations were committed.
I haven't been able to track the bizarre latency issues reported by runqlat and when I try to reproduce it myself I get nonsense values of latency greater than the history of the earth so I suspect an interface bug with BPF reporting values. It doesn't seem to affect actual latency in any way.
EDIT: Updated to version 0.111 which has a fix for suspend/resume.
Enjoy!
お楽しみ下さい
-ck
For 4.8:
4.8-sched-MuQSS_111.patch
For 4.7:
4.7-sched-MuQSS_111.patch
And in a complete departure from BFS, a git tree (which suits constant development like this, unlike BFS's stable release massive ports):
https://github.com/ckolivas/linux
Look in the pending/ directory to see all the patches that went into this or read the git changelog. In particular numerous warnings were fixed, throughput improved compared to 108, SCHED_ISO was rewritten for multiple queues, potential races/crashes were addressed, and build fixes for different configurations were committed.
I haven't been able to track the bizarre latency issues reported by runqlat and when I try to reproduce it myself I get nonsense values of latency greater than the history of the earth so I suspect an interface bug with BPF reporting values. It doesn't seem to affect actual latency in any way.
EDIT: Updated to version 0.111 which has a fix for suspend/resume.
Enjoy!
お楽しみ下さい
-ck
Friday, 7 October 2016
MuQSS - The Multiple Queue Skiplist Scheduler v0.108
A new version of the MuQSS CPU scheduler
Incrementals and full patches available for 4.8 and 4.7 respectively here:
http://ck.kolivas.org/patches/muqss/4.0/4.8/
http://ck.kolivas.org/patches/muqss/4.0/4.7/
Yet more minor bugfixes and some important performance enhancements.
This version brings to the table the same locking scheme for trying to wake tasks up as mainline which is advantageous on process busy workloads and many CPUs. This is important because the main reason for moving to multiple runqueues was to minimise lock contention for the global runqueue lock that is in BFS (as mentioned here numerous times before) and this wake up scheme helps make the most of the multiple discrete runqueue locks.
Note this change is much more significant than the last releases so new instability is a possibility. Please report any problems or stacktraces!
There was a workload when I started out that I used lockstat to debug to get an idea of how much lock contention was going on and how long it lasted. Originally with the first incarnations of MuQSS on a 14 second benchmark with thousands of tasks on a 12x CPU it obtained 3 million locks and had almost 300k contentions with the longest contention lasting 80us. Now the same workload grabs the lock just 5k times with only 18 contentions in total and the longest lasted 1us.
This clearly demonstrates that the target endpoint for avoiding lock contention has been achieved. It does not translate into performance improvements on ordinary hardware today because you need ridiculous workloads on many CPUs to even begin deriving advantage from it. However as even our phones now have reached 8 logical CPUs, it will only be a matter of time before 16 threads appears on commodity hardware - a complaint that was directed at BFS when it came out 7 years ago but they still haven't appeared just yet. BFS was shown to be scalable for all workloads up to 16 CPUs, and beyond for certain workloads, but suffered dramatically for others. MuQSS now makes it possible for what was BFS to be useful much further into the future.
Again - MuQSS is aimed primarily at desktop/laptop/mobile device users for the best possible interactivity and responsiveness, and is still very simple in its approach to balancing workloads to CPUs so there are likely to be throughput workloads on mainline that outperform it, though there are almost certainly workloads where the opposite is true.
I've now addressed all planned changes to MuQSS and plan to hopefully only look at bug reports instead of further development from here on for a little while. In my eyes it is now stable enough to replace BFS in the next -ck release barring some unexpected showstopper bug appearing.
EDIT: If you blinked you missed the 107 announcement which was shortly superseded by 108.
EDIT2: Always watch the pending directory for updated pending patches to add.
http://ck.kolivas.org/patches/muqss/4.0/4.8/Pending/
Enjoy!
お楽しみ下さい
-ck
Incrementals and full patches available for 4.8 and 4.7 respectively here:
http://ck.kolivas.org/patches/muqss/4.0/4.8/
http://ck.kolivas.org/patches/muqss/4.0/4.7/
Yet more minor bugfixes and some important performance enhancements.
This version brings to the table the same locking scheme for trying to wake tasks up as mainline which is advantageous on process busy workloads and many CPUs. This is important because the main reason for moving to multiple runqueues was to minimise lock contention for the global runqueue lock that is in BFS (as mentioned here numerous times before) and this wake up scheme helps make the most of the multiple discrete runqueue locks.
Note this change is much more significant than the last releases so new instability is a possibility. Please report any problems or stacktraces!
There was a workload when I started out that I used lockstat to debug to get an idea of how much lock contention was going on and how long it lasted. Originally with the first incarnations of MuQSS on a 14 second benchmark with thousands of tasks on a 12x CPU it obtained 3 million locks and had almost 300k contentions with the longest contention lasting 80us. Now the same workload grabs the lock just 5k times with only 18 contentions in total and the longest lasted 1us.
This clearly demonstrates that the target endpoint for avoiding lock contention has been achieved. It does not translate into performance improvements on ordinary hardware today because you need ridiculous workloads on many CPUs to even begin deriving advantage from it. However as even our phones now have reached 8 logical CPUs, it will only be a matter of time before 16 threads appears on commodity hardware - a complaint that was directed at BFS when it came out 7 years ago but they still haven't appeared just yet. BFS was shown to be scalable for all workloads up to 16 CPUs, and beyond for certain workloads, but suffered dramatically for others. MuQSS now makes it possible for what was BFS to be useful much further into the future.
Again - MuQSS is aimed primarily at desktop/laptop/mobile device users for the best possible interactivity and responsiveness, and is still very simple in its approach to balancing workloads to CPUs so there are likely to be throughput workloads on mainline that outperform it, though there are almost certainly workloads where the opposite is true.
I've now addressed all planned changes to MuQSS and plan to hopefully only look at bug reports instead of further development from here on for a little while. In my eyes it is now stable enough to replace BFS in the next -ck release barring some unexpected showstopper bug appearing.
EDIT: If you blinked you missed the 107 announcement which was shortly superseded by 108.
EDIT2: Always watch the pending directory for updated pending patches to add.
http://ck.kolivas.org/patches/muqss/4.0/4.8/Pending/
Enjoy!
お楽しみ下さい
-ck
Labels:
4.8,
bfs,
interactivity,
kernel,
latency,
linux,
MuQSS,
scalability,
scheduler
Wednesday, 5 October 2016
MuQSS - The Multiple Queue Skiplist Scheduler v0.106
Another day and time for yet another release.
There are 0.106 versions and incrementals available for linux-4.7:
http://ck.kolivas.org/patches/muqss/4.0/4.7/
and linux-4.8:
http://ck.kolivas.org/patches/muqss/4.0/4.8
Two large remaining races that could lead to warnings, stalls, or in the worst case, crashes, have been fixed in this version.
Additionally the multiple-runqueue locking has been significantly optimised to take only the runqueues needed for as long as they're needed only and dropped as soon as possible which should bring the lock contention levels down even further. This is a performance enhancement, more so in non-interactive mode, though it will only start being demonstrable if you're lucky enough to have many CPUs.
This version addresses all the known bugs and warnings I've received to date so hopefully I can have a little rest and let people out there actually give it a go. What will you expect if you use this instead of BFS? If I've done this correctly, you will notice absolutely no difference since the idea was to preserve the interactivity and responsiveness of BFS and make it scalable to more CPUs than most people can afford.
Keep the feedback coming, thanks.
Enjoy!
お楽しみ下さい
-ck
There are 0.106 versions and incrementals available for linux-4.7:
http://ck.kolivas.org/patches/muqss/4.0/4.7/
and linux-4.8:
http://ck.kolivas.org/patches/muqss/4.0/4.8
Two large remaining races that could lead to warnings, stalls, or in the worst case, crashes, have been fixed in this version.
Additionally the multiple-runqueue locking has been significantly optimised to take only the runqueues needed for as long as they're needed only and dropped as soon as possible which should bring the lock contention levels down even further. This is a performance enhancement, more so in non-interactive mode, though it will only start being demonstrable if you're lucky enough to have many CPUs.
This version addresses all the known bugs and warnings I've received to date so hopefully I can have a little rest and let people out there actually give it a go. What will you expect if you use this instead of BFS? If I've done this correctly, you will notice absolutely no difference since the idea was to preserve the interactivity and responsiveness of BFS and make it scalable to more CPUs than most people can afford.
Keep the feedback coming, thanks.
Enjoy!
お楽しみ下さい
-ck
Tuesday, 4 October 2016
MuQSS - The Multiple Queue Skiplist Scheduler v0.105
I spent the last few days fighting with various lock debugging techniques and the numerous bug reports and am pleased to announce a new version of MuQSS, version 0.105
There are versions and incrementals available for linux-4.7:
http://ck.kolivas.org/patches/muqss/4.0/4.7/
and linux-4.8:
http://ck.kolivas.org/patches/muqss/4.0/4.8
If you've been waiting for me to say it's stable enough to try, then now's your chance for I've addressed all known bugs at this time and it's working well for me.
Most of the issues were to do with races and unstable handling of cross-cpu task movement. No effort went into improving performance from 104 though this version should address many of the crashes and hangs that have been reported with earlier versions.
Additionally there is a pending patch being uploaded for BFS512 which, as per usual, had some last minute issues that only just showed up. If enough users complain loudly enough or more issues show up I might just release another bfs and -ck since it should be stable, especially being one of the last BFS releases.
http://ck.kolivas.org/patches/bfs/4.0/4.8/Pending/
Keep the feedback and bug reports coming. Next I need to put more care into the non-interactive mode of muqss for your enjoyment.
Enjoy!
お楽しみ下さい
-ck
There are versions and incrementals available for linux-4.7:
http://ck.kolivas.org/patches/muqss/4.0/4.7/
and linux-4.8:
http://ck.kolivas.org/patches/muqss/4.0/4.8
If you've been waiting for me to say it's stable enough to try, then now's your chance for I've addressed all known bugs at this time and it's working well for me.
Most of the issues were to do with races and unstable handling of cross-cpu task movement. No effort went into improving performance from 104 though this version should address many of the crashes and hangs that have been reported with earlier versions.
Additionally there is a pending patch being uploaded for BFS512 which, as per usual, had some last minute issues that only just showed up. If enough users complain loudly enough or more issues show up I might just release another bfs and -ck since it should be stable, especially being one of the last BFS releases.
http://ck.kolivas.org/patches/bfs/4.0/4.8/Pending/
Keep the feedback and bug reports coming. Next I need to put more care into the non-interactive mode of muqss for your enjoyment.
Enjoy!
お楽しみ下さい
-ck
Monday, 3 October 2016
BFS version 0.512, linux-4.8-ck1, MuQSS for linux-4.8
This is to announce an updated version of BFS for the new stable linux kernel 4.8.
BFS by itself:
4.8-sched-bfs-512.patch
-ck patches with BFS:
4.8-ck1
EDIT: Here's a bugfix post release for the above kernels that I highly recommend you include:
http://ck.kolivas.org/patches/bfs/4.0/4.8/Pending/bfs512-fixes.patch
Following on from the aggressive development towards a new scheduler, this BFS incorporates a number of fixes and performance improvements discovered while working on the Multiple Queue Skiplist Scheduler, MuQSS (pronounced mux) and should be the best performing BFS yet.
Note that this may be the last BFS based -ck release as MuQSS is designed to replace it, being the logical evolution of the same scheduler into a more scalable discrete runqueue design.
For those willing to try it in its current version, an incremental patch can be applied to BFS 512:
bfs512-muqss104.patch
or there is a full patch against 4.8:
4.8-sched-MuQSS_104.patch
Again, MuQSS is still immature code and while I have been running it stably for a few days now, and have spent a lot of time debugging locking issues and stability, it is not intended for production use just yet. Having said that, all testing is most welcome, especially benchmarks and stacktraces if you get any crashes.
I've been asked numerous times why I decided to change the name. There are two major reasons. The first is that it signifies just what a dramatic overhaul to the codebase it is, where it is virtually a new scheduler, even though it uses the same scheduling decision policy as BFS. The second is that I've had many people approach me saying they would like to use BFS for their own production environment but alas the offensive name is a showstopper for them. Additionally I had to choose a name that wasn't being used by anything else which both BFS and brainfuck had been used before.
Enjoy!
お楽しみ下さい
-ck
BFS by itself:
4.8-sched-bfs-512.patch
-ck patches with BFS:
4.8-ck1
EDIT: Here's a bugfix post release for the above kernels that I highly recommend you include:
http://ck.kolivas.org/patches/bfs/4.0/4.8/Pending/bfs512-fixes.patch
Following on from the aggressive development towards a new scheduler, this BFS incorporates a number of fixes and performance improvements discovered while working on the Multiple Queue Skiplist Scheduler, MuQSS (pronounced mux) and should be the best performing BFS yet.
Note that this may be the last BFS based -ck release as MuQSS is designed to replace it, being the logical evolution of the same scheduler into a more scalable discrete runqueue design.
For those willing to try it in its current version, an incremental patch can be applied to BFS 512:
bfs512-muqss104.patch
or there is a full patch against 4.8:
4.8-sched-MuQSS_104.patch
Again, MuQSS is still immature code and while I have been running it stably for a few days now, and have spent a lot of time debugging locking issues and stability, it is not intended for production use just yet. Having said that, all testing is most welcome, especially benchmarks and stacktraces if you get any crashes.
I've been asked numerous times why I decided to change the name. There are two major reasons. The first is that it signifies just what a dramatic overhaul to the codebase it is, where it is virtually a new scheduler, even though it uses the same scheduling decision policy as BFS. The second is that I've had many people approach me saying they would like to use BFS for their own production environment but alas the offensive name is a showstopper for them. Additionally I had to choose a name that wasn't being used by anything else which both BFS and brainfuck had been used before.
Enjoy!
お楽しみ下さい
-ck
Saturday, 1 October 2016
MuQSS - The Multiple Queue Skiplist Scheduler v0.105
Announcing a multiple runqueue variant of BFS, with the more mundane name of MuQSS (pronounced mux) for linux 4.7:
Full patch for linux-4.7
4.7-sched-MuQSS_105.patch
Keep watching this blog for newer versions!
Incremental to patch bfs502 to MuQSS 0.1:
bfs502-MuQSS_103.patch
It was inevitable that one day I would find myself tackling the 2 major scalability limitations in BFS and this is the result of it. These two issues were
- The single runqueue which means all CPUs would fight for lock contention over the one runqueue, and
- The O(n) look up which means linear increase in overhead for task lookups as number of processes increases.
Till now I did not have the energy nor time to try and find a solution for number 1. that maintained BFS' scheduling decision algorithm as the single runqueue was actually the reason latency remains bound and deterministic on BFS, capitalising with more CPUs instead of fighting against them for scalability.
This scheduler variant is an evolution of BFS, which hopefully will be mature enough to replace BFS one day when stability is assured. It is able to still use the same scheduling algorithm as BFS meaning latency and responsiveness remains as good as always, but with the per-CPU runqueue and discrete locking, it also means it will scale to any number of CPUs, as the mainline scheduler does.
It does NOT guarantee the best possible throughput as there still is virtually no complex balancing mechanism whatsoever, selecting tasks according to deadline primarily with only CPU cache distances being used to determine which idle CPU to go to, or in non-interactive mode, which overloaded CPU to pull from to fill an idle CPU.
It would be possible, with a lot of effort, to wedge the entire balancing algorithm for scalability from mainline into this, though it will probably offset the deterministic latency that makes it special.
This is a massive rewrite and consequently there are bound to still be race conditions and hidden bugs though I have been running it for a while now with reasonable stability. I'm putting this out there for the braver people to test. There's a lot more to document about it but for now let's just say, give it a try.
Please don't use any lock debugging as it will light up every possible complaint for the time being!
Regarding 4.8, for the time being I will still be releasing BFS for it and incorporate it into -ck
EDIT: Updated to version 0.105 with significant bugfixes.
Enjoy!
お楽しみ下さい
-ck
Friday, 23 September 2016
BFS 502, linux-4.7-ck5
With the fix for the last of the freezes with BFS497 becoming clearer and a number of other minor issues being attended to, such as build failures and minor improvements accumulating, I'm releasing a new BFS that combines all into yet another release, which should be the last of the releases for the 4.7 kernel.
BFS by itself:
4.7-sched-bfs-502.patch
-ck patches with BFS:
4.7-ck5
In addition to the update to BFS, this -ck release is the first in a very long time to include a patch from another developer - the Throttled background buffered writeback v7 patch by Jens Axboe. This makes a massive difference to a system's ability to read files, open new applications etc. under heavy write loads in my testing and is a change which I believe is essential and will eventually make its way into the mainline kernel.
The changes to BFS 502 are as follows:
Enjoy!
お楽しみ下さい
-ck
BFS by itself:
4.7-sched-bfs-502.patch
-ck patches with BFS:
4.7-ck5
In addition to the update to BFS, this -ck release is the first in a very long time to include a patch from another developer - the Throttled background buffered writeback v7 patch by Jens Axboe. This makes a massive difference to a system's ability to read files, open new applications etc. under heavy write loads in my testing and is a change which I believe is essential and will eventually make its way into the mainline kernel.
The changes to BFS 502 are as follows:
bfs497-build_other_arches.patch bfs497-no_smtload_avg.patch bfs497-recognise_nodes2.patch bfs497-revert-othercpufreq.patch bfs497-fix_smt_nonice.patch
- A build fix for building on other architectures (notably ARM).
- Simplifying the load measurement on SMT machines reported to cpufreq - trying to account for load on the SMT sibling is unnecessary as each core will run at the speed of the most loaded sibling anyway on any existing hardware.
- A fix for detecting CPUs on other NUMA nodes and setting their locality correctly.
- Not trying to signal CPU load to cpufreq on other CPUs when tasks migrate - this was leading to the hangs and there is enough rescheduling for cpufreq to get the load later on.
- A build fix for when SMT_NICE is not configured.
Enjoy!
お楽しみ下さい
-ck
Tuesday, 13 September 2016
BFS 497, linux-4.7-ck4
For the first time in a very long time, I'm announcing yet another -ck release up to ck4 along with yet more substantial updates for BFS for linux-4.7 based kernels.
BFS by itself:
4.7-sched-bfs-497.patch
-ck branded linux-4.7-ck4 patches:
linux-4.7-ck4
Thanks(?) to the massive changes to the mainline kernel I'd been forced to rewrite significant components of BFS to work properly with them, specifically the cpu frequency governors. At the same time I've had quite a bit of energy and enthusiasm for working on BFS in a way I haven't had in a long time. As a result, this updated version not only addresses the remaining cgroup stub patch bug (mentioned on the previous announcement) but implements further improvements and clean ups to go with those improvements.
Alas I still have no explanation for the random lockups some people are seeing, but I have seen reports of it happening on mainline kernels as well now, so while I'm always suspicious of my own code, there is also the chance that BFS exacerbates an issue in mainline. Something that appears common is onboard Intel graphics with the Haswell chipset.
Additionally I had reports of people being unable to suspend with BFS from 4.7 but I haven't heard back from them on later versions.
The short summary of improvements in this version are less overhead, higher throughput and less latencies.
I've rewritten the skiplist implementation to not require a malloc/free on insertion/removal of a new node which seemed to noticeably improve throughput at high loads.
Now that CPU frequency governors know what the scheduler is doing, the approach of BFS of old of knowing what the governor was doing and working around it is no longer helpful and I've removed the whole sticky task and offset for throttled CPUs and throughput has actually improved instead.
I've also added some micro-optimisations and cleanups.
I've added a minor change for offlining CPUs to prevent tasks trying to schedule to them.
The set of patches in ck4 is the largest in the ck patchset since the early 2.6 patchset days. I've also included the patch from Alfred (thanks!) to fix the warning that happens with suspend which is mostly harmless.
Each patch included has a mini changelog at the top.
I'm also keen to get feedback from people on if they see any noticeable interactive/responsiveness regressions by disabling the interactive flag as follows:
echo 0 > /proc/sys/kernel/interactiveEnjoy!
お楽しみ下さい
-ck
Wednesday, 7 September 2016
BFS 490, linux-4.7-ck3
Announcing yet another substantial update for BFS for linux-4.7 based kernels.
BFS by itself:
4.7-sched-bfs-490.patch
-ck branded linux-4.7-ck3 patches:
linux-4.7-ck3
Following on from the large update to BFS in 480 to skip lists, numerous regressions became apparent, the bulk of which were related to doing a poor job of signalling cpu load to the various cpufrequency governors. Some were affected badly, others not so, but there were plenty of helpful people giving feedback about those regressions which encouraged me to slowly but surely chip away at the problems. Additionally, there were some minor behavioural regressions which were oversights during the updates to BFS 480. Finally the rudimentary cgroup stub patch would crash the system.
As the number of patches required to address these issues got larger and larger, it became hard for people on this blog to keep up with the changes so I've released 490 which hopefully should address the bulk of these issues - there are patches in there that haven't been posted on this blog, but I've included all of them with a brief description in the incremental/ directory for your perusal.
Anyway it is much easier for people to grab the latest version which includes all of those changes, including the updated cgroups stub patch.
EDIT: Here's a patch to make cgroup stubs safer cgroup-stubs-safe2.patch
Enjoy!
お楽しみ下さい
-ck
Sunday, 4 September 2016
Cgroup stubs for BFS
In addition to some minor pending changes for BFS 480 here:
Pending
I've implemented basic stubs for the CPU controller cgroups feature. This patch is experimental and does NOT implement the actual controller groups features, it simply creates a compatibility layer for the cgroup filesystem. The point of this is to allow environments and applications that refuse to work (such as docker), or work improperly without them, to work. While the actual control of CPU resources won't happen, there's a good chance that it won't make any difference whatsoever since their actual use on a desktop filesystem is serious overkill and worsens throughput. I don't have any plans to implement the actual CPU groups features. This is only lightly tested but already I've noticed that a laptop that would always take ages to shutdown with BFS is now much happier, so who knows it might help machines that refuse to suspend for the same reason:
bfs480-cgroup-stubs.patch
As an aside I did an experiment today with BFS480 on a machine with 12 logical cores and 64GB ram and ran a make -j allyesconfig at sched idleprio to see how it would scale and control the many idleprio tasks along with SMT nice. The machine was fine right up until it ran out of ram and then stalled while it evicted whatever it could and then continued. The load peaked and stayed around 7200 for 10 minutes. While the load was 7200, existing applications continued to work fine and browsing on firefox was virtually normal, but starting new applications took forever thanks to seriously delayed I/O.
EDIT: More testing shows this patch is UNSTABLE and can crash so just see it as proof-of-concept for now
Enjoy!
お楽しみ下さい
-ck
Pending
I've implemented basic stubs for the CPU controller cgroups feature. This patch is experimental and does NOT implement the actual controller groups features, it simply creates a compatibility layer for the cgroup filesystem. The point of this is to allow environments and applications that refuse to work (such as docker), or work improperly without them, to work. While the actual control of CPU resources won't happen, there's a good chance that it won't make any difference whatsoever since their actual use on a desktop filesystem is serious overkill and worsens throughput. I don't have any plans to implement the actual CPU groups features. This is only lightly tested but already I've noticed that a laptop that would always take ages to shutdown with BFS is now much happier, so who knows it might help machines that refuse to suspend for the same reason:
bfs480-cgroup-stubs.patch
As an aside I did an experiment today with BFS480 on a machine with 12 logical cores and 64GB ram and ran a make -j allyesconfig at sched idleprio to see how it would scale and control the many idleprio tasks along with SMT nice. The machine was fine right up until it ran out of ram and then stalled while it evicted whatever it could and then continued. The load peaked and stayed around 7200 for 10 minutes. While the load was 7200, existing applications continued to work fine and browsing on firefox was virtually normal, but starting new applications took forever thanks to seriously delayed I/O.
EDIT: More testing shows this patch is UNSTABLE and can crash so just see it as proof-of-concept for now
Enjoy!
お楽しみ下さい
-ck
Friday, 2 September 2016
BFS 480 with skip lists, linux-4.7-ck2
Announcing a major update for BFS for linux-4.7 based kernels.
BFS by itself:
4.7-sched-bfs-480.patch
-ck branded linux-4.7-ck2 patches:
linux-4.7-ck2
This is the largest BFS update in a long time. The various problems that had been accumulating forced me to spend a more extended period fixing BFS to work with the latest mainline changes and encouraged me to overhaul some areas that had long been needing it.
The changes are:
- Fixed the crash when SMT NICE is configured in on a CPU without SMT.
- Added my skiplist implementation.
- Converted BFS from its long-standing O(n) lookup to use skiplists.
- Fix crash when SMT NICE is enabled on some hardware
- Fix try_preempt missing the locality diff effect in non-interactive mode
- Ignore busy threads/caches when still on the same core
- Reworked the testing of idle threads and cores for less overhead and to correctly identify idle siblings
- Fix the CPU load that's passed to the cpu frequency governor, fixing a crash and non-working schedutil governor.
Actually incorporating the skiplists that I had experimented with a long time ago was decided on by the fact that I was able to trim the skiplist overhead further and maintain identical semantics for process selection (maintaining interactivity) whereas on the previous experiment I had never completed the work. Throughput testing shows virtually identical performance on normal workloads and theoretically would be helpful in extreme overload cases.
The original post regarding skip lists was here:
bfs-and-skip-lists.html
This now means that BFS is no longer O(n) lookup after O(1) insertion. It is now O(log(n)) insertion, O(1) lookup and O(k) removal where k <= 16, thereby tackling a long-standing criticism of the overall design.
I did not find a specific cause for peoples' inability to suspend to ram so I doubt this has been fixed despite the large code update.
The list of patches making up bfs480 is as follows:
bfs472-fix_set_task_cpu.patch skiplists.patch bfs472-skiplist.patch bfs-delay-smt-siblings.patch bfs-fix-noninteractive-try-preempt.patch bfs-ignore_local_busy.patch bfs-rework-idles.patch bfs-fix-schedutil.patch bfs-v480.patch
As always I'm giving this to you not long after I've finished coding it so all the usual warnings apply, especially with an update of this size.
EDIT: Uniprocessor build fix: bfs480-fix-upbuild.patch
EDIT2: Here is a test patch to try and improve cpufreq behaviour: bfs480-rework_cpufreq.patch
Enjoy!
お楽しみ下さい
-ck
Friday, 29 July 2016
BFS 472, linux-4.7-ck1
Announcing an updated BFS for linux-4.7 based kernels.
BFS by itself:
4.7-sched-bfs-472.patch
-ck branded linux-4.7-ck1 patches:
linux-4.7-ck1
This was quite a substantial merge effort this time around with a fair amount of changes in mainline kernel that affected the patch. Nonetheless everything appears to be working as planned in my limited testing. I'm unsure if the changes will fix the problems people had with suspend during the 4.6-bfs patches but the new code does touch that area. I was never affected on any of my machines so was unable to reproduce the problem in the first place.
In addition to the resync, a few minor changes have made their way into this release with respect to the way tasks preempt other tasks. See bfs470-updates.patch for details.
One other fairly significant change was properly hooking into the new schedutil parameters that drive cpufreq scaling governors. What I committed into bfs470 would not have been working properly in choosing the correct CPU frequency to run at and may have led to slowdowns and/or more power usage. This should be fixed in 472.
I should also mention that if, like me, you use the evil proprietary nvidia driver, the latest will not build with the current kernel and you'll need a couple of patches to get it working.
Enjoy!
お楽しみ下さい
-ck
EDIT: This patch will fix crashes when configured without SMT_NICE enabled:
bfs472-fix_set_task_cpu.patch
And will be applied to the next BFS release.
Saturday, 11 June 2016
lrzip 0.630
It's been a long time since I've updated lrzip as version 0.621 was very stable. Having had a long time for many people to test it in lots of environments has allowed a few rare bugs to be shaken out, and a few issues showed up on different hardware/OS, so an update was finally required. In addition it gave me an opportunity to implement a feature many had requested - gzip command line support. This is now done when lrzip is called via the command line 'lrz' instead of its full name lrzip.
Get it here:
lrzip.kolivas.org
Git repository:
https://github.com/ckolivas/lrzip
The short short feature changelog:
- gzip command line support when lrzip is called with the name lrz instead of lrzip
- recursive directory compression support with -r (this was required to meet gzip compatibility)
The short short bug changelog:
- Reports of being unable to malloc ram and failing should all be fixed now.
- Inability to decompress very large (multiple chunk) encrypted archives is now fixed. Fortunately the issue was on decompression, not compression so if you have generated files that meet this criteria they are safe.
Full changelog:
* checksum.buf should only be changed after the semaphore wait
* Update README
* Add documentation for recursive mode
* Implement gzip compatible -r recursive option
* Add initial argument processing for recursive option
* Tidy
* Add one more verbose for compat mode
* Add support for various combinations in compat mode
* models is array of chars. char's signess is implementation specific. It's
unsigned on ARMv7. Unsigned char cannot represent negative values. GCC 6
complains about it:
* Fix decompression of multiple chunk encrypted archives
* Tidy gotos
* Show correct lengths during testing on big endian and compressed archives
* Update copyright dates
* Allow less than maxram to be malloced for checksum to fix Failed to malloc
ckbuf in hash_search2
* Base temporary output buffer on maximum mallocable, not maxram
* Enable subdir objects for future automake compatibility
* Add support for -m option in lrztar
* Big endian fix for Solaris Sparc courtesy of joelfredrikson.
* Fixed typographical error, changed accomodate to accommodate in README.
* A whitespace fix on lrztar.
* Add sanity check to prevent trying to malloc more ram than a
system/environment is capable of
* Cosmetic help change for compat
* Add rudimentary manpage for lrz
* Fix lrz symbolic linkage
* Do not fail if we are unable to write temporary files, giving a warning only
that it might fail if we don't have enough ram
* Try /tmp/ if none of the temporary environment directories or the current
directory are writeable
* Set STDOUT correctly in compat mode
* Style police
* Fix false warning on decompressing from stdin without keep files
* Fix false warning on compressing from stdin without keep files
* Don't show extra message in compat mode decompress
* Show correct appname when called in compat mode
* Add support for progress, fast and best flags in compat mode
* Add compatibility mode with gzip when called as lrz
* Correct adding slash to control->tmpdir. off-by-one error.
* Update manpage for long options
Enjoy!
お楽しみ下さい
-ck
Get it here:
lrzip.kolivas.org
Git repository:
https://github.com/ckolivas/lrzip
The short short feature changelog:
- gzip command line support when lrzip is called with the name lrz instead of lrzip
- recursive directory compression support with -r (this was required to meet gzip compatibility)
The short short bug changelog:
- Reports of being unable to malloc ram and failing should all be fixed now.
- Inability to decompress very large (multiple chunk) encrypted archives is now fixed. Fortunately the issue was on decompression, not compression so if you have generated files that meet this criteria they are safe.
Full changelog:
* checksum.buf should only be changed after the semaphore wait
* Update README
* Add documentation for recursive mode
* Implement gzip compatible -r recursive option
* Add initial argument processing for recursive option
* Tidy
* Add one more verbose for compat mode
* Add support for various combinations in compat mode
* models is array of chars. char's signess is implementation specific. It's
unsigned on ARMv7. Unsigned char cannot represent negative values. GCC 6
complains about it:
* Fix decompression of multiple chunk encrypted archives
* Tidy gotos
* Show correct lengths during testing on big endian and compressed archives
* Update copyright dates
* Allow less than maxram to be malloced for checksum to fix Failed to malloc
ckbuf in hash_search2
* Base temporary output buffer on maximum mallocable, not maxram
* Enable subdir objects for future automake compatibility
* Add support for -m option in lrztar
* Big endian fix for Solaris Sparc courtesy of joelfredrikson.
* Fixed typographical error, changed accomodate to accommodate in README.
* A whitespace fix on lrztar.
* Add sanity check to prevent trying to malloc more ram than a
system/environment is capable of
* Cosmetic help change for compat
* Add rudimentary manpage for lrz
* Fix lrz symbolic linkage
* Do not fail if we are unable to write temporary files, giving a warning only
that it might fail if we don't have enough ram
* Try /tmp/ if none of the temporary environment directories or the current
directory are writeable
* Set STDOUT correctly in compat mode
* Style police
* Fix false warning on decompressing from stdin without keep files
* Fix false warning on compressing from stdin without keep files
* Don't show extra message in compat mode decompress
* Show correct appname when called in compat mode
* Add support for progress, fast and best flags in compat mode
* Add compatibility mode with gzip when called as lrz
* Correct adding slash to control->tmpdir. off-by-one error.
* Update manpage for long options
Enjoy!
お楽しみ下さい
-ck
Wednesday, 8 June 2016
BFS 470, linux-4.6-ck1
Announcing an updated BFS for linux-4.6 based kernels.
BFS by itself:
4.6-sched-bfs-470.patch
-ck branded linux-4.6-ck1 patches:
linux-4.6-ck1
Resync to 4.6. You know the drill.
Enjoy!
お楽しみ下さい
-ck
Friday, 25 March 2016
BFS 469, linux-4.4-ck1, linux-4.5-ck1
Announcing an updated BFS for linux-4.4 and 4.5 based kernels.
BFS by itself:
4.5-sched-bfs-469.patch
-ck branded linux-4.5-ck1 patches:
linux-4.5-ck1
This is purely a resync of BFS 467 from 4.3-ck3 to the current kernels. The only change is extra documentation of the interactive tunable in the scheduler documentation, and a build warning fix for uniprocessor builds.
While linux-4.5 is the latest kernel, as I had been slow in syncing up and missed 4.4, and given that 4.4 is deemed a Long Term Stable release, I've provided resyncs with both. Version number differences of 467/469 are only due to syncing with different kernels and otherwise they are only trivially different.
The patches are fairly new without a great deal of testing, so the usual warnings apply, but given how long it took me to getting around to catching up, I didn't want to delay releasing them.
Enjoy!
お楽しみ下さい
-ck
Subscribe to:
Posts (Atom)
