Sunday, 11 March 2012

lrzip-0.611

It usually doesn't take long to find bugs in a new significantly larger release and that was the case with lrzip 0.610. Since I'm loathe to leaving a buggy release lying around, I've released a new version hot on the heels of the old one.

lrzip 0.611 on freecode

http://lrzip.kolivas.org

lrzcat and lrzuntar were broken on the last release and they've now been fixed. lrzuntar also had the nasty habit of overwriting existing directories without warning so I've modified the code so it will not overwrite it without the -f option.

With the report of slowdowns on the last release, almost certainly due to incorporating the liblrzip library support, since nothing else changed, I figured it was time to do some hot spot profiling. So I pulled out oprofile and found where most of the time was spent in the code during the rzip stage. Then I went in and carefully rewrote small areas of the hottest functions as though they were critical code paths in the CPU scheduler and managed to find a few small speed improvements. Most of the improvements won't be noticeable unless you're running one of the faster compression modes like lzo, or none at all with -n, but it is faster. I also moved the checksum routines (crc32 and md5) into separate threads as they now use a significant amount of CPU time of their own during the rzip phase, and this should speed things up slightly too.

Then I went to the decompression side of things and did some profiling on the runzip stage and got a surprise. Most of the time during decompression is simply spent on the md5 update code. If I disabled the md5 update code, it was much faster during the runzip stage. After arguing with myself for a day, I figured it was still better to have integrity checking enabled and consider the addition of a fast mode for decompression since that will actually be almost lightning quick.

Interestingly, when profiling decompression, I was using the test option (-t) which does not write anything to disk, and things changed quite dramatically when I changed to actual decompression to disk. It took four times longer to decompress a 1GB archive to disk than it did to just test decompression in ram. Now this all seems obvious if you consider how long it takes to write something to disk, but that was not the case at all. In fact, virtually none of the data is written to disk by the time decompression is complete; it is just all in "dirty ram" as writeback. This did not change whether I used a -ck kernel with a low dirty ratio setting or the mainline default of 10. After some more prodding around I discovered that doing a simple write() of a 1GB buffer took over 7 seconds on modern hardware. This is only 140MB/s for a copy from memory to memory! It should be 20 times faster than that. I even tried taking the write function out of the equation and doing an mmap on the file and then memcpy from the buffer to the mmaped ram and it took the same amount of time, so the slowdown was not in the write() function call. After speaking to a few people and googling, it appears I'm not alone in this finding and it may be only happening in recent linux kernels. At this point I gave up trying to find what was causing this slow decompression on lrzip since it seemed unrelated to the lrzip code, and concentrated on getting this release out. I wonder if this is related to the writeback code that I was actually so looking forward to in 3.2ish. However others reported the problem continues as far back as 2.6.38 whereas it's not there in 2.6.35. I'll wait and see.

Anyway it may be possible to get lrzip integration into libarchive and therefore a number of package managers and decompression programs that use this library. The GPL license in lrzip may be a sticking point, though, and the authors of lzo and the original rzip have not responded about queries about the possibility of making the library LGPL which would make it easier to incorporate into the BSD licensed libarchive. So for now, it's stuck on GPL version 2.

Enjoy.

Thursday, 8 March 2012

lrzip-0.610

I haven't been idle all this time. In all honesty I've spent a lot of time playing with my bitcoin mining code in cgminer and after a couple of hardware failures with my mining rig I was finally able to turn my attention back to lrzip which has been really fairly quiet in the bug reporting department since the last stable release.

Here is version 0.610 (when freecode updates):
http://freecode.com/projects/long-range-zip

Or directly:
http://lrzip.kolivas.org

The major feature addition in this latest version is largely thanks to the work of Michael Blumenkrantz in the form of a liblrzip library! This means you can now link in lrzip compression and decompression into any other application. There is a range of APIs available to use this capability from the simplest gzip equivalent lrzip_compress and lrzip_decompress functions to ones that expose all the low level features knobs and dials in lrzip. There are also a couple of demo examples of using these functions included in the source code, but in their simplest form they should be very easy to use. I look forward to seeing package managers, archivers and maybe even git and so on having lrzip support in the future ;) Being a large chunk of new code it is not entirely impossible that bugs or issues with using the library may show up.

WHATS-NEW summary:
The new liblrzip library allows you to add lrzip compression and decompression
to other applications with either simple lrzip_compress and lrzip_decompress
functions or fine control over all the options with low level functions.
Faster rzip stage when files are large enough to require the sliding mmap
feature (usually >1/3 of ram) and in unlimited mode.
A bug where multiple files being compressed or decompressed from the one
command line could have gotten corrupted was fixed.
Modification date of the decompressed file is now set to that of the lrzip
archive (support for storing the original file's date would require modifying
the archive format again).
Compilation warning fixes.
Make lrztar work with directories with spaces in their names.


The full changelog follows:
* Implement complete set of liblrzip libraries, documentation and example uses
with support for simple lrzip_compress() and lrzip_decompress() or complete
fine-grained control over all compression and decompression options.
* Use as much of the low buffer as possible with a single memcopy before going
fine grained byte by byte.
* Preserve the compressed time on decompression where suitable.
* Store a copy of the control struct to be reused on subsequent files to prevent
variables being modified in the control struct on the first file that corrupt
compression/decompression of the 2nd file.
* Explicitly select C99 to avoid certain warnings.
* Generic modifications to silence -Wextra warnings.
* Fix typos.
* Use an array of parameters in lrztar to allow working with directories with
spaces in their names.

Enjoy.

Monday, 16 January 2012

3.2-ck1

These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but suitable to any commodity hardware workload.

This is an upgrade to BFS 416, trivial bugfixes and a resync from 3.1.0-ck1 . I've changed the versioning to not specify a 3 point release since it usually applies to successive 3 point releases for some time.

Apply to 3.2.x:
patch-3.2-ck1.bz2


Broken out tarball:
3.2-ck1-broken-out.tar.bz2

Discrete patches:
patches


BFS by itself:
bfs


Web:
kernel.kolivas.org


Code blog when I feel like it:
ck-hack.blogspot.com


Each discrete patch usually contains a brief description of what it does at
the top of the patch itself.

Full patchlist:

3.2-sched-bfs-416.patch
mm-minimal_swappiness.patch
mm-enable_swaptoken_only_when_swap_full.patch
mm-drop_swap_cache_aggressively.patch
mm-kswapd_inherit_prio-1.patch
mm-background_scan.patch
mm-idleprio_prio-1.patch
mm-lru_cache_add_lru_tail-2.patch
mm-decrease_default_dirty_ratio-1.patch
kconfig-expose_vmsplit_option.patch
hz-default_1000.patch
hz-no_default_250.patch
hz-raise_max.patch
preempt-desktop-tune.patch
ck1-version.patch

Enjoy!
お楽しみください

--ck

P.S. What this really means is I finished playing zelda.

Monday, 9 January 2012

Towards Transparent CPU Scheduling

Of BFS related interest is an excellent thesis by Joseph T. Meehean
entitled "Towards Transparent CPU Scheduling". Of particular note is
the virtually deterministic nature of BFS, especially in fairness and
latency. While this of course interests me greatly because of
extensive testing of the BFS CPU scheduler, there are many aspects of
both the current CFS CPU scheduler and the older O(1) CPU scheduler
that are discussed that anyone working on issues to do with
predictability, scalability, fairness and latency should read.

http://research.cs.wisc.edu/wind/Publications/meehean-thesis11.html


Abstract:

In this thesis we propose using the scientific method to develop a deeper understanding of CPU schedulers; we use this approach to explain and understand the sometimes erratic behavior of CPU schedulers. This approach begins with introducing controlled workloads into commodity operating systems and observing the CPU scheduler's behavior. From these observations we are able to infer the underlying CPU scheduling policy and create models that predict scheduling behavior.
We have made two advances in the area of applying scientific analysis to CPU schedulers. The first, CPU Futures, is a combination of predictive scheduling models embedded into the CPU scheduler and user-space controller that steers applications using feedback from these models. We have developed these predictive models for two different Linux schedulers (CFS and O(1)), based on two different scheduling paradigms (timesharing and proportional-share). Using three different case studies, we demonstrate that applications can use our predictive models to reduce interference from low-importance applications by over 70%, reduce web server starvation by an order of magnitude, and enforce scheduling policies that contradict the CPU scheduler's.
Harmony, our second contribution, is a framework and set of experiments for extracting multiprocessor scheduling policy from commodity operating systems. We used this tool to extract and analyze the policies of three Linux schedulers: O(1), CFS, and BFS. These schedulers often implement strikingly different policies. At the high level, the O(1) scheduler carefully selects processes for migration and strongly values processor affinity. In contrast, CFS continuously searches for a better balance and, as a result, selects processes for migration at random. BFS strongly values fairness and often disregards processor affinity.