Saturday, 6 November 2010

lrzip-0.5.2 with unlimited size compression windows unconstrained by ram.

After my last blog, all I needed to do was sleep on the "sliding mmap" implementation I had so far to think about how to improve it to make it work in some reasonable time frame instead of 100 to 1000 times slower. Yes, I know lrzip isn't as exciting as BFS to everyone else, but I'm having fun hacking on it.

The hash search in the rzip phase of lrzip moves progressively along the file offset, and the match lengths are up to 65536 bytes long. So I changed my 2 sliding buffers to have the one large one move along with the progressive file offset, and the smaller one to be 65536 bytes long. Well the results were dramatic. Instead of being 100 times slower, my first attempt showed it to be 3 times slower. So I cleaned it up and started benchmarking. Not only was it faster than the earlier implementation, surprisingly, sometimes it's faster to compress with it enabled than not! So it was clear I was onto a winner. I tried a few different memory configuration machines and different file sizes, and I can see that as the amount of redundancy goes up in the original file, and as the file deviates further from the ram size, it slows down more and more, but it's still a usable speed.

Using the classic 10GB virtual image file on an 8GB machine example, here's how it fared with the various options:
Options Size       Compress Decompress
-l      1793312108  5m13s    3m12s
-lM     1413268368  4m18s    2m54s
-lU     1413268368  6m05s    2m54s

Not bad at all really, considering it manages to mmap about 5.5GB and then the other 4.5GB depends on the sliding feature to work.

So I fixed a swag of other things, updated the docs and made all new benchmarks, and decided I couldn't sit on it any longer. At some stage you have to release, so version 0.5.1 is out, with the goodness of unlimited size compression windows, unconstrained by ram.

EDIT: Updated link to 0.5.2

Here's some more fun benchmarks:

These are benchmarks performed on a 2.53Ghz dual core Intel Core2 with 4GB ram using lrzip v0.5.1. Note that it was running with a 32 bit userspace so only 2GB addressing was possible. However the benchmark was run with the -U option allowing the whole file to be treated as one large compression window.

Tarball of 6 consecutive kernel trees, linux-2.6.31 to linux-2.6.36

Compression Size        Percentage Compress Decompress
None        2373713920   100
7z          344088002    14.5       17m26s   1m22s
lrzip -l    223130711    9.4        05m21s   1m01s
lrzip -U    73356070     3.1        08m53s   43s
lrzip -Ul   158851141    6.7        04m31s   35s
lrzip -Uz   62614573     2.6        24m42s   25m30s

Kick arse.

Update:Again I managed to break the Darwin build in 0.5.1. This time it's a simple missing ; at the end of line 536 in main.c . I'll confirm that it works otherwise on Darwin and do a quick bump to a new version. (edit:done)

EDIT: Just for grins, I tried compressing a 50GB freenet datastore on the 8GB ram machine with just the one window as a test. It succeeded with flying colours in about 2 hours. Given the nature of the data that is of very low compressibility due to all being encrypted, and the window being more than 6* the size of the ram.

Compression Ratio: 1.022. Average Compression Speed: 7.266MB/s.
Total time: 01:59:3563.609

It also decompressed fine:
Total time: 01:02:3566.894

Note that it was on an external USB hard drive which isn't the fastest thing to read/write from and reading and writing 50GB to the same drive probably would take an hour itself, so that's why it takes so long. The compression time frame looks good in light of that :)


  1. 2.3GB compressed into 73MB? You gotta be kidding me :-P

  2. The Download Link still shows the old 0.5, not the 0.5.1.

    CU sysitos

  3. @RealNC, yes, this is some serious compression :D It's even better with zpaq :)

    @Anonymous, freshmeat will just link what it considers the latest and at the time you clicked on it, they probably hadn't updated their database since all entries need manual approval.