The hash search in the rzip phase of lrzip moves progressively along the file offset, and the match lengths are up to 65536 bytes long. So I changed my 2 sliding buffers to have the one large one move along with the progressive file offset, and the smaller one to be 65536 bytes long. Well the results were dramatic. Instead of being 100 times slower, my first attempt showed it to be 3 times slower. So I cleaned it up and started benchmarking. Not only was it faster than the earlier implementation, surprisingly, sometimes it's faster to compress with it enabled than not! So it was clear I was onto a winner. I tried a few different memory configuration machines and different file sizes, and I can see that as the amount of redundancy goes up in the original file, and as the file deviates further from the ram size, it slows down more and more, but it's still a usable speed.
Using the classic 10GB virtual image file on an 8GB machine example, here's how it fared with the various options:
Options Size Compress Decompress -l 1793312108 5m13s 3m12s -lM 1413268368 4m18s 2m54s -lU 1413268368 6m05s 2m54s
Not bad at all really, considering it manages to mmap about 5.5GB and then the other 4.5GB depends on the sliding feature to work.
So I fixed a swag of other things, updated the docs and made all new benchmarks, and decided I couldn't sit on it any longer. At some stage you have to release, so version 0.5.1 is out, with the goodness of unlimited size compression windows, unconstrained by ram.
EDIT: Updated link to 0.5.2
Here's some more fun benchmarks:
These are benchmarks performed on a 2.53Ghz dual core Intel Core2 with 4GB ram using lrzip v0.5.1. Note that it was running with a 32 bit userspace so only 2GB addressing was possible. However the benchmark was run with the -U option allowing the whole file to be treated as one large compression window.
Tarball of 6 consecutive kernel trees, linux-2.6.31 to linux-2.6.36
Compression Size Percentage Compress Decompress None 2373713920 100 7z 344088002 14.5 17m26s 1m22s lrzip -l 223130711 9.4 05m21s 1m01s lrzip -U 73356070 3.1 08m53s 43s lrzip -Ul 158851141 6.7 04m31s 35s lrzip -Uz 62614573 2.6 24m42s 25m30s
Update:Again I managed to break the Darwin build in 0.5.1. This time it's a simple missing ; at the end of line 536 in main.c . I'll confirm that it works otherwise on Darwin and do a quick bump to a new version. (edit:done)
EDIT: Just for grins, I tried compressing a 50GB freenet datastore on the 8GB ram machine with just the one window as a test. It succeeded with flying colours in about 2 hours. Given the nature of the data that is of very low compressibility due to all being encrypted, and the window being more than 6* the size of the ram.
Compression Ratio: 1.022. Average Compression Speed: 7.266MB/s.
Total time: 01:59:3563.609
It also decompressed fine:
Total time: 01:02:3566.894
Note that it was on an external USB hard drive which isn't the fastest thing to read/write from and reading and writing 50GB to the same drive probably would take an hour itself, so that's why it takes so long. The compression time frame looks good in light of that :)