lrzip 0.612 on freecode
This time the main update is a new zpaq library back end instead of using the ageing zpipe code. There are a number of advantages to using the libzpaq library over the old zpipe code.
First, the old code required a FILE type stream as it was written with STDIO in mind, so it was the only compression back end that required the use of some lesser known but handy, yet (virtually) linux only memory features like fmemopen, open_memstream and friends. These were not portable for osx and others so they were emulated on those platforms through the incredibly clunky use of temporary files on disk. Using the new library has killed off the need for these features making the code more portable.
Second, the code is significantly faster since it is the latest full c++ version of the zpaq code. Unfortunately it also means it takes a LOT longer to compile this part of lrzip now, but that's not a big deal since you usually only compile it once ;)
Third, it supports 3 different compression levels, one of which is higher than the previously supported one in lrzip. As lrzip uses 9 levels of compression, I've mapped the 3 levels to -L 1-3, 4-7 and 8-9 since -L 7 is the default and that provides the "mid level" compression from zpaq.
Finally, the beauty of the zpaq compression algorithm is the reference decoder can decompress any zpaq compressed data of any profile. This means you are able to use the latest version of lrzip with compression -L 9 (max profile), yet it is still backwardly compatible with older 0.6x versions of lrzip, not requiring an updated minor version and file format. The release archive I provide of lrzip-0.612.tar.lrz is self compressed with the new max profile. Even though there is significantly more code than ever in the lrzip release tarball, it has actually shrunk for the first time in a while.
So all that talk is boring and all, so let's throw around some benchmark results which are much more fun.
From the original readme benchmarks, I had compressed the linux 2.6.37 tarball, so I used that again for comparison. Tests were performed on an Intel quad core 3Ghz core 2 duo.
Compression Size Percentage Compress Decompress None 430612480 100 7z 63636839 14.8 2m28s 0m6.6s xz 63291156 14.7 4m02s 0m8.7 lrzip 64561485 14.9 1m12s 0m4.3s lrzip -z 51588423 12.0 2m02s 2m08s lrzip -l 137515997 31.9 0m14s 0m2.7s lrzip -g 86142459 20.0 0m17s 0m3.0s lrzip -b 72103197 16.7 0m21s 0m6.5s bzip2 74060625 17.2 0m48s 0m12.8s gzip 94512561 21.9 0m17s 0m4.0s
As you can see, the improvements in speed of the rzip stage have made all the compression back ends pretty snappy, and most fun of all is that lrzip -z on this workload is even faster on compression than the multithreaded 7z and is significantly smaller. Alas the major disadvantage of zpaq remains that it takes about as long to decompress as it takes to compress. However, with the trend towards more CPU cores as time goes on, one could argue that zpaq compression, as used within lrzip, is getting to a speed where it can be in regular use instead of just research/experimental use, especially when files are small like the lrzip distributed tarball I provide.
I also repeated my old classic 10GB virtual image benchmarks
Compression Size Percentage Compress Time Decompress Time None 10737418240 100.0 gzip 2772899756 25.8 05m47s 2m46s bzip2 2704781700 25.2 16m15s 6m19s xz 2272322208 21.2 50m58s 3m52s 7z 2242897134 20.9 26m36s 5m41s lrzip 1372218189 12.8 10m23s 2m53s lrzip -U 1095735108 10.2 08m44s 2m45s lrzip -l 1831894161 17.1 04m53s 2m37s lrzip -lU 1414959433 13.2 04m48s 2m38s lrzip -zU 1067169419 9.9 39m32s 39m46s
Using "U"nlimited "z"paq options, it is actually faster than xz now. Note that about 30% of this image is blank space but that's a not-uncommon type of virtual image. If it were full of data, the difference would be less. Anyway I think it's fair to say that it's worth watching zpaq in the future. Edit: I've sent Matt Mahoney (zpaq author) the latest benchmarks for lrzip and how it performs on the large file benchmark and he's updated his site: http://mattmahoney.net/dc/text.html I think it's performing pretty well for a general compression utility.