First it feeds data into the rzip buffer which preprocesses data until it has enough to pass onto a compression thread. Then it hands the data to a compression thread and continues reading more data and lets rzip work on it while the compression thread is doing the 2nd phase compression in the background. Once rzip has enough data for another thread, it spawns another thread and so on, until there are as many threads as CPUs, and then keeps reading until the first thread is free and reuses it, and so on.
Well the results of this are about as good as I could have hoped for. While the faster lzo compression backend only gains a small speedup, the slower the backend, the bigger the speedup. It becomes impressive once zpaq is in use, where I was able to get a 4x speedup on a quad core. That makes lrzip with zpaq almost as fast as regular xz! However, since zpaq takes just as long to decompress as it does to compress, and I haven't threaded the decompression phase, it ends up taking 4x longer to decompress than it did to compress (grin). So zpaq isn't -quite- at the usable stage just yet, but it may well be in the near future.
So what everyone has been waiting for (all 3 of you), benchmarks! 10GB virtual image file being compressed on a quad core 3GHz from an SSD.
Compression Size Compress Decompress None 10737418240 gzip 2772899756 05m47s 2m46s bzip2 2704781700 16m15s 6m19s xz 2272322208 50m58s 3m52s 7z 2242897134 26m36s 5m41s lrzip 1299228155 16m12s 4m32s lrzip -M 1079682231 12m03s 4m05s lrzip -l 1754694010 05m30s 3m12s lrzip -lM 1414958844 05m15s 2m57s lrzip -zM 1066902006 71m20s 04h08m
Get it here, and remember freshmeat may not have updated their download links yet:
Next stop, to parallelise the decompression phase. I doubt anything but zpaq will really benefit from this, but it would be great to have a zpaq based compression format that is useably fast.