It usually doesn't take long to find bugs in a new significantly larger release and that was the case with lrzip 0.610. Since I'm loathe to leaving a buggy release lying around, I've released a new version hot on the heels of the old one.
lrzip 0.611 on freecode
lrzcat and lrzuntar were broken on the last release and they've now been fixed. lrzuntar also had the nasty habit of overwriting existing directories without warning so I've modified the code so it will not overwrite it without the -f option.
With the report of slowdowns on the last release, almost certainly due to incorporating the liblrzip library support, since nothing else changed, I figured it was time to do some hot spot profiling. So I pulled out oprofile and found where most of the time was spent in the code during the rzip stage. Then I went in and carefully rewrote small areas of the hottest functions as though they were critical code paths in the CPU scheduler and managed to find a few small speed improvements. Most of the improvements won't be noticeable unless you're running one of the faster compression modes like lzo, or none at all with -n, but it is faster. I also moved the checksum routines (crc32 and md5) into separate threads as they now use a significant amount of CPU time of their own during the rzip phase, and this should speed things up slightly too.
Then I went to the decompression side of things and did some profiling on the runzip stage and got a surprise. Most of the time during decompression is simply spent on the md5 update code. If I disabled the md5 update code, it was much faster during the runzip stage. After arguing with myself for a day, I figured it was still better to have integrity checking enabled and consider the addition of a fast mode for decompression since that will actually be almost lightning quick.
Interestingly, when profiling decompression, I was using the test option (-t) which does not write anything to disk, and things changed quite dramatically when I changed to actual decompression to disk. It took four times longer to decompress a 1GB archive to disk than it did to just test decompression in ram. Now this all seems obvious if you consider how long it takes to write something to disk, but that was not the case at all. In fact, virtually none of the data is written to disk by the time decompression is complete; it is just all in "dirty ram" as writeback. This did not change whether I used a -ck kernel with a low dirty ratio setting or the mainline default of 10. After some more prodding around I discovered that doing a simple write() of a 1GB buffer took over 7 seconds on modern hardware. This is only 140MB/s for a copy from memory to memory! It should be 20 times faster than that. I even tried taking the write function out of the equation and doing an mmap on the file and then memcpy from the buffer to the mmaped ram and it took the same amount of time, so the slowdown was not in the write() function call. After speaking to a few people and googling, it appears I'm not alone in this finding and it may be only happening in recent linux kernels. At this point I gave up trying to find what was causing this slow decompression on lrzip since it seemed unrelated to the lrzip code, and concentrated on getting this release out. I wonder if this is related to the writeback code that I was actually so looking forward to in 3.2ish. However others reported the problem continues as far back as 2.6.38 whereas it's not there in 2.6.35. I'll wait and see.
Anyway it may be possible to get lrzip integration into libarchive and therefore a number of package managers and decompression programs that use this library. The GPL license in lrzip may be a sticking point, though, and the authors of lzo and the original rzip have not responded about queries about the possibility of making the library LGPL which would make it easier to incorporate into the BSD licensed libarchive. So for now, it's stuck on GPL version 2.