It usually doesn't take long to find bugs in a new significantly larger release and that was the case with lrzip 0.610. Since I'm loathe to leaving a buggy release lying around, I've released a new version hot on the heels of the old one.
lrzip 0.611 on freecode
http://lrzip.kolivas.org
lrzcat and lrzuntar were broken on the last release and they've now been fixed. lrzuntar also had the nasty habit of overwriting existing directories without warning so I've modified the code so it will not overwrite it without the -f option.
With the report of slowdowns on the last release, almost certainly due to incorporating the liblrzip library support, since nothing else changed, I figured it was time to do some hot spot profiling. So I pulled out oprofile and found where most of the time was spent in the code during the rzip stage. Then I went in and carefully rewrote small areas of the hottest functions as though they were critical code paths in the CPU scheduler and managed to find a few small speed improvements. Most of the improvements won't be noticeable unless you're running one of the faster compression modes like lzo, or none at all with -n, but it is faster. I also moved the checksum routines (crc32 and md5) into separate threads as they now use a significant amount of CPU time of their own during the rzip phase, and this should speed things up slightly too.
Then I went to the decompression side of things and did some profiling on the runzip stage and got a surprise. Most of the time during decompression is simply spent on the md5 update code. If I disabled the md5 update code, it was much faster during the runzip stage. After arguing with myself for a day, I figured it was still better to have integrity checking enabled and consider the addition of a fast mode for decompression since that will actually be almost lightning quick.
Interestingly, when profiling decompression, I was using the test option (-t) which does not write anything to disk, and things changed quite dramatically when I changed to actual decompression to disk. It took four times longer to decompress a 1GB archive to disk than it did to just test decompression in ram. Now this all seems obvious if you consider how long it takes to write something to disk, but that was not the case at all. In fact, virtually none of the data is written to disk by the time decompression is complete; it is just all in "dirty ram" as writeback. This did not change whether I used a -ck kernel with a low dirty ratio setting or the mainline default of 10. After some more prodding around I discovered that doing a simple write() of a 1GB buffer took over 7 seconds on modern hardware. This is only 140MB/s for a copy from memory to memory! It should be 20 times faster than that. I even tried taking the write function out of the equation and doing an mmap on the file and then memcpy from the buffer to the mmaped ram and it took the same amount of time, so the slowdown was not in the write() function call. After speaking to a few people and googling, it appears I'm not alone in this finding and it may be only happening in recent linux kernels. At this point I gave up trying to find what was causing this slow decompression on lrzip since it seemed unrelated to the lrzip code, and concentrated on getting this release out. I wonder if this is related to the writeback code that I was actually so looking forward to in 3.2ish. However others reported the problem continues as far back as 2.6.38 whereas it's not there in 2.6.35. I'll wait and see.
Anyway it may be possible to get lrzip integration into libarchive and therefore a number of package managers and decompression programs that use this library. The GPL license in lrzip may be a sticking point, though, and the authors of lzo and the original rzip have not responded about queries about the possibility of making the library LGPL which would make it easier to incorporate into the BSD licensed libarchive. So for now, it's stuck on GPL version 2.
Enjoy.
Regarding that writeback thing... doesn't Linux default to quite aggressive writeback, not caching things in ram only for more than a second or so, to prevent data loss on crashing? IIRC only with laptop mode or other special settings (file system options?) the "file fits into ram, so isn't written to disk immediately" is true.
ReplyDeleteRegarding LZO : it's unlikely that the author will change the license of its product, since he depends on it for his business model. Such request were already done and refused on other open-source projects. And, LZO is now well established as a GPL codec.
ReplyDeleteIf you are interested in a libarchive-compatible fast mode for lrzip, have you been considering any fast C codec alternative using a BSD license, such as LZ4 ? (http://code.google.com/p/lz4/)
The actual rzip code itself which is integral to lrzip is also GPL so finding a different licensed back end will not really help. As it stands libarchive integration is possible with the use of an external program instead of using the library anyway so that is likely the approach I'll be taking. I've actually used lz4 in my own code base with lrzip to see how much it offers lrzip but it is no faster on compression, slightly worse on compression ratio, and on decompression the back end is not the rate limiting step so lz4 does not speed anything up. For no demonstrable benefit, I was reluctant to add yet another back end compression algorithm - there are already too many in my opinion and they're only there for legacy reasons.
ReplyDelete