Friday, 22 April 2011

lrzip-0.603

Trying to polish off version 0.6x of lrzip to be nice and stable and working as planned, I've made a few more updates addressing a few issues that have come up, along with some outside help. Here's the short changelog:

lrzip now detects when output is being redirected without a filename and will automatically output to stdout. Apple builds, which had errors on compressing files larger than 2GB in size, were fixed. lrztar now properly supports -o, -O, and -S. The lrzip configuration file now supports encryption. lrzip will now warn if it's inappropriately passed a directory as an argument directly.

Probably the most fun part of this is the first feature upgrade to do with stdout, which I use regularly now since I store all my kernels and patches as .lrz, I can now do:

lrunzip patch-2.6.38.4.lrz | patch -p1

Also, graysky made some nice graphs and I feel obliged to put them up here:





Of course, with much larger files and more CPUs and RAM the discrepancy becomes much greater with lrzip but that doesn't change the fact this is a real world test.

So grab it here:
LRZIP ON FRESHMEAT

As an aside, debian unstable now has 0.602+ in its repo, and the upcoming elite release of slackware also has 0.602.

Thursday, 21 April 2011

BFS 0.401

I was meant to be on holidays this week, and indeed I've been away from home somewhere warm. While BFS was supposed to be the last thing I cared about, I was fortunate enough to have other people actually find some bugfixes to BFS. First up was _sid_ who found some very small optimisations that I've committed to the new version of BFS. But even more impressively, Serge Belyshev found a long standing bug that would cause bad latencies when Hz values were low, due to the "last_ran" variable not being set. This may well have been causing a significant latency disadvantage to BFS when Hz was 100.


As you can see in this graph, worst case latencies could be 100 times better with this bug fixed. While it will affect all Hz values, it is most significant at low Hz and probably unnoticeable by the time you're on 1000Hz. Those who are on low Hz configurations, especially those on say android, will notice a dramatic speedup moving to BFS 401.

So get it here (available for 2.6.38.3, 2.6.35.12 and 2.6.32.38):
BFS PATCHES

Again, thanks VERY much to the testers and even more to those contributing bugfixes and code.

Wednesday, 13 April 2011

lrzip 0.602

So the latest version of lrzip seems to be working well in the field with very few bug reports which is nice considering the magnitude of the changes that went into 0.600. I let it stew for a while at 0.601 while I shook out any obvious bugs and am releasing a new stable version that is mostly a bugfix release. Here's the what's new entry for this version:

Now builds on Cygwin.
Fixed wrong symlinks which broke some package generation.
Imposed limits for 32bit machines with way too much ram for their own good.
Disable md5 generation on Apple for now since it's faulty.
Displays full version with -V.
Checks for podman on ./configure
File permissions are better carried over instead of being only 0600.

The only new "feature" is building on cygwin which was contributed by Тулебаев Салават. Thanks!

Just a reminder for what sort of data lrzip works particularly well on:
linux-2.6.0-2.6.38.tar.lrz

This is a tarball of all 39 stable kernel releases from 2.6.0 to 2.6.38 and is only 160MB.
Decompressed file size: 10618664960
Compressed file size: 168125950
Compression ratio: 63.159

Enjoy!
lrzip 0.602 at freshmeat

Scalability of BFS?

So it occurred to me that for some time I've been saying that BFS may scale well only up to about 16 CPUs. That was a fairly generic guess based on the design of BFS, but it appears that these more-thread machines and multi-core machines seem to quite like BFS on the real-world benchmarks I'm getting back from various people. With the latest changes to BFS, which bumped the version up to 0.400, it should have improved further. I've tried googling for links to do with BFS and scalability and the biggest machine I've been able to find that benefits from it is a 24 core machine running F@H (folding at home). Given that this was with an older version of BFS, and that there were actually advantages even at 24 cores, I wonder what the point is where it doesn't scale? Obviously scalability is more than just "running F@H" and will depend entirely on architecture and workload and definition of scalability, and so on, but... I wanted to ask the community what's the biggest machine anyone has tried BFS on, and how well did it perform? If someone had access to 16+ cores to try it out I'd be mighty grateful for your results.