Linux 3.3 is out, but I'm not releasing BFS for it at this stage. The reason is that a regression has been reported as showing up in BFS and it's proving hard to track down.
The issue involves a slowdown under load when the load is niced. The problem is the slowdown does not occur all the time, and I've only seen it in the wild once and have not been able to repeat it since. I've audited the code and have not yet found the culprit, but when it does happen, it is very obvious with mouse stalling for seconds. The 'top' output is usually a give away that something has gone wrong because the 'PR' column should normally report values of 1-41 for a nice 19 load. However when it happens, it will show values much higher, in the 42-81 range which should not happen. Unfortunately, the best hint for me would be to find what version of BFS it was introduced, and look for the change responsible, and since I can't even reproduce the problem most of the time, I can't do this regression testing.
So I'm appealing to the BFS users out there to see if anyone has this problem more regularly that has the time to try older versions of BFS. By older versions of BFS, I don't mean the same version of BFS on older kernels, but to try the first version of BFS that was available for that kernel.
Running something continuously as a 'nice'd load is required to reproduce it, where the load is equal to the number of CPUs, so for example 'nice -19 make -j4' continuously in a kernel tree on a quad core machine.
I'm hoping that someone out there is able to reproduce it and can do the regression testing. Thanks in advance. In the meantime, I'll keep auditing code and comparing new to old versions in the hope something stands out.
EDIT: An alternative approach was to try moving to 3.3 and make minor fixes along the way to see if the problem persists. Consider this patch a pre-release for now (CPU accounting appears all screwy still):
EDIT2: Fixed CPU accounting, bumped version to 418: