Those following the development of the patches for interactivity at massive load, I have COMPLETELY DROPPED them as they introduce regressions at normal workloads, and I cannot under any circumstances approve changes to improve behaviour at ridiculous workloads which affect regular ones. I still see precisely zero point at optimising for absurd workloads. Proving how many un-niced jobs you can throw at your kernel compiles is not a measure of one's prowess. It is just a mindless test.
Remember, I already had developed a hierarchical tree-based penalty patch for BFS and blogged about it here. I can do it in a 10 line patch for BFS, but it introduced regressions, which is why I dropped it (see earlier blog entry here: further-updates-on-hierarchical-tree).
Again, I can't for the life of me see why you'd optimise for make -j64 on a quad core machine. It is one workload, unique to people who compile all the time, but done in a way you wouldn't normally do it anyway. It is not going to magically make anything else better. If for some god-forsaken reason you wanted to do that, you could already do that with nice, or even better, by running it SCHED_IDLEPRIO.
nice -19 make -j 64 blahblah
or
schedtool -D -e make -j64 blahblah
It's not really that hard folks...
And if you really really really still want the feature for BFS, the patch that does the hierarchical tree based penalty is rolled into a bigger patch (so a lot more than just the 10 lines I mentioned) that can also group threads and enable/disable the features and it's still here:
bfs357-penalise_fork_depth_account_threads.patch
It is worth noting also that the mainline approach costs you in throughput, whereas this patch is virtually free.
EDIT: I forgot to mention that for YEARS now I've been using my "toolsched" wrapper scripts that do this automatically. See toolsched for the scripts. Make always starts as SCHED_IDLEPRIO for me at home.