-ck hacking: 3.0 BFS delays

Monday, 25 July 2011

3.0 BFS delays

Hi all

I haven't blogged much lately because I've been distracted from kernel hacking by bitcoin mining. For some crazy reason I took it upon myself to make mining software that did what I wanted, writing it the way I write kernel code. Anyway since it's unrelated I haven't posted about it here before, but if anyone's interested, the development thread is here:

http://forum.bitcoin.org/index.php?topic=28402.0

Now about the kernel. To be honest I haven't followed the development of 3.0 almost at all, being totally pre-occupied with other things as I've taken time out from work as a sabbatical while I reassess work-life balance, long term career management (even to the point of considering changing line of work - anyone need a c programmer?) and spend time with family, friends and random other personal development things. No, I'm not quitting kernel development any time soon (again).

Anyway the thing is I'm going with Interplast next week to Nauru (of all places) as a volunteer anaesthetist for needy children for 10 days. I'm not sure if I'll find time to port BFS to 3.0 before then, or if I'll be able to do it while I'm actually there (doubtful). So just a heads up that it might be a while before we BF the 3.0 kernel.

24 comments:

Unknown26 July 2011 at 00:32
I upgraded to 3.0 just two days ago, and it isn't as bad as I expected without BFS.

Is it just me, or did upstream get much better in the last release with multimedia/desktop workloads?

Oh, and best wishes for Nauru :-)
ReplyDelete
Replies
Anonymous26 July 2011 at 02:55
How much did you expect and how much did you get?
Whatever it is you are talking about.

"Show me the money!" -- from Jerry Maguire
ReplyDelete
Replies
Anonymous26 July 2011 at 13:48
@RealNC

Perhaps it is due to improvement in DRM?
ReplyDelete
Replies
Anonymous26 July 2011 at 15:01
I haven't played with bitcoins yet but I'm going to give your new app a go. If I get any I will send them over :)
ReplyDelete
Replies
MikeyB26 July 2011 at 21:08
Does BFS take into consideration NUMA?
ReplyDelete
Replies
Anonymous26 July 2011 at 23:00
@MikeyB, quoting from http://ck.kolivas.org/patches/bfs/bfs-faq.txt

NUMA aware?

It is NOT NUMA aware in the sense that it does any fancy shit on NUMA, but
it will work on NUMA hardware just fine. Only the really big NUMA hardware
is likely to suffer in performance, and this is theoretically only, since
no one has that sort of hardware to prove it to me, but it seems almost
certain. v0.300 onwards have NUMA enhancements.

Try it for yourself
ReplyDelete
Replies
Unknown27 July 2011 at 07:01
@myself
Nah, I spoke too soon. Starting Jack + LMMS + various synths shows latency audio glitches that don't exist in BFS :-P
ReplyDelete
Replies
Ralph Ulrich27 July 2011 at 09:55
I am just planning with others a new Debian unstable derivate. I had hoped we could show up with using a linux-3.0-bfs kernel. Too bad there are no other people taking care of bfs :(
ReplyDelete
Replies
Anonymous27 July 2011 at 18:34
@Ralph
It is too bad BFS can't get into the mainline as an optional scheduler just like we have three different disk schedulers (noop deadline cfq). This would take some pressure off CK for development of BFS and also allow input of others in a collaborative sense.

@CK
Everyone has a deep appreciation for the work you do! Thank you for it!
ReplyDelete
Replies
Anonymous27 July 2011 at 18:49
I can wait a couple of weeks. Thanks for bfs and for your volunteering for needy children.
ReplyDelete
Replies
Ralph Ulrich27 July 2011 at 20:33
apropos mainline:
I think Linus had rejected introduction into mainline, because he thought it would complicate development to have different models of schedulers. So I wait for a moment when a new linux version will be patchable by a previous bfs version without errors. This indicates end of linux scheduler development. Optimal moment to request inclusion of bfs into mainline!

But: If the actual mainline scheduler has no optimum this moment will never ocure ...
ReplyDelete
Replies
Anonymous28 July 2011 at 04:05
The BFS second anniversary is upon us. Here is some gift requests.

Lobby phoronix.com to rerun their benchmarks: http://www.phoronix.com/scan.php?page=article&item=bfs_scheduler_benchmarks

Then redo the only scientific comparison I know: http://www.cs.unm.edu/~eschulte/data/bfs-v-cfs_groves-knockel-schulte.pdf

Unfortunately, a one-off test and a plot does not do it for BFS. We need a whole box nicely wrapped and presented to make BFS a happy kid.

Any takers?
ReplyDelete
Replies
Ralph Ulrich28 July 2011 at 06:15
Bfs is not about benchmarks: We want a responsive Desktop even if benchmarks go worse. Because it is about the Desktop and many of us are using battery powered notebooks, the longer sleeping cpus the better!

Are there any tests around there to take these preferences into account?
ReplyDelete
Replies
Anonymous28 July 2011 at 11:39
Try to clock your battery (not watch) while encoding 1000 or so mp3's. That's a benchmark for you.
ReplyDelete
Replies
krilli29 July 2011 at 20:23
Rerunning the benchmarks is a good idea; However, if Phoronix don't feel like spending time on it, we could do it ourselves ...

Phoronix might be willing to share the setup code etc., so at best it is just a question of setting up a machine and doing ./run_benchmark.sh :)
ReplyDelete
Replies
krilli29 July 2011 at 21:49
(I have emailed Phoronix, btw, giving them a little background story and asking if they're interested in rerunning the benchmark.)
ReplyDelete
Replies
Ralph Ulrich30 July 2011 at 05:28
>Try to clock your battery (not watch) while encoding 1000 or so mp3's

Yeah, phoronix had some experiments using a watt meter ...
ReplyDelete
Replies
Unknown31 July 2011 at 08:26
Gah, I reverted from 3.0 back to 2.6.39-bfs. Vanilla 3.0 still has big problems under load. It seems they're never going to fix their issues.
ReplyDelete
Replies
Anonymous1 August 2011 at 22:21
@MikeyB:

I have been running it on IBM x445 summit, not numaq. 2 smp nodes, both with 2 xeon HT enabled processors, works like a charm, unfortunately no time for performance statistics.

@con:

"anyone need a c programmer", yes i do, what's your salary. pm me on irc.
ReplyDelete
Replies
Ralph Ulrich2 August 2011 at 20:18
Uuuuh, I had a look about patching linux-3.0 with Bfs: This is going to be a big task for Kolivas.

At first it seems easy, one function to transfer. But there is a new feature which will cost Con a lot. There had been work done in mainline linux-3.0 to implement restrictions on LinuxContainers. Huge task I guess to make a Bfs patch now ....
ReplyDelete
Replies
Anonymous3 August 2011 at 15:17
Let's hope not :-(
ReplyDelete
Replies
Anonymous8 July 2012 at 21:10
I dont understand all the talk about 4096 cpu Linux servers? All of them are HPC servers, essentially a large cluster. The largest SMP servers for sale today, have 32 or 64 cpus. There are no larger SMP servers for sale. The biggest IBM Mainframe z196 has 24 cpus.

The Linux trick is to connect lot of nodes on a fast switch, and then use software to make it look like a single kernel. This is how Linux servers can have 4096 cpus: it is a large cluster on a network emulating a single kernel.

For instance, SGI Altix Linux server works this way. If you study the SGI Altix Linux customers, they all do HPC work (embarrasingly parallell work). None use such servers for SMP work.

Also, scale MP which has up to 8192 cores, works this way:
http://www.theregister.co.uk/2011/09/20/scalemp_supports_amd_opterons/
"Since its founding in 2003, ScaleMP has tried a different approach. Instead of using special ASICs and interconnection protocols to lash together multiple server modes together into a shared memory system, ScaleMP cooked up a special hypervisor layer, called vSMP, that rides atop the x64 processors, memory controllers, and I/O controllers in multiple server nodes. Rather than carve up a single system image into multiple virtual machines, vSMP takes multiple physical servers and – using InfiniBand as a backplane interconnect – makes them look like a giant virtual SMP server with a shared memory space."
ReplyDelete
Replies
Anonymous8 July 2012 at 21:12
I dont understand all the talk about 4096 cpu Linux servers? All of them are HPC servers, essentially a large cluster. The largest SMP servers for sale today, have 32 or 64 cpus. There are no larger SMP servers for sale. The biggest IBM Mainframe z196 has 24 cpus.

The Linux trick is to connect lot of nodes on a fast switch, and then use software to make it look like a single kernel. This is how Linux servers can have 4096 cpus: it is a large cluster on a network emulating a single kernel.

For instance, SGI Altix Linux server works this way. If you study the SGI Altix Linux customers, they all do HPC work (embarrasingly parallell work). None use such servers for SMP work.

Also, scale MP which has up to 8192 cores, works this way:
http://www.theregister.co.uk/2011/09/20/scalemp_supports_amd_opterons/
"Since its founding in 2003, ScaleMP has tried a different approach. Instead of using special ASICs and interconnection protocols to lash together multiple server modes together into a shared memory system, ScaleMP cooked up a special hypervisor layer, called vSMP, that rides atop the x64 processors, memory controllers, and I/O controllers in multiple server nodes. Rather than carve up a single system image into multiple virtual machines, vSMP takes multiple physical servers and – using InfiniBand as a backplane interconnect – makes them look like a giant virtual SMP server with a shared memory space."
ReplyDelete
Replies

Add comment