Comments on -ck hacking: Upgradeable rwlocks and BFS

(btw about that diagram: tlsf is the older incarna...

2012-06-28T17:40:04.960+10:00

(btw about that diagram: tlsf is the older incarnation of xvmalloc, kmalloc is slub)

Hi all, BFS and BFQ are now the default in all my ...

2012-06-28T17:32:12.178+10:00

Hi all,
BFS and BFQ are now the default in all my linux systems. However, I miss the old SLQB allocator so I was looking for a replacement for SLUB/SLAB.

I've found xvmalloc, from compcache. It uses less memory than slub, however I've never used it:

http://compcache.googlecode.com/svn/wiki/files/allocators_ideal_tlsf_kmalloc_compare.gif

Have any of you tried it? What do you think?

Thanks

(note: crossposted at the bfq ml)

@Ralph I have make an RIFS-V3-Test. It is fully de...

2012-06-27T03:02:04.851+10:00

@Ralph
I have make an RIFS-V3-Test. It is fully designed for desktop and fit the 5511 option.

However, the 1551 option cannot be implement by any scheduler.Or I can say, it is impossible to implement for both low latency or batch/throughput. They are two opposite things and we should prefer one of these two option.

BFS is a 1515 scheduler.EEVDF(With BFS it is actually EDF) is actually the best real-time algorithm people have proven.

Originally CFS(The one 2.6.23 kernel used) is good but Ingo expect to do something to fit all the workload type and CFS becomes not responsive.(On 2.6.31 kernel if we disabled the sleeper fairness CFS will even give very good responsive to users as BFS do, but after 2.6.31 the improved sleeper fairness feature has been hard-coded).
From CFS without sleeper fairness I can see that Unix System V scheduler is the best scheduler for desktop nomally.(CFS has the same approach as Unix Scheduler).Unluckily Linus prefer server users more and the pure Unix Scheduler is not adopted.
Chen

Ralph, Latency and throughput are heavily connect...

2012-06-26T20:20:43.130+10:00

Ralph,

Latency and throughput are heavily connected. The value of one will affect the value of the other. They are on the same "value scale".

Performance and Energy sound like implementation constraints, It either meets your requirements or it does not. There is no universal measurement.

You have to define the requirements of the task you need a scheduler for; CFS is general, because it has to "fit all needs" as best it can, BFS is designed for desktop interactivity (with fairness) etc.

Mike Brown

@Ralph Ulrich RIFS is designed for mobile/desktop ...

2012-06-25T19:44:11.492+10:00

@Ralph Ulrich
RIFS is designed for mobile/desktop user.

Chen, I know: NOHZ energy savings only ticks/gigaH...

2012-06-25T09:59:36.391+10:00

Chen, I know: NOHZ energy savings only ticks/gigaHerzOfCpu

I wrote about four dimensions, because I want the developer of a scheduler (YOU) to emphasize on differentiation not univerality!

What are the differences in regard of these four dimensions
(BFS - RIFS - RIFS/ES - CFS)
Chen you didn't answer this question yet, which I also asked at phoronix. You just say your scheduler performs (universal) better than CFS. This answer is a bug :)
Ralph Ulrich

"CFS mainline wants to satify all needs unive...

2012-06-25T02:18:10.821+10:00

"CFS mainline wants to satify all needs universally. I think a scientific informatical research project could easily show that in principle there cannot be a scheduler to fit the needs of all cases. " Yes it is. CFS wants to do that so but failed.

"1. energy consideration (think of no NOHZ of RIFS)"
Ah, a news on Phoronix has shown that there is only slightly improvement of energy costing with NOHZ.(Very little). On the other hand NOHZ will cause some problems on interactivity.

Performance is a term meaning to comprise three di...

2012-06-25T01:55:30.320+10:00

Performance is a term meaning to comprise three dimensions:
1. energy consideration (think of no NOHZ of RIFS)
2. latency of response to users
3. throughput of data

The last dimension most trivially to measure.
Regarding schedulers there is an additional dimension:
4. predictability (what realtime kernels want to provide)

CFS mainline wants to satify all needs universally. I think a scientific informatical research project could easily show that in principle there cannot be a scheduler to fit the needs of all cases.

Even if you think of a theoretical scheduler with adaptive strategies this artificial intelligence would cost much resources (energy and throughput).

I would like to have a selection choice for users. Something indicated in four dimension by the user from 1-5 points:
5511 a mobile user would want
1155 an audiophile creater would select
1551 a developer compiling his latest source edits.

Ralph Ulrich

Hi Con I just want to clearify one thing. You said...

2012-06-24T20:00:22.492+10:00

Hi Con
I just want to clearify one thing. You said that RIFS/-ES optmises for big load, but actually I don't do that. Also there is *NO ANY INTERACTIVITY ESTIMATOR* in the design of RIFS/-ES. If you say that RIFS have sleep-based estimator I can claim that BFS also have sleep-based estimator(Actually both of them don't have this).
Chen

> Thoughts? i find a linear curve reassuring b...

2012-06-23T06:40:29.493+10:00

> Thoughts?

i find a linear curve reassuring because it indicates a direct relationship between the quantities, meaning all influencers are known and there are no hidden surprises.

Well... This RIFS code optimises for sleeping tas...

2012-06-22T19:22:45.502+10:00

Well...

This RIFS code optimises for sleeping tasks under extreme loads. In other words, it adds unfairness favouring interactive tasks at loads of 64 or more. This is precisely the opposite direction of where the bfs code was heading, as it contains a sleep-based interactivity estimator (BFS has no interactivity estimator since they introduce unfairness and behave unpredictably) and it optimises for ridiculous loads that desktop users will never see.

So while I'm happy to see people hacking on BFS, this code is pretty much orthogonal to the whole BFS approach.

Ok. But using the interface provided by mainline i...

2012-06-13T15:33:10.213+10:00

Ok. But using the interface provided by mainline is better because it can prevent producing bugs. Everyone hate bugs especially the bugs in the critical part of a Os

Thank you very much. It's clearly not a win, b...

2012-06-13T09:38:09.778+10:00

Thank you very much. It's clearly not a win, but then it's not a massive lose either. If it opens up other possibilities in the scheduler it may be worth pursuing, but for now it's not going to be part of "official" BFS just yet.

OK... did some testing using the my "make&quo...

2012-06-13T05:49:21.532+10:00

OK... did some testing using the my "make" benchmark. I think this endpoint is actually relevant in comparing the code with a reproducible and realworld relevant endpoint, albeit a non-interactivity one!

1) It is a non-latency based measure.
2) Compilation benchmark using gcc to “make bzImage” for a preconfigured linux 3.4 build.
3) Runs benchmarks 28 times totally to get a decent number of observations for a statistical comparison. In all cases, the first run is omitted leaving an n=27.
4) Results are how many seconds it took to compile on a dual Intel E5620 (2x hyperhreaded quadcore CPUs on a single board) @ 2.40 GHz.
5) Make is run with 16 threads (8 physical cores and 8 HT cores).

1st test: 3.4.2-bfs342 vs. 3.4.2-bfs3.4.2+urwlocks:
Result: The URWLocks patched kernel was on the borderline of achieving a statistically significantly difference. If I had powered the analysis with a larger number of replicates (say 63 or 72) I'll bet that it would be statistically significantly SLOWER than the unpatched bfs kernel. You can see from the Anova that the median time difference is 808 ms longer for the kernel running with the experimental patchset:

http://s15.postimage.org/4uv2qx2bt/anova_1.jpg

2nd test: 3.4.2-vanilla vs. 3.4.2-bfs3.4.2:
Result: The vanilla 3.4.2 kernel with the stock queuing system is statically significantly SLOWER than the corresponding bfs patched kernel; median time difference is 152 ms.

http://s15.postimage.org/5rej32ux5/anova_2.jpg

CK - Always glad to support your innovative ideas with regard to schedulers. Just ask and keep up the great work you do for the Linux community :)

Hi Chen, as a normal user I really appreciate you ...

2012-06-12T20:13:07.875+10:00

Hi Chen,
as a normal user I really appreciate you and H.Danton working on the issue. There is a need of differentiaton in the area of schedulers. Wouldn't it be great if there was a plugin infrastructure for this at mainline: I think this would be quiet simple just a few #ifdef SCHEDNAME in mainline code.

Ralph Ulrich

I have a dual quad machine (8 physical cores + 8 H...

2012-06-12T06:30:26.169+10:00

I have a dual quad machine (8 physical cores + 8 HT cores) and will be glad to test this out... the most quantitative benchmark I have is my infamous "make" benchmark which I will run and post the results.

Thanks for hacking CK, don't ever stop!

Con, I think you should try to use the modular sch...

2012-06-12T03:05:57.398+10:00

Con, I think you should try to use the modular scheduler which mainline has.

Doing this could make the code more stable. It doesn't means that BFS has to use per-cpu runqueue completely because the modular scheduler has also provide you a free space to do what you want.:-) In the other words, the patch can be smaller with this.(We just need to change a few lines of code in core.c , sched.h and rewrite the Makefile).
Chen

I think it will be kinda difficult to find a BFS u...

2012-06-12T01:05:41.728+10:00

I think it will be kinda difficult to find a BFS user with so many cores to do proper tests. It's somewhat of an oxymoron :-)

Not particularly because of the delays in transact...

2012-06-12T01:00:18.625+10:00

Not particularly because of the delays in transactions and dependence on quiescent periods and so on.

What about RCU (Read-copy-update) — can it be used...

2012-06-12T00:57:51.545+10:00

What about RCU (Read-copy-update) — can it be used instead?