tag:blogger.com,1999:blog-6469704299235308349.post852753892410068467..comments2024-02-09T16:24:46.087+11:00Comments on -ck hacking: BFS 0.411 test with skiplistsckhttp://www.blogger.com/profile/02904761195451530213noreply@blogger.comBlogger27125tag:blogger.com,1999:blog-6469704299235308349.post-89670964200532149982011-10-09T00:12:43.780+11:002011-10-09T00:12:43.780+11:00I've tested it with nexuiz… :-) Alas, 406 is b...I've tested it with nexuiz… :-) Alas, 406 is better, meanwhile 411 was laggy all the time.poigehttps://www.blogger.com/profile/14290248818990453110noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-47694978999658087582011-09-30T22:36:40.155+10:002011-09-30T22:36:40.155+10:00I use the make -jx benchmark because it's easy...I use the make -jx benchmark because it's easy to demonstrate and reproduce. Surprisingly it's also a very good make/break test. Every other more complicated benchmark I could think of throwing at the skiplists has also failed to find a single case where it's better. I really wanted it to be be, but it's not.ckhttps://www.blogger.com/profile/02904761195451530213noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-21682416591571913722011-09-30T22:31:40.215+10:002011-09-30T22:31:40.215+10:00This means the new skiplist should be faster with ...This means the new skiplist should be faster with wokloads where there are many long running jobs - not many inertions or removals.<br /><br />Ralph Ulrich (I dont get my Name in any more without URL)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-64517117342174726572011-09-30T10:29:37.943+10:002011-09-30T10:29:37.943+10:00The original O(n) search was heavily optimised and...The original O(n) search was heavily optimised and only includes tasks queued but not running (i.e. not those already on CPU). The thing about O(n) is it tells you the complexity of the algorithm, but not really how fast it is overall. A fast O(n) algorithm can be faster than an O(1) algorithm even for all workloads, if the absolute speed of the O(1) is very slow. My guess with these skiplists is the overhead of the extra data structures and the O(log n) insertion is costing much more than any benefit at the search time. Both insertion and removal is more expensive with skiplists, and only lookup is faster.ckhttps://www.blogger.com/profile/02904761195451530213noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-62479148147069952552011-09-30T07:10:52.165+10:002011-09-30T07:10:52.165+10:00@CK - I put that data package together comparing a...@CK - I put that data package together comparing all your bfs patchsets. It's not good news as you mentioned.<br /><br /><a href="http://repo-ck.com/bench/bfs_builds.pdf" rel="nofollow">Link</a> to download a pdf.<br /><br />So, skiplists in bfs have no future in production it would seem... unless you have some other ideas? I am glad to benchmark future builds as always!Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-81230785873804441782011-09-30T02:51:29.088+10:002011-09-30T02:51:29.088+10:00What's the explanation? The original search i...What's the explanation? The original search is not really O(n), n is too small, or what?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-19502365832780637292011-09-29T21:10:13.381+10:002011-09-29T21:10:13.381+10:00I got Xorg reverted back to 1.10 and everything se...I got Xorg reverted back to 1.10 and everything seems to be working just like it did with BFS 0.406. I did some reading and it looks like the issue I was having was with the nvidia binary blob. I read that the latest driver that is suppose to be compatible with Xorg 1.11 has some "drawing issues." Next time I'll try to confirm a bug on Debian stable before I send a bug report.tux9656noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-41563756521035150982011-09-29T20:03:44.480+10:002011-09-29T20:03:44.480+10:00@tux9656: Thanks. I had a feeling something didn&#...@tux9656: Thanks. I had a feeling something didn't sound right about that. The interactivity should be virtually unchanged.ckhttps://www.blogger.com/profile/02904761195451530213noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-58762176021964523802011-09-29T18:51:28.843+10:002011-09-29T18:51:28.843+10:00It seems the problem I reported with the hiccups m...It seems the problem I reported with the hiccups might be a false alarm. I upgraded to Xorg 1.11 at the same time that I upgraded to BFS 0.411. After running for a while on the official prepackaged Debian kernel, I have been experiencing these same hiccups. It is just not as severe on the Debian kernel. I'm going to try to see if I can still get the previous Xorg 1.10 packages that were in Debian testing and let you know how BFS 0.411 works with it.tux9656noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-11954157695601042422011-09-29T18:00:02.525+10:002011-09-29T18:00:02.525+10:00Very nice. I'll test it on the three machines...Very nice. I'll test it on the three machines and include the results when I post again later today... maybe 12-13 h from now.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-70734216384062288042011-09-29T17:21:17.766+10:002011-09-29T17:21:17.766+10:00Thanks graysky! I tried my best but even in my tes...Thanks graysky! I tried my best but even in my testing am not able to match the performance of bfs406 with every improvement I can think of, but you can try for yourself with this test patch on top:<br /><br /><a href="http://ck.kolivas.org/patches/bfs/test/bfs411-412test.patch" rel="nofollow">bfs411-412test.patch</a>ckhttps://www.blogger.com/profile/02904761195451530213noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-5127703805244928172011-09-29T16:36:20.228+10:002011-09-29T16:36:20.228+10:00Glad to hear it, CK! I will post my full analysis...Glad to hear it, CK! I will post my full analysis later today (probably 15-16 h from now after I have time to work-up and review the data). I'd also like to see someone else to repeat my experiments to verify reproducibility. I have provided the needed file and script to do so, read on.<br /><br />What I did: <br />1) Configured up a linux-3.0.4 source with the default Arch linux .config (on x86_64) and used this as the test code set.<br /><br />2) Booted a system into run level 3 with a minimal set of daemons enabled and run a little bash script that times how long it takes to do a "make -jX bzImages modules" and then repeats the process 5 times totally writing the results to ~/data.txt<br /><br />3) Run this process on each kernel<br />--> stock<br />--> bfs v0.406<br />--> bfs v0.410<br />--> bfs v0.411<br /><br />4) Machines used:<br /><br />*E5200 (2 cores)<br />*X3360 (4 cores)<br />*Dual quad (8 cores + 8 HTs = 16 cores)<br /><br />5) Optionally study the effect of make flags on the dual quad system under the following conditions:<br /><br />1) Hyperthreading enabled<br />a) 16 threads<br />b) 8 threads<br />c) 4 threads<br />d) 2 threads<br /><br />1) Hyperthreading disabled<br />a) 8 threads<br />b) 4 threads<br />c) 2 threads<br /><br />LINKS<br /><a href="http://repo-ck.com/bench/linux-3.0.tar.xz" rel="nofollow">Preconfigured tarball</a> the bash script expects. Note that I have preconfigured for x86_64 only!<br><br /><a href="http://repo-ck.com/bench/benchmark.sh" rel="nofollow">Script</a> (bash) that does the work.<br /><br />Simply edit the bash script to reflect the paths you want to use for:<br />1) source tarball<br />2) temp space for compile<br /><br />Post your ~/data.txt and I can generate the boxplots for you or do them yourself :)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-22116887614326541892011-09-29T13:34:28.627+10:002011-09-29T13:34:28.627+10:00Then again, it's possible that I simply haven&...Then again, it's possible that I simply haven't optimised the code paths enough in the skiplists implementation. I'll try and shave every bit off before giving them up, because they really SHOULD be better.ckhttps://www.blogger.com/profile/02904761195451530213noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-24119099789394173232011-09-29T10:21:03.388+10:002011-09-29T10:21:03.388+10:00It's a foregone conclusion. The offset fix is ...It's a foregone conclusion. The offset fix is to fix a bug in the design. If the fixed version is shit, then there's no point sticking with it.ckhttps://www.blogger.com/profile/02904761195451530213noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-400153541149019882011-09-29T10:19:03.413+10:002011-09-29T10:19:03.413+10:00So skiplists and the offset fix are mutually exclu...So skiplists and the offset fix are mutually exclusive? What are my other option? In other words, what configure switch would need to be tweaked to avoid that "choppy" behavior with skiplists? ...or is this a forgone conclusion and was your last post a serious statement? :/Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-11863439884389427382011-09-29T10:06:36.482+10:002011-09-29T10:06:36.482+10:00Yes. Give up on skiplists and just do the cleanup ...Yes. Give up on skiplists and just do the cleanup stuff :(ckhttps://www.blogger.com/profile/02904761195451530213noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-46699635878579490752011-09-29T10:04:22.596+10:002011-09-29T10:04:22.596+10:00@ck - the problem is the deadline fix patch. I app...@ck - the problem is the deadline fix patch. I applied it to 0.410 and run the benchmark. The results are the same as the results for v0.411. Any thoughts for a path forward?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-32861142328471227302011-09-29T08:31:28.453+10:002011-09-29T08:31:28.453+10:00Hrm interesting. Thanks. The deadline fix patch wo...Hrm interesting. Thanks. The deadline fix patch would lower throughput indeed. However it also is a correct fix for a bug in the design which would inappropriately favour throughput. Gotta keep watching...ckhttps://www.blogger.com/profile/02904761195451530213noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-71351888770440047752011-09-29T08:28:44.014+10:002011-09-29T08:28:44.014+10:00CK - there are some serious regressions introduced...CK - there are some serious regressions introduced in v0.411 compared to v0.410. I show you data below for just my quad core machine. The benchmark is simply compiling linux v3.0.4 (bzImage modules) and timing the process. I have run 5 replicates for each version of BFS as you can see in the box plots. I have also repeated this experiment on a C2D and a dual quad machine (16 cores) and the treads are consistent: v0.410 > v0.406 > v0.411 > stock kernel.<br /><br />You have any thoughts as to which patch you introduced that causes this regression?<br /><br />Note: I will compiled v0.410 + the deadline patch and run it now. This way you can potentially eliminate the deadline patch as the cause of the regression.<br /><br />http://s2.postimage.org/7iufc5x7j/regression.pngAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-6500091117624061692011-09-29T08:26:38.500+10:002011-09-29T08:26:38.500+10:00This comment has been removed by the author.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-46191894382565885752011-09-29T04:27:26.860+10:002011-09-29T04:27:26.860+10:00I can attest smooth desktop operations with zero l...I can attest smooth desktop operations with zero latency, even when compiling a kernel in the background using all cores and listening to an mp3 stream at the same time... 411, the best BFS ever? ;)Martinhttp://www.frogge.de/noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-11615478609623218882011-09-28T21:41:34.576+10:002011-09-28T21:41:34.576+10:00!tux9656 yes that's the correct patch. Odd, wh...!tux9656 yes that's the correct patch. Odd, what hardware configuration? smp/up etc...ckhttps://www.blogger.com/profile/02904761195451530213noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-80898263876083753232011-09-28T21:36:52.518+10:002011-09-28T21:36:52.518+10:00I'm getting horrible hiccups with this version...I'm getting horrible hiccups with this version. It reminds me of this days of a computer with an ISA bus where more than one piece of hardware wants to use the same IRQ but not be very nice about sharing it.<br /><br />Some examples:<br />When I click on a input box in Firefox it can take a full second or more for the focus to go there and scrolling up and down on this page is not at all smooth.<br />When I start gnome-alsamixer the application is not usable for five seconds, but the menu bar is responsive right away.<br /><br />But maybe I didn't apply the correct patch. Is ck1-bfs406-411.patch meant to be applied to 3.0.4-ck1?tux9656noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-71574906711636355882011-09-28T21:12:41.758+10:002011-09-28T21:12:41.758+10:00Many thanks for the correction.
I ask if it is pos...Many thanks for the correction.<br />I ask if it is possible to make Ubuntu Packages with this patch.VamPirohttp://www.rasslabyxa.runoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-69438229357718220012011-09-28T18:12:47.532+10:002011-09-28T18:12:47.532+10:00Tested with clear sources....works very fine!
Bug ...Tested with clear sources....works very fine!<br />Bug definitly fixed! :)<br />Great ck, great BFS!!!!!!!Axlnoreply@blogger.com