tag:blogger.com,1999:blog-6469704299235308349.post6248845135880807165..comments2024-03-28T15:50:13.644+11:00Comments on -ck hacking: Visualising cksortckhttp://www.blogger.com/profile/02904761195451530213noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-6469704299235308349.post-12863589261304528612011-10-09T00:43:31.158+11:002011-10-09T00:43:31.158+11:00What if you were to implement a way to quickly est...What if you were to implement a way to quickly estimate how badly unsorted the data is and if it's not too badly unsorted, don't bother sorting it. Would that help speed things up? If so, this is my thinking so far.<br /><br />==========<br />You would calculate the average of the values of all positions, which I'll call full_average. You would then calculate the average of positions 1-10, which I'll call average_set_a, the average of positions 11-20, which I'll call average_set_b, the average of positions 21-30, which I'll call average_set_c, the average of positions 31-40, which I'll call average_set_d, and the average of positions 41-50, which I'll call average_set_e. Now using a sorting algorithm to find the difference of lowest value and highest value of the average_set_'s will give a value for an estimate of the amount of deviation in the dataset, which I'll call deviation_amount. Yes, it's a sorting algorithm to decide whether something should be sorted or not, but my thinking is that sorting a set of five is faster than sorting a set of 50. Now, if average_set_a is close to 5% of deviation_amount, and if average_set_b is close to 15% of average_set_b and so forth for the other average_set_'s, then the dataset is not likely too badly unsorted.<br />==========<br /><br />I'm not sure if this would even end up being faster than just sorting the full dataset every time, and I can still think up some datasets where this still won't work at all. Anyway, I've got to take a break from all this thinking now.<br /><br />On second thought, I think this is turning into complete mess of total nonsense. Maybe you'll at least be able to take something out of this that would work.tux9656noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-43405026388611963552011-10-05T08:05:43.063+11:002011-10-05T08:05:43.063+11:00LOL I love the way people reply with stuff like &q...LOL I love the way people reply with stuff like "What's your point" in that smug internet way. The introduction did say it was purely an intellectual exercise.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-55550012194207946782011-10-05T04:20:37.527+11:002011-10-05T04:20:37.527+11:00What's your point? Are you trying to beat qsor...What's your point? Are you trying to beat qsort in some corner case, or two, or three? Sure that's possible. The more you know about your distribution, the faster you can sort it too. That's no-brainer - that's the lack of entropy.<br /><br />Quoting from Wikipedia:<br />"quicksort performs only about 39% worse than the ideal number of comparisons"<br /><br />Beat that, my friend! <br /><br />Again the beauty of you BFS is that it does not optimize the corners. Why you decided to go in the opposite direction with cksort? No one cases about corner cases of qsort - is what I am saying.Anonymousnoreply@blogger.com