Sunday 1 July 2012

BFS 424 test

A couple of bug reports mostly related to disk I/O seem to have cropped up with BFS 423/3.4-ck2. The likely culprit seems to be the plugged I/O management within schedule() that I modified going from BFS 420 to BFS 423, when I adopted mainline's approach to managing the plugged I/O. It appears that the mechanism I had put in place for BFS was the correct one, and mainline's approach does not work (for BFS) so I've backed out that change and increased the version number. Here is the test patch:

bfs423-424.patch

Those with issues of any sort related to BFS 423 or ck2, please test this patch on top of the previous BFS patched kernel. Thanks!

19 comments:

  1. Please, upload again. That was no patch.
    Manuel

    ReplyDelete
  2. Hi Con,

    my problems with the hard freeze on badblocktests are completly gone.
    Does work on both USB und esata. Thanks. No other problems at the moment.
    Feels snappy as always.

    Applied your patch on top of the actual ZEN kernel.

    Regards Mike

    PS: Download does not work on the link, but direct http://ck.kolivas.org/patches/bfs/test/bfs423-424.patch

    ReplyDelete
  3. I also found this way to get that patch. It's now up and running for 1h with 3.4.4. Hope it's stable now.
    Manuel

    ReplyDelete
  4. Link fixed (silly blogger url was autoadded). Thanks guys for testing. This is substantial enough a bugfix for a new official release.

    ReplyDelete
  5. I'll do the standard comparison in my make benchmark... give me some time.

    ReplyDelete
  6. Just woke up in the middle of the night (uptime ~5h) and my mouse (trackball) lost its middle mouse button capability like scrolling or pasting. The main two still work (left+right). Never had this before.
    Manuel

    ReplyDelete
    Replies
    1. Forget what I've written last night. Lack of coffee or sth. like that. Shame on me: Probably loose contacts for the middle buttons had caused it.
      BFS 0.424 behaves very well on here. Unfortunately I decided to reboot (instead of using brute force for the buttons immediately, what helped later on) after 17h uptime.

      Best regards, Manuel

      Delete
    2. 54h of stable and snappy uptime passed. Manuel

      Delete
  7. CK - results from the make test show no difference between bfs v0.423 and bfs v0.424 which is good. The mainline scheduler was thrown in as a positive control. Interestingly, neither BFS differentiated itself from mainline which is unusual compared to historical data.

    http://s8.postimage.org/6ocyeg63n/boxplot.jpg

    Details:
    1) It is a non-latency based measure.
    2) Compilation benchmark using gcc to “make bzImage” for a preconfigured linux 3.4.4 build.
    3) Runs benchmarks 28 times totally to get a decent number of observations for a statistical comparison. In all cases, the first run is omitted leaving an n=27.
    4) Results are how many seconds it took to compile on a dual Intel E5620 (2x hyperhreaded quadcore CPUs on a single board) @ 2.40 GHz.
    5) Make is run with 16 threads (8 physical cores and 8 HT cores).

    ReplyDelete
  8. Thanks everyone for reporting back. I'm about to release 3.4-ck3 which is purely ck2 with bfs upgraded to this 424 patch unchanged.

    ReplyDelete
    Replies
    1. Now CFS has improved on make test. However I think there is a problem with the design of CFS.

      CFS is a variant of multilevel-feedback queue algorithm. It uses a red-black tree to store the tasks. It is not a deadline design(unlike BFS) and Ingo has missed this point. The task which has the lowest vruntime will be scheduled. Now we have a problem of this algorithm. CFS can cause O(n) lag bound and it is not really fair. Some of the task which sleeps frequently can always take turn to run on CPU.It can cause starvation when IO load is high.Let me describe that how that happen.

      First a task which always sleeps(Disk IO) can cause starvation because it can always take turn to run and it can always preempt the other tasks.If more IO tasks are running users may feel choppy.

      BFS is fairer than CFS. But still it can cause choppy user experience when we are compiling something with CPUx2 threads.It is because the interactive task couldn't take turn to run on CPU as fast as it can.
      Chen

      Delete
    2. I don't think the shortcomings of CFS are particularly relevant here. Perhaps try LKML, they might be interested.

      The simple answer to your choppy user experience is to develop some common sense and NOT to compile with twice as many jobs as you have CPUs or if you really *must* use that much overload for some obscure reason, nice -n 19 or SCHED_IDLEPRIO your compile...

      Delete
    3. chen, you probably misunderstand the term fairness. if your interactive task competes with 2*n tasks of the same priority, it only gets n/(2*n+1) cpu time. if you want to favour the interactive task you are actually asking for unfairness. in the case of BFS you need to introduce unfairness manually.

      i do it this way: the shells in my xterms are automatically executed in SCHED_IDLEPRIO class, so my compiler runs never interfere with my desktop.

      Delete
    4. @Martin
      Where could you find that I want to favor interactive task?
      I am mentioning that CFS can cause unfairness.
      Maybe you misread the hole article.
      BFS is fair but it can also introduce bad desktop experience.(With small load it is good).I am now trying to improve the fairness of CFS

      Delete
    5. @X / Chen: The top of unfairness is to not program your scheduler to also work on uniprocessor machines. If you want to try to improve that CFS, please take this into account, and I'm not the only one with old hardware.

      @Martin: You don't need to let your compilers run @ SCHED_IDLEPRIO principally. The actual BFS behaves so well, that you would notice stalls in audio/video very rarely (if at all). The only one I have @ this level is the BOINC client, that should never interfere anything else.

      Manuel

      Delete
  9. @Chen: you said "can cause choppy user experience when we are compiling something with CPUx2 threads.It is because the interactive task couldn't take turn to run on CPU as fast as it can." my point was, this is just fair behaviour.

    @Manuel: fair enough, but it doesn't hurt to run a compiler @ SCHED_IDLEPRIO. I don't mind if a compiler run takes a few seconds longer, but I expect my desktop to remain fully responsive. So I ust give this hint to the scheduler.

    Btw, on UP machines I do not compile @ SCHED_IDLEPRIO since it starves the compiler too much. I simply compile @ SCHED_NORMAL at nice level 19.

    ReplyDelete
    Replies
    1. Fair scheduling means all the task can get appropriate time to run.CFS has implement this more correct than BFS. But Ingo has forgot that the algorithm CFS using has done all the works correctly and he added a lot of things similar to O(1) scheduler. These are opposite to the beautiful algorithm CFS using and heavily hurt the interactivity. BFS hasn't get the concept of fair correctly but BFS is simple and BFS can make sure that all the task will be scheduled in a constant time and BFS can be used as a realtime scheduler.

      Delete
    2. Fair scheduling means tasks of equal priority are given equal amounts of CPU time. When there is an overload in progress, the CPU is overcontended and *none* of the tasks will receive the CPU time that they require.

      As Martin said before:

      'chen, you probably misunderstand the term fairness. if your interactive task competes with 2*n tasks of the same priority, it only gets n/(2*n+1) cpu time. if you want to favour the interactive task you are actually asking for unfairness.'

      That is also my understanding of it.

      BFS is rigidly fair. I don't understand how you can say that a scheduler which *is* completely fair hasn't got the concept of fairness correctly. That statement completely defies logic. It's either fair or it isn't. BFS *is* fair, CFS not so.

      Delete
  10. Mainline has __schedule() in various functions instead of schedule().
    (preempt_schedule, preempt_schedule_irq, ...)
    Maybe this is the reason for the freezes.

    ReplyDelete