Showing posts with label interbench. Show all posts
Showing posts with label interbench. Show all posts

Monday, 24 October 2016

Interbench benchmarks for MuQSS 116

As mentioned in my previous post, I recently upgraded interbench which is a benchmark application I invented/wrote to assess perceptible latency in the setting of various loads. The updates were to make the results meaningful on today's larger ram/multicore machines where the load scales accordingly.

The results for mainline 4.8.4 and 4.8.4-ck4 on a multithreaded hexcore (init 1) can be found here:
 http://ck.kolivas.org/patches/muqss/Benchmarks/20161024/
and are copied below. I do not have swap on this machine so the "memload" was not performed. This is a 3.6GHz hexcore with 64GB ram and fast Intel SSDs so to show any difference on this is nice. To make it easier, I've highlighted it in colours similar to the throughput benchmarks I posted previously. Blue means within 1% of each other, red means significantly worse and green significantly better.


Load set to 12 processors

Using 4008580 loops per ms, running every load for 30 seconds
Benchmarking kernel 4.8.4 at datestamp 201610242116
Comment: cfs

--- Benchmarking simulated cpu of Audio in the presence of simulated ---
Load Latency +/- SD (ms)  Max Latency   % Desired CPU  % Deadlines Met
None       0.1 +/- 0.1        0.1           100         100
Video      0.0 +/- 0.0        0.1           100         100
X          0.1 +/- 0.1        0.1           100         100
Burn       0.0 +/- 0.0        0.0           100         100
Write      0.1 +/- 0.1        0.1           100         100
Read       0.1 +/- 0.1        0.1           100         100
Ring       0.0 +/- 0.0        0.1           100         100
Compile    0.0 +/- 0.0        0.0           100         100

--- Benchmarking simulated cpu of Video in the presence of simulated ---
Load Latency +/- SD (ms)  Max Latency   % Desired CPU  % Deadlines Met
None       0.1 +/- 0.1        0.1           100         100
X          0.1 +/- 0.1        0.1           100         100
Burn      17.4 +/- 19.5      46.3            87        7.62
Write      0.1 +/- 0.1        0.1           100         100
Read       0.1 +/- 0.1        0.1           100         100
Ring       0.0 +/- 0.0        0.0           100         100
Compile   17.4 +/- 19.1      45.9          89.5        6.07

--- Benchmarking simulated cpu of X in the presence of simulated ---
Load Latency +/- SD (ms)  Max Latency   % Desired CPU  % Deadlines Met
None       0.0 +/- 0.1        1.0           100        99.3
Video     13.4 +/- 25.8      68.0          36.2        27.3
Burn      94.4 +/- 127.0    334.0          12.9        4.37
Write      0.1 +/- 0.4        4.0          97.4        96.4
Read       0.1 +/- 0.7        4.0          96.2        93.8
Ring       0.5 +/- 1.9        9.0          89.3        84.9
Compile   93.3 +/- 127.7    333.0          12.2         4.2

--- Benchmarking simulated cpu of Gaming in the presence of simulated ---
Load Latency +/- SD (ms)  Max Latency   % Desired CPU
None       0.0 +/- 0.2        2.2           100
Video      7.9 +/- 21.4      69.3          92.7
X          1.4 +/- 1.6        2.7          98.7
Burn     136.5 +/- 145.3    360.8          42.3
Write      1.8 +/- 2.0        4.4          98.2
Read      11.2 +/- 20.3      47.8          89.9
Ring       8.1 +/- 8.1        8.2          92.5
Compile  152.3 +/- 166.8    346.1          39.6
Load set to 12 processors

Using 4008580 loops per ms, running every load for 30 seconds
Benchmarking kernel 4.8.4-ck4+ at datestamp 201610242047
Comment: muqss116-int1

--- Benchmarking simulated cpu of Audio in the presence of simulated ---
Load Latency +/- SD (ms)  Max Latency   % Desired CPU  % Deadlines Met
None       0.0 +/- 0.0        0.0           100         100
Video      0.0 +/- 0.0        0.0           100         100
X          0.0 +/- 0.0        0.0           100         100
Burn       0.0 +/- 0.0        0.0           100         100
Write      0.0 +/- 0.0        0.1           100         100
Read       0.0 +/- 0.0        0.0           100         100
Ring       0.0 +/- 0.0        0.0           100         100
Compile    0.0 +/- 0.1        0.8           100         100

--- Benchmarking simulated cpu of Video in the presence of simulated ---
Load Latency +/- SD (ms)  Max Latency   % Desired CPU  % Deadlines Met
None       0.0 +/- 0.0        0.0           100         100
X          0.0 +/- 0.0        0.0           100         100
Burn       3.1 +/- 7.2       17.7           100        81.6
Write      0.0 +/- 0.0        0.5           100         100
Read       0.0 +/- 0.0        0.0           100         100
Ring       0.0 +/- 0.0        0.0           100         100
Compile   10.5 +/- 13.3      19.7           100        37.3

--- Benchmarking simulated cpu of X in the presence of simulated ---
Load Latency +/- SD (ms)  Max Latency   % Desired CPU  % Deadlines Met
None       0.0 +/- 0.1        1.0           100        99.3
Video      3.7 +/- 12.1      56.0            89        82.6
Burn      47.2 +/- 66.5     142.0          16.7        7.58
Write      0.1 +/- 0.5        5.0          97.7        95.7
Read       0.1 +/- 0.7        4.0          95.6        93.5
Ring       0.5 +/- 1.9       12.0          89.8          86
Compile   55.9 +/- 77.6     196.0          18.6        8.12

--- Benchmarking simulated cpu of Gaming in the presence of simulated ---
Load Latency +/- SD (ms)  Max Latency   % Desired CPU
None       0.0 +/- 0.1        0.5           100
Video      1.2 +/- 1.2        1.8          98.8
X          1.4 +/- 1.6        2.9          98.7
Burn     130.9 +/- 132.1    160.3          43.3
Write      2.4 +/- 2.5        7.0          97.7
Read       3.2 +/- 3.2        3.6          96.9
Ring       5.9 +/- 6.2       10.3          94.4
Compile  146.5 +/- 149.3    209.2          40.6

As you can see, the only times mainline is better, there is less than 1% difference between them which is within the margins for noise. MuQSS meets more deadlines, gives the benchmarked task more of its desired CPU and has substantially lower max latencies.

I'm reasonably confident that I've been able to maintain the interactivity people have come to expect from BFS in the transition to MuQSS now and have the data to support it above.

Enjoy!
お楽しみ下さい
-ck