tag:blogger.com,1999:blog-6469704299235308349.post552607706487347639..comments2024-03-28T15:50:13.644+11:00Comments on -ck hacking: linux-5.2-ck1, MuQSS version 0.193 for linux-5.2ckhttp://www.blogger.com/profile/02904761195451530213noreply@blogger.comBlogger95125tag:blogger.com,1999:blog-6469704299235308349.post-20172292336167344432019-10-13T01:19:02.507+11:002019-10-13T01:19:02.507+11:00Thank you very much, sir.Thank you very much, sir.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-22787611925195424422019-10-12T12:37:38.194+11:002019-10-12T12:37:38.194+11:00Might as well post the full command in case the ab...Might as well post the full command in case the above link is updated:<br /><br />sed -i -e '/^-CFLAGS/ s,+=,:=,' -i -e '/^+CFLAGS/ s,+=,:=,' patch-5.2-ck1Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-669294772378431782019-10-12T12:33:39.036+11:002019-10-12T12:33:39.036+11:00linux-ck PKGBUILD on Arch User Repository has a on...linux-ck PKGBUILD on Arch User Repository has a oneliner fix you can run against the patch-5.2-ck1 file: https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=linux-ck#n138<br /><br />(Replace the './"${_ckpatch}"' at the end with the location of the patch, and it should apply against >=5.2.18 afterwards)<br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-75593904213936078932019-10-08T11:13:22.994+11:002019-10-08T11:13:22.994+11:00Any patch for the inexperienced?Any patch for the inexperienced?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-10462959257618241892019-10-03T08:49:51.527+10:002019-10-03T08:49:51.527+10:00Thanks for reporting.
I am hoping for 5.3-ck1 to a...Thanks for reporting.<br />I am hoping for 5.3-ck1 to arrive soon.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-48265093618523370702019-10-02T22:49:54.611+10:002019-10-02T22:49:54.611+10:00A commit (https://git.kernel.org/pub/scm/linux/ker...A commit (https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/tools/objtool?h=v5.2.18&id=47af17950b03b748eea68ad7613f8d8b4c688d45) in the 5.2.18 patch conflicts with 5.2-ck1 (due to https://github.com/ckolivas/linux/commit/40846db6244abc4696bcad4f889016e1952630f4).<br /><br />It should be simple enough to fix by hand, but I thought I'd mention it here, if anyone's looking for a explanation.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-75905680251155273052019-09-30T16:58:26.826+10:002019-09-30T16:58:26.826+10:00Sveinar, this is interesting, thanks for info.
I&#...Sveinar, this is interesting, thanks for info.<br />I'll go and check PDS, I thought it was dead :) I switched to MuQSS because BMQ had teething issues and PDS was not updated to newer kernels.<br />But MuQSS was acting weird with runqueues and llc, so I hopefully fixed it (at least I tried), it's working well now.<br /><br />Nevertheless, for the sake of interest, I'll check my usual stuff on PDS similarly how I tested my patches to MuQSS, let's see how it performs.<br />I need to throw vanilla kernel into the mix as well.<br /><br />BR, EduardoAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-52465012234001522112019-09-30T07:03:50.964+10:002019-09-30T07:03:50.964+10:00Better.. hmm..
I used PDS mostly with 4.x kernel ...Better.. hmm.. <br />I used PDS mostly with 4.x kernel branch, and most things worked very well. Then 5.x came and BMQ came, but there was quite a few "starting issues" with BMQ, so i ended up with MuQSS.<br /><br />I feel PDS is working well for me, but to really compare i would need to do all 3 for 5.3 i guess... but i dunno if i have it in me to fiddle with it.<br /><br />Perhaps it could be an idea for Phoronix to do some comparison tests with the Phoronix test suite? Would be a fun experiment to suggest. Possibly also comparing AMD/Intel and the different schedulers.<br /><br />As with all things: "The absolutely best color is green!"<br /><br />Meaning: Its all in the eye of the beholder :)Sveinar Søplerhttps://www.blogger.com/profile/18401720133659243541noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-23302622828248214872019-09-30T06:58:31.300+10:002019-09-30T06:58:31.300+10:00Thank you very much, sir.Thank you very much, sir.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-56322971067291431652019-09-30T06:55:44.159+10:002019-09-30T06:55:44.159+10:00https://github.com/SveSop/kernel_cybmod
0001 and ...https://github.com/SveSop/kernel_cybmod<br /><br />0001 and 0002 is the PDS patches for 5.3 and is from the TK-Glitch git repo.Sveinar Søplerhttps://www.blogger.com/profile/18401720133659243541noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-8147673836468136712019-09-30T06:55:22.481+10:002019-09-30T06:55:22.481+10:00As I mentioned all of my patches are in my google ...As I mentioned all of my patches are in my google drive, address as usual: https://drive.google.com/drive/folders/1MxUcptaOgPbPgJoUdeq0GkEuoeyaRHdG<br /><br />Sveinar, PDS works better than BMQ and MuQSS for You?<br /><br />BR, EduardoAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-33992664754764097482019-09-30T06:41:30.682+10:002019-09-30T06:41:30.682+10:00Interesting.
Care to share?Interesting.<br />Care to share?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-21608855045051356142019-09-30T05:38:03.478+10:002019-09-30T05:38:03.478+10:00Currently i am on 5.3 and a reworked PDS scheduler...Currently i am on 5.3 and a reworked PDS scheduler. This works quite well atm, but i will perhaps give it another go once -ck/MuQSS is put up for 5.3.<br /><br />Somewhat limited timewise due to some IRL stuff i am working on atm.Sveinar Søplerhttps://www.blogger.com/profile/18401720133659243541noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-52535536715665658912019-09-29T04:56:14.258+10:002019-09-29T04:56:14.258+10:00So, I had some free time and I have made a nice pr...So, I had some free time and I have made a nice progress regarding Diablo3 stutter on Ryzen using smt. It seems that I have fixed it :)<br />There is 0007 patch on google drive for someone to try it out.<br />So, with this patch I'm not exactly sure how it behaves on Intel. I have changed the bits which select the best CPU to schedule task to, it now selects CPU a bit more accurately. I would like Con to look at it as I lack the knowledge of an idea why it was as it was before - CPU cache busyness was not checked in all cases just in siblings locality, however thread busyness is checked always (my guess is because it's not exactly a full core and task would not run as fast on sibling as on normal core, if it's free).<br />In addition, in this patch I have switched to using llc CPU map to check whether CPU caches are busy in all rq sharing cases, which should not change anything on Intel.<br /><br />I have comparisons of performance as well: D3 behaves gut, Valley and MHO shows better results, especially with smt, compilations are a bit down since previous patches by fair bit, it seems to be on par with numbers from vanilla muqss, cs:go numbers are up with smt and a little down using anything else.<br />So after this patch smt seems to be best overall :)<br /><br />If Sveinar and Anonymous are still around, please give this patch a bit of testing and report back how it behaves for You. Thanks. I have not booted this up on Intel, though :)<br /><br />BR, Eduardo<br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-55339757867106088542019-09-25T14:41:36.850+10:002019-09-25T14:41:36.850+10:00I had some time yesterday, installed and ran cs:go...I had some time yesterday, installed and ran cs:go "FPS benchmark" map, results surprised me. One of the first times I have seen that mc is the slowest (at least on Ryzen):<br /> LLC: Average framerate: 238.53<br /> MC: Average framerate: 229.29<br /> SMT: Average framerate: 238.79<br /> NONE: Average framerate: 236.86<br /><br />All went smooth, no stuttering and such, smt and llc were the same.<br />Settings were autodetected to max, I'll try lowering them next time I'll run the tests.<br />Strange results, but they are repeatable.<br /><br />BR, EduardoAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-59129861198778120362019-09-24T18:08:19.654+10:002019-09-24T18:08:19.654+10:00Thanks, now it's clear about LLC. Interestingl...Thanks, now it's clear about LLC. Interestingly in this case LLC numbers are 0 and 2, not 0 and 1 as in case of Ryzen...<br />RQ and CPU orders seem to be right, localities are slightly different, but everything sort of looks ok.<br />At least theoretically I can see how small improvements could be observed in this CPU topology.<br /><br />BR, EduardoAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-35051301129604021232019-09-24T07:36:28.433+10:002019-09-24T07:36:28.433+10:00I realize that I poorly phrased "2 dies with ...I realize that I poorly phrased "2 dies with split llc + single memory controller", I meant that its 2 dies each with their own share LLC (In this case a large shared l2 cache) with an off die memory controller.<br /><br />The core 2 quad and its xeon counterparts, are advertised as having a 12mb l2 cache, but in reality it is 2x6MB.<br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-50347167406269828462019-09-24T07:00:41.737+10:002019-09-24T07:00:41.737+10:00as requested
[ 0.519762] MuQSS possible/prese...as requested <br /><br />[ 0.519762] MuQSS possible/present/online CPUs: 4/4/4<br />[ 0.519769] MuQSS locality CPU 0 to 0: 0<br />[ 0.519769] MuQSS locality CPU 0 to 1: 3<br />[ 0.519770] MuQSS locality CPU 0 to 2: 3<br />[ 0.519771] MuQSS locality CPU 0 to 3: 2<br />[ 0.519771] MuQSS locality CPU 1 to 0: 3<br />[ 0.519772] MuQSS locality CPU 1 to 1: 0<br />[ 0.519772] MuQSS locality CPU 1 to 2: 2<br />[ 0.519773] MuQSS locality CPU 1 to 3: 3<br />[ 0.519773] MuQSS locality CPU 2 to 0: 3<br />[ 0.519774] MuQSS locality CPU 2 to 1: 2<br />[ 0.519774] MuQSS locality CPU 2 to 2: 0<br />[ 0.519775] MuQSS locality CPU 2 to 3: 3<br />[ 0.519775] MuQSS locality CPU 3 to 0: 2<br />[ 0.519776] MuQSS locality CPU 3 to 1: 3<br />[ 0.519777] MuQSS locality CPU 3 to 2: 3<br />[ 0.519777] MuQSS locality CPU 3 to 3: 0<br />[ 0.519778] MuQSS sharing MC runqueue from CPU 1 to CPU 2<br />[ 0.519780] MuQSS sharing MC runqueue from CPU 0 to CPU 3<br />[ 0.519788] MuQSS CPU 0 llc 0 RQ order 0 RQ 0 llc 0<br />[ 0.519789] MuQSS CPU 0 llc 0 RQ order 1 RQ 1 llc 2<br />[ 0.519790] MuQSS CPU 1 llc 2 RQ order 0 RQ 1 llc 2<br />[ 0.519790] MuQSS CPU 1 llc 2 RQ order 1 RQ 0 llc 0<br />[ 0.519791] MuQSS CPU 2 llc 2 RQ order 0 RQ 1 llc 2<br />[ 0.519792] MuQSS CPU 2 llc 2 RQ order 1 RQ 0 llc 0<br />[ 0.519792] MuQSS CPU 3 llc 0 RQ order 0 RQ 0 llc 0<br />[ 0.519793] MuQSS CPU 3 llc 0 RQ order 1 RQ 1 llc 2<br />[ 0.519794] MuQSS CPU 0 llc 0 CPU order 0 RQ 0 llc 0<br />[ 0.519794] MuQSS CPU 0 llc 0 CPU order 1 RQ 3 llc 0<br />[ 0.519795] MuQSS CPU 0 llc 0 CPU order 2 RQ 1 llc 2<br />[ 0.519796] MuQSS CPU 0 llc 0 CPU order 3 RQ 2 llc 2<br />[ 0.519797] MuQSS CPU 1 llc 2 CPU order 0 RQ 1 llc 2<br />[ 0.519797] MuQSS CPU 1 llc 2 CPU order 1 RQ 2 llc 2<br />[ 0.519798] MuQSS CPU 1 llc 2 CPU order 2 RQ 3 llc 0<br />[ 0.519799] MuQSS CPU 1 llc 2 CPU order 3 RQ 0 llc 0<br />[ 0.519799] MuQSS CPU 2 llc 2 CPU order 0 RQ 2 llc 2<br />[ 0.519800] MuQSS CPU 2 llc 2 CPU order 1 RQ 1 llc 2<br />[ 0.519801] MuQSS CPU 2 llc 2 CPU order 2 RQ 0 llc 0<br />[ 0.519801] MuQSS CPU 2 llc 2 CPU order 3 RQ 3 llc 0<br />[ 0.519802] MuQSS CPU 3 llc 0 CPU order 0 RQ 3 llc 0<br />[ 0.519803] MuQSS CPU 3 llc 0 CPU order 1 RQ 0 llc 0<br />[ 0.519803] MuQSS CPU 3 llc 0 CPU order 2 RQ 2 llc 2<br />[ 0.519804] MuQSS CPU 3 llc 0 CPU order 3 RQ 1 llc 2<br />[ 0.519804] MuQSS runqueue share type LLC total runqueues: 2<br />[ 1.500417] MuQSS CPU scheduler v0.193 by Con Kolivas.<br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-8052617340408823862019-09-24T05:57:50.200+10:002019-09-24T05:57:50.200+10:00Just the reference about "slapping two dual c...Just the reference about "slapping two dual cores together and call it a quad": https://www.extremetech.com/computing/49528-core-2-quad-q6600-four-cores-for-the-masses/2<br /><br />BR, EduardoAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-88756795013547292182019-09-24T03:34:37.445+10:002019-09-24T03:34:37.445+10:00If I remember correctly, there were times when int...If I remember correctly, there were times when intel created quad core CPUs by slapping two dual cores in the same package and call it a quad core :)<br />I'm not exactly sure how they organized LLC in that case, but it may well be that there were two LLCs.<br /><br />Can Anonymous please pastebin results of "journalctl -b | grep -i muq", then we'll be more sure, how kernel sees that particular CPU.<br /><br />On Ryzen LLC sharing (two queues, last 0006 patch) did not give any measurable performance boost or degradation for compilation tasks for my Ryzen 1700, but maybe Threadripper CPUs would get a boost, because cores / CCX etc. are organized in slightly different manner (and I don't know exactly how either) than my Ryzen. I have no access to TR, so I can not verify. TR even has two NUMA nodes.<br />I still need to test more stuff on that last LLC sharing patch.<br /><br />If Anonymous shares the output, I could at least theoretically guess whether that may or may not help. We don't even know how cores are organized in that CPU.<br /><br />BR, EduardoAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-9572747226902237042019-09-23T22:29:01.334+10:002019-09-23T22:29:01.334+10:00Yes, I'm not sure how performance improvements...Yes, I'm not sure how performance improvements from those changes on that CPU are possible either.ckhttps://www.blogger.com/profile/02904761195451530213noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-67994877915091089952019-09-23T22:21:52.263+10:002019-09-23T22:21:52.263+10:00So is this with a "Core 2 Quad" processo...So is this with a "Core 2 Quad" processor? And this processor does not share any l2 cache?<br /><br />Ref. https://ark.intel.com/content/www/us/en/ark/products/33924/intel-core-2-quad-processor-q9550-12m-cache-2-83-ghz-1333-mhz-fsb.html this processor have a "CPU Cache is an area of fast memory located on the processor. Intel® Smart Cache refers to the architecture that allows all cores to dynamically share access to the last level cache."<br /><br />This is the same wording used for a I7 8700K aswell. I dunno what this implies, or if i am viewing the right processor tho. But does using the patches with "llc" show to any degree that it is actually differentiating this?<br /><br />Should it show 2 runqueues when such a "separated" cache is used? (Cos i am fairly sure that it shows 1 for my 8700K when i tried).<br /><br />I am just asking. I found no/little difference between llc and mc for the benchies i did, but i am willing to revisit this if it SHOULD be a different behavior.Sveinar Søplerhttps://www.blogger.com/profile/18401720133659243541noreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-16547677722629674762019-09-23T17:57:09.774+10:002019-09-23T17:57:09.774+10:00I did some bench-marking using hl2 lost coast (bec...I did some bench-marking using hl2 lost coast (because it is cpu bound) average of 3 runs<br /><br />"stock" ck patches rqshare=mc avg fps 269.85<br /><br />ck+eduardo's patches rqshare=mc avg fps 273.23 +1.2%<br /><br />ck+eduardo's patches rqshare=llc avg fps 277.45 +2.8%<br /><br />as you can see, eduardo's patches definitely improves performance on my ancient setup.<br /> I haven't experienced any regressions or bugs.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-5141107938245885972019-09-23T15:07:00.480+10:002019-09-23T15:07:00.480+10:00specifically between the stock multicoresiblings v...specifically between the stock multicoresiblings vs the mc-llc mode added by the patch. which ends up creating 2 run queues instead 1 <br /><br />Increased frame-rate in multiple opengl applications, its not alot maybe 2-4%, it makes sense since the latency penalty between cores that don't share an l2 cache, is quite high on this generation of cpu,(as high as a 50% increase according to this benchmark anyway https://github.com/ajakubek/core-latency)Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6469704299235308349.post-41112323893062985622019-09-23T00:24:49.041+10:002019-09-23T00:24:49.041+10:00I wonder how 5.3 kernel with the new "utiliza...I wonder how 5.3 kernel with the new "utilization clamping support" turns out vs. needing to use things like MuQSS/-ck patches? <br /><br />Sounds like it WILL give a performance boost to things like gaming and its like tho..Sveinar Søplerhttps://www.blogger.com/profile/18401720133659243541noreply@blogger.com