Hardlimit test bank
-
@kynes There is a clear difference here. I am also seeing it in my case. I suppose that in the end I will have to apply a kind of truncated mean: something like eliminating the two highest values, the two lowest and making an average of the remaining 6. Because it is clear that the outliers at the beginning distort the measure.
I am also going to review the synchronization mechanism, to see if the fault was there.
By the way, be careful with the 256 threads, because if the system crashes and the processes lose communication with each other, they can remain permanently waiting consuming 100% of all the cores and you will need to either close each process manually or restart the PC.
-
@kynes said in Hardlimit Test Bench:
With 128 threads I think I have the world record in multithreading:
Well, and if I don't have it, I'll test it on 256 threads to see what happens

There the synchronization has failed. Basically you are passing a handful of threads at different times and the scores are being added as if they had all been passed at once.
-
Some conclusions I draw from this:
-
When the test bench runs with an amount 4 times higher than the number of processor threads, the program fails and gives absurd results. This does not worry me because it will be limited to double the threads.
-
When an amount higher than the number of processor threads is run (but below an absurd amount), false positives usually occur in the first sample. Here are results with double the threads of three different models:
Core i7-6820HQ

Pentium N3540

Core i5-7300HQ

That this happens in three different models makes it clear that it is a generalized behavior. Curiously, these peaks are seen in tests #1, 2, and 4, but not in test 3. That this behavior is not reproduced in test 3 makes me think it's complicated to think of a program failure, but it cannot be ruled out yet.
Below are the same models with a number of threads equal to the CPU:
-
Core i7-6820HQ

-
Pentium N3540

-
Core i5-7300HQ

If there are peaks, they are practically imperceptible.
- In the i5 and i7, a sustained improvement is seen throughout the test. Here there are only two options: that a larger amount of CPU is being monopolized or that segmentation is being better utilized. In the case of the Core i7-6820HQ, it is a PC with many programs running in the background along with an antivirus. The Pentium N3540 runs Windows 10 without anything else, with nothing running in the background and no antivirus. Perhaps for this reason, doubling the number of threads does not improve performance in a sustained way.
In general
What distorts the result shown in the program with double the threads is the initial peak. The problem is that I don't know why it occurs. If it were the program, I would expect a peak in all tests or a peak at the beginning and another at the end, but it doesn't happen that way. Could it be a trick of the cache? Honestly, I have no idea. But it is strange and the program will need to be reviewed.
-
-
And why don't we remove the option to specify the number of threads and force it to always run with the number of cores available on the CPU? I say this to avoid unnecessary differences between users.
-
@krampak Initially, the reason for having the freedom to specify the number of threads was in case HT/SMT detection failed. So far, that hasn't happened once. So there's no reason to keep it.
From the point of view of measuring the performance of a processor, choosing the number of threads is useful for seeing how a micro performs at half load, which would be quite interesting for evaluating processors at half load, which is actually how they are used most of the time. But if it's already difficult to receive validations with the default configuration, it's much more so with slightly exotic configurations.
Here a solution, as you say, is either to remove the possibility of choosing the number of threads or perhaps to limit the maximum to the number of threads of the processor to leave that possibility open.
Another possibility (which is complementary) is to do a truncated mean, something that will eventually be applied because this would correct all the results that have been sent so far.
And another possibility is to directly ignore the first sample; a simple but not elegant solution.
The method on how to calculate the final score is something that I have been thinking about for a long time (hence the program and the central use different criteria). The truncated mean is the one that is winning because it avoids this type of problems of unknown origin and because it filters the result in PCs with moderate background load.
The test bench has several mechanisms to prevent tampering with results and at the beginning of the development, it was the part to which more hours were dedicated. Fortunately, this failure has a retroactive solution. One thing is clear is that one of the objectives is to measure the real performance of the machine, without a specific configuration of the program offering a performance superiority that does not exist.
The latter I say to make it clear that this is not a trivial failure and that it will be solved. Until then, I remain open to suggestions.
-
Let me see if I understand it correctly: the program executed by default shows reliable results, but when adjusting the parameter of how many threads we want it to use, it is prone to showing "unusual" results...
If that is the case, personally I would only let the results obtained by default be validated, at least until the incident is resolved... we just needed an army of pollagorders cheating at solitaire and, by the way, falsifying the ranking (this is the most serious thing for me, after all, I think what is intended is for the table to be reliable).
The option to choose the number of threads can be maintained, warning that it may give erroneous results and that they are not "official", for those who like tinkering.
-
Surely, by tomorrow or the day after (from there, depending on how long it takes Microsoft to certify it), version 1.4 will be ready, which will come with the issue of falsified scores fixed, among other changes.
If any of you have a current and powerful processor, I could use a screenshot of the CPU tab to put on the Store page, since I only have a few older PCs around here.
-
This is the most powerful thing I have in my house, I don't know if it's what you're looking for.

-
@whoololon Thanks for the image. In the end I didn't put it because it seems to be very compressed and doesn't look good in the Store.
Regarding the program, version 1.4 is already available. You can find the details in the first message of this thread. In essence, among other things, the issue of falsified punctuation has been corrected and some changes have been made to the interface.
For now, you can still choose twice the number of threads of the maximum processor only in models without HT/SMT.
-
Very cool the latest version!! it looks more professional and you can tell a lot about the reduced loading time.
-
And to celebrate, we inaugurate it with a radiant i3-3120M.
Edit: By the way, Smart Screen is still popping up, at least on W8.1.
-
@krampak said in Hardlimit Test Bench:
Very cool the latest version!! It looks more professional and the reduced load time is noticeable.
Thanks. I'm glad you can see the difference in the boot time.
@whoololon said in Hardlimit Test Bench:
And to celebrate, we inaugurate it with a radiant i3-3120M.
Edit: By the way, Smart Screen is still popping up, at least on W8.1.
Perfect, it's been a while since anything new came in.
Smart Screen is still happening on Windows 10 as well. To be honest, it's taking longer than I've read around. Maybe it hasn't been downloaded enough times yet, or maybe it really is important to download it from Internet Explorer or Edge. I hope it disappears in the next few days.
-
-
@Xevipiu Is it a sample of engineering? Perhaps of a Core i9-9900K/S?
-
@cobito said in Hardlimit test bench:
@Xevipiu Is it a sample of engineering? Maybe of a Core i9-9900K/S?
No, it's from a 10th Gen series, an i9 10980HK, throttled, but it holds its own
-
@Xevipiu said in Hardlimit test bench:
@cobito said in Hardlimit test bench:
@Xevipiu Is this a sample of engineering? Maybe from a Core i9-9900K/S?
No, it's from a 10th Gen series, an i9 10980HK, capped, but it holds its own
Ok, I was confused by the signature because yours returns 906ED of which all are Coffee Lakes.
Actually it makes sense according to the results because it pulls 15-20% ahead of the Core i9-9900K in multi-thread.
It remains in a very good position in single-thread. If you remove the memory test (where Zen 2 usually does well) it would be first in the single-thread ranking.
-
@cobito said in Hardlimit Test Bench:
@Xevipiu said in Hardlimit Test Bench:
@cobito said in Hardlimit Test Bench:
@Xevipiu Is this a sample of engineering? Maybe from a Core i9-9900K/S?
No, it's from a 10th Gen series, an i9 10980HK, capped, but it holds its own
Ok, I was distracted by the signature because yours returns 906ED of which all are Coffee Lakes.
Actually it makes sense according to the results because it beats the Core i9-9900K by 15-20% in multi-thread.
It remains in a very good position in single-thread. If you remove the memory test (where Zen 2 usually does well) it would be first in the single-thread ranking.
Think that the memory issue is capped at 2140 or 2400mhz, it's a big handicap, the ZEN or the same 9900k beat me
Here you have the family of micro-code's of processors "ES", for your database

-
The version 1.5.0 of the program has just been released the program version that mainly comes with changes on the information related to the memory:· Now it detects "form factor" of integrated memory in processor encapsulation (SoC).
· Now it detects memory types HBM, HBM2, DDR5 and LPDDR5.
· A bug has been fixed that made that under certain memory configurations, the information of the memory type, form factor or frequency was not shown in the program.
· Now the brand and part number of the memory is shown in the program.
· *Also the brand and part number of the RAM memory is sent in the validation process.
· *In addition, the available instruction repertoire is also sent in the validation along with the detection of hypervisor (execution on virtual machine).*The information related to the memory and the available instruction repertoire will be shown in the central soon. This will also give rise to the creation of a database of memories along with the results of the memory test. Regarding the detection of hypervisor use, soon the results sent from a virtual machine will be discarded for the calculation of statistics.
For the moment the executable is available and in the next few hours it will be possible to download/update it from the Microsoft Store.
As usual, if you see something strange, do not hesitate to comment on it.
-
It seems that we have been for over a month with the test bank database blocked and, therefore, without receiving validations during this period. Since yesterday, the validation system is operational again.
-
Histograms have been added to the validated results page. Now, when you validate your result, if there are more than 5 validations in the mode in question, a histogram per test will appear where you can see the frequency in which the scores are repeated in a series of ranges. It also indicates in which range your result is located. Example. They take a little while to load and there are still a couple of things to fix, but for most cases, they work correctly.
In addition, Zen 3, Tiger Lake and Rocket Lake are added to the list of architectures, so it is now possible to see the rankings of the CPUs that we have received from said architectures.