05-24-2012 02:00 PM
Another aspect of this benchmarking code is it does not max out the CPU while it runs. This indicates that there are periods of time during the benchmark where the code is not executing.
The math assumes that the operations are the only thing tht is happening between the start and stop times.
So I suspect the threads are going idle waiting for something ... probably from th OS so I suspect the OS plays a part in the difference I am seeing.
I may use a timed seq to force the code into CPU #2 to see if I can get it use 100% of that processor.
Ben
05-25-2012 08:17 AM
Interesting. Your Set Value results are anomalously slow compared to mine. The rest seems to scale about the same. I wonder if there is an operating system issue. I am using Windows 7 64-bit, while you are using XP. There are all sorts of driver and OS differences that could be causing it.
I also found the CPU usage interesting. I would expect the Value runs to be below full CPU usage, since they are waiting on the next front panel refresh. But I expected the others to max at least one thread out. There is actually no reason to run multiple iterations, since the high resolution timers I used should give you good times for a single interation. Most should come in under a thread slice, while a multiple iteration run will not (which may be why the processor is not maxed out).
As Darren pointed out yesterday, about the only thing we can say is that terminals and locals are faster than anything. You need to test everything else on your target platform, since the relative speeds can change.
05-25-2012 08:29 AM
@DFGray wrote:
Interesting. Your Set Value results are anomalously slow compared to mine. The rest seems to scale about the same. I wonder if there is an operating system issue. I am using Windows 7 64-bit, while you are using XP. There are all sorts of driver and OS differences that could be causing it.
I also found the CPU usage interesting. I would expect the Value runs to be below full CPU usage, since they are waiting on the next front panel refresh. But I expected the others to max at least one thread out. There is actually no reason to run multiple iterations, since the high resolution timers I used should give you good times for a single interation. Most should come in under a thread slice, while a multiple iteration run will not (which may be why the processor is not maxed out).
As Darren pointed out yesterday, about the only thing we can say is that terminals and locals are faster than anything. You need to test everything else on your target platform, since the relative speeds can change.
After the garden and cleaning the guns and writing a Nugget...
My wife is leaving at home alone with a case of beer this week-end. I'll try out your code on my W7 machine at home with LV 2011 to see if my numbers are closers to yours.
I am faced with this question on a regualr basis as I advise an architectures and I want to make sure my rules of thumb are valid.
No promises but I'll keep this thread in mind.
Ben
05-25-2012 01:08 PM
Everyone is gone for the holiday week-end so I had a chance to do more testing.
I started with the array version and made the folowing changes.
Defer FP update before the case then underfer after - I did this to get the GUI update work out of the measurement.
Used timed-Sequence at set afinity for CPU 1 (the second).
Running the terminal version I managed to get all of the load to swamp CPU1. Comparing with what I wrote down the other day it runs about 10X faster (remeber FP defered).
So I can say that the GUI updates is 90% of what happens when the FP is NOT defered.
I then bumped the iteration count to make sure it sayed that way.
A) more iterations
B) CPU 1 is swamped.
C) Very little on no Kernal mode time
Then switched to the "SetControlValue' version and reduced the iteration count.
Again I see the second CPU is sitting idle and both CPUs are working.
Also the Kernal times are back.
Recalling what I learned From VMS Internals and Data Structures (was it chapter 14?)
I recalled that the kernal handles Page Faults.
Switching to the Processes tab and adding "Page Faults" and "PF Delta" I can see the "SetControlValue' method results in almost 1000X the page faults the other methods.
So the SetControlValue method invokes massive Page Faults.
So why care?
THe Kernal mode is used to let the OS do the work of making a computer act like a computer. It does nothing useful for us. It only makes it look like the machine is working. No useful work is really being done.
I'll still give this thing a whirl this week-ned if I remember.
Ben
05-25-2012 02:42 PM
No I am not done being disturbed by these findings.
What realy bugged me was why the Value-Signalling seemed to be so quick.
A value Signaling should take extra time to actually trgger the event so...
I add some while loops that watch for a value change on the control in question and things are falling into place.
This is the most recent results and they are something that will let me go to sleep tonight.
The only question I have now is;
Why is the Local showing as faster than the terminal?
I have attached my most recent version (LV 2010) of the benchmark if anyone cares to try running this on yet another machine.
Take care,
Ben
05-29-2012 08:40 AM
I spent about 4-5 hours on Saturday night running benchmarks and after all of that I am sticking with my old rule of thumb;
Control
Local
Property Value
Property Vlue -signalling
Set Control value
The benchmarks showed some interesting patterns that changed depending on the array size used.
Ben