08-15-2012 12:37 PM
@for(imstuck) wrote:
Doh, I just assumed you didn't need the output...
I was just assuming it was the 7:00am talking. It is a common benchmarking mistake. I have also noticed in the past that algorithms I try last (like "x/y 4" in this case) are slower for no good reason. I think it may have to do with memory fragmentations, but I haven't dug into too much.
09-13-2012 08:12 AM
Although it has been some time since the last post I'd like to pick up the subject. I was especially interested in the execution times for different array sizes, i.e. how long the indexing takes in dependence on the number of loop cycles. I noticed that conditional indexing (magenta curve) and build array (black curve) take exactly the same time (as noted before by crossrulz), both being slower than replacing elements in an initialized array (green curve; note the log-log axis). This is true for the entire array size rang from 1 to 1 million elements. However, it is possible to run conditionally indexed for-loops in parallel instances, which is much faster (red curve) than both other possibilities. The shown curve used 2 cores. Are there any limitations to the conditional indexing or does it actually replace the "initialize first then replace elements" method - as long as several cores are used?
I attached the used code in case you're interested.
09-14-2012
04:15 PM
- last edited on
01-10-2025
01:39 PM
by
Content Cleaner
Frank,
It is ok to use the conditionally indexed loops in parallel instances as you have done in your example and execution will be faster as you have shown. There are certain limitations to parallel for loops though, such as not being able to use conditional terminals or having dependencies through loop iterations.
10-10-2012 05:35 PM - edited 10-10-2012 05:36 PM
Just wanted to comment that a piece of code like this one:
does convert backward very badly:
Notice the 4 stacked "floating" case structures to the left and the floating "zeroes" where the conditional indexing objects were? That's what happened. Be warned!
05-24-2013 04:45 PM
Just wanted to point out that the Help page pointed at by tst above show this interesting icon:
05-25-2013 12:42 PM
@X. wrote:
Just wanted to point out that the Help page pointed at by tst above show this interesting icon:
The image was probably created before the design was finalized and no one noticed that it's no longer up to date.
05-26-2013 08:55 AM - edited 05-26-2013 08:56 AM
@Frank_Lehmann wrote:
However, it is possible to run conditionally indexed for-loops in parallel instances, which is much faster (red curve) than both other possibilities. T
I attached the used code in case you're interested.
I'm sorry, I haven't looked at the code but the folllowing question seems valid...
Are you sure our fastest version here isn't simply constant-folded? that's an astronomical speed difference between versions which is extremely hard to explain or do you think a 5000 times faster execution can be explained by simply using two cores. That's some fantastic parallel scaling if correct.... (Which it can't be really)
Shane
05-27-2013 12:44 AM
Thanks Intaris, there actually was an error in the code which accounts for a factor of 1000. The main concern, however, was the supposed equality between build array and conditional indexing, yet the latter permitting parallelisation, the other one not. Therefore, although the conditional node converts back to a build array function, they can not be identical in the background.
05-27-2013 04:47 AM
Well your code is still wrong. I fixed it for you and included a proper graphic for the actual run times.
You really need to disable debugging before benchmarking. The fact that debugging is turned off for a parallel loop by default is the reason why your code SEEMED to be executing faster. You were actually banchmarking the difference between debugging enabled and debugging disabled which is kind of pointless.
I have attached the proper code.
Shane.
05-27-2013 04:57 AM - edited 05-27-2013 04:57 AM
Benchmarking code like this is non-trivial.
I think everyone has made mistakes like these when benchmarking and ended up drawing conclusiona which were totally inaccurate. Even AQ is VERY cautious on talking about benchmarking. NI has a whole bunch of PCs with different configurations of hard and software for testing because small changes can lead to big effects (One byte can be the difference between a routine running completely from Level 1 cache of a processor or not which has HUGE speed implications).
Look HERE for a great post on the topic.
Shane