LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Is it faster to index an initialized matrix or to use an auto-indexed tunnel, both within a for loop -- what is going on?

Solved!
Go to solution

@EngrStudent wrote:

 

Here is the FP and block diagram (appended)

 


Please attach the actual VI.

 

(An image often does not tell the whole story and we cannot test for ourselves)

 

Thanks!

 

I don't understand the point of the "in place" code. The code executes in place anyway for one of the wires. The 60x60 could even be taken out of the loop.

0 Kudos
Message 11 of 17
(2,061 Views)

@altenbach - I put "in place" there because I am trying to learn, I am trying to find a way to get to the things that you understand without spending a few man-years living in your office.

 

It is attached.  Note - only one version is attached.  The error can be rewired and the display for the arrays can be moved.

 

I know, now, that I am converting days to seconds.  In a year I might not know that.  If someone else reads it they might not.  I don't always have the time to write out all the details in documentation, so I try to build some hints in at the DNA level.  The compiler should convert it to a constant, so it shouldn't hit runtime or memory.

 

Can you tell me what you do with  the VI?

0 Kudos
Message 12 of 17
(2,045 Views)

@EngrStudent wrote:

Can you tell me what you do with  the VI?


Just playing around a little bit.

 

The 24x24x60 only gets folded if you change the order, see picture (look for the fuzzy wires!). (not sure if the compiler reorders thing though, but we seem to gain 5-10%).

 

 

Once you disable debugging in the VI the times for the non-mathscript code are basically identical. With debugging enabled, you are giving the formula node an advantage, because the formula does not have debugging code inside it.

 

You definitely want all array indicators outside the main loop, Right now the UI might steal cycles from later running code in the loop to update the indicators.

 

None of the sequence frames on the right serve any purpose.

 

To get a more honest value, user array min instead of mean. All external artifacts (e.g. OS jobs, scheduling) tend to make the times longer, so the min is asymptotically a better guess for the pure algorithm time.

 

I would use high resolution relative seconds and format the time display with a format of "%.2ps", for example.

 

I look at it some more.... 😉

Message 13 of 17
(2,034 Views)

Once you take the random number genration out of the  inner loops and remove the vestigial FOR loops, the various versions (incl. mathscript!) are basically identical in speed. The formula node is about 2x slower.

 

Here's a quick benchmark rewrite (LV2013). See if it makes any sense. Let me know if you have any questions.

(Not fully tested, Of course there could be bugs)

 

 

Download All
0 Kudos
Message 14 of 17
(2,023 Views)

My results looked a little different.  I just opened your VI and clicked run.

 

Also there is this startup effect where things happen faster then take a step increase.  Is that consumption of RAM, or similar?  Transition to SWAP?  Windows background process?

 

I am wondering about the relationship between the mean and the min in evaluating the algorithm.  I use the mean because I want to measure realistic performances.  If the algorithm is faster in the min, shouldn't it be faster in the mean too?  The mean is confounded by many windows processes, so maybe there is a variation in the mean.  There is likely a bias too.  ...

 

Capture.PNG

0 Kudos
Message 15 of 17
(1,992 Views)

If you do the "mean", a single significant outlier can skew one of the results (e.g. when some other program loads at just the wrong time, windows checks for updates, etc.).

 

If you are worried about "realistic performance", you could do the min and the max, giving you a range from the best to worst case scenario. Absolute performance differs from machine to machine and is thus not very interesting. If I am really worried about the distribution, I typically do a histogram of many runs. Often, the "mean" is a bad measure of the actual distribution, because it is not gaussian.

 

Are you running on batteries or plugged in? What is your OS and power plan? Sometime the results can differ because the system adjusts clock frequency on the fly to either boost performance (turboboost) or save power (speedstep), or prevent overheating.

 

There are also boundaries that depend on the CPU cache size of the processor and such. How are your results if you change the size to 1M or 100k, for example. What is the exact make and model of your CPU? e.g. AMD will behave differently compared to Intel.

 

What version of LabVIEW are you running? Every version has new compiler improvements. I am running 2015SP1.

 

Message 16 of 17
(1,983 Views)

Here are my result for 50M, same picture.

 

 

If I go down to 10k size, things get a bit noisier and the mathscript slows down a little bit. It seems there is slightly more overhead launching the mathscript node (That's why it is so slow if you place it inside a FOR loop instead of operating on arrays!!

 

Here is is also more obvous that the "min" is a more accurate comparison. The distributions are higly skewed, always with slower outliers, but the min forms a relatively stable and reproducible lower boundary.

 

Download All
0 Kudos
Message 17 of 17
(1,981 Views)