06-24-2013 08:17 AM
A minor correction on the timing charts -- I was a little careless in ignoring "overhead". I tried doing the tests in the reverse order, doing my version first, then Altenbach's. I still win, but the ratios are a little smaller (3:1 for the small array, 10:1 for the larger).
BS
06-24-2013 10:49 AM
Must be some architectural or age-dependant factor. The attached VI shows at best half speed of concatenation on an AMD 8350. (Loop parellallization on concatenation sped it up some)
/Y
06-24-2013 11:05 AM
I also get about 2x the speed using reshape.
06-24-2013 01:58 PM
Very interesting (and instructive). Our codes are almost identical, except I wasn't using the "parallelization" trick, and was using a slightly different "clock" for timing. I like your clock -- where does it come from? I'm using code I found several years ago on an NI site that uses the CPU's System Clock -- maybe that is showing its age?
Anyway, on my machine, with your code and your clock, but with parallelism turned off (which, on my dual-core machine, sped things up), the Reshape version runs about twice as fast as Concatenate. Time to get out my watchmaker's loupe and figure out why my clock is doing so poorly.
Thanks to both of you for your comments.
BS
06-24-2013 02:02 PM - edited 06-24-2013 02:06 PM
@Bob_Schor wrote:
I like your clock -- where does it come from?
...\Vi.lib\Utilitiy\High Resolution Relative Seconds.vi (note that the times are in seconds, not us as indicatoed on the fp above. I usually use a display format of "%.2ps", try it. :D).
Bob_Schor wrote:Anyway, on my machine, with your code and your clock, but with parallelism turned off (which, on my dual-core machine, sped things up), the Reshape version runs about twice as fast as Concatenate. Time to get out my watchmaker's loupe and figure out why my clock is doing so poorly.
I don't think the clock used really makes a difference. The reshape shows an additional buffer allocation dot, but things could change as you rearrange the code.
06-24-2013 02:03 PM
P.S. -- and maybe it really is my crummy PC -- I couldn't get the code, as submitted, to run without shortening the list of test arrays (I used 3, 30, 300, and 3000 as my first dimension, which ran without LabVIEW giving me an "out of memory" error). Hmm, now I don't remember if these errors came with Parallelism on or if I'd already turned it off ...
06-24-2013 02:11 PM - edited 06-24-2013 02:13 PM
I am running out of memory on the 6000x9000 too. You probably need LabVIEW 64bit for that, or maybe test right after a reboot.
06-24-2013 02:20 PM
The High resolution clock is an undocumented feature Old Creek above has used before, it's quite nifty in situations like this. I haven't used it much (thus the unit error).
I'm running LV2012x64 and mainly tried bigger arrays to get some good reference points. The last array should fit in memory, but i guess the buffer allocations and possibly the 2 indicators might mess things up a little.
/Y