10-20-2012 01:34 PM - edited 10-20-2012 01:37 PM
Yes, the "in place element version" is incorrect for left and right. My code agrees with the original.
For testing, it is also important to stay away from square arrays. Here's an improved version that makes the results more clear.
10-20-2012 01:37 PM - edited 10-20-2012 01:37 PM
10-20-2012 01:37 PM
@altenbach wrote:
Yes, the "in place element version" is incorrect for left and right.
For testing, it is also important to stay away from square arrays. Here's an improved version that makes the results more clear.
It kinda makes it hard to get realistic benchmark figures if the code isn't correct.
The whole exercise becomes moot?
Br,
/Roger
10-20-2012 01:41 PM
@User002 wrote:
It kinda makes it hard to get realistic benchmark figures if the code isn't correct. The whole exercise becomes moot?
Well, it is not my code! 😄
Since it is not a contender anyway it does not really matter. The code looked a bit fishy but I assumed that whoever wrote it actually tested it on the original LED system.
10-20-2012 01:43 PM - edited 10-20-2012 01:45 PM
Actually, somebody copied the code wrong. You need to set the split dimension of the IPE correctly.
The first original was correct.
Here is the corrected version.
10-20-2012 01:46 PM
Ahh, I see.
I got into my mental test-mode again. Well. *cough*
Anyway what a POS benchmark code, it didn't even work correctly.
Br,
/Roger
10-20-2012 01:50 PM
@altenbach wrote:
Actually, somebody copied the code wrong. You need to set the split dimension of the IPE correctly.
The first original was correct.
Here is the corrected version.
Nah you forgot the shift registers "to see the action scroll by".
Br,
/Roger
10-20-2012 02:44 PM
Todays lession: Do computations/operations in contigous memory for best performance.
Multicore machines can speed up operations on huge arrays.
Nothing new under the sun.
Br,
/Roger
10-20-2012 07:12 PM
Hi Roger,
It seems that you made something wrong by copying the code, as the one I posted here for tests and later modified here by Altenbach appears to work correctly for all modes and rotations 😉
Thanx for the faster Up and Down For Loop versions... but it isn't clear for me why they are faster !
Best regards,
HL
10-21-2012 12:59 AM - edited 10-21-2012 01:20 AM
@Herlag wrote:
Hi Roger,
It seems that you made something wrong by copying the code, as the one I posted here for tests and later modified here by Altenbach appears to work correctly for all modes and rotations 😉
Thanx for the faster Up and Down For Loop versions... but it isn't clear for me why they are faster !
Best regards,
HL
Hi,
Probably something odd happened on copying the VI from Internet or inside my head. I'm biased here, you decide.
Yes, the second version operates in contigous memory. It operates on rows and not on rotating the columns.
A 2D array is stored as a number of sequential contigous 1d arrays in RAM which is partitioned by the row.
The double array datatype merely acts as a conventient "interface" for the programmer to this memory.
For each operation in the new version only contigous blocks (one entire row or 1d array) of data need to be accessed/allocated/reallocated, while on the previous column based version all 2000 "1d arrays" had to be accessed on (one element from each row).
Naturally a huge array like that will not fit into the cache, so the processor will have a lot of cache misses and performance suffers as it is column-hopping all over memory. Try do the same benchmarking but this time on a small array, thats fits in the cache. The performance difference will perhaps be close to zero.
http://zone.ni.com/reference/en-XX/help/371361G-01/lvconcepts/how_labview_stores_data_in_memory/
Not sure if I made it any more clear?
Br,
/Roger