02-13-2018 08:25 AM
Hi All,
I went through a number of white papers, but haven't really found what I need to deal with the problem.
To give you an idea of what we do:
1) We capture data of a detector data(datapoints, pixel)
2) We sort this data in higher dimension arrays to be able to average and do computations on it. Such Data array can be of the dimension(10, 3000, 128, 4). These arrays alone need big parts of the memory - hints on making that more efficient are appreciated!
This is done in a way that we first initialize the arrays once, then go into a measurements loop where we feed the arrays using shift registers from iteration to another. Inside the loop we do not use indicators or local variables to not waste memory. To make the code easier to read I created a typedef of the cluster containing all 8 arrays. This cluster is passed to the measurement loop and therein from vi to vi. I now found that passing such a cluster can be very inefficient in case of large arrays - any recommendation of what to do apart from passing every single array to every vi? We really struggle with memory load here as starting this routine easily blocks 3+ GB of RAM.
best,
Julian
Solved! Go to Solution.
02-13-2018 09:13 AM
Hi Julian,
defining an array of 10×3000×128×4 elements of DBL needs 15360000*8 bytes ~= 117.2 MiB.
This should be ok for 32bit-LabVIEW - as long as you don't use a lot of indicators to display that array…
How many of such arrays do you need?
- LabVIEW stores arrays as one block in memory. To reduce the need for large memory blocks you could create an array of cluster of arrays: each cluster can be stored in its own memory block…
- When you get into memory space problems you can always store your data in files. Then you need to load the data from files as needed: it's a trade-off between memory consumption and execution speed of your VI…
02-13-2018 10:21 AM - edited 02-13-2018 10:21 AM
Hi Gerd,
thanks for this fast reply! We already switched to LV64 Bit (2016) to get rid of the 32bit Ram limitation. In total we need 4-6 of such arrays, which should not be a problem on a 8GB machine.
- LabVIEW stores arrays as one block in memory. To reduce the need for large memory blocks you could create an array of cluster of arrays: each cluster can be stored in its own memory block…
This would be very nice to keep to code clean. However, the array to cluster conversion is only available for 1D arrays or am I missing something here? This would then involving reshaping the arrays every time the functions sort and add data.
- When you get into memory space problems you can always store your data in files. Then you need to load the data from files as needed: it's a trade-off between memory consumption and execution speed of your VI…
At the moment the code runs slightly faster than the stream of data comes in, therefore we do not risk to slow the VI down.
best,
Julian
02-13-2018 10:26 AM
If you're having to pass large amounts of data between VIs, and it needs to be persistent for a while, two solutions come to mind:
DVRs. Gather the data once, keep it in one spot, and pass a reference to the data to the other VIs.
File IO. Dump the data to a binary file and you'll have random access to the pieces you need when you need them.
DVRs only allow one accessor at a time, so you wind up serializing processes that you thought were parallel
In the GHz world of accessing RAM, File IO is terribly slow. You -will- take a noticeable performance hit moving data from memory to files.
02-13-2018 10:52 AM
wrote:
1) We capture data of a detector data(datapoints, pixel)
2) We sort this data in higher dimension arrays to be able to average and do computations on it. Such Data array can be of the dimension(10, 3000, 128, 4). These arrays alone need big parts of the memory - hints on making that more efficient are appreciated!
It would be useful to have more information on the computations. For example if parts of the data gets averaged and combined, maybe the averaging can be done in-place whenever new data arrives, potentially greatly reducing the memory requirements. What is the datatype? How many bits in a pixel? If it is e.g. just 8bits/measurement, you can accumulate (sum) the data in 16bits (or higher), depending on the number of averaging, avoiding divisions and orange wires. you can calculate the average at the very end doing integer math., close enough.
When you cross subVI boundaries, sometimes memory copies are made. DVRs (as have been mentioned already) could offer a better solution if done right. You could also try to inline the subVIs to see if it helps.
Hard to give more targeted advice without much more details on the system and required processing.
02-13-2018 11:19 AM
Hi Julian,
However, the array to cluster conversion is only available for 1D arrays or am I missing something here?
You are right with both parts of the sentence 😄
You don't convert an array to cluster, but you define your data structures differently right from the beginning!
You don't create a 4D array, but a 1D array of clusters. Each cluster can contain a 3D (sub) array. This way LabVIEW doesn't need to request the big 4D array memory block, but just 4 smaller 3D (sub) array blocks. (This also helps when the 3D (sub) arrays are of different size…)
02-13-2018 02:02 PM
You don't create a 4D array, but a 1D array of clusters. Each cluster can contain a 3D (sub) array. This way LabVIEW doesn't need to request the big 4D array memory block, but just 4 smaller 3D (sub) array blocks. (This also helps when the 3D (sub) arrays are of different size…)
Ah ok! So for example I want to pass a 4D Array and one 2D Array through my routine I then would break down (to speed things up) the 4D array in two 3D arrays which I would pass to a Build Cluster Array Function. This will give me an Array of Clusters. I then have one 2D array left which I also feed into a Build Cluster Array Function.
Would it be ok to bundle such Arrays of Clusters together and pass them between the functions or will LV produce copies of the whole things when they are bundled together?
02-13-2018 02:20 PM - edited 02-13-2018 02:20 PM
Hi Julian,
would break down (to speed things up) the 4D array in two 3D arrays which I would pass to a Build Cluster Array Function
I don't think this goes the right way.
Instead it would help when you would provide a "real-world" example of your requirements (which kind of arrays, how many, …)!
02-13-2018 02:22 PM
When you cross subVI boundaries, sometimes memory copies are made. DVRs (as have been mentioned already) could offer a better solution if done right. You could also try to inline the subVIs to see if it helps.
Hard to give more targeted advice without much more details on the system and required processing.
It would be useful to have more information on the computations. For example if parts of the data gets averaged and combined, maybe the averaging can be done in-place whenever new data arrives, potentially greatly reducing the memory requirements. What is the datatype? How many bits in a pixel? If it is e.g. just 8bits/measurement, you can accumulate (sum) the data in 16bits (or higher), depending on the number of averaging, avoiding divisions and orange wires. you can calculate the average at the very end doing integer math., close enough.
Sorry for keeping my description so short. I wanted to spare details to not make my questions to complicated. Nevertheless I am happy to share some specifics here:
We drive a mirror back and forth and capture shots of a laser. The data UInt16 coming from our detector is collected using DAQmx functions from a 144 channel detector. After capturing e.g. 10k events we collect the data and the the position of the mirror at every instant of such event. We start the motor and the acquisition again and after that loop through every of the 10k shots to sort them into the data matrix according to the position of the mirror. Here data is only summed into a temp data matrix. We perform this 5-20 times until sufficient data quality is reached. To be able to average a second matrix stores how many events per mirror position was collected. Here we increment every time an event is added at a given position.
That is the main acquisition loop. After that we have ca. 1 second time where another motor is moved which we use to add the collected temp data meaning data_temp and counter_temp to a data_tot and counter_tot array. Here the same operation as mentioned are used.
best,
Julian
02-13-2018 02:45 PM
I have two suggestions that may help you out.
1) Go read the "Clear as Mud Thread" found here. It was a post by one of the Chief Architects of LabVIEW and talks about how to be able to pas large data sets to sub-VIs with a copy.
2) Review my Action Engine Nugget found here. In that nugget I wrote about Action Engines that make use of Shift Registers (back then but now can be realized using Feedback nodes). By wrapping all of the functions that touch a large data set in an Action Engine I have been able to keep the storage requirements in the Shift Registers with no need to copy the data.
Then to help you further I suggest you learn how to use;
Tools >>> Profile >>> Show buffer allocations
and
Tools >>> Profile >>> Profile Buffer Allocations...
If you still want to read more then check out these tags and the related links to find other discussions about memory that may be helpful.
Have fun!
Ben