Filling N dimentional array without for loop

Ben · ‎07-14-2016

I want to echo the concern expressed by RavensFan and others about high number of dimensions.

I once created a 100-D array of booleans that quickly crashed the code. A 100-D array wire is "Yuge!".

But that was just a test and never intended for something usefull.

I did touch an age old application that used a 4-D array (not writen by me!) that should have used arrays of clusters and nota 4-D array.

Aside from some anisotropic (is that the right word?) material properties I have never imagined any measurement that needed more than 4-D.

In another long last thread from years ago I think it was Jim Kring that asserted that there is never a ndeed for anything above 4D. (admit my memory may be faulty).

Ben

Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel

RavensFan · ‎07-14-2016

@ziedhosni wrote:
Hi,
I am trying to calculate f(x1,x2...,xn) of the Ackley function that can be of any dimensions.
http://www.sfu.ca/~ssurjano/ackley.html
I am trying to calculte xi for a grid of 10 or customisable for each dimension.
Thanks,
Zied

So this sounds like a curve fit you are trying to do. I'd argue that even though you might have N number of X-variables, you aren't really dealing with an N dimensional array.

In any array, you need to have an element in every spot. So every row must have the same number of columns. They can't be ragged. Every page (3rd dimension), has to have the same size 2-D array. Every "book" (4th dimension) has to have the same number of pages. You can't have any element be non-existent.

With typical curve fitting, you are trying to find the coefficients for each of the Xsubn. You will need at least N number of data points where each data point consists of a (F(x0,x1,x2, .....) and all of the those X's. You don't need 2^N or 10^N. You really have a 2-D array of points (all the x-values that define each point), and a 1-D array of results.

ijustlovemath · ‎07-14-2016

That page you link has a rather simple matlab script for computing the ackley-m function. Here's a native G implementation:

Also attached the vi. Cheers!

ijustlovemath · ‎07-14-2016

So I decided to revisit this problem. It's a fun one! I came up with a data structure that will allow you do pretty much whatever you want with this thing; an "ackley point" is a really just a vector (1D array) of coordinates for some N-dimensional vector space for which the ackley code I posted previously applies a scalar value. Additionally, you want to know f(X) at that vector, so that's computed as well. Both the X vector and the f(X) value are stored in a cluster to create the "ackley point". You can then run these points into a for loop and unbundle by name to do whatever you want. No need for a 10D array at all; just the right data structure!

The index meshgrid (think of it like a bunch of equally spaced points in the hypercube) I built to use with a ramping function can only handle 2,147,483,647 points (LabVIEW max array size), so that's also the maximum amount of points you can have. Additionally, the hypercube is creating by just taking the cartesian product of a linspace'd interval with itself n times, so if you want a different number of points in one dimension versus another, or if you want to use the cartesian product of different intervals in your meshgrid, you're going to have to use the "cartesian product" vi in the project. The "List" element in the "List" cluster is currently an I32 to be used for the index meshgrid, but you could just as easily use doubles. I personally would say stick to the 32 bit integer and just index a collection of ramping functions according to what you want to do. Let me know if you want me to expand on this.

Hopefully the code is well documented enough to understand and apply elsewhere. I linked all the places where I took algorithms for future reference. Best of luck with your ML project 🙂

ijustlovemath · ‎07-14-2016

One last point.

As RavensFan mentioned before, the number of points grows VERY fast. If you have p points per interval, with n dimensions, you'll have p^n total points. So you can really only afford to have about ~10 points an interval and ~9 dimensions. Try and keep the array size less than 10 million to prevent major hangups. The code seems pretty fast for less than 100,000 points, but past that it's anyone's guess how well it will perform. This is why most ML work is done on GPUs using frameworks more suited to this than LabVIEW, but that's for another time. Also, I didn't do any memory optimizations for this code and LabVIEW usage got past a gig when debugging, so really try and keep the number of points minimal.

If anyone wants to take a look, maybe suggest places to add inplace structures or request deallocation, I'm open to improvements.

ziedhosni · ‎07-19-2016

Thank you guys for the effort. I managed to write this code. Unfortunately, I have to change the code each time I want to change the number of dimensions but it is working for 7 dimensions and I can use any function optimisation and look for the extrema.

But I still have the issue of the memory. I realised that I cannot exceed 10000000 solutions for a run and above that I get a memory error.

Is it possible to divide the data into portion and save them into files instead of the RAM. Also I noticed that Labview becomes slow after I run this code. Is it possible to empty the memory after each run?

Cheers,

Zied

Ben · ‎07-19-2016

Arrays of queues can be used to break-up large memory blocks.

Ben

Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel

ijustlovemath · ‎07-19-2016

zied, did you use my code or write your own? My code supports an arbitrary number of dimensions and points per interval, given you have the memory space. In practice I see that about 1 million points is the max you can realistically hold in memory. I did a bit of memory optimization, and made a plotting utility (only works in 2 dimensions), which gives the expected output:

What kind of file do you need to write to? As you've seen, this data structure is going to take up gigabytes of memory space, so I think you should think carefully about how you want to tackle this. Should it be a CSV? A bunch of CSVs? A TDMS? Straight up binary? What is your end goal? Are you going to use this data in LabVIEW or somewhere else?

I've attached the updated code; all the magic happens in ackley_main.vi, and acklet_plot.vi contains the plot shown above. In terms of memory usage,

1. Take note of the action engine with queue that I use to store the actual data points for later consumption. I use a queue instead of a normal function global as a function global that you read from needs to have the data in the shift register and the indicator; not a big deal normally, but when the indicator has hundreds of megabytes of data, this doubling needs to be addressed. I also clear indicators and controls that aren't needed to preserve memory space.

2. All usage of the data should be mediated via the functional global/action engine. This prevents unnecessary duplication; please don't add indicators to see the points, as this will add lots of memory overhead.

3. Each iteration of the for loop in ackley_points that generates data has a "request deallocation". This is important as it reduces any and all lingering data within that for loop to nothing; without it you get some lingering buffering for debugging purposes, which isn't good when your for loop is running millions and tens of millions of times.

4. Debugging is off in all data-generating VIs. This prevents LabVIEW from holding on to potentially millions of unneeded values.

Also, I seem to be having issues with the upload, so if any of these things I said don't seem to be there, please let me know.

Ben · ‎07-19-2016

Hi justlovemath,

Please review this thread that tells about the impact of conditionally reading a control to understand that a control does NOT have to store the data.

The "release Memory" function you are using is not doing what you think it is doing.

Just trying to help,

Ben

Retired Senior Automation Systems Architect with Data Science Automation LabVIEW Champion Knight of NI and Prepper LinkedIn Profile YouTube Channel

ijustlovemath · ‎07-19-2016

Thanks for the tip Ben! I'm going to keep the deallocation in the points vi and the Reinit to Default on the FG as I've seen the memory usage decrease drastically using those. I've always wondered that about the functional global though. Good to know.

LabVIEW

Filling N dimentional array without for loop

Re: Filling N dimensional array without for loop

Re: Filling N dimensional array without for loop

Re: Filling N dimensional array without for loop

Re: Filling N dimensional array without for loop

Re: Filling N dimensional array without for loop

Re: Filling N dimensional array without for loop

Re: Filling N dimensional array without for loop

Re: Filling N dimensional array without for loop

Re: Filling N dimensional array without for loop

Re: Filling N dimentional array without for loop