LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Using large datasets in python node 2

Solved!
Go to solution

Just recently I posted the following question: https://forums.ni.com/t5/LabVIEW/Using-large-datasets-in-python-node/m-p/4343916#M1273928. I modified my LabVIEW program to save 2D data to generate a file, and modified python program to load this file, and save a new file after UMAP processing. But actually the same problem occurs. When running the UMAP code in python node in LabVIEW, the error 1672 occurs. This occurrence has again a data size dependence: the error does not occur when the data size is 100000 x 100, but the error occurs when the data size is 200000 x 100. Is there a data size limit that can be handle in python node? Or is there a timeout function in python node? I appreciate if you could give me any advise. I attached my LabVIEW code in this post, and the following is a python code to communicate LabVIEW and python (named as functions4.py).

 

import umap
import numpy as np

def lv_umap(param0,param1,param2,input_tmp_file,output_tmp_file):
    dat1 = np.loadtxt(input_tmp_file)
    dat2 = umap.UMAP(n_neighbors=param0,n_components=param1,min_dist=param2).fit_transform(dat1)
    np.savetxt(output_tmp_file, dat2, delimiter='\t')
    num_rows, num_columns = dat2.shape
    return num_rows, num_columns
Download All
0 Kudos
Message 1 of 7
(1,311 Views)

@ytan wrote:

... When running the UMAP code in python node in LabVIEW, the error 1672 occurs. This occurrence has again a data size dependence: the error does not occur when the data size is 100000 x 100, but the error occurs when the data size is 200000 x 100. Is there a data size limit that can be handle in python node?


I have seen this before:

# case1, data exchange directly via python node

https://forums.ni.com/t5/LabVIEW/Tensorflow-Python-Node-breaks-after-weights-load/m-p/4216894#M1222...

 

- 1672 could be caused by an (unsupported, not-fully supported) numpy function or definition

- works to a size of a 2d array of double (=8 byte,  64-bit IEEE) with 1000x2048 elements and breaks at 10.000x2048 (I didn't bother to pin it down further...)

- size = 1000 * 2048 * 8 = 16,384 MByte (working)

- size = 10.000 * 2048 * 8 = 163,84 MByte (broken, error 1672)

 

# case 2, data exchange via file 

- (working) using pandas: https://forums.ni.com/t5/LabVIEW/Pandas/m-p/4167105#M1203513

- (broken, error 1672) using numpy.loadtxt https://forums.ni.com/t5/LabVIEW/Pandas/m-p/4167105#M1203513

- I made no attempts to look for a limit

 

 

 

Message 2 of 7
(1,236 Views)

Thanks a lot. I understand that numpy and other packages are not fully supported. I will try not to use these packages.

0 Kudos
Message 3 of 7
(1,191 Views)

@ytan wrote:

Thanks a lot. I understand that numpy and other packages are not fully supported. I will try not to use these packages.



here's the official NI quote regarding numpy https://forums.ni.com/t5/LabVIEW/Tensorflow-Python-Node-breaks-after-weights-load/m-p/4217127#M12228...

 

 

especially numpy's loadtxt and (obviously) savetxt are affected - in your original post, you are you using double (float64, 8 bytes per number)?

 

(working) 100.000 * 100 * 8 =80 MByte

(broken) 100.000 * 200 * 8 =160 MByte

 

can you use pandas instead of numpy.savetxt? or do you also run into a file limit of 160 Mbytes?

 

 

 

 

0 Kudos
Message 4 of 7
(1,113 Views)
Thanks for your suggestion. I tried to use pandas, but I received the same error. Again, there is no error when using 100000 x 100 data (data format is float32), but it causes an error when using 200000 x 100 data. numpy is used in the UMAP program, so I guess it can be a problem, but it seems too hard to change all the numpy functions in the UMAP program to others. The following is the code I made:
 
import umap
import pandas as pd

def lv_umap(param0,param1,param2,input_tmp_file,output_tmp_file):
    df = pd.read_csv(input_tmp_file, delimiter='\t', header=None)
    dat1 = df.values
    dat2 = umap.UMAP(n_neighbors=param0,n_components=param1,min_dist=param2).fit_transform(dat1)
    pd.DataFrame(dat2).to_csv(output_tmp_file, sep='\t', index=False, header=False)

    num_rows, num_columns = dat2.shape
    return num_rows, num_columns
0 Kudos
Message 5 of 7
(1,091 Views)
Solution
Accepted by ytan

@ytan wrote:
Thanks for your suggestion. I tried to use pandas, but I received the same error. Again, there is no error when using 100000 x 100 data (data format is float32), but it causes an error when using 200000 x 100 data.

 

(working) 100.000 * 100 * 4 = 40 MByte

(broken) 200.000 * 100 * 4 = 80 MByte

 

I don't know exactly, how the python node manages to share data between labview and python, but it is said there is some translation between data spaces going on (marshalling).

 

you still can run your python code from commandline, in your python script write a file to disk and read that file to labview.

 

 

Message 6 of 7
(1,073 Views)

Thanks again for your comments. I also think it will be a solution to run the python code from command line. I found system exec.vi that enables to use command line in LabVIEW, so I will try to use this.

0 Kudos
Message 7 of 7
(1,068 Views)