CUDA Matrix Multiplication Fails

Mimix88 · ‎03-02-2016

Hi,

I am trying to speed up my application by moving some heavy matrix operations on the GPU. Since I have never created a block diagram to run on the GPU so far, I made a simple example to see how things work. Attached is the screenshot of the VI I wrote.

It simply initializes the device and the LVCUBLAS library; it allocates memory for the matrices it is going to use and downloads the data to the device; compute the multiplication. However, even if I do not get any error, the result I obtain, which should be [2, 2], is not correct. What I get, is basically the vector C back, which is [0,0].

Am I missing anything?

Thank you in advance for you help.

Blokk · ‎03-03-2016

I did many years ago a CUDA online training (using CUDA C), but just so many things faded away, and I never tested it with LabVIEW. However, I have downloaded the toolkit (i have a laptop with a Nvidia gpu), and also installed the CUDA toolkit (I get my GPU props with the "Get Device Properties.VI", so it looks install went OK).

Could you attach your VI (i have LV2015), so i can test the results faster?

Mimix88 · ‎03-03-2016

Hello Blokk,

thank you for the reply and for your help. Attached is the VI that I wrote for testing.

Blokk · ‎03-03-2016

I get the same result, as you. I was checking your VI for possible problems, but still no idea...

My main problem is that, I just cannot find LabVIEW examples how to do it with this toolkit. There are only 4 examples in the Help finder, and none of them involves matrix products...

Sorry, maybe someone will jump in who works with LabVIEW CUDA frequently...

Mimix88 · ‎03-04-2016

Hi Blokk,

Finally I got it work. I post my solution here because it could be useful to others.

The problem is that the last version of the CUDA Toolkit, 7.5, has only the 64 bit versions of the cublas and the cufft libraries. Hence, they cannot be called by Labview 32 bit (I actually tried to force Labview to do this, but I got an error, and this is the explanation http://digital.ni.com/public.nsf/allkb/6C2CEE5925B8C1B08625721A00731B5E).

I could install Labview 64 bit, but I am also using the FPGA module, which requires Labview 32 bit, as highlighted here http://www.ni.com/pdf/manuals/374737b.html

I cannot even install the 32 bit CUDA Toolkit of a previous version (6.5), since my machine is a Window 7 64 bit.

Then, I overcame all this simply by copying the libraries cublas32_65.dll and cufft32_65.dll from the CUDA Toolkit 6.5 installer and pasting them to the CUDA Toolkit 7.5 folder installed on my machine.

Hope this can help.

Blokk · ‎03-04-2016

This is crazy!

What about, if someone has a built application using the NI toolkit, and they try to run it on another PC? If the CUDA Toolkit version is not correct, they will not get any error msg, but simply wrong results! This can lead to serious problems... 😞

Not to mention, how much time you lost from your life, because a toolkit was implemented with such improper compatibility checking...

Edit: for your info, my system (which also produced the bug):

Windows 7 x64, LabVIEW 2015 Pro 32 bit, CUDA Version 7.5.18

Yoni · ‎05-30-2016

That solved my problem.

Also using labview 32bit (also using FPGA module) with windows 64.

did the work-around and it worked wonderfully.

Thanks. (and kudos)

Yoni.

LabVIEW

CUDA Matrix Multiplication Fails

CUDA Matrix Multiplication Fails

Re: CUDA Matrix Multiplication Fails

Re: CUDA Matrix Multiplication Fails

Re: CUDA Matrix Multiplication Fails

Re: CUDA Matrix Multiplication Fails

Re: CUDA Matrix Multiplication Fails

Re: CUDA Matrix Multiplication Fails