04-14-2013 08:50 AM
I have a For Loop, calling a certain function inside a DLL, which I wrote. I'm observing something strange when I parallelize the execution of the loop, and I wonder if that rings any bell to anyone - a specific bug or caveat of LV parallelization, for instance, or if I have to look elsewhere.
The platform is win7, LV2010SP1, problem observed on two different computers, a quad and an oct core.
My DLL function (I have the source C code) interfaces with certain openCV 1.0 routines, allocates its own buffers, does its own things, returns its results and deallocates its storage. I think it is not too important to enter in details here; what is relevant is the following:
I had no a-priori statement about the thread safety of the underlying openCV 1.0 routines, but the fact that they DO run in parallel without problems for up to 3 threads, and that execution times scales inverse-proportionally for up to 6 threads when the job is split across instances, is a hint that they hopefully are.
I vaguely remember that the number of threads which LV reserves for some of its subprocesses may be 4 or 8 (e.g. the GUI thread, configurable somehow IIRC), and I wonder if that has anything to do with what I observe.
Any hint?
TIA, Enrico
04-15-2013 09:22 PM
Hi Enrico,
The default number of threads in LabVIEW is 4.
1 UI thread and 3 others, that is why you can get an error when the number is more than 3. Have a look at this knowledge base which shows how you can increase the number of threads.
Regards
04-16-2013 01:23 AM - edited 04-16-2013 01:33 AM
@Arham-H wrote:
The default number of threads in LabVIEW is 4.
1 UI thread and 3 others, that is why you can get an error when the number is more than 3. Have a look at this knowledge base which shows how you can increase the number of threads.
This statement disagrees with "How Many Threads Does LabVIEW Allocate?", which say the number is much higher and depends on the number of CPU cores. Can you clarify your statement?
I've been running up to 32 parallel instances of a parallel for loop without the need of any special thread configuration setting on a system with 16 hyperthreaded cores (32 virtual cores).
04-16-2013 02:05 AM - edited 04-16-2013 03:47 PM
@Enrico_Segre wrote:
Any hint?
You said you made the dll yourself. I am not familiar with openCV. Is the dll compiled so it includes everything in it or does it call external dlls itself? does your dll include any internal parallelizations (e.g. parallel_for). If you parallelize in LabVIEW, you should probably make sure that the dll does not include parallel code (just guessing).
You did not say how many parallel instances of the FOR loop you have configured. What is the exact models of your CPUs?
LabVIEW 2010 seems a bit old. Have you tried in 2012 (download the free evaluation to test if you can).
(I have a dll written in fortran that cannot be called concurrently. I ended up dynamically creating N unique copies of the DLL at startup, then calling it using the parallel instance ID, indexing in an array of dll names. Configure the clfn to specify the path on the diagram. I don't think LabVIEW 2010 has the parallel instace ID output, so you might be out of luck. I do get a 17x parallelization speedup on a system with 16 hyperthreaded codes and there was no need to tweak the thread configuration at all)
04-16-2013 02:11 PM
Thanks to both for the replies. Here I have to admit that I have reported a little too early, before making sufficient analysis. I apologize for that and plan to carry out further tests.
By now I have discovered that the issue of " if N>6 the routine halts irrecoverably" was due to a bug of mine, involving passing a misdimensioned matrix to the routine in that case, rather than to a LV fault. Still, I confirm that when errors 1097 pop up, they do it seemingly at random. Now it may be that the innards of my routine (certainly calling other stuff in OpenCV dlls, no idea about parallelization pragmas) do violate allocations somewhere, possibly also depending on input data; but I still wonder then if there is some issue when errors are generated in parallel threads, so that they are not always concurrently reported. Will see if I can shed more light.
As for looking into more recent LV versions, yes, I have access to them; LV2010 was only the one installed on the target systems where I became aware of the problem.
Enrico
04-16-2013 06:23 PM
I stand corrected, the number of threads does depend on the number of CPU cores. I stated 4 threads as this is the number that is by default when a for loop is configured for parallelism.
Regards