LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Why is this set of case statements not executing in parallel?

Solved!
Go to solution

I originally had tried to use an auto-indexed loop but I think some of the aforementioned data flow issues might have caused it to not operate in parallel.

 

As for the GET ALL DATA function, it is requesting data through a TCP/IP connection which has a timeout of 10 seconds if the device does not respond. So, the idea is that data is sampled from all devices in parallel so that the worst case scenario is that the total time wait if a device is not responding is 10 seconds rather than 10 seconds per device not responding. So, if 9 out of 10 devices are disconnected, it would still only take approx 10 seconds to execute rather than 90.

 

So, now that I have a working solution, I think I will try to roll it all back into a parallel loop. Thanks for the code, James. I will start with that approach. The GET ALL DATA is re-entrant and in the past I have had it working in parallel by knowing how many devices are available ahead of time. The problem cropted up by trying to make the code more generic to allow for an unknown number of devices to be sampled. I think I will do away with the Merge Errors function and return an array of errors to make the call lossless. I am concerned about how LabVIEW unrolls parallel loops. As mentioned, how does the number of cores and execution threads affect the performance of the parallel loop?

0 Kudos
Message 11 of 29
(1,573 Views)

@James.M wrote:

 



I was simulating his subVI, which has a boolean output. He didn't share his code so I don't have the subVI itself.

Get All Data.png

 

 

I see now why OP did it without a For loop, if he has more devices than he has cores. The For loop could save him coding duplication for at least the number of cores.

 

 


Gotcha.  That makes sense now.  I missed the boolean wire in his image.

 

For Loop is still better.  It doesn't matter how many cores he has.  If the number of actual parallel instances are limited by the number of cores in the For Loop, it is in the multi case structure as well.  But if the primary problems are long delays in the subVI's, the limited cores should not be a problem as each core could handle multiple copies through thread switching while other threads sleep during their timeout.

0 Kudos
Message 12 of 29
(1,570 Views)

@RavensFan wrote:
For Loop is still better.  It doesn't matter how many cores he has.  If the number of actual parallel instances are limited by the number of cores in the For Loop, it is in the multi case structure as well.  But if the primary problems are long delays in the subVI's, the limited cores should not be a problem as each core could handle multiple copies through thread switching while other threads sleep during their timeout.

I believe it does still matter. The For loop won't do more parallel instances than allowed by the number of cores. The way around this is nested For loops. Then each loop is split between cores in a weird way. I don't think this improves processing speed any more than a single parallelized For loop, but it will allow the code to all run in parallel up to #Cores^2. See below:

Get All Data Nested.png

Important: If your array size isn't evenly divisible by the number of cores, you will need to filter out the array elements that you don't want because there will be dummy elements in your 2D array acting as fillers.

Cheers


--------,       Unofficial Forum Rules and Guidelines                                           ,--------

          '---   >The shortest distance between two nodes is a straight wire>   ---'


Message 13 of 29
(1,554 Views)

I will definitely have more array elements than the number of cores^2. As long as the parallel iterations can do appropriate context switching and the threads support shared scheduling, the actual execution time is negligible compared to the timeout wait for a unresponsive device.

0 Kudos
Message 14 of 29
(1,542 Views)

Jeez, how many devices to do you have?

 

You might be able to repeat what I did above with another For loop, so it goes to Cores^3, but I haven't tested this.

Cheers


--------,       Unofficial Forum Rules and Guidelines                                           ,--------

          '---   >The shortest distance between two nodes is a straight wire>   ---'


0 Kudos
Message 15 of 29
(1,535 Views)

Nice implementation.

 

I hadn't used parallel for loops much.  I was thinking it could do more instances than cores because in the dialog, there was no limit on the number you can enter.  But now I see if you enter too big of a number, it limits it, (to 4 in my case.)

0 Kudos
Message 16 of 29
(1,531 Views)

Here is what I boiled it down to:

 

LabView-Parallel-Cases-Solution-Loop.jpg

I just tested it and the entire loop takes approximately 12 seconds to execute. That seems about right with the data processing involved. Thanks to all for the input and suggestions. I have a much better handle on data flow and it's impact on making things execute in parallel.

0 Kudos
Message 17 of 29
(1,504 Views)

For something like this, I might consider not trying to use a single subVI to do all the work.  When I think of communication, I think of putting it in a separate parallel process to communciate with.  Send information requests to a queue.  Allow the separate loop to update the data array stored in a functional global variable as each response comes back.  I'm assuming that since it is all TCP/IP, there isn't a restriction on how many devices you can talk to at once since each device would have its own IP address.

 

I haven't dove into this pool myself yet, but it sounds like the kind of app meant for object oriented programming.

 

Another possibility is to have each information request spawn a separate instance of a reentrant subVI in a Call and Forget manner.

Message 18 of 29
(1,500 Views)

Yeah, I considered a state machine to keep track of the requests and responses but it would require quite a bit of re-write for existing infrastructure. The same with using events and callbacks. It can be challenging to keep growing an existing program as it increases in complexity. At some point a rewrite of higher level logic will need to take place to make this more efficient, especially if the program needs to update fields or perform other logic in near real-time. For now, tens of seconds of delay in them main execution loop will not cause any major problems.

0 Kudos
Message 19 of 29
(1,493 Views)
Solution
Accepted by Arcus111

@Arcus111 wrote:

Here is what I boiled it down to:

 

LabView-Parallel-Cases-Solution-Loop.jpg

I just tested it and the entire loop takes approximately 12 seconds to execute. That seems about right with the data processing involved. Thanks to all for the input and suggestions. I have a much better handle on data flow and it's impact on making things execute in parallel.


This will still be limited by the number of cores.

 

I would look in to Call and Collect as a way to do infinite launches like RavensFan suggests. I made this below if you want to do nested For loops. This is where i stop spending so much time on this though haha Smiley Very Happy

Get All Data NestedNested.png

Cheers


--------,       Unofficial Forum Rules and Guidelines                                           ,--------

          '---   >The shortest distance between two nodes is a straight wire>   ---'


Message 20 of 29
(1,491 Views)