01-21-2025 04:17 AM
I am trying to have a more precise control over the thread management in LabVIEW.
Basically I would like to have a dedicated thread in which my loop executes the code.
The simple answer would be Timed Structure, as it does what I need and also allows the control over CPU management.
Normally I would end here but there are two problems with this answer.
1. The numer of Timed Loops is limited (I need quite a lot of them)
2. The RT palette is not available for non-rt linux environment which I am planning to use (or is there a way to have it on non-rt linux?).
The question is: is there a way to "reserve" a thread in LabVIEW for a particular piece of code in the way as the Timed Loop does it?
Somehow I assume that under the hood Timed Loop just tells compiler to book a thread for it and not share it with other pieces of application and there possibly is a way to do the same thing but it is very well hidden under the Timed Loop implementation. So in essence the timed loop is just a wrapper for something we could normally leverage as an experienced lv programmers (again just an assumption).
I already read through a couple of white papers on that.
I also saw a very similar thread to mine but without actual answer being provided: https://forums.ni.com/t5/LabVIEW/How-to-manage-threads-in-LabVIEW-or-have-exclussive-access-to-a/m-p...
01-21-2025 06:23 AM - edited 01-21-2025 06:27 AM
@graPI wrote:
I would like to have a dedicated thread in which my loop executes the code.
Well, it possible, but not easy. You can do it with specially prepared wrapper-DLL (where you will start dedicated thread and switch context for each call). Before we will continue - can you please provide information about your particular case — why do you need this? The only common reason is if you have third-party DLL in your hands, which is not thread safe, and you would like to use it not in UI thread (for example you don't want to block UI thread for long time). Which exactly LabVIEW and OS versions are you using?
01-21-2025 07:15 AM
I need to run approx 100 async threads.
Each thread makes some operation/calculation that take approx 10-250us.
Each of the threads has its own cycle that should be kept.
My observation shows that when I allow labview to control that, the cycle of the threads is very unstable.
Example:
test1: Code of all threads called one by one in a single VI needs ~20ms to complete (OK).
test2: Code of all threads split to 2 loops running in parallel needs ~10ms to complete (perfectly ok until now, 2 threads - time reduced by 2).
test3: Code of all threads running in a separate async VI in parallel needs 20-100ms to complete (sum from all threads).
So for some reason when LV schedules the 100 threads it is not optimal to say the least.
Currently the tests are done on LV2020, PXI with linux RT
Hence, I would like to have more control over threads scheduling - this can be done on linux but only if I know the actual thread to schedule and since labview puts my code in different threads I cannot do that. Moreover switching between threads seem to have an additional overhead (this is observed in test2 when compared to the same code running in timed loops).
01-21-2025 07:26 AM
test4: Make a parallell for-loop with as many threads as cores and try again.
Clearly the overhead of handling 100 threads is big compared to the work being done.
01-21-2025 09:49 AM
Not sure if it applies to your case, but there is a know parallel loop bug in LabVIEW as discussed in this forum post.
01-22-2025 04:17 AM
I am aware of this. Tested the workaround with no difference in my code.
One additional interesting thing that I did not mention before.
In test2 I tried couple of options, all executed on quad core PXI and on 8 core PXI.
I got the best results when there were 2 parallel VIs (runners) executing, each of them doing the 50% of the work.
Incrementing the number of runners increased the cycle of each runner (and total execution time as a result), even though for 3 runners each of them just got 33% of work to do.
Why can that be?
I would assume that as long as I have less parallel VIs than CPU cores I should get better results when splitting the workload to more threads.
01-22-2025 06:19 AM
@graPI wrote:
Incrementing the number of runners increased the cycle of each runner (and total execution time as a result), even though for 3 runners each of them just got 33% of work to do.
Why can that be?
I guess it is because of this:
@graPI wrote:
...
(sum from all threads).
I have strong feeling that you using sum of all "single" execution time, but this will include idle time.
Let me explain.
For example, my worker thread will be like shown below:
Just computation of SHA256 from 4K array. I will measure execution time.
Now I will call it 100 times in the cycle and also do the same with two parallel threads:
As you can see, for the first loop it takes 9,3 ms on my PC, and sum of all individual times also 9,3. When running with two threads, then total execution time is around half - 4.7 ms, but sum of each time remains the same more or less.
The situation gets changed if I will run it on 8 threads (I have 8 physical cores CPU):
Now total computation time in 8 threads is 1.8 ms only, but total time is significantly larger - 13,2 ms. It is normal and expected result.
For 16 threads the difference is larger:
In case if each iteration have significantly different execution time, the it make sense to adjust chunks (but in very rare cases) or in case of parallel running workers execution order to get "balanced" load and maximal CPU's utilization.
01-24-2025 01:44 AM
Thank you Andrey for you comment but I think I am doing the calculations correctly.
My service looks as follows:
the #computations for 1 single chunk needs approx 8ms so it is expected that it will not fit in a single instance of the service and as a result the `loop cycle` is 8ms instead of the expected 4ms passed to wait.
For 2 instances it gets better and the average loop cycle is approx 4ms.
I would expect the result for 3 and more instances to be even better on 8 core system but it is the opposite.
The cycle gets very unstable (1-2ms jitter).
It is not the case for the Timed Loop though.
When while loop is replace with timed loop the 'loop cycle' get extremely precise (jitter ~0,005ms).
I want to achieve the same result with while loop.
I think it must be just something with the priority and scheduler used by the Time Loop, but maybe there is something more.
Just modifying the scheduler and priority for the thread does not help much so maybe lv just executes the tasks in while loops differently than in timed loops 😕