09-14-2009 02:45 PM
Have been through the boards, but with no success regarding the specifics of my problem.
I am running a real-time application on a PXI-8108. The host software is taking care of user interface, etc. Information exchange is handled using Unbound Network Shared Variables (located on the host, though problems occur no matter where the variables are located). Everything deploys fine. Software runs swimmingly. I have backed off the DAQ/communication processes until the RT cores are at <10% busy. However, occassionally while running, I get a popup "Waiting for Real-Time Target xxx to respond..." If this message stays up for long enough, then the target's attempts to read the Shared Variables start erroring out with an error -1950679035. As the variables have already been deployed and used prior to this instant, I find this curious. If I filter out the error, eventually the real-time software comes back up and everything progresses as though nothing happened. However, this is a potential deal-breaker for the users who can be impatient with having to wait a few seconds for the software to respond.
Any idea what is happening? I'm not pumping huge amounts of data over the network, no threads should be starving. Kinda confused.
Dan M.
09-15-2009 11:47 AM
Hi Dan,
It does still sound like threads are starving. How have you architected your RT application? Are you using While Loops or Timed Loops? Network communication works on lower priority threads and can be ignored causing network connectivity issues if there is not enough time for the CPU to complete all tasks. Are both cores at <10% when you receive this error? Just in case, you may want to add a wait function in your While Loop or extend the loop period of your Timed Loop.
Error -1950679035 can stand for a couple different things, what text description do you get with this error code?
Are you using the DSC module?
09-15-2009 01:05 PM
The shared variables are being used much like how a queue would, directing each loop to perform an action should the host require it. If no action is required, then the loops default to check their respective bus or DAQ.
Currently, I am running 13 timed loops (whose periods I have extended signficantly during debug to no detrimental effect of my software, but to no improvement in the issue in question):
Both cores appear to share the load, usually the aggregate load is under 15%. When the popup appears telling me that the host is waiting for a reply from the target, the PXI controller monitor output still updates the CPU loads, which do not show anything unusual, such as a spike or a freeze in loading. The target loading does spike up to about 30% total after the communication is reestablished, but that appears to be the software attempting to "catch up" with the events that occurred during the "black out".
The error reads like this:
Shared Variable in Main Target Application.vi
This error or warning occurred while reading the following Shared Variable:
\\My Computer\Host-Target Interface\RS-485 Messages to Target [<--insert other variables here]
\\192.168.1.122\Host-Target Interface\RS-485 Messages to Target
Unable to locate variable in the Shared Variable Engine. Deployment of this variable may have failed.
Are there any other drivers that would be starving the Shared Variable Engine? For example, if a VISA or NI-DAQmx driver goes to sleep, could it hold up the SVE? What execution system are they running under?
Thanks for your help!
Dan M.
09-15-2009 01:06 PM
09-16-2009 03:19 PM
Another few possibly interesting things to note.
09-17-2009 07:12 PM
09-18-2009 07:49 AM
I'm not sure that I can post as much as would be needed as it would reveal a good deal of proprietary information.
However, I have not been just sitting on my hands. The problem appears to be on the host side, not the target side, as the host seems to be indicating. The problem still arose after I had removed all target code but one loop running at 50ms period and relatively low priority. So I turned my attention to the host. It appears that a portion of my host code (which performs a peripheral, rarely-used function) seems to cause the problems. Disabling that portion of code removed the problem. I tried running the problem loops slower (from 50ms to 200ms) and that seemed to help. It appears that these modules were starving the network thread. I may have to play around with priorities and threads to completely mitigate the issue.
Does this make sense to you, that the host would be the culprit?
Dan
09-21-2009 11:56 AM
09-21-2009 02:37 PM
Also, if some VIs are writing to the shared variables faster than the subscribers (readers) are reading from it, you may be getting buffer overflows.
PS- I am totally stealing your strongbad icon.
09-21-2009 03:25 PM
I'm not in front of my code right now. The shared variables are basically queues between the host and target VIs. One variable per direction per code module. Most of them are single writer variables, and I'm writing to them relatively slowly, about 50ms. One question, the elements in this pseudo-queue are usually on the order of a handful of bytes of string data. I came across the Flush Data.vi. Is it generally considered a good idea to use this VI to post the data as soon as possible? I know it would take some time to reach 8kB of data for some of these...maybe not 10ms, but I don't know.
Dan
PS--Be my guest! Strongbad rulz!