03-18-2024 02:00 PM
Hello everyone,
We have a system, developed in NI TestStand, where we integrate a few VIs used to communicate with multiple test equipment, via TCP/IP and Serial Communication. For all of them we use NI-VISA. Everything runs smoothly for a few hours, sometimes days, but all of the sudden all the VISA communication VIs start hanging and requires a complete restart of the app.
We have tried opening NI MAX to understand what might be the issue but it also freezes and then crashes with the following error message.
We can also conclude that both NI TestStand and the remaning LabVIEW VIs continue executing with no issue.
We have discarded some possible issues:
We believe the issue is in the low-level drivers of some of the equipment we use, such as FT232. We can not pinpoint the exact equipment that is causing the issue because there are multiple equipments running simultaneously, all using NI VISA and at quite an high rate. Since this is a system in production, we are also limited to the ammount the logs and experiments we can make.
We are trying to identify the root cause of the issue, but would like to better understand what might cause the NI-VISA to freeze/hang/crash, how we can obtain information to pinpoint the culprit and if it is possible for us to restart NI-VISA without restarting the entire app, as a workaround measure.
Thank you very much.
05-14-2024 04:14 PM
Carlos,
I have an error occurring that seems basically identical to yours. Did you figure anything out? If not:
- What version of Labview are you running?
- What version of TestStand are you running?
- What version of NI-VISA are you running?
- What version of NI-Serial are you running?
05-15-2024 04:01 AM
Hello Matty,
We still have no explanation for the issue. Right now, the test sequence is running in 6 production systems, all of them crash eventually but the time frame fluctuates ranging from 12h to 5 days between crashes.
Between systems we have already tried changing some of the packages' versions to verify if it could be the root cause, no changes.
We do not have NI-Serial installed.
TestStand 2017 SP1 (32-bit) - 17.1.0.130
LabVIEW 2018 SP1 f4 (32-bit) - 18.0.1
NI-VISA - 18.5 (visa32.dll)
TestStand 2017 SP1 (32-bit) - 17.1.0.130
LabVIEW 2018 SP1 f4 (32-bit) - 18.0.1
NI-VISA - 22.5 (visa32.dll)
05-15-2024 07:25 AM - edited 05-15-2024 07:58 AM
Carlos,
Interesting. We are running two identical system and I have yet to see it occur on the second system. We are running:
Labview 2022 Q3 (32-bit)
Teststand 2022 Q4 (32-bit)
NI-VISA 22.5
Our second/duplicate system is pretty new and only released in the past week. I wonder if it has something to do with Windows. What operating system are you running? Are the PCs themselves identical? What are your COM port numbers?
05-15-2024 08:09 AM
We are running Windows 10 on all systems. All of them have the same type of hardware and start of from an image of the first one.
We suspect the issue is hardware related: a DUT is connected whenever some other device is performing an action, which causes a faulty ground of sorts and breaks a low-level driver. This is only a suspicion.
For reference, we use multiple Waveshare USB to RS232 converters, one per DUT. DUTs sequences are asynchronous.
We also use multiple Segger JLink programmers, also one per DUT.
There are some other devices, such as power supply, DMM and so on, but we don't believe the issue is related with those.
The COM Port number are fairly common, i.e. COM8, COM9, COM10, for instance.
We have not been able to replicate the issue in a reliable way.
05-15-2024 08:45 AM - edited 05-15-2024 08:45 AM
Our setup seems to be simpler, and the DUTs are in an environmental chamber so there really isn't a way that they might become unplugged. I guess a grounding issue with us is possible, but nothing obvious right now. We actually use Sealevel serial servers setup with virtual COM ports. We runs the DUTs at COM200-COM231, so I was wondering if the higher number COM ports were causing a problem. I think we are in an acceptable range, though.
We are running Windows 11.
05-15-2024 08:54 AM
I can confidently tell you that it should not be the higher number COM Ports because we had a different project where the COM Ports would go into the thousands (a new one per DUT) and we did not have an issue.
Our suspicion for the grounding issue came from an occasion where we saw the device manager "blink" when a DUT was connected and the app immediately started to hang and eventually crash. During normal operation, this "blinking" did not occur.
At the time we bypassed some mass interconnect interfaces for USB that we had using a direct cable and it seemed to considerably help. Alas, we still have it occurring from time to time.
05-28-2024 08:27 AM
I wanted to follow up with you. After seeing that only one system was experiencing this "driver" issue, I decided to get a new PC from our IT department and set it up fresh myself. After ~2 weeks of running with the new PC, we have not seen this issue. I wonder if you should try setting up a brand new PC and updating to one of the newer versions of labview and teststand.
05-28-2024 09:51 AM
Hey Matty,
Thank you for the feedback.
Since we have now 7 PCs running the same software, I am not completely sure how I can do the same test. We have ran this software in our facilities in multiple systems with no problem, and the issue only appears when is more stressed in our customers' facilities, with their IT tools installed. As such, I don't discard it is related with some configuration but it is nearly impossible for us to test it out.
We have a few ideas that we will try in the following weeks, if I get some relevant information, I'll post here.