Multifunction DAQ

cancel
Showing results for 
Search instead for 
Did you mean: 

NI USB driver loses device over long period (1 week) - error -50405

0 Kudos
Message 11 of 45
(3,843 Views)

Hi Res,

 

Thanks for all the information and I hope your well.

 

I was wondering are you a UK Support customer? It may benefit your support by contacting your local office, and I will still help to support (and discuss) the issue with your local office. If it is the UK, then maybe we could have a brief chat about this some time. The office number is 01635 523545. If you prefer, if you can give me your details I can contact you.

 

Tomorrow (as its the evening in the UK) I will start some communication with my collegues in the States regarding R&Ds views on your issue and the subject of USB Long useage tests. Also, I will spent some time to look through your information in more detail and see if I have anymore suggestions.

 

The USB Root Hub Properties is a promising option...

 

I will either update via contact (if your UK Office) or obviously on the forums when I have news.

 

Thanks for your work! 

 

Kind Regards
James Hillman
Applications Engineer 2008 to 2009 National Instruments UK & Ireland
Loughborough University UK - 2006 to 2011
Remember Kudos those who help! 😉
0 Kudos
Message 12 of 45
(3,832 Views)

Hi Res,

 

Good afternoon and I hope your well.

 

To confirm I have requested some answers/discusion with the PSE/R&D on this issue. I will keep you update.

 

I have seen found out your a UK Customer. If you could please submit a support request I would prefer supporting you via email/phone. You can do this by going to ni.com/ask and then post the reference number - and I can pick it up and get your contact information.

 

I look forward to hear from you soon.

Kind Regards
James Hillman
Applications Engineer 2008 to 2009 National Instruments UK & Ireland
Loughborough University UK - 2006 to 2011
Remember Kudos those who help! 😉
0 Kudos
Message 13 of 45
(3,816 Views)

Hi Res.

 

I hope your well today. I still havent heard back from you regarding a service request - please respond. 

 

I have had an interesting discussion with R&D and PSE in the States and have some information I would like to give to you.

 

Firstly, there is a known issue with USB that can cause 50405, but this is with device insertion which has its root cause in the current and power from the different USB chipsets. We are working to mitigate this problem by recovering so that you at least do not have to plug and unplug the device.

 

However, there is no known issue with just running a long term acquisition and it just randomly fails with 50405. We didn't specifically do long term testing with the 9211A, but we recently tested a 6211 (which uses the same communication method and chips) continuously for 2 months for a different customer issue and had no 50405 errors. MAX alone will not cause this error, but the programming for those test panels are not optimized for long term testing so I would rely on something more straightforward. 

 

A question I have never asked Res, what environment will your final program be in ? It has been suggested to me on several occasions that the program will be more robust if the code was outside of MAX. Do you think you'll be able to try this - running a test not in MAX?

 

The team see two possible culprits for your situation:

1. The system you are using has an issue with the chipset or some other part of the system (possibly power on the USB host)

2. There is enough environmental noise to cause a problem.

 

1# If it is a system thing, then the need to use a different computer/chipset or try putting an external powered USB hub in line.

Have you tried another computer?

2# Environment can also cause a problem. Noise from machinery and other electronics can be enough to cause problems. The team told me that a developer there was able to cause a reboot on a USB 6211 by laying his iPhone on top of it and calling it.

It might be worth to try to cause a disturbance, by running any electronics that could be causing a problem near the device and see if they can reproduce. What are your thoughts on this?  

 

I look forward to hear from you soon.

James. 

Kind Regards
James Hillman
Applications Engineer 2008 to 2009 National Instruments UK & Ireland
Loughborough University UK - 2006 to 2011
Remember Kudos those who help! 😉
Message 14 of 45
(3,790 Views)

Hi James,

 

Thanks for the response.

 

My apologies for the delay in replying; service requests appear to need a support contract, which we apparently do not have.

 

> I was wondering are you a UK Support customer?

 

No. We are in the UK, but do not have a support contract.

 

> I have seen found out your a UK Customer. If you could please submit a support request I would prefer supporting you via email/phone. You can do this by going to ni.com/ask and then post the reference number - and I can pick it up and get your contact information.

 

I have submitted a support request by e-mail although we do not have a contract.

 

> I hope your well today. I still havent heard back from you regarding a service request - please respond. 

 

I have raised a service request simply referencing this forum thread, as requested.

 

> Firstly, there is a known issue with USB that can cause 50405, but this is with device insertion which has its root cause in the current and power from the different USB chipsets. We are working to mitigate this problem by recovering so that you at least do not have to plug and unplug the device.

 

Device insertion is certainly not part of our long-term test.

 

> MAX alone will not cause this error, but the programming for those test panels are not optimized for long term testing so I would rely on something more straightforward. 

 

What do you mean by "straightforward"?

 

> A question I have never asked Res, what environment will your final program be in ?

 

The final program is already written in Borland Builder C++ (v3). The code has been used long-term since 2003 using an EDR ISA thermocouple. The code has been converted to poll an NI USB-9211A instead.

 

> It has been suggested to me on several occasions that the program will be more robust if the code was outside of MAX. Do you think you'll be able to try this - running a test not in MAX?

 

Our code, polling the NI USB-9211A once every 20s, also suffers from an eventual -50405 failure.

 

To eliminate our code as the source of the problem - a traditional blame route taken on the forum - we tested the NI software separately and it suffered from the same problem. To avoid unnecessary finger-pointing the problem was introduced using NI software/hardware alone.

 

> Have you tried another computer?

 

No. This test can be tried, if you wish, after the current test has completed.

 

> It might be worth to try to cause a disturbance, by running any electronics that could be causing a problem near the device and see if they can reproduce. What are your thoughts on this?  

 

I'm afraid the equipment in the immediate vicinity is immovable due to permanent installation and weight. Hence we can only consider moving the PC + USB-9211A to a different environment (location). This would have to be sequenced around the 'another computer' test, since we only have one NI thermocouple.

 

Thanks for the effort so far,

 

RES.

0 Kudos
Message 15 of 45
(3,786 Views)

Hi Res,

 

Thanks for the update. I can confirm I have received your submitted request. You don't need support in this case - because it will make this investigation easier if I have a direct contact. You should receive an email from NI UK Support shortly.

 

NI Community - I will continue support directly with Res - but will update this post upon conclusion of the support.

 

Kind Regards,
james.

 

 

Kind Regards
James Hillman
Applications Engineer 2008 to 2009 National Instruments UK & Ireland
Loughborough University UK - 2006 to 2011
Remember Kudos those who help! 😉
0 Kudos
Message 16 of 45
(3,778 Views)

Hi All,

 

The outcome of some work in the UK and with R&D was a test has been successfully running for 171hours+ using LabVIEW and NI USB-9211A. No error was seen.

 

The conclusion was the environment would be a major issue. It has been seen for example, that if a Mobile Phone is placed next to a USB device it can force an error. 

 

One consideration could be using a USB cable with a ferrite ring - the compactDAQ for example has this. 

 

Res is planning to run his setup in a 'quieter' environment. 

 

Also, it has been seen that a double reset can allow the device to recover from such errors.

Kind Regards
James Hillman
Applications Engineer 2008 to 2009 National Instruments UK & Ireland
Loughborough University UK - 2006 to 2011
Remember Kudos those who help! 😉
Message 17 of 45
(3,692 Views)

I am experiencing a similar problem with my USB data acquisition system -- after running smoothly for some period (normally only several hours), I get an error in one of my tasks and from that point on I am not able to reset my device or restart any data acquisition tasks until I unplug/power off the DAQ device (USB-6229) and then plug it back in.

 

Typically when the failure occurs the read or write call returns -50405.  My program normally tries to respond to a DAQ failure (i.e. if I disconnect the USB cable) by stopping & clearing any active tasks, resetting the device via DAQmxResetDevice, recreating the I/O tasks, and then starting the I/O tasks.  After this failure, though, subsequent attempts to restore data acquisition fail with an error code of -200324.

 

I have several of these USB devices connected to various machines out on our production floor -- it is an industrial environment so I'm sure there is at least some electrical noise.  Each device has an external power supply powered off of a UPS, so I would expect that the power is fairly clean.  The machine on which each of these devices is installed does move back and forth on a set of rails, so there is probably some vibration as well.  However, we have done our best to secure all cabling so we expect that nothing should be coming loose -- certainly after a failure occurs when we recheck the cabling everything seems to be tight.

 

From my point of view I see 2 problems: first is obviously the initial failure in the data acquisition, and second the fact that the system requires a manual disconnect & reset of power in order to get it going again. 

 

In trying to track down the cause I have been able to reproduce the second of these two back in our lab.  If I pull the USB cable half out of the device and then wiggle it back and forth so that the device comes online/offline very quickly a number of times, eventually it will get into a state where MAX indicates that the device is installed but the serial number is 0x0 and my program is unable to connect to the device.  At that point even if the USB cable is fully reseated in the socket the device will not come back online.  From here the only way to get things going again is to disconnect & reconnect.

 

During the period when the device will not reinitialize, it looks to me like the NI Device Loader service (nipalsm.exe) is hung.  It is not CPU bound, but I am unable to stop the service or kill the process w/ task manager.  As soon as I disconnect the device then the service starts responding to service requests again -- that is, if I had tried to stop the service during the period when the device would not respond, then as soon as I disconnect the cable the service will finally stop.  So I'm guessing that something in this process is getting deadlocked or hung up.

 

Obviously I'm more interested in preventing the initial failure, but I'm hoping that the above information about the state that the system is in after the failure might be helpful in identifying the cause.

 

Thanks,
AJ

 

0 Kudos
Message 18 of 45
(3,665 Views)

Hi Aj,

 

Thanks for your post and I hope your well today.

 

Your research certainly makes for interesting reading.

 

The cause of this behavour is certainly environmental - as the R&D results show. I don't think the usb connection is the most robust system in these types of environment - the connection can't be screwed down. To me your work seems to suggest the root cause is these vibrations - clearly not designed to be connected/disconnected regularly. I would dear say a PCI type card with a terminal block would be mor robust. Saying this, I have updated the R&D team who I am dealing with the information on the ni service which hangs - and I can update this post in dew course if you wish?

 

A few suggestions I would have is:

1. Are you using a usb cable with a ferrite ring?

2. Have you tried performing a double DAQmx reset - this seems to be a regular solver?

3. Have you considered using Devcon? - a tool which acts a bit like a manual unplug/plug in.

 

How Do I Force Windows to Remove and Redetect a USB Device?

http://digital.ni.com/public.nsf/allkb/1D120A90884C25AF862573A700602459?OpenDocument

 

There is a lot of advice on the web site about resolving nipalsm.exe errors but not hangs. There is a service per driver which is active. As the DAQmx driver is still in use - even if it is erroring this explains why you can't stop this service. I wonder if there might be an internal error which may still appear in the windows log, you can find it here:

Select Start»Run, enter drwtsn32, and press the <Enter> key. The Dr. Watson window appears. This window indicates the location of the error log on your computer.

 

Please let me know your thoughts,

and thanks for your information!

 

Have a nice day. 

Kind Regards
James Hillman
Applications Engineer 2008 to 2009 National Instruments UK & Ireland
Loughborough University UK - 2006 to 2011
Remember Kudos those who help! 😉
0 Kudos
Message 19 of 45
(3,647 Views)

Hello everyone who is still reading.

 

This is a clarification on the last two posts by James Hillman, from NI. The posts can be misleading, suggesting we have had 'success' in circumventing the problem, and attributing the problem to environmental noise. The former is not true, and the latter is not yet proven. RES does not wish to be party to publishing misleading information.

 

Note that each test we perform takes up to 1 week so James' posts have pre-empted our test results.

 

Specifically:

 

> Res is planning to run his setup in a 'quieter' environment. 

 

True. This is the current test. James has pre-empted the result.

 

 

> Also, it has been seen that a double reset can allow the device to recover from such errors.

 
Not by us. We will (eventually) try this test, but it will have to follow the current test.
 
> 2. Have you tried performing a double DAQmx reset - this seems to be a regular solver?
 
Again, we cannot confirm this as a solution for us, or for the subject of this forum post.
RES cannot justify the term "regular", either - that is James' assertion.
 
RES will also post results on this forum. NI has closed the support call on the assumption that it is caused by environmental noise. There is no proof - yet - of this.
 
Thank you! And James - thanks for all the (immense) effort you put in, but please watch your language.

 

0 Kudos
Message 20 of 45
(3,643 Views)