08-23-2024 05:35 PM - edited 08-23-2024 05:44 PM
I'm using a CP2102N USB to Serial device to talk to an embedded device my company makes. I typically need to communicate at 921600 baud at a quite high datarate for several seconds. This works well. Recently, I needed to extend this read time to several minutes, and I've started getting VISA Overrun errors.
Since this is a binary device, I'm using Bytes at Port and reading the whole thing about 10 times a second. This generally works fine, but with longer sessions I'm getting the VISA Overrun errors.
My buffer is set at 400,000 bytes (about 8 seconds of data), and my tests show that I get the error immediately after reading only ~4000 bytes at port. So, it doesn't seem like the buffer is filling up, but it could be something filling up on the device itself.
Is there a way to view low-level error codes from the actual SILabs dll's that VISA is calling? I'd like to figure out exactly WHICH buffer is filling up- the on-chip FIFO, VISA buffer, etc.
(Just to get it out of the way, this is a binary application, so no termchars, and no hardware handshaking either. Data is of variable length and comes in a giant stream, so I'm just grabbing bytes-at-port data until I time out the read. I'm doing no processing on the data; just concatenating it with an autoindexing tunnel on the output to be processed later. The data does have length headers, but it's coming in too fast to read one byte, then read X more bytes, etc. I have to read giant chunks at a time to try to keep buffers empty, since my data rate is around 50 kb/sec. This is one of the few times I believe using Bytes at Port is the correct way to use it. I have ample experience writing serial interfaces to things using termchars and vastly prefer that method.)
08-23-2024 08:31 PM
@BertMcMahan wrote:I'm doing no processing on the data; just concatenating it with an autoindexing tunnel on the output to be processed later.
That could be a problem as the growing array will take more and more of your time. Consider using a Producer-Consumer approach, using a queue to pass the data to the processing loop. The processing loop can do whatever it needs to do as the data is coming in and it will give more time to reading the port.
Another thing to consider is to just always attempt to read the same number of bytes each read. This may allow LabVIEW to reuse memory buffers more easily.
08-26-2024 07:48 AM - edited 08-26-2024 07:49 AM
@crossrulz wrote:
@BertMcMahan wrote:I'm doing no processing on the data; just concatenating it with an autoindexing tunnel on the output to be processed later.
Another thing to consider is to just always attempt to read the same number of bytes each read. This may allow LabVIEW to reuse memory buffers more easily.
Definitely echo that. If you need to read that much of data, make as few driver calls as possible. This includes reading less often but larger buffers, but also doing as little as possibly about checking status including seeing how much data there may be. If you read lets say 400000 bytes each read, personally I would try to keep it at half the size of the buffer you can setup, and you start to get a backlog, you will get even faster a backlog if you also do a Bytes at Port every time.
08-26-2024 11:52 AM
@crossrulz wrote:
That could be a problem as the growing array will take more and more of your time. Consider using a Producer-Consumer approach, using a queue to pass the data to the processing loop. The processing loop can do whatever it needs to do as the data is coming in and it will give more time to reading the port.
I considered a queue, but assumed that LabVIEW's behavior for allocating memory to an array that keeps growing would be very similar to allocating data to a queue. I'm also only growing the array 10 times a second, which doesn't seem like a lot.
Also, I'm pretty sure there was an article on how LabVIEW increases memory allocation for a growing array, but I can't find it now.
All that said- I'm 95% certain I'm not actually lagging the reads. My buffer is enough for 8 seconds worth of data (IIRC), and I get the "overflow" error immediately after reading 4000 bytes- which is all of the data at the port. Each cycle gives me 4000 bytes at port, give or take, which I read and stash into the concatenating tunnel. I find it hard to imagine that LabVIEW needs 8 seconds to resize the array.
Also, the error happens within the same second as the Bytes at Port read. In other words, I get "Bytes at port = 4000", read 4000 bytes, and get the Overflow error as part of that same Read function, all within one second (confirmed by NI IO Trace and error wire probes).
I have been around the block a couple times with serial, and while I'm certainly no Jedi with it, I really don't think this is your standard "just use a producer-consumer" problem. I think it's more likely not able to get the data from the USB to Serial converter into RAM on time, which is why I'm trying to get more verbose error codes.
Definitely echo that. If you need to read that much of data, make as few driver calls as possible. This includes reading less often but larger buffers, but also doing as little as possibly about checking status including seeing how much data there may be. If you read lets say 400000 bytes each read, personally I would try to keep it at half the size of the buffer you can setup, and you start to get a backlog, you will get even faster a backlog if you also do a Bytes at Port every time.
I will try much larger reads. Hopefully that'll work, but this REALLY doesn't seem like I'm actually filling my VISA buffers. I will be VERY surprised if the code goes from "reads 4,000 bytes every 100 ms for 70-80 seconds" to "can't read 400,000 bytes into an array within 8 seconds".
*Side note: my numbers are rounded here, it's somewhere in the 4-5000 bytes per read, and I set the buffer to 400,000 but it might be more than that, according to the Help, and I think it's about 8 seconds worth of data. Might be more like 10, but the numbers are at least close.
08-26-2024 12:00 PM
08-26-2024 12:13 PM
That was very wrong intuitive thinking. Queues are optimized. The buffer that is created in the VISA Read is simply passed to the queue, not copied.Append Array or Concatenate String need to allocate each time a new buffer and copy the contents of the two buffers into this new buffer, after a while concatenating buffers after buffers your system is mostly busy copying between buffers over and over again and not handling the serial port!
08-26-2024 01:48 PM
@Albert.Geven wrote:
Hi
Could you execute the first read, immediate without checking bytes at port?
Maybe the system is filling up immediately after opening the port.
No, the error happens after a variable amount of time- sometimes less than 10 seconds, sometimes >100 seconds into the stream.
a while concatenating buffers after buffers your system is mostly busy copying between buffers over and over again and not handling the serial port!
I can't find the article now, but I was under the impression that LabVIEW was at least decently smart about this- the first time you create an array, it would give you (maybe) 1000 elements worth of preallocated buffer space. If you fill that up, it doesn't move to a 1001 element buffer space, but to one of, say, 10,000 elements. After that fills, 100,000, etc. I could be wrong in my remembering, as clearly I can't find the original article 🙄
Doing some testing:
Thanks to everyone who has replied so far. I tried several of the ideas mentioned above. Unfortunately, none helped. So far, I've tried:
1- Switching to a constant 4096 bytes per read, no Bytes at Port: Still getting overrun errors
2- 4096 bytes per read, now each read gets sent to a queue of type "U8 array" (so an array of queues): Still getting overrun errors
3- 4096 bytes per read, now enqueuing to a queue of type U8 (so I send the result of Read to another For loop to enqueue each element individually, since AFAIK you can't batch enqueue a whole array at once): Still getting overrun errors
Errors are intermittent; sometimes they take less than 20 seconds, other times it takes them nearly 200 seconds before hanging up.
All that said, here is why I don't think these buffers are a problem:
Check out this IO trace capture on my original method, reading Bytes at Port then calling VISA Read with that value:
This particular run was configured to Wait for Next ms Multiple every 200 seconds, and is truncated to show only the last captures (the run was about 105 seconds long).
"VISA Get Attribute" is "Get Bytes at Port" and the first value after the hex code is the bytes at port. So, it's 9984, 9920, 9920, 9920, 9984, 9984, 9920, 9984, 9920. Fairly consistent, but not bang on exact, which is to be expected.
The part that concerns me is the last 3 lines- line 451, BaP is 9920. Line 452, VISA Read, requesting 9920, returns code 0xBFFF006C (status description "An overrun error occurred during transfer. A character was not read from the hardware before the next character arrived.")
Look at the timing- line 451 happens at 13:24:35.6805. At that point, there are 9920 elements in the buffer. At that exact same timestamp, VISA Read returns an Overrun error. The buffer is 400,000 bytes- it's impossible to get from 9,920 to 400,000 that quickly.
Also, take a look at the beginning of the IO trace:
BaP returned 9920, 9984, 9984, 9856, 9984, 9984 for the first 6 calls. Over a minute later and nearly 5 MB later, the BaP reading was actually slightly lower. There's no indication that any of my threads are timing out.
Now, I certainly could be running out of time in a system interrupt or something somewhere... but I have no idea how to troubleshoot that. Surely the system can handle 10 API calls per second.
So... the short version is that I suspect this is happening before the VISA buffer. And, like I mentioned in the topic, I'd really like to see what the actual error code is that is getting returned. All I can see is code -1073807252. A more detailed error code would hopefully tell me, is this the FIFO on the chip going bad? Is it a USB delay? Is the 400,000 byte buffer filling up? Etc.
Thanks again to everyone taking a look at this. I'm fairly stumped right now. If this could still be related to my buffer handling I'd love to know, because my interpretation of the IO Trace data leads me to think otherwise.
08-26-2024 06:44 PM - edited 08-26-2024 06:45 PM
You’re conflating things here. The VISA buffer of 400000 bytes that you allocate has nothing with the hardware buffer overflow to do. The hardware buffer is in, yes hardware in the form of a FIFO and can not be enlarged by you. Not even VISA has any direct control of that. The serial port is handled by the Windows serenum driver which is called by the Windows COMM subsystem. VISA then accesses the Windows COMM API. The error you see is the serenum (or hardware specific) device driver not being able to service the serial port interrupt before that FIFO is full?
in terms of data you seem to receive about 9950 bytes per 200 ms, so 49700 bytes per second, or around 500000 Baud. FIFO buffers are relatively small sometimes as little as 16 bytes but very seldom more than 256 unless you have special high speed serial port controllers. This means that the device driver needs to be able to react within 0,2 ms to at most 2 ms. That’s though under Windows! You may need a more capable special serial port interface card with its own device drivers, to be able to operate at this speed continously.
08-26-2024 07:28 PM
Looks like the datasheet has more info,
Mfr recommends using dll instead of VCOM for special applications.
08-27-2024 10:43 AM - edited 08-27-2024 11:00 AM
@rolfk wrote:
You’re conflating things here. The VISA buffer of 400000 bytes that you allocate has nothing with the hardware buffer overflow to do. The hardware buffer is in, yes hardware in the form of a FIFO and can not be enlarged by you. Not even VISA has any direct control of that. The serial port is handled by the Windows serenum driver which is called by the Windows COMM subsystem. VISA then accesses the Windows COMM API. The error you see is the serenum (or hardware specific) device driver not being able to service the serial port interrupt before that FIFO is full?
in terms of data you seem to receive about 9950 bytes per 200 ms, so 49700 bytes per second, or around 500000 Baud. FIFO buffers are relatively small sometimes as little as 16 bytes but very seldom more than 256 unless you have special high speed serial port controllers. This means that the device driver needs to be able to react within 0,2 ms to at most 2 ms. That’s though under Windows! You may need a more capable special serial port interface card with its own device drivers, to be able to operate at this speed continously.
Edit: It appears I neglected to attach my actual error code to my first post, which was dumb of me. Sorry about that 😞
Perhaps I'm not explaining myself well. I tried to mention that I was trying to figure out WHICH buffer overflowed in my original post. I really appreciate your posts, and I know you know WAY more about this than I do, so I'm really trying to learn. The bolded part of your message is what I'd originally assumed the error was, however, the whole thread until now has been talking about ways to make sure the VISA buffer doesn't overflow. I'm not certain, but I would guess that keeping the VISA buffer emptied has very little, if anything, to do with the hardware buffer.
I fully understand that there are multiple buffers, interfaces, etc. in play when going from the actual serial layer to USB to DLL to Windows to VISA to RAM to LabVIEW etc.
Me posting the info above is my attempt to address the first few comments regarding array resizing, producer/consumer, etc.
Going back to my original question- is there any way to view the ACTUAL error. The LabVIEW error code -1073807252 just says "VISA Overrun" with no further details. I have no way of knowing (due to my limited understanding of the underlying implementation) where in the giant chain of buffers this error is getting thrown.
LabVIEW Help (for example) again just says things like "reduce the data" or "switch to producer consumer", not "make sure your hardware is fast enough".
Anyway, I don't think this is a VISA problem, but a hardware one, and it's why I posted this in the first place. I have several things that I think could be a problem; is it the USB port getting overloaded? Do I need to switch to a chip with a larger buffer? There seems to be a 16-byte FIFO somewhere in the chain; is that one getting overloaded? Is it the buffer on the chip itself?
Basically I'm just trying to find where the actual error is. It sounds like you're telling me it's the actual on-chip buffer overflowing, which is understandable, and isn't something I can fix directly. I'll definitely investigate things further.
ALL THAT SAID.... I may have found two partial problems. The first was that it seems I had a pending Windows update, which on Windows 11 tends to make basically everything worse. Second, out of desperation I threw an oscilloscope on the Tx/Rx pins to see if the chip was simply going bad. It seems that adding the oscilloscope ground seemed to help, as I was able to start getting 10 minute streams of data with no dropped packets or overflow errors. Perhaps I was getting extra bits in there somewhere from electrical issues or a damaged chip? Who knows.
I'd still like to be able to get more error info, as what you said ("The error you see [...] is the hardware specific device driver not being able to service the serial port") is INCREDIBLY valuable info that I wasn't able to find anywhere else, so I really appreciate that info. If that's listed in the LabVIEW Help somewhere I'd really love a link, as I couldn't find it 😞