LabVIEW

cancel
Showing results for 
Search instead for 
Did you mean: 

Slow TCP duplex handling on remote connections

Another issue: 

 

When I run both the client and server on the sbrio, I get about 2Hz.

 

I guess the front panel updates are limited by Nagle's algorithm? 

 

Roger

 

0 Kudos
Message 41 of 63
(2,802 Views)

Hi Roger - I doubt that front panel updates are affected by Nagle's algorithm; at least, I've never seen that.  I think you may be misunderstanding Nagle's algorithm.  More likely it's the data between the 2 sbrios that is only running at 2Hz.  Why did you pick 1024 bytes as the packet size?  The maximum segment size (MSS) for TCP on most platforms is larger than that, so you aren't guaranteeing that you'll get a full packet.  Unfortunately, the maximum segment size can vary by platform (see this article, linked from the Wikipedia page on Nagle's algorithm, for an example) so it would be difficult to get consistent results with your approach.

 

Can you provide a bit more detail about what you're trying to do?  I really think the right solution for you is UDP.  As far as I can tell, out-of-order delivery isn't possible in your scheme since you always wait for a reply before sending the next piece of data.  You could implement some sort of retry/acknowledgement if necessary.

 

You're more likely to get an NI engineer to look at this if you unmark your post as the solution so that this thread will remain unsolved.  I believe there's an option to do that in the options menu under your post.

0 Kudos
Message 42 of 63
(2,778 Views)

Nathan, thanks for the answer!

 

I'll reply in order:

 

I have one client(windows) and one sbrio. When I have the host AND client running on the sbrio, I get 2Hz, when the client is on a windows machine, I am seeing some 30Hz in loop time.

 

The 1024 bytes was by trial & error. If I go 2048, the speed is the same, but network traffic increases. Probably the cpu load on the sbrio as well, I haven't verified this.

 

I agree, if I move between platforms I will have problems, that is why I want a permanent NI fix for these tcp issues.

 

I will use UDP for asynchronous callbacks. But I haven't that in place yet.

 

The application is a RPC client/server where I can issue commands on the sbrio from the windows client, or any other client that supports tcp. A typical scenario looks like this:

 

The sbrio waits for incoming connections and spawns a worker thread(VI server call Loop.vi, that is hidden, and the method name is Work.vi) onto a new tcp connection pair and then goes back to listen for connections, see the image below:

 

Connection.jpg 

 

In the Work.vi, the client gets handled, depending on what client Object type:

 

Work.jpg 

 

Finally where the wheels hit the ground, we have a Codec encode/decode pair that handles the incoming stream, in this case a socket stream.

 

HandleClient.jpg 

 

 All client processing takes place in a timed loop, that looks like this:

 

Loop.jpg

 

 

The client code looks like this, it just echo back a string 1000 times, the client also uses the encode/decode pair, but in reversed order.

 

Client.jpg 

 

It might not be the best "architecture" out there, but I think it is quite clean in structure and code?

 

With the TCP_NODELAY fix it works great on windows, and with the 1024byte pad on SBRIO I get decent performance.  

 

Roger

 

Message Edited by RogerI on 04-05-2010 03:31 AM
0 Kudos
Message 43 of 63
(2,771 Views)

I finally set up the test code you posted at the beginning of this thread, and I can't duplicate your results.  It looks like I get about 5Hz, not 2, between 2 Windows XP machines on the same network.  I added some timing code and found that every loop iteration the timing alternates between 300ms and almost instantaneous.  When I change it to using a single bi-directional connection it's about 300ms every loop iteration.  I don't have time to experiment more with it right now, but I'll try adding the TCP_NODELAY option if I get a chance.

 

I don't think that test code accurate depicts what you're doing, though, because it assumes a symmetric situation, unlike your client-server environment.  Also, looking at the screenshots you posted, I don't understand why you don't want to use a single connection (although I admit that my limited testing doesn't support the idea that it's faster).  It would certainly be a simpler architecture because your client wouldn't need to open a port and wait for a connection.  You're already taking a single wire (labeled TCPClient in) and getting both an input and output stream from it; any reason you can't embed a single TCP connection refnum in that "TCPClient in" wire?  For that matter, why not just have a single Stream class that can do both input and output?

 

As for why you're seeing different performance between 2 sbrio's versus between an sbrio and Windows, does that happen regardless of the pad size?  The size of a packet isn't a property of a single system, it's a property of a connection.  Take a look at this link from Microsoft about Path MTU discovery.  You'll note that the two ends negotiate a maximum segment size (MSS).  It's quite possible that when connecting to the Windows machine, Windows negotiates a smaller MSS such that 1024 bytes is enough to fill a packet, but that when the two sbrio's talk to each other, they agree to a larger MSS.  Also, MSS is generally not a power of 2 - for example, 1460 bytes is a common size.

 

In your JAVA/Python/whatever code that "screams" are you finding you need to play similar games with TCP_NODELAY to get the performance you want?  If not, can share that code to look for differences between it and the LabVIEW implementation?

 

You might find the following link enlightening as well: Design issues - sending small data segments over TCP with Winsock

Message 44 of 63
(2,743 Views)

Nathan, 

 

Java: Socket client = mServerSocket.accept().setTcpNoDelay(true);

C*:  setsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, (char *) &tcpnodelay_flag, sizeof(int))

Python: Socket s -> s.connect -> s.setsockopt(socket.SOL_TCP, socket.TCP_NODELAY, 0)

 

I could go on here, but I guess I have proved my point here? Labview IS the limitation here, since that does not exist in most other languages!

 

As you posted yourself, the TCP_NODELAY flag is set per default on all other LV RT platforms, except for the VxWorks in the SBRIO. How come? Aren't NI following what they state in the bulletins? And, how come this flag is disabled on all RT platforms? Why not disable it and then bitch on the users to implement in a "better" way?

  

Good programming is all about structure, and less about the implementation details, and, those should _definetley_ not be a limiting factor, or even less the reason to redesign, as you suggest.

 

Roger 

 

Message 45 of 63
(2,734 Views)
Since I do not have an SBRIO available I can not really debug this myself but is there a change if you could tell me if the TCP Get Raw Net Object.vi does actually execute on your VxWorks target? Because if it does the fix to make your Windows VI work on VxWorks targets would be most likely fairly trivial. Otherwise I'm afraid there is no way to get it to work.
Message Edited by rolfk on 04-06-2010 08:41 PM
Rolf Kalbermatter
My Blog
Message 46 of 63
(2,712 Views)

> Otherwise I'm afraid there is no way to get it to work.

 

I bet many people said that some 40 years ago before the USA put the first man on the moon? 😉 

 

Not to sound too arrogant here, how about opening up the LV VxWorks TCP source code and add something like this for every socket that is opened:

 

int on = 1;if (setsockopt(fdFCC, IPPROTO_TCP, TCP_NODELAY,(void *) &on, sizeof(on)) < 0)perror( "setsockopt" );

 

Since we are keen on schooling each other here, why not have a look at the VxWorks Stanford online manual:

 

http://www.slac.stanford.edu/exp/glast/flight/sw/vxdocs/vxworks/netguide/c-newNetService.html#84741

 

Do a quick search for the magic phrase "TCP_NODELAY".

 

Here are the screenshot of the LV code running on the SBRIO:

 

VxWorks.png 

 

And here from the Windows:

 

Windows.jpg 

 

 

Negative Socket references, cant be a good thing. Probably the conversion between the BSD sockets and the Windows ones aren't as simple as this VI makes us believe?

 

Roger

 

0 Kudos
Message 47 of 63
(2,702 Views)

RogerI wrote:

> Otherwise I'm afraid there is no way to get it to work.

 

I bet many people said that some 40 years ago before the USA put the first man on the moon? 😉 

You completely misunderstood me here. I'm just a normal LabVIEW user like you too. I do not have access to the LabVIEW source code nor the VxWorks source code, so my possibilities are about as limited as yours. However I do know a bit about external code use in LabVIEW Smiley Very Happy

 

I posted on Info-LabVIEW (a text based mailing list for LabVIEW) a solution for the TCP_NODELAY issue for Windows more than 12 years ago and it was just about the same as the NI post refered in this thread does show, only it was long before NI did post their solution. Smiley Wink

Negative Socket references, cant be a good thing. Probably the conversion between the BSD sockets and the Windows ones aren't as simple as this VI makes us believe?

 

Roger

 


I wouldn't worry to much about the sign here. I have no idea how VxWorks implements a socket so it could internally be a large unsigned number but when displayed as signed then it shows negative.

 

Now what you can try is to modify the VI you got from an earlier post. The VxWorks kernel does export a setsockopt() function that can be accessed and quite likely works just as in any other Berkeley based socket. However the VxWorks system does obviously not use the some module names as Windows. So you may want to try to change the library name to "vxworks.*" or maybe simply "vxworks" in the Call Library Node for the setsockopt() call and see if that does work. It's a bit of a guess here but anyhow worth a trial. Also you will certainly want to change the calling convention to C decl since VxWorks most likely doesn't support stdcall at all.

 

If you use the Conditional Disable structure to put those two Call Library Nodes into the same VI you only have to figure out a symbol (likely OS==VxWorks and OS==Windows if you run this from within a project) to distinguish the correct conditions. If you run a recent version of LabVIEW you can also change the Call Library Node to specify the library name from the diagram, which will avoid the stupid search dialog for the missing shared library file in the disabled frames.

Message Edited by rolfk on 04-07-2010 12:40 AM
Rolf Kalbermatter
My Blog
Message 48 of 63
(2,687 Views)

Rolf, thanks for the suggestions! Now we are talking here! Smiley Happy

 

I'll try out some of your ideas tonight. I'll post a followup with some more info!

 

Roger

 

0 Kudos
Message 49 of 63
(2,669 Views)

Rolf,

 

I tried your suggestion replacing the library from wsock32.dll with vxWorks, but I'll just see deployment errors. There is however several *.out files in the system folder. Finding the right library file to interface seem to be the hard part here? T&E could take some time I suppose? Not only do I have to get the call right, it has to be the right library as well!

 

 

 

 

I ponder where to start with this? One thing would of course be the obvious solution - to hack some vxworks networking code and make my own binary and try interfacing it?

Unfortunately it seems that the wind river evaluation software just runs on an emulator? 

 

Roger

 

0 Kudos
Message 50 of 63
(2,651 Views)