Slow TCP duplex handling on remote connections

User002 · ‎04-04-2010

Another issue:

When I run both the client and server on the sbrio, I get about 2Hz.

I guess the front panel updates are limited by Nagle's algorithm?

Roger

nathand · ‎04-04-2010

Hi Roger - I doubt that front panel updates are affected by Nagle's algorithm; at least, I've never seen that. I think you may be misunderstanding Nagle's algorithm. More likely it's the data between the 2 sbrios that is only running at 2Hz. Why did you pick 1024 bytes as the packet size? The maximum segment size (MSS) for TCP on most platforms is larger than that, so you aren't guaranteeing that you'll get a full packet. Unfortunately, the maximum segment size can vary by platform (see this article, linked from the Wikipedia page on Nagle's algorithm, for an example) so it would be difficult to get consistent results with your approach.

Can you provide a bit more detail about what you're trying to do? I really think the right solution for you is UDP. As far as I can tell, out-of-order delivery isn't possible in your scheme since you always wait for a reply before sending the next piece of data. You could implement some sort of retry/acknowledgement if necessary.

You're more likely to get an NI engineer to look at this if you unmark your post as the solution so that this thread will remain unsolved. I believe there's an option to do that in the options menu under your post.

User002 · ‎04-05-2010

Nathan, thanks for the answer!

I'll reply in order:

I have one client(windows) and one sbrio. When I have the host AND client running on the sbrio, I get 2Hz, when the client is on a windows machine, I am seeing some 30Hz in loop time.

The 1024 bytes was by trial & error. If I go 2048, the speed is the same, but network traffic increases. Probably the cpu load on the sbrio as well, I haven't verified this.

I agree, if I move between platforms I will have problems, that is why I want a permanent NI fix for these tcp issues.

I will use UDP for asynchronous callbacks. But I haven't that in place yet.

The application is a RPC client/server where I can issue commands on the sbrio from the windows client, or any other client that supports tcp. A typical scenario looks like this:

The sbrio waits for incoming connections and spawns a worker thread(VI server call Loop.vi, that is hidden, and the method name is Work.vi) onto a new tcp connection pair and then goes back to listen for connections, see the image below:

In the Work.vi, the client gets handled, depending on what client Object type:

Finally where the wheels hit the ground, we have a Codec encode/decode pair that handles the incoming stream, in this case a socket stream.

All client processing takes place in a timed loop, that looks like this:

The client code looks like this, it just echo back a string 1000 times, the client also uses the encode/decode pair, but in reversed order.

It might not be the best "architecture" out there, but I think it is quite clean in structure and code?

With the TCP_NODELAY fix it works great on windows, and with the 1024byte pad on SBRIO I get decent performance.

Roger

Message Edited by RogerI on 04-05-2010 03:31 AM

nathand · ‎04-05-2010

I finally set up the test code you posted at the beginning of this thread, and I can't duplicate your results. It looks like I get about 5Hz, not 2, between 2 Windows XP machines on the same network. I added some timing code and found that every loop iteration the timing alternates between 300ms and almost instantaneous. When I change it to using a single bi-directional connection it's about 300ms every loop iteration. I don't have time to experiment more with it right now, but I'll try adding the TCP_NODELAY option if I get a chance.

I don't think that test code accurate depicts what you're doing, though, because it assumes a symmetric situation, unlike your client-server environment. Also, looking at the screenshots you posted, I don't understand why you don't want to use a single connection (although I admit that my limited testing doesn't support the idea that it's faster). It would certainly be a simpler architecture because your client wouldn't need to open a port and wait for a connection. You're already taking a single wire (labeled TCPClient in) and getting both an input and output stream from it; any reason you can't embed a single TCP connection refnum in that "TCPClient in" wire? For that matter, why not just have a single Stream class that can do both input and output?

As for why you're seeing different performance between 2 sbrio's versus between an sbrio and Windows, does that happen regardless of the pad size? The size of a packet isn't a property of a single system, it's a property of a connection. Take a look at this link from Microsoft about Path MTU discovery. You'll note that the two ends negotiate a maximum segment size (MSS). It's quite possible that when connecting to the Windows machine, Windows negotiates a smaller MSS such that 1024 bytes is enough to fill a packet, but that when the two sbrio's talk to each other, they agree to a larger MSS. Also, MSS is generally not a power of 2 - for example, 1460 bytes is a common size.

In your JAVA/Python/whatever code that "screams" are you finding you need to play similar games with TCP_NODELAY to get the performance you want? If not, can share that code to look for differences between it and the LabVIEW implementation?

You might find the following link enlightening as well: Design issues - sending small data segments over TCP with Winsock

User002 · ‎04-06-2010

Nathan,

Java: Socket client = mServerSocket.accept().setTcpNoDelay(true);

C*: setsockopt(sockfd, IPPROTO_TCP, TCP_NODELAY, (char *) &tcpnodelay_flag, sizeof(int))

Python: Socket s -> s.connect -> s.setsockopt(socket.SOL_TCP, socket.TCP_NODELAY, 0)

I could go on here, but I guess I have proved my point here? Labview IS the limitation here, since that does not exist in most other languages!

As you posted yourself, the TCP_NODELAY flag is set per default on all other LV RT platforms, except for the VxWorks in the SBRIO. How come? Aren't NI following what they state in the bulletins? And, how come this flag is disabled on all RT platforms? Why not disable it and then bitch on the users to implement in a "better" way?

Good programming is all about structure, and less about the implementation details, and, those should _definetley_ not be a limiting factor, or even less the reason to redesign, as you suggest.

Roger

rolfk · ‎04-06-2010

Since I do not have an SBRIO available I can not really debug this myself but is there a change if you could tell me if the TCP Get Raw Net Object.vi does actually execute on your VxWorks target? Because if it does the fix to make your Windows VI work on VxWorks targets would be most likely fairly trivial. Otherwise I'm afraid there is no way to get it to work.

Message Edited by rolfk on 04-06-2010 08:41 PM

Rolf Kalbermatter
My Blog

User002 · ‎04-06-2010

> Otherwise I'm afraid there is no way to get it to work.

I bet many people said that some 40 years ago before the USA put the first man on the moon? 😉

Not to sound too arrogant here, how about opening up the LV VxWorks TCP source code and add something like this for every socket that is opened:

int on = 1;if (setsockopt(fdFCC, IPPROTO_TCP, TCP_NODELAY,(void *) &on, sizeof(on)) < 0)perror( "setsockopt" );

Since we are keen on schooling each other here, why not have a look at the VxWorks Stanford online manual:

http://www.slac.stanford.edu/exp/glast/flight/sw/vxdocs/vxworks/netguide/c-newNetService.html#84741

Do a quick search for the magic phrase "TCP_NODELAY".

Here are the screenshot of the LV code running on the SBRIO:

And here from the Windows:

Negative Socket references, cant be a good thing. Probably the conversion between the BSD sockets and the Windows ones aren't as simple as this VI makes us believe?

Roger

rolfk · ‎04-06-2010

RogerI wrote:
> Otherwise I'm afraid there is no way to get it to work.

I bet many people said that some 40 years ago before the USA put the first man on the moon? 😉

You completely misunderstood me here. I'm just a normal LabVIEW user like you too. I do not have access to the LabVIEW source code nor the VxWorks source code, so my possibilities are about as limited as yours. However I do know a bit about external code use in LabVIEW

I posted on Info-LabVIEW (a text based mailing list for LabVIEW) a solution for the TCP_NODELAY issue for Windows more than 12 years ago and it was just about the same as the NI post refered in this thread does show, only it was long before NI did post their solution.

Negative Socket references, cant be a good thing. Probably the conversion between the BSD sockets and the Windows ones aren't as simple as this VI makes us believe?

Roger

I wouldn't worry to much about the sign here. I have no idea how VxWorks implements a socket so it could internally be a large unsigned number but when displayed as signed then it shows negative.

Now what you can try is to modify the VI you got from an earlier post. The VxWorks kernel does export a setsockopt() function that can be accessed and quite likely works just as in any other Berkeley based socket. However the VxWorks system does obviously not use the some module names as Windows. So you may want to try to change the library name to "vxworks.*" or maybe simply "vxworks" in the Call Library Node for the setsockopt() call and see if that does work. It's a bit of a guess here but anyhow worth a trial. Also you will certainly want to change the calling convention to C decl since VxWorks most likely doesn't support stdcall at all.

If you use the Conditional Disable structure to put those two Call Library Nodes into the same VI you only have to figure out a symbol (likely OS==VxWorks and OS==Windows if you run this from within a project) to distinguish the correct conditions. If you run a recent version of LabVIEW you can also change the Call Library Node to specify the library name from the diagram, which will avoid the stupid search dialog for the missing shared library file in the disabled frames.

Message Edited by rolfk on 04-07-2010 12:40 AM

Rolf Kalbermatter
My Blog

User002 · ‎04-07-2010

Rolf, thanks for the suggestions! Now we are talking here!

I'll try out some of your ideas tonight. I'll post a followup with some more info!

Roger

User002 · ‎04-07-2010

Rolf,

I tried your suggestion replacing the library from wsock32.dll with vxWorks, but I'll just see deployment errors. There is however several *.out files in the system folder. Finding the right library file to interface seem to be the hard part here? T&E could take some time I suppose? Not only do I have to get the call right, it has to be the right library as well!

I ponder where to start with this? One thing would of course be the obvious solution - to hack some vxworks networking code and make my own binary and try interfacing it?

Unfortunately it seems that the wind river evaluation software just runs on an emulator?

Roger

LabVIEW

Slow TCP duplex handling on remote connections

Re: Slow TCP duplex handling on remote connections

Re: Slow TCP duplex handling on remote connections

Re: Slow TCP duplex handling on remote connections

Re: Slow TCP duplex handling on remote connections

Re: Slow TCP duplex handling on remote connections

Re: Slow TCP duplex handling on remote connections

Re: Slow TCP duplex handling on remote connections

Re: Slow TCP duplex handling on remote connections

Re: Slow TCP duplex handling on remote connections

Re: Slow TCP duplex handling on remote connections