Is it bad practice to use a network connection for inter-application communication? - networking

I have two applications that need to communicate with each other running on the same system.
I've been using the very strange practice of opening a TCP COM channel between the two applications for communication.
Is this practice frowned-upon in anyway? Is there any alternative (apart from using stdio, not possible due to other reasons).
Is there a restriction on maximum transfer rate and/or any latency involved (compared to piped stdio)?
I'm using the local (127.0.0.1) address for both server and client, will the connection be guaranteed to stay within the local machine itself or could it relay off the nearest router before coming back to itself and does the network card influence the properties of the connection at all?

I worked on a system a while ago with Java. and I was looking for the same question. I don't have much experience with it. But I ended up using tcp connections for communications for the following advantages:
1) The ability to put the different application in different servers in the future if needed to.
2) The applications are totaly independent. one application could crash without effecting the other one. If the working application then tries to connect it gets an error and you can handle that.
I saw this used in many other type of applications. So I went with it and it is working fine. But you have to be carefull and handle networks errors and IO errors and closing all open sockets after finishing with the connection. I was only closing the socket from the client end so I ended up with many CLOSE_WAIT ports in the server.
Regards,

It's pretty common to use TCP for inter-application communication.
Performance should not be an issue.
Sockets On Same Machine For Windows and Linux
You should consider security. What will happen if another user on the machine connects to the port, how will the application authenticate etc.

Related

Initial connecting of two computers through a server, but allowing them to connect directly afterwards

Two computers in separate NAT networks, wants to connect with each other.
No port forwarding can be made.
We can connect between them using a intermediate server, however we dont want the entire communication to go through this server after the initial connectivity.
Is there any way that the intermediate server can perform the initial connection and then allow them to continue their communication directly somehow ?
Everything I know about computer networks say 'no', however I though I'll give this question a try here.
It can be done vía NAT traversal techniques. It isn't simple but many applications do it, e.g. Skype, where channeling all communication through a server would be innefective.

HTTP through a domain socket

I'm writing a bit of desktop software which has two components. Component B queries component A. Creating a web service seems like an ideal way to do IPC in principle. The data model fits, there are ready-made client and server libraries, a well known way to encode and decode parameters etc.
But setting up an HTTP server on a network socket doesn't seem right for a local application. For example what port do I choose? I don't really want people to be able to scan and talk to the app from outside etc.
So I was thinking that I might be able to do HTTP over a domain socket. Does that make any sense? Is there any precedence for it? Is there an equivalent protocol that I could use for IPC which has the same properties as HTTP (requests for specified resources (URIs), encoded parameters, response)?
Looking for C libraries (and possibly Go and ObjC for bonus points).
Binding to the loopback interface only (127.0.0.1) solves your "external visibility" problem, only processes on the local machine will be able to connect.
It does not solve your port allocation problem though, the port number you choose might be taken by the time your app starts. Then your server can't bind and your client connects to the other process bound to your port.
Old, less hip, but CORBA implementations tend to have the problems you have not thought of yet figured out already.

Creating a networking application that can work over internet connections

I have a somewhat basic understanding of network programming (and networking concepts in general) from taking a networking course in university a few years ago.
I remember being able to create a simple chat application, where the chat server is used as a central directory aware of which clients are currently online, but once a client knows another client it wants to chat with, the actual messages between them don't need to go through the server. I remember we could only test this over a bunch of LAN machines.
This C# chat program also has several comments mentioning that the program does not work over the internet: http://www.geekpedia.com/tutorial239_Csharp-Chat-Part-1---Building-the-Chat-Client.html
My question is why do these applications not work over the internet when "commercial" chat applications can. Surely, there is some way to make my computer accessible to the outer network even if its IP address is not valid outside the network of the ISP.
I see no problem with the linked-to code. The server doesn't even bind to a local address, which means it will listen for connections on all ip-addresses on the computer. There is however a comment for in the server article where the user changed the TcpListener object creation to bind to a specific address, which means clients only can connect to that specific address.
In the original server design, with using TcpListenet with only a port number, there should be nothing preventing its use on an Internet connected computer, unless there is a firewall blocking access.
Were you aware of networkComms.net and in particular the short chat example demonstrating the functionality here (It's less than 15 lines of code)? This was written specifically for people writing server-client apps in c# and given most of the problems you might come across will already have been solved and it might save you some time. This library is completely plug & play and has no issues working over the internet (as long as you can setup the necessary port forwarding where necessary).
Generally if both of your targets are behind NAT (so no true external ip addresses) and you are unable to configure port forwarding you need to look at 'TCP / UDP hole punching', quite an advanced technique.

TCP vs Reliable UDP

I am writing an application where the client side will be uploading data to the server through a wireless link.
The connection should be very reliable.The link is expected to break many times and there will be many clients connected to the server.
I am confused whether to use TCP or reliable UDP.
Please share your thoughts.
Thanks.
RUDP is not, of course, a formal standard, and there's no telling if you will find existing implementations you can use. Given a choice between rolling this from scratch and just re-making TCP connections, I'd chose TCP.
To be safe, I would go with TCP just because it's a reliable, standard protocol. RUDP has the disadvantage of not being an established standard (although it's been mentioned in several IETF discussions).
Good luck with your project!
It's likely that both your TCP and RUDP links would be broken by your environment, so the fact that you're using RUDP is unlikely to help there; there will likely be times when no datagrams can get through...
What you actually need to make sure of is that a) you can handle the number of connected clients, b) your application protocol can detect reasonably quickly when you've lost connectivity with a client (or server) and c) you can handle the required reconnection and maintenance of cross connection session state for clients.
As long as you deal with b) and c) it doesn't really matter if the connection keeps being broken. Make sure you design your application protocol so that you can get things done in short batches; so if you're uploading files, make sure that you're sending small blocks and that the application protocol can resume a transfer that was broken half way through; you don't want to get 99% of the way through a 2gb transfer and lose the connection and have to start again.
For this to work your server needs some kind of client session state cache where you can keep the logical state of a client's connection beyond the life of the connection itself. Design from the start to expect a given session to include multiple separate connections. The session state should possibly have some kind of timeout so if the client goes away for along time it doesn't continue to consume resources on the server but, to be honest, it may simply be a case of saving the state off to disk after a while.
In summary, I don't think the choice of transport matters and I'd go with TCP at least to start with. What will really matter is being able to manage your client's session state on the server and deal with the fact that clients will connect and disconnect regularly.
If you aren't sure, odds are that you should use TCP. For one thing, it's certain to be part of the network stack for anything supporting IP. "Reliable UDP" is rarely supported out of the box, so you'll have some extra support work for your clients.

How to retain one million simultaneous TCP connections?

I am to design a server that needs to serve millions of clients that are simultaneously connected with the server via TCP.
The data traffic between the server and the clients will be sparse, so bandwidth issues can be ignored.
One important requirement is that whenever the server needs to send data to any client it should use the existing TCP connection instead of opening a new connection toward the client (because the client may be behind a firewall).
Does anybody know how to do this, and what hardware/software is needed (at the least cost)?
What operating systems are you considering for this?
If using a Windows OS and using something later than Vista then you shouldn't have a problem with many thousands of connections on a single machine. I've run tests (here: http://www.lenholgate.com/blog/2005/11/windows-tcpip-server-performance.html) with a low spec Windows Server 2003 machine and easily achieved more than 70,000 active TCP connections. Some of the resource limits that affect the number of connections possible have been lifted considerably on Vista (see here: http://www.lenholgate.com/blog/2005/11/windows-tcpip-server-performance.html) and so you could probably achieve your goal with a small cluster of machines. I don't know what you'd need in front of those to route the connections.
Windows provides a facility called I/O Completion Ports (see: http://msdn.microsoft.com/en-us/magazine/cc302334.aspx) which allow you to service many thousands of concurrent connections with very few threads (I was running tests yesterday with 5000 connections saturating a link to a server with 2 threads to process the I/O...). Thus the basic architecture is very scalable.
If you want to run some tests then I have some freely available tools on my blog that allow you to thrash a simple echo server using many thousands of connections (1) and (2) and some free code which you could use to get you started (3)
The second part of your question, from your comments, is more tricky. If the client's IP address keeps changing and there's nothing between you and them that is providing NAT to give you a consistent IP address then their connections will, no doubt, be terminated and need to be re-established. If the clients detect this connection tear down when their IP address changes then they can reconnect to the server, if they can't then I would suggest that the clients need to poll the server every so often so that they can detect the connection loss and reconnect. There's nothing the server can do here as it can't predict the new IP address and it will discover that the old connection has failed when it tries to send data.
And remember, your problems are only just beginning once you get your system to scale to this level...
This problem is related to the so-called C10K problem. The C10K page lists a large number of good resources for addressing the problems you will encounter when you try to allow thousands of clients to connect to the same server.
I've come across the APE Project
a while back. It seems like a dream come true. They can support up to 100k concurrent clients on a single node. Spread them across 10 or 20 nodes, and you can serve millions. Perfect for RESTful applications. Might want to look deeper for any shared namespace. One drawback is that this is a standalone server, as in supplementary to a web server. This server is of course Open Source, so any cost is hardware/ISP related.
You cannot use UDP. If the client sends a request and you don't reply immediately, a router is going to forget the reverse route in 30 seconds or less, so your server will never be able to reply to the client.
TCP is the only option, and it, too, will give you headaches. Most routers are going to forget the route and/or drop the connection after a few minutes, so your client/server code is going to have to send "keep alives" fairly often.
I recommend setting up a "sniffer", to see how the phone companies are staying in touch with your smartphone for their "push" technology. Copy whatever they're doing, because that stuff works!
As Greg mentioned, the problem you are describing is C10K (or rather "C1M" in your case )
I recently made a simple TCP echo server on linux that scales very well with the number of sessions (only tested up to 200.000 though), by using the epoll queue. On BSD, you have something similar called kqueue.
You can check out the code if you want to. Hope this helps and good luck!
EDIT: As noted in the comments below, my original assertion that there is a 64K limit based on the number of ports is incorrect, however there is a 32K limit on the number of socket handles, so my suggested design is valid.
With a typical TCP/IP server design, you're limited in the number of simultaneous open connections you can have. The server has one listening port, and when a client connects to it the server makes an accept call, and that creates a new socket on a random port for the rest of the connection.
To handle more than 64K simultaneous connections I think you need to use UDP instead. You only need one port for the server to listen on, and you need to manage the connections using a 32-bit client ID in the packet data instead of having a separate port for each client. The 32-bit client ID could be the client's IP address, and the client can listen on a known UDP port for messages coming back from the server. That port would be the only one that needs to be open on the firewall.
With this approach, your only limitation is how quickly you can handle and respond to UDP messages. With millions of clients, even sparse traffic could give you large spikes, and if you don't read the packets fast enough your input queue will fill up and you'll start dropping packets. The C10K page Greg points to will give you strategies for that.

Resources