How does the client knows which transport protocol to use? - networking

Let's assume that I start a server at one of the computers in my private network (192.168.10.10:9900).
Now when making a request from some other computer in the same network, how does the client computer (OS?) knows which protocol to use / which protocol the server follows ? [TCP or UDP]
EDIT: As mentioned in the answers, I was basically looking for a default protocol which will be used by the client in the absence of any transport protocol information.

TCP / UDP protocols work at the transport layer level (TCP / IP MODEL) and its main difference is that TCP has a method to ensure the arrival of messages while UDP is lighter because of its virtue is to be faster in Information delivery. The use of one protocol or another is always defined by the application that will use it.
So the reference you put on the private server with ip: port 192.168.10.10:9900 is very vague to be more precise we could say that we have an Apache web server running on the ip: port 192.168.10.10:9900 (the port for default is 80 when installing the server, but it can be changed in the configuration).
Now the web servers (apache, IIS, etc.) work using the TCP protocol because when a client (computer, cell phone, etc.) consults a page through a browser (Chrome, Firefox, etc.), the ideal thing is that all the website and not just some pieces. This is why this type of servers chose and use this protocol in the first instance since they seek that in the end the result is that the user obtains the complete page regardless of whether a few more milliseconds are sacrificed in the validations involved in using TPC.
Now going to the client side. The user when visiting a web page from any browser (Chrome, Firefox, etc.) will use TCP since this protocol is already configured in the browser to send the query messages and subsequently receive the messages with the same form Website information.
Now this behavior is going to be repeated for any client / server application. For example, to change the type of application on the UDP side, we can observe the operation of DHCP services which are used to receive an IP when connecting any device to a Wi-Fi network. In this case, this service seeks to be as fast as possible (instead of the most reliable) since you want the device to connect as quickly as possible to the network, so use the UDP protocol and in this case any equipment when connecting To a WIFI network you will send your messages using this protocol.
Finally, if you want to know promptly about the type of TCP / UDP protocol used by a specific application, you can search on the Wireshark application which allows you to scan the messages that leave the device or show the protocol used in the different layers of the application.

There is no reason any client would make a request to your server, so why would it care what protocol it follows? Clients don't just randomly connect to things to see if there's a server there. So it doesn't make any difference to any client.

Normally, the client computer will use the TCP protocol as default. If you start the server using UDP protocol mode, then when you use curl -XGET 192.168.10.10:9900/test-page, it will give you back an curl: (7) Failed to connect to 192.168.10.10 port 9900: Connection refused error. You can try it, use the nc -lvp 9900 -u, it will give you that result.

The answers here are pointing to some default protocol. Its' not that, Whenever you start an application let say HTTP server, the server's internal has code to open a socket(which can be TCP or UDP), since HTTP:80 is a TCP protocol the code creates a TCP socket. Similarly for other network application it depends on their requirement what kind of transport layer protocol to use (TCP Or UDP). Like a DNS client will create a UDP socket to connect to DNS server, since DNS:53 is mostly over UDP. Both TCP and UDP have different use cases, advantages and disadvantages. Depending on there uses cases / advantages / disadvantages of UDP/TCP decision is taken to implement server using either of them.

Related

Retrieve available clients via UDP broadcast

I'm currently developing a "node-based" system where a server will send out a UDP broadcast on the private network (with a custom protocol), which will be received by several different clients which supports the specified protocol. The server will after the request pick between some of the clients for a more steady TCP connection.
Request for client sequence
Server broadcasting a request-for-ip message to every device/node on the network.
All available clients that supports the protocol will answer with their unique IP to the server.
Server chooses among the clients via a request-for-connection message.
Client that got choosen by the server connects to the server via TCP for a reliable connection.
My question
I've got pretty good knowledge about both TCP and UDP, but I've never designed a system like this before. Do you think this system is built in the right way or is there a more "standard" way doing something similar to this? What are your thoughts?
Thanks!
--- Edit ---
Added a diagram of the program.
There is a standard protocol to advertise services on the network, which you may like to consider: Simple Service Discovery Protocol, based on periodic UDP multicast:
The Simple Service Discovery Protocol (SSDP) is a network protocol based on the Internet protocol suite for advertisement and discovery of network services and presence information. It accomplishes this without assistance of server-based configuration mechanisms, such as Dynamic Host Configuration Protocol (DHCP) or Domain Name System (DNS), and without special static configuration of a network host. SSDP is the basis of the discovery protocol of Universal Plug and Play (UPnP) and is intended for use in residential or small office environments.
In this protocol clients join that UDP multicast group to discover local network services and initiate connections to them, if they wish to. And this is pretty much the intended use case for the protocol, which is somewhat different from your use case.
One benefit of IP/UDP multicast is that multicast packets can be dropped in the network adapter if no process on the host has joined that multicast group. Another one is that IP/UDP multicast can be routed across networks.
From the diagram you posted:
The server is the mediator (design pattern) whose location must be known to every other process of the distributed system.
The clients need to connect/register with the server.
Your master client is a control application.
It makes sense for the server to advertise itself over UDP multi-cast.
Online clients would connect to the server using TCP on start or TCP connection loss. If a client terminates for any reason that breaks the TCP connection and the server becomes immediately aware of that, unless the client was powered off or its OS crashed. You may like to enable frequent TCP keep-alives for the server to detect dead clients as soon as possible, if no data is being transmitted from the server to the clients. Same applies to the clients.
All communications between the server and the clients happen over TCP. Otherwise you would need to implement reliable messaging over UDP or use PGM, which can be a lot of work. Multicast UDP should only be used for server discovery, not bi-directional communication that requires reliable delivery.
The master client also connects to the server, possibly on another port, for control. The master client can discover all available servers (if there is more than one) and allow the user to choose which one to connect to.

RPC & TCP Behavior

Can someone describe from a network point of view what RPC (SUN and/or DCE) is and why it deviates from standard TCP behavior?
The way that I understand it is a client reaches out to a server with a unique source port and then switches the source port after the TCP three way handshake finishes. I work with ASA firewalls so this behavior becomes very apparent when the inspection of DCE RPC is not enabled since the firewall will block it because it sees it as a threat. I have read a few MS TechNet articles and other website definitions to include watching about five Youtube videos which all seem to explain it from a programmers perspective but I have yet to fully understand this concept since I am not a programmer.
Note that there is nothing that deviates from standard TCP regarding the RPC protocols.
SunRPC or DCE RPC works on top of UDP(at least SunRPC can use UDP) or on top of TCP.
Typically in order for an RPC client to contact/call an RPCserver, it first contacts some sort of lookup server (called portmapper or rpcbind in the case of SunRPC), which replies with the location (IP address and port number) of where the actual server is running.
So from a networking perspective:
RPC Servers listens on a random port number, which may change each time that server program is (re)started.
At startup the RPC server connects to the portmapper, which runs on a well known port and register itself with which IP address and port number it's listening on.
Normally the portmapper service runs on the same machine as the RPC server programs.
When a client wants to connect to or call an RPC service it performs these actions:
Connects to the portmapper, on a well known/standard destination port and asks it where the particular service it wants to connect to is.
portmapper replies with the IP address and port number of the service the client asked for.
client tears down the connection to the portmapper
client establishes a new connection to the service using the IP address and port number that pormapper gave it.
client calls the RPC service over this new connection, which the client may use for multiple RPC calls.
These RPC calls are just application message exchanged on top of a TCP connection.
(In the case UDP is used instead of TCP, it works much the same, but there's no naturally no connection setup/teardown being performed over the network)
This presents a problem for firewalls, since the servers listens on randomly chosen port numbers, one cannot administratively allow access to a particular port number. Instead a firewall wanting to support this kind of setup would need to open up the portmapper port, catch the RPC messages going to that well known port of portmapper, inspect the message content exchanged with the portmapper to extract the IP address and port number from the RPC messages(the portmapper is itself implemented as an RPC server) in order to dynamically open a port between the RPC server and client.

Can tcp and http connection listeners can interact with each other or not?

Is there a way in which the http connection and tcp connection listeners can interact with each other?
I have two separate application modules one is working through http and other requires tcp .
I need to do an interaction between these two modules so is there way i can make my http based module interact with tcp based module.
First of all, you need to read up a little on networking concepts. HTTP is what's known as an application level protocol, whereas TCP is what's known as a transport layer protocol. Take a look at the OSI Network Model.
As an example, you can imagine that TCP is the telephone network. It gives you the basic means to connect to another person and speak with them. However, in order to actually communicate you need to speak the same language, such as English or French. That is the application level protocol, HTTP in your case.
So to answer your question, in order for your two applications to to communicate and exchange data they need to make a connection / call using TCP and both be speaking the same language / application level protocol namely HTTP.
Two distinct processes won't be able to use the same IP port on a same IP address. Thus, two processes won't be able to use the same incoming stream of data coming out of the TCP connection. If they use different ports, there's no problem.
If the two processes use the same IP port, as HTTP is a protocol that sits on top of TCP, it means that your TCP process can be used as a pipe by the HTTP process. The TCP process will connect to the IP port, do its stuff, and forward the data to the HTTP process that will handle it.

How are different TCP connections in HTTP requests identified?

From what I understand, each HTTP request uses its own TCP connection (please correct me if i'm wrong). So, let's say that there are two current connections to the same server. For example, client side javascript code triggering a couple of AJAX POST requests using the XMLHttpRequest object, one right after the other, before getting the response to the first one. So we're talking about two connections to the same server, each waiting for a response in order to route it to each separate callback function.
Now here's the thing that I don't understand: The TCP packet includes source and destination ip and port, but won't both of these connections have the same src and dest ip addresses, and port 80? How can the packets be differentiated and routed to appropriately? Does it have anything to do with the packet sequence number which is different for each connection?
When your browser creates a new connection to the HTTP server, it uses a different source port.
For example, say your browser creates two connections to a server and that your IP address is 60.12.34.56. The first connection might originate from source port 60123 and the second from 60127. This is embedded in the TCP header of each packet sent to the server. When the server replies to each connection, it uses the appropriate port (e.g. 60123 or 60127) so that the packet makes it back to the right spot.
One of the best ways to learn about this is to download Wireshark and just observe traffic on your own network. It will show you this and much more.
Additionally, this gives insight into how Network Address Translation (NAT) works on a router. You can have many computers share the same IP address and the router will rewrite the request to use a different port so that two computers can simultaneously connect to places like AOL Instant Messenger.
They're differentiated by the source port.
The main reason for each HTTP request to not generate a separate TCP connection is called keepalives, incidentally.
A socket, in packet network communications, is considered to be the combination of 4 elements: server IP, server port, client IP, client port. The second one is usually fixed in a protocol, e.g. http usually listen in port 80, but the client port is a random number usually in the range 1024-65535. This is because the operating system could use those ports for known server protocols (e.g. 21 for FTP, 22 for SSH, etc.). The same network device can not use the same client port to open two different connections even to different servers and if two different clients use the same port, the server can tell them apart by their IP addresses. If a port is being used in a system either to listen for connection or to establish a connection, it can not be used for anything else. That's how the operating system can dispatch packets to the correct process once received by the network card.

How does the socket API accept() function work?

The socket API is the de-facto standard for TCP/IP and UDP/IP communications (that is, networking code as we know it). However, one of its core functions, accept() is a bit magical.
To borrow a semi-formal definition:
accept() is used on the server side.
It accepts a received incoming attempt
to create a new TCP connection from
the remote client, and creates a new
socket associated with the socket
address pair of this connection.
In other words, accept returns a new socket through which the server can communicate with the newly connected client. The old socket (on which accept was called) stays open, on the same port, listening for new connections.
How does accept work? How is it implemented? There's a lot of confusion on this topic. Many people claim accept opens a new port and you communicate with the client through it. But this obviously isn't true, as no new port is opened. You actually can communicate through the same port with different clients, but how? When several threads call recv on the same port, how does the data know where to go?
I guess it's something along the lines of the client's address being associated with a socket descriptor, and whenever data comes through recv it's routed to the correct socket, but I'm not sure.
It'd be great to get a thorough explanation of the inner-workings of this mechanism.
Your confusion lies in thinking that a socket is identified by Server IP : Server Port. When in actuality, sockets are uniquely identified by a quartet of information:
Client IP : Client Port and Server IP : Server Port
So while the Server IP and Server Port are constant in all accepted connections, the client side information is what allows it to keep track of where everything is going.
Example to clarify things:
Say we have a server at 192.168.1.1:80 and two clients, 10.0.0.1 and 10.0.0.2.
10.0.0.1 opens a connection on local port 1234 and connects to the server. Now the server has one socket identified as follows:
10.0.0.1:1234 - 192.168.1.1:80
Now 10.0.0.2 opens a connection on local port 5678 and connects to the server. Now the server has two sockets identified as follows:
10.0.0.1:1234 - 192.168.1.1:80
10.0.0.2:5678 - 192.168.1.1:80
Just to add to the answer given by user "17 of 26"
The socket actually consists of 5 tuple - (source ip, source port, destination ip, destination port, protocol). Here the protocol could TCP or UDP or any transport layer protocol. This protocol is identified in the packet from the 'protocol' field in the IP datagram.
Thus it is possible to have to different applications on the server communicating to to the same client on exactly the same 4-tuples but different in protocol field. For example
Apache at server side talking on (server1.com:880-client1:1234 on TCP)
and
World of Warcraft talking on (server1.com:880-client1:1234 on UDP)
Both the client and server will handle this as protocol field in the IP packet in both cases is different even if all the other 4 fields are same.
What confused me when I was learning this, was that the terms socket and port suggest that they are something physical, when in fact they're just data structures the kernel uses to abstract the details of networking.
As such, the data structures are implemented to be able to distinguish connections from different clients. As to how they're implemented, the answer is either a.) it doesn't matter, the purpose of the sockets API is precisely that the implementation shouldn't matter or b.) just have a look. Apart from the highly recommended Stevens books providing a detailed description of one implementation, check out the source in Linux or Solaris or one of the BSD's.
As the other guy said, a socket is uniquely identified by a 4-tuple (Client IP, Client Port, Server IP, Server Port).
The server process running on the Server IP maintains a database (meaning I don't care what kind of table/list/tree/array/magic data structure it uses) of active sockets and listens on the Server Port. When it receives a message (via the server's TCP/IP stack), it checks the Client IP and Port against the database. If the Client IP and Client Port are found in a database entry, the message is handed off to an existing handler, else a new database entry is created and a new handler spawned to handle that socket.
In the early days of the ARPAnet, certain protocols (FTP for one) would listen to a specified port for connection requests, and reply with a handoff port. Further communications for that connection would go over the handoff port. This was done to improve per-packet performance: computers were several orders of magnitude slower in those days.

Resources