Where is the source and destination address fields in TCP header? - tcp

From what I've read, TCP sits on the layer between the application and IP, and handles setting up the packets, checking for errors, ordering etc so the application itself doesn't have to do it.
However, when I looked at the TCP header I became confused. From the way I understand it, some data is handed to TCP from the application, and is given a destination address to which to send the data. The TCP layer packages it up, and sends it on to the IP layer, who in turn hands it off, all the way on down to the physical layer.
But looking at the TCP header on Wikipedia, there is no mention of a destination address! There is only a destination port number which I am pretty sure is not an address.
So my question is, how does TCP get the addresses? And/or, how does IP get the address if TCP isn't passing them to it?

It's the Application that's running on top of Transport Layer that chooses everything.
If the Application is designed with reliability in mind, it chooses the connection oriented protocol like TCP.
The same applications tells TCP what the Source and Destination port should be, TCP alone cannot decide this.
Example: If you're accessing a website, your Application would be the browser, since accessing websites normally happens over HTTP/HTTPS and HTTP/HTTPS is designed to be reliable, it chooses TCP. Port 80(HTTP) or 443(HTTPS) are the standard ports used for accessing websites, so either of these ports are used in the Destination Port field while the Source Port can be any random higher number port.
This combination is used to identify something called Transport Layer VC(Virtual Circuit).
Coming to IP, the same application tells what the Destination IP address is, while the Source IP is the machine from where you are running the browser.
IP in Network Layer and TCP in Transport Layer cannot choose anything, it's the Application that tells them what to choose, considering they are the chosen ones.

Related

What happens after host receives physical data from router

I know that when two machine communicate they may use the TCP/IP protocol.. But after the IP packet is routed to my router and it is converted to physical signal , how does my computer again decapsulate it and send it to proper application....I know that transport layer header is used for identifying port numbers to send it to proper process,but which device will do all these inside a host..am new to network and apologize if something was wrong or silly here
A packet comprises of information in the form of [header[body]] which will be looked up and processed across all the layers in the TCP/IP stack.
The information related to the all layers are encapsulated into a single packet.
Packet being a general term here, can be of many types based on the protocol with which two nodes are communicating (TCP Packet, UDP Packet, IP Packet etc). The information from a TCP/IP packet for example, are processed by different devices or services working at specific layers.
Switches or Bridges operate at the Ethernet layer. These devices switch packets inside LAN by looking up the MAC address information.
Routers operate at the Internet Layer and utilizes the IP protocol (i.e., IP address) to route traffic between networks.
Stateful firewalls, Proxies, Load Balancers etc. are at the transport layer. They work based on the TCP or UDP information to allow/deny/direct traffic.
Application layer facilities effective communication between application programs in a network. The application layer is not the application itself that is doing the communication. It has protocols such as DNS, FTP, SMTP, SNMP to help and serve the purpose.
References:
https://docstore.mik.ua/orelly/networking/firewall/ch06_03.htm
https://technet.microsoft.com/en-in/library/cc786128(v=ws.10).aspx

Client Server Architecture

I'm struggle with what technique to choose for a server client aspect of my application.
Defining design
Windows, C# on .net 2
On many machines there is a .net 2 service. I call that the Client.
Machines can be in different networks behind NAT's (or not) connected to Internet.
Server services are public.
Requirements
To communicate with the Clients on demand.
Client must listen for incoming connections.
The server can be or not online.
Port forwarding is not possible.
What are my choices to do something like that?
Now I'm looking in the UDP Hole punching technique. The difference between the UDP hole punching technique setup and my setup is that instead of having 2 clients behind a NAT and a mediation Server, I got only the client behind the NAT that must communicate with the server. That must be easier but I'm having hard time to understand and implement.
I'm on the right way with the this kind of NAT traversal or may be some other methods much easier to implement?
Other methods that I've taken in consideration:
When the service sees the server online, creates a connection to the server using TCP. The problem is that I have something around 200 clients, and the number is rising and I was afraid that this is a resource killer.
When the service sees the server online, checks a database table for commands then at every 30 seconds checks again. This is also a resource killer for my server.
Bottom of line is, if the UDP Hole Punching tehnique is the right way for this scenario, please provide some code ideas for de UDPServer that will run on the service behind NAT.
Thank you.
Hole punching and p2p
You might be interested in a high level discussion of UDP hole punching. Hole punching is needed if you want clients (who both might be behind a firewall) to communicate directly without an relaying server. This is how many peer 2 peer (p2p) communications work.
With p2p, typically NAT'ed clients must use some external server to determine each other's "server reflexive address." When NAT translation occurs, behind the firewall ports can be mapped to some arbitrary port to the public. A client can use a STUN server to determine its "server reflexive address." Clients then will, through an intermediary server, exchange server reflexive addresses and can initiate communication (with hole punching to initiate the session).
Often, a NAT will not behave in a manner to allow direct communication as described above. Sending packets to different destinations will cause a NAT to map ports to entirely different values depending on the destination. In this case, a TURN server is needed.
Links
Nat traversal and different types of NAT behaviors
STUN RFC
TURN RFC
Server-client communication
If your client only needs to communicate with a server, hole punching is not needed. As long as the client can communicate with the public Internet, then you can use any C# socket API (I'm not familiar with C#) to make connections to the server's public IP/port combination. Typically, clients making socket connections don't specify a source port and let the underlying socket API make that decision since it really doesn't matter.
Your server should be listening to a specific port (you make this determination), and when it receives a packet from the client, the source address of the packet will be some NAT'ed address. In other words, the source address will be the public IP of whatever firewall your client is behind. If the NAT changed the source port of the client's packet, the server will see this NAT'ed port as the source port. It really doesn't matter, since when the server sends back a response packet, the NAT on the client machine will translate the destination port (it stores translations internally) and correctly send the packet back to the correct private host (the client).

Are there different ports for output and input?

When sending data using UDP, a destination port is needed to be specified.
If sending by TCP, a source port should also be specified.
Are there different ports for input and output? E.g., if I specify port 1234, can I use it for both input and output or should I use different ports for output and input?
EDIT:
To clarify my question:
- I send data from port X.
- Someone sends data to me to port X.
Are those two different ports or is the same one used?
When sending data using UDP, a
destination port is needed to be
specified.
Correct.
If sending by TCP, a source port
should also be specified.
Incorrect. The system will allocate one for you automatically if not specified. This is the normal usage.
Are there different ports for input and output?
No. The local port you are bound to is used for both.
And all this applies to both UDP and TCP.
The source port is a port that exists only on the computer that is initiating the connection, whereas the destination port exists only on the computer that is receiving it (though both are visible to both endpoints). Both TCP and UDP have both source and destination ports. Usually the source port is selected automatically by the socket library from the unused ports on the computer. There are very few good reasons for selecting a specific source port, and it will often be changed by the Internet gateway (router) as a part of the Network Address Translation (NAT) process.
Edit: To clarify, both the source and destination ports are used for both input and output. Which port is on your computer depends on which end of the TCP connection you are on. If you are on the receiving end, then the destination port is on your computer. When you are looking at the connection from your perspective, it will be the source port, and will be used for both input and output. The same principle applies to UDP as well, except that there are no "connections" per se, merely an exchange of raw data between ports.
TCP needs both a source and a destination port because it forms a connection between the two clients, whereas UDP is connectionless; You simply send data to a destination port and it either arrives or not.
So with TCP, you open a "channel" between two computers. You send data through it and possibly receive some back.
With UDP, if you want to receive data, then yes you need a "separate" port that listens for incoming data.

How Network Layer finds route for a packet

An HTTP application request for www.stackoverflow.com.
This message is passed to Transport layer. Transport layer adds its header and sends the packet to Internet Layer.
The Internet Layer cannot see www.stackoverflow.com as it can only access the header which was appended by Transport Layer. Then how can Internet Layer decide route for this request packet.
How is the destination address field in IP header is filled, as only Application Layar and Transport Layer know about that field. (Application layer has no interaction with Internet Layer and Transport Layer mention port number in its Header.)
The application layer would have already retrieved the IP address of the host from the URL via DNS. The IP address as well as other data from the Application layer are sent down to the Transport layer which packetizes the data and then send it down to the Internet layer and then it goes.
The application, in this case the browser, did something that ended up calling the getaddrinfo library function or something equivalent, which made the system's resolver look up the name in the DNS and return a set of IP addresses.
The application somehow chose one of those (there's standard ways to do this, but the lovely thing is how many standard ways) and used the connect system call to make the connection, which started the transport layer in the kernel working on getting a connection to that IP address.
That ends up creating IP packets with that destination address and the local address as the source, next protocol set to TCP and the SYN bit on in the TCP header. Each router on the path consults its tables and forwards the packet.
TCP magic happens, a SYN+ACK comes back, then there's a connection, over which HTTP magic happens, and the page loads.
rfc791 IP - Addressing
A distinction is made between names, addresses, and routes [4]. A name indicates what we seek. An address indicates where it is. A route indicates how to get there. The internet protocol deals primarily with addresses. It is the task of higher level (i.e., host-to-host or application) protocols to make the mapping from names to addresses. The internet module maps internet addresses to local net addresses. It is the task of lower level (i.e., local net or gateways) procedures to make the mapping from local net addresses to routes. Addresses are fixed length of four octets (32 bits).
Read more: http://www.faqs.org/rfcs/rfc791.html#ixzz0buBJkVEI
It is the task of higher level (i.e., host-to-host or application) protocols to make the mapping from names to addresses ???
If you want to know how the actual IP header gets the address. It occurs in the Kernel, when a socket is created. In this case a TCP socket, Check out
man 7 ip
The data is not inherited from the TCP packet, though the data is included in the checksum of the TCP header.

How are different TCP connections in HTTP requests identified?

From what I understand, each HTTP request uses its own TCP connection (please correct me if i'm wrong). So, let's say that there are two current connections to the same server. For example, client side javascript code triggering a couple of AJAX POST requests using the XMLHttpRequest object, one right after the other, before getting the response to the first one. So we're talking about two connections to the same server, each waiting for a response in order to route it to each separate callback function.
Now here's the thing that I don't understand: The TCP packet includes source and destination ip and port, but won't both of these connections have the same src and dest ip addresses, and port 80? How can the packets be differentiated and routed to appropriately? Does it have anything to do with the packet sequence number which is different for each connection?
When your browser creates a new connection to the HTTP server, it uses a different source port.
For example, say your browser creates two connections to a server and that your IP address is 60.12.34.56. The first connection might originate from source port 60123 and the second from 60127. This is embedded in the TCP header of each packet sent to the server. When the server replies to each connection, it uses the appropriate port (e.g. 60123 or 60127) so that the packet makes it back to the right spot.
One of the best ways to learn about this is to download Wireshark and just observe traffic on your own network. It will show you this and much more.
Additionally, this gives insight into how Network Address Translation (NAT) works on a router. You can have many computers share the same IP address and the router will rewrite the request to use a different port so that two computers can simultaneously connect to places like AOL Instant Messenger.
They're differentiated by the source port.
The main reason for each HTTP request to not generate a separate TCP connection is called keepalives, incidentally.
A socket, in packet network communications, is considered to be the combination of 4 elements: server IP, server port, client IP, client port. The second one is usually fixed in a protocol, e.g. http usually listen in port 80, but the client port is a random number usually in the range 1024-65535. This is because the operating system could use those ports for known server protocols (e.g. 21 for FTP, 22 for SSH, etc.). The same network device can not use the same client port to open two different connections even to different servers and if two different clients use the same port, the server can tell them apart by their IP addresses. If a port is being used in a system either to listen for connection or to establish a connection, it can not be used for anything else. That's how the operating system can dispatch packets to the correct process once received by the network card.

Resources