I would like to ask a general newbie question. I understand that for a computer in location A to connect to a server in location B, packets of data have to be sent to multiple data centers through multiple gateways and through multiple verification channels to ensure the connection request finds the right destination.
However after the connection is established, when the computer and the server send/receive data, do these data still need to go through [multiple data centers through multiple gateways and through multiple verification channels]?
Every TCP / UDP packet can have a different network path between source to destination. However the connection establishment of a TCP connection being stateful is all about what packet size, compression method etc.
At network layer- Connection is stateless. Please read about OSI model in detail also you can refer to this https://www.ccnahub.com/wp-content/uploads/2013/09/watermarked-pc1-comm.jpg It has good explanation of how OSI works.
A TCP packet being sent from computer A to computer B will be addressed to a particular IP address. If that TCP address is not on the local LAN, it will go first through the local LAN to whatever is designated as the local gateway. That gateway then sends it on over the connection to an external network. At that point, it will be delivered to some router in your ISP. That router will look at the destination IP address and consult a routing table to find where it should next send the packet. That will typically be another router elsewhere in the network. This continues and (assuming good routing tables in each router) the packet will get closer to its end desination on each hop. Eventually, the packet will get to a router that has a routing table that knows about either the actual IP address or the home gateway for that IP address and the packet will be sent to that gateway. That home gateway can then deliver the packet to that actual IP address. In some cases, there may be a private network at either end where private IP addresses/port combinations are converted to public IP addresses and vice versa.
If computer A sends multiple packets to computer B, they do not have to all go the exact same path, though typically they will (assuming no problems or congestion in the network between the two endpoints).
In this scenario where A and B are on different private networks, there is no direct connection between computer A and computer B so each packet has to follow the path from one router to the next until it arrive at the final gateway and then destination address.
However after the connection is established, when the computer and the server send/receive data, do these data still need to go through [multiple data centers through multiple gateways and through multiple verification channels]?
If the routers are doing their job appropriately, the very first packet takes the most efficient path from A to B that the network knows. There is no "better" way to send subsequent packets. Subsequent packets will follow the same process (to a router, router looks up in routing table where to send for next hop and so on). If the two endpoints are a long ways apart (in terms of network topology), then the packet may go through many routers. Routers are highly optimized pieces of equipment capable of passing off millions of packets a second as this is how data moves on any TCP/IP network like the internet.
There is no difference in how the first packet that initiates the TCP connection flows versus subsequent packets. At the network level, they are just packets traveling from a source IP address to a destination IP address. Once the connection is established, a reliability layer will be started to track packets that might get lost, initiate retransmissions, etc... but this doesn't have anything to do with how a given packet gets from A to B.
Related
I know that when two machine communicate they may use the TCP/IP protocol.. But after the IP packet is routed to my router and it is converted to physical signal , how does my computer again decapsulate it and send it to proper application....I know that transport layer header is used for identifying port numbers to send it to proper process,but which device will do all these inside a host..am new to network and apologize if something was wrong or silly here
A packet comprises of information in the form of [header[body]] which will be looked up and processed across all the layers in the TCP/IP stack.
The information related to the all layers are encapsulated into a single packet.
Packet being a general term here, can be of many types based on the protocol with which two nodes are communicating (TCP Packet, UDP Packet, IP Packet etc). The information from a TCP/IP packet for example, are processed by different devices or services working at specific layers.
Switches or Bridges operate at the Ethernet layer. These devices switch packets inside LAN by looking up the MAC address information.
Routers operate at the Internet Layer and utilizes the IP protocol (i.e., IP address) to route traffic between networks.
Stateful firewalls, Proxies, Load Balancers etc. are at the transport layer. They work based on the TCP or UDP information to allow/deny/direct traffic.
Application layer facilities effective communication between application programs in a network. The application layer is not the application itself that is doing the communication. It has protocols such as DNS, FTP, SMTP, SNMP to help and serve the purpose.
References:
https://docstore.mik.ua/orelly/networking/firewall/ch06_03.htm
https://technet.microsoft.com/en-in/library/cc786128(v=ws.10).aspx
From what I've read, TCP sits on the layer between the application and IP, and handles setting up the packets, checking for errors, ordering etc so the application itself doesn't have to do it.
However, when I looked at the TCP header I became confused. From the way I understand it, some data is handed to TCP from the application, and is given a destination address to which to send the data. The TCP layer packages it up, and sends it on to the IP layer, who in turn hands it off, all the way on down to the physical layer.
But looking at the TCP header on Wikipedia, there is no mention of a destination address! There is only a destination port number which I am pretty sure is not an address.
So my question is, how does TCP get the addresses? And/or, how does IP get the address if TCP isn't passing them to it?
It's the Application that's running on top of Transport Layer that chooses everything.
If the Application is designed with reliability in mind, it chooses the connection oriented protocol like TCP.
The same applications tells TCP what the Source and Destination port should be, TCP alone cannot decide this.
Example: If you're accessing a website, your Application would be the browser, since accessing websites normally happens over HTTP/HTTPS and HTTP/HTTPS is designed to be reliable, it chooses TCP. Port 80(HTTP) or 443(HTTPS) are the standard ports used for accessing websites, so either of these ports are used in the Destination Port field while the Source Port can be any random higher number port.
This combination is used to identify something called Transport Layer VC(Virtual Circuit).
Coming to IP, the same application tells what the Destination IP address is, while the Source IP is the machine from where you are running the browser.
IP in Network Layer and TCP in Transport Layer cannot choose anything, it's the Application that tells them what to choose, considering they are the chosen ones.
So right now I'm using only TCP for my clients - they connect to the server, open socket and freely getting packets.
But what if I will decide to use also UDP in my game? Will they gonna have to open ports? For example, if they are using a regular WiFi, can I send UDP to the client without having opening ports problem?
Thanks.
TCP and UDP are just two examples of transport layer implementations. Both of them are using term 'port' to determine which app should receive incoming packet, but they could be routed/filtered differently by routers/switches/firewalls/etc.
So the answer is no. You will have similar problems with opening ports. Just except 'TCP port xxx should be opened' you have to demand 'UDP port xxx should be opened'.
In most home networks firewall rules allow outgoing packets (requests) to any remote port (on your server for example, where this port should be opened). And when such a packet goes through a router - it creates temporary rule to allow answers come back to the local port from which request packet.
So, normal scenario is like that:
Packet originated from home computer with IP 5.5.5.5. Lets say it has source UDP port 55555, source IP address 5.5.5.5 and destination port 8888.
Packet reaches home router. As it is going from inside - router allows it to pass through and creates rule say for 2 minutes to allow packets targeted to 5.5.5.5 to UDP port 55555.
Packet reaches corporate router before your server. It has rule to pass packets for port 8888 so packet is allowed to go.
Your server receives the packet and processes it. In response it creates packet for IP 5.5.5.5 and UDP port 55555.
Corporate router allows response to go.
Home router allows response to go according to temporary rule.
Your computer receives the response.
Corporate computers and routers often more restrictive to ensure security, so second point could restrict packet if your user (IP 5.5.5.5) is in corporate network.
It is very simplified as in reality there's almost always things like NAT and rules are more complex... But in general it gives the idea how it works internally.
An HTTP application request for www.stackoverflow.com.
This message is passed to Transport layer. Transport layer adds its header and sends the packet to Internet Layer.
The Internet Layer cannot see www.stackoverflow.com as it can only access the header which was appended by Transport Layer. Then how can Internet Layer decide route for this request packet.
How is the destination address field in IP header is filled, as only Application Layar and Transport Layer know about that field. (Application layer has no interaction with Internet Layer and Transport Layer mention port number in its Header.)
The application layer would have already retrieved the IP address of the host from the URL via DNS. The IP address as well as other data from the Application layer are sent down to the Transport layer which packetizes the data and then send it down to the Internet layer and then it goes.
The application, in this case the browser, did something that ended up calling the getaddrinfo library function or something equivalent, which made the system's resolver look up the name in the DNS and return a set of IP addresses.
The application somehow chose one of those (there's standard ways to do this, but the lovely thing is how many standard ways) and used the connect system call to make the connection, which started the transport layer in the kernel working on getting a connection to that IP address.
That ends up creating IP packets with that destination address and the local address as the source, next protocol set to TCP and the SYN bit on in the TCP header. Each router on the path consults its tables and forwards the packet.
TCP magic happens, a SYN+ACK comes back, then there's a connection, over which HTTP magic happens, and the page loads.
rfc791 IP - Addressing
A distinction is made between names, addresses, and routes [4]. A name indicates what we seek. An address indicates where it is. A route indicates how to get there. The internet protocol deals primarily with addresses. It is the task of higher level (i.e., host-to-host or application) protocols to make the mapping from names to addresses. The internet module maps internet addresses to local net addresses. It is the task of lower level (i.e., local net or gateways) procedures to make the mapping from local net addresses to routes. Addresses are fixed length of four octets (32 bits).
Read more: http://www.faqs.org/rfcs/rfc791.html#ixzz0buBJkVEI
It is the task of higher level (i.e., host-to-host or application) protocols to make the mapping from names to addresses ???
If you want to know how the actual IP header gets the address. It occurs in the Kernel, when a socket is created. In this case a TCP socket, Check out
man 7 ip
The data is not inherited from the TCP packet, though the data is included in the checksum of the TCP header.
From what I understand, each HTTP request uses its own TCP connection (please correct me if i'm wrong). So, let's say that there are two current connections to the same server. For example, client side javascript code triggering a couple of AJAX POST requests using the XMLHttpRequest object, one right after the other, before getting the response to the first one. So we're talking about two connections to the same server, each waiting for a response in order to route it to each separate callback function.
Now here's the thing that I don't understand: The TCP packet includes source and destination ip and port, but won't both of these connections have the same src and dest ip addresses, and port 80? How can the packets be differentiated and routed to appropriately? Does it have anything to do with the packet sequence number which is different for each connection?
When your browser creates a new connection to the HTTP server, it uses a different source port.
For example, say your browser creates two connections to a server and that your IP address is 60.12.34.56. The first connection might originate from source port 60123 and the second from 60127. This is embedded in the TCP header of each packet sent to the server. When the server replies to each connection, it uses the appropriate port (e.g. 60123 or 60127) so that the packet makes it back to the right spot.
One of the best ways to learn about this is to download Wireshark and just observe traffic on your own network. It will show you this and much more.
Additionally, this gives insight into how Network Address Translation (NAT) works on a router. You can have many computers share the same IP address and the router will rewrite the request to use a different port so that two computers can simultaneously connect to places like AOL Instant Messenger.
They're differentiated by the source port.
The main reason for each HTTP request to not generate a separate TCP connection is called keepalives, incidentally.
A socket, in packet network communications, is considered to be the combination of 4 elements: server IP, server port, client IP, client port. The second one is usually fixed in a protocol, e.g. http usually listen in port 80, but the client port is a random number usually in the range 1024-65535. This is because the operating system could use those ports for known server protocols (e.g. 21 for FTP, 22 for SSH, etc.). The same network device can not use the same client port to open two different connections even to different servers and if two different clients use the same port, the server can tell them apart by their IP addresses. If a port is being used in a system either to listen for connection or to establish a connection, it can not be used for anything else. That's how the operating system can dispatch packets to the correct process once received by the network card.