Packet loss while using UDP to fetch Data from Memcached - networking

I have heard, Many companies like facebook are using UDP to fetch data from memcached. I have a doubt, How they make sure there is NO packet loss and order of received packet is per requirement.As we know tcp provide such facility but udp does not.

OSI Model has 7 layers which are:
Application Layer
Presentation Layer
Session Layer
Transport Layer
Network Layer
Data Link Layer
Physical Layer
Splitting things to layers is very good approach to solve problems but it doesn't mean you have to do all the network operations in network layer.
As you have mentioned, TCP provides feedback to end systems when UDP doesn't but UDP has it's own advantages. First of UDP's datagram is simpler than TCP's one. And also most of huge systems like Facebook uses UDP because using TCP for these kind systems would not be very clever since all the data senders would have to keep track of sending rate, retransmissions rate for many many receivers. So if they've used TCP, their network layer would be under very very big presure.
So they make flow control in the application layer to reduce network traffic.

Related

Which protocol is responsible for managing retransmission? the ethernet protocol or tcp or both?

I wonder
which protocol is responsible for managing retransmission? the ethernet protocol or tcp or both?
I was attending a TCPIP course and it is not clear for me which protocol is responsible for managing retransmission
To answer your question, TCP handles retransmission of corrupt frames. Ethernet only detects it.
Systems communicating over Ethernet divide a stream of data into shorter pieces called frames. Each frame contains source and destination addresses, and error-checking data so that damaged frames can be detected and discarded; most often, higher-layer protocols trigger retransmission of lost frames. Per the OSI model, Ethernet provides services up to and including the data link layer.
So the “responsibility” is on TCP. Note that speaking of responsibilities is misleading. You can have UDP (OSI layer 4) over Ethernet (OSI layer 2) in which case the “responsibility” for handling missing data is on the application which may choose to NOT resend (I.e. video streaming in which it doesn’t make sense).
On other data link protocols the resending of frames can be built into the protocol itself.
In that sense, handling retransmissions would be more of a property or feature of the TCP protocol rather than a “responsibility”

Different Applications of TCP and UDP

In one of my classes, we went through TCP and UDP. Largely, I understand the fundamental difference.
TCP uses, 3 way handshake, congestion control, flow control and other
mechanism to make sure the reliable transmission.
UDP is mostly used in cases where the packet delay is more serious
than packet loss
The question outlined below, believe that TCP makes most for TCP, sense the order of the data that would translate to a conversation would be essential and UDP for the network handler that send player data because speed is most important for playing a competitive online game that relies on reflexes.
Does this make sense? Or am I generalizing the problems too much?
Question:
TCP and UDP. The online game is a first person shooter game where real players fight each other with guns in 5 versus 5 matches. You are in charge of two features:
an implementation of real time voice chat,
the network handlers that
send player data from the end user’s clients to your dedicated,
central servers
Which protocols do you use for each and why?
With TCP the devices at the end points need to establish a connection through a "handshake" before any data is sent. TCP also uses flow control, sequence numbers, acknowledgements and timers, to ensure reliable data transfer. Congestion control is also used by TCP to adjust the transmission rate.
The implementation of the above mechanisms comes at a time cost.
UDP, on the other hand, does almost nothing except from multiplexing/demultiplexing and a simple error checking.
Real time applications often need a minimum bitrate and can tolerate some data loss. In your example, of a real time voice chat, it is more important for the users to hear each other without delay even if a few milliseconds are inaudible. The network handlers that send player data to the server, should use TCP because reliability of the data there is vital.

OSI Layers explained

I'm trying to understand Network layers. There are five layers, application, transport, network, link and physical layer. I have studied from various sources but I couldn't understand clearly, especially transport, network and link layers. Can you explain those layers for a dummy person like me? Thanks.
So I think the hardest part people have with the OSI model is that they look at some of the layers and forget its talking about communication... application layer specifically people get confused as to what its talking about.
Application references the protocols applications use to communicate.
Examples being ftp, http, etc.
As stated above, the original OSI model (which is really just theory, nothing actually implements it exactly as it would be too inefficient) has 2 more layers in between application and transport (which are most likely wrapped into the application layer). These are Presentation and Session.
Presentation is responsible for things like HTTP's Accept-Encoding: gzip, deflate. encryption and character encoding are said to be Presentation layer, so SSL/TLS falls under this category (the s in https).
HTTP Sessions is literally would be a literal example of the Session layer (anything that keep your application state alive across connections). If a protocol is connection oriented, it might not have a session layer, which is why HTTP is probably the only example I can think of at the moment.
As you can see HTTP was the answer to all 3 of these top layers, which is why they have all been combined into Application layer in newer versions of the model. Cisco still uses the 7 layer model, but Microsoft uses the 5 layer you're using.
Transport is TCP... it contains information about reordering packets and can adjust the amount of packets per window, allows the computer to know if anything was lost and ask for re-transmission, etc.
UDP is also an example of the transport layer, however its a lot simpler of a protocol, no re-transmission of packets are done on UDP.
Network is the IP protocol (also IPX/SPX from the old netware days, and ICMP (pings) and IGMP (routers)) this allows for addressing computers that rely outside the a collision domain (things separated by switches or routers).
Link or DataLink layer is ethernet, (or ATM, or FDDI) which addresses computers physically connected to a hub or directly with a network cable to each other.
This layer adds the MAC addressing in ethernet and the frame part of a packet is the header it uses.
Physical layer (in original OSI model) is just your cables and network equipment.
Pretty much the only people who still talk about OSI is network techs.
They may still say Layer 2 or Layer 3 switches, etc which is in reference to OSI.. Layer 2 is an ethernet switch, a layer 3 switch adds routing.
The best way to see how this is used is to load up wireshark and snoop your own network traffic. It will actually show you the parts of a packet that are responsible for most of the layers.
Knowing the OSI model is not really overly useful, but it can help you organize the stages of network communication in your mind and help you troubleshoot.
Knowing the protocols and how they interact is extremely useful, learn how tcp negotiates connections, IP addressing and subnet masking, HTTP and Ethernet can help you whether you are a developer or a server admin, or network admin or even a DBA. There is nothing like a bad network card to ruin your weekend or even a whole month if no one thinks to check it, and the only way to tell is use tcpdump, and wireshark to see errors in the Ethernet frames.
The OSI reference model
The OSI model is used to connect to the open systems—these are the systems that are open and communicate with other systems. By using this model, we do not depend on an operating system anymore, so we are allowed to communicate with any operating system on any computer. This model contains seven layers, where each layer has a specific function and defines the way data is handled on certain different layers. The seven layers that are contained in this model are the Physical layer, Data Link layer, Network layer, Transport layer, Session layer, Presentation layer, and the Application layer.
THE PHYSICAL LAYER
This is the first layer in the OSI model and contains a definition of the network's physical specification, including the physical media (cables and connectors) and basic devices (repeaters and hubs). The layer is responsible for the input raw bits transmission data stream into zeros and for the ones that are on the communication channel. It then places the data onto the physical media. It is concerned with data transmission integrity and makes sure that the bits that are sent from one device are exactly the same as the data that is received by the other device
THE DATA LINK LAYER
The main role of the Data Link layer is to provide a link for raw data transmission. Before the data is transmitted, it is broken up into data frames, and the Data Link layer transmits them consecutively. The receiver will send back an acknowledge frame for each frame that has been sent if the service is reliable.
This layer consists of two sublayers: Logical Link Control (LLC) and Media Access Control (MAC). The LLC sublayer is responsible for transmission error checking and deals with frame transmission, while the MAC sublayer defines how to retrieve data from the physical media or store data in the physical media.
We can also find the MAC address, also called as the physical address, in this layer. The MAC address is used to identify every device that connects to the network because it is unique for each device.
The MAC address contains twelve hexadecimal characters, where two digits are paired with each other. The first six digits represent the organizationally unique identifier and the remaining digits represent the manufacturer serial number. If you are really curious to know what this number means, you can go to www.macvendorlookup.com and fill the text box with our MAC address to know more about it.
THE NETWORK LAYER
The Network layer is responsible for defining the best way to route the packets from a source to the destination device. It will generate routing tables using Internet Protocol (IP) as the routing protocol, and the IP address is used to make sure that the data gets its route to the required destination. There are two versions of IP nowadays: IPv4 and IPv6. In IPv4, we use 32-bit addresses to address the protocol and we use 128-bit addresses in IPv6. You are going to learn more about Internet Protocol, IPv4, and IPv6 in the next topic.
THE TRANSPORT LAYER
The Transport layer is responsible for transferring data from a source to destination. It will split up the data into smaller parts, or in this case segments, and then will join all the segments to restore the data to its initial form in the destination.
There are two main protocols that work in this layer: the Transmission Control Protocol (TCP) and the User Datagram Protocol (UDP).
TCP supplies the delivery of data by establishing a session. The data will not be transmitted until a session is established. TCP is also known as the connection-oriented protocol, which means that the session has to be established before transmitting the data.
UDP is a method of delivering data with the best efforts, but does not give a guaranteed delivery because it does not establish a session. Therefore, UDP is also known as the connection-less protocol. In-depth explanation about TCP and UDP can be found in the next topic.
THE SESSION LAYER
The Session layer is responsible for the establishment, maintenance, and termination of the session. We can analogize the session like a connection between two devices on the network. For example, if we want to send a file from a computer to another, this layer will establish the connection first before the file can be sent. This layer will then make sure that the connection is still up until the file is sent completely. Finally, this layer will terminate the connection if it is no longer needed. The connection we talk about is the session.
This layer also makes sure that the data from a different application is not interchanged. For example, if we run the Internet browser, chat application, and download manager at the same time, this layer will be responsible for establishing the session for every single application and ensure that they remain separated from other applications.
There are three communication methods that are used by this layer: the simplex, half-duplex, or full-duplex method.
In the simplex method, data can only be transferred by one party, so the other cannot transfer any data. This method is no longer common in use, since we need applications that can interact with each other.
In the half-duplex method, any data can be transferred to all the involved devices, but only one device can transfer the data in the time, after it completes the sending process. Then, the others can also send and transfer data.
The full-duplex method can transfer data to all the devices at the same time. To send and receive data, this method uses different paths.
THE PRESENTATION LAYER
The Presentation layer role is used to determine the data that has been sent, to translate the data into the appropriate format, and then to present it. For example, we send an MP3 file over the network and the file is split up into several segments. Then, using the header information on the segment, this layer will construct the file by translating the segments.
Moreover, this layer is responsible for data compression and decompression because all the data transmitted over the Internet is compressed to save the bandwidth. This layer is also responsible for data encryption and decryption in order to secure communication between two devices.
THE APPLICATION LAYER
The Application layer deals with the computer application that is used by a user. Only the application that connects to a network will connect to this layer. This layer contains several protocols that are needed by a user, which are as follows:
The Domain Name System (DNS): This protocol is the one that finds the hostname of an IP address. With this system, we do not need to memorize every IP address any longer, just the hostname. We can easily remember a word in the hostname instead of a bunch of numbers in the IP address.
The Hypertext Transfer Protocol (HTTP): This protocol is the one that transmits data over the Internet on web pages. We also have the HTTPS format that is used to send encrypted data for security issues.
The File Transfer Protocol (FTP): This protocol is the one that is used to transfer files from or to an FTP server.
The Trivial FTP (TFTP): This protocol is similar to FTP, which is used to send smaller files.
The Dynamic Host Configuration Protocol (DHCP): This protocol is a method that is used to assign the TCP/IP configuration dynamically.
The Post Office Protocol (POP3): This protocol is an electronic mail protocol used to get back e-mails from POP3 servers. The server is usually hosted by an Internet Service Provider (ISP).
The Simple Mail Transfer Protocol (SMTP): This protocol is in contrast with POP3 and is used to send electronic mails.
The Internet Message Access Protocol (IMAP): This protocol is used to receive e-mail messages. With this protocol, users can save their e-mail messages on their folder on a local computer.
The Simple Network Management Protocol (SNMP): This protocol is used to manage network devices (routers and switches) and detect problems to report them before they become significant.
The Server Message Block (SMB): This protocol is an FTP that is used on Microsoft networks primarily for file and printer sharing.
This layer also decides whether enough network resources are available for network access. For instance, if you want to surf the Internet using an Internet browser, the Application layer decides whether access to the Internet is available using HTTP.
We can divide all the seven layers into two section layers: the Upper Layer and Lower Layer. The upper layer is responsible for interacting with the user and is less concerned about the low-level details, whereas the lower layer is responsible for transferring data over the network, such as formatting and encoding.
There are FIVE LAYERS in TCP/IP Model and SEVEN LAYERS in OSI Reference Model.
The primary difference between our five-layer model and the seven-layer OSI model is that the OSI model abstracts the application layer into three layers total.
The physical layer represents the physical devices that interconnect computers. This includes the specifications for the networking cables and the connectors that join devices together along with specifications describing how signals are sent over these connections. The physical layer is all about cabling, connectors and sending signals.
The second layer in our model is known as the data link layer. Also called the network interface or the network access layer. The data link layer is responsible for defining a common way of interpreting the signals, so network devices can communicate. Lots of protocols exist at the data link layer, but the most common is known as Ethernet.
The third layer, the network layer is also sometimes called the Internet layer. It's this layer that allows different networks to communicate with each other through devices known as routers.
The most common protocol used at this layer is known as IP or Internet Protocol. IP is the heart of the Internet and most small networks around the world.
While the data link layer is responsible for getting data across a
single link, the network layer is responsible for getting data
delivered across a collection of networks.
While the network layer delivers data between two individual nodes, the transport layer sorts out which client and server programs are supposed to get that data. The transport layer Protocol is known as TCP or Transmission Control Protocol and UDP or User Datagram Protocol.
The big difference between the two is that TCP provides mechanisms to
ensure that data is reliably delivered while UDP does not.
The network layer, in our case IP, is responsible for getting data
from one node to another. The transport layer, mostly TCP and UDP, is
responsible for ensuring that data gets to the right applications
running on those nodes.
The fifth layer is known as the application layer. There are lots of different protocols at this layer, and as you might have guessed from the name, they are application-specific. Protocols used to allow you to browse the web or send receive emails are some common ones. Application Layer Protocols are HTTP, SMTP, etc.
You can think of layers like different aspects of a package being
delivered. The physical layer is the delivery truck and the
roads. The data link layer is how the delivery trucks get from
one intersection to the next over and over. The network layer
identifies which roads need to be taken to get from address A to
address B. The transport layer ensures that the delivery
driver knows how to knock on your door to tell you your package has
arrived. And the application layer is the contents of the
package itself.

how transport , network, data link layers functions achieve reliability?

we know that transport layer protocols like tcp control the flow and take care of the reliability by slide window and acknowledges ...etc. the data link layer with LLC sub layer has the same functionality for reliable connections also. the first question : is this means that both layers do the same functions twice? or when we use tcp in transport layer there is no need for LLC reliability functions ?how is it working ?
the second question: since IP layer is unreliable when it sends and receives packets, is this means that routers witch are layer 3 devices with no tcp protocol above it depends on LLC sub layer to takes care about the reliability "I mean between two routers" ?
I dont think you need other reliability, in words of disruption, both IP and Physical protocls (such as Ethernet) have CRC and Checksum in order to prevent disruption, but in terms of packet loss, TCP is what's called a stream protocol - it transfers you a stream of bytes, Sequence / Acknowledgment number server that purpose and help you keep track of bytes sent/read by client, if some weird "jump" in those numbers would be made, an ACK message wont be sent by receiving side, and protocol's sender would re-send lost packets hence I don't think that you need some reliability in terms of packet loss as TCP covers that for all downwards layers.... not sure I fully understood your question though, thats when we use TCP
about using pure IP and physical layer protocols, I honestly have no idea how to prevent packet-loss yet disruption is prevented by as mentioned earlier - checksums

If UDP is unreliable why is it used at transport layer

Sorry for what is a stupid question.
Function of transport layer is reliable delivery of messages. UDP is inherently unreliable, why do we use it at Transport layer then?
Thanks
EDIT: Just to clarify, I have read the Wiki and other sources. My question is
UDP is Unreliable (I know why and the advantages and where it is used etc.) , why not use it(UDP) at some other layer, rather than Transport layer which implies reliability.
Sometimes it is more important that the data be sent quickly and without pauses than that the stream be reliable. DNS uses UDP because the transaction between a DNS server and client consists of only one packet each way. If the packet is lost, it will be re transmitted at the request of the client.
Similarly, streaming video often uses UDP as a transport protocol because the occasional loss of a packet is acceptable. It is preferable that the image quality suffer as a result of lost packets, rather than the video stream suffer jitter or pauses (lag) as a result of TCP synchronization.
Games also often use UDP, sacrificing engine accuracy for improved speed/user experience.
These and more examples can be found in the relevant portions of the wikipedia article.
EDIT
UDP is used at the transport layer because it is a transport layer protocol. It provides "provides end-to-end communication services for applications" (RFC1122).
Reliability services are optional for transport layer protocols.
... rather than Transport layer which implies reliability
There's more than one dimension within "reliability." It's interesting to note that UDP is reliable in that it provides a checksum to prevent against corruption.
Stream protocols like TCP create problems for latency-sensitive applications. For latency-sensitive apps, UDP's natural limitation (to shed traffic during congestion) is a huge boon.
why not use it(UDP) at some other layer
IP datagrams are designed to be small enough to make the next hop transit. UDP datagrams can span IP datagrams, so there's some value added there. But if TCP were a layer above UDP, it would be limited by UDP's semantics (TCP ports are bound to a connection, UDP datagrams are not).
The reason why UDP is used at the transport layer is because the way these layers are set up. UDP is inherently a protocol for transferring data from point A to point B, not as an application or at the hardware layer.
At the transport layer there is no assumption of reliability, but rather that UDP is a protocol for transferring data. Under the 7 layer style of networking, it falls in the interface between the network and session layers. The name Transport layer simply says what it does. Reference wikipedia for more information on the OSI model.
TLDR The reason UDP is in the transport layer is because it is a protocol for data transport, and is therefore in the transport layer. All protocols that deal with data transport fall under this category
Transport layer classes
Class 0 - Simple class
Class 1 - Basic error recovery class
Class 2 - Multiplexing class
Class 3 - Error Recovery and multiplexing class
Class 4 - Error detection and recovery class

Resources