Reassembly of segments - tcp

I am working on an application that intercepts various kinds of traffic. Recently I have been receiving out-of-order segments. This traffic is over TCP. The SIP header is among multiple segments. I am trying to understand a protocol to be followed to reassemble packets that arrive out of order to be able to display them in my application. To clarify the data is segmented by TCP. By receiving out of order, I mean:
SIP INVITE header first half received later, second-half earlier.
TCP seq and ack are such that the segment received later is expected to be received first.
I would greatly appreciate any leads towards established protocols to implement this.

I suspect you may need to look deeper into your architecture as TCP is designed to deliver packets in order.
One thing in particular to check is whether you are using multiple TCP connections in some way, maybe to boost bandwidth - this could allow out of order delivery if different packets could take different TCP connections, but within a TCP connection delivery should still be in order.

Related

how transport , network, data link layers functions achieve reliability?

we know that transport layer protocols like tcp control the flow and take care of the reliability by slide window and acknowledges ...etc. the data link layer with LLC sub layer has the same functionality for reliable connections also. the first question : is this means that both layers do the same functions twice? or when we use tcp in transport layer there is no need for LLC reliability functions ?how is it working ?
the second question: since IP layer is unreliable when it sends and receives packets, is this means that routers witch are layer 3 devices with no tcp protocol above it depends on LLC sub layer to takes care about the reliability "I mean between two routers" ?
I dont think you need other reliability, in words of disruption, both IP and Physical protocls (such as Ethernet) have CRC and Checksum in order to prevent disruption, but in terms of packet loss, TCP is what's called a stream protocol - it transfers you a stream of bytes, Sequence / Acknowledgment number server that purpose and help you keep track of bytes sent/read by client, if some weird "jump" in those numbers would be made, an ACK message wont be sent by receiving side, and protocol's sender would re-send lost packets hence I don't think that you need some reliability in terms of packet loss as TCP covers that for all downwards layers.... not sure I fully understood your question though, thats when we use TCP
about using pure IP and physical layer protocols, I honestly have no idea how to prevent packet-loss yet disruption is prevented by as mentioned earlier - checksums

TCP or UDP for simple service

For a service which just returns a small number when queried such as 30, or 10, but would have to handle up to 5 or so requests at any instance, would TCP or UDP be a better protocol? I am leaning towards UDP, but I wanted some expert opinions. I am looking for relatively quick reply times as well. Could you tell me what the advantages of each would be for a service like this? Thanks.
TCP is a reliable connection-based protocol. So, you are guaranteed that data is sent/received - the packets are automatically re-sent if they are not verified to be received on the other end. However, there is the overhead of the three-way handshake for establishing the connection.
TCP is used for protocols like HTTP where there is a one-time exchange of information (the HTTP Request and Reply).
UDP is an unreliable connection-less protocol. So you can simply send / receive a packet but you have no (automatic, OS stack-provided) way of verifying that the other end got your message. If you care, you have to implement some kind of ACK yourself.
UDP is used often for more continuous, "streaming" type protocols. For example, many online multiplayer games use UDP to exchange game information to/from the host. They do this on a continual, periodic basis. So if a packet is lost, it's not really a big deal, because another update is just around the corner. It would be far worse for the gameplay if you had to wait for that (now stale) update to be re-transmitted.
DNS is also implemented over UDP.
Ultimately the choice is yours. I would probably default to TCP for most cases, and only use UDP in a scenario like I described.

I'm confused on terminology about wifi

I am trying to simulate a wifi video transmission and for that I created a connection using a socket between 2 devices, however I then started to doubt whether this is required or if I was supposed to create a UDP connection.
I think I'm just confused on terms here and I've Googled and I found out that Wifi can has TCP or UDP my question would then be would a Wifi Transmission over TCP be as reliable for a simulation as one with UDP?
I'd suggest you to read Difference between TCP and UDP?.
For streaming like video transmission you would generally want to use UDP. If a packet cannot reach the server in time, it'd better be discarded than pausing the whole transmission in order to wait for one tiny missing packet that just contains the other person blinking.
But obviously it's up to you and how you implement your software.
You may need to read up a bit on the TCP/IP protocol. TCP and UDP are just types of packets/datagrams. The main difference is that TCP packets include extra protocol information, whereas UDP are simpler packets with just a destination, the data itself, and a checksum.
The upshot is that the sender of a UDP packet has no way of knowing whether or not the packet was received at the other end. Often this doesn't matter - because it may be handled in other ways by higher layers in the software, or can be simply lost and ignored without any negative consequences. So UDP can be a more efficient use of the bandwidth, in some scenarios - because there is less protocol information being exchanged, and therefore more actual data. Plus TCP is more complicated because you have to handle the protocol stuff.
So when you create your system, you have a choice - either TCP or UDP packets, depending on what you are trying to achieve and how you want to go about it. But both packet types are really all part of the "tcp/ip" protocol stack, and have similarities.

Can UDP retransmit lost data?

I know the protocol doesn't support this but is it common for clients that require some level of reliability to build into their application a method for requesting retransmission of a packet if found to be corrupt?
It is common for clients to implement reliability on top of UDP if they need reliability (or sometimes just some reliability) but not any of the other things that TCP offers, for example strict in-order delivery, and if they want, at the same time, low latency (or multicast, where it works).
In general, you will only want to use reliable UDP if there are urgent reasons (very low latency and high speed needed, e.g. for a twitch game). In every "normal" case, simply using TCP will serve you equally well or better.
Also note that it is easy to implement your own stack on top of UDP that performs worse than TCP.
See enet for an example of a library that implements reliability (and some other features) on top of UDP (Raknet or HawkNL would be other examples).
You might want to look at the answers to this question: What do you use when you need reliable UDP?
Of course. You can build a reliable protocol (like TCP) on top of UDP.
Example
Imagine you are building a fileserver:
* read the file using blocks of 1024 bytes
* construct an UDP packet with payload: 4 bytes for the "position" in the file, 4 bytes for the "length" of the contents of the packet.
The receiver now receives UDP packets. If he gets following packets:
* 0-1024: DATA
* 1024-2048: DATA
* 3072-4096: DATA
it realises a packet got missing, and asks the sending application to resend the part between 2048 and 3072.
This is a very basic example to explain your application code needs to deal with the packet construction and payload interpretation. Don't use it, it does not take edge cases (last packet, checksums for corrupted packets, ...) into account.
Short answer: No.
Long answer: UDP doesn't care for packet loss. If an UDP packet arrives and has a bad checksum, it is simply dropped. Neither the sender of the packet is informed about this, nor the recipient is informed. It is only dropped, that's all that happens. Also UDP packets are not numbered, so if you send four packets and only three arrive at the recipient, the recipient cannot know that there used to be four and one is missing. Last but not least, packets are not acknowledged, so the sender will never know if a packet it was sending out ever made it to the recipient or not.
Contrary to this, TCP breaks data into segments, each segment is numbered, so the recipient can know if a segment is missing. Also all segments must be acknowledged to the sender. If the sender receives no acknowledgment for a segment after a certain period of time, it will assume the segment was lost and send it again.
Of course, you can add an own frame header on top of every data fragment you sent over UDP, that way your application code can number the sent fragments and you can implement an acknowledgement-resent strategy in your code but the question is: Will this really be better than what TCP is already offering for free? Even if it would be equally good, save yourself the time and just use TCP. A lot of people already thought they can do better than TCP and usually they realize in the end, actually they cannot. TCP has its weaknesses but in general it is pretty good at what it does.

When is it appropriate to use UDP instead of TCP? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 6 years ago.
Improve this question
Since TCP guarantees packet delivery and thus can be considered "reliable", whereas UDP doesn't guarantee anything and packets can be lost. What would be the advantage of transmitting data using UDP in an application rather than over a TCP stream? In what kind of situations would UDP be the better choice, and why?
I'm assuming that UDP is faster since it doesn't have the overhead of creating and maintaining a stream, but wouldn't that be irrelevant if some data never reaches its destination?
This is one of my favorite questions. UDP is so misunderstood.
In situations where you really want to get a simple answer to another server quickly, UDP works best. In general, you want the answer to be in one response packet, and you are prepared to implement your own protocol for reliability or to resend. DNS is the perfect description of this use case. The costs of connection setups are way too high (yet, DNS
does support a TCP mode as well).
Another case is when you are delivering data that can be lost because newer data coming in will replace that previous data/state. Weather data, video streaming, a stock quotation service (not used for actual trading), or gaming data comes to mind.
Another case is when you are managing a tremendous amount of state and you want to avoid using TCP because the OS cannot handle that many sessions. This is a rare case today. In fact, there are now user-land TCP stacks that can be used so that the application writer may have finer grained control over the resources needed for that TCP state. Prior to 2003, UDP was really the only game in town.
One other case is for multicast traffic. UDP can be multicasted to multiple hosts whereas TCP cannot do this at all.
If a TCP packet is lost, it will be resent. That is not handy for applications that rely on data being handled in a specific order in real time.
Examples include video streaming and especially VoIP (e.g. Skype). In those instances, however, a dropped packet is not such a big deal: our senses aren't perfect, so we may not even notice. That is why these types of applications use UDP instead of TCP.
The "unreliability" of UDP is a formalism. Transmission isn't absolutely guaranteed. As a practical matter, they almost always get through. They just aren't acknowledged and retried after a timeout.
The overhead in negotiating for a TCP socket and handshaking the TCP packets is huge. Really huge. There is no appreciable UDP overhead.
Most importantly, you can easily supplement UDP with some reliable delivery hand-shaking that's less overhead than TCP. Read this: http://en.wikipedia.org/wiki/Reliable_User_Datagram_Protocol
UDP is useful for broadcasting information in a publish-subscribe kind of application. IIRC, TIBCO makes heavy use of UDP for notification of state change.
Any other kind of one-way "significant event" or "logging" activity can be handled nicely with UDP packets. You want to send notification without constructing an entire socket. You don't expect any response from the various listeners.
System "heartbeat" or "I'm alive" messages are a good choice, also. Missing one isn't a crisis. Missing half a dozen (in a row) is.
I work on a product that supports both UDP (IP) and TCP/IP communication between client and server. It started out with IPX over 15 years ago with IP support added 13 years ago. We added TCP/IP support 3 or 4 years ago. Wild guess coming up: The UDP to TCP code ratio is probably about 80/20. The product is a database server, so reliability is critical. We have to handle all of the issues imposed by UDP (packet loss, packet doubling, packet order, etc.) already mentioned in other answers. There are rarely any problems, but they do sometimes occur and so must be handled. The benefit to supporting UDP is that we are able to customize it a bit to our own usage and tweak a bit more performance out of it.
Every network is going to be different, but the UDP communication protocol is generally a little bit faster for us. The skeptical reader will rightly question whether we implemented everything correctly. Plus, what can you expect from a guy with a 2 digit rep? Nonetheless, I just now ran a test out of curiosity. The test read 1 million records (select * from sometable). I set the number of records to return with each individual client request to be 1, 10, and then 100 (three test runs with each protocol). The server was only two hops away over a 100Mbit LAN. The numbers seemed to agree with what others have found in the past (UDP is about 5% faster in most situations). The total times in milliseconds were as follows for this particular test:
1 record
IP: 390,760 ms
TCP: 416,903 ms
10 records
IP: 91,707 ms
TCP: 95,662 ms
100 records
IP: 29,664 ms
TCP: 30,968 ms
The total data amount transmitted was about the same for both IP and TCP. We have extra overhead with the UDP communications because we have some of the same stuff that you get for "free" with TCP/IP (checksums, sequence numbers, etc.). For example, Wireshark showed that a request for the next set of records was 80 bytes with UDP and 84 bytes with TCP.
There are already many good answers here, but I would like to add one very important factor as well as a summary. UDP can achieve a much higher throughput with the correct tuning because it does not employ congestion control. Congestion control in TCP is very very important. It controls the rate and throughput of the connection in order to minimize network congestion by trying to estimate the current capacity of the connection. Even when packets are sent over very reliable links, such as in the core network, routers have limited size buffers. These buffers fill up to their capacity and packets are then dropped, and TCP notices this drop through the lack of a received acknowledgement, thereby throttling the speed of the connection to the estimation of the capacity. TCP also employs something called slow start, but the throughput (actually the congestion window) is slowly increased until packets are dropped, and is then lowered and slowly increased again until packets are dropped etc. This causes the TCP throughput to fluctuate. You can see this clearly when you download a large file.
Because UDP is not using congestion control it can be both faster and experience less delay because it will not seek to maximize the buffers up to the dropping point, i.e. UDP packets are spending less time in buffers and get there faster with less delay. Because UDP does not employ congestion control, but TCP does, it can take away capacity from TCP that yields to UDP flows.
UDP is still vulnerable to congestion and packet drops though, so your application has to be prepared to handle these complications somehow, likely using retransmission or error correcting codes.
The result is that UDP can:
Achieve higher throughput than TCP as long as the network drop rate is within limits that the application can handle.
Deliver packets faster than TCP with less delay.
Setup connections faster as there are no initial handshake to setup the connection
Transmit multicast packets, whereas TCP have to use multiple connections.
Transmit fixed size packets, whereas TCP transmit data in segments. If you transfer a UDP packet of 300 Bytes, you will receive 300 Bytes at the other end. With TCP, you may feed the sending socket 300 Bytes, but the receiver only reads 100 Bytes, and you have to figure out somehow that there are 200 more Bytes on the way. This is important if your application transmit fixed size messages, rather than a stream of bytes.
In summary, UDP can be used for every type of application that TCP can, as long as you also implement a proper retransmission mechanism. UDP can be very fast, has less delay, is not affected by congestion on a connection basis, transmits fixed sized datagrams, and can be used for multicasting.
UDP is a connection-less protocol and is used in protocols like SNMP and DNS in which data packets arriving out of order is acceptable and immediate transmission of the data packet matters.
It is used in SNMP since network management must often be done when the network is in stress i.e. when reliable, congestion-controlled data transfer is difficult to achieve.
It is used in DNS since it does not involve connection establishment, thereby avoiding connection establishment delays.
cheers
UDP does have less overhead and is good for doing things like streaming real time data like audio or video, or in any case where it is ok if data is lost.
One of the best answer I know of for this question comes from user zAy0LfpBZLC8mAC at Hacker News. This answer is so good I'm just going to quote it as-is.
TCP has head-of-queue blocking, as it guarantees complete and in-order
delivery, so when a packet gets lost in transit, it has to wait for a
retransmit of the missing packet, whereas UDP delivers packets to the
application as they arrive, including duplicates and without any
guarantee that a packet arrives at all or which order they arrive (it
really is essentially IP with port numbers and an (optional) payload
checksum added), but that is fine for telephony, for example, where it
usually simply doesn't matter when a few milliseconds of audio are
missing, but delay is very annoying, so you don't bother with
retransmits, you just drop any duplicates, sort reordered packets into
the right order for a few hundred milliseconds of jitter buffer, and
if packets don't show up in time or at all, they are simply skipped,
possible interpolated where supported by the codec.
Also, a major part of TCP is flow control, to make sure you get as
much througput as possible, but without overloading the network (which
is kinda redundant, as an overloaded network will drop your packets,
which means you'd have to do retransmits, which hurts throughput), UDP
doesn't have any of that - which makes sense for applications like
telephony, as telephony with a given codec needs a certain amount of
bandwidth, you can not "slow it down", and additional bandwidth also
doesn't make the call go faster.
In addition to realtime/low latency applications, UDP makes sense for
really small transactions, such as DNS lookups, simply because it
doesn't have the TCP connection establishment and teardown overhead,
both in terms of latency and in terms of bandwidth use. If your
request is smaller than a typical MTU and the repsonse probably is,
too, you can be done in one roundtrip, with no need to keep any state
at the server, and flow control als ordering and all that probably
isn't particularly useful for such uses either.
And then, you can use UDP to build your own TCP replacements, of
course, but it's probably not a good idea without some deep
understanding of network dynamics, modern TCP algorithms are pretty
sophisticated.
Also, I guess it should be mentioned that there is more than UDP and
TCP, such as SCTP and DCCP. The only problem currently is that the
(IPv4) internet is full of NAT gateways which make it impossible to
use protocols other than UDP and TCP in end-user applications.
Video streaming is a perfect example of using UDP.
UDP has lower overhead, as stated already is good for streaming things like video and audio where it is better to just lose a packet then try to resend and catch up.
There are no guarantees on TCP delivery, you are simply supposed to be told if the socket disconnected or basically if the data is not going to arrive. Otherwise it gets there when it gets there.
A big thing that people forget is that udp is packet based, and tcp is bytestream based, there is no guarantee that the "tcp packet" you sent is the packet that shows up on the other end, it can be dissected into as many packets as the routers and stacks desire. So your software has the additional overhead of parsing bytes back into usable chunks of data, that can take a fair amount of overhead. UDP can be out of order so you have to number your packets or use some other mechanism to re-order them if you care to do so. But if you get that udp packet it arrives with all the same bytes in the same order as it left, no changes. So the term udp packet makes sense but tcp packet doesnt necessarily. TCP has its own re-try and ordering mechanism that is hidden from your application, you can re-invent that with UDP to tailor it to your needs.
UDP is far easier to write code for on both ends, basically because you do not have to make and maintain the point to point connections. My question is typically where are the situations where you would want the TCP overhead? And if you take shortcuts like assuming a tcp "packet" received is the complete packet that was sent, are you better off? (you are likely to throw away two packets if you bother to check the length/content)
Network communication for video games is almost always done over UDP.
Speed is of utmost importance and it doesn't really matter if updates are missed since each update contains the complete current state of what the player can see.
The key question was related to "what kind of situations would UDP be the better choice [over tcp]"
There are many great answers above but what is lacking is any formal, objective assessment of the impact of transport uncertainty upon TCP performance.
With the massive growth of mobile applications, and the "occasionally connected" or "occasionally disconnected" paradigms that go with them, there are certainly situations where the overhead of TCP's attempts to maintain a connection when connections are hard to come by leads to a strong case for UDP and its "message oriented" nature.
Now I don't have the math/research/numbers on this, but I have produced apps that have worked more reliably using and ACK/NAK and message numbering over UDP than could be achieved with TCP when connectivity was generally poor and poor old TCP just spent it's time and my client's money just trying to connect. You get this in regional and rural areas of many western countries....
In some cases, which others have highlighted, guaranteed arrival of packets isn't important, and hence using UDP is fine. There are other cases where UDP is preferable to TCP.
One unique case where you would want to use UDP instead of TCP is where you are tunneling TCP over another protocol (e.g. tunnels, virtual networks, etc.). If you tunnel TCP over TCP, the congestion controls of each will interfere with each other. Hence one generally prefers to tunnel TCP over UDP (or some other stateless protocol). See TechRepublic article: Understanding TCP Over TCP: Effects of TCP Tunneling on End-to-End Throughput and Latency.
UDP can be used when an app cares more about "real-time" data instead of exact data replication. For example, VOIP can use UDP and the app will worry about re-ordering packets, but in the end VOIP doesn't need every single packet, but more importantly needs a continuous flow of many of them. Maybe you here a "glitch" in the voice quality, but the main purpose is that you get the message and not that it is recreated perfectly on the other side. UDP is also used in situations where the expense of creating a connection and syncing with TCP outweighs the payload. DNS queries are a perfect example. One packet out, one packet back, per query. If using TCP this would be much more intensive. If you dont' get the DNS response back, you just retry.
UDP when speed is necessary and the accuracy if the packets is not, and TCP when you need accuracy.
UDP is often harder in that you must write your program in such a way that it is not dependent on the accuracy of the packets.
It's not always clear cut. However, if you need guaranteed delivery of packets with no loss and in the right sequence then TCP is probably what you want.
On the other hand UDP is appropriate for transmitting short packets of information where the sequence of the information is less important or where the data can fit into a single
packet.
It's also appropriate when you want to broadcast the same information to many users.
Other times, it's appropriate when you are sending sequenced data but if some of it goes
missing you're not too concerned (e.g. a VOIP application).
Some protocols are more complex because what's needed are some (but not all) of the features of TCP, but more than what UDP provides. That's where the application layer has to
implement the additional functionality. In those cases, UDP is also appropriate (e.g. Internet radio, order is important but not every packet needs to get through).
Examples of where it is/could be used
1) A time server broadcasting the correct time to a bunch of machines on a LAN.
2) VOIP protocols
3) DNS lookups
4) Requesting LAN services e.g. where are you?
5) Internet radio
6) and many others...
On unix you can type grep udp /etc/services to get a list of UDP protocols implemented
today... there are hundreds.
Look at section 22.4 of Steven's Unix Network Programming, "When to Use UDP Instead of TCP".
Also, see this other SO answer about the misconception that UDP is always faster than TCP.
What Steven's says can be summed up as follows:
Use UDP for broadcast and multicast since that is your only option ( use multicast for any new apps )
You can use UDP for simple request / reply apps, but you'll need to build in your own acks, timeouts and retransmissions
Don't use UDP for bulk data transfer.
We know that the UDP is a connection-less protocol, so it is
suitable for process that require simple request-response communication.
suitable for process which has internal flow ,error control
suitable for broad casting and multicasting
Specific examples:
used in SNMP
used for some route updating protocols such as RIP
Comparing TCP with UDP, connection-less protocols like UDP assure speed, but not reliability of packet transmission.
For example in video games typically don't need a reliable network but the speed is the most important and using UDP for games has the advantage of reducing network delay.
You want to use UDP over TCP in the cases where losing some of the data along the way will not completely ruin the data being transmitted. A lot of its uses are in real-time applications, such as gaming (i.e., FPS, where you don't always have to know where every player is at any given time, and if you lose a few packets along the way, new data will correctly tell you where the players are anyway), and real-time video streaming (one corrupt frame isn't going to ruin the viewing experience).
We have web service that has thousands of winforms client in as many PCs. The PCs have no connection with DB backend, all access is via the web service. So we decided to develop a central logging server that listens on a UDP port and all the clients sends an xml error log packet (using log4net UDP appender) that gets dumped to a DB table upon received. Since we don't really care if a few error logs are missed and with thousands of client it is fast with a dedicated logging service not loading the main web service.
I'm a bit reluctant to suggest UDP when TCP could possibly work. The problem is that if TCP isn't working for some reason, because the connection is too laggy or congested, changing the application to use UDP is unlikely to help. A bad connection is bad for UDP too. TCP already does a very good job of minimizing congestion.
The only case I can think of where UDP is required is for broadcast protocols. In cases where an application involves two, known hosts, UDP will likely only offer marginal performance benefits for substantially increased costs of code complexity.
Only use UDP if you really know what you are doing. UDP is in extremely rare cases today, but the number of (even very experienced) experts who would try to stick it everywhere seems to be out of proportion. Perhaps they enjoy implementing error-handling and connection maintenance code themselves.
TCP should be expected to be much faster with modern network interface cards due to what's known as checksum imprint. Surprisingly, at fast connection speeds (such as 1Gbps) computing a checksum would be a big load for a CPU so it is offloaded to NIC hardware that recognizes TCP packets for imprint, and it won't offer you the same service.
UDP is perfect for VoIP addressed where data packet has to be sent regard less its reliability...
Video chatting is an example of UDP (you can check it by wireshark network capture during any video chatting)..
Also TCP doesn't work with DNS and SNMP protocols.
UDP does not have any overhead while TCP have lots of Overhead

Resources