how to bypass TCP max file size sent? [closed] - networking

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
According to TCP, sequence number is used to refer to bytes instead of being a counter. The sequence number is 32-bit integer (~4.2 GB).
If I am sending file directly using TCP, I can't exceed this number.
This was okay with old file-systems but now we have files exceeding this size.
I believe Application layer protocols has been modified to bypass this limit, can any provide an example for this or at least list the used techniques?
For reference, the question was based on the following problem
Textbook : Computer Networking: A Top-Down Approach by James F. Kurose , Keith W. Ross.
P26. Consider transferring an enormous file of L bytes from Host A to Host B.
Assume an MSS of 536 bytes.
a. What is the maximum value of L such that TCP sequence numbers are not
exhausted? Recall that the TCP sequence number field has 4 bytes.

If I am sending file directly using TCP, I can't exceed this number.
Yes you can. You are mistaken. It wraps around.
P26. Consider transferring an enormous file of L bytes from Host A to Host B. Assume an MSS of 536 bytes. a. What is the maximum value of L such that TCP sequence numbers are not exhausted? Recall that the TCP sequence number field has 4 bytes.
'Sequence numbers are not exhausted' is a constraint for the purposes of this question, but the authors aren't necessarily thereby claiming that such a limit applies to any TCP transmission. If they are, they're manifestly wrong. Consider that the initial sequence number is chosen randomly, and therefore can be 2^32-1. Does that imply a limit on that connection of one byte? Of course it doesn't.
I also note that the MSS of 536 bytes is entirely irrelevant to the question. Possibly this is just a substandard text.
EDIT I've now located this source. You didn't misunderstand it. There is nothing in the book about TCP sequence number exhaustion except for this stupid question. Nothing about it wrapping around either, which is a curious omission. The MSS is used in the second part of the book problem, not quoted here.

Related

what is overhead, payload, and header [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Can someone please explain me what is overhead, payload, header and packet. As far as I know a packet is the whole data that is to be transmitted. This packet consists of the actual data which I think is the payload and the source/destination information of the packet is in the header. So a packet consists of header and payload. So what is this overhead? Is overhead a part of the header? I got this from the internet "Packet overheard includes all the extra bytes of information that are stored in the packet header".
The header already contains source/destination info. What are the extra bytes of information that this packet overhead has? I'm confused.
The packet like you said, have the "payload" which is the data itself it needs to transfer (usually the user's data), the "header" contains various things depends on the protocol you are using, for example UDP contains just simple things in the header like Destination and Source IP/PORT, TCP on the other end contains more things like the sequence number of the packet to ensure ordered delivery, a lot of flags to ensure the packet actually received in it's destination and checksum of the data to make sure it didn't get corrupted and received correctly in its detination.
Now, the "overhead" part is actually the additional data that you need in order to send your payload. In the cases I talked about above it's the header part, because you need to add it to every payload that you want to send over the internet. TCP has bigger overhead than UDP because it needs to add more data to your payload, but you are guaranteed that your data will be received in it's destination, in the order you sent it and not corrupted. UDP does not have this features so it can't guarantee that.
Sometimes you will read/hear discussions on what protocol to use according to the data you want to send. For example, let's say you have a game, and you want to update the player's position everytime he moves, the payload it self will contain this:
int playerID;
float posX;
float posY;
The payload's size is 12 byte, and let's say we send it using TCP, now the whole packet will look like this:
-------------
TCP_HEADER
-------------
int playedID;
float posX;
float posY;
Now the whole packet's size is payload + TCP_HEADER which is 12 bytes + (32 bytes to 72 bytes), you now have 32 to 72 bytes overhead for your data. You can read about TCP's header here. Notice that the overhead is even bigger than the data itself!
Now you need to decide if it is the protocol you want to use for your game, if you don't need the features TCP offers you better of using UDP because it have smaller overhead and therefore less data to be sent.
You are correct that a packet generally consists of a header and then the payload. The overhead of a packet type is the amount of wasted bandwidth that is required to transmit the payload. The packet header is extra information put on top of the payload of the packet to ensure it gets to its destination.
The overhead is variable because you can choose a different type of packet (Or packet protocol) to transmit the data. Different packet protocols give you different features. The two key type of packet protocols that exist today are TCP and UDP.
One can say UDP has a lower overhead than TCP because its packets have a smaller header and therefore take less bandwidth to send the payload (The data).
The reasons for this are a deep subject but suffice to say that TCP provides many very useful features that UDP does not, such as ensured delivery of the packets and corruption detection. Both are very useful protocols and are chosen based on what features an application needs (Speed or reliability).

Speed vs Bandwith, ISP's, misconception? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 8 years ago.
Improve this question
A lot of ISP's sell their products saying: 100Mbit/s Speed.
However, compare the internet to a packet service, UPS for example.
The ammount of packages you can send every second(bandwith) is something different then the time it takes to arrive(speed).
I know there are multiple meanings of the term 'bandwith' so is it wrong to advertise with speed?
Wikipedia( http://en.wikipedia.org/wiki/Bandwidth_(computing) )
In computer networking and computer science, bandwidth,[1] network
bandwidth,[2] data bandwidth,[3] or digital bandwidth[4][5] is a
measurement of bit-rate of available or consumed data communication
resources expressed in bits per second or multiples of it (bit/s,
kbit/s, Mbit/s, Gbit/s, etc.).> In computer networking and computer
science, bandwidth,[1] network
bandwidth,[2] data bandwidth,[3] or digital bandwidth[4][5] is a
measurement of bit-rate
This part tells me that bandwith is measured in Mbit/s, Gbit/s.
So does this mean the majority of ISP's are advertising wrongly while they should advertise with 'bandwith' instead of speed?
Short answer: Yes.
Long answer: There are several aspects on data transfer that can be measured on an amount-per-time basis; Amount of data per second is one of them, but perhaps misleading if not properly explained.
From the network performance point of view, these are the important factors (quoting Wikipedia here):
Bandwidth - maximum rate that information can be transferred
Throughput - the actual rate that information is transferred
Latency - the delay between the sender and the receiver decoding it
Jitter - variation in the time of arrival at the receiver of the information
Error rate - corrupted data expressed as a percentage or fraction of the total sent
So you may have a 10Mb connection, but if 50% of the sent packages are corrupted, your final throughput is actually just 5Mb. (Even less, if you consider that a substantial part of the data may be control structures instead of data payload.
Latency may be affected by mechanisms such as Nagle's algorythm and ISP-side buffering:
As specified in RFC 1149, An ISP could sell you a IPoAC package with 9G bits/s, and still be true to its words, if they sent to you 16 pigeons with 32GB SD cards attached to them, average air time around 1 hour - or ~3,600,000 ms latency.

Why RIP(Routing Information Protocol ) uses hopcount of 15 hops? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I'm reading one of the Distance vector protocol RIP and come to know maximum hop count it uses is 15 hops but My doubt is why 15 is used as maximum Hop count why not some other number 10,12 or may be 8 ?
My guess is that 15 is 16 - 1, that is 2^4 - 1 or put it otherwise: the biggest unsigned value that holds in 4 bits of information.
However, the metric field is 4 bytes long. And the value 16 denotes infinity.
I can only guess, but I would say that it allows fast checks with a simple bit mask operation to determine whether the metric is infinity or not.
Now the real question might be: "Why is the metric field 4 bytes long when apparently, only five bits are used ?" and for that, I have no answer.
Protocols often make arbitrary decision. RIP is a very basic (and rather old protocol). You should keep that in mind when reading about it. As said above, the max hop count will be a 4 byte field, where 16 is equivalent to infinity. 10 is not a power of 2 number. 8 was probably deemed too small to reach all the routers.
The rationale behind keeping the maximum hop count low is the count to infinity problems. Higher max hop counts would lead to a higher convergence time. (I'll leave you to wikipedia count to infinity problem). Certain versions of RIP use split horizon, which addresses this issue).

Is there a good way to frame a protocol so data corruption can be detected in every case?

Background: I've spent a while working with a variety of device interfaces and have seen a lot of protocols, many serial and UDP in which data integrity is handled at the application protocol level. I've been seeking to improve my receive routine handling of protocols in general, and considering the "ideal" design of a protocol.
My question is: is there any protocol framing scheme out there that can definitively identify corrupt data in all cases? For example, consider the standard framing scheme of many protocols:
Field: Length in bytes
<SOH>: 1
<other framing information>: arbitrary, but fixed for a given protocol
<length>: 1 or 2
<data payload etc.>: based on length field (above)
<checksum/CRC>: 1 or 2
<ETX>: 1
For the vast majority of cases, this works fine. When you receive some data, you search for the SOH (or whatever your start byte sequence is), move forward a fixed number of bytes to your length field, and then move that number of bytes (plus or minus some fixed offset) to the end of the packet to your CRC, and if that checks out you know you have a valid packet. If you don't have enough bytes in your input buffer to find an SOH or to have a CRC based on the length field, then you wait until you receive enough to check the CRC. Disregarding CRC collisions (not much we can do about that), this guarantees that your packet is well formed and uncorrupted.
However, if the length field itself is corrupt and has a high value (which I'm running into), then you can't check the (corrupt) packet's CRC until you fill up your input buffer with enough bytes to meet the corrupt length field's requirement.
So is there a deterministic way to get around this, either in the receive handler or in the protocol design itself? I can set a maximum packet length or a timeout to flush my receive buffer in the receive handler, which should solve the problem on a practical level, but I'm still wondering if there's a "pure" theoretical solution that works for the general case and doesn't require setting implementation-specific maximum lengths or timeouts.
Thanks!
The reason why all protocols I know of, including those handling "streaming" data, chop up the datastream in smaller transmission units each with their own checks on board is exactly to avoid the problems you describe. Probably the fundamental flaw in your protocol design is that the blocks are too big.
The accepted answer of this SO question contains a good explanation and a link to a very interesting (but rather heavy on math) paper about this subject.
So in short, you should stick to smaller transmission units not only because of practical programming related arguments but also because of the message length's role in determining the security offered by your crc.
One way would be to encode the length parameter so that it would be easily detected to be corrupted, and save you from reading in the large buffer to check the CRC.
For example, the XModem protocol embeds an 8 bit packet number followed by it's one's complement.
It could mean doubling your length block size, but it's an option.

Reassembling TCP Segments [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
While observing network traffic in wireshark, i see that wireshark reassembles packets like:
[Reassembled TCP Segments (4233 bytes): #1279(2133), #1278(2100)]
Packet #1278: blahblah, Seq: 1538, Ack:3074, Len: 2133
Packet #1279: blahblah, Seq: 2998, Ack:3074, Len: 2100
(lengths are fictional values)
Im looking to reassemble tcp packets that i receive through sharppcap
Does wireshark use Ack to know what segments belong to each other?
What is the Seq value refer to?
If not, how does it reassemble them?
SEQ values are counted in bytes, so if you receive a 100 byte segment with SEQ == 5, you know the next segment in the sequence will have a SEQ == 105.
The ACK indicates the next SEQ value that the sender expects to see from its peer. So the only reason you're seeing the same ACK value in multiple packets is because only one side is transmitting. By keeping the ACK the same, With each transmission, the host is basically saying it hasn't received anything new.
The sequence number identifies the first byte in the segment. As part of connection establishment each peer picks a random sequence number for the first byte that it will send. Thereafter, the next sequence number is the previous sequence number plus the number of bytes in the previous segment.
I don't understand your question about whether Wireshark uses Ack to reassemble segments.
I might be wrong,
It is not up to TCP to reassemble the PDU..TCP's job is to make sure the tcp segments arrive in order(seq, ack), it does not care about the upper layer protocols..
e.g. a long HTTP response(suppose you are downloading some large file), TCP does not know (neither does it care) where the end of the request is, because that's HTTP's job

Resources