Cellular data billing -- does it include TCP/IP headers - tcp

I'm currently building an application that is intended to run on an embedded system hooked up to a cellular data card. I've been made aware of some low-data plans from several carriers, and our application only generates about 5 bytes/second, lending itself to such plans.
However, I can't seem to figure out if the TCP/IP header overhead (about 40 bytes, give or take) is included in the calculation for data usage. Since I need real-time data, I've disabled Nagle's algorithm. This means for each 5 byte burst I send out, I'm sending out a new header. If TCP/IP headers are factored into the data usage pricing, it will dwarf the amount of data I'm sending.

I can't answer definitively, but I would assume they must. Otherwise this could be exploited by adding extra data to the headers. With TCP you send a 40 byte packet and then you receive a 40 byte acknowledgement packet. You could try using UDP instead of TCP so that you don't have to waste data with the acknowledgement packets.

According to an email from Sprint network engineering, "Any data that goes through our network, including network Header [sic.] would be billed or count towards your plan."

Related

udp vs tcp packet dropping

If I send two packets via the net one is UDP packet and the other is TCP packet, which packet is more likely to reach its destination? I have been told that the TCP protocol is safer but this is because of it's "fail-safe" mechanism. But does it also mean that UDP packets are more likely to fall in the way?
I think it's related to the specific router implementation, because on one hand if a UDP packet disappears then both sides probably know it might happen and can afford to lose a packet or two but on the other hand if a TCP packet disappears then by it's "fail-safe" mechanism it will send another and the problem is solved, and TCP packet is much heavier.
I would like to have more solid answer for that question because i find this subject quite interesting.
If you are making a decision on which protocol to use for your application, you really need to look into both in more detail. Below is just an overview.
TCP is a stream protocol that provides several mechanisms that will deliver: a guaranteed delivery of data, in order. It will control the rate at which the data is sent (it will start transmitting slowly, then upping the speed auntil it reaches a rate that is sustainable by the peer). It will resend any data that was not received on the other side. To do that, you pay a price (for example the slow start, the need for acknowledging all data received etc.)
UDP on the other side is a "data chunk" (datagram) protocol and provides none of the checks of integrity / rate / order. It "compensates" by being (potentially) faster: you pump out data as fast as you can, the other side receives whatever it is able to catch, at full network speed in the extreme case. No guarantee of delivery or order of data arriving at the other side. They either receive the whole datagram or nothing.
Any decision one usually makes has nothing to do with the possibility of data being lost or not but the criticality of losing any of it. Video streaming is done via UDP many times since missing the occasional datagram is less critical than having a smooth image. File transmission cannot afford any data loss or inversions of data chunks, so TCP is the natural choice.
Apart form that question, remeber that the network protocol is only half your problem. The other half is coming up with your application protocol to interprest the bytes you are receiveing...

What is the actual bit transfer rate over a network transferring 100kb/s?

When transferring, say, 1GB worth of data over the internet this data is split into packets, each packet containing a small piece of data and aach of these packets are part of a frame.
Eg. Windows reports that you are transferring the file in 100kb/s over a TCP connection, but this appears to be the amount of data from the file being transferred per second, and does not seem to include the ip or tcp header, or ethernet frame.
What is the actual amount of traffic on the network needed to transfer at this speed? Or is that data actually already included in the transfer speed, but just small enough that it makes no significant difference?
Also, IP supports up to 1500 byte / packet (I think?), but what is the common size of data packets when loading, say, an HD image on reddit?
Sorry for the rather basic questions I probably should have figured out myself by now...
It depends on where you look at the transfer rate:
Task Manager will report all the transferred bytes (i.e. the sum of all the packets including their headers).
A file transfer program will report the transmitted payload.
Task Manager
If you look at Task Manger / Network, you can see the transmitted bytes together with the number of transmitted packets (unicast or non-unicast).
That data comes from the network driver (or at least something close to it), so it makes sense to report the total amount of data here (otherwise each packet would need to be inspected to calculate the payload).
There is also a graph showing the transfer rate. Those numbers could easily be compared with the reported numbers in file transfer software.
File transfer program
A file transfer program on the other hand, does not know the details about the packets being created in the lower layers (those could be any size). So the only option here is to report the amount of transmitted payload data / part of the file, which also makes more sense to the user.
Network packets
On normal networks (there could also be jumbo frames), a TCP-packet (full ethernet frame) is around 1500 bytes when fully loaded (on my system (IPv4) the packets are 1514 bytes with a total header size of 54 bytes -- 14 for the Ethernet header, 20 for the IP header and 20 for the TCP header). Those could be split in smaller packets along the way in the network, but in most cases they won't.
Transfer rate
When transferring a file (or other large datastream), on average 2 full packets (1514 bytes) will be sent each time, and 1 small packet (54 bytes) is received (the [ACK] packet). In this optimal case we have 2 x 1460 payload, and 2 x 54 bytes of overhead on the sending side + 54 bytes on the receiving side. When comparing to the maximum transfer rate of the internet connection, we also have to consider some latency.
Not all transmissions are optimal:
There could be packets that never arrived or where the checksum was wrong, so a retransmit would be needed.
In some cases data could be sent in smaller parts, causing a higher overhead/payload ratio (but with small chunks Nagle's algorithm could take care of that).
Certain software could be reading the file contents into small buffers (say 4096 bytes). Those could then be split in 2 x 1460 and 1 x 1176, introducing some extra overhead.
Conclusion
It's hard to tell or calculate the exact ratio transferred_bytes/payload. It depends on the quality of the internet connection (lost packets, retransmits), the software or API calls used to transfer the data, and even the underlying network (small frames vs jumbo frames for example).
A typical full-size TCP/IPv4 packet on the Internet is of size 1500B (maximum transmission unit (MTU)), of which (minimum) 20B are of TCP header and (minimum) 20B are of IPv4. This MTU was chosen to be compatible with Ethernet. Furthermore, there are application headers (e.g., HTTP for web, SIP/RTP/RTCP for voice call, etc.) included in this packet. The minimum MTU is 576B for IPv4 and 1280B for IPv6. One can see MTU on Linux with ifconfig command.
The best way to figure these values is by using a pcap tool/network analyzer such as Wireshark. Also refer to wiki pages or a good networking book for headers and fields of the protocols.
I'm pretty sure that the reported transer rate doen't include all the headers and overhead of the different layers in the protocal stack, since the reported thruput usually comes from some user-space application which would only get the actual data from the network stream object. It would need to do additional work to find out about all the headers and frames and other overhead that occurred in the different layers and affected the actual physical transmission.

Bluetooth 4.0 scan response

What exactly is a BLE scan response packet?
Since there is almost nothing to be found online, we would like to now this.
Does a scan response packet, respond on a device scan or is it like the advertisement packet sent every x seconds?
A BLE scan response is the packet that is sent by the advertising device (peripheral) upon the reception of scanning requests (i.e. yes, it is a response to a device scan). The scan response usually has more data than the advertising packets. In other words, central devices send scan requests to the advertising device in order to get additional user data through the scan response. Please also note that scan responses are considered to have fixed 'static' data relative to the more dynamic advertising data.
Advertising packets and scan response share the same format, and are transmitted over the same three physical channels (they are both sent as advertising events), but are otherwise two different things.
For more information, I recommend reading about scan response packets in the SIG's core specification found here.
I hope this helps
An important addition to yousif saeed's answer:
According to the Bluetooth 4.x specification, Peripheral devices accepting Scan Requests,
Must advertise this by using a specific Advertising Type value in the protocol header.
Must use advertising intervals of equal or bigger than, at least, 100 ms, so that the Central/Peripheral devices can exchange the Scan Request/Response packets in the time between two consecutive advertising packets (advertising interval).
Keep in mind, also, that depending on your particular hardware platform and Bluetooth Low Energy software stack,
You may find that a peripheral device accepting Scan Requests is non connectable, that is, may be limited to behave as a pure beacon (connection-less).
I was just looking for this information and it is difficult to find good technical resources beyond the basic description.
There is a great few pages on one of the manufacturer's sites that goes into the details of how their hardware interacts with these communications.
The scan response packet consists of:
Device name,
Transmission power,
Beacon ID,
Firmware version,
Battery level
https://support.kontakt.io/hc/en-gb/articles/201492492-iBeacon-advertising-packet-structure
https://support.kontakt.io/hc/en-gb/articles/201493072-Beacon-services
https://support.kontakt.io/hc/en-gb/articles/201492492-iBeacon-advertising-packet-structure
I am not promoting Kontakt.io, but they did a pretty good job of providing this answer in good detail.
Yes it does depend on device scan.
I recently had this experience.
I was working with Nordic device and started sending advertising packets which included scan rsp data. But either I was getting no scan rsp packet or hardly any packet. The issue was I was not scanning from my other nordic device. Once I started scanning from another device, scan rsp packets started coming quickly.

Can the data in a UDP packet be assumed to be correct at the application level?

I recall reading somewhere that if a udp actually gets to the application layer that the data can assume to be intact. Disregarding the possibility of someone in the middle sending fake packets will the data I receive in the application layer always be what was sent out?
UDP uses a 16-bit optional checksum. Packets which fail the checksum test are dropped.
Assuming a perfect checksum, then 1 out of 65536 corrupt packets will not be noticed. Lower layers may have checksums (or even stronger methods, like 802.11's forward error correction) as well. Assuming the lower layers pass a corrupt packet to IP every n packets (on average), and all the checksums are perfectly uncorrelated, then every 65536*n packets your application will see corruption.
Example: Assume the underlying layer also uses a 16-bit checksum, so one out of every 2^16 * 2^16 = 2^32 corrupt packets will pass through corrupted. If 1/100 packets are corrupted, then the app will see 1 corruption per 2^32*100 packets on average.
If we call that 1/(65536*n) number p, then you can calculate the chance of seeing no corruption at all as (1-p)^i where i is the number of packets sent. In the example, to get up to a 0.5% chance of seeing corruption, you need to send nearly 2.2 billion packets.
(Note: In the real world, the chance of corruption depends on both packet count and size. Also, none of these checksums are cryptographically secure, it is trivial for an attacker to corrupt a packet. The above is only for random corruptions.)
UDP uses a 16-bit checksum so you have a reasonable amount of assurance that the data has not been corrupted by the link layer. However, this is not an absolute guarantee. It is always good to validate any incoming data at the application layer, when possible.
Please note that the checksum is technically optional in IPv4. This should further drop your "absolute confidence" level for packets sent over the internet.
See the UDP white paper
You are guaranteed only that the checksum is consistent with the header and data in the UDP packet. The odds of a checksum matching corrupted data or header are 1 in 2^16. Those are good odds for some applications, bad for others. If someone along the chain is dropping checksums, you're hosed, and have no way of even guessing whether any part of the packet is "correct". For that, you need TCP.
Theoretically a packet might arrive corrupted: the packet has a checksum, but a checksum isn't a very strong check. I'd guess that that kind of corruption is unlikely though, (because if it's being sent via a noisy modem or something that the media layer is likely to have its own, stronger corruption detection).
Instead I'd guess that the most likely forms of corruption are lost packets (not arriving at all), packets being duplicated (two copies of the same packet arriving), and packets arriving out of sequence (a later one arriving before an earlier one).
Not really. And it depends on what you mean by "Correct".
UDP packets have a checksum that would be checked at the network layer (below the application layer) so if you get a UDP packet at the application layer, you can assume the checksum passed.
However, there is always the chance that the packet was damaged and the checksum was similarly damaged so that is is actually correct. This would be extremely rare - with today's modern hardware it would be really hard for this to happen. Also, if an attacker had access to the packet, they could just update the checksum to match whatever data they changed.
See RFC 768 for more on UDP (quite small for a tech spec :).
Its worth noting the same 16-bit crc implementation applies to TCP as well as UDP on a per packet basis. When characterizing the properties of UDP consider the majority of data transfers that take place on the Internet today use TCP. When you download a file from a web site the same CRC is used for the transmission.
The secret is the physical and virtual layers (L1) of most access technologies is significantly more robust than TCP and the combined chance of error between L1 and L2 is very low.
For example modems had error correcting hardware and the PPP layer also had its own checksum.
DSL is the same way with error correction at the ATM (Solomon codes) and CRC at the PPPoA layers.
Docsis cable modems use similiar technology to that of DSL for error detection and correction.
The end result is that errors in modern systems are extremely unlikely to ever get past L1.
I have seen clock issues with old frame relay circuts 14 years ago routinly cause corruption at the TCP layer. Have also heard stories of patterns of bit flips on malfunctioning hardware promoting canceling of CRCs and corrupting TCP.
So yes it is possible for corruption and yes you should implement your own error detection if the data is very important. In practice on the Internet and private networks its a rare occurance today.
All hardware: disk drives, buses, processors, even ECC memory have their own error probabilities - for most applications their low enough that we take them for granted.

What does LAN/traffic congestion mean?

While talking about UDP I saw/heard congestion come up a few times. What does that mean?
congestion is when you are trying to send too much data over a limited bandwidth, it cannot send the data faster than the incoming amount so additional packets are dropped.
When congestion occurs, you can see these effects:
Delay due to the queue at one end of the connection being too big, so it takes time for your packet to be transmitted.
Packet loss when new packets are simply dropped, forcing connection resets (and often causing more congestion).
Lower quality of service, protocols like TCP will do a cutback on the transmission rate, so your throughput will be lowered.
Blocking, certain networks have protocol priorities, so your UDP packets may be dropped in favor of allowing TCP traffic through.
Its like a traffic jam, imagine right after a sports game where a parking lot full of cars is trying to empty out into a small side street.
It means that network-connected devices are attempting to send more data across the network than it can handle, e.g. 20 Mbps of data across a 10 Mbps link.
In context of UDP, it's your main source of lost datagrams under ordinary circumstances.
Most LANs use some sort of a collission detection/avoidance system. A congestion typically means that the amount of data which is being transmiited on the medium is causing enough collissions to deteriorate the quality of service defined for that medium.
You may want to read up CSMA/CD at wikipedia.
As UDP packets can often be broadcasted, congestion can occur more often.
Kind regards,
For instance, Ethernet is a broadband protocol. Once a message is sent, every node receives it but ignores if the packet are not sent to them. What happens when two nodes send a packet at the same time? It will cause a collision and data loss.
So, both of the nodes will have to resend the message. To avoid more collisions, nodes are designed to wait a random number of milliseconds. Otherwise they keep going on sending messages simultaneously and packages will collide forever.

Resources