Just out of curiosity I was wondering if not having a checksum field in the application layer of the protocol is a major design issue? Or since the IP has the inbuilt checksum part in it, shouldn't it be an issue at all? Or you think is a dumb question as there is never a checksum in application layer?
Unless I am much mistaken FTP doesn't have a checksum, and neither does HTTP, and both are used to download enormous pieces of software by the million. Draw your own conclusion. Neither does RMI, or IIOP, or XDR, or ... In fact I can't think of an application protocol that does, other than one I wrote in 1994.
It depends on the integrity requirements of the application.
IP's checksum won't protect the application against packets that are lost or misordered. Applications that seek reliability usually use TCP (which provides a checksum over the data as well as recovering from loss and misordering).
The question then becomes whether an application needs its own checksum when TCP already provides one? That depends on whether the 16b checksum of TCP is sufficient for the integrity needs of the application. e.g. financial or other applications that are very sensitive to data changes might need to use a CRC or message digest to double-check the information after TCP has checked it.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 years ago.
Improve this question
The goal of this question is that I am just trying to better understand the nature of P2P and networking and security / encryption. I am a front-end web developer and my knowledge of the networking stack is not great if we go lower than HTTP requests.
That being said, I am trying to understand how torrent traffic is "sniffed" by ISPs and the content identified. I feel like this question will expose my ignorance, but is it not possible to have some sort of HTTPS-like P2P protocol that would not be so readable?
I grasp that a given packet has to identify its destination to the network along the way, but couldn't torrent packets be configured to show ONLY their destination, so that nobody could identify its purpose along the way, until it arrived at its destination? Why is it apparently an unrectifiable situation that ISPs can just look at P2P traffic and know everything about it, yet SSH is extremely safe?
Every answer here seems to have a different interpretation of the question, or rather, a different assumed purpose of the encryption. Since you compare it to https, it seems like a reasonable assumption is that you're looking for authentication and confidentiality. I'll enumerate a few attempts in decreasing level of "security". This is a bittorrent centric answer, because you tagged the question with bittorrent.
SSL
Starting with the strongest system, it is possible to run bittorrent over SSL (it's not supported by many clients, but in a fully controlled deployment it can be done). This gives you:
Authentication of every peer participating
The ability to pick which peers are let into the swarm by signing their certificate with the swarm root.
SSL encryption of all peer connections + tracker connections
The tracker can authenticate every peer connecting to it, but even if the peer list (or one peer) is leaked or guessed, there's still the peer-to-peer authentication, blocking any unauthorized access.
Bittorrent over SSL has been implemented and deployed.
encrypted torrents
At BitTorrent (in the uTorrent client) we added support for symmetric encryption of torrents at the disk layer:
Everything in the bittorrent engine would operate on encrypted blocks. The data integrity checks (sha-1 hashes of pieces) would be done on encrypted blocks and the .torrent file would have hashes of the encrypted data. An encrypted torrent like this is backwards compatible with clients that don't support the feature, but they won't be able to access the data (just help out the swarm and seed it).
To download the torrent in an unencrypted form, you would add the &key= argument to the magnet link, and uTorrent decrypts and encrypts data at the disk boundary (leaving the data on disk in the clear). Anyone adding the magnet link without the key, would just get encrypted data.
There are some other details involved too, like encrypting some of the metadata in the .torrent file. Such as the list of files etc.
This does not let you pick which peers get to join. You can give access to the peers you want, but since it's a symmetric key, anyone with access can invite anyone else, or publish the key. It does not give you any stronger authentication than you had when you found the magnet link.
It gives you confidentiality among trusted peers and the ability to have untrusted peers help out with seeding.
bittorrent protocol encryption
The bittorrent protocol encryption is probably better described as obfuscation. Its primary intention is not to authenticate or control access to a swarm (it derives the encryption key from the info-hash, so if you can keep that a secret you do get that property). The main purpose is to avoid trivial passive snooping and shaping of traffic. My understanding is that it's less effective to avoid being identified as bittorrent traffic these days. It also provides weak protection against sophisticated and active attacks. For instance, if the DHT is enabled, or tracker connections are not encrypted, it's easy to learn about the info-hash, which is the key.
In the case of private torrents (where DHT and peer exchange are disabled) assuming the tracker runs HTTPS there aren't any obvious holes in it. However, my experience is that it's not uncommon for https trackers to have self signed certificates, and for clients to not authenticate trackers. Which means poisoning the DNS entry for the tracker may be enough to enter the swarm.
Torrent traffic can be encrypted, and there are VPNs/SOCKS proxies that can be used to redirect traffic, i.e., via another country through an encrypted tunnel before connecting to peers. That said, even if you use such services, there are a lot of ways of leaking traffic via side channels (e.g., DNS lookups, insecure trackers, compromised nodes), and most people aren't knowledgeable enough to follow all proper security/anonymity precautions. Furthermore, restricting yourself to communicating only with clients who have also forced encryption will limit the number of peers you can connect to.
The problem you're considering is the difference between point-to-point encryption where there are only two peers in a private context and an unbounded number of peers in a public context.
Decryption by any of the public peers can only be effected if there's a primer somewhere -- a decryption key that is available for all the public peers to use. In the case of protecting from the ISPs, they would also have access to that key unless there was some exclusionary protocol for only sharing the key amongst everyone else. It's not practical to do this.
In a point-to-point connection, a TLS key negotiation eventually creates a session encryption key that is shared by both peers. The key is pseudorandom and session-specific. Data shared on the internet this way would be unusable to clients that didn't participate in the key negotiation.
Bittorrent traffic (specifically the peer-peer protocol used to transfer the bulk of the data) can be encrypted. But it's the kind of encryption that does not provide strong confidentiality/authentication guarantees, similar (but not identical) to HTTP2's opportunistic encryption
Client-Tracker communication can be encrypted with HTTPS.
These two components give you a working, albeit restricted, bittorrent stack that's encrypted and whose contents are not visible to a passive observer.
ISPs may still be able to identify it as "bittorrent, probably" based on side-channel data (packet sizes/traffic patterns, domains contacted, ...) but they won't know exactly what is being transferred.
I am writing an application where the client side will be uploading data to the server through a wireless link.
The connection should be very reliable.The link is expected to break many times and there will be many clients connected to the server.
I am confused whether to use TCP or reliable UDP.
Please share your thoughts.
Thanks.
RUDP is not, of course, a formal standard, and there's no telling if you will find existing implementations you can use. Given a choice between rolling this from scratch and just re-making TCP connections, I'd chose TCP.
To be safe, I would go with TCP just because it's a reliable, standard protocol. RUDP has the disadvantage of not being an established standard (although it's been mentioned in several IETF discussions).
Good luck with your project!
It's likely that both your TCP and RUDP links would be broken by your environment, so the fact that you're using RUDP is unlikely to help there; there will likely be times when no datagrams can get through...
What you actually need to make sure of is that a) you can handle the number of connected clients, b) your application protocol can detect reasonably quickly when you've lost connectivity with a client (or server) and c) you can handle the required reconnection and maintenance of cross connection session state for clients.
As long as you deal with b) and c) it doesn't really matter if the connection keeps being broken. Make sure you design your application protocol so that you can get things done in short batches; so if you're uploading files, make sure that you're sending small blocks and that the application protocol can resume a transfer that was broken half way through; you don't want to get 99% of the way through a 2gb transfer and lose the connection and have to start again.
For this to work your server needs some kind of client session state cache where you can keep the logical state of a client's connection beyond the life of the connection itself. Design from the start to expect a given session to include multiple separate connections. The session state should possibly have some kind of timeout so if the client goes away for along time it doesn't continue to consume resources on the server but, to be honest, it may simply be a case of saving the state off to disk after a while.
In summary, I don't think the choice of transport matters and I'd go with TCP at least to start with. What will really matter is being able to manage your client's session state on the server and deal with the fact that clients will connect and disconnect regularly.
If you aren't sure, odds are that you should use TCP. For one thing, it's certain to be part of the network stack for anything supporting IP. "Reliable UDP" is rarely supported out of the box, so you'll have some extra support work for your clients.
I am designing an application protocol, and i am wondering if i still need include checksum in the protocol since tcp/ip already has checksum.
what's your opinion?
The BitTorrent protocol has a heavy amount of additional error correction and detection layered on top of TCP, so clearly the protocol designers saw the need for it.
The TCP checksum is quite weak, so you probably want an application level one if you are at all worried about reliability.
In particular the TCP checksum is not a secure hash, and there is no signature, so if you're worried about malicious changes then you need to add the security yourself.
To add to the other answers, you should probably look into Message Authentication Codes. MACs are a more robust way to detect errors than a simple TCP checksum.
If you want something robust, take a look at [HMAC][2]. HMAC provides both error detection and authentication (via shared keys).
If you want something quick and dirty, why not use sha1 hashes?
I have just started writing socket programs. Came to know that single UDP packet has source port destination port and some MAC address representing router..etc. I wonder why anybody cannot create custom packets with a fake information in and send it over internet. I would like to know how safe are our PCs. What should be done to secure it ?
There are a couple of different aspects to the answer.
One is that the web relies on TCP, not UDP. Which means that it is connection-oriented. Your package will be rejected, unless it appears to be part of an existing connection (which means, among other things, that it has to have the right source IP and port as well. And it has to have the right sequence number to fit into the receive window). This can still be faked without too much trouble, of course. But it does require you to know a bit about the packets being sent on the original connection.
Another part is that whenever we need to be sure that the sender of a packet is who they claim to be, we use encryption. :)
Most packets don't really need this. It's not a huge deal if someone sends a request to Google which appears to come from my IP. But when making credit card transactions, it becomes a bit more important.
Most of the TCP/IP stack "leaks trust", as I once put it -- and there isn't much that you, as a software developer (assuming you're looking for a programming solution, otherwise, stackoverflow's the wrong forum, go to serverfault or superuser;-) can do about it -- beyond choosing and carefully implemented protocols that are reasonable in terms of security expectation.
HTTPS (with strong checks of certificates, etc) is one reasonably strong approach; for stronger security, look into SSH and VPN-based approaches. Of course, nobody should assume privacy or strong authentication is in place unless they've taken specific steps towards it (if they HAVE taken such steps, they may be still subject to successful attacks, which is why using existing, more or less "proven" solutions such as HTTPS, SSH, VPNs, is advisable;-).
Yes, anyone can create packets with whatever data they want and send them out over the internet. Especially with UDP, you can pretend to be anyone you want (unless your ISP does egress filtering). Source addresses for UDP cannot be trusted. Source addresses for TCP can to an extent (you know the data has to be coming from the IP address in question, or someone along the route).
Welcome to the internet :)
Edit: just to clarify egress filtering is something the sending ISP would have to do. As a reciever, there's not really anything you can do to verify the address on a UDP packet without communicating back to the sender. The only reason you can at least partially trust an incoming TCP connection is that TCP requires certain control data flow back to the sender (and hence needs a valid IP address/port to set the connection up and maintain it).
Well, many many people create invalid packets and send them over Internet; for instance, read Ping of death.
A [completly] secure computer is a computer turned off. To make your running PC more secure from this thread kind, you should rely on firewall softwares/hardwares, which can detect that malformed packets.
Custom packets with fake information can easily be created. Therefore you have to make sure you're not vulnerable to them.
I am considering using SCTP instead of TCP for a p2p app written in C. Should I do it? Also how does the speed of SCTP compare to the speed of TCP?
EDIT:
I found that SCTP can be tunneled over UDP with the only problem being tunneled SCTP is not interoperable with untunneled SCTP.
Have you considered whether your target systems will all have SCTP pre-installed on them or whether your application will need to include SCTP itself? In my experience I would not expect all systems to have SCTP installed on them, and I would expect them not to if it were Windows.
If you include SCTP in the application itself then that will more than double the number of messages being passed into an out of the Kernel which will impact performance when compared with using the pre installed TCP.
Have you considered what benefits you want from SCTP? You mentioned fault tolerance but for this to work with SCTP it requires the application to have multiple ethernet ports and and IP addresses. Is this likely on your app?
As much as I love SCTP (!) I would seriously consider sticking with TCP unless you are sure SCTP is needed or unless you control the hosts your app is deployed on.
Regards
If it's for a local area network, sure go for it.
Note however that if you plan to use it on the open internet many consumer grade firewalls aren't flexible enough to permit unrecognised IP protocols through them.
How does it help you?
You're P2P, so every peer must have at least one socket open to every other peer.
If you've got a socket open, then you can do everything you need to do over that. If you've taken the approach of one socket per file and you have multiple files being tranferred concurrently between two given peers, then SCTP will save you one socket per file. However, on a normal P2P network of any size, you will almost never have multiple files being transferred concurrently between two peers.
Just have one socket and have your own little protocol; send a packet with a header, the header indicates content type, e.g. a command, or part a file - and if so, which file, and which byte range.
Of course, you get a little overhead for that, whereas if you have one socket for commands and one per file, you're more efficient. Is saving one socket per peer (assuming one download at a time) worth the time/hassle/complexity of using SCTP?