Doubts Related to working of torrents?

Doubts Related to working of torrents? - networking

I was trying to understand how torrents work?
And after reading a lot on web I now know the basics about it but I have a very
important question related to working of torrents!
In torrents how do peer-to-peer connections take place?
Almost all the peers have private-IP(for e.g 192.x.x.x) addresses then how does connections take place without a server(As I have read: There is no server involved in torrents) ?
Thanks a lot!

There are a few alternatives:
Peers behind NAT simply don't connect to other peers behind NATs. This creates two classes of peers, where the ones that are connectable will have an advantage when trading pieces, and typically achieve faster download rates.
Peers behind NAT use UPnP or NAT-PMP to set up port forwarding in order to be connectable by other peers
peers using uTP and Peer exchange can support a simple hole-punching mechanism (uTorrent and libtorrent supports this for instance). A peer can help in introducing two of its connections to each other, they try to connect to at the same time and of one of them have a full-cone NAT, they are very likely to succeed in establishing the connection.
Peers supporting DHT and uTP may use a relatively new feature where the port announced to the DHT is derived from their UDP packets. Using the same socket for DHT and uTP increases the chances that a peer behind a full-cone NAT can accept incoming connections without UPnP or NAT-PMP set up. Simply because the DHT traffic will keep a pinhole open on the NAT.
If you have a swarm of only peers behind symmetric NATs, nobody is going to be able to connect to anyone else, and bittorrent is not going to work. In practice (at least in moderately large swarms) there are always some peers that are connectable.

Related

LibTorrent Nat Traversal

I am trying to connect to peer via add_peer() function in LibTorrent. But what if the peer from which I want to download the file is behind NAT? Is there something for NAT Traversal in Libtorrent?

The NAT traversal in libtorrent is limited to:
Explicit port forwards using UPnP, NAT-PMP and PCP.
Implicit (opportunistic) attempts to reach peers via their external port
The peer receiving the connection attempt is not behind a NAT, but the initiating peer is. This is the case NATs are meant to support.
It sounds like you're mostly interested in (2), where we assume both peers are behind a NAT. This is commonly referred to UDP hole-punching.
Generally, if you don't control or have any influence over the peer you're trying to connect to, you're limited in what measures you can take.
Also, if the neither NAT is a full-cone (or let's say, p2p-friendly) it may not be possible for the peers to connect. A p2p-friendly NAT generally accepts incoming connections from IPs they have not had any interaction with previously.
The main two approaches used by libtorrent (and bittorrent clients generally) are:
commonly connected peers may introduce two NATed peers to each other via the peer exchange extension. In this mode both peers try to connect to each other simultaneously, hoping that both NATs will open up pin-holes for the ports that are being attempted. This only works if the swarm has at least one peer that's not behind a NAT. You can find more information about this in BEP 55
Sharing the UDP port for uTP, DHT and UDP trackers and having the listen port be implied by the source port of the tracker and DHT announce. With some luck, that source port can also be used by other hosts to reach the NATes client. This works because uTP connections also run over UDP.

Writing client-server application in global network

I know, how to write a C# application that works through a local network.
I mean I know, how to make my client-side application access my server-side application in a single local network.
But I wonder: How do such apps, as Skype, TeamViewer, and many other connect via global network?
I apologise, if this question is simple or obvious, but I couldn't find any information about this stuff.
Please, help me, I'll be very grateful. Any information is accepted - articles, plain info, books,and so on...

Question is very wide and I try to do short overview.
Following major difference between LAN (Local Area Network) and WAN (Wide Area Network):
Network quality:
LAN is more or less stable, WAN can be with network issues like:
Packet loss (you need use loss-tolerant transport like TCP or UDP with retransmits or packet loss concealment)
Packet jitter (interpacket intervals may differ a lot from sending part). Most common thing is packets bursts.
Packet reordering
Packet duplication
Network connectivity
WAN is less stable than LAN. So you need properly handle all things like:
Connection stale
Connection loss
Errors in the middle of the connection (if you use UDP for example)
Addresses:
In WAN you deal with different network equipment between client and server (or peers in case of peer-to-peer communication). You need to take in account:
NATs - most of the clients are behind NAT and you need to pass them through. According technics are called "NAT traversal"
Firewalls - may ISP has own rules what client can do or can't. So if you do something specific like custom transport protocol you may bump into ISP firewalls.
Routing - especially multicast and broadcast communication. In common case multicast is not possible to route. Broadcasts are never routed. So you need to avail this type of communication if you want to use WAN.
May be I forgot something. But these points are major. You can read many articles about any of them.

In my peer to peer application, should I use multiple ports?

I am building a simple peer to peer application where about 8 participants all connect to each other (n*n). I will be using UDP with a reliability and ordering protocol layered on top. Each peer will be broadcasting a few KB of data per second.
It occurred to me that there are two ways of configuring the ports on each peer:
Each peer takes one port and all messages are received via that
Each peer takes a port for each other peer, and only communicates with a peer using its corresponding port
What are the advantages and disadvantages of each approach?

Creating only one port to communicate with each peers is the best option. You create only one socket and use only one port to send/receive data with as many peers as you want. You can distinguish received data by seeing source address of each packet. This way has Less complicated code and is more resource efficient.
Creating multiple port has absolutely no advantage. It will complicate your code and will use much more resource with no benefit. Resource consumption will grow with more peers.

I have never done anything with this model of application, but here are my few thoughts.
1 big thing to consider will be whether any firewalls are in place between the peers. If so, a hole will need to be punched through for each listening port. This can be done either once (eg. router port-forward rule) or dynamically (UPnP, etc), but you may not be able to count on automatic full-cone NAT to do this for you. If you are expecting any firewalls between your peers, I would recommend using a single port for ease of programming on your part and identify your peers strictly by their remote address or some other in-protocol identifier.
Using a single port per user will make identifying communication much simpler, but if you expect your number of participants to grow by n(n-1)/2. If you never expect more than a small number (eg. 20), port per peer will work decently well without much effort.
Another option for you (possibly) may be using multicast. If all of your peers are on the same broadcast domain, this will reduce bus contention and may make your coding somewhat cleaner as well.
Hope this helps. I apologize if this isn't what you're looking for. Good luck!

Each peer takes one port and all messages are received via that
If each peer can get the source IP/source port of the incoming datagram (and I bet it can), this is enough to differentiate the peers.
Each peer takes a port for each other peer, and only communicates with
a peer using its corresponding port
See above, and most importantly this contradict your base idea of broadcasting in the first place. It just add a level of complexity (and is probably not very scalable, even if for now you envision just 8 peers).
In your base requirement I think you may have a dilemma between:
broadcast everything to everyone,
but still you want a peer to be able to "only communicates with a peer", which is inherently unicast.
This raises some problems, as you already realized by asking the question.
I see 2 other problems:
Scalability-wise, the broadcast everything approach whereas you sometime actually need unicast is going to put some useless load on the network. This is not pretty.
The broadcast approach dictates UDP, but still you want reliable data transfer, so as you stated you'll have to add a "reliability and ordering protocol layered on top". This (not so easy) work would not be needed if only we could use TCP.
There is a third approach:
use broadcast UDP for each peer to announce itself on the network, so that other peers can...
...discover it and then establish a unicast TCP connection with this peer. No more reliability and ordering problems + reduced network load.
This approach is used in SSDP (Simple Service Discovery Protocol), part of UPnP. I do not suggest you use SSDP, it's probably bloated for what you want to do, you said you wanted something simple.
All in all, you first have to resolve your dilemma: decide and differentiate the data that really need to be broadcasted vs the unicast part. YMMV.
PS: with broadcast UDP also comes the problem that though OK on a LAN, this will not pass a router unless you use multicast routing. But that's another story.

Connecting P2P over NAT?

I started to explore the option of connecting with other using a p2p connection, so I coded a simple socket program in JAVA for android devices in which the users can share simple messages p2p (I didn't have any idea about NAT then). I got to know about NAT, so I now need to establish a TCP connection with another user which uses a server for discovery but payload is transferred p2p. I have also looked at XMPP(a very good and detailed explanation of how protocol works is here) and UPnP but I dont know how to implement them.
Another interesting question that arises is of BitTorrent because they can work on any device and even behind a NAT. I am not able to get any explanation of how BitTorrent works.
I have researched a lot but I am stuck.
My questions are:
A detailed explanation of BitTorrent(like here, not how torrents work) and how is it able to work around NAT ?
Is there a way to make a NAT entry programmatically ?
Is socket programming sufficient for p2p ?
How difficult is it to create your own protocol and how can I build one ?
If two devices D1 and D2 want to communicate p2p and they know each other's IP. D1 sends a request to D2 and that can't get through the D2's NAT, but there should be an entry created in D1's NAT. So when D2 tries to send something D1's NAT should discover an entry with D2's IP. Then why is the packet not allowed by it ?

Another interesting question that arises is of BitTorrent because they can work on any device and even behind a NAT. I am not able to get any explanation of how BitTorrent works.
This statement looks like you assume that bittorrent needs full connectivity to operate.
That is incorrect.
Behind a NAT device you will still be able to establish outgoing TCP connections. Which generally is sufficient for bittorrent as long as there are other, non-NATed (or NATed but properly port-forwarded) clients in the network that can accept incoming connnections.
NAT has no impact on the flow direction of the data because connections are bi-directional once they are established. It only is problematic for the initial connection setup.
This works perfectly fine for bittorrent because bittorent does not care from which specific node you get your data.
Although better connectivity generally does improve performance.
If the identity of the node matters or one-on-one transfers are an important use-case then other p2p protocols usually attempt NAT traversal first and if that fails rely on 3rd party nodes relaying traffic between those nodes who cannot connect to each other directly.
Additionally, IPv6 support will become essential in the future to maintain end-to-end connectivity because more and more ISPs are starting to roll out carrier-grade NAT for IPv4 while IPv6 will remain non-NATed

One thing need to be clear is that 100% P2P between all type of NAT is impossible right now. There is no practical way to establish P2P connectivity between **Symmetric and Symmetric/PRC NAT. In this scenario connection is established through a relay server called TURN.
I am answering from your 2nd question because I don't know much about the first one.
2) Yes. You can send a packet through your NAT and there will be a mapping between your internal IP:Port to your NAT's external IP:Port. You can know these external IP:Port by sending a stun request. Note that this technique doesn't work for Symmetric NAT.
3)Yes socket programming sufficient for p2p.
4)Why do you need a protocol when there already exists several. ICE protocol is the best today for NAT traversal and I don't think it was easy to create. UPnP and NAT-PMP is really vulnerable in terms of security.
5)I think what happens is usually NAT blocks unknown packets coming to it. So when D1 sends a packet to D2, its NAT blocks all packets incoming from D1s IP:Port. That is why connection establishment fails. You have to employ hole punching technique for D1 and D2 to successfully establish P2P connectivity.
**By symmetric NAT I mean symmetric NAT with random port allocation.

There is a paper on "Peer-to-Peer Communication Across Network Address Translators" which describes the UDP hole punching method and extends it to be used over TCP as well.
Of course, you will always need a relay server for the cases where hole punching is not supported.

Recent versions of BitTorrent use µTP, which is layered above UDP, not TCP. µTorrent uses a private extension (ut_holepunch) that performs UDP hole punching, most other implementations don't bother (with the notable exception of Tixati).
Some NAT routers accept port forwarding requests using either the uPNP or the PMP protocol. Whether this is supported depends on the particular brand of router and its configuration.
Yes, socket programming is enough for P2P.
Difficult to answer. I suggest that you read the wikified and annotated BitTorrent specification for a start.
Yes, this is the principle behind UDP hole punching.

How bit torrent works in private network?

Can somebody please explain working of a bit torrent from the perspective of a host in private network as its IP address is not visible outside the private network. Is port forwarding necessary for bit torrent to work?

Not really. The basic protocol still works if it can not accept incoming connections, it can rely on just outgoing connections. Of course if several peers are not accepting incoming connections, none of them can directly connect, and that's a bad thing - for those peers and for the whole swarm. The number of unreachable (but active) peers is significant in practice, though very hard to measure precisely.
Also, consider that your client will be advertising itself as available, so other peers will be wasting connection attempts to your client, which will be rejected by the NAT device (or they won't even really go anywhere, if the client is silly enough to advertise its private IP address).
So in short, it will work, but it's not a good thing.
For the UDP based protocols (UDP tracker, DHT, µtp), hole-punching can be used (except from behind symmetric NAT), so typically no forwarding is required for those (as long as the client supports hole-punching).