What's so hard about p2p Hole Punching? [closed] - networking

Closed. This question is not about programming or software development. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed last month.
Improve this question
I am trying to experiment with some p2p networking. Upon doing some research, one of the biggest obstacle I learnt is "What if a client is behind a NAT/Firewall", later on I discovered about Hole Punching but that it is not always guaranteed to work.
As far a I understand, I don't understand why it might fail, This is what I know so far:
Based on the diagram above, this is how I understand how a successful connection can be established.
Alice joins the network (1) by creating connection to a directory-server. When this happens, Alice's NAT creates a mapping from her public ip to her local ip.
The directory server receives the connection and store Alice's public ip:port in the directory
Bob does the same (2), Joins the network and publishes his ip:port in the directory
Alice wants to communicate with bob. So she looks up Bob's ip:port from the directory. (3)
Alice sends data on Bob's ip:port which she got from the server. (5)
Since Bob also has a mapping from is ip:port to his local ip:port, the NAT simply forwards any data received on Bob's public ip:port to his computer.
Same works for Alice
I hope I was clear in my explanation of what I understand. My question is, what is so hard or unreliable about this? i must be clearly missing something. Can you explain me what it is?

One problem is that the NAT mappings in Alice's NAT server may time out, either after a fixed time, or after a period of inactivity.
A second potential problem is that the NAT server could make the restriction that Alice's NAT mapping is only "good" for TCP connections established by Alice, or connections between Alice and the initial IP "she" connected to. (In other words, direct communication between Alice & Bob may be blocked.)
And so on.
The problem is that the behaviour of a NAT server is highly dependent on how the managing organization's configuration / policy decisions. Many of these decisions could mean that your particular P2P usage pattern won't work reliably ... or at all.
So then is my whole idea about hole punching wrong?
No. It just means that it won't always work.

Possibly the biggest problem in NAT holepunching is lack of port consistency. For your implementation to work, at least one of the two NATs must support it.
Port consistency is where the same (local ip, local port) is mapped to the same (external ip, external port) regardless of the target (destination ip, destination port). Without this, the port seen by the directory server is not helpful to the client since it will not be the same port the clients will need to talk to each other.
(Note that this is a weaker requirement than port preservation, where external port == local port.)
Unfortunately for P2P communication, most NATs are some flavor of Symmetric NAT and do not have consistent port mappings.

Firewalls are typically stateful. Bob (2) establishing communications with the outside directory server sets up a rule in his NAT server that allows Bob and the directory server to communicate. When the NAT server sees packets from Alice, it rejects/drops them because it hasn't seen Bob establish communications with Alice.

First of all there are 2 types of hole punching
1.UDP hole punching
2.TCP hole punching
UDP hole punching success rate is 82%
TCP hole punching success rate is 64%
I have done many UDP hole punching experiments and they were mostly all successful but not same in the case of TCP hole punching.
The reason behind the failure of TCP hole punching is only the router NAT table. I will try to explain my best:
Client 1 --> connect(client2) --Internet-- connect(client1)<-- Client 2
Now if Client1 **SYN Packet**** reaches to the client2 and **client2 **SYN packet wasn't released** , the ROUTER of client2 can do 2 things:
1. send RST packet back as connection refused to client1.
2. drop packet immediately and no reply send to client1.
If this happens no connection will be established.
I can only suggest a solution that time difference between connect call from both the client should be very less. The connect call difference should be in milli-seconds
TIP: if you are in local network , put disable your firewall
for ubuntu user : sudo ufw disable

I think understanding how the Hole Punching really works would assist to get into why it may fail.
It was first explored by Dan Kagel, read here. In this technique, both peers are generally
assumed to be behind two different NATs. Both peers must be connected to an
intermediate server called a Rendezvous/Signaling server; there are many well-known Rendezvous protocols and SIP (RFC 3261) is the most famous one. As they are
connected to the server, they get to know about each other’s public transport
addresses through it. The public transport addresses are allocated by the NATs
in front of them. The Hole Punching process in short:
Peer 1 and Peer 2 first discover their Public Transport Addresses using
STUN Bind Request as described in RFC 8589.
Using a signaling/messaging mechanism, they exchange their Public Transport Addresses. SDP(Session Description Protocol)[16] may be used to
complete this. The Public Transport Address of Peer 1 and Peer 2 will be
assigned by NAT 1 and NAT 2 respectively.
Then both peers attempt to connect to each other using the received Public
Transport addresses. With most NATs, the first message will be dropped
by the NAT except for Full Cone NAT. But the subsequent packets will
penetrate the NAT successfully as by this time the NAT will have a mapping.
NAT can be of any type. If the NAT is, let's say, Symmetric NAT RFC 8489, Hole Punching won’t be possible. Because only an external host that receives a packet from an internal host can send a packet back. If this is the case, then the only possible way is Relaying.
Learn more about the current state of P2P communication: read RFC 5128.

Related

Connecting P2P over NAT?

I started to explore the option of connecting with other using a p2p connection, so I coded a simple socket program in JAVA for android devices in which the users can share simple messages p2p (I didn't have any idea about NAT then). I got to know about NAT, so I now need to establish a TCP connection with another user which uses a server for discovery but payload is transferred p2p. I have also looked at XMPP(a very good and detailed explanation of how protocol works is here) and UPnP but I dont know how to implement them.
Another interesting question that arises is of BitTorrent because they can work on any device and even behind a NAT. I am not able to get any explanation of how BitTorrent works.
I have researched a lot but I am stuck.
My questions are:
A detailed explanation of BitTorrent(like here, not how torrents work) and how is it able to work around NAT ?
Is there a way to make a NAT entry programmatically ?
Is socket programming sufficient for p2p ?
How difficult is it to create your own protocol and how can I build one ?
If two devices D1 and D2 want to communicate p2p and they know each other's IP. D1 sends a request to D2 and that can't get through the D2's NAT, but there should be an entry created in D1's NAT. So when D2 tries to send something D1's NAT should discover an entry with D2's IP. Then why is the packet not allowed by it ?
Another interesting question that arises is of BitTorrent because they can work on any device and even behind a NAT. I am not able to get any explanation of how BitTorrent works.
This statement looks like you assume that bittorrent needs full connectivity to operate.
That is incorrect.
Behind a NAT device you will still be able to establish outgoing TCP connections. Which generally is sufficient for bittorrent as long as there are other, non-NATed (or NATed but properly port-forwarded) clients in the network that can accept incoming connnections.
NAT has no impact on the flow direction of the data because connections are bi-directional once they are established. It only is problematic for the initial connection setup.
This works perfectly fine for bittorrent because bittorent does not care from which specific node you get your data.
Although better connectivity generally does improve performance.
If the identity of the node matters or one-on-one transfers are an important use-case then other p2p protocols usually attempt NAT traversal first and if that fails rely on 3rd party nodes relaying traffic between those nodes who cannot connect to each other directly.
Additionally, IPv6 support will become essential in the future to maintain end-to-end connectivity because more and more ISPs are starting to roll out carrier-grade NAT for IPv4 while IPv6 will remain non-NATed
One thing need to be clear is that 100% P2P between all type of NAT is impossible right now. There is no practical way to establish P2P connectivity between **Symmetric and Symmetric/PRC NAT. In this scenario connection is established through a relay server called TURN.
I am answering from your 2nd question because I don't know much about the first one.
2) Yes. You can send a packet through your NAT and there will be a mapping between your internal IP:Port to your NAT's external IP:Port. You can know these external IP:Port by sending a stun request. Note that this technique doesn't work for Symmetric NAT.
3)Yes socket programming sufficient for p2p.
4)Why do you need a protocol when there already exists several. ICE protocol is the best today for NAT traversal and I don't think it was easy to create. UPnP and NAT-PMP is really vulnerable in terms of security.
5)I think what happens is usually NAT blocks unknown packets coming to it. So when D1 sends a packet to D2, its NAT blocks all packets incoming from D1s IP:Port. That is why connection establishment fails. You have to employ hole punching technique for D1 and D2 to successfully establish P2P connectivity.
**By symmetric NAT I mean symmetric NAT with random port allocation.
There is a paper on "Peer-to-Peer Communication Across Network Address Translators" which describes the UDP hole punching method and extends it to be used over TCP as well.
Of course, you will always need a relay server for the cases where hole punching is not supported.
Recent versions of BitTorrent use µTP, which is layered above UDP, not TCP. µTorrent uses a private extension (ut_holepunch) that performs UDP hole punching, most other implementations don't bother (with the notable exception of Tixati).
Some NAT routers accept port forwarding requests using either the uPNP or the PMP protocol. Whether this is supported depends on the particular brand of router and its configuration.
Yes, socket programming is enough for P2P.
Difficult to answer. I suggest that you read the wikified and annotated BitTorrent specification for a start.
Yes, this is the principle behind UDP hole punching.

Are there security measures against udp hole punching? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 years ago.
Improve this question
I want to establish an UDP communication between two peers, say Alice and Bob. Alice is behind a port restricted cone NAT (so that the same internal port gets mapped to the same external port even if the destination is changed), while Bob is behind a symmetric NAT (which means that the external port will change every time a new destination is chosen regardless of the internal port, thus making the external port unpredictable). I have a server in between and I want to make an UDP hole punch.
I implemented the following strategy:
Bob opens a large number of ports and from all of them sends a packet to Alice's external port (he gets to know if through the server).
Alice sends packets to Bob's NAT at random ports until the connection is established.
Having two NATs of those types at hand, I did some experiments. Bob opens 32 ports, and Alice sends 64 packets every 0.1 seconds. The connection is usually established within 1 or 2 seconds, which is more than suitable for my needs.
However, I was wondering if I could get in trouble with some strict NAT routers or firewalls. On example, could it happen that a router won't allow an internal peer to open 32 ports? Or (and this sounds somehow more likely) could it happen that a router that sees a lot of packets incoming on random ports that get dropped will blacklist the ip and drop all its packets for some time? I read that sometimes this could happen in case of a DoS attack but my packet rate is something like 4 to 6 orders of magnitude lighter than a DoS attack.
I am asking about reasonable network configuration: I am pretty sure that in principle it is possible to setup a firewall to behave in that way. I will be targeting mainly users that lie behind standard home connections, so my main target is common internet providers that use NATs.
It's an interesting question.
First of all, I'm not sure anyone has the exact answer you're looking for. Different networks use different equipment and different configuration. Two ISPs can use ten different vendors for their routers, firewalls, NATs, intrusion detection equipment, DPI equipment etc; not to mention the number of possible configurations all of this equipment has.
And while commercial and corporate networks are bad enough, home networks are even worse. Here there are even more vendors selling modems, NAT boxes, and various software that affects network connectivity (such as firewalls and anti-viruses). All of which is in the hands of users who aren't technically savvy that leave it with the default settings, or worse.
Moreover, in both home and commercial networks there might be several layers of NAT. I know of a company that has a NAT for each lab (to isolate it from other labs and the R&D network). Each lab is then connected to the R&D NAT (to isolate it from other departments), which in turn is connected to the company-wide NAT, which, by the way, is also heavily firewalled. Add to that a possible ISP-level (carrier grade) NAT, and you're looking at up to 4 layers of NAT. Hopefully this is an extreme example, but two layers of NAT are quite common nowadays with home NAT and carrier grade NAT.
Given that, how likely it is for a random network to consider this behavior suspicious and limit it? Frankly, I don't know for sure and I don't think anyone else does with a high degree of certainty.
Despite that, my educated guess is that sane default configurations of communication equipment (NATs, routers, etc) should not block such behavior. After all, many applications open several ports; not to mention the fact that the NAT has no way of knowing that the IP sending this traffic isn't itself a NAT device with dozens of computers behind it - each of them with several open ports.
I also guess that simple firewalls should be fine with it as long as UDP itself isn't blocked, and the usage of the various ports is allowed. Firewalls that attempt to block port-scanning, and anti-DDoS equipment, however, might pose a problem as this traffic might seem suspicious to it, so it might depend on the configuration/implementation details of such equipment and software. So unfortunately, the only way to tell how your strategy will behave in the real world it to try it out on a variety of different networks.
Second, I'd like to say a few words about your hole punching strategy. If both Alice and Bob have a shared server, and Alice is behind a cone NAT, I don't see the point in your strategy. A cone NAT is the simplest NAT to overcome. If you want Alice to be able to connect to Bob (which is tricky since he's behind a symmetric NAT), all you really have to do is to get Bob to connect to Alice upon Alice's request.
To do that, both Alice and Bob should always have a long-lasting TCP or UDP connection to the server. The connection shouldn't carry any data for the most part, and should be just kept alive once in a while.
When Alice wants to connect to Bob, it just opens a port (say port X), and connects from that port to the server. The server sees Alice's external port that corresponds to port X - say port Y. At this point, Alice informs the server that she would like Bob to connect to her. Since Bob is connected to the same server, the server informs Bob that it should connect to Alice at at port Y. This should establish a connection between them without the need for any guessing.

NAT traversal while connecting mobile over http

Would anyone know the answer to this?
I was reading Practical JXTA II (also at http://www.scribd.com/doc/47538921/Practical-JXTA-II). I'm confused by the statement on page 92 second paragraph concerning establishing communication with a peer behind a NAT : "such peers remain inaccessible ...until either ... or b) inaccessible peer establishes a connection to the remote peer spontaneously."
This seems to imply that the NAT translation of IPv4 local addresses to public addresses is always the same. If the router is mapping a large set of addresses to a smaller set of public addresses wouldn't the results vary? Once the HTTP response is received then the session would be terminated and someone else could use that public IP, right? Once the HTTP session is over the router would no longer record the mapping used.
I'm trying to implement an idea for Web Services where an aspect of the application is P2P (I need both nodes to act as both client and server at times). The central server can have a DNS registered address but the various potentially mobile nodes might be behind NATs etc. After reading this I thought I would be ok if I had the nodes behind NATs establish a connection when they start up, telling the central DNS register node their public address, but now I'm thinking that address would likely change.
Her is my understanding of what Jérôme meant.
Say peer A is WAN-visible and peer B is behing a firewall. Peer A can send data to Peer B, when
Peer A and Peer B both establish outbound connections to a relay server. Peer A sends data to the relay on the outbound request that the relay forwards to Peer B on the synchronous response (of peer B connection to the relay).
Peer B establishes a connection to Peer A, and peer A sends data back to Peer B on the synchronous response. A "reverse-invoke" mechanism.
wrt to JXTA, a peer publishes to the network a local address and optionally a WAN address (address is the couple host+port). There can only be one WAN address per peer if you want to establish direct connection to that peer using NAT.
Having a central server is a bad idea in a P2P network: you create a single point of failure, which is exactly what P2P networks excel at avoiding.
Yet, as you hint, there is still a need to maintain a registry of "adresses/peer locations". This registry has to be distributed however. This would need a book, but here are two approaches:
a Distributed Hash Table (DHT) on the nodes: every node holds and share a copy of part of the registry. JXTA has such a mechanism but check Kademlia on Wikipedia for a very successful algorithm.
a Global Index Nodes approach (I believe Skype-like): a limited number of dedicated peers/nodes that hold the registry using DHTs or other replication algorithms. The peers connect to the GINs for addresses using firewall friendly protocols (HTTP) and the GINs talk to each other using fast socket-to-socket connections (check Hazelcast for a quick way of implementing GINs).

Creating a TCP connection between 2 computers without a server

2 computers are in different subnets.
Both are Windows machines.
There are 2-5 IGMP-ready routers between them.
They can connect each other over multicast protocol (they have joined the same multicast group and they know about each other's existance).
How to establish a reliable TCP connection between them without any public server?
Programming language: C++, WinAPI
(I need a TCP connection to send some big critical data, which I can not entrust to UDP)
You haven't specified a programming language, so this whole question may be off-topic.
Subnets are not the problem. Routability is the problem. Either there is routing set up or there isn't. If they are, for example, both behind NAT boxes, then you're at the mercy of the configuration of the nat boxes. If they are merely on two different subnets of a routed network, it's the job of the network admin to have set up routing. So, each has an IP address, and either can address the other.
On one machine, you are going to create a socket, bind it to some port of your choice, and listen. On the other, you will connect to the first machine's IP + the selected port.
edit
I'm going to try again, but I feel like there's a giant conceptual gap here.
Once upon a time, the TCP/IP was invented. In the original conception, every item on the network has an IPV4 address, and every machine could reach every other machine, via routing, except for machines in the 'private' address space (10.x, etc).
In the very early days, the only 'subnets' were 'class A, class B, class C'. Later the idea of subdividing a network via bitmasks was added. The concept of 'subnet' is just a way of describing a piece of network in which all the hosts can deliver packets to each other by one hop over some transport or another. In a properly configured network, this is only of concern to operating system drivers. Ordinary programs just address packets over the network and they arrive.
The implementation of this connectivity was always via routing protocol. If you have a (physical) ethernet A over here, and a (physical) ethernet B over there, connected by some sort of point-to-point link, the machines on A need to know where to send packets for B. Or, to be exact, they need to know where to send 'not-A' packets, and whatever they send them needs to know where to send 'B' packets. In simple cases, this is arranged via explicit configuration: routing rules stuffed into router boxes or even computers with multiple physical interfaces. In more complex cases, routing boxes intercommunicate via protocols like EGP or BGP or IGMP to learn the network topology.
If you use the Windows 'route' command, you will see the 'default route' that the system uses to send packets that need to leave the local subnet. It is generally the address of the router box responsible for moving information from the local subnet to everywhere else.
The whole goal of this routing is to arrange that a packet sent from a.b.c.d to e.f.g.h will get there. TCP is no different than UDP, except that you can't get there by multicast or broadcast: you need to know the exact address of your correspondent.
DNS was invented to allow hosts to learn each other's IP addresses without having human being send them around in email messages.
All this stops working when people start using NAT and firewalls to turn off routing. The whole idea of NAT is that the computers behind the NAT box are not addressable at all. They all appear to have one IP address. They can send stuff out, but they can only receive stuff if the NAT box has gone to extra trouble to map them a port.
From your original message, I sort of doubt that NAT is in use here. I just don't understand your comment 'I don't have access to the network.' You say that you've sent UDP packets here and there. So how did you do that? What addresses did you use?

Direct TCP/IP connections in P2P apps

From a Joel's post on Copilot:
Direct Connect! We’ve always done
everything we can to make sure that
Fog Creek Copilot can connect in any
networking situation, no matter what
firewalls or NATs are in place. To
make this happen, both parties make
outbound connections to our server,
which relays traffic on their behalf.
Well, in many cases, this isn’t
necessary. So version 2.0 does
something rather clever: it sets up
the initial connection through our
servers, so you get connected right
away with 100% reliability. But then
once you’re all connected, it quietly,
in the background, looks for a way to
make a direct connection. If it can’t,
no big deal: you just keep relaying
through our server. If you can make a
direct peer-to-peer connection, it
silently shifts your data onto the
direct connection. You won’t notice
anything except, probably, much faster
communication.
How do they change the server connection to a P2P connection?
It's pretty tricky and interesting. I'm sure I have some details wrong, but the overview is this:
The programs can already talk to each other through Joel's server, so they can exchange information with each other and Joel's server. Further, Joel has their external IP addresses, and they give joel information about their internal IP addresses.
They decide to try this hole punch technique. Computer A initiates a TCP connection with Computer B using B's external IP address. It won't go through, but what it does is tell's A's router that it needs to allow incoming packets from B on a given port.
Computer B does the same thing, but its message gets through to A since A's router opened a port/ip combination that matches what B sent (there's some port magic that happens here - this is non trivial, but doable).
B's router remembers that B initiated a connection with A on a given port and IP, and so A's packets now flow into B past their router correctly as well.
So it's actually pretty straight forward, but the implementation has details, especially regarding how ports are given to new TCP connections, and how NAT routers typically deal with TCP requests and how they map to external ports. These details are the interesting, and difficult, bit.
-Adam
I believe the simple version is that they drop the server connection and replace it with the P2P connection.
Something along the lines of:
Machine1 connects to copilot's servers.
Machine1 connects to copilot's servers.
Machine1 connects to copilot's servers.
Machine2 subsequently connects, and they begin screen sharing.
Machine2 opens a port intended for Machine1 to connect to.
Machine1 tries to connect to the now open port on Machine2.
If this connection is established:
The connection to copilot's servers is severed.
Data is instead transfered over the direct (P2P) connection between the two machines.
There is a technique called "Hole Punching" that works well with "Cone" NAT (Cone is a technical familly of router). That's not an 100% sure technique, today, it works well with UDP on about 80% of the router.
There is some implementations of library to realize Hole Punching: STUN (wikipedia)

Resources