Practical NAT traversal for reliable network connections - networking

I've seen and read a lot of similar questions, and the corresponding Wikipedia articles (NAT traversal, STUN, TURN, TCP hole punching), but the overwhelming amount of information doesn't really help me with my very simple problem:
I'm writing a P2P application, and I want two users of my application behind NAT to be able to connect to each other. The connection must be reliable (comparable to TCP's reliability) so I can't just switch to UDP. The solution should work on today's common systems without reconfiguration. If it helps, the solution may involve a connectible 3rd-party, as long as it doesn't have to proxy the entire data (for example, to get the peers' external (WAN) IP addresses).
As far as I know, my only option is to use a "reliable UDP" library + UDP hole punching. Is there a (C/C++) library for this? I found enet in a related question, but it only takes care of the first half of the solution.
Anything else? Things I've looked at:
Teredo tunnelling - requires support from the operating system and/or user configuration
UPnP port forwarding - UPnP isn't present/enabled everywhere
TCP hole punching seems to be experimental and only work in certain circumstances
SCTP is even less supported than IPv6. SCTP over UDP is just fancy reliable UDP (see above)
RUDP - nearly no mainstream support
From what I could understand of STUN, STUNT, TURN and ICE, none of them would help me here.

ICE collects a list of candidate IP/port targets to which to connect. Each peer collects these, and then each runs a connectivity check on each of the candidates in order, until either a check passes or a check fails.
When Alice tries to connect to Bob, she somehow gets a list of possible ways - determined by Bob - she may connect to Bob. ICE calls these candidates. Bob might say, for example: "my local socket's 192.168.1.1:1024/udp, my external NAT binding (found through STUN) is 196.25.1.1:4454/udp, and you can invoke a media relay (a middlebox) at 1.2.3.4:6675/udp". Bob puts that in an SDP packet (a description of these various candidates), and sends that to Alice in some way. (In SIP, the original use case for ICE, the SDP's carried in a SIP INVITE/200/ACK exchange, setting up a SIP session.)
ICE is pluggable, and you can configure the precise nature/number of candidates. You could try a direct link, followed by asking a STUN server for a binding (this punches a hole in your NAT, and tells you the external IP/port of that hole, which you put into your session description), and falling back on asking a TURN server to relay your data.
One downside to ICE is that your peers exchange SDP descriptions, which you may or may not like. Another is that TCP support's still in draft form, which may or may not be a problem for you. [UPDATE: ICE is now officially RFC 6544.]
Games often use UDP, because old data is useless. (This is why RTP usually runs over UDP.) Some P2P applications often use middleboxes or networks of middleboxes.
IRC uses a network of middleboxes: IRC servers form networks, and clients connect to a near server. Messages from one client to another may travel through the network of servers.
Failing all that, you could take a look at BitTorrent's architecture and see how they handle the NAT problem. As CodeShadow points out in the comments below, BitTorrent relies on reachable peers in the network: in a sense some peers form a network of middleboxes. If those middleboxes could act as relays, you'd have an IRC-like architecture, but one that's set up dynamically.

I recommend libjingle as it is used by some major video game companies which heavily relies on P2P network communication. (Have you heard about Steam? Vavle also uses libjingle , see the "Peer-to-peer networking" session in the page: https://partner.steamgames.com/documentation/api)
However, the always-work-solution would be using a relay server. Since there is no "standard" way to go through NAT, you should have this relay server option as a fall-back strategy if a connection has to be always established between any peers.

Related

Why does WebRTC needs ICE protocol to operate?

As far as I understand, ICE protocol is used for discovering the nodes/devices from the end-user device to "the outside".
I don't understand why it's needed. Isn't packet-routing is the responsibility of network devices like routers and switches? They should find the shortest path from the gateway to the end-user device (Actually, routers remembers those routes they previously discovered).
Moreover, NAT protocol is used to convert from an "internal ip" to "external ip" and vice-versa.
So again,
Why does the other user needs to be familiar with my internal network setup?
NAT is a kludge, put in place to try to conserve IPv4 addresses until IPv6 becomes ubiquitous, and it breaks the end-to-end connectivity which is the promise of IP. Because of that, some things don't work correctly through NAT. There are various kludges to work around the NAT kludge, and ICE is part of that. This is explained in RFC 5245, Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer Protocols:
Introduction
RFC 3264 [RFC3264] defines a two-phase exchange of Session Description
Protocol (SDP) messages [RFC4566] for the purposes of establishment of
multimedia sessions. This offer/answer mechanism is used by protocols
such as the Session Initiation Protocol (SIP) [RFC3261].
Protocols using offer/answer are difficult to operate through Network
Address Translators (NATs). Because their purpose is to establish a
flow of media packets, they tend to carry the IP addresses and ports
of media sources and sinks within their messages, which is known to be
problematic through NAT [RFC3235]. The protocols also seek to create
a media flow directly between participants, so that there is no
application layer intermediary between them. This is done to reduce
media latency, decrease packet loss, and reduce the operational costs
of deploying the application. However, this is difficult to
accomplish through NAT. A full treatment of the reasons for this is
beyond the scope of this specification.
Numerous solutions have been defined for allowing these protocols to
operate through NAT. These include Application Layer Gateways (ALGs),
the Middlebox Control Protocol [RFC3303], the original Simple
Traversal of UDP Through NAT (STUN) [RFC3489] specification, and Realm
Specific IP [RFC3102] [RFC3103] along with session description
extensions needed to make them work, such as the Session Description
Protocol (SDP) [RFC4566] attribute for the Real Time Control Protocol
(RTCP) [RFC3605]. Unfortunately, these techniques all have pros and
cons which, make each one optimal in some network topologies, but a
poor choice in others. The result is that administrators and
implementors are making assumptions about the topologies of the
networks in which their solutions will be deployed. This introduces
complexity and brittleness into the system. What is needed is a
single solution that is flexible enough to work well in all
situations.
This specification defines Interactive Connectivity Establishment
(ICE) as a technique for NAT traversal for UDP-based media streams
(though ICE can be extended to handle other transport protocols, such
as TCP [ICE-TCP]) established by the offer/answer model. ICE is an
extension to the offer/answer model, and works by including a
multiplicity of IP addresses and ports in SDP offers and answers,
which are then tested for connectivity by peer-to-peer connectivity
checks. The IP addresses and ports included in the SDP and the
connectivity checks are performed using the revised STUN specification
[RFC5389], now renamed to Session Traversal Utilities for NAT. The
new name and new specification reflect its new role as a tool that is
used with other NAT traversal techniques (namely ICE) rather than a
standalone NAT traversal solution, as the original STUN specification
was. ICE also makes use of Traversal Using Relays around NAT (TURN)
[RFC5766], an extension to STUN. Because ICE exchanges a multiplicity
of IP addresses and ports for each media stream, it also allows for
address selection for multihomed and dual- stack hosts, and for this
reason it deprecates RFC 4091 [RFC4091] and [RFC4092].
Firewalls. They're typically configured to bounce any unsolicited traffic from the world wide web to you. They only approve of you initiating contact with a server, which only then is allowed to back-traffic to you, and that's pretty much it. Unless your friends all own static IPs (which few people can justify) this is a hostile environment for peer to peer communication.
ICE tries to solve this, by enumerating addresses and ports at which the other side may be reached, and trying to connect to these addresses, by initiating outbound requests on both ends, or if all else fails, falling back to communicating through a TURN server, if specified.
See this WebRTCHacks article for more on the problem.
Why does the other user needs to be familiar with my internal network setup?
Because the other user is sometimes on your internal network. e.g. LAN games.

How to detect router?

I am trying to write a program that scan an ip range and detect if an ip is address of a router or not.
Currently i used traceroute from my computer to all host in the network. However, i believe there must be some way to directly "ask" a host at an ip if it is a router or not?
by the way, do you know any program/ opensource already does this?
Routers are supposed to talk couple of protocols (actually a neat bunch) that regular IP nodes do not, and then there are some which are more common (i.e. even non-router nodes do).
Router-only protocols:
VRRP
IGRP / EIGRP
OSPF
BGP
RIP
You could do active-probing on those, i.e. send a packet (behaving as if you are another router, or an end-node) and checking to see what kind of response the router (if at all) sends.
Alternatively you could do passive-probing, like 'sniffing', i.e. watching out for the kind of IP packets being sent out by various nodes. There are some which are usually sent out by Routers only (again, mostly from the above list).
Common protocol, but that can actually tell you a lot:
SNMP (esply the unsecure one's like v1/v2, are easy to deal with, without having to establish a secure session)
Other ways:
Portscanning (actually can tell you a real lot), for example all routers have some management ports (although, often they are locked down due to security concerns)
What you want to do is often what many 'Network Management' software do, to "discover" capabilities / functionality of other nodes in the network. And, there isn't a single size-fits all solution. They use bunch of different methods, heuristics to finally figure out what the other node is.
Any node which is hopped to and not just an endpoint is a router. However, this doesn't allow you to detect routers with no reachable devices hooked up. (Any input as to whether my answer has merit would be great!)

Standard Chat Applications

Quick question: do most chat applications (ie. AIM, Skype, Oovoo) use peer to peer UDP exchange for talking to other users or an echoing TCP connection with a server? Or some combination in-between?
Traditionally, most applications used a TURN-like solution (i.e., communication via a server) to overcome NAT traversal issues. Since chat does not consume much bandwidth, servers could support thousands of communications.
But now that P2P has evolved and the NAT traversal issues are now well understood, some use direct UDP communication provided that the users' NAT allows this (i.e., STUN-like communication). They still need a central server to punch the hole though. Direct communication is also helpful when lots of data needs to be transmitted.
I believe it is fair to say that most modern frameworks use a combination of both.
when you need small fragments of data, such as text messaging, there's no need of using P2P. data can be transmitted from client1 to server, and from server back to the client2.
When you need to transfer data quickly between clients, in cases such as VoIP (voice over IP), or file transfer, you will use P2P.
A pretty standard IM protocol is XMPP. I know it's used by Google Talk, as well as a few other big names in chat.

What percentage of users are behind symmetric NATs, such that "p2p" traffic needs to be relayed?

We're implementing a SIP-based solution and have configured the setup to work with RTPProxy. Right now, we're routing everything through RTPProxy as we were having some issues with media transport relying on ICE. If we're not mistaken, a central relay server is necessary for relaying streaming data between two clients if they're behind symmetric NATs. In practice, is this a large percentage of all consumer users? How much bandwidth woudl we save if we implemented proper routing to skip the relay server when not necessary. Are there better solutions we're missing?
In falling order of usefulness:
There is a direct connection between the two endpoints in both directions. You just connect and you are essentially done.
There is a direct connection between the two endpoints in one direction. In that case you just connect via the right direction by trying both.
Both parties are behind NATs of some kind.
Luckily, UPnP works in one end, you can then upgrade the connection to the above scheme
UPnP doesn't work, but STUN does. Use it to punch a hole in the NAT. There are a couple of different protocols but the general trick is to negotiate via a middle man that coordinates the NAT-piercing.
You fall back to let another node on the network act as a relaying proxy.
If you implement the full list above, then you have to give up very few connections and don't have to spend much time on bandwidth utilization at proxies. The BitTorrent protocol, of which I am somewhat familiar, usually stops at UPnP, but provides a built-in test to test for connectivity through the NAT.
One really wonders why IPv6 did not get implemented earlier - this is a waste of programmers time.
Real world NAT types survey (not a huge dataset, though):
http://nattest.net.in.tum.de/results.php
According to Google, about 8% of the traffic has to be relayed: http://code.google.com/apis/talk/libjingle/important_concepts.html
A large percentage (if not the majority) of home users uses NAT, as that is what those xDSL/cable routers use to provide network access to the local network.
You can theoretically use UPnP to open ports and set-up forwarding rules on the router to go through the NAT transparently. Unfortunately (or fortunately, depending on who you are) many users disable UPnP as a matter of course on their router and may not appreciate having to add forwarding rules manually.
What you might be able to do (and what Skype does AFAIK) is to have some of the users that have clear network paths and enough bandwidth act as relay nodes. Apart from the routing and QoS issues, you would at least have to find some way to ensure the privacy of any relayed data from anyone, including the owner of the relay node. In addition, there might be legal issues to settle with this approach, apart from the technical ones.

Why is SCTP not much used/known

I recently checked out the book "UNIX Network Programming, Vol. 1" by Richards Stevens and I found that there is a third transport layer standard besides TCP and UDP: SCTP.
Summary: SCTP is a transport-level protocol that is message-driven like UDP, but reliable like TCP. Here is a short introduction from IBM DeveloperWorks.
Honestly, I have never heard of SCTP before. I can't remember reading about it in any networking books or hearing about it in classes I had taken. Reading other stackoverflow questions that mentions SCTP suggests that I'm not alone with this lack of knowledge.
Why is SCTP so unknown? Why is it not much used?
Indeed, SCTP is used mostly in the telecom area. Traditionally, telecom switches use SS7 (Signaling System No. 7) to interconnect different entities in the telecom network. For example - the telecom provider's subscriber data base(HLR), with a switch (MSC), the subscriber is connected too (MSC).
The telecom area is moving to higher speeds and more reachable environment. One of these changes is to replace SS7 protocol by some more elegant, fast and flexible IP-based protocol.
The telecom area is very conservative. The SS7 network has been used here for decades. It is very a reliable and closed network. This means a regular user has no access to it.
The IP network, in contrast, is open and not reliable, and telecoms will not convert to it if it won't handle at least the load that SS7 handles. This is why SCTP was developed. It tries:
to mimic all advantages of the SS7 network accumulated over the decades.
to create a connection-oriented protocol better than TCP in speed, security, and redundancy
The latest releases of Linux already have SCTP support.
We have been deploying SCTP in several applications now, and encountered significant problem with SCTP support in various home routers. They simply don't handle SCTP correctly. I believe this is primarily a performance issue (the SCTP protocol specification require checksums for the whole packets to be recalculated and not just for headers).
Like many other promising protocols SCTP is sadly dead in the water until D-link and Netgear fixes their broken NAT boxes.
SCTP is not very much known and not used/deployed a lot because:
Widespread: Not widely integrated in TCP/IP stacks (in 2013: still missing natively in latest Mac OSX and Windows. 2020 update: still not in Windows nor Mac OS X)
Libraries: Few high level bindings in easy to use languages (Disclaimer: i'm maintainer of pysctp, SCTP easy stack support for Python)
NAT: Doesn't cross NAT very well/at all (less than 1% internet home & enterprise routers do NAT on SCTP).
Popularity: No general public app use it
Programming paradigm: it changed a bit: it's still a socket, but you can connect many hosts to many hosts (multihoming), datagram is ordered and reliable, erc...
Complexity: SCTP stack is complex to implement (due to above)
Competition: Multipath TCP is coming and should address multihoming needs / capabilities so people refrain from implementing SCTP if possible, waiting for MTCP
Niche: Needs SCTP fills are very peculiar (ordered reliable datagrams, multistream) and not needed by much applications
Security: SCTP evades security controls (some firewalls, most IDSes, all DLPs, does not appear on netstat except CentOS/Redhat/Fedora...)
Audit-ability: Something like 3 companies in the world routinely do audits of SCTP security (Disclaimer: I work in one of them)
Learning curve: Not much toolchain to play with SCTP (check the excellent withsctp that combines nicely with netcat or use socat, 2020 edit: nmap supports it for a few years now )
Under the hood: Used mostly in telecom and everytime you send SMS, start surfing the net on your mobile or make phone calls, you're often triggering messages that flow over SCTP (SIGTRAN/SS7 with GSM/UMTS, Diameter with LTE/IMS/RCS, S1AP/X2AP with LTE), so you actually use it a lot but you never know about it ;-) 2020 edit: it's being removed from the core 5G network (no more Diameter, HTTP/2 instead) and will be only used in the 5G radio access network between antennas and core.
SCTP requires more design within the application to get the best use of it. There are more options than TCP, the Sockets-like API came later, and it is young. However I think most people that take the time to understand it (and who know the shortcomings of TCP) appreciate it -- it is a well designed protocol that builds on our ~30 years of knowledge of TCP and UDP.
One of the aspects that requires some thought is that of streams. Streams provide (usually, I think you can turn it off) an order guarantee within them (much like a TCP connection) but there can be multiple streams per SCTP connection. If your application's data can be sent over multiple streams then you avoid head-of-line blocking where the receiver starves due to one mislaid packet. Effectively different conversations can be had over the same connection without impacting each other.
Another useful addition is that of multi-homing support -- one connection can be across multiple interfaces on both ends and it copes with failures. You can emulate this in TCP, but at the application layer.
Proper link heartbeating, which is the first thing any application using TCP for non-transient connections implements, is there for free.
My personal summary of SCTP is that it doesn't do anything you couldn't do another way (in TCP or UDP) with substantial application support. The thing it provides is the ability to not have to implement that code (badly) yourself.
FYI, SCTP is mandated as supported for Diameter (cf RADIUS next gen). see RFC 3588
Diameter clients MUST support either TCP or SCTP, while agents and
servers MUST support both. Future versions of this specification MAY
mandate that clients support SCTP.
p1. SCTP mapped directly over IPv4 requires support in NAT gateways, which has never been widely deployed anywhere, and without it the typical NAT gateway will only permit one private host per public address to be using SCTP at a time.
p2. SCTP mapped over UDP/IPv4 allows more private hosts per public address, but UDP mappings in IPv4/NAT gateways are notoriously tricky to establish and keep maintained, due to the fact that UDP is a connectionless transport without any explicit state for a NAT to track.
p3. SCTP mapped directly over IPv6 requires... well... IPv6. Have you tried to deploy IPv6? If so, have you tried to buy an IPv6 firewall? Does it support SCTP? How about a load balancer? A SSL accelerator?
p4. Finally, a lot of the Internet is pretty much constrained to what can fit through TCP port 80 and port 443, so SCTP of any flavor tends to lose there. Hence, you see efforts like the MPTCP working group in IETF.
Many of us will be using SCTP soon, since it's used by WebRTC datachannels to create a TCP-like reliable layer on top of UDP -- SCTP over DTLS over UDP: https://datatracker.ietf.org/doc/html/draft-ietf-rtcweb-data-channel-13#section-6
Reading the SCTP Wikipedia page I'd say that the main reason is that SCTP is a very young protocol (proposed in 2000) that is currently unsupported by the mainstream OSs (Windows, OS X, Linux).
If "very young" seems inappropriate to you, think about IPV6: "in December 2008, despite marking its 10th anniversary as a Standards Track protocol, IPv6 was only in its infancy in terms of general worldwide deployment."
SCTP is used extensively in the 4G LTE network where Diameter is used for AAA.
It might not be well known, but it's not unused. Quite recently there was a draft published at the IETF about Using SCTP as a Transport Layer Protocol for HTTP.
In reference to all of the comments about commercial routers being broken or lacking SCTP support, the issue is that SCTP with NAT is still in draft form with the IETF. So there is no RFC specification for them to implement it.
https://datatracker.ietf.org/doc/html/draft-ietf-behave-sctpnat-09
Sctp is born too late, and for many situation TCP is enough.
Also, as I know most of its usage is on telecommunication area.

Resources