libutp (µTP) and NAT traversal (UDP hole punching) - nat

According to the Wikipedia article the Micro Transport Protocol supports NAT traversal using UDP hole punching. But looking at the libutp's project page I can't find any such reference in the header files. Am I missing something obvious? Or has the NAT traversal been implemented somewhere else?

UDP hole punching is not really specific to any protocol, it does not even need to be supported by the protocol that needs to traverse the NAT. I.e. it can happen out of band or at a higher protocol layer.
In the case of bittorrent-over-utp it is negotiated and initiated with the BEP55 ut_holepunch extension message. In addition to the specification you could also read libtorrent's implementation.

Related

TCP/IP - Why does a part of a packet may use a connection-less services in a connection-oriented service.

While reading the book on TCP/IP I came across the words which are as "Although it looks as though the use of the flow label may make the source and destination addresses useless, the parts of the Internet that use connection-less service at the network layer still keep these addresses for several reasons.One reason is that part of the packet path may still be using the connection-less service. Another reason is that the protocol at the network layer is designed with these addresses and it may take a while before they can be changed". Now my question to you is if a connection has been formed between hosts in a connection-oriented manner then how come a path of a packet may still be using the connection-less services. Because as per my knowledge prevails the virtual path always be formed at while 3-way handshake is taking place which is the TCP/IP connection (which uses a connection-oriented service) ? And my second question for the second reason is that which protocol they are talking about since these words are stated below the Heading of "Connection-Oriented Services" therefore, it's making me pissed off to understand the literal meaning behind the words(The core conceptual understanding). And correct if anyone thinks I am having a wrong concept at any place. I'll be obliged. Thanks.
TCP as a connection-oriented protocol runs on top of IP which is connection-less. The routers used in transport only look at the IP packet, the TCP segment is simply payload and transported along. TCP provides several algorithms to form a virtual connection over a connection-less network.
The IP packet goes from hop to hop. On each hop, a router makes a forwarding decision solely based on the destination IP address. (More sophisticated devices may inspect more packet elements including source address and payload, but they aren't simple routers.)
The "path" is made up of all these individual hops. Because each hop is based on an independent routing decision the path can change at any time and for any packet. The path is not laid out by the TCP handshake.
Basically, you have to look at each protocol layer individually. Each one serves its own function.
I hope this also answers the second question.

Why is UDP required at all when some protocols ride directly over IP?

As I understand it TCP is required for congestion control and error recovery or reliable delivery of information from one node to another and its not the fastest of protocols for delivering information.
Some routing protocols such as EIGRP and OSPF ride directly on top of IP. Even ICMP rides directly over IP.
Why is UDP even required at all? Is it only required so that developers/programmers can identify what application the inbound packet should be sent to based on the destination port number contained within the packet?
If that is the case then how is information gathered from protocols that ride directly on top of IP sent to the appropriate process when there is no port number information present?
Why are voice and video sent over UDP? Why not directly over IP?
(Note that I do understand thoroughly the use case for TCP. I am not asking why use UDP over TCP or vice versa. I am asking why use UDP at all and how can some protocols use directly the IP layer. Whats the added advantage or purpose of UDP over IP?)
Your question makes more sense in terms of why is UDP useful (than why is UDP required).
UDP is a recognized protocol by the Internet Assigned Numbers Authority. UDP can be useful if you want to write a network protocol that's datagram based and you want to play more nicely with Internet devices.
Routers can have rules to do things like drop any packet that doesn't make sense. So if you try and send packets using say an unassigned IP protocol number between hosts separated by one or more routers, the packets may well never get delivered as you've intended. The same could happen with packets from an unrecognized UDP protocol but that's at least one less door to worry about whether your packet can make it through.
Internet endpoints (like hosts) may do similar filtering too. If you want to write your own datagram based protocol and use a typical host operating system, you're more likely to need to write your software as a privileged process if not as a kernel extension if you're trying to ride it as its own IP protocol (than if you'll be using UDP).
Hope this answer is useful!
First of all, IP and UDP are protocols on the different layers, IP by definition is Internet layer when UDP is transport layer. Layers were introduced to simplify network protocols architecture and to separate concerns. Application layer protocols are supposed to be based on transport layer (with some exceptions).
Most popular transport protocols (in IP network) are UDP and TCP. While TCP is feature rich but with many tradeoffs UDP is very simple but gives a lot of freedom and so typically is a base for other protocols.
The main feature of UDP is multiplexing: ports that allow multiple protocol instances (aka sockets) to coexist on the same node. This means that implementing your own protocol over IP instead of UDP either you won't be able to have multiple instances of your protocol on the same machine or you'll have to implement multiplexing yourself.
There're other features like segmentation and checksum. These features are not mandatory.
And as was mentioned in another answer there're lots of middleware like routers, NATs and firewalls that can ruin the idea of a custom "right over IP" protocol, but it's more like a collateral damage than a feature of UDP.

Connecting P2P over NAT?

I started to explore the option of connecting with other using a p2p connection, so I coded a simple socket program in JAVA for android devices in which the users can share simple messages p2p (I didn't have any idea about NAT then). I got to know about NAT, so I now need to establish a TCP connection with another user which uses a server for discovery but payload is transferred p2p. I have also looked at XMPP(a very good and detailed explanation of how protocol works is here) and UPnP but I dont know how to implement them.
Another interesting question that arises is of BitTorrent because they can work on any device and even behind a NAT. I am not able to get any explanation of how BitTorrent works.
I have researched a lot but I am stuck.
My questions are:
A detailed explanation of BitTorrent(like here, not how torrents work) and how is it able to work around NAT ?
Is there a way to make a NAT entry programmatically ?
Is socket programming sufficient for p2p ?
How difficult is it to create your own protocol and how can I build one ?
If two devices D1 and D2 want to communicate p2p and they know each other's IP. D1 sends a request to D2 and that can't get through the D2's NAT, but there should be an entry created in D1's NAT. So when D2 tries to send something D1's NAT should discover an entry with D2's IP. Then why is the packet not allowed by it ?
Another interesting question that arises is of BitTorrent because they can work on any device and even behind a NAT. I am not able to get any explanation of how BitTorrent works.
This statement looks like you assume that bittorrent needs full connectivity to operate.
That is incorrect.
Behind a NAT device you will still be able to establish outgoing TCP connections. Which generally is sufficient for bittorrent as long as there are other, non-NATed (or NATed but properly port-forwarded) clients in the network that can accept incoming connnections.
NAT has no impact on the flow direction of the data because connections are bi-directional once they are established. It only is problematic for the initial connection setup.
This works perfectly fine for bittorrent because bittorent does not care from which specific node you get your data.
Although better connectivity generally does improve performance.
If the identity of the node matters or one-on-one transfers are an important use-case then other p2p protocols usually attempt NAT traversal first and if that fails rely on 3rd party nodes relaying traffic between those nodes who cannot connect to each other directly.
Additionally, IPv6 support will become essential in the future to maintain end-to-end connectivity because more and more ISPs are starting to roll out carrier-grade NAT for IPv4 while IPv6 will remain non-NATed
One thing need to be clear is that 100% P2P between all type of NAT is impossible right now. There is no practical way to establish P2P connectivity between **Symmetric and Symmetric/PRC NAT. In this scenario connection is established through a relay server called TURN.
I am answering from your 2nd question because I don't know much about the first one.
2) Yes. You can send a packet through your NAT and there will be a mapping between your internal IP:Port to your NAT's external IP:Port. You can know these external IP:Port by sending a stun request. Note that this technique doesn't work for Symmetric NAT.
3)Yes socket programming sufficient for p2p.
4)Why do you need a protocol when there already exists several. ICE protocol is the best today for NAT traversal and I don't think it was easy to create. UPnP and NAT-PMP is really vulnerable in terms of security.
5)I think what happens is usually NAT blocks unknown packets coming to it. So when D1 sends a packet to D2, its NAT blocks all packets incoming from D1s IP:Port. That is why connection establishment fails. You have to employ hole punching technique for D1 and D2 to successfully establish P2P connectivity.
**By symmetric NAT I mean symmetric NAT with random port allocation.
There is a paper on "Peer-to-Peer Communication Across Network Address Translators" which describes the UDP hole punching method and extends it to be used over TCP as well.
Of course, you will always need a relay server for the cases where hole punching is not supported.
Recent versions of BitTorrent use µTP, which is layered above UDP, not TCP. µTorrent uses a private extension (ut_holepunch) that performs UDP hole punching, most other implementations don't bother (with the notable exception of Tixati).
Some NAT routers accept port forwarding requests using either the uPNP or the PMP protocol. Whether this is supported depends on the particular brand of router and its configuration.
Yes, socket programming is enough for P2P.
Difficult to answer. I suggest that you read the wikified and annotated BitTorrent specification for a start.
Yes, this is the principle behind UDP hole punching.

If TCP is connection oriented why do packets follow different paths?

According to my knowledge if an internet application has to be designed, we should use either a connection-oriented service or connection-less service, but not both.
Internet's connection oriented service is TCP and connection-less service is UDP, and both resides in the transport layer of Internet Protocol stack.
Internet's only network layer is IP, which is a connection-less service. So it means whatever application we design it eventually uses IP to transmit the packets.
Connection-oriented services use the same path to transmit all the packets, and connection-less does not.
Therefore my problem is
if a connection oriented application has been designed, it should transmit the packets using the same path. But IP breaks that rule by using different routes.So how do both TCP and IP work together in this sense? It totally confuses me.
You, my friend, are confusing the functionality of two different layers.
TCP is connection oriented in the sense that there's a connection establishment, between the two ends where they may negotiate different things like congestion-control mechanism among other things.
The transport layer protocols' general purpose is to provide process-to-process delivery meaning that it doesn't know anything about routes; how your packets reach the end system is beyond their scope, they're only concerned with how packets are being transmitted between the two end PROCESSES.
IP, on the other hand, the Network layer protocol for the Internet, is concerned with data-delivery between end-systems yet it's connection-less, it maintains no connection so each packet is handled independently of the other packets.
Leaving your system, each router will choose the path that it sees fit for EACH packet, and this path may change depending on availability/congestion.
How does that answer your question?
TCP will make sure packets reach the other process, it won't care HOW they got there.
IP, on the other hand, will not care if they reach the other end at all, it'll simply forward each different packet according to what it sees most fit for a particular packet.
Note:
Let's assume that IP was connection-oriented, would that mean packets would follow the same-path?
Not necessarily, it depends on what the word 'connection' at this layer means, if it means negotiating certain options related to security, for instance, you may still have all your packets being forwarded through different routes over the Internet.
EDIT:
Not to confuse you though, most connection-oriented services at the network-layer and below mean that the connection, when established, also establishes a virtual-path that all 'packets' must follow, for further information read about:
Virtual circuit and frame-relay networks
This link answers your question pretty well http://www.tcpipguide.com/free/t_ConnectionOrientedandConnectionlessProtocols-3.htm
Some people consider this (TCP) to be like a “simulation” of circuit-switching at higher network layers; this is perhaps a bit of a dubious analogy. Even though a TCP connection can be used to send data back and forth between devices, all that data is indeed still being sent as packets; there is no real circuit between the devices. This means that TCP must deal with all the potential pitfalls of packet-switched communication, such as the potential for data loss or receipt of data pieces in the incorrect order.
The TCP protocol deals with the problem of IP packets arriving out of order or being lost, to give you the feeling they arrive through a single FIFO channel. Yes, TCP is smart enough to do that, there's no need for a dedicated underlying channel.
The TCP protocal is implemented by the sending/receiving machines, once the packets leave the sending machine, the routers they travel along know nothing about TCP, they just use IP to get the packets from the source the to destination. Then, it is the destination machines job to, using TCP, make sure that all the packets arrive and that they arrive in the correct order. The internet itself doesn't know anything about TCP, it's just a layer (often software) that gives connection to a connectionless medium (the internet).
So onces a packet leaves a destination, it can go along any path (mostly) as long as it gets to the desintation, regardless of the higher level protocol (such as TCP or UDP).
I mean, it's a bit more complicated then that, but as far as I can remember that's the general Idea.
Refer my short points properly,
1) Connection oriented means ==> reserving resources(buffer,cpu,bandwidth etc.)..but "Where??".(where resources are reserved?? This where is reason of your confusion, so following is ans.).
2) Connection oriented at Transport Layer means ==> Reserving the resources at Both End processes/Ports.(Since TCP is a transport layer,then its responsibility is to reserve the resources at both end processes only,irrespective of whats happening in the intermediate path.)
3) Connection oriented at Network Layer means ==> Reserving the resources at Network Layers.(Now In the whole journey of a packet from source to destination, Network layer is found at all intermediate routers too(but not transport layer). Hence if any protocol at Network layer is connection oriented then,its responsibility is to reserve resources at all intermediate routeres too i.e. all packets will have to follow same intermediate path, But IP is connection less hence,intermediate resources will not be reserved. i.e journey of a packets may follow different paths etc.)
#CONCLUSION:==> Intermediate path is decided by Network Layer, hence if IP then paths may be different.(IP may contain TCP),But TCP is responsible for Resource reservation at both End processes ,irrespective of Intermediate path of packet.
router works on three layers only (physical , data link and Network layers) , so routers will take decision depending only on the info. of network layer (IP protocol ) hence there is no information available about its TCP or UDP at the router

Network: Does router breaks end-to-end principle?

Here is the part of end-to-end principle from Wikipedia
End-to-end connectivity is a property of the Internet that allows all
nodes of the network to send packets to all other nodes of the
network, without requiring intermediate network elements to further
interpret them.
But you should notice that routers (intermediate device) always among end nodes. So, as end-to-end principle above, not only NAT, but normally network with break end-to-end principle too.
Does it true ? If not, please explain for me why, normally network with router doesn't break end-to-end principle.
Thanks :)
A router has a special job in that it must facilitate the end-to-end principle.
A router's job, basically, is to interpret the destinatinaton IP address and determine the next router (or end node) to send it to. A router usually does two things:
Decrement the TTL
Find the next hop based on the destination IP address
It should not, in general, further interpret the packet.
Now, a NAT breaks end-to-end because it further interprets and transforms the packet. NAT might, for example, change the source address of the packet. In that case, you have broken end-to-end because another internet host will not be able to identify the specific host that sent the packet. They'll only be able to identify the source address of the NAT.
Edit: RFC 1812 makes this clear in section 2.2.1, where it describes the layering responsibilities:
o Transport Layer
The Transport Layer provides end-to-end communication services.
...
TCP is a reliable connection-oriented transport service that
provides end-to-end reliability, resequencing, and flow control.
...
o Internet Layer
All Internet transport protocols use the Internet Protocol (IP) to
carry data from source host to destination host. IP is a
connectionless or datagram internetwork service, providing no
end-to-end delivery guarantees.
In section 2.2.3, it goes on to state that:
Routers provide datagram transport only, and they seek to minimize
the state information necessary to sustain this service in the
interest of routing flexibility and robustness.
A "normal" router doesn't deal at all with the "end-to-end" principle. Its job is simply to deliver packets at the Internet layer, which is by definition "providing no end-to-end delivery guarantees". That's the Transport Layer's job. NAT "breaks end-to-end" by operating at the Transport layer rather than simply the Internet layer, and keeping too much state about each connection.
Hope this makes sense. I know it can be confusing.
I went through the article in wikipedia and found that there is a condition to it that it is only applicable to application-specific jobs not for other supportive tasks like address translations or redirecting, NAT etc. which are performed by intermediate routers... here is the part from that article
The end-to-end principle states that application-specific functions ought to reside in the end hosts of a network rather than in intermediary nodes – provided they can be implemented "completely and correctly" in the end hosts.
So, the principle is preserved as intermediate routers are not handling the application-specific functions , it is the end routers(nodes) which performs these functions.

Resources