I recently checked out the book "UNIX Network Programming, Vol. 1" by Richards Stevens and I found that there is a third transport layer standard besides TCP and UDP: SCTP.
Summary: SCTP is a transport-level protocol that is message-driven like UDP, but reliable like TCP. Here is a short introduction from IBM DeveloperWorks.
Honestly, I have never heard of SCTP before. I can't remember reading about it in any networking books or hearing about it in classes I had taken. Reading other stackoverflow questions that mentions SCTP suggests that I'm not alone with this lack of knowledge.
Why is SCTP so unknown? Why is it not much used?
Indeed, SCTP is used mostly in the telecom area. Traditionally, telecom switches use SS7 (Signaling System No. 7) to interconnect different entities in the telecom network. For example - the telecom provider's subscriber data base(HLR), with a switch (MSC), the subscriber is connected too (MSC).
The telecom area is moving to higher speeds and more reachable environment. One of these changes is to replace SS7 protocol by some more elegant, fast and flexible IP-based protocol.
The telecom area is very conservative. The SS7 network has been used here for decades. It is very a reliable and closed network. This means a regular user has no access to it.
The IP network, in contrast, is open and not reliable, and telecoms will not convert to it if it won't handle at least the load that SS7 handles. This is why SCTP was developed. It tries:
to mimic all advantages of the SS7 network accumulated over the decades.
to create a connection-oriented protocol better than TCP in speed, security, and redundancy
The latest releases of Linux already have SCTP support.
We have been deploying SCTP in several applications now, and encountered significant problem with SCTP support in various home routers. They simply don't handle SCTP correctly. I believe this is primarily a performance issue (the SCTP protocol specification require checksums for the whole packets to be recalculated and not just for headers).
Like many other promising protocols SCTP is sadly dead in the water until D-link and Netgear fixes their broken NAT boxes.
SCTP is not very much known and not used/deployed a lot because:
Widespread: Not widely integrated in TCP/IP stacks (in 2013: still missing natively in latest Mac OSX and Windows. 2020 update: still not in Windows nor Mac OS X)
Libraries: Few high level bindings in easy to use languages (Disclaimer: i'm maintainer of pysctp, SCTP easy stack support for Python)
NAT: Doesn't cross NAT very well/at all (less than 1% internet home & enterprise routers do NAT on SCTP).
Popularity: No general public app use it
Programming paradigm: it changed a bit: it's still a socket, but you can connect many hosts to many hosts (multihoming), datagram is ordered and reliable, erc...
Complexity: SCTP stack is complex to implement (due to above)
Competition: Multipath TCP is coming and should address multihoming needs / capabilities so people refrain from implementing SCTP if possible, waiting for MTCP
Niche: Needs SCTP fills are very peculiar (ordered reliable datagrams, multistream) and not needed by much applications
Security: SCTP evades security controls (some firewalls, most IDSes, all DLPs, does not appear on netstat except CentOS/Redhat/Fedora...)
Audit-ability: Something like 3 companies in the world routinely do audits of SCTP security (Disclaimer: I work in one of them)
Learning curve: Not much toolchain to play with SCTP (check the excellent withsctp that combines nicely with netcat or use socat, 2020 edit: nmap supports it for a few years now )
Under the hood: Used mostly in telecom and everytime you send SMS, start surfing the net on your mobile or make phone calls, you're often triggering messages that flow over SCTP (SIGTRAN/SS7 with GSM/UMTS, Diameter with LTE/IMS/RCS, S1AP/X2AP with LTE), so you actually use it a lot but you never know about it ;-) 2020 edit: it's being removed from the core 5G network (no more Diameter, HTTP/2 instead) and will be only used in the 5G radio access network between antennas and core.
SCTP requires more design within the application to get the best use of it. There are more options than TCP, the Sockets-like API came later, and it is young. However I think most people that take the time to understand it (and who know the shortcomings of TCP) appreciate it -- it is a well designed protocol that builds on our ~30 years of knowledge of TCP and UDP.
One of the aspects that requires some thought is that of streams. Streams provide (usually, I think you can turn it off) an order guarantee within them (much like a TCP connection) but there can be multiple streams per SCTP connection. If your application's data can be sent over multiple streams then you avoid head-of-line blocking where the receiver starves due to one mislaid packet. Effectively different conversations can be had over the same connection without impacting each other.
Another useful addition is that of multi-homing support -- one connection can be across multiple interfaces on both ends and it copes with failures. You can emulate this in TCP, but at the application layer.
Proper link heartbeating, which is the first thing any application using TCP for non-transient connections implements, is there for free.
My personal summary of SCTP is that it doesn't do anything you couldn't do another way (in TCP or UDP) with substantial application support. The thing it provides is the ability to not have to implement that code (badly) yourself.
FYI, SCTP is mandated as supported for Diameter (cf RADIUS next gen). see RFC 3588
Diameter clients MUST support either TCP or SCTP, while agents and
servers MUST support both. Future versions of this specification MAY
mandate that clients support SCTP.
p1. SCTP mapped directly over IPv4 requires support in NAT gateways, which has never been widely deployed anywhere, and without it the typical NAT gateway will only permit one private host per public address to be using SCTP at a time.
p2. SCTP mapped over UDP/IPv4 allows more private hosts per public address, but UDP mappings in IPv4/NAT gateways are notoriously tricky to establish and keep maintained, due to the fact that UDP is a connectionless transport without any explicit state for a NAT to track.
p3. SCTP mapped directly over IPv6 requires... well... IPv6. Have you tried to deploy IPv6? If so, have you tried to buy an IPv6 firewall? Does it support SCTP? How about a load balancer? A SSL accelerator?
p4. Finally, a lot of the Internet is pretty much constrained to what can fit through TCP port 80 and port 443, so SCTP of any flavor tends to lose there. Hence, you see efforts like the MPTCP working group in IETF.
Many of us will be using SCTP soon, since it's used by WebRTC datachannels to create a TCP-like reliable layer on top of UDP -- SCTP over DTLS over UDP: https://datatracker.ietf.org/doc/html/draft-ietf-rtcweb-data-channel-13#section-6
Reading the SCTP Wikipedia page I'd say that the main reason is that SCTP is a very young protocol (proposed in 2000) that is currently unsupported by the mainstream OSs (Windows, OS X, Linux).
If "very young" seems inappropriate to you, think about IPV6: "in December 2008, despite marking its 10th anniversary as a Standards Track protocol, IPv6 was only in its infancy in terms of general worldwide deployment."
SCTP is used extensively in the 4G LTE network where Diameter is used for AAA.
It might not be well known, but it's not unused. Quite recently there was a draft published at the IETF about Using SCTP as a Transport Layer Protocol for HTTP.
In reference to all of the comments about commercial routers being broken or lacking SCTP support, the issue is that SCTP with NAT is still in draft form with the IETF. So there is no RFC specification for them to implement it.
https://datatracker.ietf.org/doc/html/draft-ietf-behave-sctpnat-09
Sctp is born too late, and for many situation TCP is enough.
Also, as I know most of its usage is on telecommunication area.
Related
As far as I understand, ICE protocol is used for discovering the nodes/devices from the end-user device to "the outside".
I don't understand why it's needed. Isn't packet-routing is the responsibility of network devices like routers and switches? They should find the shortest path from the gateway to the end-user device (Actually, routers remembers those routes they previously discovered).
Moreover, NAT protocol is used to convert from an "internal ip" to "external ip" and vice-versa.
So again,
Why does the other user needs to be familiar with my internal network setup?
NAT is a kludge, put in place to try to conserve IPv4 addresses until IPv6 becomes ubiquitous, and it breaks the end-to-end connectivity which is the promise of IP. Because of that, some things don't work correctly through NAT. There are various kludges to work around the NAT kludge, and ICE is part of that. This is explained in RFC 5245, Interactive Connectivity Establishment (ICE): A Protocol for Network Address Translator (NAT) Traversal for Offer/Answer Protocols:
Introduction
RFC 3264 [RFC3264] defines a two-phase exchange of Session Description
Protocol (SDP) messages [RFC4566] for the purposes of establishment of
multimedia sessions. This offer/answer mechanism is used by protocols
such as the Session Initiation Protocol (SIP) [RFC3261].
Protocols using offer/answer are difficult to operate through Network
Address Translators (NATs). Because their purpose is to establish a
flow of media packets, they tend to carry the IP addresses and ports
of media sources and sinks within their messages, which is known to be
problematic through NAT [RFC3235]. The protocols also seek to create
a media flow directly between participants, so that there is no
application layer intermediary between them. This is done to reduce
media latency, decrease packet loss, and reduce the operational costs
of deploying the application. However, this is difficult to
accomplish through NAT. A full treatment of the reasons for this is
beyond the scope of this specification.
Numerous solutions have been defined for allowing these protocols to
operate through NAT. These include Application Layer Gateways (ALGs),
the Middlebox Control Protocol [RFC3303], the original Simple
Traversal of UDP Through NAT (STUN) [RFC3489] specification, and Realm
Specific IP [RFC3102] [RFC3103] along with session description
extensions needed to make them work, such as the Session Description
Protocol (SDP) [RFC4566] attribute for the Real Time Control Protocol
(RTCP) [RFC3605]. Unfortunately, these techniques all have pros and
cons which, make each one optimal in some network topologies, but a
poor choice in others. The result is that administrators and
implementors are making assumptions about the topologies of the
networks in which their solutions will be deployed. This introduces
complexity and brittleness into the system. What is needed is a
single solution that is flexible enough to work well in all
situations.
This specification defines Interactive Connectivity Establishment
(ICE) as a technique for NAT traversal for UDP-based media streams
(though ICE can be extended to handle other transport protocols, such
as TCP [ICE-TCP]) established by the offer/answer model. ICE is an
extension to the offer/answer model, and works by including a
multiplicity of IP addresses and ports in SDP offers and answers,
which are then tested for connectivity by peer-to-peer connectivity
checks. The IP addresses and ports included in the SDP and the
connectivity checks are performed using the revised STUN specification
[RFC5389], now renamed to Session Traversal Utilities for NAT. The
new name and new specification reflect its new role as a tool that is
used with other NAT traversal techniques (namely ICE) rather than a
standalone NAT traversal solution, as the original STUN specification
was. ICE also makes use of Traversal Using Relays around NAT (TURN)
[RFC5766], an extension to STUN. Because ICE exchanges a multiplicity
of IP addresses and ports for each media stream, it also allows for
address selection for multihomed and dual- stack hosts, and for this
reason it deprecates RFC 4091 [RFC4091] and [RFC4092].
Firewalls. They're typically configured to bounce any unsolicited traffic from the world wide web to you. They only approve of you initiating contact with a server, which only then is allowed to back-traffic to you, and that's pretty much it. Unless your friends all own static IPs (which few people can justify) this is a hostile environment for peer to peer communication.
ICE tries to solve this, by enumerating addresses and ports at which the other side may be reached, and trying to connect to these addresses, by initiating outbound requests on both ends, or if all else fails, falling back to communicating through a TURN server, if specified.
See this WebRTCHacks article for more on the problem.
Why does the other user needs to be familiar with my internal network setup?
Because the other user is sometimes on your internal network. e.g. LAN games.
Quick question: do most chat applications (ie. AIM, Skype, Oovoo) use peer to peer UDP exchange for talking to other users or an echoing TCP connection with a server? Or some combination in-between?
Traditionally, most applications used a TURN-like solution (i.e., communication via a server) to overcome NAT traversal issues. Since chat does not consume much bandwidth, servers could support thousands of communications.
But now that P2P has evolved and the NAT traversal issues are now well understood, some use direct UDP communication provided that the users' NAT allows this (i.e., STUN-like communication). They still need a central server to punch the hole though. Direct communication is also helpful when lots of data needs to be transmitted.
I believe it is fair to say that most modern frameworks use a combination of both.
when you need small fragments of data, such as text messaging, there's no need of using P2P. data can be transmitted from client1 to server, and from server back to the client2.
When you need to transfer data quickly between clients, in cases such as VoIP (voice over IP), or file transfer, you will use P2P.
A pretty standard IM protocol is XMPP. I know it's used by Google Talk, as well as a few other big names in chat.
I've seen and read a lot of similar questions, and the corresponding Wikipedia articles (NAT traversal, STUN, TURN, TCP hole punching), but the overwhelming amount of information doesn't really help me with my very simple problem:
I'm writing a P2P application, and I want two users of my application behind NAT to be able to connect to each other. The connection must be reliable (comparable to TCP's reliability) so I can't just switch to UDP. The solution should work on today's common systems without reconfiguration. If it helps, the solution may involve a connectible 3rd-party, as long as it doesn't have to proxy the entire data (for example, to get the peers' external (WAN) IP addresses).
As far as I know, my only option is to use a "reliable UDP" library + UDP hole punching. Is there a (C/C++) library for this? I found enet in a related question, but it only takes care of the first half of the solution.
Anything else? Things I've looked at:
Teredo tunnelling - requires support from the operating system and/or user configuration
UPnP port forwarding - UPnP isn't present/enabled everywhere
TCP hole punching seems to be experimental and only work in certain circumstances
SCTP is even less supported than IPv6. SCTP over UDP is just fancy reliable UDP (see above)
RUDP - nearly no mainstream support
From what I could understand of STUN, STUNT, TURN and ICE, none of them would help me here.
ICE collects a list of candidate IP/port targets to which to connect. Each peer collects these, and then each runs a connectivity check on each of the candidates in order, until either a check passes or a check fails.
When Alice tries to connect to Bob, she somehow gets a list of possible ways - determined by Bob - she may connect to Bob. ICE calls these candidates. Bob might say, for example: "my local socket's 192.168.1.1:1024/udp, my external NAT binding (found through STUN) is 196.25.1.1:4454/udp, and you can invoke a media relay (a middlebox) at 1.2.3.4:6675/udp". Bob puts that in an SDP packet (a description of these various candidates), and sends that to Alice in some way. (In SIP, the original use case for ICE, the SDP's carried in a SIP INVITE/200/ACK exchange, setting up a SIP session.)
ICE is pluggable, and you can configure the precise nature/number of candidates. You could try a direct link, followed by asking a STUN server for a binding (this punches a hole in your NAT, and tells you the external IP/port of that hole, which you put into your session description), and falling back on asking a TURN server to relay your data.
One downside to ICE is that your peers exchange SDP descriptions, which you may or may not like. Another is that TCP support's still in draft form, which may or may not be a problem for you. [UPDATE: ICE is now officially RFC 6544.]
Games often use UDP, because old data is useless. (This is why RTP usually runs over UDP.) Some P2P applications often use middleboxes or networks of middleboxes.
IRC uses a network of middleboxes: IRC servers form networks, and clients connect to a near server. Messages from one client to another may travel through the network of servers.
Failing all that, you could take a look at BitTorrent's architecture and see how they handle the NAT problem. As CodeShadow points out in the comments below, BitTorrent relies on reachable peers in the network: in a sense some peers form a network of middleboxes. If those middleboxes could act as relays, you'd have an IRC-like architecture, but one that's set up dynamically.
I recommend libjingle as it is used by some major video game companies which heavily relies on P2P network communication. (Have you heard about Steam? Vavle also uses libjingle , see the "Peer-to-peer networking" session in the page: https://partner.steamgames.com/documentation/api)
However, the always-work-solution would be using a relay server. Since there is no "standard" way to go through NAT, you should have this relay server option as a fall-back strategy if a connection has to be always established between any peers.
I know that a protocol is a set of rules that governs communication between two computers on a network, but how are thoses rules implemented for the computer? Is a protocol basically a piece of code or, in other words, software?
Protocols are generally built upon each other. At the risk of sounding pedantic, here's an example of a protocol and where/how it's implemented:
Application Protocol - the way a particular application talks to another instance of itself or a corresponding server; this is implemented in the application code or a shared library
TCP (or UDP, or another layer) - the way that information is sent at the binary level and split up into usable chunks, then reassembled at the destination; this is usually implemented as part of the operating system, but it is still software code
IP - the way that information (having already been split or truncated by something like TCP or UDP) makes its way from one place to another by routing over one or more "hops"; this is always software code, but is sometimes implemented in the OS and sometimes implemented in the network device (your LAN card, for example)
base-T (ethernet), token ring, etc - Here we are physically getting into how the hardware talks to one another; ie, which wire corresponds to a particular type of signal; this is always implemented in hardware
electricity /photons - the laws that govern (or at least define) how electrons (or photons) flow over a conductive material or over the air; this is usually implemented in hardware ;)
In a sense, these are all "protocols" (a set of rules or expected behaviors that allow communication to take place), and they're built on one another.
Bear in mind that (aside from electricity) this is not an exhaustive list of the sort of protocols that exist at any of these layers!
Edit Thanks to dmckee for pointing out that electricity isn't the only physical process used in networking ;)
Networking protocols are not pieces of code or software, they are only a set of rules. When software uses a specific networking protocol, then the software is known as an implementation. There can be many different software implementations of the same protocol (i.e. Windows and UNIX have different TCP/IP implementations). It is possible to understand networking protocols without any knowledge of programming.
EDIT: How are they implemented? Here's a paper on taking an abstract specification of a protocol and implementing it into C. You'll see that less-strict protocols leave out certain details that programmers have to guess on, which makes some implementations incompatible with others.
A network protocol is basically like a spoken language. It is implemented by code that sends and receives specially prepared messages over the network/internet, much like the vocal chords you need to speak (the network and hardware) and a brain to actually understand what someone said (the protocol stack/software).
Sometimes protocols are implemented directly on the hardware [for speed reasons] (like the Ethernet protocol for LANs) - but it is always software/code required to do something useful with a protocol.
This might be interesting for you:
The OSI Model
Protocol (Computing)
Software implements the rules defined in the protocol, some protocols are formal defined and some informal.
a protocol is a set of rules governing the communication between two entities.
in the computer/programming context, a protocol is a set of rules governing the communication between two programs.
in the computer network context, a protocol is a set of rules governing the communication between two programs, well, over network.
in computers, in the end everything is embodied in code...
Protocols are basically set of rules. The way to implement them is to first of all make a state machine diagram as it completely tells that what is going to be the current state and how the state is going to change on the basis of input and what output actions are going to be performed.
Your answer is a very short one:
BY READING THE RFC.
The main networking problem is to share data between computers. All the networking protocols try to solve is a little part of that major problem. Some of them (the protocols) are implemented as software, some others as hardware. In short, protocols like algorithms, can be implemented it in many programming languages.
Back to the TCP, it is implemented by the operating system.
I am considering using SCTP instead of TCP for a p2p app written in C. Should I do it? Also how does the speed of SCTP compare to the speed of TCP?
EDIT:
I found that SCTP can be tunneled over UDP with the only problem being tunneled SCTP is not interoperable with untunneled SCTP.
Have you considered whether your target systems will all have SCTP pre-installed on them or whether your application will need to include SCTP itself? In my experience I would not expect all systems to have SCTP installed on them, and I would expect them not to if it were Windows.
If you include SCTP in the application itself then that will more than double the number of messages being passed into an out of the Kernel which will impact performance when compared with using the pre installed TCP.
Have you considered what benefits you want from SCTP? You mentioned fault tolerance but for this to work with SCTP it requires the application to have multiple ethernet ports and and IP addresses. Is this likely on your app?
As much as I love SCTP (!) I would seriously consider sticking with TCP unless you are sure SCTP is needed or unless you control the hosts your app is deployed on.
Regards
If it's for a local area network, sure go for it.
Note however that if you plan to use it on the open internet many consumer grade firewalls aren't flexible enough to permit unrecognised IP protocols through them.
How does it help you?
You're P2P, so every peer must have at least one socket open to every other peer.
If you've got a socket open, then you can do everything you need to do over that. If you've taken the approach of one socket per file and you have multiple files being tranferred concurrently between two given peers, then SCTP will save you one socket per file. However, on a normal P2P network of any size, you will almost never have multiple files being transferred concurrently between two peers.
Just have one socket and have your own little protocol; send a packet with a header, the header indicates content type, e.g. a command, or part a file - and if so, which file, and which byte range.
Of course, you get a little overhead for that, whereas if you have one socket for commands and one per file, you're more efficient. Is saving one socket per peer (assuming one download at a time) worth the time/hassle/complexity of using SCTP?