The idea is that two different machines (behind two different NATs) connect to public-server.
And they try to create TCP connection with such public server...
Then possible the magic can happens during proxing data stream!
Change source and dest address on whole tcp/ip stack during this session.
The goal - to exclude this third part as a proxy from further communication...
First you need a server to which Peer will send a data or something for letting it know that the server needs to send an syn-ack to it.
Then first Peer A send a packet to Peer B's address with low TTL value so that it is dropped in the middle and doesn't reach to B's NAT. It will keep sending this packet until a packet form the server reaches it with syn-ack containing source address of B's (source faking). And A will do the handshaking with the server but A will think he is doing the handshaking with B.
Exactly same thing happens with B. B will handshake with server but will think it is done with A. After the handshaking is complete on both end data transfer begins with between A and B as P2P connection.
This is source faking as server is handshaking with both peers pretending one of the peers. This is how both peers NAT is opened to each other.
Related
I would like to ask a general newbie question. I understand that for a computer in location A to connect to a server in location B, packets of data have to be sent to multiple data centers through multiple gateways and through multiple verification channels to ensure the connection request finds the right destination.
However after the connection is established, when the computer and the server send/receive data, do these data still need to go through [multiple data centers through multiple gateways and through multiple verification channels]?
Every TCP / UDP packet can have a different network path between source to destination. However the connection establishment of a TCP connection being stateful is all about what packet size, compression method etc.
At network layer- Connection is stateless. Please read about OSI model in detail also you can refer to this https://www.ccnahub.com/wp-content/uploads/2013/09/watermarked-pc1-comm.jpg It has good explanation of how OSI works.
A TCP packet being sent from computer A to computer B will be addressed to a particular IP address. If that TCP address is not on the local LAN, it will go first through the local LAN to whatever is designated as the local gateway. That gateway then sends it on over the connection to an external network. At that point, it will be delivered to some router in your ISP. That router will look at the destination IP address and consult a routing table to find where it should next send the packet. That will typically be another router elsewhere in the network. This continues and (assuming good routing tables in each router) the packet will get closer to its end desination on each hop. Eventually, the packet will get to a router that has a routing table that knows about either the actual IP address or the home gateway for that IP address and the packet will be sent to that gateway. That home gateway can then deliver the packet to that actual IP address. In some cases, there may be a private network at either end where private IP addresses/port combinations are converted to public IP addresses and vice versa.
If computer A sends multiple packets to computer B, they do not have to all go the exact same path, though typically they will (assuming no problems or congestion in the network between the two endpoints).
In this scenario where A and B are on different private networks, there is no direct connection between computer A and computer B so each packet has to follow the path from one router to the next until it arrive at the final gateway and then destination address.
However after the connection is established, when the computer and the server send/receive data, do these data still need to go through [multiple data centers through multiple gateways and through multiple verification channels]?
If the routers are doing their job appropriately, the very first packet takes the most efficient path from A to B that the network knows. There is no "better" way to send subsequent packets. Subsequent packets will follow the same process (to a router, router looks up in routing table where to send for next hop and so on). If the two endpoints are a long ways apart (in terms of network topology), then the packet may go through many routers. Routers are highly optimized pieces of equipment capable of passing off millions of packets a second as this is how data moves on any TCP/IP network like the internet.
There is no difference in how the first packet that initiates the TCP connection flows versus subsequent packets. At the network level, they are just packets traveling from a source IP address to a destination IP address. Once the connection is established, a reliability layer will be started to track packets that might get lost, initiate retransmissions, etc... but this doesn't have anything to do with how a given packet gets from A to B.
Which software libraries does exist for such task for Linux, Windows OS?
Does it exist some info in RFC how people should do it?
I'm interesting how can I create functionality for my C++ project like presented here in that software: https://secure.logmein.com/ru/products/hamachi/download.aspx
There is not much difference if you want to make a connection through TURN relay server. The only difference is how TCP and UDP creates connection and nothing else.
There are some big differences if you want to make P2P connection.
If you are in same network(behind same NAT): In UDP you send a stun binding request to your peer candidate and then if you get a response back then you know you are connected. Same in TCP you have to create one active socket on one side and one passive socket on another. And then send syn from active socket and receive it from passive socket and then send syn ack to the active socket. And then active socket send an ack and the connection is established.
If you are in different Network(behind different NAT): You have to employ TCP hole punching technique for making a connection. Because your NAT won't allow a TCP syn packet through if previously no packet was sent to the address the syn is coming from.
TCP hole punching in details:
You have to use a TCP simultaneous open socket. This socket acts in both active and passive mode. Both end needs to know each others private and public IP:Port.
TCP simultaneous open will happen as follows:
Peer A keeps sending SYN to Peer B
Peer B keeps sending SYN to Peer A
When NAT-a receives the outgoing SYN from Peer A, it creates a mapping in its state machine.
When NAT-b receives the outgoing SYN from Peer B, it creates a mapping in its state machine.
Both SYN cross somewhere along the network path, then:
SYN from Peer A reaches NAT-b, SYN from Peer B reaches NAT-a
Depending on the timing of these events (where in the network the SYN cross),
at least one of the NAT will let the incoming SYN through, and map it to the internal destination peer
Upon receipt of the SYN, the peer sends a SYN+ACK back and the connection is established.
From WIKI.
Also to learn about TCP simultaneous open connection read from here. To learn about NAT filtering behavior see this answer.
I started to learn Linux Networking and packets filtering. In the iptables documentation it is stated that:
If a packet is destined for this box, the packet passes downwards in the diagram, to the INPUT chain. If it passes this, any processes waiting for that packet will receive it.
So, suppose there're 3 server apps on a host. Servers A and B are TCP servers, and C is UDP server.
Is it true, that if we receive an UDP packet, at IP level this packet is to be delivered for apps A, B, C? Or sockets of apps A & B wouldn't receive this packet at all?
TCP servers and UDP servers operate in very different ways.
At most one TCP server will listen on a given TCP port (corner cases ignored for the sake of simplicity). Connection requests (encapsulated in IP packets) destined for that port are "accepted" by exactly one process (more accurately, accepted by a process that has a file descriptor corresponding to exactly one listening endpoint). The combination of [remote_address,remote_port] and [local_address,local_port] is unique. A TCP server doesn't really receive "packets", it receives a stream of data that doesn't have any specific relationship to the underlying packets that carry the data (packet "boundaries" are not directly visible to the receiving process). And a TCP packet that is neither a connection request nor associated with any existing connection would simply be discarded.
With UDP, each UDP datagram is logically independent and may be received by multiple listening processes. That is, more than one process can bind to the same UDP endpoint and receive datagrams sent to it. Typically, each datagram corresponds to a single IP packet though it is possible for a datagram to be broken into multiple packets for transmission.
So, in your example: no, a server that is listening for TCP requests (a "TCP server") will never receive a UDP packet. The port namespaces for TCP and UDP are completely separate.
The delivery of the packet will depend on its destination port.
Lets assume that the servers A, B and C are listening on port 1111, 2222 and 3333 respectively, so when a packet with destination port 2222 is arrived, it will be delivered to server B.
My question wasn't well formulated, unfortunatelly. I understood it when I had seen the answers. Here is an explanation which I was looking for, it's from http://www.cs.unh.edu/cnrg/people/gherrin/linux-net.html#tth_chAp6: > When the process scheduler sees that there are networking tasks to do it runs the network bottom-half. This function pops packets off of the backlog queue, matches them to a known protocol (typically IP), and passes them to that protocol's receive function. The IP layer examines the packet for errors and routes it; the packet will go into an outgoing queue (if it is for another host) or up to the transport layer (such as TCP or UDP). This layer again checks for errors, looks up the socket associated with the port specified in the packet, and puts the packet at the end of that socket's receive queue.
The SYN packet has the same source dest IP address & port with the established connection, so what will happen in this case?
The server will silently drop the packet since it already has a connection in the ESTABLISHED state, one of the four values from (client-ip, src-port, server-ip, dest-port) must be different for the new SYN to be accepted.
The server will attempt a new connection.
in tech terms it will send a syn,ack packet and wait for the client to finish the tcp handshake
and open the connection.
http://en.wikipedia.org/wiki/Transmission_Control_Protocol
will explain the process alot better than me.
the server will send some information to identify the connection in its syn,ack packet.
and that information is used to keep that connection seperate from others.
Most the time, the ports will not be the same
but when it is, it can cause problems with low grade nat routers,
They try to rewrite that ports that are used, and can get the connections confused.
The socket API is the de-facto standard for TCP/IP and UDP/IP communications (that is, networking code as we know it). However, one of its core functions, accept() is a bit magical.
To borrow a semi-formal definition:
accept() is used on the server side.
It accepts a received incoming attempt
to create a new TCP connection from
the remote client, and creates a new
socket associated with the socket
address pair of this connection.
In other words, accept returns a new socket through which the server can communicate with the newly connected client. The old socket (on which accept was called) stays open, on the same port, listening for new connections.
How does accept work? How is it implemented? There's a lot of confusion on this topic. Many people claim accept opens a new port and you communicate with the client through it. But this obviously isn't true, as no new port is opened. You actually can communicate through the same port with different clients, but how? When several threads call recv on the same port, how does the data know where to go?
I guess it's something along the lines of the client's address being associated with a socket descriptor, and whenever data comes through recv it's routed to the correct socket, but I'm not sure.
It'd be great to get a thorough explanation of the inner-workings of this mechanism.
Your confusion lies in thinking that a socket is identified by Server IP : Server Port. When in actuality, sockets are uniquely identified by a quartet of information:
Client IP : Client Port and Server IP : Server Port
So while the Server IP and Server Port are constant in all accepted connections, the client side information is what allows it to keep track of where everything is going.
Example to clarify things:
Say we have a server at 192.168.1.1:80 and two clients, 10.0.0.1 and 10.0.0.2.
10.0.0.1 opens a connection on local port 1234 and connects to the server. Now the server has one socket identified as follows:
10.0.0.1:1234 - 192.168.1.1:80
Now 10.0.0.2 opens a connection on local port 5678 and connects to the server. Now the server has two sockets identified as follows:
10.0.0.1:1234 - 192.168.1.1:80
10.0.0.2:5678 - 192.168.1.1:80
Just to add to the answer given by user "17 of 26"
The socket actually consists of 5 tuple - (source ip, source port, destination ip, destination port, protocol). Here the protocol could TCP or UDP or any transport layer protocol. This protocol is identified in the packet from the 'protocol' field in the IP datagram.
Thus it is possible to have to different applications on the server communicating to to the same client on exactly the same 4-tuples but different in protocol field. For example
Apache at server side talking on (server1.com:880-client1:1234 on TCP)
and
World of Warcraft talking on (server1.com:880-client1:1234 on UDP)
Both the client and server will handle this as protocol field in the IP packet in both cases is different even if all the other 4 fields are same.
What confused me when I was learning this, was that the terms socket and port suggest that they are something physical, when in fact they're just data structures the kernel uses to abstract the details of networking.
As such, the data structures are implemented to be able to distinguish connections from different clients. As to how they're implemented, the answer is either a.) it doesn't matter, the purpose of the sockets API is precisely that the implementation shouldn't matter or b.) just have a look. Apart from the highly recommended Stevens books providing a detailed description of one implementation, check out the source in Linux or Solaris or one of the BSD's.
As the other guy said, a socket is uniquely identified by a 4-tuple (Client IP, Client Port, Server IP, Server Port).
The server process running on the Server IP maintains a database (meaning I don't care what kind of table/list/tree/array/magic data structure it uses) of active sockets and listens on the Server Port. When it receives a message (via the server's TCP/IP stack), it checks the Client IP and Port against the database. If the Client IP and Client Port are found in a database entry, the message is handed off to an existing handler, else a new database entry is created and a new handler spawned to handle that socket.
In the early days of the ARPAnet, certain protocols (FTP for one) would listen to a specified port for connection requests, and reply with a handoff port. Further communications for that connection would go over the handoff port. This was done to improve per-packet performance: computers were several orders of magnitude slower in those days.