Writing client-server application in global network - networking

I know, how to write a C# application that works through a local network.
I mean I know, how to make my client-side application access my server-side application in a single local network.
But I wonder: How do such apps, as Skype, TeamViewer, and many other connect via global network?
I apologise, if this question is simple or obvious, but I couldn't find any information about this stuff.
Please, help me, I'll be very grateful. Any information is accepted - articles, plain info, books,and so on...

Question is very wide and I try to do short overview.
Following major difference between LAN (Local Area Network) and WAN (Wide Area Network):
Network quality:
LAN is more or less stable, WAN can be with network issues like:
Packet loss (you need use loss-tolerant transport like TCP or UDP with retransmits or packet loss concealment)
Packet jitter (interpacket intervals may differ a lot from sending part). Most common thing is packets bursts.
Packet reordering
Packet duplication
Network connectivity
WAN is less stable than LAN. So you need properly handle all things like:
Connection stale
Connection loss
Errors in the middle of the connection (if you use UDP for example)
Addresses:
In WAN you deal with different network equipment between client and server (or peers in case of peer-to-peer communication). You need to take in account:
NATs - most of the clients are behind NAT and you need to pass them through. According technics are called "NAT traversal"
Firewalls - may ISP has own rules what client can do or can't. So if you do something specific like custom transport protocol you may bump into ISP firewalls.
Routing - especially multicast and broadcast communication. In common case multicast is not possible to route. Broadcasts are never routed. So you need to avail this type of communication if you want to use WAN.
May be I forgot something. But these points are major. You can read many articles about any of them.

Related

How exactly does an ethernet switch work?

I understand that it's different than a hub in that instead of packets being broadcasted to all devices connected to the device, it knows exactly who requested the packet by looking at the MAC layer.
However, is it still possible to use a packet sniffer like Wireshark to intercept packets meant for other users of the switch? Or is this only a problem with ethernet hubs that doesn't affect switches due to the nature of how a switch works?
On a slightly off topic side note, what exactly is classified as a LAN? For example, imagine two separate ethernet switches are hooked up to a router. Would each switch be considered a separate LAN? What is the significance of having multiple LAN's within the same network?
it knows exactly who requested the packet by looking at the MAC layer.
More exactly, the switch uses the MAC destination address to forward a frame to the port associated with that address. Addresses are automatically learned by looking at the MAC source address on received frames.
A switch is stateless, ie. is has no memory who requested which data. A layer-2 switch also has no understanding of IP packets, addresses or protocols. All a basic switch does is learn source addresses and forward by destination address.
is it still possible to use a packet sniffer like Wireshark to intercept packets meant for other users of the switch?
Yes. You'll need a managed switch supporting port mirroring or SPANning. This doesn't intercept frames, it just copies them to the mirror port. If you need to actually intercept frames you have to put your interceptor in between the nodes (physically or logically).
With a repeater hub, every bit is repeated to every node in the collision domain, making monitoring effortless.
what exactly is classified as a LAN?
This depends on who you ask and on the context. A LAN can be a layer-1 segment/bus aka collision domain (obsolete), a layer-2 segment (broadcast domain), a layer-3 subnet (mostly identical with an L2 segment) or a complete local network installation (when contrasted with SAN or WAN).
Adding to #Zac67:
Regarding this question:
is it still possible to use a packet sniffer like Wireshark to
intercept packets meant for other users of the switch?
There are also active ways in which you can trick the Switch into sending you data that is meant for other machines. By exploiting the Switch's mechanism, one can send a frame with a spoofed source MAC, and then the Switch will transfer frames destined to this MAC - to the sender's port (until someone else sends a frame with that MAC address).
This video discusses this in detail:
https://www.youtube.com/watch?v=YVcBShtWFmo&list=PL9lx0DXCC4BMS7dB7vsrKI5wzFyVIk2Kg&index=18
In general, I recommend the following video that explains this in detail and in a visual way:
https://www.youtube.com/watch?v=Youk8eUjkgQ&list=PL9lx0DXCC4BMS7dB7vsrKI5wzFyVIk2Kg&index=17
what exactly is classified as a LAN?
So indeed this is one of the least-well-defined terms in Computer Networks. With regards to the Data Link Layer, a LAN can be defined as a segment, that is - a broadcast domain. In this case, two devices are regarded as part of the same segment iff they are one hop away from one another - that is, they can switch frames in the second layer.

Connecting P2P over NAT?

I started to explore the option of connecting with other using a p2p connection, so I coded a simple socket program in JAVA for android devices in which the users can share simple messages p2p (I didn't have any idea about NAT then). I got to know about NAT, so I now need to establish a TCP connection with another user which uses a server for discovery but payload is transferred p2p. I have also looked at XMPP(a very good and detailed explanation of how protocol works is here) and UPnP but I dont know how to implement them.
Another interesting question that arises is of BitTorrent because they can work on any device and even behind a NAT. I am not able to get any explanation of how BitTorrent works.
I have researched a lot but I am stuck.
My questions are:
A detailed explanation of BitTorrent(like here, not how torrents work) and how is it able to work around NAT ?
Is there a way to make a NAT entry programmatically ?
Is socket programming sufficient for p2p ?
How difficult is it to create your own protocol and how can I build one ?
If two devices D1 and D2 want to communicate p2p and they know each other's IP. D1 sends a request to D2 and that can't get through the D2's NAT, but there should be an entry created in D1's NAT. So when D2 tries to send something D1's NAT should discover an entry with D2's IP. Then why is the packet not allowed by it ?
Another interesting question that arises is of BitTorrent because they can work on any device and even behind a NAT. I am not able to get any explanation of how BitTorrent works.
This statement looks like you assume that bittorrent needs full connectivity to operate.
That is incorrect.
Behind a NAT device you will still be able to establish outgoing TCP connections. Which generally is sufficient for bittorrent as long as there are other, non-NATed (or NATed but properly port-forwarded) clients in the network that can accept incoming connnections.
NAT has no impact on the flow direction of the data because connections are bi-directional once they are established. It only is problematic for the initial connection setup.
This works perfectly fine for bittorrent because bittorent does not care from which specific node you get your data.
Although better connectivity generally does improve performance.
If the identity of the node matters or one-on-one transfers are an important use-case then other p2p protocols usually attempt NAT traversal first and if that fails rely on 3rd party nodes relaying traffic between those nodes who cannot connect to each other directly.
Additionally, IPv6 support will become essential in the future to maintain end-to-end connectivity because more and more ISPs are starting to roll out carrier-grade NAT for IPv4 while IPv6 will remain non-NATed
One thing need to be clear is that 100% P2P between all type of NAT is impossible right now. There is no practical way to establish P2P connectivity between **Symmetric and Symmetric/PRC NAT. In this scenario connection is established through a relay server called TURN.
I am answering from your 2nd question because I don't know much about the first one.
2) Yes. You can send a packet through your NAT and there will be a mapping between your internal IP:Port to your NAT's external IP:Port. You can know these external IP:Port by sending a stun request. Note that this technique doesn't work for Symmetric NAT.
3)Yes socket programming sufficient for p2p.
4)Why do you need a protocol when there already exists several. ICE protocol is the best today for NAT traversal and I don't think it was easy to create. UPnP and NAT-PMP is really vulnerable in terms of security.
5)I think what happens is usually NAT blocks unknown packets coming to it. So when D1 sends a packet to D2, its NAT blocks all packets incoming from D1s IP:Port. That is why connection establishment fails. You have to employ hole punching technique for D1 and D2 to successfully establish P2P connectivity.
**By symmetric NAT I mean symmetric NAT with random port allocation.
There is a paper on "Peer-to-Peer Communication Across Network Address Translators" which describes the UDP hole punching method and extends it to be used over TCP as well.
Of course, you will always need a relay server for the cases where hole punching is not supported.
Recent versions of BitTorrent use µTP, which is layered above UDP, not TCP. µTorrent uses a private extension (ut_holepunch) that performs UDP hole punching, most other implementations don't bother (with the notable exception of Tixati).
Some NAT routers accept port forwarding requests using either the uPNP or the PMP protocol. Whether this is supported depends on the particular brand of router and its configuration.
Yes, socket programming is enough for P2P.
Difficult to answer. I suggest that you read the wikified and annotated BitTorrent specification for a start.
Yes, this is the principle behind UDP hole punching.

If TCP is connection oriented why do packets follow different paths?

According to my knowledge if an internet application has to be designed, we should use either a connection-oriented service or connection-less service, but not both.
Internet's connection oriented service is TCP and connection-less service is UDP, and both resides in the transport layer of Internet Protocol stack.
Internet's only network layer is IP, which is a connection-less service. So it means whatever application we design it eventually uses IP to transmit the packets.
Connection-oriented services use the same path to transmit all the packets, and connection-less does not.
Therefore my problem is
if a connection oriented application has been designed, it should transmit the packets using the same path. But IP breaks that rule by using different routes.So how do both TCP and IP work together in this sense? It totally confuses me.
You, my friend, are confusing the functionality of two different layers.
TCP is connection oriented in the sense that there's a connection establishment, between the two ends where they may negotiate different things like congestion-control mechanism among other things.
The transport layer protocols' general purpose is to provide process-to-process delivery meaning that it doesn't know anything about routes; how your packets reach the end system is beyond their scope, they're only concerned with how packets are being transmitted between the two end PROCESSES.
IP, on the other hand, the Network layer protocol for the Internet, is concerned with data-delivery between end-systems yet it's connection-less, it maintains no connection so each packet is handled independently of the other packets.
Leaving your system, each router will choose the path that it sees fit for EACH packet, and this path may change depending on availability/congestion.
How does that answer your question?
TCP will make sure packets reach the other process, it won't care HOW they got there.
IP, on the other hand, will not care if they reach the other end at all, it'll simply forward each different packet according to what it sees most fit for a particular packet.
Note:
Let's assume that IP was connection-oriented, would that mean packets would follow the same-path?
Not necessarily, it depends on what the word 'connection' at this layer means, if it means negotiating certain options related to security, for instance, you may still have all your packets being forwarded through different routes over the Internet.
EDIT:
Not to confuse you though, most connection-oriented services at the network-layer and below mean that the connection, when established, also establishes a virtual-path that all 'packets' must follow, for further information read about:
Virtual circuit and frame-relay networks
This link answers your question pretty well http://www.tcpipguide.com/free/t_ConnectionOrientedandConnectionlessProtocols-3.htm
Some people consider this (TCP) to be like a “simulation” of circuit-switching at higher network layers; this is perhaps a bit of a dubious analogy. Even though a TCP connection can be used to send data back and forth between devices, all that data is indeed still being sent as packets; there is no real circuit between the devices. This means that TCP must deal with all the potential pitfalls of packet-switched communication, such as the potential for data loss or receipt of data pieces in the incorrect order.
The TCP protocol deals with the problem of IP packets arriving out of order or being lost, to give you the feeling they arrive through a single FIFO channel. Yes, TCP is smart enough to do that, there's no need for a dedicated underlying channel.
The TCP protocal is implemented by the sending/receiving machines, once the packets leave the sending machine, the routers they travel along know nothing about TCP, they just use IP to get the packets from the source the to destination. Then, it is the destination machines job to, using TCP, make sure that all the packets arrive and that they arrive in the correct order. The internet itself doesn't know anything about TCP, it's just a layer (often software) that gives connection to a connectionless medium (the internet).
So onces a packet leaves a destination, it can go along any path (mostly) as long as it gets to the desintation, regardless of the higher level protocol (such as TCP or UDP).
I mean, it's a bit more complicated then that, but as far as I can remember that's the general Idea.
Refer my short points properly,
1) Connection oriented means ==> reserving resources(buffer,cpu,bandwidth etc.)..but "Where??".(where resources are reserved?? This where is reason of your confusion, so following is ans.).
2) Connection oriented at Transport Layer means ==> Reserving the resources at Both End processes/Ports.(Since TCP is a transport layer,then its responsibility is to reserve the resources at both end processes only,irrespective of whats happening in the intermediate path.)
3) Connection oriented at Network Layer means ==> Reserving the resources at Network Layers.(Now In the whole journey of a packet from source to destination, Network layer is found at all intermediate routers too(but not transport layer). Hence if any protocol at Network layer is connection oriented then,its responsibility is to reserve resources at all intermediate routeres too i.e. all packets will have to follow same intermediate path, But IP is connection less hence,intermediate resources will not be reserved. i.e journey of a packets may follow different paths etc.)
#CONCLUSION:==> Intermediate path is decided by Network Layer, hence if IP then paths may be different.(IP may contain TCP),But TCP is responsible for Resource reservation at both End processes ,irrespective of Intermediate path of packet.
router works on three layers only (physical , data link and Network layers) , so routers will take decision depending only on the info. of network layer (IP protocol ) hence there is no information available about its TCP or UDP at the router

Creating a TCP connection between 2 computers without a server

2 computers are in different subnets.
Both are Windows machines.
There are 2-5 IGMP-ready routers between them.
They can connect each other over multicast protocol (they have joined the same multicast group and they know about each other's existance).
How to establish a reliable TCP connection between them without any public server?
Programming language: C++, WinAPI
(I need a TCP connection to send some big critical data, which I can not entrust to UDP)
You haven't specified a programming language, so this whole question may be off-topic.
Subnets are not the problem. Routability is the problem. Either there is routing set up or there isn't. If they are, for example, both behind NAT boxes, then you're at the mercy of the configuration of the nat boxes. If they are merely on two different subnets of a routed network, it's the job of the network admin to have set up routing. So, each has an IP address, and either can address the other.
On one machine, you are going to create a socket, bind it to some port of your choice, and listen. On the other, you will connect to the first machine's IP + the selected port.
edit
I'm going to try again, but I feel like there's a giant conceptual gap here.
Once upon a time, the TCP/IP was invented. In the original conception, every item on the network has an IPV4 address, and every machine could reach every other machine, via routing, except for machines in the 'private' address space (10.x, etc).
In the very early days, the only 'subnets' were 'class A, class B, class C'. Later the idea of subdividing a network via bitmasks was added. The concept of 'subnet' is just a way of describing a piece of network in which all the hosts can deliver packets to each other by one hop over some transport or another. In a properly configured network, this is only of concern to operating system drivers. Ordinary programs just address packets over the network and they arrive.
The implementation of this connectivity was always via routing protocol. If you have a (physical) ethernet A over here, and a (physical) ethernet B over there, connected by some sort of point-to-point link, the machines on A need to know where to send packets for B. Or, to be exact, they need to know where to send 'not-A' packets, and whatever they send them needs to know where to send 'B' packets. In simple cases, this is arranged via explicit configuration: routing rules stuffed into router boxes or even computers with multiple physical interfaces. In more complex cases, routing boxes intercommunicate via protocols like EGP or BGP or IGMP to learn the network topology.
If you use the Windows 'route' command, you will see the 'default route' that the system uses to send packets that need to leave the local subnet. It is generally the address of the router box responsible for moving information from the local subnet to everywhere else.
The whole goal of this routing is to arrange that a packet sent from a.b.c.d to e.f.g.h will get there. TCP is no different than UDP, except that you can't get there by multicast or broadcast: you need to know the exact address of your correspondent.
DNS was invented to allow hosts to learn each other's IP addresses without having human being send them around in email messages.
All this stops working when people start using NAT and firewalls to turn off routing. The whole idea of NAT is that the computers behind the NAT box are not addressable at all. They all appear to have one IP address. They can send stuff out, but they can only receive stuff if the NAT box has gone to extra trouble to map them a port.
From your original message, I sort of doubt that NAT is in use here. I just don't understand your comment 'I don't have access to the network.' You say that you've sent UDP packets here and there. So how did you do that? What addresses did you use?

Direct TCP/IP connections in P2P apps

From a Joel's post on Copilot:
Direct Connect! We’ve always done
everything we can to make sure that
Fog Creek Copilot can connect in any
networking situation, no matter what
firewalls or NATs are in place. To
make this happen, both parties make
outbound connections to our server,
which relays traffic on their behalf.
Well, in many cases, this isn’t
necessary. So version 2.0 does
something rather clever: it sets up
the initial connection through our
servers, so you get connected right
away with 100% reliability. But then
once you’re all connected, it quietly,
in the background, looks for a way to
make a direct connection. If it can’t,
no big deal: you just keep relaying
through our server. If you can make a
direct peer-to-peer connection, it
silently shifts your data onto the
direct connection. You won’t notice
anything except, probably, much faster
communication.
How do they change the server connection to a P2P connection?
It's pretty tricky and interesting. I'm sure I have some details wrong, but the overview is this:
The programs can already talk to each other through Joel's server, so they can exchange information with each other and Joel's server. Further, Joel has their external IP addresses, and they give joel information about their internal IP addresses.
They decide to try this hole punch technique. Computer A initiates a TCP connection with Computer B using B's external IP address. It won't go through, but what it does is tell's A's router that it needs to allow incoming packets from B on a given port.
Computer B does the same thing, but its message gets through to A since A's router opened a port/ip combination that matches what B sent (there's some port magic that happens here - this is non trivial, but doable).
B's router remembers that B initiated a connection with A on a given port and IP, and so A's packets now flow into B past their router correctly as well.
So it's actually pretty straight forward, but the implementation has details, especially regarding how ports are given to new TCP connections, and how NAT routers typically deal with TCP requests and how they map to external ports. These details are the interesting, and difficult, bit.
-Adam
I believe the simple version is that they drop the server connection and replace it with the P2P connection.
Something along the lines of:
Machine1 connects to copilot's servers.
Machine1 connects to copilot's servers.
Machine1 connects to copilot's servers.
Machine2 subsequently connects, and they begin screen sharing.
Machine2 opens a port intended for Machine1 to connect to.
Machine1 tries to connect to the now open port on Machine2.
If this connection is established:
The connection to copilot's servers is severed.
Data is instead transfered over the direct (P2P) connection between the two machines.
There is a technique called "Hole Punching" that works well with "Cone" NAT (Cone is a technical familly of router). That's not an 100% sure technique, today, it works well with UDP on about 80% of the router.
There is some implementations of library to realize Hole Punching: STUN (wikipedia)

Resources