In my peer to peer application, should I use multiple ports? - networking

I am building a simple peer to peer application where about 8 participants all connect to each other (n*n). I will be using UDP with a reliability and ordering protocol layered on top. Each peer will be broadcasting a few KB of data per second.
It occurred to me that there are two ways of configuring the ports on each peer:
Each peer takes one port and all messages are received via that
Each peer takes a port for each other peer, and only communicates with a peer using its corresponding port
What are the advantages and disadvantages of each approach?

Creating only one port to communicate with each peers is the best option. You create only one socket and use only one port to send/receive data with as many peers as you want. You can distinguish received data by seeing source address of each packet. This way has Less complicated code and is more resource efficient.
Creating multiple port has absolutely no advantage. It will complicate your code and will use much more resource with no benefit. Resource consumption will grow with more peers.

I have never done anything with this model of application, but here are my few thoughts.
1 big thing to consider will be whether any firewalls are in place between the peers. If so, a hole will need to be punched through for each listening port. This can be done either once (eg. router port-forward rule) or dynamically (UPnP, etc), but you may not be able to count on automatic full-cone NAT to do this for you. If you are expecting any firewalls between your peers, I would recommend using a single port for ease of programming on your part and identify your peers strictly by their remote address or some other in-protocol identifier.
Using a single port per user will make identifying communication much simpler, but if you expect your number of participants to grow by n(n-1)/2. If you never expect more than a small number (eg. 20), port per peer will work decently well without much effort.
Another option for you (possibly) may be using multicast. If all of your peers are on the same broadcast domain, this will reduce bus contention and may make your coding somewhat cleaner as well.
Hope this helps. I apologize if this isn't what you're looking for. Good luck!

Each peer takes one port and all messages are received via that
If each peer can get the source IP/source port of the incoming datagram (and I bet it can), this is enough to differentiate the peers.
Each peer takes a port for each other peer, and only communicates with
a peer using its corresponding port
See above, and most importantly this contradict your base idea of broadcasting in the first place. It just add a level of complexity (and is probably not very scalable, even if for now you envision just 8 peers).
In your base requirement I think you may have a dilemma between:
broadcast everything to everyone,
but still you want a peer to be able to "only communicates with a peer", which is inherently unicast.
This raises some problems, as you already realized by asking the question.
I see 2 other problems:
Scalability-wise, the broadcast everything approach whereas you sometime actually need unicast is going to put some useless load on the network. This is not pretty.
The broadcast approach dictates UDP, but still you want reliable data transfer, so as you stated you'll have to add a "reliability and ordering protocol layered on top". This (not so easy) work would not be needed if only we could use TCP.
There is a third approach:
use broadcast UDP for each peer to announce itself on the network, so that other peers can...
...discover it and then establish a unicast TCP connection with this peer. No more reliability and ordering problems + reduced network load.
This approach is used in SSDP (Simple Service Discovery Protocol), part of UPnP. I do not suggest you use SSDP, it's probably bloated for what you want to do, you said you wanted something simple.
All in all, you first have to resolve your dilemma: decide and differentiate the data that really need to be broadcasted vs the unicast part. YMMV.
PS: with broadcast UDP also comes the problem that though OK on a LAN, this will not pass a router unless you use multicast routing. But that's another story.

Related

TCP/IP - Why does a part of a packet may use a connection-less services in a connection-oriented service.

While reading the book on TCP/IP I came across the words which are as "Although it looks as though the use of the flow label may make the source and destination addresses useless, the parts of the Internet that use connection-less service at the network layer still keep these addresses for several reasons.One reason is that part of the packet path may still be using the connection-less service. Another reason is that the protocol at the network layer is designed with these addresses and it may take a while before they can be changed". Now my question to you is if a connection has been formed between hosts in a connection-oriented manner then how come a path of a packet may still be using the connection-less services. Because as per my knowledge prevails the virtual path always be formed at while 3-way handshake is taking place which is the TCP/IP connection (which uses a connection-oriented service) ? And my second question for the second reason is that which protocol they are talking about since these words are stated below the Heading of "Connection-Oriented Services" therefore, it's making me pissed off to understand the literal meaning behind the words(The core conceptual understanding). And correct if anyone thinks I am having a wrong concept at any place. I'll be obliged. Thanks.
TCP as a connection-oriented protocol runs on top of IP which is connection-less. The routers used in transport only look at the IP packet, the TCP segment is simply payload and transported along. TCP provides several algorithms to form a virtual connection over a connection-less network.
The IP packet goes from hop to hop. On each hop, a router makes a forwarding decision solely based on the destination IP address. (More sophisticated devices may inspect more packet elements including source address and payload, but they aren't simple routers.)
The "path" is made up of all these individual hops. Because each hop is based on an independent routing decision the path can change at any time and for any packet. The path is not laid out by the TCP handshake.
Basically, you have to look at each protocol layer individually. Each one serves its own function.
I hope this also answers the second question.

Writing client-server application in global network

I know, how to write a C# application that works through a local network.
I mean I know, how to make my client-side application access my server-side application in a single local network.
But I wonder: How do such apps, as Skype, TeamViewer, and many other connect via global network?
I apologise, if this question is simple or obvious, but I couldn't find any information about this stuff.
Please, help me, I'll be very grateful. Any information is accepted - articles, plain info, books,and so on...
Question is very wide and I try to do short overview.
Following major difference between LAN (Local Area Network) and WAN (Wide Area Network):
Network quality:
LAN is more or less stable, WAN can be with network issues like:
Packet loss (you need use loss-tolerant transport like TCP or UDP with retransmits or packet loss concealment)
Packet jitter (interpacket intervals may differ a lot from sending part). Most common thing is packets bursts.
Packet reordering
Packet duplication
Network connectivity
WAN is less stable than LAN. So you need properly handle all things like:
Connection stale
Connection loss
Errors in the middle of the connection (if you use UDP for example)
Addresses:
In WAN you deal with different network equipment between client and server (or peers in case of peer-to-peer communication). You need to take in account:
NATs - most of the clients are behind NAT and you need to pass them through. According technics are called "NAT traversal"
Firewalls - may ISP has own rules what client can do or can't. So if you do something specific like custom transport protocol you may bump into ISP firewalls.
Routing - especially multicast and broadcast communication. In common case multicast is not possible to route. Broadcasts are never routed. So you need to avail this type of communication if you want to use WAN.
May be I forgot something. But these points are major. You can read many articles about any of them.

Doubts Related to working of torrents?

I was trying to understand how torrents work?
And after reading a lot on web I now know the basics about it but I have a very
important question related to working of torrents!
In torrents how do peer-to-peer connections take place?
Almost all the peers have private-IP(for e.g 192.x.x.x) addresses then how does connections take place without a server(As I have read: There is no server involved in torrents) ?
Thanks a lot!
There are a few alternatives:
Peers behind NAT simply don't connect to other peers behind NATs. This creates two classes of peers, where the ones that are connectable will have an advantage when trading pieces, and typically achieve faster download rates.
Peers behind NAT use UPnP or NAT-PMP to set up port forwarding in order to be connectable by other peers
peers using uTP and Peer exchange can support a simple hole-punching mechanism (uTorrent and libtorrent supports this for instance). A peer can help in introducing two of its connections to each other, they try to connect to at the same time and of one of them have a full-cone NAT, they are very likely to succeed in establishing the connection.
Peers supporting DHT and uTP may use a relatively new feature where the port announced to the DHT is derived from their UDP packets. Using the same socket for DHT and uTP increases the chances that a peer behind a full-cone NAT can accept incoming connections without UPnP or NAT-PMP set up. Simply because the DHT traffic will keep a pinhole open on the NAT.
If you have a swarm of only peers behind symmetric NATs, nobody is going to be able to connect to anyone else, and bittorrent is not going to work. In practice (at least in moderately large swarms) there are always some peers that are connectable.

If TCP is connection oriented why do packets follow different paths?

According to my knowledge if an internet application has to be designed, we should use either a connection-oriented service or connection-less service, but not both.
Internet's connection oriented service is TCP and connection-less service is UDP, and both resides in the transport layer of Internet Protocol stack.
Internet's only network layer is IP, which is a connection-less service. So it means whatever application we design it eventually uses IP to transmit the packets.
Connection-oriented services use the same path to transmit all the packets, and connection-less does not.
Therefore my problem is
if a connection oriented application has been designed, it should transmit the packets using the same path. But IP breaks that rule by using different routes.So how do both TCP and IP work together in this sense? It totally confuses me.
You, my friend, are confusing the functionality of two different layers.
TCP is connection oriented in the sense that there's a connection establishment, between the two ends where they may negotiate different things like congestion-control mechanism among other things.
The transport layer protocols' general purpose is to provide process-to-process delivery meaning that it doesn't know anything about routes; how your packets reach the end system is beyond their scope, they're only concerned with how packets are being transmitted between the two end PROCESSES.
IP, on the other hand, the Network layer protocol for the Internet, is concerned with data-delivery between end-systems yet it's connection-less, it maintains no connection so each packet is handled independently of the other packets.
Leaving your system, each router will choose the path that it sees fit for EACH packet, and this path may change depending on availability/congestion.
How does that answer your question?
TCP will make sure packets reach the other process, it won't care HOW they got there.
IP, on the other hand, will not care if they reach the other end at all, it'll simply forward each different packet according to what it sees most fit for a particular packet.
Note:
Let's assume that IP was connection-oriented, would that mean packets would follow the same-path?
Not necessarily, it depends on what the word 'connection' at this layer means, if it means negotiating certain options related to security, for instance, you may still have all your packets being forwarded through different routes over the Internet.
EDIT:
Not to confuse you though, most connection-oriented services at the network-layer and below mean that the connection, when established, also establishes a virtual-path that all 'packets' must follow, for further information read about:
Virtual circuit and frame-relay networks
This link answers your question pretty well http://www.tcpipguide.com/free/t_ConnectionOrientedandConnectionlessProtocols-3.htm
Some people consider this (TCP) to be like a “simulation” of circuit-switching at higher network layers; this is perhaps a bit of a dubious analogy. Even though a TCP connection can be used to send data back and forth between devices, all that data is indeed still being sent as packets; there is no real circuit between the devices. This means that TCP must deal with all the potential pitfalls of packet-switched communication, such as the potential for data loss or receipt of data pieces in the incorrect order.
The TCP protocol deals with the problem of IP packets arriving out of order or being lost, to give you the feeling they arrive through a single FIFO channel. Yes, TCP is smart enough to do that, there's no need for a dedicated underlying channel.
The TCP protocal is implemented by the sending/receiving machines, once the packets leave the sending machine, the routers they travel along know nothing about TCP, they just use IP to get the packets from the source the to destination. Then, it is the destination machines job to, using TCP, make sure that all the packets arrive and that they arrive in the correct order. The internet itself doesn't know anything about TCP, it's just a layer (often software) that gives connection to a connectionless medium (the internet).
So onces a packet leaves a destination, it can go along any path (mostly) as long as it gets to the desintation, regardless of the higher level protocol (such as TCP or UDP).
I mean, it's a bit more complicated then that, but as far as I can remember that's the general Idea.
Refer my short points properly,
1) Connection oriented means ==> reserving resources(buffer,cpu,bandwidth etc.)..but "Where??".(where resources are reserved?? This where is reason of your confusion, so following is ans.).
2) Connection oriented at Transport Layer means ==> Reserving the resources at Both End processes/Ports.(Since TCP is a transport layer,then its responsibility is to reserve the resources at both end processes only,irrespective of whats happening in the intermediate path.)
3) Connection oriented at Network Layer means ==> Reserving the resources at Network Layers.(Now In the whole journey of a packet from source to destination, Network layer is found at all intermediate routers too(but not transport layer). Hence if any protocol at Network layer is connection oriented then,its responsibility is to reserve resources at all intermediate routeres too i.e. all packets will have to follow same intermediate path, But IP is connection less hence,intermediate resources will not be reserved. i.e journey of a packets may follow different paths etc.)
#CONCLUSION:==> Intermediate path is decided by Network Layer, hence if IP then paths may be different.(IP may contain TCP),But TCP is responsible for Resource reservation at both End processes ,irrespective of Intermediate path of packet.
router works on three layers only (physical , data link and Network layers) , so routers will take decision depending only on the info. of network layer (IP protocol ) hence there is no information available about its TCP or UDP at the router

How much additional load does a multicast subscriber impose on a switch?

If you have a switch with at least one subscriber to a multicast address, how much additional load would each additional subscriber add?
Example:
You have a 10G switch (with IGMP) with 10 servers and no other activity.
When Server1 subscribers to a 1G multicast feed, the switch will have 1G of load.
What would the load be after Server2 and Server3 subscribed?
Obviously traffic to the switch would not increase, but what about the switch's internal load?
Houw would the answer be different without IGMP?
The whole idea of multicast is that it is efficient. The presence of one subscriber downstream causes the switch to send an IGMP join request of its own upstream and pass incoming multicasts downstream, without duplication. The addition of further downstream subscribers has no effect at all except to increment an internal subscriber count for that group. When that goes back to zero it sends an IGMP leave request of its own upstream.
I don't know what you mean by 'without IGMP'. There is no such thing as UDP multicast without IGMP. It is a contradiction in terms.
Firstly, some background information for you.
The traditional definition of routers and switches are along the lines of:
Router: a device capable of routing a packet form one IP subnet to a different IP subnet
Switch: a device capable of switching a packet within the same IP subnet
However, this traditional definition no longer holds these days because we have switches that can route traffic from one IP subnet to another IP subnet and even perform complex operations such as QoS at wire speed.
Therefore it is often easier to redefine Routers and Switches as follows:
Router: a device that uses the CPU to route packets, often inspects parts of packets that are higher up the OSI layer.
Switch: a device with ASIC(s) (a.k.a switching chips) that switches/routes traffic at full wire speed. What this means is that if the switch has 24 1Gbps ports, it will be able to switch 24Gbps bi-directional traffic without dropping any packets.
Now, to answer your question, it is important to determine whether the ASIC in your switch is capable of handling multicast traffic or not. If so, adding "load" really isn't an issue, as long as you ensure that each switch port is not congested (e.g. 2Gbps of traffic trying to egress out of 1Gbps port). If the ASIC in your switch is NOT capable of handling multicast traffic, it is highly likely that the switch will simply send all multicast traffic up to the CPU. Then it would be up to the software to determine where each packet goes. CPUs on switches are not powerful, because their primary role isn't to route/switch packets, but to manage the switch (e.g. configure the ASIC so that packets get switched properly). Therefore, if your switch is sending packets up to the CPU, the switch will struggle. You won't get anywhere near 1Gbps of multicast via the CPU.
Without IGMP, switches, by default, will flood out the traffic on all ports. Again, this is not a problem for the switch itself because it can handle this at wirespeed. It may cause problems for other parts of the network because traffic is needlessly being duplicated.
The reason for this long answer is because the phrase "10G switch" in your example is quite misleading, and it led me to believe that you maybe thinking that a powerful CPU sits at the center of the switch that is capable of performing 10Gbps bi-directional switching. This is simply not the case, and talking about "load" on a switch therefore often makes little sense.
I hope this helps.

Resources