Calculating theoretical network bandwidth in topology - networking

I'm in the process of building a discrete-event simulator, and need to be able to calculate the theoretical bandwidth available between two systems in a given network topology, so that I can "time" how long a transfer will take to occur and create an event at its expected completion time.
At the moment, for simplicity, I do not consider the switch's backplanes or likelyhood for collisions / congestion to occur within the network. I am simply interested in the maximum transfer rate between all systems communicating.
For instance, consider the following sample network topology:
We assume the following connections:
Source 1, Source 2 -> (sending to) Dest 1
Source 3, Source 4 -> (sending to) Dest 2
Given these connections, what is the maximum effective transfer rate of all sources?
If we visualize this as a graph, I can calculate this manually by starting from the sources and evaluating at each switch level the maximum amount of incoming network traffic vs the switch's uplink.
For instance, Source #1 in this scenario has 50 Mbps of effective bandwidth to Dest 1
1 Gbps * S1(1/2) * S2(1) * S3(1/10) = 50 Mbps
However, I'm curious as to what other methods can be utilized to calculate this, or if there is a more effective approach which I can utilize to "predict" network traffic.
Any feedback is appreciated -- thanks.

This is essentially a max-min fairness problem.
https://en.wikipedia.org/wiki/Max-min_fairness
The progressive filling algorithm (described in the Wiki article) is a simple solution to this problem:
If resources are allocated in advance in the network nodes, max-min
fairness can be obtained by using an algorithm of progressive filling.
You start with all rates equal to 0 and grow all rates together at the
same pace, until one or several link capacity limits are hit. The
rates for the sources that use these links are not increased any more,
and you continue increasing the rates for other sources. All the
sources that are stopped have a bottleneck link. This is because they
use a saturated link, and all other sources using the saturated link
are stopped at the same time, or were stopped before, thus have a
smaller or equal rate. The algorithm continues until it is not
possible to increase. Lastly, when the algorithm terminates, all
sources have been stopped at some time and thus have a bottleneck
link. This allocation is max-min fair.

Related

Trading off between User Bandwidth and Download Interval

I am designing a non commercial open source client app which needs to download data of exactly 100 KB from server on regular interval and show an alert in client app based on the data changes. Now I need to trade off between the user bandwidth and download interval.
Analysis,
If I set the interval = 1 hour. That means within 1 month app will download 30*24*100KB = 72MB.
If I set the interval = 30 mins. That means within 1 month app will download 30*48*100KB = 144MB.
And so on.
Now, I am considering only the file size where in practice there will be some portion of bandwidth used for control flow apart from data flow. For downloading file of exactly 100 KB from server, how much overhead bandwidth of control flow should I consider in my analysis for TCP communication? Is there any guideline/reference or research on that topic?
Assume, if 10KB is used for control flow, total monthly usage will include 14.4MB extra data which needed to be identified in my analysis.
Note: (1) I am limited to analyse only the client app part. (2) No changes in server side can be done at that moment (i.e. pull based to push based, partial data change api etc. cannot be applied). (3) I am limited to download the file using TCP. (4) Although, that much granularity is not often be considered in practice, let's assume, for my case the analysis required to be that much granular that I need to know the data vs control bandwidth ratio.
If you are asking only for the TCP/IP part, the payload/PDU ratio is 1460/1500 for IPv4 and 1440/1500 for IPv6, assuming an MTU of 1500 bytes (sources: this already mentioned discussion, this other discussion, this other article).
I also found this really nice page that allows you to see all the header sizes for an arbitrary protocol stack and this academic paper.
However besides the protocol headers, there are more effects that reduce the bandwidth:
TCP will send additional messages, e.g. for performing a handshake when establishing the connection,
Retransmission of data may occur,
Actual frame sizes are negotiated on the lower communication layers, so TCP segments might be smaller than assumed.
In summary, this is not easy to answer precisely, because there are influences in the transmission process that are beyond your control.
Have you considered to measure the actual amount of data needed for transmitting one (or more) 100KB chunk(s) of payload rather than performing a theoretical analysis?

Predicting/calculating congestion in telecom network

I have an application installed at my phone which is providing below details every minute: - Bandwidth , -Packet loss ,-signal strength,- RTT for google.com every minute.
I am trying to predict congestion based on these 4 attribute , but some how it doesn't look accurate to me , previously i have only used bandwidth .
I want predict congestion at any point more appropriately , appreciate any recommendations .
I think you are saying you are trying to measure network 'responsiveness', and from these measurements get a sense of how congested the network is. You also mention you want to predict which I guess means you want to make an estimate of the future 'responsiveness' based on your measurements and observations.
The items you are measuring look sensible, although you may want to include jitter if you are interested in VoIP or other real time streamed media.
The issue you have is that there are many variables which can effect your measurements, for example:
congestion in the radio cell you are in at the time
congestion in the backhaul network
delays in the server you are using to measure the RTT
congestion or faults with the particular APN your mobile is using to access data services
network faults
As some of these can be irregularly occurring but can have a large impact, it is quite hard to build up an accurate view of the overall network 'responsiveness' with a single handset. For example your local cell may be busy or have a problem but others users of Google.com in other cells will have perfectly good response, or Google.com may be busy or delayed and other users in your cell accessing a different server may again have perfectly good response.
It would likely be useful for you to look at some of the generally available web speedtest applications to see the type of information they provide - they have the advantage of being able to gather results from many thousands of users, and also generally have access to the servers to understand any issues on that side.
Depending on what you are trying to achieve it might be that a combination of measurements from one of the general speedtest services, combined with your own measurements will give you enough data to draw some sort of meaningful conclusions.

How to improve the speed of MPI_scatter/MPI_gather?

I found that time used for MPI_scatter/MPI_gather continuously increased (somehow linearly) as the number of workers increases, especially when the workers are across different nodes.
I thought that MPI_scatter/MPI_gather is a parallel process, and wonder what leads to the above increasing? Is there any trick to make it faster, especially for workers distributing across CPU nodes?
The root rank has to push a fixed amount of data to the other ranks. As long as all ranks reside on the same compute node, the process is limited by the memory bandwidth available. Once more nodes become involved, the network bandwidth, usually much lower than the memory bandwidth, becomes the limiting factor.
Also the time to send a message is roughly divided in two parts - initial (network setup and MPI protocol handshake) latency and then the time it takes to physically transfer the actual data bits. As the amount of data is fixed, the total physical transfer time remains the same (as long as the transport type and therefore the bandwidth stays the same) but more setup/latency overhead is being added with each new rank that data is scattered to or gathered from, therefore the linear increase in the time it takes to complete the operation.
How an MPI_Scatter/Gather will work varies between implementations. Some MPI implementations may choose to use a series of MPI_Send as an underlying mechanism.
The parameters that may affect how MPI_Scatter works are:
1. Number of processes
2. Size of data
3. Interconnect
For example, an implementation may avoid using a broadcast for very small number of ranks sending/receiving very large data.

Modeling communication costs in MPI

Does anyone know of any papers that discuss communication costs in MPI programs? I am trying to predict the time taken by (say) the communication step in two phase I/O. That would depend on the no. of processes, the size and number of messages sent/received, network interconnect and architecture, etc. It would be helpful for us to come up with a formula to assess the time taken by communication alone. I have read some papers , but none of them handle the case where multiple processes are communicating at the same time.
The most critical elements in any time estimate will be the total data to be sent, and the speed of the interconnect. That should give you an effective "minimum" time for the message transfers.
After that, you can measure the actual time taken and use that to determine a rough efficiency rating for the MPI implementation. As the amount of data scales up, the time required will also scale up using the scale factor. This is a very rough way to get an estimate. Keep in mind that as the data size crosses certain interesting thresholds (e.g. page size, cache size, and so on) the scale factor will likely need to be revised.

What is the optimal number of nodes in a BitTorrent swarm?

What is the optimal number of nodes in a BitTorrent swarm? I think that there is a mathematical way to express the most efficient number of nodes. To be honest I have a problem with just having an empirical number of X, without some rigor to back it up.
According to this specification the number is 30.
"Implementer's Note: Even 30 peers
is plenty, the official client
version 3 in fact only actively forms
new connections if it has less than 30
peers and will refuse connections if
it has 55. This value is important
to performance. When a new piece has
completed download, HAVE messages (see
below) will need to be sent to most
active peers. As a result the cost of
broadcast traffic grows in direct
proportion to the number of peers.
Above 25, new peers are highly
unlikely to increase download speed.
UI designers are strongly advised to
make this obscure and hard to change
as it is very rare to be useful to do
so."
The overhead this quote is referencing is to HAVE messages.
u mean by number of nodes in a swarm. It sounds like you're referring to the total number of participants in a swarm, but your quote is referring to the number of nodes you should connect to. Let's assume the question is the latter.
You also did not specify what performance metric to use. What does efficient mean to you?
If optimal means the lowest number of overhead bytes per payload byte, you want 1 connection (or maybe 0 connections).
Let's assume that you want to maximize your download rate. The answer to this question (how many peers should I connect to to maximize my download rate) is:
The lowest number of peers that will saturate your down-link.
Now, what does this mean? Well, it depends on the swarm, and what capacity other peers have, and it depends on how many distributed copies there are in the swarm.
The other question that also needs to be resolved is, how many peers should you upload to? The answer here is:
The largest number of peers you can divide your upload capacity among, so that they all still reciprocates, or the smallest number that will saturate your down-link
Note that the division does not need to be even, see the bittyrant paper for details.
Now, you need at least that that many connections to unchoke.
The trick in getting good download speed mostly comes down to sending fast enough to peers so that they reciprocate, but preferably not any faster than that. If there's spare upload capacity, it should be used to make another peer reciprocate. Being connected to many peers means that you can find good trading partners a bit faster, and you will be less affected by high churn in swarms.
If you are referring to the optimal number of nodes in the swarm, it is probably somewhere around infinity. Since every leecher is best matched with 1 seeder that has the same upload speed as the leechers download speed.
If you are referring to the optimal number of nodes to connect to as a leecher, this number cannot (or is extremely hard to) be found, because it depends on too much variables. Variables to consider:
number of nodes in swarm ( 1 - 1000 )
seed/leech ration of each node ( 0 - 10000% )
latency of each node ( 1ms - 1s )
max active connections of each node ( 0 - 1000)
max upload speed of each node ( 1kb/s - 1000mb/s )
max download speed of each node ( 1kb/s - 1000mb/s)
torrent size ( 1 kb - 1 tb )
tracker intelligence (hard to quantify)
torrent piece size ( 1kb - 4mb )
torrent pieces ( 1 - 10000 )
So per node there are at least a million possible configurations, then every other node also has these options. So there are 1.000.000^1000 configurations possible for a swarm with 1000 nodes.
When there are a lot of low speed nodes, you will probably want to connect to a lot of nodes.
When there are a lot of high speed nodes, you will probably want to connect to just 1 or 2 nodes.

Resources