Measure the latency in milliseconds between two public DNS servers - networking

Short Introduction: I'm trying to create a distance matrix of the latency between DNS servers in order to predict p2p latencies using matrix factorization. To use the prediction algorithm, I need about 20 DNS servers and the latencies between them.
Servers
DNS1
DNS2
DNS3
...
Client 1
DNS1
0
?
?
...
ping
DNS2
?
0
?
...
ping
DNS3
?
?
0
...
ping
...
...
...
...
0
Client1
ping
ping
ping
Knowing the distances between the DNS servers, I can add a client by pinging all DNS servers and entering the distances. Using this distance matrix, I can now use matrix factorization to predict the distance between two clients.
The Problem: Without access to the DNS servers, I don't quite know how to get the latencies between them. Can I use a traceroute or recursive lookup?
I'm considering hosting about 20 pingable servers myself in order to get one initial matrix. But this costs a lot of money and is kind of a waste of resources.
Maybe someone has an idea how to get these distances from a set of servers. (They don't necessarily have to be DNS servers)
This is a project that has already collected a huge distance matrix but is super old: https://pdos.csail.mit.edu/archive/p2psim/
This is the paper for the algorithm: https://dl.acm.org/doi/pdf/10.1145/1028788.1028827

Alright, I might have found a great solution.
https://www.azurespeed.com/Azure/RegionToRegionLatency
Microsoft offers a distance matrix with RTT between their availability zones. These are not super current, but should be enough.
There is also a method to measure these yourself: https://learn.microsoft.com/en-us/azure/network-watcher/view-relative-latencies

Related

Use a DHT for a gossip protocol?

I've been digging about DHTs and especially kademlia for some time now already.
I'm trying to implement a p2p network working on a Kademlia DHT. I want to be able to gossip a message to the whole network.
from my research for that gossip protocols are used, but it seems odd to add another completely new protocol to spread messages when I already use the dht to store peers.
Is there a gossip protocol that works over or with a DHT topology like Kademlia ?
How concerned are you about efficiency? As a lower bound someone has to send a packet to all N nodes in the network to propagate an update to all nodes.
The most naive approach is to simply forward every message to all entries in your routing table. This will not do since it obviously leads to forwarding storms.
The second most naive approach is to forward updates, i.e. newer data. This will result in N * log(N) traffic.
If all your nodes are trusted and you don't care about the last quantum of efficiency you can already stop here.
If nodes are not trusted you will need a mechanism to limit who can send updates and to verify packets.
If you also care about efficiency you can add randomized backoff before forwarding and tracking which routing table entry already has which version to prune unnecessary forwarding attempts.
If you don't want to gossip with the whole network but only a subset thereof you can implement subnetworks which interested nodes can join, i.e. subscribe to. Bittorrent Enhancement Proposal 50 describes such an approach.

Estimating file transfer time over network?

I am transferring file from one server to another. So, to estimate the time it would take to transfer some GB's of file over the network, I am pinging to that IP and taking the average time.
For ex: i ping to 172.26.26.36 I get the average round trip time to be x ms, since ping send 32 bytes of data each time. I estimate speed of network to be 2*32*8(bits)/x = y Mbps --> multiplication with 2 because its average round trip time.
So transferring 5GB of data will take 5000/y seconds
Am I correct in my method of estimating time.
If you find any mistake or any other good method please share.
It could also depend on the protocol. PING is ICMP and ftp uses TCP. The delays need not the be same for both the protocols. TCP tries to adjust to network during congestion time and this means longer delays. Just send 100 MB or 500MB files using FTP and collect the stats and do estimates (one way). Or, there is a tool called iperf/jperf that can pump TCP traffic of your interest and show some bandwidth and time stats. Possibly you can try that.
No. Your method of estimating bandwidth is entirely incorrect. Ping can only tell you about delay. You have to send something large enough to saturate the network to get the bandwidth.

GCE Instances Network connection speeds on internal network

Coming from a background of vSphere vm's with vNIC's defined on creation as I am do the GCE instances internal and public ip network connections use a particular virtualised NIC and if so what speed is it 100Mbit/s 1Gb or 10Gb?
I'm not so much interested in the bandwidth from the public internet in but more what kind of connection is possible between instances given networks can span regions
Is it right to think of a GCE project network as a logical 100Mbit/s 1Gb or 10Gb network spanning the atlantic I plug my instances into or should there be no minimum expectation because too many variables exist like noisy neighbours and inter region bandwidth not to mention physical distance?
The virtual network adapter advertised in GCE conforms to the virtio-net specification (specifically virtio-net 0.9.5 with multiqueue). Within the same zone we offer up to 2Gbps/core of network throughput. The NIC itself does not advertise a specific speed. Performance between zones and between regions is subject to capacity limits and quality-of-service within Google's WAN.
The performance relevant features advertised by our virtual NIC as of December 2015 are support for:
IPv4 TCP Transport Segmentation Offload
IPv4 TCP Large Receive Offload
IPv4 TCP/UDP Tx checksum calculation offload
IPv4 TCP/UDP Rx checksum verification offload
Event based queue signaling/interrupt suppression.
In our testing for best performance it is advantageous to enable of all of these features. Images supplied by Google will take advantage of all the features available in the shipping kernel (that is, some images ship with older kernels for stability and may not be able to take advantage of all of these features).
I can see up to 1Gb/s between instances within the same zone, but AFAIK that is not something which is guaranteed, especially for tansatlantic communication. Things might change in the future, so I'd suggest to follow official product announcements.
There have been a few enhancements in the years since the original question and answers were posted. In particular, the "2Gbps/core" (really, per vCPU) is still there but there is now a minimum cap of 10 Gbps for VMs with two or more vCPUs. The maximum cap is currently 32 Gbps, with 50 Gbps and 100 Gbps caps in the works.
The per-VM egress caps remain "guaranteed not to exceed" not "guaranteed to achieve."
In terms of achieving peak, trans-Atlantic performance, one suggestion would be the same as for any high-latency path. Ensure that your sources and destinations are tuned to allow sufficient TCP window to achieve the throughput you desire. In particular, this formula would be in effect:
Throughput <= WindowSize / RoundTripTime
Of course that too is a "guaranteed not to exceed" rather than a "guaranteed to achieve" thing. As was stated before "Performance between zones and between regions is subject to capacity limits and quality-of-service within Google's WAN."

Why is latency smaller when running website from localhost?

When i “ping” a hostname from my remote website the latency is larger than when I ping the same hostname from localhost (or using the ping command). I use the same machine for both tests. Why is this happening?
Because the network is zero length instead of possibly 12,500 miles, or much more in the case of satellite links, and consists of zero routers instead of possibly dozens.

Determine asymmetric latencies in a network

Imagine you have many clustered servers, across many hosts, in a heterogeneous network environment, such that the connections between servers may have wildly varying latencies and bandwidth. You want to build a map of the connections between servers by transferring data between them.
Of course, this map may become stale over time as the network topology changes - but lets ignore those complexities for now and assume the network is relatively static.
Given the latencies between nodes in this host graph, calculating the bandwidth is a relative simply timing exercise. I'm having more difficulty with the latencies - however. To get round-trip time, it is a simple matter of timing a return-trip ping from the local host to a remote host - both timing events (start, stop) occur on the local host.
What if I want one-way times under the assumption that the latency is not equal in both directions? Assuming that the clocks on the various hosts are not precisely synchronized (at least that their error is of the the same magnitude as the latencies involved) - how can I calculate the one-way latency?
In a related question - is this asymmetric latency (where a link is quicker in direction than the other) common in practice? For what reasons/hardware configurations? Certainly I'm aware of asymmetric bandwidth scenarios, especially on last-mile consumer links such as DSL and Cable, but I'm not so sure about latency.
Added: After considering the comment below, the second portion of the question is probably better off on serverfault.
To the best of my knowledge, asymmetric latencies -- especially "last mile" asymmetries -- cannot be automatically determined, because any network time synchronization protocol is equally affected by the same asymmetry, so you don't have a point of reference from which to evaluate the asymmetry.
If each endpoint had, for example, its own GPS clock, then you'd have a reference point to work from.
In Fast Measurement of LogP Parameters
for Message Passing Platforms, the authors note that latency measurement requires clock synchronization external to the system being measured. (Boldface emphasis mine, italics in original text.)
Asymmetric latency can only be measured by sending a message with a timestamp ts, and letting the receiver derive the latency from tr - ts, where tr is the receive time. This requires clock synchronization between sender and receiver. Without external clock synchronization (like using GPS receivers or specialized software like the network time protocol, NTP), clocks can only be synchronized up to a granularity of the roundtrip time between two hosts [10], which is useless for measuring network latency.
No network-based algorithm (such as NTP) will eliminate last-mile link issues, though, since every input to the algorithm will itself be uniformly subject to the performance characteristics of the last-mile link and is therefore not "external" in the sense given above. (I'm confident it's possible to construct a proof, but I don't have time to construct one right now.)
There is a project called One-Way Ping (OWAMP) specifically to solve this issue. Activity can be seen in the LKML for adding high resolution timestamps to incoming packets (SO_TIMESTAMP, SO_TIMESTAMPNS, etc) to assist in the calculation of this statistic.
http://www.internet2.edu/performance/owamp/
There's even a Java version:
http://www.av.it.pt/jowamp/
Note that packet timestamping really needs hardware support and many present generation NICs only offer millisecond resolution which may be out-of-sync with the host clock. There are MSDN articles in the DDK about synchronizing host & NIC clocks demonstrating potential problems. Timestamps in nanoseconds from the TSC is problematic due to core differences and may require Nehalem architecture to properly work at required resolutions.
http://msdn.microsoft.com/en-us/library/ff552492(v=VS.85).aspx
You can measure asymmetric latency on link by sending different sized packets to a port that returns a fixed size packet, like send some udp packets to a port that replies with an icmp error message. The icmp error message is always the same size, but you can adjust the size of the udp packet you're sending.
see http://www.cs.columbia.edu/techreports/cucs-009-99.pdf
In absence of a synchronized clock, the asymmetry cannot be measured as proven in the 2011 paper "Fundamental limits on synchronizing clocks over networks".
https://www.researchgate.net/publication/224183858_Fundamental_Limits_on_Synchronizing_Clocks_Over_Networks
The sping tool is a new development in this space, which uses clock synchronization against nearby NTP servers, or an even more accurate source in the form of a GNSS box, to estimate asymmetric latencies.
The approach is covered in more detail in this blog post.

Resources