Find out latency in a reliable way - networking

Background: I am developing a small game and use the player's latency to do lag compensation. The game is open sourced, so at the moment it is a very easy task to reverse engineer the system and delay ones response time to artificially increment ones reported delay, resulting in possibly unfair advantages.
My current strategy for latency retrieval is:
Every fixed interval I send a message labeled as "ping" to a player. (This has nothing to do with ICMP)
This ping message consists of a special "ping" opcode and a payload with a sequence number
Once the client receives said message, he sends back one with a "pong" opcode and a payload with the same sequence number
When the server receives the message labeled as "pong", it calculates how much time passed in between sending and receiving. This is the round trip time
Our latency is the rtt / 2
In pseudo code
Server:
function now() {
return current UTC time in millis
}
i = 0
function nextSequence() {
return i++
}
sendingTimestamps = []
function onPingEvent() {
id = nextSequence()
sendingTimestamps[id] = now()
sendPingMessage(id)
}
function onPongReceived(id) {
received = now()
sent = sendingTimestamps[id]
rtt = received - sent
latency = rtt / 2
}
Client:
function onPingReceived(id) {
sendPongMessage(id)
}
As you can see, it's very easy for the client to just add a delay in his code to inflate his reported latency.
Is there a better way to get a clients latency in order to leave them less room for cheating?

Answer below is a summary of topics discussed in comments to have them all in one place.
Lag compensation should rely on precise time stamp of event rather than average packet delay
Transition time may drastically vary even for two successive packets. Suggested approach with measuring average latency and assuming, that each received packet was sent "latency" ms ago for lag compensation is way too inaccurate. The following scheme should be applied instead:
Server starts emulating world on its side and sends command START to all clients. Clients initiate emulating world and count ticks from its creation. Whenever any event occurs on client side, client sends it with timestamp to server. Like "user pressed fire at tick #183". Server's emulation of game is far ahead due to packet transition time, but server can "go back in time" to handle user's order and resolve consequences.
Time stamps and events still can be faked
AFAIU problem of verifying client input is generally unsolvable. Any algorithm implemented in client can be recreated to fake events/timestamps/packets. Closed code can be reversed, so it is not an answer. Even world wide spread games like Counter-Strike or OverWatch have cheaters, despite they are developed by large companies, which, I bet, have separate department focused solely on game security. Some companies develop antivirus like modules, which check game file integrity or hash of parts of RAM snapshot, but it still can be bypassed.
The question is amount of efforts required to fake algorithm. The more efforts needed the less fakers will be. Trivial timestamp verifycation is the following:
If you receive event#2 in TCP stream after event#1, but its time stamp is before event#1, then it's faked.
If time stamp is far behind server's time, then warn and kick player for enormously bad delay. If it's a real player, the game anyway is unplayable for him, otherwise you kicked hacker. CS servers do this if I'm not mistaken.

Related

Dealing with network packet loss in realtime games - TCP and UDP

Reading lots on this for my first network game, I understand the core difference of guaranteed delivery versus time-to-deliver for TCP v UDP. I've also read diametrically opposed views whether realtime games should use UDP or TCP! ;)
What no-one has covered well is how to handle the issue of a dropped packet.
TCP : Read an article using TCP for an FPS that recommended only using TCP. How would an authoritative server using TCP client input handle a packet drop and sudden epic spike in lag? Does the game just stop for a moment and then pick up where it left off? Is TCP packet loss so rare that it's not really that much of an issue and an FPS over TCP actually works well?
UDP : Another article suggested only ever using UDP. Clearly one-shot UDP events like "grenade thrown" aren't reliable enough as they won't fire some of the time. Do you have to implement a message-received, resend protocol manually? Or some other solution?
My game is a tick-based authoritative server with 1/10th second updates from the server to clients and local simulation to keep things seeming more responsive, although the question is applicable to a lot more applications.
I did a real-time TV editing system. All real-time communication was via UDP, but none-real-time used TCP as it is simpler. With the UDP we would send a state packet every frame. e.g. start video in 100 frames, 99,98,…3,2,1,0,-1,-2,-3 so even if no message gets through until -3 then the receiver would start on the 4th frame (just skipping the first 3), hoping that no one would notice, and knowing that this was better than lagging from here on in. We even added the countdown from around +¼ second (as no-one will notice), this way hardly any frames where dropped.
So in summary, we sent the same status packet every frame. It contained all real-time data about past, current, and future events.
The trick is keeping this data-set small. So instead of sending play button pressed event (there is an unbound number of these), we send the video-id, frame-number, start-mask and end-mask. (start/stop mask are frame numbers, if start-mask is positive and stop-mask is negative then show video, at frame frame-number).
Now we need to be able to start a video during another or shortly after it stops. So we consider how many consecutive video can be played at the same time. We need a slot for each, but can we reuse them immediately? If we have pressed stop, so do not know the stop mask until then, then reuse the slot will the video stop. Well there will be no slot for this video, so we should stop it. So yes we can reuse the slot immediately, as long as we use unique IDs.
Other tips: Do not send +1 events instead send current total. If two players have to update the some total, then each should have their own total, sum all totals at point of use, but never edit someone else's total.

Calculate time offset using HTTP header `date`

I have a program that needs to do something exactly every hour. The catch is that the time needs to be relative to the remote server, which is not synchronised with a time server and is, in fact, about 6 seconds ahead (!). There is no way for me to change that server.
All I have, is access to the HEAD headers of the web server, which have a handy field date (that's how I found out about the discrepancy).
Question: regardless of the language (I use nodeJS, but that's not the point), what would you do to calculate a precise offset between my server and the remote server?
I am especially worried about network latency: I have the following variables:
Local server time
Time when request was sent
Time when the response with the Date header arrived
Remote server time
However, the remote server time was generated when the server received the request -- something that might have taken up to 1 second. And, the time when the response arrived needs to take into account the time it took to receive it...
Right now I am offsetting with (Time request was sent - Time response arrived) / 2. However, it feels lame.
Is there a better, established way to deal with this?
Hmm, i know this kind of problem, though i never had the limitation of not being able to change one of the 2 'actors'. I would say this approximation (Time request was sent - Time response arrived) / 2 feels ok. If you care more about it you could experiment with the approximation in a 'benchmark' kind of way:
don't make one synchronization request but make 10 in sequence, then eliminate the first 3 offsets and the last 3 offsets and average the remaining 4
or:
don't make one synchronization request but make a burst of 10 in 10 different threads, this should theoretically eliminate the client side (local side) time it takes to create the request and should block (if it blocks) on the server side (or remote side in your case). But this would involve some math and i think it's too much trouble for value
P.S. the number 10 is arbitrary (and hopefully the remote server doesn't ban/block you for making too many requests :)

What is the difference between the delay and the jitter in the context of real time applications?

According to Wikipedia Jitter is the undesired deviation from true periodicity of an assumed periodic signal, according to a papper on QoS that I am reading jitter is reffered to as delay variation. Are there any definition of the jitter in the context of real time applications? Are there applications that are sensitive to jitter but not sensitive to delay? If for example a streaming application use some kind of buffer to store packets before show them to the user, is it possible that this application is not sensitive to delay but is sensitive to jitter?
Delay: Is the amount of time data(signal) takes to reach the destination. Now a higher delay generally means congestion of some sort of breaking of the communication link.
Jitter: Is the variation of delay time. This happens when a system is not in deterministic state eg. Video Streaming suffers from jitter a lot because the size of data transferred is quite large and hence no way of saying how long it might take to transfer.
If your application is sensitive to jitter it is definitely sensitive to delay.
In Real-time Protocol (RTP, RFC3550), a header contains a timestamp field. The value of it usually comes from a monotonically incremented counter and the frequency of the increment is the clock-rate. This clock-rate must be the same all over the participant wants something with the timestamp field. The counters have different base offsets, because the start time may different or they contains it because of security reason, etc... All in all we say the clocks are not syncronized.
To show it in an example consider if we refer to snd_timestamp and rcv_timestamp the most recent packet sender timestamp from the RTP header field and receiver timestamp generated by the receiver using the same clock-rate.
The wrong conclusion is that
delay_in_timestamp_unit = rcv_timestamp - snd_timestamp
If the receiver and sender clock-rate has different base offset (and they have), this not gives you the delay, also it doesn't consider the wrap around the 32bit unsigned integer.
But monitoring the time for delivering packets is somehow necessary if we want a proper playout adaption algorithm or if we want to detect and avoid congestions.
Also note that if we have syncronized clocks delay_in_timestamp_unit might be not punctually represent the pure network delay, because of components at the sender or at the receiver side retaining these packets after and/or before the timestamp added and/or exemined. So if you calculate a 2seconds delay between the participant, but you know your network delay is around 100ms, then your packets suffer additional delays at the sender or/and at the receiver side. But that additional delay is somehow (or at least you hope that it is) constant, so the only delay changes in time is - hopefully - the network delay. So you should not say that if packet delay > 500ms then we have a congestion, because you have no idea what is the actual network delay if you use only one packet sender and receiver timestamp information.
But the difference between the delays of two consecutive packets might gives you some information about weather something wrong in the network or not.
diff_delay = delay_t0 - delay_t1
if diff_delay equals to 0 the delay is the same, if it greater than 0 the newly arrived packets needed more time then the previous one, and if it smaller than 0 it needed less time.
And from that relative information based on two consecutive delays you could say something.
How you determine the difference between two delay if the clocks are not syncronized?
Consider you stored the last timestamps in rcv_timestamp_t1 and snd_timestamp_t1
diff_delay = (rcv_timestamp_t0 - snd_timestamp_t0) - (rcv_timestamp_t1 - snd_timestamp_t1)
but that would be problem without maintaining the base offsets of the sender and the receiver, so reordering it:
diff_delay = (rcv_timestamp_t0 - rcv_timestamp_t1) - (snd_timestamp_t0 - snd_timestamp_t1)
and here you can subtract rcv timestamps from each other and it eliminates the offset rcv and snd contain, and then you can extract the rcv_diff from snd_diff and it gives you the information about the difference of the delays of two consecutive packets in the unit of the clock-rate.
Now, according to RFC3550 jitter is "An estimate of the statistical variance of the RTP data packet interarrival time".
In order to finally get to the point your question is
"What is the difference between the delay and the jitter in the context of real time applications?"
Tiny note, but real-time applications usually refer to systems processing data in a range of nanoseconds, so I think you refer to end-to-end systems.
Also despite of several altered definition of jitter, it all uses the difference of the delays of arrived packets and thus provide you information about the relative changes of the network delay, meanwhile delay itself is an absolute value of the time of delivery.

defining the time it takes to do something (latency, throughput, bandwidth)

I understand latency - the time it takes for a message to go from sender to recipient - and bandwidth - the maximum amount of data that can be transferred over a given time - but I am struggling to find the right term to describe a related thing:
If a protocol is conversation-based - the payload is split up over many to-and-fros between the ends - then latency affects 'throughput'1.
1 What is this called, and is there a nice concise explanation of this?
Surfing the web, trying to optimize the performance of my nas (nas4free) I came across a page that described the answer to this question (imho). Specifically this section caught my eye:
"In data transmission, TCP sends a certain amount of data then pauses. To ensure proper delivery of data, it doesn’t send more until it receives an acknowledgement from the remote host that all data was received. This is called the “TCP Window.” Data travels at the speed of light, and typically, most hosts are fairly close together. This “windowing” happens so fast we don’t even notice it. But as the distance between two hosts increases, the speed of light remains constant. Thus, the further away the two hosts, the longer it takes for the sender to receive the acknowledgement from the remote host, reducing overall throughput. This effect is called “Bandwidth Delay Product,” or BDP."
This sounds like the answer to your question.
BDP as wikipedia describes it
To conclude, it's called Bandwidth Delay Product (BDP) and the shortest explanation I've found is the one above. (Flexo has noted this in his comment too.)
Could goodput be the term you are looking for?
According to wikipedia:
In computer networks, goodput is the application level throughput, i.e. the number of useful bits per unit of time forwarded by the network from a certain source address to a certain destination, excluding protocol overhead, and excluding retransmitted data packets.
Wikipedia Goodput link
The problem you describe arises in communications which are synchronous in nature. If there was no need to acknowledge receipt of information and it was certain to arrive then the sender could send as fast as possible and the throughput would be good regardless of the latency.
When there is a requirement for things to be acknowledged then it is this synchronisation that cause this drop in throughput and the degree to which the communication (i.e. sending of acknowledgments) is allowed to be asynchronous or not controls how much it hurts the throughput.
'Round-trip time' links latency and number of turns.
Or: Network latency is a function of two things:
(i) round-trip time (the time it takes to complete a trip across the network); and
(ii) the number of times the application has to traverse it (aka turns).

Measuring time difference between networked devices

I'm adding networked multiplayer to a game I've made. When the server sends an update packet to the client, I include a timestamp so that the client knows exactly when that information is valid. However, the server computer and the client computer might have their clocks set to different times (maybe even just a few seconds difference), so the timestamp from the server needs to be translated to the client's local time.
So, I'd like to know the best way to calculate the time difference between the server and the client. Currently, the client pings the server for a time stamp during initialization, takes note of when the request was sent and when it was answered, and guesses that the time stamp was generated roughly halfway along the journey. The client also runs 10 of these trials and takes the average.
But, the problem is that I'm getting different results over repeated runs of the program. Within each set of 10, each measurement rarely diverges by more than 400 milliseconds, which might be acceptable. But if I wait a few minutes between each run of the program, the resulting averages might disagree by as much as 2 seconds, which is not acceptable.
Is there a better way to figure out the difference between the clocks of two networked devices? Or is there at least a way to tweak my algorithm to yield more accurate results?
Details that may or may not be relevant: The devices are iPod Touches communicating over Bluetooth. I'm measuring pings to be anywhere from 50-200 milliseconds. I can't ask the users to sync up their clocks. :)
Update: With the help of the below answers, I wrote an objective-c class to handle this. I posted it on my blog: http://scooops.blogspot.com/2010/09/timesync-was-time-sink.html
I recently took a one-hour class on this and it wasn't long enough, but I'll try to boil it down to get you pointed in the right direction. Get ready for a little algebra.
Let s equal the time according to the server. Let c equal the time according to the client. Let d = s - c. d is what is added to the client's time to correct it to the server's time, and is what we need to solve for.
First we send a packet from the server to the client with a timestamp. When that packet is received at the client, it stores the difference between the given timestamp and its own clock as t1.
The client then sends a packet to the server with its own timestamp. The server sends the difference between the timestamp and its own clock back to the client as t2.
Note that t1 and t2 both include the "travel time" t of the packet plus the time difference between the two clocks d. Assuming for the moment that the travel time is the same in both directions, we now have two equations in two unknowns, which can be solved:
t1 = t - d
t2 = t + d
t1 + d = t2 - d
d = (t2 - t1)/2
The trick comes because the travel time is not always constant, as evidenced by your pings between 50 and 200 ms. It turns out to be most accurate to use the timestamps with the minimum ping time. That's because your ping time is the sum of the "bare metal" delay plus any delays spent waiting in router queues. Every once in a while, a lucky packet gets through without any queuing delays, so you use that minimum time as the most repeatable time.
Also keep in mind that clocks run at different rates. For example, I can reset my computer at home to the millisecond and a day later it will be 8 seconds slow. That means you have to continually readjust d. You can use the slope of various values of d computed over time to calculate your drift and compensate for it in between measurements, but that's beyond the scope of an answer here.
Hope that helps point you in the right direction.
Your algorithm will not be much more accurate unless you can use some statistical methods. First of all, 10 is probably not sufficient. The first and simplest change would be to gather 100 transit time samples and toss out the x longest and shortest.
Another thing to add would be that both clients send their own timestamp in each packet. Then you can also calculate how different their clocks are and check the average difference between the clocks.
You can also check up on STNP and NTP implementations specifically, as these protocols do this specifically.

Resources