Calculation of data delta

Calculation of data delta - math

I'm writing a server that sends a "coordinates buffer" of game objects to clients every 300ms. But I don't want to send the full data each time. For example, suppose I have an array with elements that change over time:
0 0 100 50 -100 -50 at time t
0 10 100 51 -101 -50 at time t + 300ms
You can see that only the 2nd, 4th, and 5th elements have changed.
What is the right way to send not all the elements, but only the delta? Ideally I'd like a function that returns the complete data the first time and empty data when there are no changes.
Thanks.

Are you looking to optimize for efficiency, or is this a learning exercise? Some thoughts:
Unless there's a lot of data, it's probably easiest, and not terribly inefficient, to send all the data each time.
If you send deltas for all of the data points each time, you won't save much by sending zeroes for unchanged points instead of re-sending the previous vales.
If you send data for only those points that change, you'll need to provide an index for each value. For example, if point 3 increases by 5 and point 8 decreases by 2, then you might send 3 5 8 -2. But now, since you're sending two values for each point that changes, you'll only win if fewer than half the points change.
If the values change relatively slowly, as compared to the rate at which you transmit updates, you might increase efficiency by transmitting the delta for each data point, but using only a few bits. For example, with 4 bits you can transmit values from -8 to +7. That would work as long as the deltas are never larger than that, or if it's ok to transmit several deltas before they "catch up" to the actual values.
It may not be worthwhile to have 2 different mechanisms: one to send the initial values, and another to send deltas. If you can tolerate the lag, it may make more sense to assume some constant initial value for every point, and then transmit only deltas.

There are lots of options. If most data isn't changing, just send (index,value) pairs of the changed elements. If most values change but the changes are small, compute deltas and gzip (or run length encode, or lots of other possibilities) the result.

Related

Is it possible to encode date AND time (with some caveats) into 12 bits?

I have at my disposal 16 bits. Of them, 4 bits are the header and cannot be touched. This leaves us with 12 bits. I would like to encode date and time data into them. These are essentially logs being sent over LPWAN.
Obviously, it's impossible to encode proper generic date and time into it. Since the unix timestamp uses 32 bits, and projects like Compact Time Format use 5 bytes.
Let's say we don't really need the year, because this information is available elsewhere. Let's also say the time resolution of seconds doesn't have to be super accurate, so we can split the seconds into 30 second intervals. If we were to simply encode the data as is then:
4 bits month (0-11)
5 bits day (0-31)
5 bits hour (0-23)
6 bits minute (0-59)
1 bit second (0,30)
-----------------------------
21 bits
21 bits is much better than 32. But it's still not 12. I could subtract one bit from the minutes (rounding to the nearest even minute), and remove the seconds but that still leaves us with 19 bits. Which is still far from 12.
Just wondering if it's possible, and if anyone has any ideas.

12 bits can hold 2^12 = 4096 values, which feels pretty tight for a task. Not sure much can be done in terms of compressing a date time into a 4096 number. It is too little space to represent this data.
There are some workarounds, none of them able to achieve what you want, but maybe something you could use anyway:
Split date and time. Alternate with some algorithm between sending date/time, one bit can be used to indicate what data is being sent. This leaves 11 bits to encode either date or time. You could go a bit further and split time like this as well. Receiving side can then reconstruct a full date time having access to the previously received data.
You could have a schema where one date packet is sent as a starting point, and subsequent packets are incremented in N-second intervals from the start of the epoch
Remove date time from data completely, saving 12 bits, but send it periodically as a stand-alone heartbeat/datetime packet.
You could try compressing the whole data packet which could allow using more bits to represent date time and still fit into a smaller overall packet size
If data is sent at reasonable fixed intervals, you could use a circular counter of an interval of N seconds, this may work if you have few devices and you can keep track of when they start transmitting. For example a satellite was launched on XYZ date time, it send counter every 30 seconds, we received counter value of 100, to calculate date we use simple math XYZ + 30*100 seconds

No. Unless you'd be happy with representing less than a span of a day and a half. You can just count 4096 30-second intervals, and find that that will cover 34 hours and eight minutes. 4096 two-minute intervals is just four times that, or five days, 16 hours, and 32 minutes. Still a small fraction of a year.
If you can assure that the difference between successive log entries is small, then you can stuff that in 12 bits. You will need a special entry to give an initial date, and maybe you could insert such an entry when the difference betweem successive entries is too large.
#oleksii has some other good suggestions as well.

Convention for comparing NCR/PCR values as ahead/behind

DVB-S2X satellite data communications has a Network Clock Reference. MPEG streams have a Program Clock Reference. In both cases they're a number of ticks on a 27MHz clock that wraps around at 300*2^33 because the number is defined as a 33-bit "base" plus a 9-bit "extension" with the latter running from 0..299 and the former incrementing by one tick each time the latter wraps around.
My question is this: given 2 NCR counter values A and B, is there a standards-defined method for determining whether A is "behind" or "ahead of" B?
For example, if A is 20 and B is 30 then intuitively you can see that A is behind B by 10 ticks.
Conversely, if A is 10 and B is 300*2^33-10 then intuitively A is ahead of B by 20 ticks with the counter having wrapped around as it incremented from 300*2^33-1 to 0.
Previously when I've worked with things like data packet sequence numbers, we've taken by convention that if max(A,B)-min(A,B) < WRAPVALUE/2 then the larger number is ahead of the smaller, otherwise it’s behind and the difference is given by min(A,B)+WRAPVALUE-max(A,B) (draw some number lines if you don't follow this).
Is this half ahead/behind heuristic simply a convention I've picked up over the years, or is there something more concrete in an ETSI spec somewhere mandating that this is how NCR/PCR values are to be compared and the difference between them calculated? Or any other wraparound number types, for that matter.

Game Logic Math headbreaker

I have 10 objects traveling at the same speed to 5 different destinations in my game world.
The 5 destinations take 5,10,15,20 and 25 seconds to reach. So each destination has 2 objects traveling to it.
My 10 objects all start from the same origin, at 5 seconds intervals. So when object1 is still traveling, after 5 seconds object2 starts to move and so on. The problem is, is that the object's destinations are random...so object1 may have the furthest destination in case 1 and in case 2 it may be that object10 has the furthest destination. In this particular example, I have 5 destinations that each will receive 2 objects.
How do I calculate the maximum time it could potentially take for all objects to reach their destination? Preferably breaking it down in a logical function that captures the above. Doesn't have to be in C# or anything like that, i just want some help in creating a function that could capture more complex scenarios where I have more objects and more destinations as well....
So the variables are:
Objects,
Destinations + Time to reach the particular destination,
Interval at which they start.
For the avoidance of doubt: each destination will receive an equal amount of objects. So the number of total objects traveling is always an even number.
The outcome should be the longest hypothetical time it would take for all cubes to reach their destination (and bonuspoints for the shortest amount of time it would take.)
I have been trying to capture this in Excel to calculate a few scenarions but I fail miserably...
Apologies for the Grade 9 Highschool level question here, but this one has me really puzzled!

The maximum time it takes for all objects to reach the destination:
The latest time objects start to travel: 10 objects, one every 5 seconds, means the last one starts at 45s.
Plus:
The longest time to reach a target: 25s.
So the maximum time it takes for the worst case is 70 seconds.

Sliding window protocol, calculation of sequence number bits

I am preparing for my exams and was solving problems regarding Sliding Window Protocol and I came across these questions..
A 1000km long cable operates a 1MBPS. Propagation delay is 10 microsec/km. If frame size is 1kB, then how many bits are required for sequence number?
A) 3 B) 4 C) 5 D) 6
I got the ans as C option as follows,
propagation time is 10 microsec/km
so, for 1000 km it is 10*1000 microsec, ie 10 milisec
then RTT will be 20 milisec
in 10^3 milisec 8*10^6 bits
so, in 20 milisec X bits;
X = 20*(8*10^6)/10^3 = 160*10^3 bits
now, 1 frame is of size 1kB ie 8000 bits
so total number of frames will be 20. this will be a window size.
hence, to represent 20 frames uniquely we need 5 bits.
the ans was correct as per the answer key.. and then I came across this one..
Frames of 1000 bits are sent over a 10^6 bps duplex link between two hosts. The propagation time is
25ms. Frames are to be transmitted into this link to maximally pack them in transit (within the link).
What is the minimum number of bits (l) that will be required to represent the sequence numbers distinctly?
Assume that no time gap needs to be given between transmission of two frames.
(A) l=2 (B) l=3 (C) l=4 (D) l=5
as per the earlier one I solved this one like follows,
propagation time is 25 ms
then RTT will be 50 ms
in 10^3 ms 10^6 bits
so, in 50 ms X bits;
X = 50*(10^6)/10^3 = 50*10^3 bits
now, 1 frame is of size 1kb ie 1000 bits
so total number of frames will be 50. this will be a window size.
hence, to represent 50 frames uniquely we need 6 bits.
and 6 is not even in the option. Answer key is using same solution but taking propagation time not RTT for calculation. and their answer is 5 bits. I am totally confused, which one is correct?

I don't see what RTT has to do with it. The frames are only being sent in one direction.

Round-Trip-Time means that you have to take into account the ACK (acknowledgement message) you must receive that tells you the frames you are sending are being received by on the other side of the link. This 'time' window is the period where you get to send the remaining frames that the window allows you to send before you anticipate an ACK.
Ideally you want to be able to transmit continuously, i.e not having to stop at the window frame limit to wait for an ACK (which is essentially turns into a stop-and-wait situation if you have to stop and wait for the ack. The solution to this question is: the minimum number of frames that will be transmitted from the moment the first frame is transmitted to the moment you get an ack. (also known as the size for a large window)
Your calculations look to be correct in both cases and it would be safe to assume the answer choices for the second question are wrong .

Here its duplex channel so YOUR RTT= Tp hence they have considered Tp
Now you will get X = 25*10³
So total bits of window will be 5..

Getting accurate graphite stats_counts

We have etsy/statsd node application running that flushes stats to carbon/whisper every 10 seconds. If you send 100 increments (counts), in the first 10 seconds, graphite displays them properly, like:
localhost:3000/render?from=-20min&target=stats_counts.test.count&format=json
[{"target": "stats_counts.test.count", "datapoints": [
[0.0, 1372951380], [0.0, 1372951440], ...
[0.0, 1372952460], [100.0, 1372952520]]}]
However, 10 seconds later, and this number falls to 0, null and or 33.3. Eventually it settles at a value 1/6th of the initial number of increments, in this case 16.6.
/opt/graphite/conf/storage-schemas.conf is:
[sixty_secs_for_1_days_then_15m_for_a_month]
pattern = .*
retentions = 10s:10m,1m:1d,15m:30d
I would like to get accurate counts, is graphite averaging the data over the 60 second windows rather than summing it perhaps? Using the integral function, after some time has passed, obviously gives:
localhost:3000/render?from=-20min&target=integral(stats_counts.test.count)&format=json
[{"target": "stats_counts.test.count", "datapoints": [
[0.0, 1372951380], [16.6, 1372951440], ...
[16.6, 1372952460], [16.6, 1372952520]]}]

Graphite data storage
Graphite manages the retention of data using a combination of the settings stored in storage-schemas.conf and storage-aggregation.conf. I see that your retention policy (the snippet from your storage-schemas.conf) is telling Graphite to only store 1 data point for it's highest resolution (e.g.10s:10m) and that it should manage the aggregation of those data points as the data ages and moves into the older intervals (with the lower resolution defined - e.g. 1m:1d). In your case, the data crosses into the next retention interval at 10 minutes, and after 10 minutes the data will roll up according the settings in the storage-aggregation.conf.
Aggregation / Downsampling
Aggregation/downsampling happens when data ages and falls into a time interval that has lower resolution retention specified. In your case, you'll have been storing 1 data point for each 10 second interval but once that data is over 10 minutes old graphite now will store the data as 1 data point for a 1 minute interval. This means you must tell graphite how it should take the 10 second data points (of which you have 6 for the minute) and aggregate them into 1 data point for the entire minute. Should it average? Should it sum? Depending on the type of data (e.g. timing, counter) this can make a big difference, as you hinted at in your post.
By default graphite will average data as it aggregates into lower resolution data. Using average to perform the aggregation makes sense when applied to timer (and even gauge) data. That said, you are dealing with counters so you'll want to sum.
For example, in storage-aggregation.conf:
[count]
pattern = \.count$
xFilesFactor = 0
aggregationMethod = sum
UI (and raw data) aggregation / downsampling
It is also important to understand how the aggregated/downsampled data is represented when viewing a graph or looking at raw (json) data for different time periods, as the data retention schema thresholds directly impact the graphs. In your case you are querying render?from=-20min which crosses your 10s:10m boundary.
Graphite will display (and perform realtime downsampling of) data according to the lowest-resolution precision defined. Stated another way, it means if you graph data that spans one or more retention intervals you will get rollups accordingly. An example will help (assuming the retention of: retentions = 10s:10m,1m:1d,15m:30d)
Any graph with data no older than the last 10 minutes will be displaying 10 second aggregations. When you cross the 10 minute threshold, you will begin seeing 1 minute worth of count data rolled up according to the policy set in the storage-aggregation.conf.
Summary / tldr;
Because you are graphing/querying for 20 minutes worth of data (e.g. render?from=-20min) you are definitely falling into a lower precision storage setting (i.e. 10s:10m,1m:1d,15m:30d) which means that aggregation is occurring according to your aggregation policy. You should confirm that you are using sum for the correct pattern in the storage-aggregation.conf file. Additionally, you can shorten the graph/query time range to less than 10min which would avoid the dynamic rollup.