Related
I am looking for possible techniques to gracefully handle race conditions in network protocol design. I find that in some cases, it is particularly hard to synchronize two nodes to enter a specific protocol state. Here is an example protocol with such a problem.
Let's say A and B are in an ESTABLISHED state and exchange data. All messages sent by A or B use a monotonically increasing sequence number, such that A can know the order of the messages sent by B, and A can know the order of the messages sent by B. At any time in this state, either A or B can send a ACTION_1 message to the other, in order to enter a different state where a strictly sequential exchange of message needs to happen:
send ACTION_1
recv ACTION_2
send ACTION_3
However, it is possible that both A and B send the ACTION_1 message at the same time, causing both of them to receive an ACTION_1 message, while they would expect to receive an ACTION_2 message as a result of sending ACTION_1.
Here are a few possible ways this could be handled:
1) change state after sending ACTION_1 to ACTION_1_SENT. If we receive ACTION_1 in this state, we detect the race condition, and proceed to arbitrate who gets to start the sequence. However, I have no idea how to fairly arbitrate this. Since both ends are likely going to detect the race condition at about the same time, any action that follows will be prone to other similar race conditions, such as sending ACTION_1 again.
2) Duplicate the entire sequence of messages. If we receive ACTION_1 in the ACTION_1_SENT state, we include the data of the other ACTION_1 message in the ACTION_2 message, etc. This can only work if there is no need to decide who is the "owner" of the action, since both ends will end up doing the same action to each other.
3) Use absolute time stamps, but then, accurate time synchronization is not an easy thing at all.
4) Use lamport clocks, but from what I understood these are only useful for events that are causally related. Since in this case the ACTION_1 messages are not causally related, I don't see how it could help solve the problem of figuring out which one happened first to discard the second one.
5) Use some predefined way of discarding one of the two messages on receipt by both ends. However, I cannot find a way to do this that is unflawed. A naive idea would be to include a random number on both sides, and select the message with the highest number as the "winner", discarding the one with the lowest number. However, we have a tie if both numbers are equal, and then we need another way to recover from this. A possible improvement would be to deal with arbitration once at connection time and repeat similar sequence until one of the two "wins", marking it as favourite. Every time a tie happens, the favourite wins.
Does anybody have further ideas on how to handle this?
EDIT:
Here is the current solution I came up with. Since I couldn't find 100% safe way to prevent ties, I decided to have my protocol elect a "favorite" during the connection sequence. Electing this favorite requires breaking possible ties, but in this case the protocol will allow for trying multiple times to elect the favorite until a consensus is reached. After the favorite is elected, all further ties are resolved by favoring the elected favorite. This isolates the problem of possible ties to a single part of the protocol.
As for fairness in the election process, I wrote something rather simple based on two values sent in each of the client/server packets. In this case, this number is a sequence number starting at a random value, but they could be anything as long as those numbers are fairly random to be fair.
When the client and server have to resolve a conflict, they both call this function with the send (their value) and the recv (the other value) values. The favorite calls this function with the favorite parameter set to TRUE. This function is guaranteed to give the opposite result on both ends, such that it is possible to break the tie without retransmitting a new message.
BOOL ResolveConflict(BOOL favorite, UINT32 sendVal, UINT32 recvVal)
{
BOOL winner;
int sendDiff;
int recvDiff;
UINT32 xorVal;
xorVal = sendVal ^ recvVal;
sendDiff = (xorVal < sendVal) ? sendVal - xorVal : xorVal - sendVal;
recvDiff = (xorVal < recvVal) ? recvVal - xorVal : xorVal - recvVal;
if (sendDiff != recvDiff)
winner = (sendDiff < recvDiff) ? TRUE : FALSE; /* closest value to xorVal wins */
else
winner = favorite; /* break tie, make favorite win */
return winner;
}
Let's say that both ends enter the ACTION_1_SENT state after sending the ACTION_1 message. Both will receive the ACTION_1 message in the ACTION_1_SENT state, but only one will win. The loser accepts the ACTION_1 message and enters the ACTION_1_RCVD state, while the winner discards the incoming ACTION_1 message. The rest of the sequence continues as if the loser had never sent ACTION_1 in a race condition with the winner.
Let me know what you think, and how this could be further improved.
To me, this whole idea that this ACTION_1 - ACTION_2 - ACTION_3 handshake must occur in sequence with no other message intervening is very onerous, and not at all in line with the reality of networks (or distributed systems in general). The complexity of some of your proposed solutions give reason to step back and rethink.
There are all kinds of complicating factors when dealing with systems distributed over a network: packets which don't arrive, arrive late, arrive out of order, arrive duplicated, clocks which are out of sync, clocks which go backwards sometimes, nodes which crash/reboot, etc. etc. You would like your protocol to be robust under any of these adverse conditions, and you would like to know with certainty that it is robust. That means making it simple enough that you can think through all the possible cases that may occur.
It also means abandoning the idea that there will always be "one true state" shared by all nodes, and the idea that you can make things happen in a very controlled, precise, "clockwork" sequence. You want to design for the case where the nodes do not agree on their shared state, and make the system self-healing under that condition. You also must assume that any possible message may occur in any order at all.
In this case, the problem is claiming "ownership" of a shared clipboard. Here's a basic question you need to think through first:
If all the nodes involved cannot communicate at some point in time, should a node which is trying to claim ownership just go ahead and behave as if it is the owner? (This means the system doesn't freeze when the network is down, but it means you will have multiple "owners" at times, and there will be divergent changes to the clipboard which have to be merged or otherwise "fixed up" later.)
Or, should no node ever assume it is the owner unless it receives confirmation from all other nodes? (This means the system will freeze sometimes, or just respond very slowly, but you will never have weird situations with divergent changes.)
If your answer is #1: don't focus so much on the protocol for claiming ownership. Come up with something simple which reduces the chances that two nodes will both become "owner" at the same time, but be very explicit that there can be more than one owner. Put more effort into the procedure for resolving divergence when it does happen. Think that part through extra carefully and make sure that the multiple owners will always converge. There should be no case where they can get stuck in an infinite loop trying to converge but failing.
If your answer is #2: here be dragons! You are trying to do something which buts up against some fundamental limitations.
Be very explicit that there is a state where a node is "seeking ownership", but has not obtained it yet.
When a node is seeking ownership, I would say that it should send a request to all other nodes, at intervals (in case another one misses the first request). Put a unique identifier on each such request, which is repeated in the reply (so delayed replies are not misinterpreted as applying to a request sent later).
To become owner, a node should receive a positive reply from all other nodes within a certain period of time. During that wait period, it should refuse to grant ownership to any other node. On the other hand, if a node has agreed to grant ownership to another node, it should not request ownership for another period of time (which must be somewhat longer).
If a node thinks it is owner, it should notify the others, and repeat the notification periodically.
You need to deal with the situation where two nodes both try to seek ownership at the same time, and both NAK (refuse ownership to) each other. You have to avoid a situation where they keep timing out, retrying, and then NAKing each other again (meaning that nobody would ever get ownership).
You could use exponential backoff, or you could make a simple tie-breaking rule (it doesn't have to be fair, since this should be a rare occurrence). Give each node a priority (you will have to figure out how to derive the priorities), and say that if a node which is seeking ownership receives a request for ownership from a higher-priority node, it will immediately stop seeking ownership and grant it to the high-priority node instead.
This will not result in more than one node becoming owner, because if the high-priority node had previously ACKed the request sent by the low-priority node, it would not send a request of its own until enough time had passed that it was sure its previous ACK was no longer valid.
You also have to consider what happens if a node becomes owner, and then "goes dark" -- stops responding. At what point are other nodes allowed to assume that ownership is "up for grabs" again? This is a very sticky issue, and I suspect you will not find any solution which eliminates the possibility of having multiple owners at the same time.
Probably, all the nodes will need to "ping" each other from time to time. (Not referring to an ICMP echo, but something built in to your own protocol.) If the clipboard owner can't reach the others for some period of time, it must assume that it is no longer owner. And if the others can't reach the owner for a longer period of time, they can assume that ownership is available and can be requested.
Here is a simplified answer for the protocol of interest here.
In this case, there is only a client and a server, communicating over TCP. The goal of the protocol is to two system clipboards. The regular state when outside of a particular sequence is simply "CLIPBOARD_ESTABLISHED".
Whenever one of the two systems pastes something onto its clipboard, it sends a ClipboardFormatListReq message, and transitions to the CLIPBOARD_FORMAT_LIST_REQ_SENT state. This message contains a sequence number that is incremented when sending the ClipboardFormatListReq message. Under normal circumstances, no race condition occurs and a ClipboardFormatListRsp message is sent back to acknowledge the new sequence number and owner. The list contained in the request is used to expose clipboard data formats offered by the owner, and any of these formats can be requested by an application on the remote system.
When an application requests one of the data formats from the clipboard owner, a ClipboardFormatDataReq message is sent with the sequence number, and format id from the list, the state is changed to CLIPBOARD_FORMAT_DATA_REQ_SENT. Under normal circumstances, there is no change of clipboard ownership during that time, and the data is returned in the ClipboardFormatDataRsp message. A timer should be used to timeout if no response is sent fast enough from the other system, and abort the sequence if it takes too long.
Now, for the special cases:
If we receive ClipboardFormatListReq in the CLIPBOARD_FORMAT_LIST_REQ_SENT state, it means both systems are trying to gain ownership at the same time. Only one owner should be selected, in this case, we can keep it simple an elect the client as the default winner. With the client as the default owner, the server should respond to the client with ClipboardFormatListRsp consider the client as the new owner.
If we receive ClipboardFormatDataReq in the CLIPBOARD_FORMAT_LIST_REQ_SENT state, it means we have just received a request for data from the previous list of data formats, since we have just sent a request to become the new owner with a new list of data formats. We can respond with a failure right away, and sequence numbers will not match.
Etc, etc. The main issue I was trying to solve here is fast recovery from such states, with going into a loop of retrying until it works. The main issue with immediate retrial is that it is going to happen with timing likely to cause new race conditions. We can solve the issue by expecting such inconsistent states as long as we can move back to proper protocol states when detecting them. The other part of the problem is with electing a "winner" that will have its request accepted without resending new messages. A default winner can be elected by default, such as the client or the server, or some sort of random voting system can be implemented with a default favorite to break ties.
Scenario:
I send encrypted information to a client program.
I want to the information to display 1 year later.
No further information will be sent by me.
If the user of the client program can do analysis of the binary file of the program, is it possible to prevent the early reveal of the information?
In general, such a thing is not possible. If the program is able to decrypt the data without further interaction, it must possess the key.
Therefore, even with signed timestamping, you cannot prevent someone from reverse-engineering your program, taking the key, and doing the decryption.
EDIT: Though you could at least in theory implement something like this indirectly, by requiring a computionally intensive puzzle to be solved for retrieving the key (which takes a year on the average!), but this is unreliable at best (faster/slower hardware) and will certainly not find acceptance among your users/customers. Be prepared to receive hate mails if you do that :-)
Interesting question. I think it is not possible as you described, unless server store a part of the secret and deliver it to the client at the right moment.
If you've sent the client all the info they need to do the decryption, there is no way to force them to wait a year before doing so.
You can always use a timemachine and base your key on a hash of, say, the Dow Jones index one year from now as well as some other data that can't be pre-calculated. So unless you have some inside info that only you know about the day the decryption should occur, I think you're facing a quite impossible task
I have a situation where a main orchestration is responsible for processing a convoy of messages. These messages belong to a set of customers, the orchestration will read the messages as they come in, and for each new customer id it finds, it will spin up a new orchestration that is responsible for processing the messages of a particular customer. I have to preserve the order of messages as they come in, so the newly created orchestrations should process the message it has and wait for additional messages from the main orchestration.
Tried different ways to tackle this, but was not able to successfuly implement it.
I would like to hear your opinions on how this could be done.
Thanks.
It sounds like what you want is a set of nested convoys. While it might be possible to get that working, it's going to... well, hurt. In particular, my first worry would be maintenance: any changes to the process would be a pain in the neck to make, and, much worse, deployment would really, really suck.
Personally, I would really try to find an alternative way to implement this and avoid the convoys if possible, but that would depend a lot on your specific scenario.
A few questions, if you don't mind:
What are your ordering requirements? For example, do you only need ordered processing for each customer on a single incoming batch, or across batches? If the latter, could you make do without the master orchestration and just force a single convoy'd instance per customer? Still not great, but would likely simplify things a lot.
What are you failure requirements with respect to ordering? Should it completely stop processing? Save message and keep going? What about retries?
Is ordering based purely on the arrival time of the message? Is there anything in the message that you could use to force ordering internally instead of relying purely on the arrival time?
What does the processing of the individual messages do? Is the ordering requirement only to ensure that certain preconditions are met when a specific message is processed (for example, messages represent some tree structure that requires parents are processed before children).
I don't think you need a master orchestration to start up the sub-orchestrations. I am assumin you are not talking about the master orchestration implmenting a convoy pattern. So, if that's the case, here's what I might do.
There is a brief example here on how to implment a singleton orchestration. This example shows you how to setup an orchestration that will only ever exist once. All the messages going to it will be lined up in order of receipt and processed one at a time. Your example differs in that you want to have this done by customer ID. This is pretty simple. Promote the customer ID in the inbound message and add it to the correlation type. Now, there will only ever be one instance of the orchestration per customer.
The problem with singletons is this. You have to kill them at some point or they will live forever as dehydrated orchestrations. So, you need to have them end. You can do this if there is a way for the last message for a given customer to signal the orchestration that it's time to die through an attribute or such. If this is not possible, then you need to set a timer. If no messags are received in x seconds, terminate the orch. This is all easy to do, but it can introduce Zombies. Zombies occur when that orchestration is in the process of being shut down when another message for that customer comes in. this can usually be solved by tweeking the time to wait. Regardless, it will cause the occasional Zombie.
A note fromt he field. We've done this and it's really not a great long term solution. We were receiving customer info updates and we had to ensure ordered processing. We did this singleton approach and it's been problematic from the Zombie issue and the exeption issue. If the Singleton orchestration throws an exception, it will block the processing for a all future messages for that customer. So - handle every single possible exception. The real solution would have been to have the far end system check the time stamps from the update messages and discard ones that were older than the last update. We wanted to go this way, but the receiving system didn't want to do this extra work.
I'm learning about the various networking technologies, specifically the protocols UDP and TCP.
I've read numerous times that games like Quake use UDP because, "it doesn't matter if you miss a position update packet for a missile or the like, because the next packet will put the missile where it needs to be."
This thought process is all well-and-good during the flight path of an object, but it's not good for when the missile reaches it's target. If one computer receives the message that the missile reached it's intended target, but that packet got dropped on a different computer, that would cause some trouble.
Clearly that type of thing doesn't really happen in games like Quake, so what strategy are they using to make sure that everyone is in sync with instantaneous type events, such as a collision?
You've identified two distinct kinds of information:
updates that can be safely missed, because the information they carry will be provided in the next update;
updates that can't be missed, because the information they carry is not part of the next regular update.
You're right - and what the games typically do is to separate out those two kinds of messages within their protocol, and require acknowledgements and retransmissions for the second type, but not for the first type. (If the underlying IP protocol is UDP, then these acknowledgements / retransmissions need to be provided at a higher layer).
When you say that "clearly doesn't happen", you clearly haven't played games on a lossy connection. A popular trick amongst the console crowd is to put a switch on the receive line of your ethernet connection so you can make your console temporarily stop receiving packets, so everybody is nice and still for you to shoot them all.
The reason that could happen is the console that did the shooting decides if it was a hit or not, and relays that information to the opponent. That ensures out of sync or laggy hit data can be deterministically decided. Even if the remote end didn't think that the shot was a hit, it should be close enough that it doesn't seem horribly bad. It works in a reasonable manner, except for what I've mentioned above. Of course, if you assume your players are not cheating, this approach works quite reasonably.
I'm no expert, but there seems to be two approaches you can take. Let the client decide if it's a hit or not (allows for cheating), or let the server decide.
With the former, if you shoot a bullet, and it looks like a hit, it will count as a hit. There may be a bit of a delay before everyone else receives this data though (i.e., you may hit someone, but they'll still be able to play for half a second, and then drop dead).
With the latter, as long as the server receives the information that you shot a bullet, it can use whatever positions it currently has to determine if there was a hit or not, then send that data back for you. This means neither you nor the victim will be aware of you hit or not until that data is sent back to you.
I guess to "smooth" it out you let the client decide for itself, and then if the server pipes in and says "no, that didn't happen" it corrects. Which I suppose could mean players popping back to life, but I reckon it would make more sense just to set their life to 0 and until you get a definitive answer so you don't have weird graphical things going on.
As for ensuring the server/client has received the event... I guess there are two more approaches. Either get the server/client to respond "Yeah, I received the event" or forget about events altogether and just think about everything in terms of state. There is no "hit" event, there's just HP before and after. Sooner or later, it'll receive the most up-to-date state.
I'm playing around with a native (non-web) single-player game I'm writing, and it occured to me that having a daily/weekly/all-time online highscore list (think Xbox Live Leaderboard) would make the game much more interesting, adding some (small) amount of community and competition. However, I'm afraid people would see such a feature as an invitation to hacking, which would discourage regular players due to impossibly high scores.
I thought about the obvious ways of preventing such attempts (public/private key encryption, for example), but I've figured out reasonably simple ways hackers could circumvent all of my ideas (extracting the public key from the binary and thus sending fake encrypted scores, for example).
Have you ever implemented an online highscore list or leaderboard? Did you find a reasonably hacker-proof way of implementing this? If so, how did you do it? What are your experiences with hacking attempts?
At the end of the day, you are relying on trusting the client. If the client sends replays to the server, it is easy enough to replicable or modify a successful playthrough and send that to the server.
Your best bet is to raise the bar for cheating above what a player would deem worth surmounting. To do this, there are a number of proven (but oft-unmentioned) techniques you can use:
Leave blacklisted cheaters in a honeypot. They can see their own scores, but no one else can. Unless they verify by logging in with a different account, they think they have successfully hacked your game.
When someone is flagged as a cheater, defer any account repercussions from transpiring until a given point in the future. Make this point random, within one to three days. Typically, a cheater will try multiple methods and will eventually succeed. By deferring account status feedback until a later date, they fail to understand what got them caught.
Capture all game user commands and send them to the server. Verify them against other scores within a given delta. For instance, if the player used the shoot action 200 times, but obtained a score of 200,000, but the neighboring players in the game shot 5,000 times to obtain a score of 210,000, it may trigger a threshold that flags the person for further or human investigation.
Add value and persistence to your user accounts. If your user accounts have unlockables for your game, or if your game requires purchase, the weight of a ban is greater as the user cannot regain his previous account status by simply creating a new account through a web-based proxy.
No solution is ever going to be perfect while the game is running on a system under the user's control, but there are a few steps you could take to make hacking the system more trouble. In the end, the goal can only be to make hacking the system more trouble than it's worth.
Send some additional information with the high score requests to validate one the server side. If you get 5 points for every X, and the game only contains 10 Xs, then you've got some extra hoops to make the hacker to jump through to get their score accepted as valid.
Have the server send a random challenge which must be met with a few bytes of the game's binary from that offset. That means the hacker must keep a pristine copy of the binary around (just a bit more trouble).
If you have license keys, require high scores to include them, so you can ban people caught hacking the system. This also lets you track invalid attempts as defined above, to ban people testing out the protocol before the ever even submit a valid score.
All in all though, getting the game popular enough for people to care to hack it is probably a far bigger challenge.
I honestly don't think it's possible.
I've done it before using pretty simple key encryption with a compressed binary which worked well enough for the security I required but I honestly think if somebody considers cracking your online high score table a hack it will be done.
There are some pretty sad people out there who also happen to be pretty bright unless you can get them all laid it's a lost cause.
If your game has a replay system built in, you can submit replays to the server and have the server calculate the score from the replay.
This method isn't perfect, you can still cheat by slowing down the game (if it is action-based), or by writing a bot.
I've been doing some of this with my Flash games, and it's a losing battle really. Especially for ActionScript that can be decompiled into somewhat readable code without too much effort.
The way I've been doing it is a rather conventional approach of sending the score and player name in plain text and then a hash of the two (properly salted). Very few people are determined enough to take the effort to figure that out, and the few who are would do it anyway, negating all the time you put into it.
To summarize, my philosophy is to spend the time on making the game better and just make it hard enough to cheat.
One thing that might be pretty effective is to have the game submit the score to the server several times as you are playing, sending a bit of gameplay information each time, allowing you to validate if the score is "realistic". But that might be a bit over-the-top really.
That's a really hard question.
I've never implemented such thing but here's a simple aproximmation.
Your main concern is due to hackers guessing what is it your application is doing and then sending their own results.
Well, first of all, unless your application has a great success I wouldn't be worried. Doing such thing is extremely difficult.
Encryption won't help with the problem. You see, encryption helps to protect the data on its way but it doesn't protect either of the sides of the transaction before the data is encrypted (which is where the main vulnerability may be). So if you encrypt the sure, the data will remain private but it won't be safe.
If you are really worried about it I will suggest obfuscating the code and designing the score system in a way which is not completely obvious what is doing. Here we can borrow some things from an encryption protocol. Here is an example:
Let's say the score is some number m
Compute some kind of check over the score (for example the CRC or any other system you see feet. In fact, if you just invent one, no matter how lame is it it will work better)
Obtain the private key of the user (D) from your remote server (over a secure connection obviously). You're the only one which know this key.
Compute X=m^D mod n (n being the public module of your public/private key algorithm) (that is, encrypt it :P)
As you see that's just obfuscation of another kind. You can go down that way as long as you want. For example you can lookup the nearest two prime numbers to X and use them to encrypt the CRC and send it also to the server so you'll have the CRC and the score separately and with different encryption schemes.
If you use that in conjunction with obfuscation I'd say that would be difficult to hack. Nontheless even that could be reverse engingeered, it all depends on the interest and ability of the hacker but ... seriously, what kind of freak takes so much effort to change its results on a game? (Unless is WoW or something)
One last note
Obfuscator for .NET
Obfuscator for Delphi/C++
Obfuscator for assembler (x86)
As the other answer says, you are forced to trust a potentially malicious client, and a simple deterant plus a little human monitoring is going to be enough for a small game.
If you want to get fancy, you then have to look for fraud patterns in the score data, simmular to a credit card company looking at charge data. The more state the client communicates onto your server, the potentially easier it is to find a pattern of correct or incorrect behavior via code. For example. say that the client had to upload a time based audit log of the score (which maybe you can also use to let another clients watch the top games), the server can then validate if the score log breaks any of the game rules.
In the end, this is still about making it expensive enough to discourage cheating the scoreboard. You would want a system where you can always improve the (easier to update)server code to deal with any new attacks on your validation system.
#Martin.
This is how I believe Mario Kart Wii works. The added bonus is that you can let all the other players watch how the high score holder got the high score. The funny thing about this is that if you check out the fastest "Grumble Volcano" time trail, you'll see that somebody found a shortcut that let you skip 95% of the track. I'm not sure if they still have that up as the fastest time.
You can't do it on a nontrusted client platform. In practice it is possible to defeat even some "trusted" platforms.
There are various attacks which are impossible to detect in the general case - mainly modifying variables in memory. If you can't trust your own program's variables, you can't really achieve very much.
The other techniques outlined above may help, but don't solve the basic problem of running on a nontrusted platform.
Out of interest, are you sure that people will try to hack a high score table? I have had a game online for over two years now with a trivially-crackabe high score table. Many people have played it but I have no evidence that anyone's tried to crack the high scores.
Usually, the biggest defender against cheating and hacking is a community watch. If a score seems rather suspicious, a user can report the score for cheating. And if enough people report that score, the replay can be checked by the admins for validity. It is fairly easy to see the difference between a bot an an actual player, if there's already a bunch of players playing the game in full legitimacy.
The admins must oversee only those scores that get questioned, because there is a small chance that a bunch of users might bandwagon to remove a perfectly hard-earned score. And the admins only have to view the few scores that do get reported, so it's not too much of their time, even less for a small game.
Even just knowing that if you work hard to make a bot, just to be shot down again by the report system, is a deterrent in itself.
Perhaps even encrypting the replay data wouldn't hurt, either. Replay data is often small, and encrypting it wouldn't take too much more space. And to help improve that, the server itself would try out the replay by the control log, and make sure it matches up with the score achieved.
If there's something the anti-cheat system can't find, users will find it.