Why the time-out values are so small? - asynchronous

I'm not sure if this is a pure stackoverflow relevant question. It is related to general design practice. Since I cannot think of another relevant stack exchange site, posting it here.
In the general design practice of converting an async call to sync one, we use a time-out and wait for the results. While, this may not exactly a good practice from the point of view of responsiveness, it definitely makes the implementation easier.
I have seen many such implementations and often noticed that the developers tend to give a very small time-out value. I can understand that the people may have the need of a responsive system in mind when they did this. But many of these applications I have seen are very data critical ones where the loss of data is very bad. So, it is always better to wait more and try to get as much data instead of timing out early and giving an error message to the user. Now, the situations where the server failing to give data or the client unable to reach server etc are rare. In those situations, I expect the a large time-out for such waits. After all, these time-outs don't mean that the wait will definitely last until the given time-out value; the timeout value is only an upper limit. So, I have always arguing for higher values here. But I see the use of low values in more and more places and now I'm getting confused if really there is something else in this practice that I don't understand.
So, my question is : Are there any arguments, other than the need for responsiveness to implement a very small time-out for waiting?

As always, the right decision depends on the real-life data.
The timeout should be proportional to the time it usually takes to complete an operation successfully.
Sending a UDP message for example could take between 1 - 50 milliseconds so a timeout of 100 milliseconds is more than reasonable however copying a file over the wire could take minutes or more so a 100 millisecond timeout is laughable.
There are pros and cons to both short and long timeouts so it's a tradeoff. Longer timeouts use more resources (tasks, threads, memory, etc.) for the same amount of work while short timeouts, as you mentioned, may result in loss of data.
In conclusion, you need to set a configurable timeout that sounds reasonable and then figure out whether you timeout too many operations in production or the other way around and calibrate accordingly.

Related

Am I misunderstanding how often IORING POLL is used?

I am watching a video that says after a million operations its faster to use io_uring_poll. I haven't used io_uring much so I may be remembering things wrong but io_uring poll means use io_uring_enter with the IORING_SETUP_SQPOLL flag? I understand sometimes traffic drops but wouldn't most of the time the code be using atomics to check if there's more items in the completion queue? Wouldnt enter/poll be needed the rare second when traffic drops? Or is it called constantly? I never had to do that many operations so I don't know

Optimization of parallel programming

I want to use MPI to make my program parallel, and I want to send something to other computers. I want to know which one is better: sending a huge buffer one time or sending two smaller messages 3 times atrent times during the execution instead of all at once?
It's almost always going to be faster to send the one big message than the smaller one. Each time you do a Send/Receive pair, the two processes have to go through the entire process of sending a message to each other, including at least 6 roundtrip messages. If you are just sending one larger message, there is a minimum of 2 roundtrip messages. Each of those messages can be very expensive (compared to doing things locally like packing all of your data into one buffer).
I'd encourage you to try it out both ways though to be sure that this applies to your application. It could be different if you're doing something unexpected.
Depending on your problem, sending all data may be more efficient, because the nodes have to be synced, every time. That may cause a delay.
I always try to send as much data as I can in a single MPI call. In my experience, sending many small bits of data greatly increases the overhead and network traffic, and I have even run into problems where I overwhelmed the computers' ability to keep up with the number of requests, because I was sending a large member of a complicated class, one integer at a time, to many workers. Therefore, when possible, send the entire data at once, unless you have some reason to believe it is too large.
Further, I strive to use 100% of all the CPU's my program claims. When you are working on shared resources, if you use a CPU, you need to actually use it. Otherwise, someone else who wants to use that core, or node, is blocked out while your program sits and does nothing. For example, on a Cray which I have used, even if you call for only two 'cores', the manager will reserve a full bank of 24 cores, essentially wasting 22. Or, perhaps one worker has nothing to do, while another chugs away -- again, wasting time. Hopefully, there is a way to balance the load, so to speak, to avoid unintentional waste of resources.
Back to the topic at hand. Demonstrate timing and efficiency of vector sending to yourself -- write a program which breaks up the vector into varying sizes of packets, and do the sends/receives. Test it with varying numbers of workers, and on several different configurations of computers, if you can. Before writing production code, do proof of concept, and what optimization you can. Test and time it!

Asp.net guaranteed response time

Does anybody have any hints as to how to approach writing an ASP.net app that needs to have a guaranteed response time?
When under high load that would normally cause us to exceed our desired response time, we want to throw out an appropriate number of requests, so that the rest of the requests can return before the max response time. Throwing out requests based on exceeding a fixed req/s is not viable, as there are other external factors that will control response time that cause the max rps we can safely support to fiarly drastically drift and fluctuate over time.
Its ok if a few requests take a little too long, but we'd like the great majority of them to meet the required response time window. We want to "throw out" the minimal or near minimal number of requests so that we can process the rest of the requests in the allotted response time.
It should account for ASP.Net queuing time, ideally the network request time but that is less important.
We'd also love to do adaptive work, like make a db call if we have plenty of time, but do some computations if we're shorter on time.
Thanks!
SLAs with a guaranteed response time require a bit of work.
First off you need to spend a lot of time profiling your application. You want to understand exactly how it behaves under various load scenarios: light, medium, heavy, crushing.. When doing this profiling step it is going to be critical that it's done on the exact same hardware / software configuration that production uses. Results from one set of hardware have no bearing on results from an even slightly different set of hardware. This isn't just about the servers either; I'm talking routers, switches, cable lengths, hard drives (make/model), everything. Even BIOS revisions on the machines, RAID controllers and any other device in the loop.
While profiling make sure the types of work loads represent an actual slice of what you are going to see. Obviously there are certain load mixes which will execute faster than others.
I'm not entirely sure what you mean by "throw out an appropriate number of requests". That sounds like you want to drop those requests... which sounds wrong on a number of levels. Doing this usually kills an SLA as being an "outage".
Next, you are going to have to actively monitor your servers for load. If load levels get within a certain percentage of your max then you need to add more hardware to increase capacity.
Another thing, monitoring result times internally is only part of it. You'll need to monitor them from various external locations as well depending on where your clients are.
And that's just about your application. There are other forces at work such as your connection to the Internet. You will need multiple providers with active failover in case one goes down... Or, if possible, go with a solid cloud provider.
Yes, in the last mvcConf one of the speakers compares the performance of various view engines for ASP.NET MVC. I think it was Steven Smith's presentation that did the comparison, but I'm not 100% sure.
You have to keep in mind, however, that ASP.NET will really only play a very minor role in the performance of your app; DB is likely to be your biggest bottle neck.
Hope the video helps.

Practical value for concurrent-request-timeout parameter or options for avoiding concurrent access to conversation exception

In the Seam Reference Guide, one can find this paragraph:
We can set a sensible default for the concurrent request timeout (in ms) in components.xml:
<core:manager concurrent-request-timeout="500" />
However, we found that 500 ms is not nearly enough time for most of the cases we had to deal with, especially with the severe restriction seam places on conversation access.
In our application we have a combination of page scoped ajax requests (triggered by various user actions), some global scoped polling notification logic (part of the header, so included in every page) and regular links that invoke actions and/or navigate to other pages.
Therefore, we get the dreaded concurrent access to conversation exception way too often, even without any significant load on the site.
After researching the options for quite a bit, we ended up bumping this value to several seconds (we're debating whether to bump it up to 10s), as none of the recommended solutions seemed able to solve our issue completely (even forcing a global queue for all the ajax requests would still leave us exposed to a user deciding to click a link right when one of our polling calls was in progress). And we'd much rather have the users wait for a second or two instead of getting an error page just because they clicked a link at the wrong moment.
And now to the question: is there something obvious we're missing (like a way to allow concurrent access to conversations and taking care of the needed locking ourselves, for instance :)? How do people solve this problem (ajax requests mixed with user driven interaction) in seam? Disabling all the links on the page while ajax requests are in progress (as suggested by one blog page) is really not a viable option.
Any other suggestions?
TIA,
Andrei
We use 60000 or 120000 (1-2 minutes). Concurrent-request-timeout is designed to avoid deadlocks. Historically we have far more problems with timeouts than deadlocks. A better approach is to use a client-side queue (<a4j:ajaxQueue> if using RichFaces) to serialize and remove duplicate requests as much as possible, then set the timeout high enough to avoid any remaining problems.
There are many serious issues resulting from Seam's concurrent request timeouts:
The issue is the last request gets the ConcurrentRequestTimeoutException. If the user double-clicks or reloads the page, only the last request matters -- why should he get an error?
Usually the ConcurrentRequestTimeoutException is suppressed, and only secondary NullPointerExceptions and #In injection failures are shown, making debugging difficult.
Seam 2.2.1 has a severe problem where transactions, ThreadLocals, and locks may leak after a timeout occurs, especially when used with <spring:spring-transaction/>. Look at SeamPhaseListener.afterRestoreView: there's no finally block to clean up after restoreConversation fails!
In my opinion there are many poor aspects to this design, so it's best to use a much higher timeout and try to avoid the issues.
This is what we have and it works fine for us:
<core:manager concurrent-request-timeout="5000"
conversation-timeout="120000" conversation-id-parameter="cid"
parent-conversation-id-parameter="pid" />
We also use a much higher value for the concurrent-request-timeout.
At least for duplicate events you can use settings in the a4j components to filter and delay them with eventsQueue, requestDelay and ignoreDupResponses=”true”.
(Last point http://docs.jboss.org/seam/2.0.1.GA/reference/en/html/conversations.html )
Can you analyse which types of request are taking a long time? Is there a particular type which you could reduce the request time by doing the "work" asynchronously and getting the update back in your poll?
In my opinion, ajax requests should always complete fairly quickly, then you can calculate a max concurrent request time by (request time * max number of requests likely to be initiated)

Secure Online Highscore Lists for Non-Web Games

I'm playing around with a native (non-web) single-player game I'm writing, and it occured to me that having a daily/weekly/all-time online highscore list (think Xbox Live Leaderboard) would make the game much more interesting, adding some (small) amount of community and competition. However, I'm afraid people would see such a feature as an invitation to hacking, which would discourage regular players due to impossibly high scores.
I thought about the obvious ways of preventing such attempts (public/private key encryption, for example), but I've figured out reasonably simple ways hackers could circumvent all of my ideas (extracting the public key from the binary and thus sending fake encrypted scores, for example).
Have you ever implemented an online highscore list or leaderboard? Did you find a reasonably hacker-proof way of implementing this? If so, how did you do it? What are your experiences with hacking attempts?
At the end of the day, you are relying on trusting the client. If the client sends replays to the server, it is easy enough to replicable or modify a successful playthrough and send that to the server.
Your best bet is to raise the bar for cheating above what a player would deem worth surmounting. To do this, there are a number of proven (but oft-unmentioned) techniques you can use:
Leave blacklisted cheaters in a honeypot. They can see their own scores, but no one else can. Unless they verify by logging in with a different account, they think they have successfully hacked your game.
When someone is flagged as a cheater, defer any account repercussions from transpiring until a given point in the future. Make this point random, within one to three days. Typically, a cheater will try multiple methods and will eventually succeed. By deferring account status feedback until a later date, they fail to understand what got them caught.
Capture all game user commands and send them to the server. Verify them against other scores within a given delta. For instance, if the player used the shoot action 200 times, but obtained a score of 200,000, but the neighboring players in the game shot 5,000 times to obtain a score of 210,000, it may trigger a threshold that flags the person for further or human investigation.
Add value and persistence to your user accounts. If your user accounts have unlockables for your game, or if your game requires purchase, the weight of a ban is greater as the user cannot regain his previous account status by simply creating a new account through a web-based proxy.
No solution is ever going to be perfect while the game is running on a system under the user's control, but there are a few steps you could take to make hacking the system more trouble. In the end, the goal can only be to make hacking the system more trouble than it's worth.
Send some additional information with the high score requests to validate one the server side. If you get 5 points for every X, and the game only contains 10 Xs, then you've got some extra hoops to make the hacker to jump through to get their score accepted as valid.
Have the server send a random challenge which must be met with a few bytes of the game's binary from that offset. That means the hacker must keep a pristine copy of the binary around (just a bit more trouble).
If you have license keys, require high scores to include them, so you can ban people caught hacking the system. This also lets you track invalid attempts as defined above, to ban people testing out the protocol before the ever even submit a valid score.
All in all though, getting the game popular enough for people to care to hack it is probably a far bigger challenge.
I honestly don't think it's possible.
I've done it before using pretty simple key encryption with a compressed binary which worked well enough for the security I required but I honestly think if somebody considers cracking your online high score table a hack it will be done.
There are some pretty sad people out there who also happen to be pretty bright unless you can get them all laid it's a lost cause.
If your game has a replay system built in, you can submit replays to the server and have the server calculate the score from the replay.
This method isn't perfect, you can still cheat by slowing down the game (if it is action-based), or by writing a bot.
I've been doing some of this with my Flash games, and it's a losing battle really. Especially for ActionScript that can be decompiled into somewhat readable code without too much effort.
The way I've been doing it is a rather conventional approach of sending the score and player name in plain text and then a hash of the two (properly salted). Very few people are determined enough to take the effort to figure that out, and the few who are would do it anyway, negating all the time you put into it.
To summarize, my philosophy is to spend the time on making the game better and just make it hard enough to cheat.
One thing that might be pretty effective is to have the game submit the score to the server several times as you are playing, sending a bit of gameplay information each time, allowing you to validate if the score is "realistic". But that might be a bit over-the-top really.
That's a really hard question.
I've never implemented such thing but here's a simple aproximmation.
Your main concern is due to hackers guessing what is it your application is doing and then sending their own results.
Well, first of all, unless your application has a great success I wouldn't be worried. Doing such thing is extremely difficult.
Encryption won't help with the problem. You see, encryption helps to protect the data on its way but it doesn't protect either of the sides of the transaction before the data is encrypted (which is where the main vulnerability may be). So if you encrypt the sure, the data will remain private but it won't be safe.
If you are really worried about it I will suggest obfuscating the code and designing the score system in a way which is not completely obvious what is doing. Here we can borrow some things from an encryption protocol. Here is an example:
Let's say the score is some number m
Compute some kind of check over the score (for example the CRC or any other system you see feet. In fact, if you just invent one, no matter how lame is it it will work better)
Obtain the private key of the user (D) from your remote server (over a secure connection obviously). You're the only one which know this key.
Compute X=m^D mod n (n being the public module of your public/private key algorithm) (that is, encrypt it :P)
As you see that's just obfuscation of another kind. You can go down that way as long as you want. For example you can lookup the nearest two prime numbers to X and use them to encrypt the CRC and send it also to the server so you'll have the CRC and the score separately and with different encryption schemes.
If you use that in conjunction with obfuscation I'd say that would be difficult to hack. Nontheless even that could be reverse engingeered, it all depends on the interest and ability of the hacker but ... seriously, what kind of freak takes so much effort to change its results on a game? (Unless is WoW or something)
One last note
Obfuscator for .NET
Obfuscator for Delphi/C++
Obfuscator for assembler (x86)
As the other answer says, you are forced to trust a potentially malicious client, and a simple deterant plus a little human monitoring is going to be enough for a small game.
If you want to get fancy, you then have to look for fraud patterns in the score data, simmular to a credit card company looking at charge data. The more state the client communicates onto your server, the potentially easier it is to find a pattern of correct or incorrect behavior via code. For example. say that the client had to upload a time based audit log of the score (which maybe you can also use to let another clients watch the top games), the server can then validate if the score log breaks any of the game rules.
In the end, this is still about making it expensive enough to discourage cheating the scoreboard. You would want a system where you can always improve the (easier to update)server code to deal with any new attacks on your validation system.
#Martin.
This is how I believe Mario Kart Wii works. The added bonus is that you can let all the other players watch how the high score holder got the high score. The funny thing about this is that if you check out the fastest "Grumble Volcano" time trail, you'll see that somebody found a shortcut that let you skip 95% of the track. I'm not sure if they still have that up as the fastest time.
You can't do it on a nontrusted client platform. In practice it is possible to defeat even some "trusted" platforms.
There are various attacks which are impossible to detect in the general case - mainly modifying variables in memory. If you can't trust your own program's variables, you can't really achieve very much.
The other techniques outlined above may help, but don't solve the basic problem of running on a nontrusted platform.
Out of interest, are you sure that people will try to hack a high score table? I have had a game online for over two years now with a trivially-crackabe high score table. Many people have played it but I have no evidence that anyone's tried to crack the high scores.
Usually, the biggest defender against cheating and hacking is a community watch. If a score seems rather suspicious, a user can report the score for cheating. And if enough people report that score, the replay can be checked by the admins for validity. It is fairly easy to see the difference between a bot an an actual player, if there's already a bunch of players playing the game in full legitimacy.
The admins must oversee only those scores that get questioned, because there is a small chance that a bunch of users might bandwagon to remove a perfectly hard-earned score. And the admins only have to view the few scores that do get reported, so it's not too much of their time, even less for a small game.
Even just knowing that if you work hard to make a bot, just to be shot down again by the report system, is a deterrent in itself.
Perhaps even encrypting the replay data wouldn't hurt, either. Replay data is often small, and encrypting it wouldn't take too much more space. And to help improve that, the server itself would try out the replay by the control log, and make sure it matches up with the score achieved.
If there's something the anti-cheat system can't find, users will find it.

Resources