We are thinking to make some architectural changes in our application, which might affect the technologies we'll be using as a result of those changes.
The change that I'm referring in this post is like this:
We've found out that some parts of our application have common data and common services, so we extracted those into a GlobalServices service, with its own master data db.
Now, this service will probably have its own cache, so that it won't have to retrieve data from the db on each call.
So, when one client makes a call to that service that updates data, other clients might be interested in that change, or not. Now that depends on whether we decide to keep a cache on the clients too.
Meaning that if the clients will have their own local cache, they will have to be notified somehow (and first register for notifications). If not, they will always get the data from the GlobalServices service.
I need your educated advice here guys:
1) Is it a good idea to keep a local cache on the clients to begin with?
2) If we do decide to keep a local cache on the clients, would you use
SqlCacheDependency to notify the clients, or would you use WCF for
notifications (each might have its cons and pros)
Thanks a lot folks,
Avi
I like the sound of your SqlCacheDependency, but I will answer this from a different perspective as I have worked with a team on a similar scenario. We created a master database and used triggers to create XML representations of data that was being changed in the master, and stored it in a TransactionQueue table, with a bit of meta data about what changed, when and who changed it. The client databases would periodically check the queue for items it was interested in, and would process the XML and update it's own tables as necessary.
We also did the same in reverse for the client to update the master. We set up triggers and a TransactionQueue table on the client databases to send data back to the master. This in turn would update all of the other client databases when they next poll.
The nice thing about this is that it is fairly agnostic on client platform, and client data structure, so we were able to use the method on a range of legacy and third party systems. The other great point here is that you can take any of the databases out of the loop (including the master - e.g. connection failure) and the others will still work fine. This worked well for us as our master database was behind our corporate firewall, and the simpler web databases were sitting with our ISP.
There are obviously cons to this approach, like race hazard, so we were careful with the order of transaction processing, error handling, de-duping etc. We also built a management GUI to provide a human interaction layer before important data was changed in the master.
Good luck! Tim
Related
I am using SignalR with my ASP.NET application. What my application needs is to pressist the groups data that is updated from various servers. According to SignalR documentation it's my responsibility to do this. It means that I need to use an external server/service that will collect the data from one or more servers and I can query that data from a single place.
I first thought that MemCached is the best candidate, because it's fast and the data that I need to put there is volatile. The problem is that I need to store collections, for example: collection A with user Ids, so I can have Collection A with 2000 user ids and Collection B with 40,000 ids. The problem is that I need to update this collection and remove and insert id very quickly. I afraid that because the commands will be initiated from several servers, and the fact that I might need to read the entire collection and update it on a either web servers, the data won't be consistent. Web Server A might update the data, but Server B will read the data before Server A finished updating it. There is a concurrency conflict.
I'm searching for the best way to implement this kind of strategy in my ASP.NET 4.5 application. I think that this might be a choice to use a in-memory database or that to insure no data integrity.
I want to ask you what is the best solution for my problem.
Here's an example for my problen:
MemCached Server - stores the collections (e.g. Collection A, B, C, D), each collection stores User Id's, which can be thousands of Ids and even much more.
Web Servers - My Amazon EC2 web servers with SignalR installed. Can be behind load balancer. Those servers need to gain access to the memcached server and get a complete collection items by the Collection name (e.g. "Collection_23"). They need to be able to remove items (User Id's) and add Items. All this should be fast as possible.
I hope that I explained myself right. Thanks.
Alternatively, you can use Redis, like Memcached everything is served from in-memory. Redis has many other capabilities beyond a simple key-value datastore; for your specific case you might use Redis transactions, which ensures data consistency.
In a comment in another post it shows a link to redis provider. The link is broken, it seems that it is now integrated in the main SignalR project: https://github.com/SignalR/SignalR/tree/master/src/Microsoft.AspNet.SignalR.Redis
You have the redis nuget here:
http://www.nuget.org/packages/Microsoft.AspNet.SignalR.Redis
and documentation here:
http://www.asp.net/signalr/overview/signalr-20/performance-and-scaling/scaleout-with-redis
I am facing a situation where I am stuck in a very heavy traffic load and keeping the performance high at the same time. Here is my scenario, please read it and advise me with your valuable opinion.
I am going to have a three way communication between my server, client and visitor. When visitor visits my client's website, he will be detected and sent to a intermediate Rule Engine to perform some tasks and output a filtered list of different visitors on my server. On the other side, I have a client who will access those lists. Now what my initial idea was to have a Web Service at my server who will act as a Rule Engine and output resultant lists on an ASPX page. But this seems to be inefficient because there will be huge traffic coming in and the clients will continuously requesting data from those lists so it will be a performance overhead. Kindly suggest me what approach should I do to achieve this scenario so that no deadlock will happen and things work smoothly. I also considered the option for writing and fetching from XML file but its also not very good approach in my case.
NOTE: Please remember that no DB will involve initially, all work will remain outside DB.
Wow, storing data efficiently without a database will be tricky. What you can possibly consider is the following:
Store the visitor data in an object list of some sort and keep it in the application cache on the server.
Periodically flush this list (say after 100 items in the list) to a file on the server - possibly storing it in XML for ease of access (you can associate a schema with it as well to make sure you always get the same structure you need). You can perform this file-writing asynchronously as to avoid keeping the thread locked while writing the file.
The Web Service sounds like a good idea - make it feed off the XML file. Possibly consider breaking the XML file up into several files as well. You can even cache the contents of this file separately so the service feeds of the cached data for added performance benefits...
Are there well-known best practices for synchronizing tasks across a server farm? For example if I have a forum based website running on a server farm, and there are two moderators trying to do some action which requires writing to multiple tables in the database, and the requests of those moderators are being handled by different servers in the server farm, how can one implement some locking functionality to ensure that they can't take that action on the same item at the same time?
So far, I'm thinking about using a table in the database to sync, e.g. check the id of the item in the table if doesn't exsit insert it and proceed, otherwise return. Also probably a shared cache could be used for this but I'm not using this at the moment.
Any other way?
By the way, I'm using MySQL as my database back-end.
Your question implies data level concurrency control -- in that case, use the RDBMS's concurrency control mechanisms.
That will not help you if later you wish to control application level actions which do not necessarily map one to one to a data entity (e.g. table record access). The general solution there is a reverse-proxy server that understands application level semantics and serializes accordingly if necessary. (That will negatively impact availability.)
It probably wouldn't hurt to read up on CAP theorem, as well!
You may want to investigate a distributed locking service such as Zookeeper. It's a reimplementation of a Google service that provides very high speed distributed resource locking coordination for applications. I don't know how easy it would be to incorporate into a web app, though.
If all the state is in the (central) database then the database transactions should take care of that for you.
See http://en.wikipedia.org/wiki/Transaction_(database)
It may be irrelevant for you because the question is old, but it still may be useful for others so i'll post it anyway.
You can use a "SELECT FOR UPDATE" db query on a locking object, so you actually use the db for achieving the lock mechanism.
if you use ORM, you can also do that. for example, in nhibernate you can do:
session.Lock(Member, LockMode.Upgrade);
Having a table of locks is a OK way to do it is simple and works.
You could also have the code as a Service on a Single Server, more of a SOA approach.
You could also use the the TimeStamp field with Transactions, if the timestamp has changed since you last got the data you can revert the transaction. So if someone gets in first they have priority.
I am working on a web application (ASP.NET) game that would consist of a single page, and on that page, there would be a game board akin to Monopoly. I am trying to determine what the best architectural approach would be. The main requirements I have identified thus far are:
Up to six users share a single game state object.
The users need to keep (relatively) up to date on the current state of the game, i.e. whose turn it is, what did the active user just roll, how much money does each other user have, etc.
I have thought about keeping the game state in a database, but it seems like overkill to keep updating the database when a game state object (say, in a cache) could be kept up to date. For example, the flow might go like this:
Receive request for data from a user.
Look up data in database. Create object from that data.
Verify user has permissions to perform request based on the game's state (i.e. make sure it's really their turn or have enough money to buy that property).
Update the game object.
Write the game object back to the database.
Repeat for every single request.
Consider that a single server would be serving several concurrent games.
I have thought about using AJAX to make requests to an an ASP.NET page.
I have thought about using AJAX requests to a web service using silverlight.
I have thought about using WCF duplex channels in silverlight.
I can't figure out what the best approach is. All seem to have their drawbacks. Does anyone out there have experience with this sort of thing and care to share those experiences? Feel free to ask your own questions if I am being too ambiguous! Thanks.
Update: Does anyone have any suggestions for how to implement this connection to the server based on the three options I mention above?
You could use the ASP.Net Cache or the Application state to store the game object since these are shared between users. The cache would probably be the best place since objects can be removed from it to save memory.
If you store the game object in cache using a unique key you can then store the key in each visitors Session and use this to retrieve the shared game object. If the cache has been cleared you will recreate the object from the database.
While updating a database seems like overkill, it has advantages when it comes time to scale up, as you can have multiple webheads talking to one backend.
A larger concern is how you communicate the game state to the clients. While a full update of the game state from time to time ensures that any changes are caught and all clients remain in synchronization even if they miss a message, gamestate is often quite large.
Consider as well that usually you want gamestate messages to trigger animations or other display updates to portray the action (for example, of a piece moves, it shouldn't just appear at the destination in most cases... it should move across the board).
Because of that, one solution that combines the best of both worlds is to keep a database that collects all of the actions performed in a table, with sequential IDs. When a client requests an update, it can give all the actions after the last one it knew about, and the client can "act out" the moves. This means even if an request fails, it can simply retry the request and none of the actions will be lost.
The server can then maintain an internal view of the gamestate as well, from the same data. It can also reject illegal actions and prevent them from entering the game action table (and thus prevent other clients from being incorrectly updated).
Finally, because the server does have the "one true" gamestate, the clients can periodically check against that (which will allow you to find errors in your client or server code). Because the server database should be considered the primary, you can retransmit the entire gamestate to any client that gets incorrect state, so minor client errors won't (potentially) ruin the experience (except perhaps a pause while the state is downloaded).
Why don't you just create an application level object to store your details. See Application State and Global Variables in ASP.NET for details. You can use the sessionID to act as a key for the data for each player.
You could also use the Cache to do the same thing using a long time out. This does have the advantage that older data could be flushed from the Cache after a period of time ie 6 hours or whatever.
In an Adobe flex applicaiton using BlazeDS AMF remoting, what is the best stategy for keeping the local data fresh and in synch with the backend database?
In a typical web application, web pages refresh the view each time they are loaded, so the data in the view is never too old.
In a Flex application, there is the tempation to load more data up-front to be shared across tabs, panels, etc. This data is typically refreshed from the backend less often, so there is a greater chance of it being stale - leading to problems when saving, etc.
So, what's the best way to overcome this problem?
a. build the Flex application as if it was a web app - reload the backend data on every possible view change
b. ignore the problem and just deal with stale data issues when they occur (at the risk of annoying users who are more likely to be working with stale data)
c. something else
In my case, keeping the data channel open via LiveCycle RTMP is not an option.
a. Consider optimizing back-end changes through a proxy that does its own notification or poling: it knows if any of the data is dirty, and will quick-return (a la a 304) if not.
b. Often, users look more than they touch. Consider one level of refresh for looking and another when they start and continue to edit.
Look at BuzzWord: it locks on edit, but also automatically saves and unlocks frequently.
Cheers
If you can't use the messaging protocol in BlazeDS, then I would have to agree that you should do RTMP polling over HTTP. The data is compressed when using RTMP in AMF which helps speed things up so the client is waiting long between updates. This would also allow you to later scale up to the push methods if the product's customer decides to pay up for the extra hardware and licenses.
You don't need Livecycle and RTMP in order to have a notification mechanism, you can do it with the channels from BlazeDS and use a streaming/long polling strategy
In the past I have gone with choice "a". If you were using Remote Objects you could setup some cache-style logic to keep them in sync on the remote end.
Sam
Can't you use RTMP over HTTP (HTTP Polling)?
That way you can still use RTMP, and although it is much slower than real RTMP you can still braodcast updates this way.
We have an app that uses RTMP to signal inserts, updates and deletes by simply broadcasting RTMP messages containing the Table/PrimaryKey pair, leaving the app to automatically update it's data. We do this over Http using RTMP.
I found this article about synchronization:
http://www.databasejournal.com/features/sybase/article.php/3769756/The-Missing-Sync.htm
It doesn't go into technical details but you can guess what kind of coding will implement this strategies.
I also don't have fancy notifications from my server so I need synchronization strategies.
For instance I have a list of companies in my modelLocator. It doesn't change really often, it's not big enough to consider pagination, I don't want to reload it all (removeAll()) on each user action but yet I don't want my application to crash or UPDATE corrupt data in case it has been UPDATED or DELETED from another instance of the application.
What I do now is saving in a SESSION the SELECT datetime. When I come back for refreshing the data I SELECT WHERE last_modified>$SESSION['lastLoad']
This way I get only rows modified after I loaded the data (most of the time 0 rows).
Obviously you need to UPDATE last_modified on each INSERT and UPDATE.
For DELETE it's more tricky. As the guy point out in his article:
"How can we send up a record that no longer exists"
You need to tell flex which item it should delete (say by ID) so you cannot really DELETE on DELETE :)
When a user delete a company you do an UPDATE instead: deleted=1
Then on refresh companies, for row where deleted=1 you just send back the ID to flex so that it makes sure this company isn't in the model anymore.
Last but not the least, you need to write a function that clean rows where deleted=1 and last_modified is older than ... 3days or whatever you think suits your needs.
The good thing is that if a user delete a row by mistake it's still in the database and you can save it from real delete within 3days.
Rather than caching on flex client, why not do caching on server side? Some reasons,
1) When you cache data on server side, its centralized and you can make sure all clients have the same state of data
2) There are much better options available on server side for caching rather than on flex. Also you can have a cron job which refreshes data based on some frequency say every 24 hours.
3) As data is cached on server and it doesn't need to fetch it from db every time, communication with flex will be much faster
Regards,
Tejas