I would like some advice from anyone experienced with implementing something like "pessimistic locking" in an asp.net application. This is the behavior I'm looking for:
User A opens order #313
User B attempts to open order #313 but is told that User A has had the order opened exclusively for X minutes.
Since I haven't implemented this functionality before, I have a few design questions:
What data should i attach to the order record? I'm considering:
LockOwnedBy
LockAcquiredTime
LockRefreshedTime
I would consider a record unlocked if the LockRefreshedTime < (Now - 10 min).
How do I guarantee that locks aren't held for longer than necessary but don't expire unexpectedly either?
I'm pretty comfortable with jQuery so approaches which make use of client script are welcome. This would be an internal web application so I can be rather liberal with my use of bandwidth/cycles. I'm also wondering if "pessimistic locking" is an appropriate term for this concept.
It sounds like you are most of the way there. I don't think you really need LockRefreshedTime though, it doesn't really add anything. You may just as well use the LockAcquiredTime to decide when a lock has become stale.
The other thing you will want to do is make sure you make use of transactions. You need to wrap the checking and setting of the lock within a database transaction, so that you don't end up with two users who think they have a valid lock.
If you have tasks that require gaining locks on more than one resource (i.e. more than one record of a given type or more than one type of record) then you need to apply the locks in the same order wherever you do the locking. Otherwise you can have a dead lock, where one bit of code has record A locked and is wanting to lock record B and another bit of code has B locked and is waiting for record A.
As to how you ensure locks aren't released unexpectedly. Make sure that if you have any long running process that could run longer than your lock timeout, that it refreshes its lock during its run.
The term "explicit locking" is also used to describe this time of locking.
I have done this manually.
Store the primary-key of the record to a lock table, and mark record
mode attribute to edit.
When another user tries to select this record, indicate the user's
ready only record.
Have a set-up maximum time for locking the records.
Refresh page data for locked records. While an user is allowed to
make changes, all other users are only allowed to check.
Lock table should have design similar to this:
User_ID, //who locked
Lock_start_Time,
Locked_Row_ID(Entity_ID), //this is primary key of the table of locked row.
Table_Name(Entity_Name) //table name of the locked row.
Remaining logic is something you have to figure out.
This is just an idea which I implemented 4 years ago on special request of a client. After that client no one has asked me again to do anything similar, so I haven't achieved any other method.
Related
I want to implement a views counter like most forums, Youtube and several others have. So every time a user reads an article, that is stored and remembered. I also want to know who looked at the article.
My queston is: How do you implement this efficiently? What is the best practice?
One way would be to call a stored procedure for every view, but that would result in a lot of unneeded calls to the database.
Another way would be to store this to some global application object, and then store in DB every 5 minutes or so (and can you even do that in a good way?)
What's the best way to do this?
Database operations are surprisingly cheap and really are not worth worrying about. In the event that a DB operation was even marginally expensive then you can always delegate the blocking operation to a new thread thus freeing-up your page-generation thread (you can trivially do this for UPDATE and INSERT operations that return nothing from the database - they are inconsequential).
Sprocs aren't really in-fashion right now - the performance advantage they might have had from pre-computed execution plans is almost eliminated because modern servers cache plans from all previous queries, and for trivial SELECT, INSERT, and UPDATEs you begin to suffer from increased code complexity. There's nothing wrong with inline SQL commands now.
Anyway, back on-topic and in summary: your assumptions are wrong. There is nothing wrong with running UPDATE Pages SET ViewCount = ViewCount + 1 WHERE PageId = #pageId on every page-view. There is also nothing wrong with doing this either: INSERT INTO UserPageviews (UserId, PageId, DateTime) VALUES ( #userId, #pageId, NOW() ). Both operations are very cheap and will execute in under 2-3 miliseconds on even an old and aged database server.
Another way would be to store this to some global application object,
and then store in DB every 5 minutes or so (and can you even do that
in a good way?)
This method is very prone to data loss unless you use a durable queueing mechanism (like MSMQ). Unless you anticipate massive traffic, I wouldn't even think about this approach.
Writes of this nature are inexpensive and hundreds of operations per second are not a big deal. I recently built a comment/rating framework that acheives throughput of 3000+ complete transactions per second just on my local all-in-one workstation. This included processing the request, validation, and creating multiple records within a transaction.
As a note, you should take steps to ensure that your statistics data isn't vulnerable to artificial inflation/manipulation. This part of the process will probably be more complex than the view tracking itself. For example, a user should not be able to sit and hold down the F5 key and inflate the number of views on their video. Nor should these values be manipulable by HTTP (e.g. creating a small script to send an AJAX request over and over).
This suggests that each INSERT would be preceded by a SELECT to ensure that the same user ID or IP hadn't already been recorded in some period of time. Of course, this isn't foolproof (unless you invest a great deal of effort), but it errs on the side of conservatism which is usually a good approach.
One way would be to call a stored procedure for every view, but that
would result in a lot of unneeded calls to the database.
I regularly have to remind myself (and other developers) to not fear the database. People (me included) sometimes go to great lengths to avoid a few simple database calls. Keep your tables narrow and well-indexed, and operations like this are faster than you might think.
I have asked a few questions today as I try to think through to the solution of a problem.
We have a complex data structure where all of the various entities are tightly interconnected, with almost all entities heavily reliant/dependant upon entities of other types.
The project is a website (MVC3, .NET 4), and all of the logic is implemented using LINQ-to-SQL (2008) in the business layer.
What we need to do is have a user "lock" the system while they make their changes (there are other reasons for this which I won't go into here that are not database related). While this user is making their changes we want to be able to show them the original state of entities which they are updating, as well as a "preview" of the changes they have made. When finished, they need to be able to rollback/commit.
We have considered these options:
Holding open a transaction for the length of time a user takes to make multiple changes stinks, so that's out.
Holding a copy of all the data in memory (or cached to disk) is an option but there is heck of a lot of it, so seems unreasonable.
Maintaining a set of secondary tables, or attempting to use session state to store changes, but this is complex and difficult to maintain.
Using two databases, flipping between them by connection string, and using T-SQL to manage replication, putting them back in sync after commit/rollback. I.e. switching on/off, forcing snapshot, reversing direction etc.
We're a bit stumped for a solution that is relatively easy to maintain. Any suggestions?
Our solution to a similar problem is to use a locking table that holds locks per entity type in our system. When the client application wants to edit an entity, we do a "GetWithLock" which gets the client the most up-to-date version of the entity's data as well as obtaining a lock (a GUID that is stored in the lock table along with the entity type and the entity ID). This prevents other users from editing the same entity. When you commit your changes with an update, you release the lock by deleting the lock record from the lock table. Since stored procedures are the api we use for interacting with the database, this allows a very straight forward way to lock/unlock access to specific entities.
On the client side, we implement IEditableObject on the UI model classes. Our model classes hold a reference to the instance of the service entity that was retrieved on the service call. This allows the UI to do a Begin/End/Cancel Edit and do the commit or rollback as necessary. By holding the instance of the original service entity, we are able to see the original and current data, which would allow the user to get that "preview" you're looking for.
While our solution does not implement LINQ, I don't believe there's anything unique in our approach that would prevent you from using LINQ as well.
HTH
Consider this:
Long transactions makes system less scalable. If you do UPDATE command, update locks last until commit/rollback, preventing other transaction to proceed.
Second tables/database can be modified by concurent transactions, so you cannot rely on data in tables. Only way is to lock it => see no1.
Serializable transaction in some data engines uses versions of data in your tables. So after first cmd is executed, transaction can see exact data available in cmd execution time. This might help you to show changes made by user, but you have no guarantee to save them back into storage.
DataSets contains old/new version of data. But that is unfortunatelly out of your technology aim.
Use a set of secondary tables.
The problem is that your connection should see two versions of data while the other connections should see only one (or two, one of them being their own).
While it is possible theoretically and is implemented in Oracle using flashbacks, SQL Server does not support it natively, since it has no means to query previous versions of the records.
You can issue a query like this:
SELECT *
FROM mytable
AS OF TIMESTAMP
TO_TIMESTAMP('2010-01-17')
in Oracle but not in SQL Server.
This means that you need to implement this functionality yourself (placing the new versions of rows into your own tables).
Sounds like an ugly problem, and raises a whole lot of questions you won't be able to go into on SO. I got the following idea while reading your problem, and while it "smells" as bad as the others you list, it may help you work up an eventual solution.
First, have some kind of locking system, as described by #user580122, to flag/record the fact that one of these transactions is going on. (Be sure to include some kind of periodic automated check, to test for lost or abandoned transactions!)
Next, for every change you make to the database, log it somehow, either in the application or in a dedicated table somewhere. The idea is, given a copy of the database at state X, you could re-run the steps submitted by the user at any time.
Next up is figuring out how to use database snapshots. Read up on these in BOL; the general idea is you create a point-in-time snapshot of the database, do whatever you want with it, and eventually throw it away. (Only available in SQL 2005 and up, Enterprise edition only.)
So:
A user comes along and initiates one of these meta-transactions.
A flag is marked in the database showing what is going on. A new transaction cannot be started if one is already in process. (Again, check for lost transactions now and then!)
Every change made to the database is tracked and recorded in such a fashion that it could be repeated.
If the user decides to cancel the transaction, you just drop the snapshot, and nothing is changed.
If the user decides to keep the transaction, you drop the snapshot, and then immediately re-apply the logged changes to the "real" database. This should work, since your requirements imply that, while someone is working on one of these, no one else can touch the related parts of the database.
Yep, this sure smells, and it may not apply to well to your problem. Hopefully the ideas here help you work something out.
I have the scenario like this,
My environment is .Net2.0, VS 2008, Web Application
I need to lock a record when two members are trying to access at the same time.
We can do it in two ways,
By Front end (putting the sessionID and record unique number in the dictionary and keeping it as a static or application variable), we will release when the response is go out of that page, client is not connected, after the post button is clicked and session is out.
By backend (record locking in the DB itself - need to study - my team member is looking ).
Is there any others to ways to do and do I need to look at other ways in each and every steps?
Am I missing any conditions?
You do not lock records for clients, because locking a record for anything more than a few milliseconds is just about the most damaging thing one can do in a database. You should use instead Optimistic Concurrency: you detect if the record was changed since the last read and re-attempt the transaction (eg you re-display the screen to the user). How that is actually implemented, will depend on what DB technology you use (ADO.Net, DataSets, Linq, EF etc).
If the business domain requires lock-like behavior, those are always implemented as reservation logic in the database: when a record is displayed, it is 'reserved' so that no other users can attempt to make the same transaction. The reservation completes or times out or is canceled. But a 'reservation' is never done using locks, is always an explicit update of state from 'available' to 'reserved', or something similar.
This pattern is also describe din P of EAA: Optimistic Offline Lock.
If your talking about only reading data from a record from SQL server database, you don't need to do anything!!! SQL server will do everything about managing multi access to records. but if you want to manipulate data, you have to use Transactions.
I agree with Ramus. But still if u need it. Create a column with name like IsInUse as bit type and set it true if one is accessing. Since other guys will also need same data at same time then u need to save your app from crash .. so at every place from where the data is retrieved you have to put a check if IsInUse is False or not.
I have a requirement that my site always display the number of users currently online. For example, "35741 Users Currently Online". This is not based on a log in, simply how many users are currently on my site. I have tried using Session Start/Session End for this, however session end is not reliable. Therefore I get inflated numbers, as my session start adds numbers but session end doesn't remove them because it doesn't fire.
There is no additional information to be gathered from this (reporting, etc), it's simply requested that the number show up. Very simple request that's turning into a huge deal. Any help is appreciated.
EDIT:
I should specify that I have also tried using a database for this. Simple table that contains a session ID and a last activity column. With each page hit, I check to see if the session is in my database. If not, insert. If so, update with activity time. Then I run a procedure that sweeps the database looking for sessions with no activity in the last 20 minutes. This approach seemed to kill my SQL server and/or IIS. Had to restart the site.
Best way is like you do, but time it out via activity. If a given session doesn't access a page within 5 minutes or so, you may consider them no longer active.
If you're using ASP.Net membership, take a look at GetNumberOfUsersOnline.
For every user action that you can record, you need to consider them "online" for a certain window of time. Depending on the site, you may set that to 5 minutes. The actual web request should take less than a second. You have to make some assumption about how long they might stay on that page and do nothing but be considered online.
This approach requires that you keep track of the time of each users last activity.
Use Performance Counters:
State Server Sessions Active: The
number of active user sessions.
Expanding what silky said in his answer - since really http is stateless to determine if the user is currently 'online' you can really only track how long since the user last accessed your site and make a determination on how long between requests your consider to still be active.
Since you stated that this isn't based upon users logging in may it's a simple of how many different IP addresses you received requests from in the past 5 minutes (or however long you consider the 'online' timeout to be).
Don't use sessions for this unless you also need sessions for something else; it's overkill otherwise.
Assuming a single-server installation, do something like this:
For each user, issue a cookie that contains a unique ID
Maintain a static table of unique IDs and their last access time
In an HttpModule (or Global.asax), enter new users into the table and update their access times (use appropriate locking to prevent race conditions)
Periodically, either from a background thread or in-line with a user request, remove entries from the table that haven't made a request within the last N minutes. You might also want to support an explicit "log out" feature.
Report the number of people online as the size of the table
If you do use sessions, you can use the Session ID as the unique identifier. However, keep in mind that Session IDs aren't issued until you store something in the Session dictionary, unless you have a Session_Start() event configured.
In a load balanced or web garden scenario, it gets a little more complicated, but you can use the same basic idea, just persisting the info in a database instead of in memory.
When the user logs in write his user name into the HttpContext.Current.Cache with a sliding expiration (say 20 minutes).
Then in the Global.asax.cs in the Application_PreRequestHandlerExecute "touch" the cache entry for the current users so it resets the sliding expiration.
When a user explicitly logs out, remove his username from HttpContext.Current.Cache.
If you do this, at any given time HttpContext.Current.Cache.Count will give you the # of current users.
Note: this is assuming you aren't using the Cache for other purposes.
I have a web application at work that is similar to a ticket working system. Some users enter new issues. Other workers choose and resolve issues. All of the data is maintained in MS SQL server 2005.
The users working to resolve issues go to a page where they can view open issues. Because up to twenty people can be looking at this page at the same time, one potential problem I had to address was what happens if someone picks an issue that someone else picked just after their page loaded.
To address this, I did two things. First, the gridview displaying the issues to select uses an AJAX timer to update every second. Once an issue has been selected, it disappears one second later at most. In case they select one within this second, they get a message asking them to choose another.
The problem is that the AJAX part of this is sending too many updates (this is what I am assuming) and it is affecting the performance of the page and database. In addition, the updates are not performing every second. I find the timer to be unreliable when working to trigger stored procedures.
There has to be a better way, but I can't seem to find one. Does anyone have experience with a situation like this or have suggestions to keep multiple users from selecting the same record to maintain? I really do not want to disable the AJAX part entirely because I feel the message alone would make the application frustrating to use.
Thanks,
Put a lock timestamp field on the row in the database. Write a stored proc that returns true or false if the expiration timsetamp is older than a specific time. Set your sessions on your web app to expire in the same time, a minute or two. When a user select a row they hit the stored proc which helps the app to decide if it should let the user to modify it.
Hope that makes sense....
Two things can help mitigate your problem.
First, after-selection notification that the case has been taken is needed regardless of your ajax update time frame. Even checking every second doesn't mean two people cannot click the same case at what they perceive to be the same time. In such cases, one of the users needs to be notified that their selection is invalid even though it appeared valid when selected. This notification doesn't need to be elaborate; keeping a light, helpful tone can improve user perception even in the light of disappointment. And if you identify the user who selected that record already, that will not only help your users coordinate in future but also divert attention from your program to the user who snaked the juicy case. (indeed, management may like giving your users the occasional collision as it will motivate them to select cases faster)
Second, a small tweak to how you display your cases can reduce selection collisions. Adding a random element to display order and/or filtering out every other case on display will help your users select different cases naturally. Human pattern recognition and task selection isn't really random so small changes to presentation can equal big changes to selection behavior. Reductions in collision chance keeps your collision notifications rare (and thus less frustrating to your users). This is even better if your users can be separated into classifications that can help determine useful case ordering/filtering.
Okay, a third thing that will help you over time is if you keep a log of when collisions occur (with helpful meta data about the collision—like who was involved and selection timing). Armed with solid collision data, you can find what works and what doesn't. Over time, you can hone your application to your actual use cases as well as identify potential problems early. Nothing reassures your users more than being on top of a problem (and able to explain your plans to solve it) before they're even aware it exists.
With these mitigating patterns, you'll probably find you can safely reduce your ajax query timeframe without affecting user experience. And with useful logging, you'll have the assurance that any tweaks you put in place are actually working (or not—which is maybe even more useful to know).
I did something similar where once a user opened a ticket (row) it assigned that ticket to that user and set a value on that record, like and FK to that particular user, so if anyone else tried to open that ticket (row) it would let them know it has already been assigned to someone else.
If possible limit the system so that they just get the next open issue off the work queue as opposed having them be able choose from all open issues.
If that isn't possible, I suppose you could check upon the choosing of an issue to see if it is still available. If it's not available, then make it disappear after the user clicks on it. This way you are only requesting when they actually click on something as opposed to constant polling of the data.
Have you tried increasing the time between refreshes. I would expect that once per 30 seconds would be sufficient. 40 requests/minute is a lot less load than 1200/minute. Your users may not even notice the difference.
If they do, how about providing a refresh button on the page so the users can manually refresh the list just prior to selecting an item to avoid the annoying message if they choose.
I'm missing to see the issue, specially after you mentioned you are already flagging tickets as in progress/being maintained and have a timestamp/version of the item.
Isn't the following enough:
User browses the tickets and sees a list of available tickets i.e. this excludes ones that are in the db as in progress. If you want the users to also see tickets in progress, you indicate it clearly in the ticket status and disable the option to take it.
User either flags a ticket as in progress explicitly or implicitly by opening the ticket (depends on the user experience / how its presented to the users).
User explicitly moves the ticket to a different status i.e. completed, invalid, awaiting for feedback, etc.
When the items are retrieved at 1, you include a timestamp/version. When 2 happens, you use a optimistic concurrency approach to make sure that if 2 persons try to update the take the ticket at the same time only the first one will be successful.
What will happen is that for the second person, the update ... where ... timestamp = #timestamp will not find any records to update and you will report back that the ticket was already taken.
If you want, you can build on top of the above to update the UI as tickets are grabbed. This could be by just doing a full refresh of the current page of tickets after x time (maybe alerting/prompting the user), or even by retrieving a list of tickets changed for the page of tickets being showed with ajax. You still have the earlier steps in place, as this modification its just a convenience for the users.