Hi all I have a question on concurrent stored procedure calls from a .net web service written in 3.5 framework.
I am building a web service that is used by many users and needs to return contact information from oracle database.
The issue is when more than 1 user clicks at the same time the db returns the same contact info. I have written the status update query in the SP. I am having this issue only when 2 or more requests read the same record before the status update happens. I have used transaction and transaction-scope but it doesn't solve the issue. Can anyone tell me if i am tackling the problem right or should i be looking at some other way. Any help is greatly appreciated.
Sounds like your stored procedure code is what we term in the trade 'dodgy'.
Generally it should be an
UPDATE table
SET status = 'READ'
WHERE ...
RETURNING col_1, col_2 INTO var1, var2;
RETURN;
It is probably doing a SELECT then an UPDATE based on ID without checking to see whether the status has been changed by another transaction
Your "read" operation on the database isn't really a read, since it updates the record in question. Sounds like what you need is for the first read-and-update operation to complete and commit before the second read-and-update operation begins.
I'm not a database/oracle guru, so I apologize if this answer is somewhat vague: is there a way to lock the table in question at the start of the read-and-update operation - perhaps an oracle transaction setting? If so, that should accomplish what you're looking for.
Note that you're essentially single-threading access to that table, which could potentially have performance implications, depending on the number of concurrent users and how often the table is accessed. You might want to consider alternate ways of accomplishing the requirement.
Related
I read this document and among several very relevant topics, some of them are key to a scalability problem I am facing.
Basically the document states that it is possible to overcome the 1 per second update ratio per entity that basically me drove me to redis in a use case that would not demand me to do it.
"a (google) software engineer in the Datastore team had mentioned a technique to obtain much higher throughput than one update per second on an entity group"
"The basic idea of Job Aggregation is to use a single thread to process a batch of updates. Because there is only one thread and only one transaction open on the entity group, there are no transaction failures due to concurrent updates. You can find similar ideas in other storage products such as VoltDb and Redis."
This is very useful to me but I don't have any clue on how this works.
Just creating a service and serialising (pull queue) upserts to a specific Kind could solve the issue? How datastore could be sure that no other thread would suddenly begin to upsert?
Thanks
It is important to keep in mind that Job Aggregation is not part of Datastore. As the documentation says, you need to use a single batch of updates. You can take a look here Batch operations to know how to upsert multiple entities.
About your second question, Datastore is not the responsible to ensure that other thread begin to upsert, you must to ensure that this not happens to get a better performance.
Here Datastore best practices there are other best practices that Google recommends to get better performance.
I am pretty new with ASP.NET and SQL Server so please provide support as much detail as possible, I really appreciate it. Here is my situation:
I am doing the registration system for the website using ASP.NET and SQL Server 2012 Express.
So when user registering, first of all, they need to fill out account information (name, address, phone, email, password...) and secondly, they need to fill the company information - where they are working at (company name, company industry, website, description...). So in database I have 2 table: User and Company
Problem is when they filled out all those require fields in the registration form and click SUBMIT button, how do I check if data has been inserted to 2 tables successfully, because if for any reason after data inserted to User table successfully but failed to Company table so this gonna be a problem, I don't know if there is any way to check it or reverse the process in SQL Server if any inserting query is failed.
I don't know this make sense for all of you, please ask me if you don't understand my question.
Thank you!
Without any code this is hard to answer.
In .net you can use Try{} catch{} blocks to handle errors and act upon them.
A better approach is to Commit the User and Company in one transaction. This is dependent on your data access approach. In Linq2Sql you need to have the two insertions in one SubmitChanges() for example. Or do it yourself with TransactionScope http://msdn.microsoft.com/en-us/library/system.transactions.transactionscope.aspx
Update: read the Remarks in the link:
The System.Transactions infrastructure provides both an explicit
programming model based on the Transaction class, as well as an
implicit programming model using the TransactionScope class, in which
transactions are automatically managed by the infrastructure.
My client is asking for a "suggestion" based lookup to be added to a particular form field.
In other words, as you start typing into a field there should be a "Google style" popup which suggests possible results to select from. There will be in the order of "tens of thousands" of possible suggestions - this is the best estimate I currently have on the quantity.
Using AJAX to send/retrieve the result, my question is whether it is better to store ALL the suggestions within .NET cache and process there, or whether it's better to run a stored-procedure based query on SQL Server for each request?
This would be using .NET 2.0 and SQL Server 2005
There is one trick I use every time when faced with such task. Do not do it on every keystroke. Put the launching of the search on a sliding timeout.
The intent here is to launch the search only when the user paused in his/her typing. Usually I set the timeout at .1 to .2 sec. Psychologically it is still instantaneous, but it considerably reduces the load on whatever you will use to search
Your bottle neck will be transporting the data from the server to the browser. You can easily produce a large result from the database in almost no time at all, but it takes forever to return to the browser.
You will have to find a way to limit the result that the server returns, so that you only fetch what the user has to see right now.
When you have the data traffic down to reasonable level, you can start looking at optimising the server part. With some caching and logic that should be easy, considering that when the user types the result is often a subset of the previous result, e.g. the match for "hat" is a subset of the match for "ha".
When I've seen the "suggest as you type" type searches done in SQL Server environments, I've seen the best performance using some sort of a caching mechanism, and typically a distributed approach - like a memcached, typically. Even if your code is optimized well, your database is tuned well and you have your query taking only a <= 10ms with the call to it, process and return, that is still 10ms as they type.
It depends on the number of items. If you can cache the items in a .NET process without running out of memory this will defenitely be faster.
But if that can't be done you are stuck with accessing the database on each request. A stored procedure would be a nice way to do that.
But there are some design patterns which can guide you. I've used the following two while developing something similar to what you are doing.
Submission throttling
Browser side cache
I currently have a problem where the performance of my database is impacted by several million lines of updates that run (it takes more or less 3 days, so we usually run them over a weekend)
However since the site is live, search performance is impacted. A 3 second query to pull 1.3 million records and page through them takes in excess of the timeout values by default in sql server sometimes. This obviously creates a user experience no one wants (or can afford to) to have happen.
My question now. If I setup replication on the Master to a Slave on the same server; Would I be able to point the website to the Slave and avoid that performance impact? Or would it just be duplicating the same problem since the Master will push any updates through to the Slave in any case?
I don't think replication is going to help you here, it is only going to make things on the source system worse IMHO.
Is it possible you can make a static copy of the data for the users running queries while the updates are going on? For reporting solutions that don't need to be up-to-the-minute, I've done this in several cases using two schemas - one that holds the static versions of the tables for querying, and one where the work is being done; when the work is done, switch. I go into this methodology in a little more detail here: What is the best way to refresh a rollup table under load?
Perhaps another thought is to make your updates more efficient, such that they don't take 3 days? Do you only do this on long weekends?
Question is, what is nature of your site ?? Do users use it just to "Search" or it does CRUD operations ?? If it is just for "Search" and Report generation then I agree with #Aaron. You can have some database just for reporting purposes, you can even use Log Shipping to automatically update your reporting database at very brief interval.
Is it possible that user can change data at the same time while records are being updated by your update process ?? In that case, you will have to update your Primary Database using your update job, and then again update Primary database for changes made by users using Slave database.
Are there well-known best practices for synchronizing tasks across a server farm? For example if I have a forum based website running on a server farm, and there are two moderators trying to do some action which requires writing to multiple tables in the database, and the requests of those moderators are being handled by different servers in the server farm, how can one implement some locking functionality to ensure that they can't take that action on the same item at the same time?
So far, I'm thinking about using a table in the database to sync, e.g. check the id of the item in the table if doesn't exsit insert it and proceed, otherwise return. Also probably a shared cache could be used for this but I'm not using this at the moment.
Any other way?
By the way, I'm using MySQL as my database back-end.
Your question implies data level concurrency control -- in that case, use the RDBMS's concurrency control mechanisms.
That will not help you if later you wish to control application level actions which do not necessarily map one to one to a data entity (e.g. table record access). The general solution there is a reverse-proxy server that understands application level semantics and serializes accordingly if necessary. (That will negatively impact availability.)
It probably wouldn't hurt to read up on CAP theorem, as well!
You may want to investigate a distributed locking service such as Zookeeper. It's a reimplementation of a Google service that provides very high speed distributed resource locking coordination for applications. I don't know how easy it would be to incorporate into a web app, though.
If all the state is in the (central) database then the database transactions should take care of that for you.
See http://en.wikipedia.org/wiki/Transaction_(database)
It may be irrelevant for you because the question is old, but it still may be useful for others so i'll post it anyway.
You can use a "SELECT FOR UPDATE" db query on a locking object, so you actually use the db for achieving the lock mechanism.
if you use ORM, you can also do that. for example, in nhibernate you can do:
session.Lock(Member, LockMode.Upgrade);
Having a table of locks is a OK way to do it is simple and works.
You could also have the code as a Service on a Single Server, more of a SOA approach.
You could also use the the TimeStamp field with Transactions, if the timestamp has changed since you last got the data you can revert the transaction. So if someone gets in first they have priority.