I am learning Dynamics CRM 2015 (differences from prev. versions). My experience is in Dynamics CRM 2011. Obviously it is a bit of a jump; I am taking some tutorials on YouTube and came across the term "Optimistic" Asynchronous.
I have searched this term and am having a hard time getting the answer that I need. May somebody explain this to me? Also, is this a term that is mostly used withing Dynamics CRM.
Thanks!
In Dynamics CRM excessive usage of workflows can lead to deadlocks frequently occurring in the SQL Server database. I.e. the asyncoperation table tends to grow quickly and multiple concurrent threads processing workflow operation jobs often compete for getting exclusive access to shared database resources.
Database (pessimistic) locking tends to scale badly, especially in high traffic transactional systems, so SQL Server 2005 introduced another technique called optimistic locking. (Nice explanations can be found here and here.) When a record is retrieved from the database a column is included holding a unique time stamp or version number. When this record is to be updated, the database simply checks if its time stamp or version number still remains unchanged. If not, after the original read operation another process apparently has modified the same row, so the update is based on a dirty read and must be cancelled and the transaction is rolled back.
Dynamics CRM 2015 Update 1 introduced optimistic concurrency for entities through RowVersion attributes, as you can read on MSDN.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
The community reviewed whether to reopen this question last month and left it closed:
Original close reason(s) were not resolved
Improve this question
I need to integrate a database in one of our products and I wonder which one would be more suited to our needs (easy automatic deployment, no administration, good performance), and sqlite seems to be a good solution. The problem is that the database could potentially face high concurrency issues: it is accessed through PHP (Apache) each time a client connects to the server the database is running on. One client connects (and execute an INSERT query) approximatively every 10 seconds to the server, and it could possibly have more than 100 clients running.
When executing an INSERT query, sqlite locks the entire database at a certain time for a certain duration. Is there a way to compute that duration? If this is not possible, do you think sqlite (v3.3.7) is still adapted with the above conditions?
I try to avoid emotive replies and hyperbole but I am truly astonished at the lack of knowledge about sqlite displayed on this page. Different database implementations serve different needs and from the operational specs you provide, sqlite3 seems ideal for your needs. To elaborate:
sqlite3 is fully ACID compliant, meaning it ensures atomic commits, which is something neither MySQL (good as it may be) nor Oracle can brag about. See more here
Also, sqlite3 has a deceptively simple mechanism for ensuring maximum concurrency (which is also thread-safe) as described in their File locking and Concurrency document.
By their (sqlite3 developers') own estimation, sqlite3 is capable of up to 50,000 INSERTs per second - a theoretical maximum which is limited by disk rotation speed. ACID compliance requires sqlite3 to confirm that a database commit has been written to disk, so an INSERT, UPDATE or DELETE transaction requires two full disk rotations, thereby effectively reducing the number of transactions to 60/s on a 7200rpm diskdrive. This is outlined in the sqlite FAQ linked in another answer and the fact gives some idea of the engine's data throughput capability in production. But what about concurrent reading and writing?
The File locking and Concurrency document linked earlier, explains how sqlite3 avoids "writer startvation" - a condition whereby heavy database read access prevents a process/thread seeking to write to the database from acquiring a lock. The escalation of locking state from SHARED to PENDING to EXCLUSIVE happens as sqlite3 encounters an INSERT (or UPDATE or DELETE) statement and then again upon COMMIT, meaning that the full database lock is delayed to the last moment before an actual write is performed. The outcome of sqlite's clever mechanism for handling file locking means that should a writer join the queue (PENDING lock), existing reads (SHARED locks) will complete, grant an EXCLUSIVE lock to the writer process and then resume reading. This takes only a few milliseconds, meaning that the effective transaction throughput will hardly move from the 60/s rate quoted above.
I believe the default sqlite3 WAIT on an EXCLUSIVE lock is 3 seconds, so given the fact that 60 transactions per second is a reasonable expectation and that you seek to write to the database on average once every 10 seconds - I'd say sqlite3 is well up to the task and will only require the introduction of clustering once your traffic increases by a factor of 500.
Not bad and perfect for your requirement.
I don't think that SQLite would be a good solution for those requirements. SQLite is designed for local and lightweight use only, not to serve hundreds of requests.
I would recommend some other solution, for example MySQL or PostgreSQL, both can be scripted quite well. So, if I were you, I would put my efforts into the setup scriptings.
To avoid the flame war between SQLite believers and haters, let me draw draw your attention to the often referred SQLite When-To-Use document (I believe it is considered as a credible source). Here they state the following:
Situations Where A Client/Server RDBMS May Work Better
High Concurrency
SQLite supports an unlimited number of simultaneous readers, but it will only allow one writer at any instant in time. For many situations, this is not a problem. Writer queue up. Each application does its database work quickly and moves on, and no lock lasts for more than a few dozen milliseconds. But there are some applications that require more concurrency, and those applications may need to seek a different solution.
I think that in the referred question involves many writes and if the OP would go for SQLite, it would result a non-scalable solution.
Here is what SQLite has to say about appropriate uses of SQLite: http://www.sqlite.org/whentouse.html In particular, that page says SQLite is good for low the medium traffic sites, exactly the sort of application that you're contemplating.
Seems like SQLite would work for you, unless you expect substantial growth. Depending on what you do in each request, I would expect that a query rate of 0.17 queries per second to be well within SQLite's capabilities.
For good user experience, you should design your site so that queries needed to service a single request take ~ 200 milliseconds. To achieve this, result sets should probably not touch more than a few score rows; and should rely on indeces, not full table scans. If you hit that, then you'll have enough headroom to serve 5 queries per second (at peak). That's 30x the requirement that you state in your question.
The SQLite FAQ covers this topic: http://www.sqlite.org/faq.html (See: "(5) Can multiple applications or multiple instances of the same application access a single database file at the same time?")
But for your particular use, you'd probably want to do some stress testing to verify it'll meet your needs. 100 concurrent users might be a bit much for SQLite.
I read through the FAQs, and it seems that SQLite has some pretty decent support for concurrency, but may require the use of transactions to be sure things are going to go well.
The comments above regarding Apache concurrency are correct: one Apache server can serve out multiple requests, the number depends on how many processes are run. Most of the servers I run, its set to 3-5 processes, while on larger installations, it might reach 20. The point here is that SQLite can more than handle a small to medium traffic web site, as it can do thousands of inserts a second.
I plan on using SQLite for my current project, but to be safe, I fully intend on using BEGIN TRANSACTION and COMMIT for any writing or concurrency-sensitive parts.
Bottom line, as usual, Read The Manual.
In addition to the discussion above about sqlite3, one more awesome feature introduced in sqlite v 3.7.0 is WAL Mode, in which you can read from multiple processes and can write with one process at the same time.
have a look at http://www.sqlite.org/wal.html
We are using core Berkeley(dbbaseapi)and already processed Write/ReadCDRFile of 9 lack records, now we are getting difference in performance while using transaction and without using transaction which are as follows:-
we Write/Read S-Max CDRFile implementing the concept of Secondary Database:-
Using Secondary Database Concept (With Transaction):-
Operation No.of Rows Time
Write CDRFile 9,66,320 5.8 minutes
Read CDRFile 9,66,320 1.36 minutes
Using Secondary Database Concept (Without Transaction):-
Operation No.of Rows Time
Write CDRFile 966320 3.27 minutes
Read CDRFile 966320 1.28 minutes
we have already set parameter used in environConfig.setTransaction(true) and same in database.
Transaction txn = databaseEnv.getDbEnv().beginTransaction(null, null);
and passed to here:-
databaseEnv.getCdrDb().put(txn, thekey, theData);
so reply that how transaction internally used on database and environment
From subsequent questions that you've asked, it sounds like you've moved beyond this initial question. For the record, when configured for transactions BDB performs writes to the transaction log in order to provide recoverability in the event of an application or a system crash (automatic recovery is not available when BDB is used without transactions).
When using transactions, by default BDB will wait until the transaction has been written all the way to the disk (see the first two sections of Getting started with Transactions to get more information about transaction durability and how to increase throughput by relaxing the durability constraints).
In addition to relaxing the durability constrains, as indicated above, you can also improve throughput by a) including multiple operations (inserts) in the same transaction rather than inserting a single record in a separate transaction, and b) by using the Bulk put API via the putMultiple API call (this will reduce the pre-call JNI overhead)
In the future, these questions may be more quickly answered on the Berkeley DB technical forum on OTN.
I hope that helps.
Regards,
Dave
I'm about to begin designing the architecture of a personal project that has the following characteristics:
Essentially a "game" containing several concurrent users based on a sport.
Matches in this sport are simulated on a regular basis and their results stored in a database.
Users can view the details of a simulated match "live" when it is occurring as well as see results after they have occurred.
I developed a similar web application with a much smaller scope as the previous iteration of this project. In that case, however, I chose to go with SQLite as my DB provider since I also had a redistributable desktop application that could be used to manually simulate matches (and in fact that ran as a standalone simulator outside of the web application). My constraints have now shifted to be only a web application, so I don't have to worry about this additional level of complexity.
My main problem with my previous implementation was handling concurrent requests. I made the mistake of using one database (which was represented by a single file on disk) to power both the simulation aspect (which ran in a separate process on the server) and the web application. Hence, when users were accessing the website concurrently with a live simulation happening, there were all sorts of database access issues since it was getting locked by one process. I fixed this by implementing a cross-process mutex on database operations but this drastically slowed down the performance of the website.
The tools I will be using are:
ASP.NET for the web application.
SQL Server 2008 R2 for the database... probably with an NHibernate layer for object relational mapping.
My question is, how do I design this so I will achieve optimal efficiency as well as concurrent access? Obviously shifting to an actual DB server from a file will have it's positives, but do I need to have two redundant servers--one for the simulation process and one for the web server process?
Any suggestions would be appreciated!
Thanks.
You should be fine doing both on the same database. Concurrent access is what modern database engines are designed for. Concurrent reads are usually no problem at all; concurrent writes lock the minimum possible amount of data (a table, or even just a number of rows), not the entire database.
A few things you should keep in mind though:
Use transactions wisely. On the one hand, a transaction is an important tool in making sure your database is always consistent - in short, a transaction either happens completely, or not at all. On the other hand, two concurrent transactions can cause deadlocks, and those buggers can be extremely hard to debug.
Normalize, and use constraints to protect your data integrity. Enforcing foreign keys can save the day, even though it often leads to more cumbersome administration.
Minimize the amount of time spent on data access: don't keep connections around when you don't need them, make absolutely sure you're not leaking any connections, don't fetch data you know don't need, do as much data-related processing (especially things that can be solved using joins, subqueries, groupings, views, etc.) in SQL instead of in code
The topic of sql server deadlock has been discussed many times, however, I was unsure that even two simultaneous inserts on a table can end up in a deadlock situation.
Scenario:
While testing our application (SQL Server 2005 as backend, ASP.net 3.5) we inserted records into a table simultaneously (simplified overview) and that resulted into a deadlock for more than 70% of users.
I could not get a hang of this as how an insert is being deadlock as this is not a case of multiple resources. After a detailed analysis (of reproducing the bug by two users) I found that both the processes were holding a RangeS-S lock on the primary key index of the table and were trying to convert this into RangeI-N lock, that resulted into a deadlock and one transaction being killed.
Question:
Can we avoid or reduce these kind of deadlocks as this is not a case of change in order of access of resources? Cant' we force the transaction to get exclusive lock initially so that it blocks the other process and avoid deadlock? What (adverse) effects that may have?
Also could some one explain more about RangeI-N lock.
Isolation Level for this was "Serializable".
Any suggestion will be helpful.
Thanks,
Gaurav
Change your ADO isolation level. Unless you have clear requirements for Serializable, you shouldn't use it. If you do use it, then you must clearly understand the consequences, and frequent deadlocks due to range locks are one of these consequences.
The isolation level of System.Transactions is controlled by the IsolationLevel property.
Use sp_getapplock to acquire a custom exclusive lock
I have a situation where a single Oracle system is the data master for two seperate CRM Systems (PeopleSoft & Siebel). The Oracle system sends CRUD messages to BizTalk for customer data, inventory data, product info and product pricing. BizTalk formats and forwards the messages on to PeopelSoft & Siebel web service interfcaes for action. After initial synchronization of the data, the ongoing operation has created a situation where the data isn't accurate in the outlying Siebel and PeopleSoft systems despite successful delivery of the data (this is another converation about what these systems mean when they return a 'Success' to BizTalk).
What do other similar implementations do to reconcile system data in this distributed service-oriented approach? Do they run a periodic dump from all systems for comparison? Are there any other techniques or methodologies for spotting failed updates and ensuring synchronization?
Your thoughts and experiences are appreciated. Thanks!
Additional Info
So why do the systems get out of synch? Whenevr a destination syste acknolwedges to BizTalk it has received the message, it means many things. Sometimes an HTTP 200 means I've got it and put it in a staging table and I'll commit it in a bit. Sometimes this is sucessful, sometimes it is not for various data issues. Sometimes the HTTP 200 means... yes I have received and comitted the data. Using HTTP, there can be issues with ordere dlivery. All of tese problems could have been solved with a lot of architehtural planning up front. It was not done. There are no update/create timestamps to prevent un-ordered delivery from stepping on data. There is no full round trip acknowledgement of data commi from destinatin systems. All of this adds up to things getting out of synch.
(sorry this is an answer and not a comment working my way up to 50 points).
Can the data be updated in the other systems or is it essentially read only?
Could you implement some further validation in the BizTalk layer to ensure that updates wouldn't fail because of data issues?
Can you grab any sort of notification that the update failed from the destination systems which would allow you to compensate in the BizTalk layer?
FWIW in situations like this I have usually end up with a central data store that contains at least the datakeys from the 3 systems that acts as the new golden repository for the data, however this is usually to compensate for multiple update sources. Seems like we also usually operate some sort of manual error queue that users must maintain.
To your idea of batch reconciliation I have seen that be quite common to compensate for transactional errors especially in the financial services realm.