isolating database operations for integration tests - asp.net

I am using NHibernate and ASP.NET/MVC. I use one session per request to handle database operations. For integration testing I am looking for a way to have each test run in an isolated mode that will not change the database and interfere with other tests running in parallel. Something like a transaction that can be rolled back at the end of the test. The main challenge is that each test can make multiple requests. If one request changes data the next request must be able to see these changes etc.
I tried binding the session to the auth cookie to create child sessions for the following requests of a test. But that does not work well, as neither sessions nor transactions are threadsafe in NHibernate. (it results in trying to open multiple DataReaders on the same connection)
I also checked if TransactionScope could be a way, but could not figure out how to use it from multple threads/requests.
What could be a good way to make this happen?

I typically do this by operating on different data.
for example, say I have an integration test which checks a basket total for an e-commerce website.
I would go and create a new user, activate it, add some items to a basket, calculate the total, delete all created data and assert on whatever I need.
so the flow is : create the data you need, operate on it, check it, delete it. This way all the tests can run in parallel and don't interfere with each other, plus the data is always cleaned up.

Related

DDD: persisting domain objects into two databases. How many repositories should I use?

I need to persist my domain objects into two different databases. This use case is purely write-only. I don't need to read back from the databases.
Following Domain Driven Design, I typically create a repository for each aggregate root.
I see two alternatives. I can create one single repository for my AG, and implement it so that it persists the domain object into the two databases.
The second alternative is to create two repositories, one each for each database.
From a domain driven design perspective, which alternative is correct?
My requirement is that it must persist the AR in both databases - all or nothing. So if the first one goes through and the second fails, I would need to remove the AG from the first one.
If you had a transaction manager that were to span across those two databases, you would use that manager to automatically roll back all of the transactions if one of them fails. A transaction manager like that would necessarily add overhead to your writes, as it would have to ensure that all transactions succeeded, and while doing so, maintain a lock on the tables being written to.
If you consider what the transaction manager is doing, it is effectively writing to one database and ensuring that write is successful, writing to the next, and then committing those transactions. You could implement the same type of process using a two-phase commit process. Unfortunately, this can be complicated because the process of keeping two databases in sync is inherently complex.
You would use a process manager or saga to manage the process of ensuring that the databases are consistent:
Write to the first database and leave the record in a PENDING status (not visible to user reads).
Make a request to second database to write the record in a PENDING status.
Make a request to the first database to leave the record in a VALID status (visible to user reads).
Make a request to the second database to leave the record in a VALID status.
The issue with this approach is that the process can fail at any point. In this case, you would need to account for those failures. For example,
You can have a process that comes through and finds records in PENDING status that are older than X minutes and continues pushing them through the workflow.
You can can have a process that cleans up any PENDING records after X minutes and purges them from the database.
Ideally, you are using something like a queue based workflow that allows you to fire and forget these commands and a saga or process manager to identify and react to failures.
The second alternative is to create two repositories, one each for each database.
Based on the above, hopefully you can understand why this is the correct option.
If you don't need to write why don't build some sort of commands log?
The log acts as a queue, you write the operation in it, and two processes pulls new command from it and each one update a database, if you can accept that in worst case scenario the two dbs can have different version of the data, with the guarantees that eventually they will be consistent it seems to me much easier than does transactions spanning two different dbs.
I'm not sure how much DDD is your use case, as if you don't need to read back you don't have any state to manage, so no need for entities/aggregates

Designing an asynchronous task library for ASP.NET

The ASP.NET runtime is meant for short work loads that can be run in parallel. I need to be able to schedule periodic events and background tasks that may or may not run for much longer periods.
Given the above I have the following problems to deal with:
The AppDomain can shutdown due to changes (Web.config, bin, App_Code, etc.)
IIS recycles the AppPool on a regular basis (daily)
IIS itself might restart, or for that matter the server might crash
I'm not convinced that running this code inside ASP.NET is not the right thing to do, becuase it would allow for a simpler programming model. But doing so would require that an external service periodically makes requests to the app so that the application is keept running and that all background tasks are programmed with utter most care. They will have to be able to pause and resume thier work, in the event of an unexpected error.
My current line of thinking goes something like this:
If all jobs are registered in the database, it should be possible to use the database as a bookkeeping mechanism. In the case of an error, the database would contain all state necessary to resume the operation at the next opportunity given.
I'd really appriecate some feedback/advice, on this matter. I've been considering running a windows service and using some RPC solution as well, but it doesn't have the same appeal to me. And I'd instead have a lot of deployment issues and sycnhronizing tasks and code cross several applications. Due to my business needs this is less than optimial.
This is a shot in the dark since I don't know what database you use, but I'd recommend you to consider dialog timers and activation. Assuming that most of the jobs have to do some data manipulation, and is likely that all have to do only data manipulation, leveraging activation and timers give an extremely reliable job scheduling solution, entirely embedded in the database (no need for an external process/service, not dependencies outside the database bounds like msdb), and is a solution that ensures scheduled jobs can survive restarts, failover events and even disaster recovery restores. Simply put, once a job is scheduled it will run even if the database is restored one week later on a different machine.
Have a look at Asynchronous procedure execution for a related example.
And if this is too radical, at least have a look at Using Tables as Queues since storing the scheduled items in the database often falls under the 'pending queue' case.
I recommend that you have a look at Quartz.Net. It is open source and it will give you some ideas.
Using the database as a state-keeping mechanism is a completely valid idea. How complex it will be depends on how far you want to take it. In many cases you will ended up pairing your database logic with a Windows service to achieve the desired result.
FWIW, it is typically not a good practice to manually use the thread pool inside an ASP.Net application, though (contrary to what you may read) it actually works quite nicely other than the huge caveat that you can't guarantee it will work.
So if you needed a background thread that examined the state of some object every 30 seconds and you didn't care if it fired every 30 seconds or 29 seconds or 2 minutes (such as in a long app pool recycle), an ASP.Net-spawned thread is a quick and very dirty solution.
Asynchronously fired callbacks (such as on the ASP.Net Cache object) can also perform a sort of "behind the scenes" role.
I have faced similar challenges and ultimately opted for a Windows service that uses a combination of building blocks for maximum flexibility. Namely, I use:
1) WCF with implementation-specific types OR
2) Types that are meant to transport and manage objects that wrap a job OR
3) Completely generic, serializable objects contained in a custom wrapper. Since they are just a binary payload, this allows any object to be passed to the service. Once in the service, the wrapper defines what should happen to the object (e.g. invoke a method, gather a result, and optionally make that result available for return).
Ultimately, the web site is responsible for querying the service about its state. This querying can be as simple as polling or can use asynchronous callbacks with WCF (though I believe this also uses some sort of polling behind the scenes).
I tell you what I have do.
I have create a class called Atzenta that have a timer (1-2 second trigger).
I have also create a table on my temporary database that keep the jobs. The table knows the jobID, other parameters, priority, job status, messages.
I can add, or delete a job on this class. When there is no action to be done the timer is stop. When I add a job, then the timer starts again. (the timer is a thread by him self that can do parallel work). I use the System.Timers and not other timers for this.
The jobs can have different priority.
Now let say that I place a job on this table using the Atzenta class. The next time that the timer is trigger is check the query on this table and find the first available job and just run it. No other jobs run until this one is end.
Every synchronize and flags are done from the table. In the table I have flags for every job that show if its |wait to run|request to run|run|pause|finish|killed|
All jobs are all ready known functions or class (eg the creation of statistics).
For stop and start, I use the global.asax and the Application_Start, Application_End to start and pause the object that keep the tasks. For example when I do a job, and I get the Application_End ether I wait to finish and then stop the app, ether I stop the action, notify the table, and start again on application_start.
So I say, Atzenta.RunTheJob(Jobs.StatisticUpdate, ProductID); and then I add this job on table, open the timer, and then on trigger this job is run and I update the statistics for the given product id.
I use a table on a database to synchronize many pools that run the same web app and in fact its work that way. With a common table the synchronize of the jobs is easy and you avoid 2 pools to run the same job at the same time.
On my back office I have a simple table view to see the status of all jobs.

Dealing with Long running processes in ASP.NET

I would like to know the best way to deal with long running processes started on demand from an ASP.NET webpage.
The process may consist of various steps (like upload files to the server, run SSIS packages on them, execute some stored procedures etc.) and sometimes the process could take up to couple of hours to finish.
If I go for asynchronous execution using a WCF service, then what happens if the user closes the browser while the process is running, how the process success or failure result should be displayed to the user? To solve this, I choose one-way WCF service calls, but the problem with this is I need to create a process table and store the result (and error messages if it fails in any of the steps and which steps have completed successfully) in that table which is an additional overhead because there are many such processes with various steps that the user can invoke from the web page and user needs to be made aware of the progress (in simplest case, the status can be "process xyz running") and once it is done, the output needs to be displayed to the user (for example by running a stored procedure).
What is the best way to design the solution for this?
As I see it, you have three options
Have a long running page where the user waits for the response. If this is several hours, you're going to have many usability problems, so I wouldn't even consider it.
Create a process table to store the results of operations. Run service functions asynchronously and delegate logging the results to the service. There can be a page that the user refreshes which gets the latest results of this table.
If you really don't want to create a table, then store all the current process details in the users' session state, and have a current processes page as above. You have the possible issue that the session might timeout, or the web app might restart and you'll lose all this.
I can't see that number 2 is such a great hardship. You could make the table fairly generic to encompass all types of processes: process details could just be encoded as binary or xml and interpreted by the web application. You then have the most robust solution.
I cant say what the best way would be but using Windows Workflow Foundation for such long running processes is definitely one way to go about it.
You can do tracking of the process to see what stage it is at, even persist it if you have steps where it is awaiting user input etc.
WF provides a lot of features out of the box (especially if your storage medium is SQL Server) and may be a good option to consider.
http://www.codeproject.com/KB/WF/WF4Extensions.aspx might help give you some more insight into the same.
I think you are in the right track. You should run the process asynchronously, store the execution somewhere (a table), and keep status of the running process in there.
Your user should see a pending display label while the process is executing, and a finished label with the result when the process finished. If the user closed the browser, she will see the result of her running process next time she logs in.

Listing Currently Running Workflows in .Net 4.0

I've got a .Net 4.0 Workflow application hosted in WCF that takes a request to process some information. This information is passed to a secondary system via a web service and returns a bool indicating that its going to process that information.
My workflow then loops, sleeping for 5 minutes and then querying the secondary system to see if processing of the information is complete.
When its complete the workflow finishes.
I have this persisting in SQL, and it works perfectly.
My question is how do I retrieve a list of the persisted workflows in such a way that I can tie them back to the original request? I'd like my UI to be able to list the running workflows in a grid with the elapsed time that they've been run.
I've thought about storing the workflow GUID in my primary DB and generating the list that way, but what I'd really like is to be able to reconcile what I think is running, and what the persistant store thinks is running.
I'd also like to be able to select a running workflow and kill it off or completely restart it if the user determines that its gone screwy.
You can promote data from the workflow using the SqlWorkflowInstanceStore. The result is they are stored alongside the workflow data in the InstancesTable using the InstancePromotedPropertiesTable. Using the InstancePromotedProperties view is the easiest way of querying you data.
This blog post will show you the code you need.
Another option, use the WorkflowRuntime GetAllServices().
Then you can loop through each one to pull out the data you need. I would cache the results, given this may be an expensive operation. If you have only 100 or less running workflows, and only a few users on your page, don't bother caching.
This way you don't have to create a DAL or Repo layer. Especially if you are using sql for persistence.
http://msdn.microsoft.com/en-us/library/ms594874(v=vs.100).aspx

Does the concept of shared sessions exist in ASP.NET?

I am working on a web application (ASP.NET) game that would consist of a single page, and on that page, there would be a game board akin to Monopoly. I am trying to determine what the best architectural approach would be. The main requirements I have identified thus far are:
Up to six users share a single game state object.
The users need to keep (relatively) up to date on the current state of the game, i.e. whose turn it is, what did the active user just roll, how much money does each other user have, etc.
I have thought about keeping the game state in a database, but it seems like overkill to keep updating the database when a game state object (say, in a cache) could be kept up to date. For example, the flow might go like this:
Receive request for data from a user.
Look up data in database. Create object from that data.
Verify user has permissions to perform request based on the game's state (i.e. make sure it's really their turn or have enough money to buy that property).
Update the game object.
Write the game object back to the database.
Repeat for every single request.
Consider that a single server would be serving several concurrent games.
I have thought about using AJAX to make requests to an an ASP.NET page.
I have thought about using AJAX requests to a web service using silverlight.
I have thought about using WCF duplex channels in silverlight.
I can't figure out what the best approach is. All seem to have their drawbacks. Does anyone out there have experience with this sort of thing and care to share those experiences? Feel free to ask your own questions if I am being too ambiguous! Thanks.
Update: Does anyone have any suggestions for how to implement this connection to the server based on the three options I mention above?
You could use the ASP.Net Cache or the Application state to store the game object since these are shared between users. The cache would probably be the best place since objects can be removed from it to save memory.
If you store the game object in cache using a unique key you can then store the key in each visitors Session and use this to retrieve the shared game object. If the cache has been cleared you will recreate the object from the database.
While updating a database seems like overkill, it has advantages when it comes time to scale up, as you can have multiple webheads talking to one backend.
A larger concern is how you communicate the game state to the clients. While a full update of the game state from time to time ensures that any changes are caught and all clients remain in synchronization even if they miss a message, gamestate is often quite large.
Consider as well that usually you want gamestate messages to trigger animations or other display updates to portray the action (for example, of a piece moves, it shouldn't just appear at the destination in most cases... it should move across the board).
Because of that, one solution that combines the best of both worlds is to keep a database that collects all of the actions performed in a table, with sequential IDs. When a client requests an update, it can give all the actions after the last one it knew about, and the client can "act out" the moves. This means even if an request fails, it can simply retry the request and none of the actions will be lost.
The server can then maintain an internal view of the gamestate as well, from the same data. It can also reject illegal actions and prevent them from entering the game action table (and thus prevent other clients from being incorrectly updated).
Finally, because the server does have the "one true" gamestate, the clients can periodically check against that (which will allow you to find errors in your client or server code). Because the server database should be considered the primary, you can retransmit the entire gamestate to any client that gets incorrect state, so minor client errors won't (potentially) ruin the experience (except perhaps a pause while the state is downloaded).
Why don't you just create an application level object to store your details. See Application State and Global Variables in ASP.NET for details. You can use the sessionID to act as a key for the data for each player.
You could also use the Cache to do the same thing using a long time out. This does have the advantage that older data could be flushed from the Cache after a period of time ie 6 hours or whatever.

Resources