How to automate functional/integration tests and database rollbacks

How to automate functional/integration tests and database rollbacks - asp.net

In contrast to my previous question, i'll try to give my requirements.
I am trying to find some framework/methodology/"thing" that would fit the following:
Ability to write an automated test, preferably written in Visual Studio, using C#.
Test should drive a web browser and interact with SUT just like an user would.
Test should be able to setup a test scenario in DB.
Test should be able to assert that user interactions had the expected effect in DB.
After test is completed, it should be able to roll back all changes it made in DB.
My first attempt was to use NUnit test to drive Selenium (and Watin before that), but i faced a bit of a problem (check the link above) while using TransactionScope to roll back the changes Selenium-driven browser did in the DB.
Has anyone done anything like this in the "real world"? I've found some references through Google, but haven't been able to find any concrete examples on how to implement this. There wouldn't be any problems if i'd be doing unit testing. In that case TransactionScope would be quite enough.
Edit: R. Harvey pointed me to this question, which is almost identical to my situation.
However that question is just almost identical. My application is part of a family of services, all of them accessing the same set of database tables. Amount of test data required does not allow for efficient use of drop/create-scripts, so is there some alternate solution for this?
We are using SQL Server 2005, and i'm not very proficient in database magic, so if there's some way to use sql scripting other than drop/create, then that could be an option.
Edit 2:
Based on the answers and some additional head scratching, we'll go for more lightweight databases for developers to perform unit-, integration- and functional testing. This enables us to use sql-scripts for setting up and tearing down the test.

Changes made in a transaction are only visible inside said transaction. Also wrapping the test in a transaction scope (if possible) would make the test behave differently than the real thing in a very critical aspect (transactions).
It is much better to use a database image that you restore before every test suite. This way after the suite completes and the verification is done, you drop the test database. The next run, during the suite setup, the database is re-created from the saved image in a pristine state ready for testing. Even better would be to have a script that deploys the database from scratch and run that script during suite setup.
Btw is not feasible to restore to a pristine state before every test. More generically is not feasible to have lengthy individual test setup and cleanup steps. As you add more tests the time spent restoring the database to test-ready condition between tests will become just unmanageable. Suites with hundreds of tests are quite common and full test runs of tens of thousands of tests would mean hours and hours spent just restoring database for test. Design your individual test so that they can be run independently, ie. test N has to produce valid results even if test N-1 failed.
Another thing to consider is failure investigation, you want your failed test to leave the database in a state that can be investigated for meaningful info and you want subsequent tests to be able to run and produce valid results. Sometimes these requirements will contradict each other, but you must take them into consideration and design your test around them.

If the amount of data required to restore the database to a known-good state is prohibitive of drop/create scripts and you are running you tests on Developer or Enterprise edition of SQL 2005, you could look into creating a database snapshot of the good state, and reverting to it before each test. This is considerably faster than a full restore, although it may still be too time consuming if you have hundreds of tests.

Don't miss Amnesia which I recommended on this related question.

Related

How Do I Copy Corda Production Env To Dev For Debugging?

Very often in enterprise applications, something doesn't work as expected and you need to debug and create a fix.
Obviously you can't test in production as you might have to save something in order to debug it, and you don't want to be responsible for accidentally sending a $1M transaction by mistake!
With traditional applications, this process is done by copying the database from production to a dev environment (maybe redacting sensitive data) and duplicating and debugging the problem there.
In Corda you have multiple nodes involved, the nodes have specific keys and the network has a truststore hierarchy.
What is the process to replicate the production structure and copy all the data from production to development in order to debug?

I think it depends on how complicated your setup is.
The easy way to rigorously do this is within a mocknetwork during unit testing. (this is the most common setup, example here: https://github.com/corda/samples-kotlin/blob/master/Advanced/obligation-cordapp/workflows/src/test/kotlin/net/corda/samples/obligation/flow/IOUSettleFlowTests.kt)
Something I like to do a lot is to use intellij breakpoints in the flow / unit tests in order to be sure something works the way i expect.
Another way to do it is potentially using the testnet (again depends on your use case) https://docs.corda.net/docs/corda-os/4.7/corda-testnet-intro.html
Another way to do this is to write up a script to perform all of the transactions you want the nodes to do while running them locally on your machine just by using the corda shell on all the local nodes and feeding the transactions directly that way.
Copying data from production to apply to local network is gonna be hard because you can't fake all the transactions / state history without a lot of really painful editing of all the tables on each node.

Should we have separate database instance for each developer?

What is the best way for developing a database based application? We can have two approaches.
One common database for all the developers.
Separate database for all the developers.
What are the pros and cons of each? And which one is better way?
Edit: More then one developer is supposed to update the database and we already have SqlExpress 2005 on each developer machine.
Edit: Most of us are suggesting a common database. However if one of the dev has modified the code and database schema . He has not committed the code changes but the schema changes has gone to the common database. Will it not possibly break the other developers code.

Both -
I like a single database that changes are tested on before going live, or going to a 'formal' test environment. This is your developer's sanity check; it stays up to date with the live system and it makes sure they always consider each others changes. The rule should be that changes don't go on here if they might break something else.
A database per developer is great (even essential) when more than one developer is making updates. It allows them all the development flexibility they want without breaking things for other developers.
The key is to have a process for moving database changes from development through to your live system, and stick to your process.

Shared database
Simpler
Less cases of "It works on my machine".
Forces integration
Issues are found quickly (fail fast)
Individual databases
Never affect other developers, but this is also a bad thing, in continuous integration
We use a shared development database and it works out nicely. Our schema rarely changes in a way that makes it backwards incompatible, but occasionally a design change will occur before we go live, and we simply ask the other developers to update.
We do have separate development application (web) servers, but they share the same database. Our developers do have the option to use their own database, as they know how to set this up, and will do that on occasion, but only temporarily. The norm, for us, is to share the database.

Thought I'd throw this out there, but why not let every developer host their own instance of SQL Server Developer on their desktops and then have a shared server for each of the other environments (development, QA, and prod)? I think even the basic MSDN that comes with Visual Studio Pro (if you opt for it) includes a license for SQL Server Developer.
The developer can work on their desktop without impacting the others and then you can have them move the code to the next shared environment as you see fit (at will, with daily/weekly builds, etc.).
EDIT:
I should add that the desktop instance allows developers to do things that he DBAs often restrict on shared environments. This includes database creation, backup/restore, profiler, etc.. These things are not essential but they allow the developer to become so much more productive while reducing the demands they make against your DBAs.
The shared environment is completely necessary for testing - I would not recommend going from desktop to production. But you can add so much by allowing the developers to have 100% control over a given database environment (including isolation from others) with a relatively minor cost.

Depends on your development, testing and maintenance cycles. Also on the size and location of the development team (and of course organization). If you support several versions of the database you might need even more environments.
In real world I found the following approach rather satisfying:
single central database/application for testing purposes, gets all the changes by various developers periodically merged into it
local copies for development (so you are free to drop and reload the whole database)
upgrade scripts are maintained for any changes to schema, auxiliary and sample data sets
Here are some further points:
If two developers (two teams) are working on changes that can affect each other then they should complete their tasks independently and then integrate/merge and test. For this it is much better to have separate development environments (unless they have to work together in which case I consider them to be a part of the same team; still they can work on their own copies of the database and share it if necessary)
If they work on the changes that do not influence each other they could work on the main server. Or on their own local copies of the database.
So, developing on the local copy has all the benefits with no risk in a general case (when you support multiple versions of the system and maintain upgrade scripts anyway).
Still it is great if you can share test cases so ability to dump/restore the database easily and quickly is a big plus.
EDIT:
All of the above assume that having a copy on the local machine of the whole system for testing purposes is feasible (size, performance, licenses, etc).

I would opt for solution #1 : One common database for all the developers.
Pros
Less expensive for the infrastructure;
Only one dump is required when it's time to refresh the development database;
Everyone develops with the same data, so it closely represents the production environment;
Cons
If one developer performs a bad operation, this could impact a larger amount of developers.
As for solution #2 : One independant database for each of the developers;
Pros
This could be useful for new features developments, when development requires isolation;
Cons
More expensive for the company (infrastructure, licences...);
Multiplication of problems caused by eager isolation development environment (works in devloper's environement, not integrated);
Multiplication of dumps by the DBAs of the same copy from the production environment.
Considering the above, I would recommend, depending on your company size:
One database for development;
One database for testing the integration;
One database for acceptance tests;
One for new feature development that will perhaps require integration tests.
If your company doesn't require integration tests, then go with acceptance tests, this step is crucial before going to production.

One per developer plus a continuous integration and build server to run unit and integration tests. That gives you the best of both worlds.
Having all developers modify a single dev database quickly becomes less productive once the amount of database change reaches a certain level because it forces a developer to deploy changes to the shared database before he is ready to check-in, which means other parts of the code line may break unnecessarily.

Simple answer:
Have one development database, and if the developers want their own, they can just run their own instance on their own machines. Just be sure to test/publish on the shared.

We do both:
We use code generation where I'm at and our database is generated as well. So we have an instance on each developer's box where the database is generated. Then we use the scripts that are generated to apply the changes to a central test database. If that goes well we apply the changes to the production database during a release.
What's nice with this approach is that when our "source of truth" is checked in to source control, all the database changes are automatically distributed to the other developers when they rebase and regenerate. It works well for us.

The best way is single database on Test/QA server and one database (probably on developer's local computer) for each developer (so, 10 developers work with 10 + 1 databases).
The same approach as for general development: each developer has own copy of source code on local machine.
Also, multiple-database approach simplifies the keeping database schema in version control systems. We are keeping database creation scripts in SVN.
We are using the approach, described here:
http://www.sqlaccessories.com/Howto/Version_Control.aspx

You might also want to look at Refactoring Databases. Aside from discussing database changes, he includes discussions on going from development to production in a way that reduces risk.

Why on earth would you want a separate database for all developers?
Have one common database for all, that way the table structure is consistent and the sql statements are as well.

The biggest problems with developers having their own databases are:
First it is unlikely to be the size
of the real production database (if
you take all the databases we need to
work with here, they would take up
several hundred gigabytes of space, I
don't have that available on my
machine), this causes bad code to be
written that will never work on a
large database for performance
reasons. SQL code should never be written against a data set significantly smaller than the one on prod.
Second, developers who use their own
database create problems when they
spend a long time developing
something and then find out only
after they merge with a real datbase
that it affects something else. You
find this stuff much faster when you
share the environment. So there is
inthe end less wasted development
time.
Third developers working on related
things need to know about the changes
you are making, it will affect their
change.
When you know you are going to affect others, I think you tend to be more careful what you do which isa plus in my book.
Now the shared database server should have what we call a scratch database, a place where people can create and test table changes, so if they are doing something that might need to drop and recreate a table (which should be a rare case!), they can test the process first by copying the table to the scratch database and running their process there and then changin to the real database when they are sure it works. Or we often copy a backup table to the scratch database before testing a particular change, so we can easily recreate the old data if it goes bad.
I see no advantages at all to using individual databases.

Web application: Acceptance testing: Initial state for a test and test isolation?

Greetings,
I am currently exploring some extreme programming and try to stick as much to it as possible. This means, I will need to turn my (by now, unexpectedly thick stack of) user stories into acceptance tests once I begin an iteration (after planning the release, of course).
I am not entirely sure about the implementation language I am going to use, however, I am sure that this is going to be a dynamic web application with a database backend, served by a webserver. Right now, I plan to develop the first release on a local machine with a local testing environment, so it is possible to assume that security is no concern on the acceptance tests (so, I can give the acceptance tests root access to the testing database involved, for example). I am still a bit unsure about the acceptance test framework to use, however, since this is going to be a web application, I think I will use Selenium RC in order to write the tests and run them (I mention this in case someone is able to point me to something better :) ).
However, there still is a dark area left: I do not have data for this application yet, because I am implementing a new, fresh application. Thus, I cannot grab a snapshot of the current production database in order to grab a test database, and additionally, the application is stateful (as any web application with a database backend is), so using a single database for all acceptance tests is going to cause ugly problems with regard to test isolation (and at least for unit tests, that reads as "This can result in great fun and lots of gray hair").
So, how to I solve this problem? Do I create artificial testing databases (and maintain them whenever the database schema changes) and write the acceptance tests such that each acceptance test loads the appropiate database state into the testing database before running the test? (How fast or slow will it be to load a dozen records a hundred times, when a lot of accentance tests run?) Should I create a single example database, load this for all tests and hope for the best? Should I recreate the test data I need all the time in the acceptance tests? Or, how do people do this?

According to further research, the proper way to do this is to bring the database into a defined state using the appropiate setUp-methods. This porentially involves deleting all existing data in the tables, adding a certain test set of data to the table and then running the test on exaclty this data. Afterwards, the teardown-method clears up whatever was done to the tables (either setUp drops everything, or teardown drops everything again). There are tools like dbUnit to simplify this process. This results in some reduction of the testing speed, however, it establishes total isolation of tests, which is a good thing, because then, green simply means green and red simply means red and not "Given the current order of test execution, this works".
Besides that, the speed issue will probably be less significant for me, as I can focus on a small amount of tests during developing code for a single user story and have my CI-server run all the tests (which then takes more time) in the background when I think I am done.

batch manipulations for online web app

A customer has a web based inventory management system. The system is proprietary and complicated. it has around 100 tables in the DB and complex relationships between them. it has ~1500000 items.
The customer is doing some reorganisations in his processes and now has the need to make massive updates and manipulation to the data (only data changes, no structural changes). The online screens do not permit such work, since they where designed at the begining without this requirement in mind.
The database is MS Sql 2005, and the application is an asp.net running on IIS.
one solution is to build for him new screens where he could visialize the data in grids and do the required job on a large amount of records. This will permit us to use the already existing functions that deal with single items (we just need to implement a loop). At this moment the customer is aware of 2 kinds of such massive manipulations he wants to do, but says there will be others.This will require design, coding, and testing everytime we have a request.
However the customer needs are urgent because of some regulatory requirements, so I am wondering if it will be more efficient to use some kind of mapping between MSSQL and Excel or Access to expose the needed informations. make the changes in Excel or Access then save in the DB. may be using SSIS to do this.
I am not familiar with SSIS or other technologies that do such things, that's why I am not able to judge if the second solution is indeed efficient and better than the first. of course the second solution will require some work and testing, but will it be quicker and less expensive?
the other question is are there any other ways to do this?
any ideas will be greatly appreciated.

Either way, you are going to need testing.
Say you export 40000 products to Excel, he re-organizes them and then you bring them back into a staging table(s) and apply the changes to your SQL table(s). Since Excel is basically a freeform system, what happens if he introduces invalid situations? Your update will need to detect it, fail and rollback or handle it in some specified way.
Anyway, both your suggestions can be made workable.
Personally, for large changes like this, I prefer to have an experienced database developer develop the changes in straight SQL (either hardcoding or table-driven), test it on production data in a test environment (doing a table compare between before and after) and deploy the same script to production. This also allows the use of the existing stored procedures (you are using SPs, to enforce a consistent interface to this complex database, right?) so that basic database logic already in place is simply re-used.

I doubt Excel will be able to deal with 1.5mil elements/rows.
When you say to visualise data in grids - how will your customer make changes? Manually or is there some automation behind it? I would strongly encourage automation (since you know about only 2 types of changes at the moment). Maybe even a simple standalone "converter" application - don't make part of the main program - it will be too tempting for them in the future to manually edit data straight in the DB tables.

Here is a strategy that I think will get you from A to B in the shortest amount of time.
one solution is to build for him new
screens where he could visialize the
data in grids and do the required job
on a large amount of records.
It's rarely a good idea to build an interface into the main system that you will only use once or twice. It takes extra time and you'll probably spend more time maintaining it than using it.
This will permit us to use the already
existing functions that deal with
single items (we just need to
implement a loop)
Hack together your own crappy little interface in a .NET Application, whose sole purpose is to fulfill this one task. Keep it around in your "stuff I might use later" folder.
Since you're dealing with such massive amounts of data, make sure you're not running your app from a remote location.
Obtain a copy of SQL 2005 and install it on a virtualization layer. Copy your production database over to this virtualized SQL server. Take a snap shot of your virtualized copy before you begin testing. Write and test your app against this virtualized copy. Roll back to your original snap shot each time you test. Keep changing your code, testing, and rolling back until your app can flawlessly perform the desired changes.
Then, when the time comes for you to change the production database, you can sit back and relax while your app does all of the changes. Since the process will likely take a while, so add some logging so you can check the status as it runs.
Oh yeah, make sure you have a fresh backup before you run your big update.

WatiN test data reset/clean up

I'm wondering how people are currently resetting their data / cleaning up test remnants for their WatiN/Wartir tests?
For example, lets say there's a test to add a user into the system and the username has to be unique. Obviously the first run without any users should work fine, but the second run will fail without manual intervention.

There are a couple of strategies that you could do for this, I am assuming that you are using WatiN, with Nunit or VS Unit tests to run your tests.
Use transactions
An approach that is used when unit testing is that you "wrap" the whole test in a transaction and at the completion of the test roll the transaction back. In .net you can use System.Transactions for this.
Build a "stub page"
Build a page in your applicaiton that uses the existing business logic to delete your data. This page would need to be secured and ideally not even deployed in to production.
This is the approach that I would recommend.
Call a web service
Develop a web service, or call one directly from the app tier of the applicaiton to perform the delete. You will probably need to develop this as well.
Clean up directly
Build some classes in your test code to access the data and clean it up.
With any of these you will need to cleanup before and after you run your test, i.e. in the test setup and test cleanup methods. The reason to do it twice is that you should assume that your test has failed and not cleaned up properley.
Use Linq to Sql AFAIK if you are using Linq to sql, it works in-memory and wraps the whole update in a transaction for you automatically. If you simply don't call the SubmitChanges(); method then you should be fine, but I haven't tested this myself.

I have asked a developer to make a script that will reset database. After a suite of tests, I just call that script and start from clean database.

Mike - your question isn't unique for Watir/WatiN. It applies for any UI testing, so search around for similar solutions for Selenium, Windmill, and even headless integration tests (HtmlUnit, API tests, etc). I've answered this question a couple times personally on StackOverflow.

WatiN is for UI testing.
In order to test the scenario you are looking for, you can generate user id using the c# code that will make it unique (as against the way it is stored when you created the test).

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex