How to maintain SQL Server DB with lots of branches

How to maintain SQL Server DB with lots of branches - asp.net

I have an ASP.NET project under git where we follow the convention of using a branch for a feature. We just started using SQL Server Data Tools to manage schema changes (quite new to it, so I suspect it may have features that get me to what I need).
I am looking for some strategies that have worked for other teams that manage switching between branches that have different DB schemas and then successfully merging branches together. Ideally, after merging all the features, I would have implicitly created a change script(s) to deploy for the release to production.
Note I am using SQL Server 2008 R2

There are multiple parts to this strategy. One aspect is the handling of the storage of the different branches, and what has worked well for my teams has been to use different SQL Server instances for each branch (rather than naming individual databases with branch-specific prefixes or suffixes, e.g., MyDatabase_FeatureBranchX, which can get out of hand). This enables the corresponding database(s) in each branch to have the same names (for clarity) but also allows for physical and logical isolation of a given branch's SQL resources (data files, access permissions, etc.).
As for the second, more interesting aspect (which I think is the main intent of your question), you might consider utilizing a code-based "migrations" approach -- e.g., using FluentMigrator or the like. Provided that you've got a standard baseline schema from which each branch was initially created, you can create the appropriate migrations in code as part of your feature development in each branch (and apply them to that branch's SQL instance). When it comes time to merge the branch into trunk, you'd also be merging and then applying that branch's migrations.
At best, this means that you could simply run the migration tool against your trunk instance after the merge, in order to apply all the branch's migrations, since tools like this automatically keep track of which migrations have been applied (via a custom database table) and do not reapply them. Provided that you're also doing periodic merges of your trunk code (including its migrations) into your feature branch throughout its development, and you're applying those migrations, you would also be ensuring that your feature branch's schema is being kept up to date, which minimizes the nasty surprises at merge time.
When it comes time to deploy your trunk to production, these same migrations would be applied once again. FluentMigrator offers various runners: a console application, NAnt, MSBuild, and Rake.
I would highly recommend using a timestamp-based (e.g., 201210241033) migration ID strategy, rather than simple sequential integers (1, 2, ...), to minimize the likelihood of collisions and changes being applied out of the intended sequence.

Related

Can flyway be used in project with manual DB changes?

We have a production system with a large DB (several hundred tables) and would like to begin using Flyway to manage DDL changes that occur through the dev cycle. However, the organization is setup in such a way that there are production DB changes that sometimes occur, mostly just data changes but possibly DDL, that will happen outside of a data migration tool. While this is obviously an organizational challenge, does this fact alone cripple a tool like Flyway? Or is there a workflow where Flyway could rebuild its indices on demand such that any out-of-band DB change like this is pulled in?
We'd love to use Flyway, but would need to integrate it incrementally until all teams using the system are trained/bought in.

When introducing Flyway to a DB with existing data you will need to baseline Flyway to integrate with your existing data. See baseline.
For changes made after this, Flyway will only track and version changes made from its own migration scripts and not changes made externally to it. However, this does not mean you cannot use the two together, but you would need to be more aware of your database structure to avoid conflicts between your flyway migrations and external changes.

Transactional data changes made to production shouldn't impact Flyway as these won't be versioned.
If you're referring to static data (eg lookup data) that you'd like Flyway to manage, then this isn't detected by Flyway (at least not today). If you discover that you have drift you'll need to add the changes as a new migration script using idempotent syntax to ensure that next time this runs against production it doesn't try to make the same changes again.
For out-of-band schema changes, The enterprise edition of Flyway has a drift check, so at least you'd be made aware of them. However, as for the data changes described above, you'll need to manually add these schema changes as an idempotent migration script.

Integrating Flyway into an existing database

We have not used Flyway from the beginning of our project. We are at an advanced state of development. An expert review has suggested to use Flyway in our project.
The problem is that we have moved part of our services (microservices) into another testing environment as well.
What is the best way to properly implement Flyway? The requirements are:
In Development environment, no need to alter the schema which is already existing. But all new scripts should be done using Flyway.
In Testing environment, no need to alter the schema which is already existing. But what is not available in testing environment should be created automatically using Flyway when we do migrate project from Dev to test.
When we do migration to a totally new envrionment (UAT, Production etc) the entire schema should be created automatically using Flyway.
From the documentation, what I understood is:
Take a backup of the development schema (both DDL and DML) as SQL script files, give a file name like V1_0_1__initial.sql.
Clean the development database using "flyway clean".
Baseline the Development database "flyway baseline -baselineversion=1.0.0"
Now, execute "flyway migrate" which will apply the SQL script file V1_0_1__initial.sql.
Any new scripts should be written with higher version numbers (like V2_0_1__account_table.sql)
Is this the correct way or is there any better way to do this?
The problem is that I have a test database where we have different set of data (Data in Dev and test are different and I would like to keep the data as it is in both the environments). If so, is it good to separate the DDL and DML in different script files when we take it from the Dev environment and apply them separately in each environment? The DML can be added manually as required; but bit confused if I am doing the right thing.
Thanks in advance.

So, there are actually two questions here. Data management and Flyway management.
In terms of data management, yes, that should be a separate thing. Data grows and grows. Trying to manage data, beyond simple lookup tables, from source control quickly becomes very problematic. Not to mention that you want different data in different environments. This also makes automating deployments much more difficult (branching would be your friend if you insist on going this route, one branch for each data set, then deploy appropriately).
You can implement Flyway on an existing project, yes. The key is establishing the baseline. You don't have to do all the steps you outlined above. Let's say you have an existing database. You have to get the script that defines that database. That single script should include all appropriate DDL (and, if you want, DML). Name it following the Flyway standards. Something like V1.0__Baseline.sql.
With that in place, all you must do is run:
flyway baseline
That will establish your existing code base as the start point. From there, you just have to create scripts following the naming standard: V1.1xxx V2.0xxx V53000.1xxx. And run
flyway migrate
To deploy appropriate changes.
The only caveat to this is that, as the documentation states, you must ensure that all your databases match this V1.0 that you're creating and marking as the baseline. Any deviation will cause errors as you introduce new changes and migrate them into place. As long as you've got matching baseline points, you should be able to proceed with different data in different environments with no issues.

This is my how-to instruction on integration flyway with prod DB: https://delicious-snipe-938.notion.site/How-to-integrate-Flyway-with-existing-MySQL-DB-in-Prod-PostgreSQL-is-similar-1eabafa8a0e844e88205c2f32513bbbe.

Multiple developers working on Flyway and GIT

I have a question regarding working on Flyway with Git.
How to organize the work?
Two developers can create new SQL versions but on Git I think there should be one version of code to see all changes.
I mean when I am creating VBA code I am pushing to Git only one workbook and all changes there are updated when new version is pushed.
What about Flyway and creating multiple files?
How to do it?

Working with git usually implies working with different branches across developers.
As long as each developer is working on an individual branch you would not experience any problems with "seeing all changes" (from the perspective of such developer).
You will, however, experience problems in cases where those different developers are applying their (potentially different) flyway migrations to the very same database schema. In such case you may either drop the schema before applying flyway migrations or use a separate database instance per developer.
Now remember flyway is using migrations for expressing the sequence of changes that will form the final schema (or database content). migrations with flyway are just files (text files with sql code or java classes). For further details you might consult the flyway documentation
If you are looking at a single branch, e.g. as a result of merging various (remote) copies or other branches, you will encounter the following cases:
new migration (aka new file)
Such file will be added to the total set of migrations on the target branch. The main problem is ensuring (using a proper naming convention) that the new migration is executed at the proper place within the sequence of migrations.
modified existing migration (aka change to a file)
After a merge the modifications are part of the file and as such visible to the users of that branch.
deleted migration
Here the modification is becoming visible to the target branch immediately after the merge.
In any case (whether you are keeping the complete set of migrations as a single file (not recommended!) or as a set of files (possible distributed across various folders), there will be a exact version of what is the "current" state of the migrations for a given branch.
EDIT:
Consider the following example based on your additional information:
branch b1 has the following migrations:
03.02.12__creating table.sql
05.02.12__Altering table
branch b2 has:
- 04.02.12__Adding Column.
After merge of both branches into master branch you will ending with:
03.02.12__creating table.sql
04.02.12__Adding Column.
05.02.12__Altering table
Given the dates being the versions the list above gives the sequence of application of these migrations by flyway from the master branch (according to lexical ordering).
As there is no files with identical name with the git branches the files are just living side-by-side in the final version in the master branch.

Flyway support to re-run SQL file multiple times using placeholder params

Does FlywayDB support the use-case where a script can be re-run multiple times using different parameter sets through "placeholder" and be treated either as separate versions or repeatable migration (though with different SQL files)? I have a requirement where we'd want to run the same set of scripts to organize data according to "regions" (US, UK, CA, etc.)
e.g...
Files:
sql/V1__customer_info.sql
sql/V2__customer_address.sql
Commands:
# Migrate US customers
mvn -Dflyway.placeholders.region_id=us flyway:migrate
# Migrate UK customers
mvn -Dflyway.placeholders.region_id=uk flyway:migrate
# Migrate Australian customers
mvn -Dflyway.placeholders.region_id=au flyway:migrate

No is the short answer. I have had a couple of ideas you might like to explore further:
Implement it in a callback specifically afterMigrate.sql and then call as per your example. afterMigrate is called even if there are no pending migrations to apply. This is "extending" the callback feature and you would be constrained by a single sql file so would need to combine info and address into a single file. Java callbacks are more flexible however I have not used them.
Pass a list into the placeholder and have your database split and loop over it. This would be achievable with Oracle and PLSQL but may be tricky with other databases or if you need to support multiple database types.

Keep local version of file in TFS without being checked out.

I have two development environments, one for production and one for development. In TFS is there a way I can keep different versions of a file for each environment?
I would like to do this on my Web.config file where I keep different connection string for each environment. Right now I either have to keep that file checked out in both environments with there respective variables or update it every time I change environments.

In TFS you can do that using branching and merging... create one branch for production and one for deployment
Branching is a feature that allows a collection of files to evolve in two or more divergent paths. Branching is frequently used when teams have to maintain two or more similar code bases
Merging is the process of combining the changes in two distinct branches. A merge operation takes changes that have occurred in the source branch and integrates them into the target branch. Merging integrates all types of changes in the source branch including name changes, file edits, file additions, and file delete and undelete changes. If items have been modified in both the source and target branches, you will be prompted to resolve conflicts.
you can find more on branching and merging here

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex