Flyway support to re-run SQL file multiple times using placeholder params - flyway

Does FlywayDB support the use-case where a script can be re-run multiple times using different parameter sets through "placeholder" and be treated either as separate versions or repeatable migration (though with different SQL files)? I have a requirement where we'd want to run the same set of scripts to organize data according to "regions" (US, UK, CA, etc.)
e.g...
Files:
sql/V1__customer_info.sql
sql/V2__customer_address.sql
Commands:
# Migrate US customers
mvn -Dflyway.placeholders.region_id=us flyway:migrate
# Migrate UK customers
mvn -Dflyway.placeholders.region_id=uk flyway:migrate
# Migrate Australian customers
mvn -Dflyway.placeholders.region_id=au flyway:migrate

No is the short answer. I have had a couple of ideas you might like to explore further:
Implement it in a callback specifically afterMigrate.sql and then call as per your example. afterMigrate is called even if there are no pending migrations to apply. This is "extending" the callback feature and you would be constrained by a single sql file so would need to combine info and address into a single file. Java callbacks are more flexible however I have not used them.
Pass a list into the placeholder and have your database split and loop over it. This would be achievable with Oracle and PLSQL but may be tricky with other databases or if you need to support multiple database types.

Related

Integrating Flyway into an existing database

We have not used Flyway from the beginning of our project. We are at an advanced state of development. An expert review has suggested to use Flyway in our project.
The problem is that we have moved part of our services (microservices) into another testing environment as well.
What is the best way to properly implement Flyway? The requirements are:
In Development environment, no need to alter the schema which is already existing. But all new scripts should be done using Flyway.
In Testing environment, no need to alter the schema which is already existing. But what is not available in testing environment should be created automatically using Flyway when we do migrate project from Dev to test.
When we do migration to a totally new envrionment (UAT, Production etc) the entire schema should be created automatically using Flyway.
From the documentation, what I understood is:
Take a backup of the development schema (both DDL and DML) as SQL script files, give a file name like V1_0_1__initial.sql.
Clean the development database using "flyway clean".
Baseline the Development database "flyway baseline -baselineversion=1.0.0"
Now, execute "flyway migrate" which will apply the SQL script file V1_0_1__initial.sql.
Any new scripts should be written with higher version numbers (like V2_0_1__account_table.sql)
Is this the correct way or is there any better way to do this?
The problem is that I have a test database where we have different set of data (Data in Dev and test are different and I would like to keep the data as it is in both the environments). If so, is it good to separate the DDL and DML in different script files when we take it from the Dev environment and apply them separately in each environment? The DML can be added manually as required; but bit confused if I am doing the right thing.
Thanks in advance.
So, there are actually two questions here. Data management and Flyway management.
In terms of data management, yes, that should be a separate thing. Data grows and grows. Trying to manage data, beyond simple lookup tables, from source control quickly becomes very problematic. Not to mention that you want different data in different environments. This also makes automating deployments much more difficult (branching would be your friend if you insist on going this route, one branch for each data set, then deploy appropriately).
You can implement Flyway on an existing project, yes. The key is establishing the baseline. You don't have to do all the steps you outlined above. Let's say you have an existing database. You have to get the script that defines that database. That single script should include all appropriate DDL (and, if you want, DML). Name it following the Flyway standards. Something like V1.0__Baseline.sql.
With that in place, all you must do is run:
flyway baseline
That will establish your existing code base as the start point. From there, you just have to create scripts following the naming standard: V1.1xxx V2.0xxx V53000.1xxx. And run
flyway migrate
To deploy appropriate changes.
The only caveat to this is that, as the documentation states, you must ensure that all your databases match this V1.0 that you're creating and marking as the baseline. Any deviation will cause errors as you introduce new changes and migrate them into place. As long as you've got matching baseline points, you should be able to proceed with different data in different environments with no issues.
This is my how-to instruction on integration flyway with prod DB: https://delicious-snipe-938.notion.site/How-to-integrate-Flyway-with-existing-MySQL-DB-in-Prod-PostgreSQL-is-similar-1eabafa8a0e844e88205c2f32513bbbe.

Is this a Flyway use case

I have delivered a Product to the customer. Now I have upgraded the Product, which includes changes to the database.
Customer wants to upgrade the Product. Now will Flyway help in the migration of Customer data from older version to newer version. Please let me know, if this is a valid use case. The flyway documentation talks about its use during development only.
Flyway allows you to change your database by running a set of scripts in a defined order. These scripts are called 'migrations' as they allow you to 'migrate' your database from one version to another.
The idea is you can start with an an empty database and each migration script will successively bring that database up from empty up to the current version. However, it's also possible to start with an existing database by creating a 'baseline' migration.
As SudhirR said, Flyway's primary use case is to define schema changes. However, it's perfectly possible to change data also. Since Flyway is just running plain SQL, in principle almost anything you can do in a SQL script you can also do in a Flyway migration.
In the case you described it should be possible to use Flyway to migrate the customer database. The steps you could take are:
Generate a sql script that includes the entire DDL (including indexes, triggers, procedures, ...) of the production database. To do this you will need to add insert statements for all the reference data present in the database.
Save this script in your Flyway project as something like 'V1__base_version.sql'
Run the flyway baseline command against your production database
This will set up your production database for use with Flyway
Add a new migration script to migrate your customer's data to the new version
e.g. create new table, copy data from old table to new table, delete old table
Run flyway migrate to upgrade production
These steps are adapted from the Flyway documentation page here.
Of course you should read the Flyway docs and manually test on a throwaway DB before you run anything against production. However I think in principle Flyway could be a good fit for your use case.
Flyway should be used for schema migrations and any reference data (basic data that is required by the system/application in order to function properly).
Putting client specific data migrations would not be a use case. However, if you can represent the data migration "generically" by not using IDs and instead use names or types than it could be a candidate. Meaning if you could write a migration in a way that could be applied to all clients, then that would be the use case to put it in as a flyway migration.
Otherwise data migrations would be applied in some other way outside of the process like requesting special access to the database or having some team that manages the database to apply the scripts.
If you are doing custom data modifications quite often then I'd say something is wrong in some other area of the SDLC and you may need to increase testing so that bugs don't mess up the data in the first place.

Best common practice for data insert/update scripts in flyway

scenario: I have two databases.
The first database is a blank database used for testing. I essentially run flyway:migrate and build the database with complete schema and run my integration tests against that blank database. Any data that the integration tests need are inserted before the tests are run. Finally, the database is tore down by using flyway:clean to make sure the next build that comes through has a clean db to work with.
The second database has data in it.
Problem: The build fails in the integration phase because I have migration scripts that depends on data which database 1 doesn't have. Basically I'm inserting data based on certain data existing in the db.
Is the best common practice for flyway to only have ddl change type migration scripts and no data insert/update scripts?
Consider adding your reference data behind an IF statement in an afterMigrate callback:
http://flywaydb.org/documentation/callbacks.html
In the best case you add it as a migration and change it in the future via migrations. Including production. Things can be more complicated if that data can be changed on real environments by other means. In such case I would personally prefer to have a (shared) test fixture to insert the sample data.

Migrating databases of different schema versions automagically using Flyway

I think the documentation (http://flywaydb.org/getstarted/existingDatabaseSetup.html) is not clear enough and would like the description to be illustrated with an example. I have one for you:
Let's say we have two different versions (1 and 2) of the production database whose schema version is implicit but deterministic by querying the existing tables. How will we then achieve what is described in the documentation?
In my example the two versions both have a script attached:
Version 1: Create table A
Version 2: Create table B
I have created java migration files matching the scripts for versions 1 and 2 but since the flyway metadata is missing I need to query the database whether the scripts have been run and skip them in that case. The problem is that the application crashes since Flyway has not been initialised.
I don't want to initialise Flyway from the commandline since I want this to be done automatically upon deployment (Flyway in embedded mode). From what I've seen this only works with empty databases.
Is there a simple solution to this problem?
For single PROD databases, you can use flyway.initOnMigrate
In your case you would have to wrap Flyway and recreate this manually by inspecting your tables and calling either init with flyway.initialVersion=1 or with flyway.initialVersion=2, followed by a call to migrate.

How to maintain SQL Server DB with lots of branches

I have an ASP.NET project under git where we follow the convention of using a branch for a feature. We just started using SQL Server Data Tools to manage schema changes (quite new to it, so I suspect it may have features that get me to what I need).
I am looking for some strategies that have worked for other teams that manage switching between branches that have different DB schemas and then successfully merging branches together. Ideally, after merging all the features, I would have implicitly created a change script(s) to deploy for the release to production.
Note I am using SQL Server 2008 R2
There are multiple parts to this strategy. One aspect is the handling of the storage of the different branches, and what has worked well for my teams has been to use different SQL Server instances for each branch (rather than naming individual databases with branch-specific prefixes or suffixes, e.g., MyDatabase_FeatureBranchX, which can get out of hand). This enables the corresponding database(s) in each branch to have the same names (for clarity) but also allows for physical and logical isolation of a given branch's SQL resources (data files, access permissions, etc.).
As for the second, more interesting aspect (which I think is the main intent of your question), you might consider utilizing a code-based "migrations" approach -- e.g., using FluentMigrator or the like. Provided that you've got a standard baseline schema from which each branch was initially created, you can create the appropriate migrations in code as part of your feature development in each branch (and apply them to that branch's SQL instance). When it comes time to merge the branch into trunk, you'd also be merging and then applying that branch's migrations.
At best, this means that you could simply run the migration tool against your trunk instance after the merge, in order to apply all the branch's migrations, since tools like this automatically keep track of which migrations have been applied (via a custom database table) and do not reapply them. Provided that you're also doing periodic merges of your trunk code (including its migrations) into your feature branch throughout its development, and you're applying those migrations, you would also be ensuring that your feature branch's schema is being kept up to date, which minimizes the nasty surprises at merge time.
When it comes time to deploy your trunk to production, these same migrations would be applied once again. FluentMigrator offers various runners: a console application, NAnt, MSBuild, and Rake.
I would highly recommend using a timestamp-based (e.g., 201210241033) migration ID strategy, rather than simple sequential integers (1, 2, ...), to minimize the likelihood of collisions and changes being applied out of the intended sequence.

Resources