Flyway: How to remove a large migration script from migrations - flyway

My current project has a few Flyway migrations in place that are used to import initial data into a database. This data is convenient especially for developers to be able to quickly setup the project. Production data is imported through some batch jobs and has a newer version.
Some of these migrations are quite big (~20MB) and so everytime the application starts, Flyway takes some time to calculate the checksum of the migrations. This also is a problem for integration tests as they also take longer because of this.
I consider this approach to be a misuse of Flyway, I think migration tools should be used mainly for structural data.
I want to remove those files from our application and rather use a configuration management tool (e.g. Vagrant, Puppet, Chef) to import test data on developer environments. However, I can't just delete the migration files from the application as Flyway will complain that a migration has been recorded in the database but is not present in the application migrations.
My first thinking was to create a new migration with a high-priority version number that will
Delete the test data
Delete the migration from the schema_version table
and then remove the migration scripts. This however does not work, Flyway still complains that the removed migration script is missing.
Is there a restriction that you cannot interact with the schema_version table in migrations?
What other options do I have? If at all possible I want to do this using Flyway and not manually.

Repair is your best bet. Empty those data migrations and run the repair command to have their checksums recalculated based on the empty files.

Related

Can flyway be used in project with manual DB changes?

We have a production system with a large DB (several hundred tables) and would like to begin using Flyway to manage DDL changes that occur through the dev cycle. However, the organization is setup in such a way that there are production DB changes that sometimes occur, mostly just data changes but possibly DDL, that will happen outside of a data migration tool. While this is obviously an organizational challenge, does this fact alone cripple a tool like Flyway? Or is there a workflow where Flyway could rebuild its indices on demand such that any out-of-band DB change like this is pulled in?
We'd love to use Flyway, but would need to integrate it incrementally until all teams using the system are trained/bought in.
When introducing Flyway to a DB with existing data you will need to baseline Flyway to integrate with your existing data. See baseline.
For changes made after this, Flyway will only track and version changes made from its own migration scripts and not changes made externally to it. However, this does not mean you cannot use the two together, but you would need to be more aware of your database structure to avoid conflicts between your flyway migrations and external changes.
Transactional data changes made to production shouldn't impact Flyway as these won't be versioned.
If you're referring to static data (eg lookup data) that you'd like Flyway to manage, then this isn't detected by Flyway (at least not today). If you discover that you have drift you'll need to add the changes as a new migration script using idempotent syntax to ensure that next time this runs against production it doesn't try to make the same changes again.
For out-of-band schema changes, The enterprise edition of Flyway has a drift check, so at least you'd be made aware of them. However, as for the data changes described above, you'll need to manually add these schema changes as an idempotent migration script.

How to avoid deploy errors with EF Core Migrations and Custom Migration Operations (custom SQL)

I'm using EF Core Migrations in my .NET Core projects, and deploy using DevOps pipelines.
In my build pipeline, I build a migration SQL script using a Command Line task to execute a dotnet ef migrations script command (using the --idempotent option), then execute it in the release pipeline using an Azure SQL Database deployment task. All fairly standard, to judge from a simple google search (eg. here), which is how I learned of the approach in the first place.
This is my problem: To achieve certain desired results, in some of my migrations I use Custom Migration Operations (migrationBuilder.Sql("...")) to execute some hand-crafted SQL as part of a migration.
However, as the project develops, and my DB schema changes over time, older such migrations inevitably contain SQL which no longer fits the schema. One would think this isn't a problem, as any migration is meant to be applied to a database with some specific schema version only. However, as it turns out, the dotnet ef migrations script command builds a SQL script with a bunch of conditional SQL block like this:
IF NOT EXISTS(SELECT * FROM [__EFMigrationsHistory] WHERE [MigrationId] = N'20190123160628_SomeMigrationId')
BEGIN
/* Generated or custom SQL here */
END;
Note: every migration is included in the script, and the IF NOT EXIST-clauses ensure that only the not-yet-applied ones are executed in the database.
However, the custom migration task SQL statements are included as-is in the script, and if they are outdated, they longer compile. Then the DB deployment task fails, as does the whole deployment.
Has anyone else had and solved this problem, or know another way to deploy migrations which doesn't have this issue? It seems to me this seriously affects the utility of the MigrationBuilder.Sql() command, which is the only way to arbitrarily "massage" data as part of a migration.

Is this a Flyway use case

I have delivered a Product to the customer. Now I have upgraded the Product, which includes changes to the database.
Customer wants to upgrade the Product. Now will Flyway help in the migration of Customer data from older version to newer version. Please let me know, if this is a valid use case. The flyway documentation talks about its use during development only.
Flyway allows you to change your database by running a set of scripts in a defined order. These scripts are called 'migrations' as they allow you to 'migrate' your database from one version to another.
The idea is you can start with an an empty database and each migration script will successively bring that database up from empty up to the current version. However, it's also possible to start with an existing database by creating a 'baseline' migration.
As SudhirR said, Flyway's primary use case is to define schema changes. However, it's perfectly possible to change data also. Since Flyway is just running plain SQL, in principle almost anything you can do in a SQL script you can also do in a Flyway migration.
In the case you described it should be possible to use Flyway to migrate the customer database. The steps you could take are:
Generate a sql script that includes the entire DDL (including indexes, triggers, procedures, ...) of the production database. To do this you will need to add insert statements for all the reference data present in the database.
Save this script in your Flyway project as something like 'V1__base_version.sql'
Run the flyway baseline command against your production database
This will set up your production database for use with Flyway
Add a new migration script to migrate your customer's data to the new version
e.g. create new table, copy data from old table to new table, delete old table
Run flyway migrate to upgrade production
These steps are adapted from the Flyway documentation page here.
Of course you should read the Flyway docs and manually test on a throwaway DB before you run anything against production. However I think in principle Flyway could be a good fit for your use case.
Flyway should be used for schema migrations and any reference data (basic data that is required by the system/application in order to function properly).
Putting client specific data migrations would not be a use case. However, if you can represent the data migration "generically" by not using IDs and instead use names or types than it could be a candidate. Meaning if you could write a migration in a way that could be applied to all clients, then that would be the use case to put it in as a flyway migration.
Otherwise data migrations would be applied in some other way outside of the process like requesting special access to the database or having some team that manages the database to apply the scripts.
If you are doing custom data modifications quite often then I'd say something is wrong in some other area of the SDLC and you may need to increase testing so that bugs don't mess up the data in the first place.

How can I remove issues with my flyway springboot project?

So while building a new database using our database migration scripts written in a springboot flyway project, we realized we made some mistakes.
Some old scripts need to be changed to ensure that we do not face these issues when we make a new database schema again. These issues are mostly related - an info table was not populated with entries in the project and there are scripts that refer to the data in the migration project -- this data does not exist because we never included a script to include data.
How can we correct this project - the only way I can think of is to correct scripts such that all inserts are replaced by - insert if not exists or replace create statements by create if not exists.
and then delete all entries in schema version and re-run this on all the database which are using this schema.
I cannot go back and correct my script because then the migration project will fail because of checksum issues.
You are rigth, if this project and the scripts are running in some existing projects you can not modify them because the checksum would fail.
Then the cleanest way I can think would be add a file called "DB-GENERAL-FIXES" or something like that, where you can add all SQL validations to restore the DB to a stable status. For the new implementations will be extra work first build it wrongly and then clean it, but if you are sharing the same code in production right now...is the best option

Can I replicate a schema using flyway?

Can I replicate a schema using flyway from one environment to another.
Is it possible by one by one table or whole schema to replicate from Dev to Prod?
You could certainly share a set of migrations to be applied across multiple databases.
For example, you could have a structure:
db/migration/
--V2_base_schema.sql
--V3_base_data.sql
--V4_change_table.sql
--R__function.sql
as a resource bundle and provide the applicable runtime parameters to each environment in order to have the same migrations carried out on each database. Each database maintaining it's own schema_version, of course.
If you are asking if Flyway is the tool to somehow dump and restore, there is no such functionality - look to your databases native tools for that (eg pg_dump / pg_restore for PostgreSQL).

Resources