We are using Flyway to migrate the database schema and we already have more than 100 migration scripts.
Once we "squashed" multiple migrations into a single first-version migration, this is ok during development, as we drop and recreate the schema. But in production this wouldn't work, as Flyway won't be able to validate the migrations.
I couldn't find any documentation or best practice of what to do in this case. The problem is that the file quantity increases constantly, I don't want to see thousands of migration files everytime, essentially if production is already in the latest version. I mean, the migration scripts that have a version number that is lower than the version in production are irrelevant to us, it would be awesome if we could squash those files into a single migration.
We are using MySQL.
How should we handle this?
Isn't that what re-baselining would do?
I'm still new to flyway, but this is how I think it would work. Please test the following first before taking my word for it.
Delete the schema_version table.
Delete your migration scripts.
Run flyway baseline
(this recreates the schema_version table and adds a baseline record as version 1)
Now you're good to go. Bear in mind that you will not be able to 'migrate' to any prior version as you've removed all of your migration scripts, but this might not be a problem for you.
Step by step solution:
drop table schema_version;
Export database structure as a script via MySQL Workbench, for example. Name this script V1__Baseline.sql
Delete all migration scripts and add V1__Baseline.sql to your scripts folder, so it is the only script available for Flyway
Run Flyway's "baseline" command
Done
We do this to allow us to compress scripts for building new DB in dev environments but also run against existing production DB without having to log on and delete the flyway_version_history table, and we can keep the scripts (mainly for reference):
Compress all the scripts to a new script e.g. V1 to V42 into a new scripts V43.
Convert V1 to V42 to text files by putting .txt on the end.
Set the baseline to 43.
Set flyway to ignore missing migrations.
In script V43 use an 'if' block to protect the create/insert statements so that they don't run for the existing production database. We use postgres so it is something like this:
DO $$
DECLARE
flywayVersion INTEGER;
BEGIN
SELECT coalesce(max(installed_rank), 0) INTO flywayVersion FROM flyway_schema_history;
RAISE NOTICE 'flywayVersion = %', flywayVersion;
IF flywayVersion = 0 THEN
RAISE NOTICE 'Creating the DB from scratch';
CREATE TABLE...
.....
END IF;
END$$;
The flyway command looks something like this:
Flyway.configure()
.dataSource(...)
.baselineVersion("43")
.ignoreMissingMigrations(true)
.load()
.migrate()
I haven't tried this, but what if you deleted all the migrations, create a new migration that creates the new starting point as version 1, set it as the baseline version -- and then modify your configuration to use a different table (e.g. flyway_schema_history_2)?
In existing databases, Flyway will see that you have a non-empty schema with no (recognized) flyway table and ignore the baseline migration. In new environments it will run the baseline migration too.
Am I missing anything?
(Of course a separate problem is how to generate the "compressed" migration. If you don't need any seed data you can just do a schema-only backup of your database and use that. If your migrations populate data too you will probably have to work that out manually.)
I think this article answers your question best:
https://medium.com/att-israel/flyway-squashing-migrations-2993d75dae96
For postgres a reusable script has been created that you could execute every so many months for instance. You can of course adapt the script to MySQL specific things instead of postgres:
https://github.com/the-serious-programmer/flyway-postgres-squash-script
Related
I have delivered a Product to the customer. Now I have upgraded the Product, which includes changes to the database.
Customer wants to upgrade the Product. Now will Flyway help in the migration of Customer data from older version to newer version. Please let me know, if this is a valid use case. The flyway documentation talks about its use during development only.
Flyway allows you to change your database by running a set of scripts in a defined order. These scripts are called 'migrations' as they allow you to 'migrate' your database from one version to another.
The idea is you can start with an an empty database and each migration script will successively bring that database up from empty up to the current version. However, it's also possible to start with an existing database by creating a 'baseline' migration.
As SudhirR said, Flyway's primary use case is to define schema changes. However, it's perfectly possible to change data also. Since Flyway is just running plain SQL, in principle almost anything you can do in a SQL script you can also do in a Flyway migration.
In the case you described it should be possible to use Flyway to migrate the customer database. The steps you could take are:
Generate a sql script that includes the entire DDL (including indexes, triggers, procedures, ...) of the production database. To do this you will need to add insert statements for all the reference data present in the database.
Save this script in your Flyway project as something like 'V1__base_version.sql'
Run the flyway baseline command against your production database
This will set up your production database for use with Flyway
Add a new migration script to migrate your customer's data to the new version
e.g. create new table, copy data from old table to new table, delete old table
Run flyway migrate to upgrade production
These steps are adapted from the Flyway documentation page here.
Of course you should read the Flyway docs and manually test on a throwaway DB before you run anything against production. However I think in principle Flyway could be a good fit for your use case.
Flyway should be used for schema migrations and any reference data (basic data that is required by the system/application in order to function properly).
Putting client specific data migrations would not be a use case. However, if you can represent the data migration "generically" by not using IDs and instead use names or types than it could be a candidate. Meaning if you could write a migration in a way that could be applied to all clients, then that would be the use case to put it in as a flyway migration.
Otherwise data migrations would be applied in some other way outside of the process like requesting special access to the database or having some team that manages the database to apply the scripts.
If you are doing custom data modifications quite often then I'd say something is wrong in some other area of the SDLC and you may need to increase testing so that bugs don't mess up the data in the first place.
So while building a new database using our database migration scripts written in a springboot flyway project, we realized we made some mistakes.
Some old scripts need to be changed to ensure that we do not face these issues when we make a new database schema again. These issues are mostly related - an info table was not populated with entries in the project and there are scripts that refer to the data in the migration project -- this data does not exist because we never included a script to include data.
How can we correct this project - the only way I can think of is to correct scripts such that all inserts are replaced by - insert if not exists or replace create statements by create if not exists.
and then delete all entries in schema version and re-run this on all the database which are using this schema.
I cannot go back and correct my script because then the migration project will fail because of checksum issues.
You are rigth, if this project and the scripts are running in some existing projects you can not modify them because the checksum would fail.
Then the cleanest way I can think would be add a file called "DB-GENERAL-FIXES" or something like that, where you can add all SQL validations to restore the DB to a stable status. For the new implementations will be extra work first build it wrongly and then clean it, but if you are sharing the same code in production right now...is the best option
I am working with doctrine:migrations:diff in order to prepare database evolutions.
This command creates files into app/DoctrineMigrations
Thoses files contains sql commands in order to upgrade or downgrade database scructure.
I want to store those sql commands into the database itself. In fact, i have several instances of databases. If sql commands are store into files, it is a big problem.
I have read somewhere that DoctrineMigrations bundle can create a table called "migration_versions", but i do not manage to find where i have read this...
I cannot really understand what you're trying to do.
Migrations are used when your code needs altered database structure. For example, a new table or a new column. These new requirements for a table or column comes from your newly written code, so it's only natural to place the migrations also as a code in your repository.
How and when would migrations even get to your database? How would you guarantee that migration is executed before the code changes, which use that new structure?
Generally, migrations are used in this way:
You develop your code, add new features, change existing ones. Your code needs changes to database.
You generate doctrine migration class, which contains needed SQL statements for your current database to get to the needed state.
You alter the class adding any more required SQL statements. For example, UPDATE statements to migrate your data, not only the structure.
You execute migration locally.
You test your code with database changes. If you need more changes, you either add new migration, or execute migration down, delete it and regenerate it. Never change the migration class, as you'll loose what's supposed to be in the database and what's not.
You commit your migration together with code that uses it.
Then comes the deployment part:
- For each server, upload the code, clear and warm-up cache, run other installation scripts. Then run migrations. And only then switch to the new code.
This way your database is always in-sync with current code in the server that uses that database.
migration_versions database table is created automatically by doctrine migrations. It holds only the version numbers of migration classes - it's used for keeping track which migrations were already run and which was not.
This way when you run doctrine:migrations:migrate all not-yet-ran migrations are executed. This allows to migrate few commits at once, have several migrations in a commit etc.
I looked at the Flyway samples and documentation and tried to understand if it is useful in my environment.
The following conceptual detail is unclear to me: How does Flyway manage the changes between database versions? It obviously does NOT compare database life-instances (see answer here:Can Flyway find out and generate migration files from datamodel?)
In detail my setup looks like this:
I create SQL create and insert scripts when coding (automatically and manually). This means every version of my database is represented by a number of insert/create statements.
In my world I execute these scripts through a database tool (sqlplus from Oracle). Each run would setup the database _from_scratch_ (!).
Can I put these very same scripts 1 to 1 inside the "migration" path of Flyway? What happens if the target database is way older than the last "migration step" I did (or flyway did not yet exist when it was installed)?
Update:
I got some input from another Flyway user:
It seems like each "migration" (version of the database) has to be hand-written SQL/Java code and contains only "updates" from the previous "migration" of database.
If this is true, I wonder how this can be used with traditional coding technics: in my world SQL statements are generated automatically and contain all database init/create statements, not just "updates" to some previous version. If my SQL code generator could do that, then I wouldn't even need a tool like Flyway :-).
Your question about "how to handle a DB that has a longer history than there are migration scripts?" You need to create a V1_ migration/sql script that matches/recreates your latest DB schema. Something that can take a blank DB to what you have today. Create/generate that sql script using your existing DB tools and then put it in flyways migration directory. (And test V1 by using flyway against a clean DB and see if you get what you expect.) http://flywaydb.org/documentation/existing.html
After that point in time, all later versions must be added in as you work. When you decide you need a new table, in your dev environment, write a new V*_.sql that modifies your schema to the way you need it.
This blog goes over this situation for a Spring/SQL application. https://blog.synyx.de/2012/10/database-migration-using-flyway-and-spring-and-existing-data/
OUR SYSTEM
We are trying to put migrations as .sql files under version control. Developers would write a VN__*.sql file, commit to version control and a job that runs every 5 minutes would automatically migrate to a Dev and Test database. Once the change proved to not cause problems, someone else would run a manual job to run the migration on Production.
MY PROBLEM:
I had a demo migration that created a few tables. I checked V4__DemoTables.sql into version control on my PC.
On our Linux box a job that runs every 5 minutes extracted the new file from version control, then ran the flyway.sh file. It detected the file and executed it.
But the .sql file had a typo. And we are using Neteeza which has problems with flyway automatically wrapping a migration in a BEGIN TRAN ... END TRAN. So the migration created 2 tables, then aborted before the third.
No problem I thought. I dropped the 2 tables that the .sql file did create. Checked V4__ out of version control, fixed the typo and re-submitted it.
Five minutes later the update was extracted but flyway complains that the checksum does not match. So it will NOT run the updated V4__DemoTables.sql file.
How do I get flyway to accept the updated file and update the checksum in the SCHEMA_VERSION file in case of a typo?
Reading the docs it seems like the developers suggest I should have created a new V4_1_DemoTables.sql file with the fix's. But this would have collided with the commands in the V4__ file so this seemed wrong.
So here is what the docs imply I need to do:
Leave V4__ as a 'successful' migration according to the
SCHEMA_VERSION table.
Create V4_1_ to delete the tables that were created before the typo
line in V4__.
Create V4_2_ which has the typo fix from the original file to do all
the real work.
Is this correct?
If the migration completes successfully, but some of the db objects are not quite right yet (typo in column name, ...), do as you said and push a follow-up script that fixes it (rename column, ...).
If the migration fails, and it didn't run on a DB with DDL transaction, the DB must be cleaned manually. This means:
Reverting the effects of the migration on the DB
Removing the version from the SCHEMA_VERSION table and marking the previous one as current
This second step will be automated in the future with the introduction of the flyway.repair() command.