I have been reading a blog post about Flyway, called Lessons Learned Using Flyway DB with Distributed Version Control. One of the author's suggestions is to create idempotent migrations.
Quoting from the article:
In a perfect world, each migration will only be run once against each
database.
In a perfect world, that is.
In actuality, there will be cases where you’ll need to re-run
migrations against the same database. Often this will be due to a
failed migration somewhere along the line, causing you to have to
retrace your steps of successful migrations before to get the database
back in a working state. When this happens, it’s incredibly helpful
for the migration to be written in an idempotent manner.
Assuming I am using a database that supports DDL transactions, should I be worried about idempotency while creating these migration sqls?
In general no, especially when you have a database that supports DDL transactions.
Versioned migrations are designed to run exactly once and can, but don't have to be idempotent (virtually no benefit).
Repeatable migrations on the other hand have to be idempotent as they'll be run over and over again.
Flyway let's you very easily recreate your database from scratch, and that's the approach you should favor when experimenting.
Related
I wanted to do some experiments on our on-prem DevOps collection database. Alas, I'm having a bit trouble actually duplicating said collection. I mean, it's easy enough to take a backup of the database, but now I'd like to restore the collection database under a new name in DevOps.
Except DevOps doesn't even see the restored collection database. Now, granted, chances are it's because the backup was performed while the collection was attached (since this is for testing I'm not particularly worried about breaking anything on said duplicated collection; the same cannot be said about the rest of the system, obviously). There's there's a very good chance that some collection IDs might be duplicated, given this is a copy.
Is it possible to duplicate a collection in DevOps on-prem? Or would any kind of duplication / testing require a whole new DevOps server instance?
PS. While not important to this question, it might shed some light on what I am trying to do. I wanted to see if the SQL changes suggested by someone here https://developercommunity.visualstudio.com/t/bring-inherited-process-to-existing-projects-for-a/614232 are actually worth it. I don't want to stop our main / production DevOps collection for obvious reasons and instead wanted to duplicate it and run the SQL scripts on such a cloned collection...
We have a database which stores, among others, identifiers from an external system. Now the identifiers changed (the system changed the scheme), and the database needs to be updated. It can be done - I have a mapping so I can generate just enough SQL to make it work, and in the end this will need to be done like this.
The question is - is this use case for a Flyway Java migration? I tend to think that it's not the case, but I can't really say why, it's a gut feeling. But, the external system's schema is not versioned, at least not by us, so I feel it doesn't fit into out Flyway migrations at all; I think it should be executed just once, outside of Flyway.
Can anybody with more experience maybe help, explain why or why not?
It's mostly opinion based, but it seems to me as it's just the same as to use a steam-hammer to crack nuts. Flyway is a very useful tool for periodical migrations and for cases, then there is a number of databases, you have to recreate from the scratch or update'em regularly, rather then for a single use.
What is the reason to include some relatively large framework in your project, spend some time to make it work, and use it only once? Furthermore, Flyway need some extra table to exist in your DB to store it's inner info about the the current version and applied migrations. Don't think, that is the thing you really want to have in your case.
As for me, I think that if you have to do this task just once and can do it without Flyway, then just do it this way.
I think one question we should be asking ourselves when we determine whether or not to write a Flyway script for our data migrations is "Is this necessary when creating this db from scratch?"
Flyway uses a versioning system so in your case, would it make sense to flip the values from the old version to the new version when standing up a new environment? What about multiple modifications? Does it make sense to store the old values and apply them sequentially if you are creating a new environment?
If you answer "NO", then flyway is probably not the way to go. Flyway is better utilized for schema changes where the structure of the database is changed and data is converted into the new structure. If you're just changing configuration values, I believe flyway is probably not your best bet, simply because it is not necessary to store all the changes to these configuration values.
We are having many projects running on many servers looking up into one database, we"re thinking to setup Flyway to every project for control our database structure.
But we are worrying about concurrent migration problem, if some projects re-deploy in sametime.( Off-coures, we always take care the "If exist" things in sql syntax )
How Flyway work when concurrent change on same data table or other struture things ?
It works as expected. See the answer in the FAQ: https://flywaydb.org/documentation/learnmore/faq.html#parallel
Can multiple nodes migrate in parallel?
Yes! Flyway uses the locking technology of your database to coordinate multiple nodes. This ensures that even if multiple instances of your application attempt to migrate the database at the same time, it still works. Cluster configurations are fully supported.
I have one application using a single database schema.
Nonetheless, the application has a core (having its DB objects) and can be extended with a plugin logic (each plugin having its DB objects).
Core DB objects and Plugins DB objects are distinct sets, since plugins are optional and may exist or may not.
Thus I need separate versionig and migration control for Core and each single plugin.
I'm wondering if there is some way, using Flyway, to manage this separate "migration paths".
The only thing I can think about is creating, under the same, single DB schema hosting the application, many different Flyway metadata tables (like schema_version_core, schema_version_plugin1, etc.) and managing migrations of each component independently.
Is this doable? Any smarter suggestion?
Many thanks
I highly recommend you split your DB into schemas, as this is really what they are for: managing sets of objects.
If that is not an option, the alternative you suggested works fine. Just be careful when invoking clean, as it will then clean the entire schema not just the part for one of your plugins.
I am currently struggling with the same problem: An application which is made of several "base" components which all could have their own database objects.
I tried the option to put all in the same schema and using different flyway meta tables, but this does not work: When flyway comes to process e.g. the second table for the second module and discovers that the schema is not empty (because the first module has migrated its db changes), it stops, as flyway has now no chance to determine the state of the db and its migrations.
Also using base line versions do not help, as in that case the changes of the base line would be skiped for the next modules... I think the only reasonable solution with flyway would be to use different schemas....
We have a ASP.NET web application and need to maintain the database creation and initialization script.
Are there any industry best practices that people know of for maintaining database creation and initialization scripts. I can think of two main approaches.
Maintain a tsql creation script directly by hand.
Maintain a master database and create the script that is then checked into source safe.
Also the script should be able to be tracked through source control, i.e. table order should be controllable.
If possible should also include the ability to track initialisation data either in the same or a seperate script.
Currently we generate the script from management studio but the order of the tables seems to be random.
And the more automated the solution the better.
The problem is not maintaining the script, nor maintaining a 'master' copy of the database. The real problem is upgrading existing database(s). You do your modification in the developer environment, which are then propagated to the test environment, and finally pushed into production environment. While at developer and test environment stage is possible to start from scratch, in production you always have to upgrade the existing deployment.
In my experience the best practice is to use upgrade scripts. This practice is useful even with a single deployed site, but it becomes invaluable with multiple locations that may be at different versions. But even with one single operational site is still useful to be able to test the upgrade repeatedly (starting from backups of current version), keep the changes in source control, have a well formalized and peer reviewed change procedure (the upgrade script). And upgrade scripts can be tailored to specific needs of the operational site, like handling a large table with special care, or deal with encrypted data, or whatever one of the myriad of the details diff based tools neglect or ignore. The main disadvantage is the the scripts have to be written, which require real T-SQL knowledge (forget all the 'designers' in you favorite management tool).
You might want to check out RedGate SQL Source Control.
Are you looking for Visual Studio Database Projects?
I use database projects to store all database objects (tables, views, functions, keys, triggers, indexes across schemas) and keep versioning in TFS. You can build the database to ensure that everything is valid. You can deploy to a fresh database, or do a schema comparison with an existing database.
I also keep all reference and setup data in post deployment scripts which are automatically run after deployment.