This question is like " What was first, chicken or egg?".
Let's imagine we have some source code. Written using symfony or yii. It has db migration code that hadle some database changes.
Now, we have some commits that updates our code (for example new classes) and some db changes (change old columns or add new tables).
When we developing at localhost or update our dev servers it's ok to have time to stop services\any actions and update server. But when we tries to do it on production server we will crash everything for a while and this is not an option.
Why this will happen - when we pull it (git\mercurial) our code will be updated, but NOT database, and when code will be executed - it will throw exceptions of database. To fix it we should run build-in framework migrations. So in the end our server will be crashed until migrations will be called.
Code and migrations should be updated "in one time".
What is the best practice to handle it?
ADDED:
Solution like "run pull then run migrations in one call" - not an option in highload project. Because on highload even in second some entries\calls can be borken.
Stop server we cannot too.
Pulling off a zero downtime deployment can be a bit tricky and there are many ways to achieve this.
As for the database it is recommended to do changes in a backwards compatible fashion. So for example adding a nullable column or new table will not affect your existing code base and can be done safely. So if you want to add a new non-nullable column you would do it in 3 steps:
Add new column as nullable
Populate with data to make sure there are no null-values
Make the column NOT NULL
You will need a new deployment for 1 & 3 at the very least. When modifying a column it's pretty much the same, you create a new column, transfer the data over, release the code that uses the new column (optionally with the old column as fallback) and then remove the old column (plus fallback code) in a 3rd deployment.
This way you make sure that your database changes will not cause a downtime in your existing application. This takes great care and obviously requires you to have a good deployment pipeline allowing for fast releases. If it takes hours to get a release out this method will not be fun.
You could copy the database (or even the whole system), do a migration and then switch to that instance, but in most applications this is not feasible because it will make it a pain to keep both instances in sync between deployments. I cannot recommend investing too much time in that, but I might be biased from my experience.
When it comes to switching the current version of your code with a newer one you have multiple options. The fancy cloud based solutions like kubernetes make this kind of easy. You create a second cluster with your new version and then slowly route traffic from the old cluster to the new one. If you have a single server it is quite common to deploy a new release to a separate folder, do all the management tasks like warming caches and then when the release is ready to be used you switch a symlink to the newest release. Both methods require meticulous planning and tweaking if you really want them to be zero downtime. There are all kinds of thing that can cause issues like a shared cache being accidentally cleared to sessions not being transferred over correctly to the new release. Whenever something that's stored in a session changes you have to take a similar approach as with the database and basically slow move the state over to the new one while running the code or having a fallback to still handle the old data otherwise you might get errors when reading the session, causing 500 pages for your customers.
They key to deploy with as few outages and glitches as possible is good monitoring of the systems and the application to see where things go wrong during a deployment to make it more stable over time.
You can create a backup server with content that mirrors your current server. Then do some error detection.
If an error is detected on your primary server, update your DNS record to divert your traffic to your secondary server.
Once primary back up and running, traffic moves back to primary and then sync the changes in your secondary.
These are called failover servers.
Related
We have a production system with a large DB (several hundred tables) and would like to begin using Flyway to manage DDL changes that occur through the dev cycle. However, the organization is setup in such a way that there are production DB changes that sometimes occur, mostly just data changes but possibly DDL, that will happen outside of a data migration tool. While this is obviously an organizational challenge, does this fact alone cripple a tool like Flyway? Or is there a workflow where Flyway could rebuild its indices on demand such that any out-of-band DB change like this is pulled in?
We'd love to use Flyway, but would need to integrate it incrementally until all teams using the system are trained/bought in.
When introducing Flyway to a DB with existing data you will need to baseline Flyway to integrate with your existing data. See baseline.
For changes made after this, Flyway will only track and version changes made from its own migration scripts and not changes made externally to it. However, this does not mean you cannot use the two together, but you would need to be more aware of your database structure to avoid conflicts between your flyway migrations and external changes.
Transactional data changes made to production shouldn't impact Flyway as these won't be versioned.
If you're referring to static data (eg lookup data) that you'd like Flyway to manage, then this isn't detected by Flyway (at least not today). If you discover that you have drift you'll need to add the changes as a new migration script using idempotent syntax to ensure that next time this runs against production it doesn't try to make the same changes again.
For out-of-band schema changes, The enterprise edition of Flyway has a drift check, so at least you'd be made aware of them. However, as for the data changes described above, you'll need to manually add these schema changes as an idempotent migration script.
We have not used Flyway from the beginning of our project. We are at an advanced state of development. An expert review has suggested to use Flyway in our project.
The problem is that we have moved part of our services (microservices) into another testing environment as well.
What is the best way to properly implement Flyway? The requirements are:
In Development environment, no need to alter the schema which is already existing. But all new scripts should be done using Flyway.
In Testing environment, no need to alter the schema which is already existing. But what is not available in testing environment should be created automatically using Flyway when we do migrate project from Dev to test.
When we do migration to a totally new envrionment (UAT, Production etc) the entire schema should be created automatically using Flyway.
From the documentation, what I understood is:
Take a backup of the development schema (both DDL and DML) as SQL script files, give a file name like V1_0_1__initial.sql.
Clean the development database using "flyway clean".
Baseline the Development database "flyway baseline -baselineversion=1.0.0"
Now, execute "flyway migrate" which will apply the SQL script file V1_0_1__initial.sql.
Any new scripts should be written with higher version numbers (like V2_0_1__account_table.sql)
Is this the correct way or is there any better way to do this?
The problem is that I have a test database where we have different set of data (Data in Dev and test are different and I would like to keep the data as it is in both the environments). If so, is it good to separate the DDL and DML in different script files when we take it from the Dev environment and apply them separately in each environment? The DML can be added manually as required; but bit confused if I am doing the right thing.
Thanks in advance.
So, there are actually two questions here. Data management and Flyway management.
In terms of data management, yes, that should be a separate thing. Data grows and grows. Trying to manage data, beyond simple lookup tables, from source control quickly becomes very problematic. Not to mention that you want different data in different environments. This also makes automating deployments much more difficult (branching would be your friend if you insist on going this route, one branch for each data set, then deploy appropriately).
You can implement Flyway on an existing project, yes. The key is establishing the baseline. You don't have to do all the steps you outlined above. Let's say you have an existing database. You have to get the script that defines that database. That single script should include all appropriate DDL (and, if you want, DML). Name it following the Flyway standards. Something like V1.0__Baseline.sql.
With that in place, all you must do is run:
flyway baseline
That will establish your existing code base as the start point. From there, you just have to create scripts following the naming standard: V1.1xxx V2.0xxx V53000.1xxx. And run
flyway migrate
To deploy appropriate changes.
The only caveat to this is that, as the documentation states, you must ensure that all your databases match this V1.0 that you're creating and marking as the baseline. Any deviation will cause errors as you introduce new changes and migrate them into place. As long as you've got matching baseline points, you should be able to proceed with different data in different environments with no issues.
This is my how-to instruction on integration flyway with prod DB: https://delicious-snipe-938.notion.site/How-to-integrate-Flyway-with-existing-MySQL-DB-in-Prod-PostgreSQL-is-similar-1eabafa8a0e844e88205c2f32513bbbe.
The normal way of dealing with Doctrine Migrations is via the standard Commands - during development one runs the commands manually to e.g. run diffs and apply the migrations, and deployment typically involves applying them the by the same approach but automatically. Occasionally when working in a team on a local instance there are new migrations, but I've updated my source from version control rather than done a deployment, so I need to apply the new migrations manually, and I need to know that I need to do that! An improvement could be to display a warning on a rendered webpage that migrations are out of sync and action needs to be taken.
Is there a way to access the Migrations API directly in PHP/Symfony code, so that I could detect a mismatch between committed and applied migrations? I haven't found any documentation about that. I've had an initial poke around the code and it seems heavily skewed towards Commands (reasonably enough).
Firstly, updating your source code from version control, is a deployment too, and applying Doctrine Migrations should be part of that. You should create a check list of all the steps you need to do during a deployment, including rollbacks. Depending on the complexity of the application, many things could go wrong.
To answer you question, you can execute, in your code, a diff migration with the Process component and parse the output to determine if there're migrations to be applied.
I am working on a ASP.net MVC4 project where a same project needs to be deployed to many clients on daily basis, each client will have its own domain / sub domain and a separate app pool and db (MSSSQL).
Doing each deployment manually could take at least 1-2 hours if everything goes well. Is there anyway using which I can do this in some automated way?
Moreover, we also need to update all of the apps when a new version is released.. may be one by one or all of them at same time. However, doing this manually could take weeks and once we have more clients then it will not possible doing this update manually.
The update involves, suspending app for some time, taking a full backup of files and db, update application code/ files in app folder, upgrade db with a script and then start app, doing some diagnosis script to check if update was successful or not, if not we need to check what went wrong?
How can we automate this updates? Any idea would be great on how to approach this issue.
As a developer for BuildMaster, I can say that this scenario, known as the "Core Version" pattern, is a common one. If you're OK with a paid solution, you can setup your deployment plans within the tool that do exactly what you described.
As a more concrete example, we experience this exact situation in a slightly different way. BuildMaster has a set of 60+ extensions that rely on a specific SDK version. In our recent 4.0 release, we had to re-deploy every extension because of breaking API changes within the SDK. This is essentially equivalent to having a bunch of customers and deploying to them all at once. We have set up our deployment plans such that any time we create a new release of the SDK application, we have the option to set a variable that says to build every extension that relies on the SDK:
In BuildMaster, the idea is to promote a build (i.e. an immutable object that travels through various environments like Dev, Test, Staging, Prod) to its final environment (where it becomes the deployed build for the release). In your case, this would be pushing your MVC application to its final environment, and that would then trigger the deployments of all dependent applications (i.e. your customers' instances of your application). For our SDK, the plan looks like this:
For your scenario, you would only need the single action, "Promote Build". As I mentioned before, any dependents would then be promoted to their final environments, so all your customer deployments would kick off once that action is run during deployment. As an example, our Azure extension's deployment plan for its final environment looks like this (internal URLs redacted):
You may have noticed that these plans are marked "Shared", which means every extension we have has the exact same deployment plan, but utilizes different variables to handle the minor differences like names, paths, etc.
Since this is such an enormous topic I could go on for ages, but I think that should be sufficient for your use-case if you wanted to try it out.
There are others but you could setup Team Server Foundation to deploy automated builds.
http://msdn.microsoft.com/en-us/library/ff650529.aspx
I find the easiest way to do this from an MVC project is to create a publish profile.
This is done by right-clicking your project selecting publish and then configuring it to your needs.
Then from TFS you create a new build definition, this kicks of a wizard which takes you through it.
There are quite a few options which would be too long to go into for every scenario.
The main change I usually find the most important is to set an MSBuild Argument to deploy with the publish profile.
This can be found at Process > Advanced > MSBuild Arguments.
Once this is configured correctly it's a simple case of right-clicking and queue new build to build and deploy.
You wil need different PublishProfile/Build configuration per deployment environment.
For backups I use a powershell script which can be called manually or from TFS.
You also have a drop folder in TFS which keeps a backup of x many releases.
The datbases are automatically configured via Sql server to backup, TBH I didn't set that up it was a DB admin guy who is also involved with releases.
From a dev testing side I use jMeter (http://jmeter.apache.org/) to run some automated scripts that check that users can login and view certain screens, just to confirm nothing major has gone wrong. However there is usually a testing team to run more detailed tests, again not setup by me.
All of the above will probably take you sometime to setup but in the long run it will literally save you weeks of time over a year.
A free alternative to TFS is http://www.cruisecontrolnet.org/, I have used this in the past too and is pretty good.
You can automate your .Net deployments with Beanstalk, which will give you a way to trigger deployments with a single click, watch progress, manage permissions and see history of deployments. Check out this guide on the topic:
http://guides.beanstalkapp.com/deployments/deploy-dotnet.html
I hope you will find it useful.
P.S. - I work at Beanstalk.
The scenario is this. I have a SQL Server database online that I am demoing an application. During development, I have added extra fields, modified field types, changed keys and added some new tables locally.
What's the best way for me to update the online database with the new structure and not lose the data? The database is a SQL Server 2005 one.
Download a trial of Red Gate SQL Compare, compare your two servers and you are done. If you do this often, it is well worth the $400, or get one of their bundles for a better bang for the buck.
And I do not work for Red Gate, just a happy customer!
Write update scripts to modify your live database structure to the new structure, as well as inserting any data which is required.
You may find it necessary to use temporary tables to do this.
It's probably best if you test this process on a test environment, before running the scripts on the live environment.
Depending on what exactly you've done you may be able to get away with alter statements, though from the sounds of it (removing keys and whatnot) you're doing some heavy lifting that may make that a less-than-ideal solution. You should probably look into creating a maintenance plan or, better yet, a SQL Server Integration Services project in Visual Studio. You should be able to migrate the data in the existing database to a new one using those tools.
This probably isn't of huge help retrospectively, but I always script all structural DB changes to my development database and then using a version number to determine the current version of the DB I can run the required scripts on the live DB, hence bringing it back in line at the same time as the new code is uploaded.
This also works for any content changes, for instance if the change in the underlying structure has an effect on the conent stored you can also write scripts to migrate the data accordingly.
Make a copy of the existing database to copy from.
Make another copy and alter it to your new schema. save DDL for reuse.
Write queries that copy data from #1 to #2. Save the queries for reuse.
Check the results.
Repeat until done.