We have a database which stores, among others, identifiers from an external system. Now the identifiers changed (the system changed the scheme), and the database needs to be updated. It can be done - I have a mapping so I can generate just enough SQL to make it work, and in the end this will need to be done like this.
The question is - is this use case for a Flyway Java migration? I tend to think that it's not the case, but I can't really say why, it's a gut feeling. But, the external system's schema is not versioned, at least not by us, so I feel it doesn't fit into out Flyway migrations at all; I think it should be executed just once, outside of Flyway.
Can anybody with more experience maybe help, explain why or why not?
It's mostly opinion based, but it seems to me as it's just the same as to use a steam-hammer to crack nuts. Flyway is a very useful tool for periodical migrations and for cases, then there is a number of databases, you have to recreate from the scratch or update'em regularly, rather then for a single use.
What is the reason to include some relatively large framework in your project, spend some time to make it work, and use it only once? Furthermore, Flyway need some extra table to exist in your DB to store it's inner info about the the current version and applied migrations. Don't think, that is the thing you really want to have in your case.
As for me, I think that if you have to do this task just once and can do it without Flyway, then just do it this way.
I think one question we should be asking ourselves when we determine whether or not to write a Flyway script for our data migrations is "Is this necessary when creating this db from scratch?"
Flyway uses a versioning system so in your case, would it make sense to flip the values from the old version to the new version when standing up a new environment? What about multiple modifications? Does it make sense to store the old values and apply them sequentially if you are creating a new environment?
If you answer "NO", then flyway is probably not the way to go. Flyway is better utilized for schema changes where the structure of the database is changed and data is converted into the new structure. If you're just changing configuration values, I believe flyway is probably not your best bet, simply because it is not necessary to store all the changes to these configuration values.
Related
We have a modular application, each app creates its own tables (typically one two) and manage the data.
We use Flyway in our main application but also need it for our modules. However, if we add the patches to our main application, ALTER TABLE queries won't work for some deployments if the corresponding module is not installed.
One way to solve this issue is to perform the schema evolution with multiple Flyway operations, each module gets its own Flyway and manage itself. However since Flyway creates tables for managing the state we ended up too much tables since we have ~20 modules right now.
What's the elegant way to solve this issue?
I would say having the migrations managed by the unit of software it is supporting is the cleanest and trumps "too many tables". In terms of neat organisation of those tables, you can silo those using a schema (if you RDBMS supports those) and Flyway lets you name the table that is used per migration managed application.
The key thing here is "modules". From your description, it sounds like not all applications are made up of the same modules. I would ask you, if we go to the effort of making our modules descrete to create decoupled / reusable software - why should database schemas of those modules be treated any different?
To counter your concern about "too many tables", lets try and debunk the costs of that.
Volume. Your RDBMS is made to handle thousands, there is no cost there.
Operational. Flyway does all the management here, they are effectively opaque to you.
Performance. They are a deployment concern, not a runtime liability.
Organisational. Hide them / name them with the methods mentioned above.
Our natural urge is to aggregate related things but doesn't always lead to the best outcome, so we must be pragmatic. In this situation, good/flexible design trumps aggregation.
I have one application using a single database schema.
Nonetheless, the application has a core (having its DB objects) and can be extended with a plugin logic (each plugin having its DB objects).
Core DB objects and Plugins DB objects are distinct sets, since plugins are optional and may exist or may not.
Thus I need separate versionig and migration control for Core and each single plugin.
I'm wondering if there is some way, using Flyway, to manage this separate "migration paths".
The only thing I can think about is creating, under the same, single DB schema hosting the application, many different Flyway metadata tables (like schema_version_core, schema_version_plugin1, etc.) and managing migrations of each component independently.
Is this doable? Any smarter suggestion?
Many thanks
I highly recommend you split your DB into schemas, as this is really what they are for: managing sets of objects.
If that is not an option, the alternative you suggested works fine. Just be careful when invoking clean, as it will then clean the entire schema not just the part for one of your plugins.
I am currently struggling with the same problem: An application which is made of several "base" components which all could have their own database objects.
I tried the option to put all in the same schema and using different flyway meta tables, but this does not work: When flyway comes to process e.g. the second table for the second module and discovers that the schema is not empty (because the first module has migrated its db changes), it stops, as flyway has now no chance to determine the state of the db and its migrations.
Also using base line versions do not help, as in that case the changes of the base line would be skiped for the next modules... I think the only reasonable solution with flyway would be to use different schemas....
Is it possible to build an ASP.NET website using EF where each customer logging in has separately stored data? We have customers demanding that their data won’t be stored in the same tables as other customers’ data.
I’ve read that EF can’t work with several databases but is it possible to switch database at runtime depending on input parameters? I have a feeling it won’t be possible since the migration features are tightly connected to the database being used, but I'm not sure.
One solution could be to have a separate website deployment and database for each customer. They’ll get separate domains to access but that’s not a problem. But this solution feels a bit clumsy if you’re having many customers, especially with deployment and future upgrades.
Am I missing some smart ways of solving this or is this a very tricky issue?
is structure (of the db) the same ?
if so you could switch connections - not w/o issues though, but should work. For details on how that should be done check the long discussion we've had here (and linked previous questions etc.)...
Code first custom connection string and migrations without using IDbContextFactory
My team is doing web development (ASP.NET, WCF), and we are at a beginning stage where everyone needs to make DB changes and use own sample data.
We use a dedicated DB server, and we want each developer to develop against separate DB.
What we appear to need is ability to configure connection string on per-developer basis in source controlled way. Obviously, we might have other configuration settings that need custom setting and finally, we'll need to maintain a set of configuration settings that are common to all developers.
Can anyone suggest a best practice here?
PS Similar issue appears when we want to deploy a built application to different environments (test, stage, production) without having to manually tweak configurations (except perhaps configuring the environment name).
You can use config transforms for your deployment to different environments. That's easy enough. Scott Hanselman did a pretty awesome video on it here.
For your individual developer db problem, there isn't any particularly elegant solution I can think of. Letting each developer have a unique configuration isn't really a "best practice" to begin with. Once everyone starts integrating their code, you could have a very ugly situation on your hands if everyone wrote their code against a unique db and configuration set. It almost guarantees that code won't perform the same way for two developers.
Here is what I would recommend, and have done in the past.
Create a basic framework for your database, on one database on your test db server.
Create a Database Project as part of your solution.
Use .Net's built in Schema Compare to write your existing database to the database project.
When someone needs to change the database, first, they should get latest on the Database project, then make their changes, and then repeat step 4 to add their changes to the project.
Using this method, it is also very easy for developers to deploy a local instance of the database that matches the "main" database, make changes, and write those changes back to the project.
OK.
Maybe not so elegant solution, but we've chosen to read connection string from a different place when the project is built using Debug configuration.
We are using registry, and it has to be maintained manually.
It requires some extra coding, but the code to read the registry is only compiled in debug (#if debug), so there is no performance hit in production.
Hope this helps as well.
Cheers
v.
A customer has a web based inventory management system. The system is proprietary and complicated. it has around 100 tables in the DB and complex relationships between them. it has ~1500000 items.
The customer is doing some reorganisations in his processes and now has the need to make massive updates and manipulation to the data (only data changes, no structural changes). The online screens do not permit such work, since they where designed at the begining without this requirement in mind.
The database is MS Sql 2005, and the application is an asp.net running on IIS.
one solution is to build for him new screens where he could visialize the data in grids and do the required job on a large amount of records. This will permit us to use the already existing functions that deal with single items (we just need to implement a loop). At this moment the customer is aware of 2 kinds of such massive manipulations he wants to do, but says there will be others.This will require design, coding, and testing everytime we have a request.
However the customer needs are urgent because of some regulatory requirements, so I am wondering if it will be more efficient to use some kind of mapping between MSSQL and Excel or Access to expose the needed informations. make the changes in Excel or Access then save in the DB. may be using SSIS to do this.
I am not familiar with SSIS or other technologies that do such things, that's why I am not able to judge if the second solution is indeed efficient and better than the first. of course the second solution will require some work and testing, but will it be quicker and less expensive?
the other question is are there any other ways to do this?
any ideas will be greatly appreciated.
Either way, you are going to need testing.
Say you export 40000 products to Excel, he re-organizes them and then you bring them back into a staging table(s) and apply the changes to your SQL table(s). Since Excel is basically a freeform system, what happens if he introduces invalid situations? Your update will need to detect it, fail and rollback or handle it in some specified way.
Anyway, both your suggestions can be made workable.
Personally, for large changes like this, I prefer to have an experienced database developer develop the changes in straight SQL (either hardcoding or table-driven), test it on production data in a test environment (doing a table compare between before and after) and deploy the same script to production. This also allows the use of the existing stored procedures (you are using SPs, to enforce a consistent interface to this complex database, right?) so that basic database logic already in place is simply re-used.
I doubt Excel will be able to deal with 1.5mil elements/rows.
When you say to visualise data in grids - how will your customer make changes? Manually or is there some automation behind it? I would strongly encourage automation (since you know about only 2 types of changes at the moment). Maybe even a simple standalone "converter" application - don't make part of the main program - it will be too tempting for them in the future to manually edit data straight in the DB tables.
Here is a strategy that I think will get you from A to B in the shortest amount of time.
one solution is to build for him new
screens where he could visialize the
data in grids and do the required job
on a large amount of records.
It's rarely a good idea to build an interface into the main system that you will only use once or twice. It takes extra time and you'll probably spend more time maintaining it than using it.
This will permit us to use the already
existing functions that deal with
single items (we just need to
implement a loop)
Hack together your own crappy little interface in a .NET Application, whose sole purpose is to fulfill this one task. Keep it around in your "stuff I might use later" folder.
Since you're dealing with such massive amounts of data, make sure you're not running your app from a remote location.
Obtain a copy of SQL 2005 and install it on a virtualization layer. Copy your production database over to this virtualized SQL server. Take a snap shot of your virtualized copy before you begin testing. Write and test your app against this virtualized copy. Roll back to your original snap shot each time you test. Keep changing your code, testing, and rolling back until your app can flawlessly perform the desired changes.
Then, when the time comes for you to change the production database, you can sit back and relax while your app does all of the changes. Since the process will likely take a while, so add some logging so you can check the status as it runs.
Oh yeah, make sure you have a fresh backup before you run your big update.