Preserving/remapping intid based relations when moving Plone sites - plone

I'm migrating some Plone 4 sites and noticed a problem when moving the entire site after migration. The actual migration is fine - relatedItems and other relations are handled well by the migration scripts. But the problem is my staging environment has a different ZODB mount structure than the destination production environment, and this seems to break intid based relations. For instance it might come in as /dbXX/mysite and end up on /dbYY/mysite, which makes both the to_object and the from_object of a RelationValue inaccurate. I took a look at five.intid and it looks like it is based on the database oid which is perhaps dependent on the path?
I have been unable to find any information on how to handle this. I get the impression it just isn't meant to work with changes to the path, but I don't really have the option of not changing it here. The path is not arbitrary, it is based on actual ZODB mount points. I am considering two options which I want to solicit feedback on:
For background, I am moving a zexp to a staging environment in Plone 4 (identical buildout with p4 production), then moving the entire Data.fs to a Plone 5 staging environment (identical buildout with p5 production), migrating, then moving zexp to production. When moving from Plone 4 to staging env, I could have the zope path mirror what it will be at the final destination. At this point relatedItems etc. should be UID based which won't change. The drawback here is that any intid relations from Plone 4 will break at this point, but I don't think there are many/any of those.
Creating a mapping of all relations based on the RelationCatalog and rebuild/cleanup. I'm thinking I would do this by noting the from_attribute, from_path, and to_path, and recreating relations with the new intids after the final move. I might have to know something about the structure of the from_attribute at this point. Any other drawbacks?

Related

When migrating from an old Artifactory instance to a new one, what is the point of copying $ARTIFACTORY_HOM/data/filestore?

Artifactory recommends the steps outlined here when moving from an old Artifactory server to a new one: https://jfrog.com/knowledge-base/what-is-the-best-way-to-migrate-a-large-artifactory-instance-with-minimal-downtime/
Under both methods it says that you're supposed to copy over $ARTIFACTORY_HOME/data/filestore, but then you just go ahead an export the old data and import it into the new instance, and in the first method you also rsync the files. This seems like you're just doing the exact same thing three times in a row. JFrog really doesn't explain why each of these steps is necessary and I don't understand what each does differently that cannot be done by the other.
When migrating Artifactory instance we need to take two things into consideration:
Artifactory Database - Contains the information about the binaries, configurations, security information (users, groups, permission targets, etc)
Artifactory Filestore - Contains all the binaries
Regardless to your questions, I would like to add that from my experience, in case of a big filestore size (500GB+) it is recommended to use a skeleton export (export the database only, without the filestore. This can be done by marking "Exclude Content" in Export System) and copy the filestore with the help of a 3rd party tool such as Rsync.
I hope this clarifies further.
The main purpose of this article is to provide a bit faster migration comparing to simple full export & import.
The idea of both methods is to select the "Exclude Content". The content we select to exclude is exactly the one that is stored in $ARTIFACTORY_HOME/data/filestore/.
The difference between the methods is that Method #1 exposes some downtime, as you will have to shut down Artifactory at a certain point, sync the diffs, and start the new one.
While method #2 exposes a bit more complexed process, that includes in-app replications to sync the diffs.
Hope that makes more sense.

Continuous deployment and db migration

This question is like " What was first, chicken or egg?".
Let's imagine we have some source code. Written using symfony or yii. It has db migration code that hadle some database changes.
Now, we have some commits that updates our code (for example new classes) and some db changes (change old columns or add new tables).
When we developing at localhost or update our dev servers it's ok to have time to stop services\any actions and update server. But when we tries to do it on production server we will crash everything for a while and this is not an option.
Why this will happen - when we pull it (git\mercurial) our code will be updated, but NOT database, and when code will be executed - it will throw exceptions of database. To fix it we should run build-in framework migrations. So in the end our server will be crashed until migrations will be called.
Code and migrations should be updated "in one time".
What is the best practice to handle it?
ADDED:
Solution like "run pull then run migrations in one call" - not an option in highload project. Because on highload even in second some entries\calls can be borken.
Stop server we cannot too.
Pulling off a zero downtime deployment can be a bit tricky and there are many ways to achieve this.
As for the database it is recommended to do changes in a backwards compatible fashion. So for example adding a nullable column or new table will not affect your existing code base and can be done safely. So if you want to add a new non-nullable column you would do it in 3 steps:
Add new column as nullable
Populate with data to make sure there are no null-values
Make the column NOT NULL
You will need a new deployment for 1 & 3 at the very least. When modifying a column it's pretty much the same, you create a new column, transfer the data over, release the code that uses the new column (optionally with the old column as fallback) and then remove the old column (plus fallback code) in a 3rd deployment.
This way you make sure that your database changes will not cause a downtime in your existing application. This takes great care and obviously requires you to have a good deployment pipeline allowing for fast releases. If it takes hours to get a release out this method will not be fun.
You could copy the database (or even the whole system), do a migration and then switch to that instance, but in most applications this is not feasible because it will make it a pain to keep both instances in sync between deployments. I cannot recommend investing too much time in that, but I might be biased from my experience.
When it comes to switching the current version of your code with a newer one you have multiple options. The fancy cloud based solutions like kubernetes make this kind of easy. You create a second cluster with your new version and then slowly route traffic from the old cluster to the new one. If you have a single server it is quite common to deploy a new release to a separate folder, do all the management tasks like warming caches and then when the release is ready to be used you switch a symlink to the newest release. Both methods require meticulous planning and tweaking if you really want them to be zero downtime. There are all kinds of thing that can cause issues like a shared cache being accidentally cleared to sessions not being transferred over correctly to the new release. Whenever something that's stored in a session changes you have to take a similar approach as with the database and basically slow move the state over to the new one while running the code or having a fallback to still handle the old data otherwise you might get errors when reading the session, causing 500 pages for your customers.
They key to deploy with as few outages and glitches as possible is good monitoring of the systems and the application to see where things go wrong during a deployment to make it more stable over time.
You can create a backup server with content that mirrors your current server. Then do some error detection.
If an error is detected on your primary server, update your DNS record to divert your traffic to your secondary server.
Once primary back up and running, traffic moves back to primary and then sync the changes in your secondary.
These are called failover servers.

Handling TFS Branches and multiple folders for large solution ASP.NET website

We're running TFS 2015 and VS.NET 2015 for a large solution with an ASP.NET web app as the main project and several class library projects.
I'd like our team to start utilizing branches but the concept of branches being in separate folders is causing all sorts of issues with configuration.
Once the branch is completed the entire folder structure and web.config values, project references, reference paths etc are all now different, as the solution is being opened from a different folder than the main branch.
We use IIS virtual directories so that also doesn't work due to the new folder for the branch.
If I go ahead and make all of these manual changes to make our solution work from the new branch folder, then every time we do a forward integration from main->branch all of this config of course gets overwritten, and every developer on the team would need to redo this config
Surely there's a better method to handle branches for larger solutions which have a high level of config and customization, is there a way to keep a single physical folder and just specify which branch you want to work on?
Don't use long term branches. After moving from using long-term branches to a single main branch four all our teams I would never go back. The merges were always terrible, even for seemingly simple changes.
We now use Release Readiness analysis to allow multiple developers to work in parallel on different features. Check it out -
https://dotnetcatch.com/2016/02/16/are-you-release-ready/

Synchronizing Plone 4 sites

I'm using Plone 4 for my sites and I was wondering if there is a way to synchronize two plone sites i.e. be able to synchronize my development site with my production site.
I have looked at Zsyncer product and it appears it is no longer maintained. Besides, the last version is not compatible with Plone 4.
I am thinking of writing a custom script that will handle exporting of the data.fs files and the src files as explained in these two articles:
Copying a remote site database
Copying a Plone site
Is there a better way of synchronizing two plone sites as described by my use case above?
For keeping the code synchronized, you want collective.hostout
For the database, use collective.recipe.backup - you could probably also use hostout to import the backups
Not sure if this solution will fit all your needs, but I use DemoStorage which is build-in to ZODB since version 3.9 (Plone 4 use it).
DemoStorage you have to setup on development instance and use Data.fs from production. All changes will be stored in memory or in separated file (it depends how you configure it), so changes in dev will not be visible on production. If you have both instances on the same server you can use Data.fs directly (without copying it), so it will be always synchronized.
To configure it you have to modify buildout. See: https://pypi.python.org/pypi/plone.recipe.zope2instance#advanced-options
When on prod and on dev transactions changes the same objects (it happens occasionally) DemoStorage can show errors, Than you have to just reboot dev instance (if you use memory change storage) or remove file with changes and than reboot.

Keep local version of file in TFS without being checked out.

I have two development environments, one for production and one for development. In TFS is there a way I can keep different versions of a file for each environment?
I would like to do this on my Web.config file where I keep different connection string for each environment. Right now I either have to keep that file checked out in both environments with there respective variables or update it every time I change environments.
In TFS you can do that using branching and merging... create one branch for production and one for deployment
Branching is a feature that allows a collection of files to evolve in two or more divergent paths. Branching is frequently used when teams have to maintain two or more similar code bases
Merging is the process of combining the changes in two distinct branches. A merge operation takes changes that have occurred in the source branch and integrates them into the target branch. Merging integrates all types of changes in the source branch including name changes, file edits, file additions, and file delete and undelete changes. If items have been modified in both the source and target branches, you will be prompted to resolve conflicts.
you can find more on branching and merging here

Resources