How to use vhd-util to manage snapshots - xen

I'm running several VMs within Xen, and now I'm trying to create/revert snapshots of my VMs.
Along with Xen and blktap2, another utility, vhd-util is also delivered, and according to its description, I guess I can use it to create/revert VM snapshots.
To create a snapshot is actually easy, I just call:
vhd-util snapshot -n aSnapShot.vhd -p theVMtoBackup.vhd
But when it comes to reverting a snapshot, things get really annoying.
The "revert" command requires a mandatory argument "journal", like this:
vhd-util revert -n aSnapShot.vhd -j someThingCalledJournalOfWhichIHaveNoIdea
And vhd-util expects some info from the journal, which means it's not some empty file you can write logs in.
But I've went through the code and the internet, still get no idea where this Journal comes from.
Similar question is asked
http://xen.1045712.n5.nabble.com/snapshots-with-vhd-util-blktap2-td4639476.html but poor guy never get answered.
Hope someone here could help me out.

Creating snapshots in VHD works by putting an overlay over the existing VHD image, so that any change get written into the overlay file instead of overwriting existing data. For reading the top-most data is returned: either the data from the overlay if that sector/cluster was already over-written, or from the original VHD file if it was not-yet over-written.
The vhd-util command creates such an overlay-VHD-file, which uses the existing VHD image as its so-called "backing-file". It is important to remember, that the backing-file must never be changed while snapshots still using this backing-file exist. Otherwise the data would change in all those snapshots as well (unless the data was overwritten there already.)
The process of using backing files can be repeated multiple times, which leads to a chain of VHD files. Only the top-most file should ever be written to, all other files should be handled as immutable.
Reverting back to a snapshot is as easy as deleting the current top-most overlay file and creating a new empty overlay file again, which again expose the data from the backing file containing the snapshot. This is done by using the same command again as above mentioned. This preserves your current snapshot and allows you to repeat that process multiple times.
(renaming the file would be more like "revert back to and delete last snapshot".)
Warning: before re-creating the snapshot file, make sure that no other snapshots exists, which uses this (intermediate) VHD file as its backing file. Otherwise you would not only loose this snapshot, but all other snapshots depending on this one.

You don't need to use revert, all you need to do is shut down the VM, rename aSnapShot.vhd to theVMtoBackup.vhd and restart the VM.

Related

Stop rsync from backing up if too many files are being changed

Does anyone know of a way that I can tell rsync to not perform a backup if it detects and X amount of data will be changed? For example, if I run a backup and it detects and 25% of the data in the destination directory will be changed can I have it automatically abort that run and then I can evaluate and make a decision whether allow it or not. I back up my machine every night but what I'm worried about is if my machine gets hit with a ransomware bug or another issue that causes a ton of my data gets destroyed or lost I really don't want it to propagate to my backup. I used to tool call synconvery and it had this feature but I don't think the tool is supported very well and I get a lot of permission and read errors that I don't see with any other tools. Goodsync also has this feature but even though it runs on the Mac it doesn't support special characters in a file name and replaces it with an underscore when the file is copied. I just think that will cause problems when I try to restore those file and it's being referenced wit that special character but can't be found because it has a damn underscore. I like using rsync and I will eventually retrofit my script to use msrsync but I can't trust it if I can't get this protection in place.

How to convert a snapshot to a snapshot to an Image in Openstack?

It seems that snapshots and instances are very similar (e.g. https://serverfault.com/questions/527449/why-does-openstack-distinguish-images-from-snapshots).
However, I've been unable to share snapshots publicly globally (i.e. across all projects). Note, I'm a user of the OpenStack installation, not an administrator of the installation.
Assuming that Images don't suffer the same limitation of Snapshots, is there a procedure for convert a snapshot to an image? If not, maybe I should ask a separate question, but my cloud admin tells me it needs to be an image.
for download
glance-image download
Initially I tried this:
openstack image save --file NixOS.qcow2 5426bbcf-06b3-42f3-b117-37395e7dde83
However, the reported size of NixOS.qcow2 was always 0 bytes. Not good. Well, the issue was apparently related to the fact that is is also what was reported in OpenStack and Horizon for the size of the Snapshot. So something weird was going on, but functionally, I could still use the snapshot to create instances without issue.
I then created a volume of a snapshot in horizon (when shut off, couldn't create a volume when shelved), then used this command to create an image from the newly created volume (NixSnapVol):
openstack image create --volume NixSnapVol NixSnapVol-img
Interestingly the reported size went from 41GB to 45GB, maybe that was part of the issue. Anyway, seems to work now, and bonus is that it is now RAW type instead of qcow2, so I don't need to do the conversion (our system largely benefits from using RAW as we have a ceph backend).

.goutputstream-XXXXX - possible to relocate?

I've been trying to create a union file system for a college project. One of its features that differentiates it from unionfs is the fact that there are no copy-ups. This means that if a file is located in a certain branch, it will remain there even if it is written to.
But my current problem with that is the fact that .goutputstream-XXXXX are created, renamed, and deleted whenever a write operation occurs. This is actually OK if the file being written to is in the highest priority branch (i.e. the default branch where files can be created), but makes my kernel crash if I try to write to a file in a lower branch.
How do I deal with this? How can I rig it so that all .goutputstream-XXXXX files are written to only one location? These .goutputstream-XXXXX files seem to be intricately connected to the files they correspond too, and seem to work only the same directory as the file being written to.
I also noticed that .goutputstream-XXXXX files appear when a directory is read. What are they for, anyway?
There has been a bug submitted to the ubuntu launchpad in which the creation of .goutputstream-xxxxx files is discussed.
https://bugs.launchpad.net/ubuntu/+source/lightdm/+bug/984785
From what i see now, these files are created when shutting down without preceding logout, but several other sources may occur, like evince or maybe gedit.
maybe lightdm has something to do with the creation of these files.
which distribution did you use?
maybe changing the distribution would help.
.goutputstream-XXXXX created by gedit and there is no simple way (menu or settings) to relocate them.

why do you copy the SQLite DB before using it?

Everything I have read so far, it seems as though you copy the DB from assets to a "working directory" before it is used. If I have an existing SQLite DB I put it in assets. Then I have to copy it before it is used.
Does anyone know why this is the case?
I can see a possible application to that, where one doesn't want to accidentally corrupt database during write. But in that case, one would have to move database back when it's done working on it, otherwise, next time program is run will start from "default" database state.
That might be another use case - you might always want to start program execution with known data state. Previous state might be set from external application.
Thanks everyone for your ideas.
I think what I might have figured out is that the install cannot put a DB directly to the /data directory.
In Eclipse there is no /data which is where most of the discussions I have read say to put it.
This is one of the several I found:
http://www.reigndesign.com/blog/using-your-own-sqlite-database-in-android-applications/comment-page-4/#comment-37008

How to let humans and programs access the same file without stepping on each others' toes

Suppose I have a file, urls.txt, that contains a list of URLs I'm monitoring. My monitoring script edits that file occasionally, say, to indicate whether each URL is reachable. I'd like to also manually edit that file, to add to or change the list of URLs. How can I allow that such that I don't have to think about it when manually editing?
Here are some possible answers. What would you do?
Engage in hackery like having the program check for the lockfiles that vim or emacs create. Since this is just for me, this would actually work.
If the human edits always take precedence, just always have the human clobber the program's changes (eg, ignore the editor's warning that the file has changed on disk). The program can then just redo its changes on its next loop. Still, changing the file while the user edits it is not so nice.
Never let a human touch a file that a program makes ongoing modifications to. Rethink the design and have one file that only the human edits and another file that only the program edits.
Give the human a custom tool to edit the file that does the appropriate file locking. That could be as crude as locking the file and then launching an editor, or a custom interface (perhaps a simple command line interface) for inserting/changing/deleting entries from the file.
Use a database instead of a flat file and then the locking is all taken care of automatically.
(Note that I concocted the URL monitoring example to make this more concrete and because what I actually have in mind is perhaps too weird and distracting -- this question is strictly about how to let humans and programs both modify the same state file.)
I'd use a database since that's basically what you're going to have to build to achieve what you want. Why re-invent the wheel?
If a full-blown DBMS is too much of a load, separate the files into two and synchronize them periodically. Whether the URL is reachable doesn't sound like something the user would be changing, so should not be editable by them.
During the synchronize process (which would have to lock out the monitor and the user although it could be a sub-function of the monitor), remove entries in the monitor file that aren't in the user full. Also, add to the monitor file those that have been added to the user file (and start monitoring them).
But, I'd go the database method with a special front-end for the user, since you can get relatively good light-weight databases nowadays.
Use a sensible version control system!
(Git would work well here).
That said, the nature of the problem implies that a real database would be best - and they will generally have either database-level, table-level, or row-level locking - but then put any scripts you need into version control.
I would go with option 3. In fact, I would have the program read the human-edited input file, and append the results of each query to a log file. In this way, you can also analyse the reachability of sites over time. You can also have the program maintain a file that indicates the current reachability state of each site in the input file, as a snapshot of the current state.
One other option is using two files, one for automated access and one for manual. You'd need a way in the user file to indicate modifications or deletions but you'd have similar problems in some of the other solutions as well.

Resources