Berkeley Db platform migration - berkeley-db

I have a large (several Gb) berkeley db that I am thinking of migrating from windows (2K) to Linux (either Redhat or Ubuntu). I am not sure how to go about this. Can I merely move the db files accross, or do I need a special conversion utility?

Database and log files are portable across different endian systems. Berkeley DB will recognize the kind of system it is on and swap bytes accordingly for data structures it manages that make up the database itself. Berkeley DB's region files, which are memory mapped, are not portable. That's not such a big deal because they region files hold the cache and locks which, because your application will not be running during the transition, will be re-created on the new system.
But, be careful, Berkeley DB doesn't know anything about the byte-order or types in your data (in your keys and values, stored at "DBTs"). Your application code is responsible for knowing what kind of system it is running on, how it stored the data (big or little endian) and how to transition it (or simple re-order on access). Also, pay close attention to your btree comparison function. That too may be different depending on your system's architecture.
Database and log files are also portable across operating systems with the same caveat as with byte-ordering -- the application's data is the application's responsibility.
You might consider reviewing the following:
Selecting a Byte Order
DB->set_lorder()
Berkeley DB's Getting Started Guide for Transactional Applications
Berkeley DB's Reference Guide
Voice-over presentation about Berkeley DB/DS (Data Store)
Voice-over presentation about Berkeley DB/CDS (Concurrent Data Store)
Berkeley DB's Documentation
Disclosure: I work for Oracle as a product manager for Berkeley DB products. :)

There's a cross-platform file transfer utility described here.
You may also need to be concerned about the byte order on your machine, but that's discussed a little here.
If you're using Java Berkeley though it shouldn't matter?

Related

Is RocksDB and LevelDB just like Riak?

I have a question regarding some NoSQL databases. In Ehcache we have for example the JCache API, in MapDB the Map Interface and in Riak KV we have a own process with clusters. How do I exactly find out which database fits to which implementation type? For example for RocksDB (I assume that it is a process) and same for LevelDB.
For reference, RocksDB and LevelDB perform very similar functions and can be interchangeable in some situations.
Given your question of Is RocksDB and LevelDB just like Riak?, I can say that they are not the same as Riak provides a scalable distributed platform to run on that can connect to one or more backend databases simultaneoulsy (currently supported backends are Bitcask, LevelDB, Leveled and memory). RocksDB and LevelDB are essentially stand alone database platforms that can be used as such or can utilised by other software such as Riak as a backend. While you could technically implement RocksDB as a backend for Riak KV without needing a mountain of custom code, you probably wouldn't want to as RocksDB does not scale well.
How do I exactly find out which database fits to which implementation type? is rather a broad question. I think you might want to rephrase it as Which databases offer me {my list of desired implementations/functions}? to make it easier for community members to answer. Please note that some NoSQL databases have multiple uses available e.g. as you mentioned Riak KV, we have Maps, Sets, GSets, Flags, Registers, Solr Search, 2i and the standard CRDT options as well but some of those may be tied to other requirements e.g. 2i only works with a LevelDB/Leveled backend, Solr Search requires the Yokozuna package version of Riak KV 3.0.0 and above but is built in for all Riak 2.x.x versions etc.
What you may also want to try to do is download a few different options to a VM or bare metal rig, have a play and see how it works out. There are often cases where two competing products do something very similar on paper but in your specific use case, one outperforms the other significantly.
To get you started, here are links to Riak 2.9.8 (the latest release of the 2.x.x series) and to the Riak 2.2.6 docs (the 2.9.x docs should be out later this month).
I'm not sure if this has directly answered your question but, hopefully, it will give you some pointers as to where to go next.

SQLite backup using Windows DFS Replication [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 years ago.
Improve this question
I have an application that use SQLite for storage, and I'm wondering whether it is safe to use Windows DFS Replication to backup the database file to a second server which has a cold standby instance of the application installed.
Potentially relevant details:
Although DFS supports two-way replication, in this case it is only the master DB file that is written to, so the replication is effectively one-way.
The master DB file is located on the same server as the process that is writing to it.
Currently SQLite is configured to use the standard Rollback Journal, but I could switch to Write-Ahead Log if necessary.
If DFS locks the master DB file during replication then I think this approach could work as long as the lock isn't held for too long. However I can't find sufficient information on how DFS is implemented.
UPDATE: I've implemented this in a test environment, and have had it running for several days. During that time I have not encountered any problems, so I am tempted to go with this solution.
Considering that DFS Replication is oriented towards files and folders:
DFS Replication is an efficient, multiple-master replication engine
that you can use to keep folders synchronized between servers across
limited bandwidth network connections.
I would probably try to avoid it if you care about consistency and keeping all your data, as stated by the SQLite backup documentation:
Historically, backups (copies) of SQLite databases have been created
using the following method:
Establish a shared lock on the database file using the SQLite API (i.e. the shell tool).
Copy the database file using an external tool (for example the unix 'cp' utility or the DOS 'copy' command).
Relinquish the shared lock on the database file obtained in step 1.
This procedure works well in many scenarios and is usually very fast.
However, this technique has the following shortcomings:
Any database clients wishing to write to the database file while a backup is being created must wait until the shared lock is relinquished.
It cannot be used to copy data to or from in-memory databases.
If a power failure or operating system failure occurs while copying the database file the backup database may be corrupted
following system recovery.
In the case of DFS, it wouldn't even lock the database prior to copying.
I think your best bet would be to use some kind of hot replication, you might want to use the SQLite Online Backup API, you could check this tutorial on creating a hot backup with the Online Backup API.
Or if you want something simpler, you might try with SymmetricDS, an open source database replication system, compatible with SQLite.
There are other options (like litereplicator.io), but this one went closed source and is limited to old SQLite versions and ~50MB size databases
ps. I would probably move away from SQLite if you really need HA, replication or this kind of features. Depending on your programming language of choice, most probably, you already have the DB layer abstracted and you could use MySQL or PosgreSQL.

SQLite, prevent a shared network file

We have an application that can use both Postgres and SQLite as its backend, depending on what the customer requires: High load and concurrency, or easy setup.
But I fear that some customers will be too clever for their own good, and to get around the (slightly) complex Postgres installation, they use SQLite, but put the file on a network disk.
Then the inevitable happens and the database gets corrupted, because SQLite is not meant for that, and we have Postgres support for that very reason.
Is there an ideal way to prevent SQLite from working on a network drive? There are some questionable ideas like looking for a \\ at the beginning, or the colon in "C:\" (it's purely a windows app), or parsing for IPs, but none of these are fool-proof and can be circumvented by making a link to a network disk.

How does the DBMS affect application performance? And Informix GUI tools?

I use Informix DBMS in all my web applications. My question has two parts:
Does the DBMS have a big effect on the performance of my applications
and if the answer is yes what about Informix and `MS SQL Server in this
issue?
I want some GUI tools to facilitate my job when writing queries,
creating database, relationships, ERD, etc. The Informix client
is so bad. There are no facilities at all. I want some tools
like SQL Server Management Studio
As a GUI tool for Informix you can use Aqua Data Studio from Aquafold. It also supports MS SQL Server.
As of the performance: it depends. How well is your Database design. Do you use indexes, is your query well-written, etc. etc. Very hard to answer your question, we just don't know enough.
To design a solution that would perform the best, one needs to know the nature of the application you are building. For example, if you are building a system that needs to process and compute large volume of data and computations can be distributed, a "traditional" relational database is not a good option no matter what vendor you choose. You would be better off with an option that supports sharding, Hadoop and will likely be based on some kind of NoSQL solution.
If you are sticking with RDMS and building something that has a lot of reads and not a lot of writes, go for a database that supports Snapshot Isolation which will allow your readers to not be blocked by writers.
Cost also plays into this - some RDMS systems are free, some are not. Your question is way to general to be answered specifically.
Aqua Data Studio is good but quite expensive. An open source tool SQL Workbench/J is also an effective tool for informix.
Informix have its own charm but i guess it should not be said that MS-Sql Server is slower or not good in performance. You may decide DBMS according to your nature of application. There are many techniques to optimize Database performance like, Applying Indexes/ Not too many Joins/ Queries can be optimize too/ Stored Procedure can also be used/ Multi-DBs level etc.
Once i need to develop Social Media site, i used MySQL in this project but only for POSTs i installed MongoDB.
Regards,
Salik

Using MonoCross, where should the creation of SQLite databases live?

Due to the differences in file structure etc. between platforms, I was wondering if the database creation (with connection strings) need to be platform specific? Or if there's maybe a way to create a database from OnAppLoad() platform agnostic?
The SQLite file format is fully portable.
A database in SQLite is a single disk file. Furthermore, the file
format is cross-platform. A database that is created on one machine
can be copied and used on a different machine with a different
architecture. SQLite databases are portable across 32-bit and 64-bit
machines and between big-endian and little-endian architectures.
You do not need to worry about it at all. Things to worry about that are not platform related are few and can include the WAL journal-mode due to lack of backward compatibility.
You can also read:
http://www.sqlite.org/atomiccommit.html#sect_9_0
and:
http://www.sqlite.org/lockingv3.html#how_to_corrupt

Resources