how log files are created in berkelydb java edition db base api - berkeley-db

we are using berkeleydb java edition db base api, we have already read/write CDRFile of 9 lack rows with transaction and without transaction implementing secondary database concept the issues we are getting are as follows:-
with transaction----------size of database environment 1.63gb which is due to no. of log files created each of 10 mb.
without transaction-------size of database environment 588mb and here only one log file is created which is of 10mb. so we want to know how this happens..
how log files are created and what is meant of using transaction and not using transaction in db environment and what are this db files _db.001,_db.002,_db.003,_db.004,__db.005 and log files like log.0000000001.....plz reply soon

It looks like this question was already answered here what are log files and why they are created during transaction in berkeleydb core api(dbapi)?.
From your description it actually looks like you're using Berkeley DB core, not Java Edition. __db.001 through __db.005 are the shared region system environment files. The environment files are described here. The log.* files are the transaction log files. The transaction log files are described in the answer referenced above.
These types of questions can often be more easily/quickly answered on the Berkeley DB forum on OTN.
Regards,
Dave

Related

Generating ZIP files in azure blob storage

What is the best method to zip large files present in AZ blob storage and download them to the user in an archive file (zip/rar)
does using azure batch can help ?
currently we implement this functions in a traditionally way , we read stream generate zip file and return the result but this take many resources on the server and time for users.
i'am asking about the best technical and technologies solution (preferred way using Microsoft techs)
There are few ways you can do this **from azure-batch only point of view**: (for the initial part user code should own whatever zip api they use to zip their files but once it is in blob and user want to use in the nodes then there are options mentioned below.)
For initial part of your question I found this which could come handy: https://microsoft.github.io/AzureTipsAndTricks/blog/tip141.html (but this is mainly from idea sake and you will know better + need to design you solution space accordingly)
In option 1 and 3 below you need to make sure you user code handle the unzip or unpacking the zip file. Option 2 is the batch built-in feature for *.zip file both at pool and task level.
Option 1: You could have your *rar or *zip file added as azure batch resource files and then unzip them at the start task level, once resource file is downloaded. Azure Batch Pool Start up task to download resource file from Blob FileShare
Option 2: The best opiton if you have zip but not rar file in the play is this feature named Azure batch applicaiton package link here : https://learn.microsoft.com/en-us/azure/batch/batch-application-packages
The application packages feature of Azure Batch provides easy
management of task applications and their deployment to the compute
nodes in your pool. With application packages, you can upload and
manage multiple versions of the applications your tasks run, including
their supporting files. You can then automatically deploy one or more
of these applications to the compute nodes in your pool.
https://learn.microsoft.com/en-us/azure/batch/batch-application-packages#application-packages
An application package is a .zip file that contains the application binaries and supporting files that are required for your
tasks to run the application. Each application package represents a
specific version of the application.
With regards to the size: refer to the max allowed in blob link in the document above.
Option 3: (Not sure if this will fit your scenario) Long shot for your specific scenario but you could also mount virtual blob to the drive at join pool via mount feature in azure batch and you need to write code at start task or some thing to unzip from the mounted location.
Hope this helps :)

Transfer content from one Alfresco instance to another (same version) on another server

What would be the best /better way to transfer repository content from one Alfresco (enterprise edition) to another instance running on a different server. Currently we copy the entire Alfresco database & file system under alf_data but that needs a down time on the servers.
I would require a mechanism without down time & the repository data be copied from one instance to another. Is there any way this is possible ?
In addition to Heiko's solution, you might be interested in:
The out-of-the-box replication service, which wouldn't be good for replicating your entire repo, but can be used for replicating a handful of nodes from one server to another.
A solution from Parashift which allows one- and two-way replication of nodes between servers.
An Alfresco presentation on using Apache Camel and Apache Kafka to replicate nodes between servers. This is available through Alfresco's professional services organization, but it may make it into the product at some point. Or you could use it as inspiration to write your own solution.
What is your intention? A standby system, a real copy, an external private cloud with a subset of data?
If you just need a 100% clone you can script backup & restore without downtime on the source server. Downtime is limited to the db and index restore on the target system. Your script shouldn't copy life data from solr index - use the backup done by the solr backup job instead. Depending on the database you use online db backup shouldn't be an issue.
Our Alfresco Virtual Appliance has preconfigured scripts and jobs for this task to start an additional alfresco instance from snapshot backups without copying the contentstore (we call this Alfresco Time Machine).
If your aim is an external private cloud server or a road warrior solution ecm4u has a commercial alfresco module to sync very efficient a subset of modified nodes including metadata/types/aspects (list of types and aspects needs to be defined). This sync provides a REST interface for automation and also manual execution from alfresco's admin console. We support mix of alfreso versions and editions. At the moment this sync is implemented as a unidirectional sync but could be extended as a bidirectional sync.
I recently did this task of installing 2 alfresco instances on my local running on 2 different ports.
While performing some tasks, I realized that 2 instances having same Repository ID is creating issues.
I was able to change the repository ID of one of them following below steps:
update alfresco-global.properties:
db.name="Add new DB Name"
(Alfresco will create a db db.name mentioned here while initializing)
and restart the server
If you are still facing issues, try deleting solr indexes under alf-data folder.

Open SQLite database on Google App Engine

Is there anyway to open and read a SQLite database file on GAE?
I am currently uploading dbs to blobstore as admin and serving them publicly to user clients. I just can't read them in the GAE admin interface.
You can use SQLITE on Google App Engine. The problem has nothing to do with the support of certain libraries. It has to do with read-only file system. There is, however, a writable /tmp directory. If your app on startup first copies the db.sqlite3 file to /tmp/db.sqlite3 and references this path as database path, it will work.
There are, however, drawbacks.
1. This is not a "real" directory i.e. it's stored im memory. If database is too large, one will get problems.
2. Each instance has its own copy of db.sqlite3 file. Does not scale well.
Here is a django example:
Using SQLITE for local Django development for Google App Engine?
Short answer, no it is not possible to use a SQLite database on a standard Google App Engine application as it is not currently supported. However, you can give a try at implementing your own configuration with the App Engine Flexible Environment that allows to take advantage of custom libraries through Infrastructure Customization.
In case you would want to experiment, here is a sample Django application designed to be run with its default SQLite database on the App Engine Flexible Environment. Still, make sure to read the database notice providing alternative data storage options and explaining that SQLite data does not persist upon instance restart.

copying sqlite3 db while being read

I have a script that was reading data from a sqlite3 database and while this script was running I made a copy of the database cp mydatabase mydatabase.bak. Will this affect either the script that was reading from the db or the copy of the db? I had a look at the sqlite documentation here [0] but I didn't put a lock on the db as per the instructions.
[0] http://www.sqlite.org/backup.html
Copying the file should be analogous to another application reading the database, so it shouldn't be a problem. Multiple applications can safely read the database file at the same time (per the SQLite FAQ).
As another point, consider that you can read from a database even if the database and its directory both lack write permissions. Since in that scenario there's no way for the reading application to be modifying the database file or creating a temp file that needs to be incorporated into it, there's no way for any of a number of simultaneously reading applications to affect what any of the others see.

what are log files and why they are created during transaction in berkeleydb core api(dbapi)?

We are using BerkeleyDB Java edition, core api to read/write cdrfiles, we are having a problem with log files.
When we are writing 9lack records to database then multiples log files are created with huge sizes, 1.08gb. We want to know why multiple logfiles are created while using transaction , is it due to every commit statement after writing data to database ? or is there any other reason ?
This is normal. The log files contains ongoing tranactions, as well as information you can use for recovering the database (which means they're suitable to use as backup and disaster recovery).
Read chapter 5 of the documentation carefully, as well as this section which explains the periodic maintenance you need to do on your database.

Resources