How we can get local path of documents in MarkLogic - xquery

How we can get local path of documents in MarkLogic? Is there any XQuery or Java method to get local path of document?
For example: I have exported few documents from my local D drive into MarkLogic. In feature if I want to fetch local path from MarkLogic.
Is there any query to get that path?
How can I know the local path of each document in MarkLogic?
Will MarkLogic store that information anywhere in local directory?
I have installed MarkLogic server in the following path C:\Program Files\MarkLogic\data. Will it store this information anywhere in this path?
I am not able to find any data related to document in this forest location?

When documents are ingested into a MarkLogic database they are stored in a proprietary binary format, along with its index entries, on disk in one of the of the forests associated with that database.
The default location on Windows for forests is
C:\Program Files\MarkLogic\Data\Forests
The default location on Linux for forests is
/var/opt/MarkLogic/Forests
The way the documents are tracked is by URI, which can look similar to a disk path, but is not associated with any particular disk location. If you want to determine the particular forest that the document is stored in you can take the URI and use xdmp:document-forest with xdmp:forest-name to find the forest name.
xdmp:forest-name(xdmp:document-forest("/my/uri/path/example.xml"))
MarkLogic offers free self-paced, and free instructor-led training, and I would suggest starting with the MarkLogic Fundamentals course, as it will cover some of these concepts.

Related

How to get sqlite database path from sql.DB instance

I want to get the path or even the connection string of a sql.DB instance. The package I'm writing doesn't know what the database path is and the application may have multiple database files. However I need to be able to map a specific database to a specific buffered channel, and ignore any additional calls for a database the package has already seen.
The application uses the github.com/mutecomm/go-sqlcipher driver to instantiate the database, if that makes any difference.
Is it possible to discriminate between instances of sql.DB based on the file path of the source database? If so, how do I do it?

After a certain limit it should point to different vm

I have a file system in which the files are stored with the numeric number (SQL index) but my VM size is full and I can't shift my files to different Cloud or anything.
My file system URL will be
https://example.com/5374/randomstring.jpg
5374 is file number which is saved in SQL DB and a random string is generated.
What I'm planning to do is using nginx redirecting right now I have 56770 in a vm if a user tries to upload it will go and save in different vm and if user wants to access 56771 means using nginx it should point to that VM.
You will make your life easier by choosing the cutoff point yourself, it's not essential but it will make matching a regular expression a lot more concise.
If you said 56000 and above was on VM2 then your regex is as simple as /([5-9][6-9][0-9][0-9][0-9])/

cosmosdb - archive data older than n years into cold storage

I researched several places and could not find any direction on what options are there to archive old data from cosmosdb into a cold storage. I see for DynamoDb in AWS it is mentioned that you can move dynamodb data into S3. But not sure what options are for cosmosdb. I understand there is time to live option where the data will be deleted after certain date but I am interested in archiving versus deleting. Any direction would be greatly appreciated. Thanks
I don't think there is a single-click built-in feature in CosmosDB to achieve that.
Still, as you mentioned appreciating any directions, then I suggest you consider DocumentDB Data Migration Tool.
Notes about Data Migration Tool:
you can specify a query to extract only the cold-data (for example, by creation date stored within documents).
supports exporting export to various targets (JSON file, blob
storage, DB, another cosmosDB collection, etc..),
compacts the data in the process - can merge documents into single array document and zip it.
Once you have the configuration set up you can script this
to be triggered automatically using your favorite scheduling tool.
you can easily reverse the source and target to restore the cold data to active store (or to dev, test, backup, etc).
To remove exported data you could use the mentioned TTL feature, but that could cause data loss should your export step fail. I would suggest writing and executing a Stored Procedure to query and delete all exported documents with single call. That SP would not execute automatically but could be included in the automation script and executed only if data was exported successfully first.
See: Azure Cosmos DB server-side programming: Stored procedures, database triggers, and UDFs.
UPDATE:
These days CosmosDB has added Change feed. this really simplifies writing a carbon copy somewhere else.

Where does OpenStack Swift store the rings?

does anybody know where OpenStack Swift stores the "Rings"? Is there a distributed algorithm or is it just one table somewhere on some of the Storage Nodes with information about all (!) the physical object locations (I cannot believe that because from my understanding of Object Storage, it should scale to Exabytes, and this would need lots of entries in such a table...)?
This page could not help me: http://docs.openstack.org/developer/swift/overview_ring.html
Thanks in advance for your help!
Ring Builder
The rings are built and managed manually by a utility called the ring-builder. The ring-builder assigns partitions to devices and writes an optimized Python structure to a gzipped, serialized file on disk for shipping out to the servers. The server processes just check the modification time of the file occasionally and reload their in-memory copies of the ring structure as needed.
so, it's stored in all servers.
If you were asking the path of ring,gz files it is under /etc/swift by default
Also these ring files are can be updated using the .builder files when swift rebalance is run.

Are not stored solr fields stored as encrypted data at rest?

If solr puts my documents in an index that include personally identifiable information, will the index files contain meaningful information that would allow someone to read the documents in plain text from the disk (i.e. at rest)?
Solr will not encrypt the information stored in the index file.
So a person, who have access to the index files or your solr environment could copy the index files or whole solr environment in order to run the index/solr installation on an different host to walk/read trough the index.

Resources