upload_archive() operation in boto3 for glacier - python-3.4

Does the upload_archive() operation in boto3 for glacier automatically use multi-part upload when the data to be uploaded is larger than 100MB?
I believe this is the case in boto2 (see #lenrok258's answer in Boto Glacier - Upload file larger than 4 GB using multipart upload)
I have tried different ways to view the source code for the upload_archive() operation in boto3 for glacier, but I haven't been able to find it using inspect or ipython. If anyone happens to know how to do this and is willing to share it would be much appreciated.

Unlike boto2, boto3 does not automatically use multi-part upload.
From a comment from a member of the boto project on an issue on the boto3 Github:
... boto3 does not have the ability to automatically handle
multipart uploads to Glacier. That would be a feature request. There
are some features that exist in boto2 that have not been implemented
in boto3.
You'll have to implement it yourself using the initiate_multipart_upload functionality.
Or, as another commenter on the issue suggests instead:
The optimal usage pattern for interacting with Glacier is generally
to upload to S3 and use S3 lifecycle policies to transition the
object to Glacier.

Related

How to download dynamodb backup to localhost

I have a backup of DynamoDB table, I want to download it to my localhost in order to restore it on local dynamodb instance. Couldn't find any documents, every tool I found like dynamodump creates on-demand backup and then downloads it. Can anyone help me?
Your best bet is to do an Export to S3 and then you’ll have direct access to the objects in S3. Hopefully that satisfies your need?
The backup which you state belongs to DynamoDB and not directly accessible to you, it's only purpose is to restore to DynamoDB tables in the cloud.
You have 2 options
1. Export to S3
As #hunterhacker stated you can do an export to S3 and then download the data from there.
2. Scan
A more cost effective solution is to write a local script which does a Scan or if there is a large amount of data a parallel Scan and store the data locally

import individual files automated to firestore with the POST: https://firestore.googleapis.com/v1beta1/{database}/documents.commit?

As the title suggested, can I post with the: https://firestore.googleapis.com/v1beta1/{database}/documents.commit command a single JSON file directly in my Firestore database and will they be processed? Added to the collections etc? Or should I go with POST projects.databases.documents.createDocument. I was reading this documentation
I want to put json files from different sources in to my Firestore database to build up my collection.
And where should I put the filename of the json file that I want to upload?
Thanks!!
You can see here [1] the usage of both calls:
documents.commit= Commits a transaction, while optionally updating documents
documents.createDocument= Creates a new document
For using the JSON in the API call you need to send a POST request, check this question [2].
Also, regarding your last comment, you can start collections and add documents using the Firestore UI, but also you can do that using client libraries in different languages (Python, Java, Go...). Here is a list of "How to"s regarding Firestore [3].
In case you think that some features are missing, you can always file a Feature Request following this link [4] (as Firestore is still not there, I would choose Datastore), but keep in mind that Firestore is still in Beta.

cosmosdb - archive data older than n years into cold storage

I researched several places and could not find any direction on what options are there to archive old data from cosmosdb into a cold storage. I see for DynamoDb in AWS it is mentioned that you can move dynamodb data into S3. But not sure what options are for cosmosdb. I understand there is time to live option where the data will be deleted after certain date but I am interested in archiving versus deleting. Any direction would be greatly appreciated. Thanks
I don't think there is a single-click built-in feature in CosmosDB to achieve that.
Still, as you mentioned appreciating any directions, then I suggest you consider DocumentDB Data Migration Tool.
Notes about Data Migration Tool:
you can specify a query to extract only the cold-data (for example, by creation date stored within documents).
supports exporting export to various targets (JSON file, blob
storage, DB, another cosmosDB collection, etc..),
compacts the data in the process - can merge documents into single array document and zip it.
Once you have the configuration set up you can script this
to be triggered automatically using your favorite scheduling tool.
you can easily reverse the source and target to restore the cold data to active store (or to dev, test, backup, etc).
To remove exported data you could use the mentioned TTL feature, but that could cause data loss should your export step fail. I would suggest writing and executing a Stored Procedure to query and delete all exported documents with single call. That SP would not execute automatically but could be included in the automation script and executed only if data was exported successfully first.
See: Azure Cosmos DB server-side programming: Stored procedures, database triggers, and UDFs.
UPDATE:
These days CosmosDB has added Change feed. this really simplifies writing a carbon copy somewhere else.

Using the Nexus3 API how do I get a list of artifacts in a repository

We are migrating from Nexus Repository Manager 2.1.4 to Nexus 3.1.0-04. With version 2 we have been able to use the API to get a list of artifacts by repository, however we are struggling to find a way to do this with the Nexus 3 API.
Having read https://books.sonatype.com/nexus-book/reference3/scripting.html chapter 16 we have been able to get artifact information for a specific blob using a groovy script like:
import org.sonatype.nexus.blobstore.api.BlobId
def properties = blobStore.blobStoreManager.get("default").get(new BlobId("7f6379d32f8dd78f98b5b181166703b6")).getProperties()
return [headers: properties.headers, metrics: properties.metrics]
However we can't find a way to iterate over the contents of a blob store. We can get a blob store object:
blobStore.blobStoreManager.get("default")
however the API does not appear to give us a way to get a list of all blobs within that store. We need to get a list of the blobIDs within a blob store.
Is there a way to do this via the Nexus 3 API?
One of our internal team members put this together. It doesn't use the blobStore but accomplishes I believe what you are trying to do (and a bit more): https://gist.github.com/kellyrob99/2d1483828c5de0e41732327ded3ab224
For some background, think of a blobStore as just where we store the bits, with no information about them. OrientDB has Component/Asset records and stores all the info about them. You'll generally want to use that instead of the blobStore for Asset information as a result.
Once your migration is done, it can be worth to investigate to update your version of Nexus.
That way, you will be able to use the - still in beta - new API for Nexus. It's available by default on the version 3.3.0 and more: http://localhost:8082/swagger-ui/
Basically, you retrieve the json output from this URL: http://localhost:8082/service/siesta/rest/beta/assets?repositoryId=YOURREPO
Only 10 records will be displayed at a time and you will have to use the continuationToken provided to request the next 10 records for your repository by calling: http://localhost:8082/service/siesta/rest/beta/assets?continuationToken=46525652a978be9a87aa345bdb627d12&repositoryId=YOURREPO
More information here: http://blog.sonatype.com/nexus-repository-new-beta-rest-api-for-content

Where does OpenStack Swift store the rings?

does anybody know where OpenStack Swift stores the "Rings"? Is there a distributed algorithm or is it just one table somewhere on some of the Storage Nodes with information about all (!) the physical object locations (I cannot believe that because from my understanding of Object Storage, it should scale to Exabytes, and this would need lots of entries in such a table...)?
This page could not help me: http://docs.openstack.org/developer/swift/overview_ring.html
Thanks in advance for your help!
Ring Builder
The rings are built and managed manually by a utility called the ring-builder. The ring-builder assigns partitions to devices and writes an optimized Python structure to a gzipped, serialized file on disk for shipping out to the servers. The server processes just check the modification time of the file occasionally and reload their in-memory copies of the ring structure as needed.
so, it's stored in all servers.
If you were asking the path of ring,gz files it is under /etc/swift by default
Also these ring files are can be updated using the .builder files when swift rebalance is run.

Resources