gsutil zip directory on google cloud storage - directory

is it possible to compress file or directory of google cloud storage without download it first and re-upload?
I think I need some tools similar like http://googlegenomics.readthedocs.org/en/latest/use_cases/compress_or_decompress_many_files/
thank you.

No. There is no way to ask GCS to directly compress or decompress objects entirely within GCS. You can certainly copy them elsewhere in the cloud (GCE, for instance) and operate on them there, or you could download an uncompressed object as a compressed object simply by using the Accept-Encoding: gzip header, but, again, not without taking it out of GCS in some fashion.

From
https://console.cloud.google.com/storage/browser/your-bucket
open a cloud shell (top right of the page), then:
gsutil cp -r gs://your-bucket .
zip your-bucket.zip ./your-bucket
gsutil cp your-bucket.zip gs://your-bucket
Strictly it is not happening in place within your storage but within the cloud.

Related

How to keep/re-create object metadata during gsutil cp on storage bucket

I would like to sync all of the files in my Google Cloud Storage bucket with the exported files in my Firebase Storage Emulator.
I downloaded all of my cloud files using gsutil to my local machine.
I used BeyondCompare to move all of the new files to the '../storage_export/blobs/ directory.
How do I update/create the JSON metadata in '../storage_export/metadata' to reflect these new files and make them available when I run the emulator and import them in?
Edit:
The gsutil docs mention the following:
when you download data from the cloud, it ends up in a file with no associated metadata, unless you have some way to keep or re-create that metadata.
How would one "keep" or "re-create" that metadata during a gsutil cp download?
You can use gsutil or the SDK to get each object's metadata and then write it down to a JSON file however, there's currently no native way to import Google Cloud Storage data in the Storage Emulator. But as I stated in my answer to this post, you can study how the emulator register the object by uploading sample files within the emulator and then running the export, you will see that the emulator will require 1 object and 1 JSON file that contains it's metadata.
Lastly, you can add the option --export-on-exit when starting the emulator, Downloaded all data from the real Firebase project, uploaded everything with the Emulator, then kill the emulator; as stated in this post.
Note: This is not a documented feature! Firebase doesn't expose the concept of download tokens in its public SDKs or APIs, so manipulating tokens this way feels a bit "hacky". For your further reference, check this post.

How to delete all files starting with "foo" in Firebase Storage

I have a long list of files in Firebase Storage, which I have uploaded from a python script.
Many of those files have this kind of names:
foo_8346gr.msb
foo_8333ys.msb
foo_134as.mbb
...
I know there is no programmatic way to delete a folder in Storage (they are not even folders), but how could I remove all files starting with "foo_" programmatically, from python?
You can use Cloud Storage List API to find all files with a certain prefix, then delete them. That page has code samples for a variety of languages, including Python. Here's how you list files with a prefix:
storage_client = storage.Client()
bucket = storage_client.get_bucket(bucket_name)
blobs = bucket.list_blobs(prefix=prefix, delimiter=delimiter)
print('Blobs:')
for blob in blobs:
print(blob.name)
if delimiter:
print('Prefixes:')
for prefix in blobs.prefixes:
print(prefix)
You will have to add the bit of code that deletes the file if you believe it should be deleted. The documentation goes into more detail about the List API.
Firebase provides a wrapper around Cloud Storage that allows you to directly access the files in storage from the client, and that secures access to those files. Firebase does not provide a Python SDK for accessing these files, but since it is built around Google Cloud Storage, you can use the GCP SDK for Python to do so.
There is no API to do a wildcard delete in there, but you can simply list all files with a specific prefix, and then delete them one by one. For an example of this, see the answer here: How to delete GCS folder from Python?

How to download multiple files from the Firebase Storage Console?

I have an images directory in Firebase Storage and I am trying to download all the files in that directory from the console. It gives me the option to select all files and a download button appears but when I click it only 1 image is downloaded.
Is there a way to download all the images via the Firebase Console?
You can use the gsutil tool to download all files from your Firebase Storage bucket. Use the cp command:
gsutil -m cp -r gs://{{bucket_url}} {{local_path_to_save_downloads}}
-m performs a multi-threaded download. Use this if you want to download a large number of files to perform parallel downloads
-r to copy an entire directory tree
Just had the same issue. Turns out it's a known issue.
Here's the response from Firebase support:
This is a known bug when trying to download directly on the Storage
Console. Our engineers are working on getting it fixed, however I
can't provide any timelines right now. We apologize for the
inconvenience this may have caused you. For the time being, you may
download the file by right-clicking the image previewed, then choose
"Save image as...".

Uncompress file on Rackspace cloud files container for CDN

I have created a Rackspace account earlier today for CDN to serve my Opencart images from Rackspace.
I have created a container where i will upload over 500,000 images, but prefer to upload them as a compressed file, feels more flexible.
If i upload all the images in a compressed file how do i extract the file when it is in the container? and what compression type files would work?
The answer may depend on how you are attempting to upload your file/files. Since this was not specified, I will answer your question using the CLI from a *nix environment.
Answer to your question (using curl)
Using curl, you can upload a compressed file and have it extracted using the extract-archive feature.
$ tar cf archive.tar directory_to_be_archived
$ curl -i -XPUT -H'x-auth-token: AUTH_TOKEN' https://storage101.iad3.clouddrive.com/v1/MossoCloudFS_aaa-aaa-aaa-aaa?extract-archive=tar -T ./archive.tar
You can find the documentation for this feature here: http://docs.rackspace.com/files/api/v1/cf-devguide/content/Extract_Archive-d1e2338.html
Recommended solution (using Swiftly)
Uploading and extracting that many objects using the above method might take a long time to complete. Additionally if there is a network interruption during that time, you will have to start over from the beginning.
I would recommend instead using a tool like Swiftly, which will allow you to concurrently upload your files. This way if there is a problem during the upload, you don't have to re-upload objects that have alreaady been successfully uploaded.
An example of how to do this is as follows:
$ swiftly --auth-url="https://identity.api.rackspacecloud.com/v2.0" \
--auth-user="{username}" --auth-key="{api_key}" --region="DFW" \
--concurrency=10 put container_name -i images/
If there is a network interruption while uploading, or you have to stop/restart uploading your files, you can add the "--different" option after the 'put' in the above command. This will tell Swiftly to HEAD the object first and only upload if the time or size of the local file does not match its corresponding object, skipping objects that have already been uploaded.
Swiftly can be found on github here: https://github.com/gholt/swiftly
There are other clients that possibly do the same things, but I know Swiftly works, so I recommend it.

Enable S3 bucket contents to always be public

I've managed to use S3FS to mount an Amazon S3 folder into my Wordpress site. Basically, my gallery folder for NextGEN gallery is a symlink to a mounted S3FS folder of the bucket, so when I upload an image, the file is automatically added to the S3 bucket.
I'm busy writing an Apache rewrite rule to replace the links, to fetch gallery images from S3 instead, without having to hack or change anything with NextGEN, but one problem I'm finding, is that images are not public by default on S3.
Is there a way to change a parent folder, to make its children always be public, including new files as they are generated?
Is it possible or advisable to use a cron task to manually make a folder public using the S3 command line API?
I'm the lead developer and maintainer of Open source project RioFS: a userspace filesystem to mount Amazon S3 buckets.
Our project is an alternative to “s3fs” project, main advantage comparing to “s3fs” are: simplicity, the speed of operations and bugs-free code. Currently the project is in the “beta” state, but it's been running on several high-loaded fileservers for quite some time.
We are seeking for more people to join our project and help with the testing. From our side we offer quick bugs fix and will listen to your requests to add new features.
Regarding your issue:
if'd you use RioFS, you could mount a bucket and have a write / read access to it using the following command (assuming you have installed RioFS and have exported AWSACCESSKEYID and AWSSECRETACCESSKEY environment variables):
riofs -o allow_other http://s3.amazonaws.com bucket_name /mnt/static.example.com
(please refer to project description for command line arguments)
Please note that the project is still in the development, there are could be still a number of bugs left.
If you find that something doesn't work as expected: please fill a issue report on the project's GitHub page.
Hope it helps and we are looking forward to seeing you joined our community !
I downloaded s3curl and used that to add the bucket policy to S3.
See this link: http://blog.travelmarx.com/2012/09/working-with-s3curl-and-amazon-s3.html
You can generate your bucket policies using the Amazon Policy Generator:
http://awspolicygen.s3.amazonaws.com/policygen.html

Resources