Encrypting existing buckets and their Objects - encryption

I have tried cli command:
aws s3 cp s3://test2/ s3://test2--recursive --sse AES256
It neither works. It neither gives me any error nor encrypts bucket objects. Please help!
There are thousands of buckets and objects in each account. Need a solid way to encrypt them.

Related

What is data.json.gz firebase

I received an existing firebase project from another developer. I was setting up a bucket for making backups programmatically and I found a bucket named "backups" where I find these documents with extension data.json.gz that are created every day at 3 am, but I'm not sure what they are, does anyone know what they could be?, Client is asking me if there a backup of the database, but as far I know firestore backups have an extension named overall_export_metadata
As mentioned by Alexander N, (for Linux) hit gzip -d filename on your terminal, this will create the ungzip file, and you'll be able to read the data. In this case, these files are backups from one of the firestore collections and the firestore DB rules.

Using Airflows S3Hook is there a way to copy objects between buckets with different connection ids?

I'm copying files from an external companies bucket, they've sent me an access key/secret that I've set up as an env variable. I want to be able to copy objects from their bucket, I've used the below but that's for moving objects with the same connection, how do I use S3Hook to copy objects w. a different conn id?
s3 = S3Hook(self.aws_conn_id)
s3_conn = s3.get_conn()
ext_s3 = S3Hook(self.ext_aws_conn_id)
ext_s3 conn = ext_s3.get_conn()
#this moves objects w. the same connection...
s3_conn.copy_object(Bucket="bucket",
Key=f'dest_key',
CopySource={
'Bucket': self.partition.bucket,
'Key': key
}, ContentEncoding='csv')
From my point of view this is impossible. First of all, you can only declare one URL endpoint.
Secondly, Airflow S3Hook work with Boto3 in its background, and probably, both of your connections will have different acces_key and secret_key to create the boto3 resource/client. As explained in this post, if you wish to copy between different buckets, then you will need to use a single set of credentials that have:
GetObject permission on the source bucket
PutObject permission on the destination bucket
Again in the S3Hook, you can only declare a single set of credentials. You could maybe use the credentials given by your client and declare a bucket in your account with PutObject permission, but this will imply that you are allowed to do this in your enterprise (not very wise in terms of security), and even though your S3Hook will still only reference to one single endpoint.
To sum up, everything I have been dealing with the same problem and ended up creating two S3 connections using the first one for downloading from the original bucket and the second to upload to my enterprise bucket.

Check if azure databricks mount point exists from .NET

I work on an app which does some kind of data engineering and we use Azure ADLS for data storage and Databricks for data manipulation. There are two approaches in order to retrieve the data, the first one uses the Storage Account and Storage account secret key and the other approach uses mount point. When I go with the first approach, I can successfully check, from .NET, whether the Storage account and it's corresponsive Secret key correspond to each other and return a message whether the credentials are right or not. However, I need to do the same thing with the mount point i.e. determine whether the mount point exists in dbutils.fs.mounts() or anywhere in the storage (I don't know how mount point exactly works and if it stores data in blob).
The flow for Storage account and Secret key is the following:
Try to connect using the BlobServiceClient API from Microsoft;
If it fails, return a message to the user that the credentials are invalid;
If it doesn't fail, proceed further.
I'm not that familiar with /mnt/ and stuff because I mostly do .NET but is there a way to check from .NET whether a mount point exists or not?
Mount point is just a kind of reference to the underlying cloud storage. dbutils.fs.mounts() command needs to be executed on some cluster - it's doable, but it's not fast & cumbersome.
The simplest way to check that is to use List command of DBFS REST API, passing the mount point name /mnt/<something> as path parameter. If it doesn't exist, you'll get error message RESOURCE_DOES_NOT_EXIST:
{
"error_code": "RESOURCE_DOES_NOT_EXIST",
"message": "No file or directory exists on path /mnt/test22/."
}

How do I specify encryption type when using s3remote for DVC

I have just started to explore DVC. I am trying with s3 as my DVC remote. I am getting
But when I run the dvc push command, I get the generic error saying
An error occurred (AccessDenied) when calling the PutObject operation: Access Denied
which I know for a fact that I get that error when I don't specify the encryption.
It is similar to running aws s3 cp with --sse flag or specifying ServerSideEncryption when using boto3 library. How can I specify the encryption type when using DVC. Coz underneath DVC uses boto3 so there must be an easy way to do this.
Got the answer for this immediately in the DVC discord channel!! By default, no encryption is used. We should specify what server-side encryption algorithm should be used.
Running dvc remote modify worked for me!
dvc remote modify my-s3-remote sse AES256
There are a bunch of things that we can configure here. All this does is that it adds an entry of sse = AES256 under the ['remote "my-s3-remote"'] inside the .dvc/config file.
More on this here
https://dvc.org/doc/command-reference/remote/modify

AWS Auto Scaling Launch Configuration Encrypted EBS Cloud Formation Example

I am creating cloud formation script, which will have ELB. In Auto Scaling launch configuration, I want to add encrypted EBS volume. Couldn't find an encrypted property withing blockdevicemapping. I need to encrypt volume. How can I attach an encrypted EBS volume to an EC2 instance through auto scaling launch configuration?
There is no such property for some strange reason when using launch configurations, however it is there when using blockdevicemappings with simple EC2 instances. See
launchconfig-blockdev vs ec2-blockdev
So you'll either have to use simple instances instead of autoscaling groups, or you can try this workaround:
SnapshotIds are accepted for launchconf blockdev too, and as stated here "Snapshots that are taken from encrypted volumes are automatically encrypted. Volumes that are created from encrypted snapshots are also automatically encrypted."
Create a snapshot from an encrypted empty EBS volume and use it in the CloudFormation template. If your template should work in multiple regions then of course you'll have to create the snapshot in every region and use a Mapping in the template.
As Marton says, there is no such property (unfortunately it often takes a while for CloudFormation to catch up with the main APIs).
Normally each encrypted volume you create will have a different key. However, when using the workaround mentioned (of using an encrypted snapshot) the resulting encrypted volumes will inherit the encryption key from the snapshot and all be the same.
From a cryptography point of view this is a bad idea as you potentially have multiple, different volumes and snapshots with the same key. If an attacker has access to all of these then he can potentially use differences to infer information about the key more easily.
An alternative is to write a script that creates and attaches a new encrypted volume at the boot time of a instance. This is fairly easy to do. You'll need to give the instance permissions to create and attach volumes and either have installed the AWS CLI tool or a library for your preferred scripting language. One you have that you can, from the instance that is booting, create a volume and attach it.
You can find a starting point for such a script here: https://github.com/guardian/machine-images/blob/master/packer/resources/features/ebs/add-encrypted.sh
There is an AutoScaling EBS Block Device type which provides the "Encrypted" option:
http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-as-launchconfig-blockdev-template.html
Hope this helps!
AWS recently announced Default Encryption for New EBS Volumes. You can enable this per region via
EC2 Console > Settings > Always encrypt new EBS volumes
https://aws.amazon.com/blogs/aws/new-opt-in-to-default-encryption-for-new-ebs-volumes/

Resources