How can I add EFS to an Airflow deployment on Amazon-EKS? - airflow

Kubernetes and EKS newbie here.
I've set up an Elastic Kubernetes Service (EKS) cluster and added an Airflow deployment on top of it using the official HELM chart for Apache Airflow. I configured gitsync and can successfully run my DAGS. For some of the DAGs, I need to save the data to an Amazon EFS. I installed the Amazon EFS CSI driver on eks following the instruction on the amazon documentation.
Now, I can create a new pod with access to the NFS but the airflow deployment broke and stay in a state of Back-off restarting failed container. I also got the events with kubectl -n airflow get events --sort-by='{.lastTimestamp} and I get the following messages:
TYPE REASON OBJECT MESSAGE
Warning BackOff pod/airflow-scheduler-599fc856dc-c4pgz Back-off restarting failed container
Normal FailedBinding persistentvolumeclaim/redis-db-airflow-redis-0 no persistent volumes available for this claim and no storage class is set
Warning ProvisioningFailed persistentvolumeclaim/ebs-claim storageclass.storage.k8s.io "ebs-sc" not found
Normal FailedBinding persistentvolumeclaim/data-airflow-postgresql-0 no persistent volumes available for this claim and no storage class is set
I have tried this on EKS version 1.22.
I understand from this that airflow is expecting to get an EBS volume for its pods but the NFS driver changed the configuration of the pvs.
The pvs before I install the driver are this:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-###### 100Gi RWO Delete Bound airflow/logs-airflow-worker-0 gp2 1d
pvc-###### 8Gi RWO Delete Bound airflow/data-airflow-postgresql-0 gp2 1d
pvc-###### 1Gi RWO Delete Bound airflow/redis-db-airflow-redis-0 gp2 1d
After I install the EFS CSI driver, I see the pvs have changed.
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
efs-pvc 5Gi RWX Retain Bound efs-storage-claim efs-sc 2d
I have tried deploying airflow before or after installing the EFS driver and in both cases I get the same error.
How can I get access to the NFS from within Airflow without breaking the Airflow deployment on EKS. Any help would be appreciated.

As stated in the error above no persistent volumes available for this claim and no storage class is set and storageclass.storage.k8s.io "ebs-sc" not found, you have to deploy a storage class called efs-sc using the EFS CSI driver as a provisioner.
Further documentation could be found here
An example of creating your missing storage class and persistent volume could be found here
These steps are also described in the AWS EKS user guide

Related

How to save files from R Studio-server on GCE VM to a Google Cloud Bucket? Possible settings issue

I have a VM on GCE with R Studio-server installed. I have mounted a bucket form google cloud in my VM such that I am able to access and load my data in R. My goal is to be able to to save output back in this bucket. How can I do so? When I tried to save data using the R function saveRDS() I got the following error message:
Error in gzfile(file, mode) : cannot open the connection
In addition: Warning message:
In gzfile(file, mode) :
cannot open compressed file 'remote/processed_input_reproduction/file.rds', probable reason 'Permission denied'
Investigating the problem, I got to this link. One of the suggestions in it is: "It is possible that the GCE instance is not running with scope "storage-full" configured. For example: If you have created GCE instance with default Cloud API access scopes, it set the GCE instance storage access scope to read only. In that case, you can change the access scope of the instance for Storage to "Full"". This had also been suggested to me in another question I asked recently. However, when I changed the storage's scope to full, I was unable to connect again to my VM instance. Changing storage's scope back to read only solved the issue. How can I avoid this issue when setting storage's scope to full? How can I write files from R Studio-server back to my mounted bucket?
About the issue, when scope was set as full I would get the following error message (see bottom for it) by running gcloud compute ssh my-project --project=roberto --zone=us-west1-b --troubleshoot:
Starting ssh troubleshooting for instance https://compute.googleapis.com/compute/v1/projects/roberto/zones/us-west1-b/instances/my-project in zone us-west1-b
Start time: 2023-01-30 19:56:09.194370
---- Checking network connectivity ----
The Network Management API is needed to check the VM's network connectivity.
Is it OK to enable it and check the VM's network connectivity? (Y/n)? Y
Enabling service [networkmanagement.googleapis.com] on project [emlab-gcp]...
Operation "operations/acat.p2-458956698118-ed65bbb9-e27f-45f4-a929-6020411326dc" finished successfully.
Your source IP address is 72.xxx.xx.xx
Network Connectivity Test Result: REACHABLE
To view complete details of this test, see https://console.cloud.google.com/net-intelligence/connectivity/tests/details/ssh-troubleshoot-7meui?project=roberto
Help for connectivity tests:
https://cloud.google.com/network-intelligence-center/docs/connectivity-tests/concepts/overview
---- Checking user permissions ----
User permissions: 0 issue(s) found.
---- Checking VPC settings ----
VPC settings: 0 issue(s) found.
---- Checking VM status ----
The Monitoring API is needed to check the VM's Status.
Is it OK to enable it and check the VM's Status? (Y/n)? Y
Enabling service [monitoring.googleapis.com] on project [roberto]...
VM status: 0 issue(s) found.
---- Checking VM boot status ----
VM boot: 1 issue(s) found.
The VM may not be running. The serial console logs show the VM has been unable to complete the boot process. Check your serial console logs to see if the VM has been dropped into an "emergency shell" or has reached "Emergency Mode". If that is the case, try restarting the VM to see if the problem is reproducible.
Hi RobertoAS instead of changing the scopes there is another way you can use gcsfuse library for mounting the storage bucket to your compute engine and while mounting it, mount it to the user directory or give necessary permission to the service which is running the R studio-server. Follow this doc for mounting your cloud storage bucket using gcsfuse go through this document. This worked for me and I hope this will solve your issue as well.

Kubernetes: using OpenStack Cinder from one cloud provider while nodes on another

Maybe my question does not make sense, but this is what I'm trying to do:
I have a running Kubernetes cluster running on CoreOS on bare metal.
I am trying to mount block storage from an OpenStack cloud provider with Cinder.
From my readings, to be able to connect to the block storage provider, I need kubelet to be configured with cloud-provider=openstack, and use a cloud.conf file for the configuration of credentials.
I did that and the auth part seems to work fine (i.e. I successfully connect to the cloud provider), however kubelet then complains that it cannot find my node on the openstack provider.
I get:
Unable to construct api.Node object for kubelet: failed to get external ID from cloud provider: Failed to find object
This is similar to this question:
Unable to construct api.Node object for kubelet: failed to get external ID from cloud provider: Failed to find object
However, I know kubelet will not find my node at the OpenStack provider since it is not hosted there! The error makes sense, but how do I avoid it?
In short, how do I tell kubelet not to look for my node there, as I only need it to look up the storage block to mount it?
Is it even possible to mount block storage this way? Am I misunderstanding how this works?
There seem to be new ways to attach Cinder storage to bare metal, but it's apparently just PoC
http://blog.e0ne.info/post/Attach-Cinder-Volume-to-the-Ironic-Instance-without-Nova.aspx
Unfortunately, I don't think you can decouple the cloud provider for the node and that for the volume, at least not in the vanilla kubernetes.

Secondary storage not recognized in apache cloud stack

I am trying to setup a cloudstack (v4.4 on CentOS 6.5) management instance to talk to one physical host with XenServer (6.2) on it.
I have got so far as it setting the zone/pod/cluster/host and it can see the XenServer machine. Primary storage is also visible to it - I can see it in the dashboard. However it can't see the secondary storage and thus I can't download templates/ISOs. The dashboard says 0kb of 0kb in use for secondary storage.
I have tried having the secondary storage as local to the cloudstack management instance (whilst setting the use.local global setting to true). I have also tried setting up a new host and setting that up as the NFS share and it did not work.
I have checked in both instances that the shares I have made are mountable - and they are. I have also seeded them with the template VM by running the command outlined in the installation guide. Both places I set to be secondary storage had ample space available - 1 greater than 200GB. The other around 70GB. I have also restarted the management machine a few times.
Any help would be much appreciated!
You need secondary storage enabled in order to supply templates to your hosts. The simplest way to achieve that is to create an NFS export that is available to the host. I usually do it in the host it self. In your case that would be the XenServer. Then in the management server add the secondary storage in: Infrastructure -> Secondary Storage -> Add Secondary Storage.
Secondary storage is provided by a dedicated system VM. Once you add a secondary storage, CloudStack will create a system VM for that. Start by checking the status of the system VMs in: Infrastructure -> System VMs
The one you are looking for should be called Secondary Storage VM.
It should be running and the agent should be ready (two green circles). If the agent is not ready, first ssh to your XenServer host and then to the system VM using the link local IP (you can see the IP in the details of the VM) with the following command:
ssh -i /root/.ssh/id_rsa.cloud -p 3922 LIKN_LOCAL_IP_ADDRESS
Then in the system VM, run a diagnostic tool to check what could be wrong:
/usr/local/cloud/systemvm/ssvm-check.sh

Hosting wordpress blog on AWS

I have hosted a wordpress blog on AWS using EC2 instance t1.micro (Ubuntu).
I am not an expert on linux administration. However, after going through few tutorials, I was able to manage to have wordpress successfully running.
I noticed a warning on AWS console that "In case if your EC2 instance terminates, you will lose your data including wordpress files and data stored by MySql service."
Does that mean I should use S3 service for storing data to avoid any accidental data loss? Or my data will remain safe in an EBS volume even if my EC2 instance terminates?
By default, the root volume of an EC2 instance will be deleted if the instance is terminated. It can only be terminated automatically if its running as a spot instance. Otherwise it can only be terminated if you do it.
Now with that in mind, EBS volumes are not failure proof. They have a small chance of failing. To recover from this, you should either create regular snapshots of your EBS volume, or back up the contents of your instance to s3 or other storage service.
You can setup snspshot lifecycle policy to create scheduled volume snapshots.
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/snapshot-lifecycle.html

How to mount a EBS in Cloudify after the creation of a VM

I want to share some data with my VMs thanks to a mounted EBS.
How can I say to Cloudify that every created VM should have additional mounted EBS?
(I'm talking about EBS in the case of Amazon EC2, but I want to do the same with OpenStack, and other IaaS)
For ec2, you would need to set the template options in the template section of the cloud configuration file as follows:
options ([
"securityGroups" : ["default"]as String[],
"keyPair" : "XXXXX",
"blockDeviceMappings": [new org.jclouds.ec2.domain.BlockDeviceMapping.MapEBSSnapshotToDevice("/dev/sda1/","aa", 20, true) ] ])
Cloudify uses the jclouds multi-cloud library to handle API calls to amazon services. For more details on using EBS with EC2, see:
http://demobox.github.com/jclouds-maven-site-1.4.0/1.4.0/jclouds-multi/apidocs/org/jclouds/ec2/domain/BlockDeviceMapping.MapEBSSnapshotToDevice.html
http://demobox.github.com/jclouds-maven-site-1.4.0/1.4.0/jclouds-multi/apidocs/org/jclouds/ec2/domain/BlockDeviceMapping.MapNewVolumeToDevice.html
Please note that these settings are specific to EC2 and are not portable across clouds.
With regards to Openstack, The Cloudify Openstack cloud driver does not currently support using volumes, the openstack EBS equivalent. This is accurate for version 2.1.1 and 2.2 of Cloudify, though this feature is expected to become available in the near future.

Resources