Kubernetes: using OpenStack Cinder from one cloud provider while nodes on another - openstack

Maybe my question does not make sense, but this is what I'm trying to do:
I have a running Kubernetes cluster running on CoreOS on bare metal.
I am trying to mount block storage from an OpenStack cloud provider with Cinder.
From my readings, to be able to connect to the block storage provider, I need kubelet to be configured with cloud-provider=openstack, and use a cloud.conf file for the configuration of credentials.
I did that and the auth part seems to work fine (i.e. I successfully connect to the cloud provider), however kubelet then complains that it cannot find my node on the openstack provider.
I get:
Unable to construct api.Node object for kubelet: failed to get external ID from cloud provider: Failed to find object
This is similar to this question:
Unable to construct api.Node object for kubelet: failed to get external ID from cloud provider: Failed to find object
However, I know kubelet will not find my node at the OpenStack provider since it is not hosted there! The error makes sense, but how do I avoid it?
In short, how do I tell kubelet not to look for my node there, as I only need it to look up the storage block to mount it?
Is it even possible to mount block storage this way? Am I misunderstanding how this works?

There seem to be new ways to attach Cinder storage to bare metal, but it's apparently just PoC
http://blog.e0ne.info/post/Attach-Cinder-Volume-to-the-Ironic-Instance-without-Nova.aspx

Unfortunately, I don't think you can decouple the cloud provider for the node and that for the volume, at least not in the vanilla kubernetes.

Related

How can I add EFS to an Airflow deployment on Amazon-EKS?

Kubernetes and EKS newbie here.
I've set up an Elastic Kubernetes Service (EKS) cluster and added an Airflow deployment on top of it using the official HELM chart for Apache Airflow. I configured gitsync and can successfully run my DAGS. For some of the DAGs, I need to save the data to an Amazon EFS. I installed the Amazon EFS CSI driver on eks following the instruction on the amazon documentation.
Now, I can create a new pod with access to the NFS but the airflow deployment broke and stay in a state of Back-off restarting failed container. I also got the events with kubectl -n airflow get events --sort-by='{.lastTimestamp} and I get the following messages:
TYPE REASON OBJECT MESSAGE
Warning BackOff pod/airflow-scheduler-599fc856dc-c4pgz Back-off restarting failed container
Normal FailedBinding persistentvolumeclaim/redis-db-airflow-redis-0 no persistent volumes available for this claim and no storage class is set
Warning ProvisioningFailed persistentvolumeclaim/ebs-claim storageclass.storage.k8s.io "ebs-sc" not found
Normal FailedBinding persistentvolumeclaim/data-airflow-postgresql-0 no persistent volumes available for this claim and no storage class is set
I have tried this on EKS version 1.22.
I understand from this that airflow is expecting to get an EBS volume for its pods but the NFS driver changed the configuration of the pvs.
The pvs before I install the driver are this:
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-###### 100Gi RWO Delete Bound airflow/logs-airflow-worker-0 gp2 1d
pvc-###### 8Gi RWO Delete Bound airflow/data-airflow-postgresql-0 gp2 1d
pvc-###### 1Gi RWO Delete Bound airflow/redis-db-airflow-redis-0 gp2 1d
After I install the EFS CSI driver, I see the pvs have changed.
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
efs-pvc 5Gi RWX Retain Bound efs-storage-claim efs-sc 2d
I have tried deploying airflow before or after installing the EFS driver and in both cases I get the same error.
How can I get access to the NFS from within Airflow without breaking the Airflow deployment on EKS. Any help would be appreciated.
As stated in the error above no persistent volumes available for this claim and no storage class is set and storageclass.storage.k8s.io "ebs-sc" not found, you have to deploy a storage class called efs-sc using the EFS CSI driver as a provisioner.
Further documentation could be found here
An example of creating your missing storage class and persistent volume could be found here
These steps are also described in the AWS EKS user guide

Can't connect to Google Cloud SQL from Cloud Run - using R Shiny

I have created an R Shiny app that connects to a Cloud SQL instance. It runs fine on my local server, but when I upload to either shinyapps.io or to Cloud Run via Dockerfile, it is unable to connect.
Here is the code I am using to connect using RPostgres package:
conn <- dbConnect(
drv=RPostgres::Postgres(),
dbname='postgres',
sslrootcert=path to 'server-ca.pem',
sslcert=path to 'client-cert.pem',
sslkey=path to 'client-key.pem',
host='xxxxxxxxxxxxxxxxxxx',
port=5432,
user='username',
password='password_string',
sslmode='verify-ca')
I've checked the logs in Cloud Run, the error message I am seeing is the following:
Warning: Error in : unable to find an inherited method for function 'dbGetQuery' for signature '"character", "character"'
The dbGetQuery() function is called after the dbConnect function, and since it runs fine on my local server, I am fairly confident that what I am seeing is a connection issue, rather than a package namespace issue. But could be wrong.
I have opened up to all IPs by adding 0.0.0.0/0 as an allowed network. The weird thing is that OCCASIONALLY I CAN CONNECT from shinyapps.io, but most of the time it fails. I have not yet got it to work once from Cloud Run. This is leading me to think that it could be a problem with a dynamic IP address or something similar?
Do I need to go through the Cloud Auth proxy to connect directly between Cloud Run and Cloud SQL? Or can I just connect via the dbConnect method above? I figured that 0.0.0.0/0 would also include Cloud Run IPs but I probably don't understand how it works well enough. Any explanations would be greatly appreciated.
Thanks very much!
I have opened up to all IPs by adding 0.0.0.0/0 as an allowed network.
From a security standpoint, this is a terrible, horrible, no good idea. It essentially means the entire world can attempt to connect to your database.
As #john-hanley stated in the comment, the Connecting Cloud Run to Cloud SQL documentation details how to connect. There are two options:
via Public IP (the internet) using the Unix domain socket on /cloudsql/CLOUD_SQL_CONNECTION_NAME
via Private IP, which connects through a VPC using the Serverless VPC Access
If a Unix domain socket is not supported by your library, you'll have to use a different library or choose Option 2 and connect over TCP. Note that Serverless VPC Access connector has additional costs associated with using it.

How configure Octavia in Openstack Kolla?

Im trying to deploy Octavia in Kolla Openstack, my global.yml is:
config_strategy: "COPY_ALWAYS"
kolla_base_distro: "ubuntu"
kolla_install_type: "source"
kolla_internal_vip_address: "169.254.1.11"
network_interface: "eth0"
neutron_external_interface: "eth1"
neutron_plugin_agent: "openvswitch"
enable_neutron_provider_networks: "yes"
enable_haproxy: "yes"
enable_cinder: "yes"
enable_cinder_backend_lvm: "yes"
keystone_token_provider: 'fernet'
cinder_volume_group: "openstack_cinder"
nova_compute_virt_type: "kvm"
enable_octavia: "yes"
octavia_network_interface: "eth2"
I use a default/automatic configuration, keypair, network and flavor are created in service project. Then I create the amphora image for this project.
All this is indicated in the Openstack guide, but it doesn't work.
When I create a loadbalancer, the amphora is deployed but the loadbalancer is "Pending Create" status. I saw that the created network is vxlan, a tenant network, and I think that it should have external conectivity, I tried but didn't work.
I check the openvswitch configuration and don't see any difference deploying with or without Octavia.
Do I miss something? I don't know what to do at this point, I even tried the manual config but I couldn't make it work.
I can't speak to the kolla part of this issue, but with the load balancer in PENDING_CREATE, the controller (worker) logs should show where it is retrying to take some action against your cloud and failing. It will retry for some time, then move to an ERROR status if the cloud issue is not resolved in time.
Without seeing the logs, my guess is kolla did not setup the lb-mgmt-net correctly.
I don't know how to get it working with an external network - but for the tenant network it appears the solution is:
Setting octavia_network_interface will make kolla create that interface, so any name will do. When referencing other setups (ie. the devstack plugin) they name this o-hm0. So this is what I did.
Set octavia_network_type to tenant in globals.yml. Note this requires the host to have dhclient available, kolla didn't seem to install this for me.
I tested this on stable/zed and it appears to work for me.

Google Cloud function times out when connecting to Redis on Compute Engine internal IP

I created a Redis instance using https://console.cloud.google.com/launcher/details/bitnami-launchpad/redis-ha
and the network interface is:
I'm trying to connect to this Redis instance from a Firebase trigger.
The question is: what firewall rule do I need to connect from a cloud function to a compute instance?
Please provide as many details as possible, e.g. IP ranges, ingress/egress, etc, and whether I have to connect the Redis client to the instance on the internal IP, or the external IP.
This is the code:
const redis = require('redis');
let redisInstance = redis.createClient({
/* surely external IP needn't be used
here as it's all GCP infra? */
host: '10.1.2.3',
port: 6379
})
redisInstance.on('connect', () => {
console.log(`connected`);
});
redisInstance.on('error', (err) => {
console.log(`Connection error ${err}`);
});
The error in the log is
Connection error Error: Redis connection to 10.1.2.3:6379 failed - connect ETIMEDOUT 10.1.2.3:6379
I've looked at Google Cloud Function cannot connect to Redis but it's not specific enough about the options when setting up a rule.
What I've tried
I tried to set up a firewall rule with these settings:
ingress
network: default
source filter: my firebase service account
protocols/ports: all
targets: all
Just a note about the service account:
created by Firebase
has the Editor role in IAM
is known to work with BigQuery and other Firebase services from my Firebase triggers
This same firewall rule has been in effect for a few hours now, and I've also redeployed the trigger which tests Redis, but still getting ETIMEDOUT
UPDATES
2018-06-25 morning
I phoned GCP Gold support and the problem isn't obvious to the operator, so they'll open a case, investigate, and leave some notes.
2018-06-25 afternoon
Using a permissive firewall rule (source 0.0.0.0/0, destination "all targets") and connecting to the Redis instance's external IP address works (of course!). However, I mentioned many times now on the phone call I don't want the Redis instance to be open to the Internet, and if there's some sort of solution involving a networking bridge/VPN so I can connect to the 10.x.x.x address from the Cloud Function.
The operator said they'll get back to me in 2 days.
2018-06-25 bit later in the afternoon
I've self-answered that it doesn't seem to be possible to connect to a Compute Engine internal IP from a cloud function.
It looks like it's NOT currently possible to connect to Google Compute Engine internal IP from Google Cloud Funtions so all my (and my helpful Gold support operator's) efforts have been in vain.
Here's the (open) issue: https://issuetracker.google.com/issues/36859738
As it is explained in the question you referred to, when you create a new firewall rule you change the Source Filter field from IP ranges to Service Account. In the following step you won't need to specify any IPs, only the name of the service account for Cloud Functions.

Intergration of Docker with OpenStack via Docker Heat Plugin

I'm trying to integrate Docker with OpenStack (icehouse) via the Docker-Heat Pluigin and I'm facing a problem.
OpenStack is configured according to the tutorial by OpenStack for Ubuntu. I'm using a controller node and a compute node (just the 2 nodes) with the legacy nova-networking.
Things to keep in mind:
Controller Node: 1 network interface - management interface
Compute Node : 2 network interfaces - management interface and the external interface (vm instance have ips of the same subnet of that external interface)
With OpenStack everything works perfect except (which might be the problem I'm facing for dockers)
1- You can't reach (ping) the deployed vm instances from the controller node [makes sense, i think no problem in that one]
2- You can't reach (ping) the deployed vm instances from the compute node (ping: operation not permitted) [might be the issue] - but you can ping from a vm instance to the compute node
3- The virtual machines themselves don't see each others [but i think doesn't have relation to the issue im facing]
For Dockers, the plug-in is installed. I assume perfect since the syntax for Dockers DockerInc::Docker ... is accepted but when I try to run the example posted in the Docker blog - making the adjustments required - the compute instance is created but the docker container is not. Im having this error:
When i try it as a user with admin role
MissingSchema: Invalid URL u'192.168.122.26/v1.9/containers/None/json': No schema supplied. Perhaps you meant http:/ /192.168.122.26/v1.9/containers/None/json
When i try it as a user with just a member role
MissingSchema: Invalid URL u'192.168.122.26/v1.9/containers/create': No schema supplied. Perhaps you meant http:/ /192.168.122.26/v1.9/containers/create
Notes:
192.168.122.26 is the ip of the created vm instance.
I've tried not only with cirros but also coreos and ubunto-precise (same error)
Docker itsself is installed on both Controller and Compute.
Docker plugin and its requirements are only installed on the controller node
Finally, both the controller and the compute nodes run as virtual machines themselves
I would be really glad if you had an idea. Thanks for your time,
Kindest Regards,
M. El Sioufy
My guess is that you haven't allowed communication to the VMs from the outside world (which the controller and/or the compute node will be from the VM's point of view). By default, communications from VMs to the outside world are allowed, but not inbound to the VMs. Try adding an "allow all TCP" rule to the default security group of the tenant that the VMs live in. This may fix your HTTP timeout.

Resources