AWS Datapipeline ServiceAccessSecurityGroup - emr

When I try to create an EMRcluster resource with those properties:
Emr Managed Master Security Group Id
Emr Managed Slave Security Group Id
I have this error : Terminated with errors. You must also specify a ServiceAccessSecurityGroup if you use custom security

Service Access Security Group: Besides the firewall settings mentioned in the 2 security groups mentioned. Internet traffic between AWS EMR Service servers(you dont have any control over this, completely managed by AWS) and your Slave EMR instance, has to be allowed.
This security group contains 2 entries
HTTPS* (8443) TCP (6) 8443 ElasticMapReduce-Slave-Private(sg-id)
HTTPS* (8443) TCP (6) 8443 Default Security Group of VPC
Without this EMR will not work with DataPipeline. Neither Datapipeline specifies a way to list this in pipeline definition. AWS team is aware of this.
So, as a workaround please use the custom template provided by AWS, and clone, edit accordingly to your needs.
Thanks, #blamblam for pointing that out. The previous steps assume, servers have already been created in the private subnets, and you need to allow communication automatically.
For launching in private subnet, we will include one more setting, Subnet Id, this will launch your EMR in private subnets.
Hope, that helps.

Related

How do I point traffic from a GCE external IP to a secondary internal IP?

I currently have a GCE instance that is running Jenkins, and I want to be able to access it from the browser. It's running on an IP address OTHER than the primary internal address Google gives me. So for example, the primary internal IP is 10.128.0.8, but Jenkins is running at 10.0.1.15:8081.
How do I direct traffic from <EXTERNAL_IP>:8081 to 10.0.1.15:8081 ?
Please note that my Linux skills are shaky and my networking skills are non-existant, so if you can tell me HOW to do whatever it is I need to do, bonus. :) Thanks!
1- First you need to create a Firewall rules on the current instance's network eg:
gcloud beta compute --project=<project-name> firewall-rules create jenkins --description="8081 port jenkins" --target-tags=jenkins --network=<network-name> --action=ALLOW --rules=tcp:8081
Then you have to add that rule in the instance (selecting the tag created above) eg:
gcloud compute instances add-tags <instance-name> --tags jenkins
2- Other way it's by Cloud Console from VPC network/Firewall rules and then add the Firewall Rule Tag on your instance.
However you should use the Alias IP Ranges (from this documentation may respond your question + your FR rules created for External IP).

Route Propagation on AWS via Terraform

My company uses AWS as a cloud provider and Terraform to do our Infrastructure as code piece. I need to make a change to the way our traffic routes in AWS. We currently have 1 NAT gateway. So if the AZ that this live sin went down we'd lose external connectivity from our instances that live on our private subnets.
I've created 2 extra NAT GW's. One in each AZ. I have done all of this through Terraform ok but I've run into a stumbling block when it comes to the routing.
I've created this type of setup, where you have a routing table for both private and public subnets in each AZ
NAT GW Architecture and routing
We have a Direct Connect and use BGP to advertise our Datacentre networks to AWS. I can't seem to figure out how to enable route propagation on the private subnet route tables so that our on-prem networks get populated in these route tables.
resource "aws_route_table" "private-subnet-a-routes" {
vpc_id = "${aws_vpc.foo.id}"
propogating_vgws "${aws_vgw.foo.id}"
I have tried that but get the below error
resource 'aws_route_table.private-subnet-a-routes' config: unknown resource 'aws_vgw.foo' referenced in variable aws_vgw.foo.id
Does anyone know how to set routes to be propagated on a route table from the main VGW in your VPC?
Thanks in advance
Chris
Not sure that answers your question but might give you some ideas:
resource "aws_route" "nat" {
count = "${var.num_availability_zones}"
route_table_id = "${element(aws_route_table.private.*.id, count.index)}"
destination_cidr_block = "0.0.0.0/0"
nat_gateway_id = "${element(aws_nat_gateway.nat.*.id, count.index)}"
depends_on = ["aws_internet_gateway.main", "aws_route_table.private"]
}
https://www.terraform.io/docs/providers/aws/d/route.html

Kubernetes update changes static+reserved external IPs for nodes in Google Cloud

I have three nodes in my google container cluster.
Everytime i perform a kubernetes update through the web-ui on the cluster in Google Container Engine.
My external IP's change, and i have to manually assign the previous IP on all three instances in Google Cloud Console.
These are reserved static external IP set up using the following guide.
Reserving a static external IP
Has anyone run into the same problem? Starting to think this is a bug.
Perhaps you can set up the same static outbound external IP for all the instances to use, but i cannot find any information on how to do so, that would be a solution as long as it persists through updates, otherwise we've got the same issue.
It's only updates that cause this, not restarts.
I was having the same problem as you. We found some solutions.
KubeIP - But this needed a cluster 1.10 or higher. Ours is 1.8
NAT - At GCP documentation they talk about this method. It was too complex for me.
Our Solution
We followed the documentation for assign IP addresses on GCE. Used the command line.
Using this method, we didn't have any problems so far. I don't know the risks for it yet. If anyone has an idea, it would be good.
We basically just ran:
gcloud compute instances delete-access-config [INSTANCE_NAME] --access-config-name [CONFIG_NAME]
gcloud compute instances add-access-config [INSTANCE_NAME] --access-config-name "external-nat-static" --address [IP_ADDRESS]
If anyone have any feedback on this solution. Please give it to us.
#Ahmet Alp Balkan - Google
You should not rely on the IP addresses of each individual node. Instances can come and go (especially when you use Cluster Autoscaler), and their IP addresses can change.
You should always be exposing your applications with Service or Ingress and IP addresses of the load balancers created with these resources do not change between upgrades. Further you can convert IP address on a load balancer to a static (reserved) IP address.
I see that you're assigning static IP addresses to your nodes. I don't see any reason to do that. When you expose your services with Service/Ingress resources, you can associate a static external IP to them.
See this tutorial: https://cloud.google.com/container-engine/docs/tutorials/http-balancer#step_5_optional_configuring_a_static_ip_address

Running Kubernetes on vCenter

So Kubernetes has a pretty novel network model, that I believe is based on what it perceives to be a shortcoming with default Docker networking. While I'm still struggling to understand: (1) what it perceives the actual shortcoming(s) to be, and (2) what Kubernetes' general solution is, I'm now reaching a point where I'd like to just implement the solution and perhaps that will clue me in a little better.
Whereas the rest of the Kubernetes documentation is very mature and well-written, the instructions for configuring the network are sparse, largely incoherent, and span many disparate articles, instead of being located in one particular place.
I'm hoping someone who has set up a Kubernetes cluster before (from scratch) can help walk me through the basic procedures. I'm not interested in running on GCE or AWS, and for now I'm not interested in using any kind of overlay network like flannel.
My basic understanding is:
Carve out a /16 subnet for all your pods. This will limit you to some 65K pods, which should be sufficient for most normal applications. All IPs in this subnet must be "public" and not inside of some traditionally-private (classful) range.
Create a cbr0 bridge somewhere and make sure its persistent (but on what machine?)
Remove/disable the MASQUERADE rule installed by Docker.
Some how configure iptables routes (again, where?) so that each pod spun up by Kubernetes receives one of those public IPs.
Some other setup is required to make use of load balanced Services and dynamic DNS.
Provision 5 VMs: 1 master, 4 minions
Install/configure Docker on all 5 VMs
Install/configure kubectl, controller-manager, apiserver and etcd to the master, and run them as services/daemons
Install/configure kubelet and kube-proxy on each minion and run them as services/daemons
This is the best I can collect from 2 full days of research, and they are likely wrong (or misdirected), out of order, and utterly incomplete.
I have unbridled access to create VMs in an on-premise vCenter cluster. If changes need to be made to VLAN/Switches/etc. I can get infrastructure involved.
How many VMs should I set up for Kubernetes (for a small-to-medium sized cluster), and why? What exact corrections do I need to make to my vague instructions above, so as to get networking totally configured?
I'm good with installing/configuring all the binaries. Just totally choking on the network side of the setup.
For a general introduction into kubernetes networking, I found http://www.slideshare.net/enakai/architecture-overview-kubernetes-with-red-hat-enterprise-linux-71 pretty helpful.
On your items (1) and (2): IMHO they are nicely described in https://github.com/kubernetes/kubernetes/blob/master/docs/admin/networking.md#docker-model .
From my experience: What is the Problem with the Docker NAT type of approach? Sometimes you need to configure e.g. into the software all the endpoints of all nodes (172.168.10.1:8080, 172.168.10.2:8080, etc). in kubernetes you can simply configure the IP's of the pods into each others pod, Docker complicates it using NAT indirection.
See also Setting up the network for Kubernetes for a nice answer.
Comments on your other points:
1.
All IPs in this subnet must be "public" and not inside of some traditionally-private (classful) range.
The "internal network" of kubernetes normally uses private IP's, see also slides above, which uses 10.x.x.x as example. I guess confusion comes from some kubernetes texts that refer to "public" as "visible outside of the node", but they do not mean "Internet Public IP Address Range".
For anyone who is interested in doing the same, here is my current plan.
I found the kube-up.sh script which installs a production-ish quality Kubernetes cluster on your AWS account. Essentially it creates 1 Kubernetes master EC2 instance and 4 minion instances.
On the master it installs etcd, apiserver, controller manager, and the scheduler. On the minions it installs kubelet and kube-proxy. It also creates an auto-scaling group for the minions (nice), and creates a whole slew of security- and networking-centric things on AWS for you. If you run the script and it fails creating the AWS S3 bucket, create a bucket of the same exact name manually and then re-run the script.
When the script is finished you will have Kubernetes up and running and ready for near-production usage (I keep saying "near" and "production-ish" because I'm too new to Kubernetes to know what actually constitutes a real deal productionalized cluster). You will need the AWS CLI installed and configured with a user that has full admin access to your AWS account (it goes ahead and creates IAM roles, etc.).
My game plan will be to:
Get comfortable working with Kubernetes on AWS
Keep hounding the Kubernetes team on Slack to help me understand how Kubernetes works under the hood
Reverse engineer the kube-up.sh script so that I can get Kubernetes running on premise (vCenter)
Blog about this process
Update this answer with a link to said blog.
Give me some time and I'll follow through.

Intergration of Docker with OpenStack via Docker Heat Plugin

I'm trying to integrate Docker with OpenStack (icehouse) via the Docker-Heat Pluigin and I'm facing a problem.
OpenStack is configured according to the tutorial by OpenStack for Ubuntu. I'm using a controller node and a compute node (just the 2 nodes) with the legacy nova-networking.
Things to keep in mind:
Controller Node: 1 network interface - management interface
Compute Node : 2 network interfaces - management interface and the external interface (vm instance have ips of the same subnet of that external interface)
With OpenStack everything works perfect except (which might be the problem I'm facing for dockers)
1- You can't reach (ping) the deployed vm instances from the controller node [makes sense, i think no problem in that one]
2- You can't reach (ping) the deployed vm instances from the compute node (ping: operation not permitted) [might be the issue] - but you can ping from a vm instance to the compute node
3- The virtual machines themselves don't see each others [but i think doesn't have relation to the issue im facing]
For Dockers, the plug-in is installed. I assume perfect since the syntax for Dockers DockerInc::Docker ... is accepted but when I try to run the example posted in the Docker blog - making the adjustments required - the compute instance is created but the docker container is not. Im having this error:
When i try it as a user with admin role
MissingSchema: Invalid URL u'192.168.122.26/v1.9/containers/None/json': No schema supplied. Perhaps you meant http:/ /192.168.122.26/v1.9/containers/None/json
When i try it as a user with just a member role
MissingSchema: Invalid URL u'192.168.122.26/v1.9/containers/create': No schema supplied. Perhaps you meant http:/ /192.168.122.26/v1.9/containers/create
Notes:
192.168.122.26 is the ip of the created vm instance.
I've tried not only with cirros but also coreos and ubunto-precise (same error)
Docker itsself is installed on both Controller and Compute.
Docker plugin and its requirements are only installed on the controller node
Finally, both the controller and the compute nodes run as virtual machines themselves
I would be really glad if you had an idea. Thanks for your time,
Kindest Regards,
M. El Sioufy
My guess is that you haven't allowed communication to the VMs from the outside world (which the controller and/or the compute node will be from the VM's point of view). By default, communications from VMs to the outside world are allowed, but not inbound to the VMs. Try adding an "allow all TCP" rule to the default security group of the tenant that the VMs live in. This may fix your HTTP timeout.

Resources