How can I set up a Docker network with restricted communication? - networking

I'm trying to create something like this:
The server containers each have port 8080 exposed, and accept requests from the client, but crucially, they are not allowed to communicate with each other.
The problem here is that the server containers are launched after the client container, so I can't pass container link flags to the client like I used to, since the containers it's supposed to link to don't exist yet.
I've been looking at the newer Docker networking stuff, but I can't use a bridge because I don't want server cross-communication to be possible. It also seems to me like one bridge per server doesn't scale well, and would be difficult to manage within the client container.
Is there some kind of switch-like docker construct that can do this?

It seems like you will need to create multiple bridge networks, one per container. To simplify that, you may want to use docker-compose to specify how the networks and containers should be provisioned, and have the docker-compose tool wire it all up correctly.
Resources:
https://docs.docker.com/engine/userguide/networking/dockernetworks/
https://docs.docker.com/compose/
https://docs.docker.com/compose/compose-file/#version-2
One more side note: I think that exposed ports are accessible to all networks. If that's right, you may be able to set all of the server networking to none and rely on the exposed ports to reach the servers.

Hope this is relevant to your use-case - I'm attempting to draw context regards your actual application from the diagram and comments. I'd recommend you go the Service Discovery route. It may involve a little bit of simple API over a central store (say Redis, or SkyDNS), but would make things simple in the long run.
Kubernetes, for instance, uses SkyDNS to do so with DNS. At the end of the day, any orchestration tool of your choice would most likely do something like this out of the box: https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/dns
The idea is simple:
Use a DNS container that keeps entries of newly spawned servers
Allow the Client Container to query it for a list of servers. e.g. Picture a DNS response with a bunch of server-<<ISO Timestamp of Server Creation>>s
Disallow client containers read-access to this DNS (how to manage this permission-configuration without indirection, i.e. without proxying through an endpoint that allows writing into the DNS Container, but not reading, is going to exotic)
Bonus Edit: I just realised you can use a simpler Redis-like setup to do this, and that DNS might just be overengineering :)

Related

How to spin up proxies on demand to use for webscraping?

I've been using web scraping services for a bit but I need such a high volume that I can't pay for those services anymore.
I was thinking to build my own proxy pool to mimic these services for myself but didn't know how to do it.
My solutions:
My first thought was to use kali linux's proxychain to get the same result but I don't know if that would work since I can't find the list or proxies in the code.
I also thought that I could create a server config with Ansible, and use Terraform to spin cloud servers either on demand or until blacklisted but I'm not sure it'll be fast and reliable or if cloud providers could block me.
Lastly, I thought about using VMs or Containers since you can attribute them random IPs and bridge networked VMs have random IPs so if it gets banned just delete it and spin up a new one.
Does someone know which way between the three makes the most sense to achieve what I'm looking for ?
I don't know what is required to get a pool so reliable that some companies have created business of offering them for webscraping such as ScrapingBee but just on the reliable proxy pool side, because all the free proxy pools online are not really good.
I imagine that if it was public knowledge, those companies wouldn't exist, but I'm sure I'm missing something there.
Thank you.

Floating IPs usage on Digital Ocean

I am looking for a basic thing yet I have not found not even a single good documentation on getting it done.
I want to allocate a floating IP, then associate it to a network interface of a droplet other than eth0.
The reason is I want to have the ability to very easily switch from one IP to the other with a programming language.
In a few words, I want to be able to do these two commands and both should provide a different response.
curl --interface eth0 https://icanhazip.com
curl --interface eth1 https://icanhazip.com
Also, I want to know what to do once I release the Floating IP, how do I roll back to the starting point.
All documentation I read, rely heavily on "ip route" and "route", most did not even work, some worked but replaced completely the old IP by the floating and that's not what I want, and also they did not show how to rollback the introduced configuration changes.
Please help, I spent 1 whole day now trying to get this to work for a project, and no results so far.
I guess there is no need to know DigitalOcean, how to make this work on other Cloud Providers would apply here too I think.
Update
After asking this on DigitalOcean community forum (https://www.digitalocean.com/community/questions/clear-guide-on-outbound-network-through-floating-ip), they claim that is not supported, although there may be some solutions to this if somebody can provide such a "hacky" solution I would take it too. Thanks
In the cloud (AWS. GCP etc.) ARP is emulated by the virtual network layer, meaning that only IPs assigned to VMs by the cloud platform can be resolved. Most of the L2 failover protocols do break for that reason. Even if ARP worked,the IP allocation process for these IPs (often called “floating IPs”) would not integrate with the virtual network in a standard way, so your OS can't just "grab" the IP using ARP and route the packets to itself.
I have not personally done this on Digital Ocean, but I assume that you can call the cloud's proprietary API to do this functionality if you would like to go this route.
See this link on GCP about floating IPs and their implementation. Hope this is helpful.
Here's an idea that needs to be tested:
Let's say you have Node1(10.1.1.1/24) and Node2(10.1.1.2/24)
Create a loopback interface on both VMs and set the same IP address for both like (10.2.1.1/32)
Start a heartbeat send/receive between them
When NodeA starts it automatically makes an API call to create a route for 10.2.1.1/32 and points to itself with preference 2
When NodeB starts it automatically makes an API call to create a route for 10.2.1.1/32 and points to itself with preference 1
The nodes could monitor each other to withdraw the static routes if the other fails. Ideally you would need a 3rd node to reach quorum and prevent split brain scenarios, but you get the idea right?

What is the replacement for `--net=container` in new docker networking?

In the pre docker 1.9 days I used to have a vpn provider container which I could use as the network gateway for my other containers by passing the option --net=container:[container-name].
This was very simple but had a major limitation in that the provider container had to exist prior to starting the consumers and it could not be restarted.
The new docker networking stack seems to have dropped this provision in favour of creating networks which does sound better, but I'm struggling to get equivalent behaviour.
Right now I have created an internal network docker network create isolated --internal --subnet=172.32.0.0/16 and brought up 2 containers one of which is attached only to internal network and one which is attached to both the default bridge and the internal network.
Now I need to route all network traffic from the isolated container through the connected one. I've messed around with some iptable rules but tbh this is not my strongest area.
So my questions are simply: Is my approach along the right lines? What rules need to be in place in the two containers to get this working as --net=container?

How can two services discover each other without static addresses?

Supposed I have two services that need to share and / or exchange data. Both instances are separate from each other, and they shall not know anything about where the other part is located.
Now in order for them to be able to share and / or exchange data, they need to connect to each other.
How do they find each other without the need to configure the IP addresses explicitly? In other words: How could they detect each other automatically?
Basically, I have two ideas:
Pull: You need to have a central service where they register. Then you can ask that service for the address of a service, and that service then returns those data. While this works, it has the drawback that it only shifts the problem to the next level: What if I have multiple instances of that service, and I don't want them to know each other in advance?
Push: Each service broadcasts its own address, so that other services get it to know. Each service repeats this from time to time. Drawback: This does hardly work in the internet.
Any idea of how I could solve this in an intelligent way?
PS: If you want to say so, I'm looking for a way to handle dynamic IPs without the need for a central DNS server.
The usual way is to have some fault-tolerant server where services register and can then look for other services - Curator framework implements that over zookeeper.
If you want autodiscovery then you should probably implement some sort of gossip protocol so that the servers would know which other servers are out there in a reliable way. You should keep in mind that getting gossip protocols right is tricky (e.g. some of past Amazon cloud failures where due to problems in their implementation)
"broadcast packets are not forwarded everywhere on a network, but only to devices within a broadcast domain."
If your devices are on different broadcast domains then broadcasting is not going to work.
You are probably going to have to implement your own central service, unless you can use one of the free dynamic dns servers, for example: Free

Best way to prevent denial of service attacks on a website

I have a web app and I would like to prevent DOS attacks by blocking an IP address if it make many request in a short period of time.
For example, if the same IP address makes 100 request in a second, I can assume that it's some kind of attack and I would like to block this IP.
However, making this check in the application layer seems too expensive - what is the optimal way to make this check?
Should I make this kind of check at my:
firewall
router
apache config
someplace else entirely ...
If you want to block IP addresses when they make a certain number of requests, this is best done at the Network layer. This would suggest that you do this either in your host machine's network stack or using a router (which operates at the network layer).
Some things you might want to consider though are:
- Are you really wanting to block access to the entire host based on an IP address, or do you want to block access to a specific application running on a specific port.
- Sometimes, by using NATs, one IP address may be making requests on behalf of many real hosts.
With any security application you need to have many layers of defence, so it would be a good idea to invest in a good firewall as well.
Some apps for generating APIs in django implement some methods for limiting the amount of request per second.
For example django-piston use throttling method to do that.
django-piston throttling
Thats an easy way to solve the problem.

Resources