Will a request to api.myapp.com be slower then a request to api-myapp.herokuapp.com when hosted on heroku? - http

I'm trying to understand the best way to handle SOA on heroku, i've got it into my head that making requests to custom domains will somehow be slower, or would all requests go "out" via the internet?
On previous projects which are SOA in nature we've had dedicated hosting so could make requests like http://blogs/ (obviously on the internal network) I'm wondering if heroku treats *.herokuapp.com requests as "internal"... Or is it clever enough to know the myapp.com is actually myapp.herokuapp.com and route locally, or am i missing the point completely, and in fact all requests are "external"

What you are asking about is general knowledge of how internet requests are working.
Whenever you do request from your application to lets say example.com, domain name will first be translated into IP address using so called DNS servers.
So this how it works: does not matter you request myapp.com or myapp.heroku.com you will always request infromation from specific IP address, and domain name you have requested will be passed as part of request headers.
Server which receives this request will try to find in its internal records this domain name and handle request accordingly.
So conclusion is that does not matter you put myapp.com or myapp.heroku.com, the speed of request will always be same.
PS: As heroku will load balance your requests between different instances of your running myapp.com, the speed here will depend on several factors: how quickly your application will respond, how many instances you have running and load average per instance, how much is load balancer loaded at the moment. But surely it will not depend on which domain name you use.

Related

Kubernetes sticky session / load balance by header value

I'm working on a project where I want to use Kubernetes and Docker. The microservice I'm about to implement must create a permanent HTTP/2 connection to another service (provided by others and I can't modify anything in that service) pro user / client in order to send asynchronous and cloud initiated messages to that user. Also, each following request from that client must use the same connection.
Obviously that is a challenge in terms of scalability, because every request from a client must be routed to the same instance of my microservice, which created the permanent connection to the other service. What makes things worse is the fact that my clients can change the IPs and that they can't use cookies. But what they can do is to send a custom header value which identifies them.
I thought about HAProxy and nginx, but can't find an option in either of them to load balance requests by a header value. Is there really no way to do that? How would you approach that issue? Any ideas?
Thanks!

What is "Reverse Proxy" and "Load Balancing" in Nginx / Web server terms?

These are two phrases I hear about very often, mainly associated with Nginx. Can someone give me a laymans defintion?
Definitions are often difficult to understand. I guess you just need some explanation for their use case.
A short explanation is: load balancing is one of the functionalities of reverse proxy, and reverse proxy is one of the softwares that can do load balancing.
And a long explanation is given below.
For example a service of your company has customers in UK and German. Because the policy is different for these two countries, your company has two web servers, uk.myservice.com for UK and de.myservice.com for German, each with different business logic. In addition, your company wants there to be only one unified endpoint, myservice.com for the service. In this case, you need to set up a reverse proxy as the unified endpoint. The proxy takes the url myservice.com, and rewrites the url of incoming requests so that requests from UK(determined by source ip) go to uk.myservice.com and requests from German go to de.myservice.com. From the view of a client from UK, it never knows the response is actually generated from uk.myservice.com.
In this case, the load of request traffic to the service is actually balanced to servers on uk.myservice.com and de.myservice.com as a side effect. So we normally don't call it used as a load balancer, just say it as a reverse proxy.
But lets say if your company uses the same policy for all countries, and has 2 servers, a.myservice.com and b.myservice.com, only for the reason that the work load is to heavy for one server machine. In this case, we normally call the reverse proxy as load balancer to emphasize the reason why it is being used.
Here is the basic definition:
Reverse Proxy is a proxy host, that receives requests from a client, and sends it to one of the servers behind itself. Nginx and apache httpd are commonly used as reverse proxies. These are in the administrative network of the web server that a servers a request.
This is in contrast with a (forward) Proxy, which sits in front of a client, and sends requests on behalf of a client to a web server. As an example, your corporate network address translator is a forward proxy. These are in the administrative network of the client from where the request originates.
Load balancing is a function performed by reverse proxies. The client requests are received by a load balancer, and the load balancer tries to send that request to one of the nodes (hosts) in the server pool, in an attempt to balance the load across various nodes.
I see both of them as a functionality of a HTTP/Web Server.
Load balancer’s job is to distribute the workload between servers node in a way that makes the best use of it.
Reverse proxy is a interface for external world ,forwarding request to a server node (even when we have a single node)
Its other use cases are caching of static content ,compression etc

Can I forge the HTTP HOST-header param in order to fake a request to a non-mapped subdomain?

Scenario: I want a staging environment at a customer's site. The customer owns www.example.com. I want to map the site to staging.example.com reachable from the outside, but I haven't got time to wait for the bureaucracy surrounding either the purchase of the new subdomain or opening of secondary HTTP ports.
Assumption: If I spoof the HTTP Header param Host to be staging.example.com on the client side, but actually make the request to the IP of www.example.com, IIS will redirect the request to the configured site for staging.example.com. Am I right?
So is there any client tool that can help me with that? I'm fairly famailiar with Fiddler, but it seem to override my rewrites of the host parameter. Also I would need to configure it to do it for every request, not just one, to make it trivial to test.
Are there simpler solutions to this problem?
I'm not entirely sure what you're asking.
Inside Fiddler, by clicking Tools > HOSTS and you can send all traffic targeting one site, e.g. dev.example.com to the IP of your choice. The target site (namely dev.example.com) doesn't need to exist at all in this case. Your client (e.g. the browser) has no idea that Fiddler is retargeting the traffic, it just thinks that it is talking to dev.example.com.
If you have the Fiddler book, check out the Retargeting Traffic section for many other ways to retarget traffic.

How to make a fault tolerant system which can immediately handle the situation when a server goes down

Before XYZ.com was down I noticed that my request was being routed to IP address 192.33.31.xxx and when it came up I noticed that my request was routed to IP 50.17.196.xxx , is it some sort of server switching? Isn't Dynamic server switching in case of fault is a way to make fault tolerant system?
Most of the heavy used websites works with load balancers directing the request to multiple servers. There are very simple to very complex logic based on which the routing is defined.
For e.g. the logic can be
Once the request is handled by one particular server the request from the same IP is handled by the same server OR
Use round robing way of handling request ( In this case IP has role to play)
route request to the least used server etc.
Website can also use hybrid approach of all the above.
Now if the website you were accessing was using the IP based routing then the case would have been
This is your first request. Then you would be assigned a server which will handle your request. (This can be round robin or other logic (for e.g. the server which has less load)
Now the load balancer might have logic that next X (say 100) requests or for Y (say 1 hour) time period, all requests from the same IP will be routed to the same server or both
So in your case if you make a request to XYZ it routes to some a.b.c.d server and then if you make another call after 1 hour or you make 101'th call you might get a different server which handles your request.
Now if your server went down, then you can have a fallback mechanisms like,
If a server do not respond in some pre configured time period, remove that server from live server List in load balancer configuration and reroute the request.
Or we can have an idle sitting pre running servers (it is costly but to be followed when each request has to served without fail). As soon as a server goes down, invoke this new server and assign it the ip which was assigned to the old server which went down.
There can be better and much more complex solution for this problem though.

Get domain the server was reached over?

In general on any non-HTTP server. Would there be a way to detect what domain was used to reach the IP?
I know HTTP servers get the domain passed within the request header, but would this be possible with any other server that does not require this information to be received from the client?
I'm especially looking for a way to do this with the minecraft server (Bukkit) so my preferred language (if needed for you to answer) would be Java. But I'd like to not have the theories about this language specific.
In general, no, which is why the HTTP protocol includes it in the headers.
In order to reach your server, first a DNS lookup is performed to resolve your IP, which is then followed by the connection itself. These two steps are separate, and hard to link together.
Logging what domain was last requested by a client is tricky, too, as DNS information is often cached, so the DNS request may not even reach your DNS server before being answered.
If it isn't cached, it also often isn't directly looked up by the end client, but rather by a caching DNS server operated, for instance, by the ISP.
No. The only way to get the DNS name used to connect to a server is to have the client provide it.
No, if there are no means for this in the protocol itself like the Host header in HTTP you cannot find out which hostname was used on the client to resolve your IP address.

Resources