Amazon Cloudfront settings to reduce waiting time - cdn

UPDATE: This was my mistake, see my comment the below. Now Cloudfront works great with new settings.
Sometimes dns waits 600ms and than it will wait another half second which makes 90kb file waiting more than 1 second. Sometimes pingdom wait time shows even 1 second. If I try another test, it will go sometimes to 90ms all together.
I understand that first request will take more time because cloudfront needs first to take file from our server. I set cache time to 86400 s which means if it should get file from cache for whole 24 hours. But if I try pingdom just 2 hours after first test it will go again very slow.
The below are my results and settings. Am I missing something?

Most of the cases its the DNS that makes the delay because amazon is really scalable.
I had similar issues with my ISP and was able to resolve its quickly by changing the DNS servers.
Try changing your DNS to Google DNS
IP V4
8.8.8.8
8.8.4.4
IP V6
2001:4860:4860::8888
2001:4860:4860::8844
Google Public Dns Documentation
Or use OPEN DNS
208.67.220.220
208.67.222.222
OPEN DNS Documentation

CloudFront is not only scalable, it also eliminates bottlenecks, but aims to speed it up.
AWS CloudFront is a service with low latency and fast transfer rates.
Here are some of the symptoms that may be slower when using CloudFront.
(This includes most problems.)
The requesting Edge may be receiving a large number of requests.
The edge server closest to the client may be farther than the web host server.
(Geographic delay)
DNS lookups can be delayed.
There is not much of this possibility, but make sure the x-edge is in the "View in cloud front" state.
Cache may be missing.
Detailed troubleshooting is difficult because you do not know what the test is or what the condition is.
If logging is enabled, further troubleshooting is possible.
It is generally recommended to enable logging.
If you have any questions, please feel free to ask!
thank you.

Related

Can't stream more than 5-6 static videos simultaneously on a single client

Intro
Hi! First, let me say that I am a networking n00b. With that in mind :
We have a small video conferencing app, dedicated to eLearning, which does live streaming through janus (webRTC). We offer the possibility to our clients to record their videos. The live streaming part we got running pretty well now, but we are having a small issue with video playback.
When we are playing back the videos, as long as there aren't many large video files to stream, everything plays back correctly. However, we had fringe cases of trouble for a while and I was sent to investigate them.
Problem encountered
When playing back multiple large video files, I noticed that I can't get more than 4-5 large files to stream simultaneously on a single client.
When the limit is hit, there seems to be some kind of race condition lock happening : a few videos are stuck, and a few are playing (usually 2 or 3). Then one of the playing videos will get stuck, and one of the stuck videos will start playing.
However, it doesn't affect the playback on other clients. When it get stuck, I can't even connect to the MinIO web interface from the same client, but I can from another client (i.e. another browser, or from another machine). I can also stream as much from the other client as I can from the one that is stuck.
I've been testing different configurations on a test minio server by loading the same file many times from different tabs in Chrome, which seems to recreate the problem.
Server Description
The files are hosted on a cloud storage that offers > 7 Gpbs bandwidth. The storage is mounted on a MinIO instance in a kubernetes cluster, behind a NGINX Ingress Controller that serves as the single point of entry to the cluster, and so it also controls traffic to the other micro-services on the k8s cluster.
Each k8s node has a guaranteed bandwitdth of > 250 Mbps, if that matters in this case.
The MinIO instance is mainly used to create transient sharing rights. We call the videos simply by pointing to their location using the DNS we set up for the service.
What has been investigated and tried
At first, I thought it might be a MinIO misconfiguration. However, I looked at the config files and the documentation and couldn't find anything that seemed to limit the number of connections / requests per client.
While reading, I stumbled upon the fact that HTML/1.1 didn't allow for more than 6 connections on Chrome and thought I hit the jackpot. But then I went and looked and the protocol used to get the files is already HTTP2 (h2).
Then I went one level higher and looked through the configuration of the NGINX Ingress Controller. Here again, everything seems ok :
events {
multi_accept on;
worker_connections 16384;
use epoll;
}
[...]
http2_max_field_size 4k;
http2_max_header_size 16k;
http2_max_requests 1000;
http2_max_concurrent_streams 128;
[...]
So I've been scouring the net for a good while now and I'm getting more and more confused by what I could investigate next, so I thought I'd come here and ask my very first StackOverflow question.
So, is there anything I could do with the current setup to make it so we can stream more large files simultaneously? If not, what are your thoughts and recommendations?
Edit :
I've found a workaround by searching hard enough : Increase Concurrent HTTP calls
At first I was not a fan - HTTP2 is supposed, from my understanding, to support a lot of parallel requests. However, I think I found the crux of the problem here : https://en.wikipedia.org/wiki/Head-of-line_blocking
Further research led me to find these mitigations to that problem : https://tools.ietf.org/id/draft-scharf-tcpm-reordering-00.html#rfc.section.3.2
I'll have to look into SCTP and see if it is something I'd like to implement, however. At first glance, that seems rather complicated and might not be worth the time investment.

Squid Cache Behavior - Multiple Requests

I'm having to learn Squid on the fly for a project and have tried to Google this, but not sure how to ask the question so Google can find it. I have a customer that wants to distribute larger amounts of video to multiple clients, but they have a slow WAN. The idea is to have Squid as a reverse proxy and cache to hold the videos on the local networks for faster transfer to the end users and to avoid saturating the WAN.
My question is this. If multiple users try to download a video that Squid has not yet cached, or fully cached, at the same time, what happens? In my particular scenario (insert unnecessary explanation here), this will happen frequently. Is a separate download of the video from the host opened for each request to Squid, or is Squid smart enough to manage it and only have one connection back to the host actually downloading the video?
This problem is called thundering herd or cache miss storm. It happens with new resources and also on higly used resources's cache invalidation.
For squid used as a reverse Proxy there are some options available to handle this, like the Collapsed Forwarding. The idea is to temporize parallel queries to the same resources, and use only one query fetching the resource on the backend.
Check the link for details, because depending on the squid version you may have to try different configuration tokens and values.

How do I set up global load balancing using Digital Ocean DNS and Nginx?

UPDATE: See the answer I've provided below for the solution I eventually got set up on AWS.
I'm currently experimenting with methods to implement a global load-balancing layer for my app servers on Digital Ocean and there's a few pieces I've yet to put together.
The Goal
Offer highly-available service to my users by routing all connections to the closest 'cluster' of servers in SFO, NYC, LON, and eventually Singapore.
Additionally, I would eventually like to automate the maintenance of this by writing a daemon that can monitor, scale, and heal any of the servers on the system. Or I'll combine various services to achieve the same automation goals. First I need to figure out how to do it manually.
The Stack
Ubuntu 14.04
Nginx 1.4.6
node.js
MongoDB from Compose.io (formerly MongoHQ)
Global Domain Breakdown
Once I rig everything up, my domain would look something like this:
**GLOBAL**
global-balancing-1.myapp.com
global-balancing-2.myapp.com
global-balancing-3.myapp.com
**NYC**
nyc-load-balancing-1.myapp.com
nyc-load-balancing-2.myapp.com
nyc-load-balancing-3.myapp.com
nyc-app-1.myapp.com
nyc-app-2.myapp.com
nyc-app-3.myapp.com
nyc-api-1.myapp.com
nyc-api-2.myapp.com
nyc-api-3.myapp.com
**SFO**
sfo-load-balancing-1.myapp.com
sfo-load-balancing-2.myapp.com
sfo-load-balancing-3.myapp.com
sfo-app-1.myapp.com
sfo-app-2.myapp.com
sfo-app-3.myapp.com
sfo-api-1.myapp.com
sfo-api-2.myapp.com
sfo-api-3.myapp.com
**LON**
lon-load-balancing-1.myapp.com
lon-load-balancing-2.myapp.com
lon-load-balancing-3.myapp.com
lon-app-1.myapp.com
lon-app-2.myapp.com
lon-app-3.myapp.com
lon-api-1.myapp.com
lon-api-2.myapp.com
lon-api-3.myapp.com
And then if there's any strain on any given layer, in any given region, I can just spin up a new droplet to help out: nyc-app-4.myapp.com, lon-load-balancing-5.myapp.com, etc…
Current Working Methodology
A (minimum) trio of global-balancing servers receive all traffic.
These servers are "DNS Round-Robin" balanced as illustrated in this
(frankly confusing) article: How To Configure DNS Round-Robin Load
Balancing.
Using the Nginx GeoIP
Module and
MaxMind GeoIP Data
the origin of any given request is determined down to the
$geoip_city_continent_code.
The global-balancing layer then routes the request to the least
connected server on the load-balancing layer of the appropriate
cluster: nyc-load-balancing-1, sfo-load-balancing-3,
lon-load-balancing-2, etc.. This layer is also a (minimum) trio of
droplets.
The regional load-balancing layer then routes the request to the
least connected server in the app or api layer: nyc-app-2,
sfo-api-1, lon-api-3, etc…
The details of the Nginx kung fu can be found in this tutorial:
Villiage Idiot: Setting up Nginx with GSLB/Reverse Proxy on
AWS. More general info about Nginx load-balancing is available
here
and
here.
Questions
Where do I put the global-balancing servers?
It strikes me as odd that I would put them either all in one place, or spread that layer out around the globe either. Say, for instance, I put them all in NYC. Then someone from France hits my domain. The request would go from France, to NYC, and then be routed back to LON. Or if I put one of each in SFO, NYC, and LON then isn't it still possible that a user from Toronto (Parkdale, represent) could send a request that ends up going to LON only to be routed back to NYC?
Do subsequent requests get routed to the same IP?
As in, if a user from Toronto sends a request that the global-balancing layer determines should be going to NYC, does the next request from that origin go directly to NYC, or is it still luck of the draw that it will hit the nearest global-balancing server (NYC in this case).
What about sessions?
I've configured Nginx to use the ip_hash; directive so it will direct the user to the same app or api endpoint (a node process, in my case) but how will global balancing affect this, if at all?
Any DNS Examples?
I'm not exactly a DNS expert (I'm currently trying to figure out why my CNAME records aren't resolving) but I'm a quick study when provided with a solid example. Has anyone gone through this process before and can provide a sample of what the DNS records look like for a successful setup?
What about SSL/TLS?
Would I need a certificate for every server, or just for the three global-balancing servers since that's the only public-facing gateway?
If you read this whole thing then reward yourself with a cupcake. Thanks in advance for any help.
The Goal: Offer highly-available service to my users by routing all connections to the closest 'cluster' of servers in SFO, NYC, LON, and eventually Singapore.
The global-balancing layer then routes the request to theleast
connected server...
If I'm reading your configuration correctly, you're actually proxying from your global balancers to the balancers at each region. This does not meet your goal of routing users to the nearest region.
There are three ways that I know of to get what you're looking for:
30x Redirect Your global balancers receive the HTTP request and then redirect it to a server group in or near the region it thinks the request is coming from, based on IP address. This sounds like what you were trying to set up. This method has side effects for some applications, and also increases the time it takes for a user to get data since you're adding a ton of overhead. This only makes sense if the resources you're redirecting to are very large, and the local regional cluster will be able to serve much more efficiently.
Anycast (taking advantage of BGP routing) This is what the big players like Akamai use for their CDN. Basically, there are multiple servers out on the internet with the exact same routable IP address. Suppose I have servers in several regions, and they have the IP address of 192.0.2.1. If I'm in the US and try to connect to 192.0.2.1, and someone is in Europe that tries to connect to 192.0.2.1, it's likely that we'll be routed to the nearest server. This uses the internet's own routing to find the best path (based on network conditions) for the traffic. Unfortunately, you can't just use this method. You need your own AS number, and physical hardware. If you find a VPS provider that lets you have a chunk of their Anycast block, let me know!
Geo-DNS There are some DNS providers that provide a service often marketed as "Geo-DNS". They have a bunch of DNS servers hosted on anycast addresses which can route traffic to your nearest servers. If a client queries a European DNS server, it should return the address for your European region servers, vs. some in other regions. There are many variations on the Geo DNS services. Others simply maintain a geo-IP database and return the server for the region they think is closer, just like the redirect method but for DNS before the HTTP request is ever made. This is usually the good option, for price and ease of use.
Do subsequent requests get routed to the same IP?
Many load balancers have a "stickiness" option that says requests from the same network address should be routed to the same end server (provided that end server is still up and running).
What about sessions?
This is exactly why you would want that stickiness. When it comes to session data, you are going to have to find a way to keep all your servers up-to-date. Realistically, this isn't always guaranteed. How you handle it depends on your application. Can you keep a Redis instance or whatever out there for all your servers to reliably hit from around the world? Do you really need that session data in every region? Or can you have your main application servers dealing with session data in one location?
Any DNS Examples?
Post separate questions for these. Everyone's "successful setup" looks differently.
What about SSL/TLS?
If you're proxying data, only your global balancers need to handle HTTPS. If you're redirecting, then all the servers need to handle it.
A Working Solution
I've had a wild ride over the past few months figuring out the whole Global-HA setup. Tonnes of fun and I've finally settled with a rig that works very well, and is nothing like the one outlined in the above question.
I still plan on writing this up in tutorial form, but time is scarce as I head into the final sprint to get my app launched early next year, so here's a quick outline of the working rig I ended up with.
Overview
I ended up moving my entire deployment to AWS. I love Digital Ocean, but the frank reality is that AWS is light years ahead of them (and everyone, really) when it comes to the services offered under one roof. My monthly expenses went up slightly, but once I was done tweaking and streamlining I ended up with a solution that costs about $75/month per region for the most basic deployment (2 instances behind an ELB). And a new region can be spun up and deployed within about 30 minutes.
Global Balancing
I quickly found out (thanks to #Brad's answer above) that trying to spin up my own global balancing DNS layer is insane. It was a hell of a lot of fun figuring out how a layer like this works, but short of getting on a plane and scraping my knuckles installing millions of dollars worth of equipment around the world, it was not going to be possible to roll my own.
When I finally figured out what I was looking for, I found my new best friend: AWS Route 53. It offers a robust DNS network with about 50-odd nodes globally and the ability to do some really cool routing tricks like location-based routing, latency-based routing (which is kinda awesome), and AWS Alias records that 'automagically' route traffic to other AWS Services you'll be using (Like ELB for load balancing).
I ended up using latency-based routing that directs the global traffic to the closest regional Elastic Load Balancer, which has an Auto-Scaling Group attached to it in any given region.
I'll leave it up to you to do your homework on the other providers: www.f5.com, www.dyn.com, www.akamai.com, www.dnsmadeeasy.com. Depending on your needs, there may be a better solution for you, but this works very well for me.
Content Delivery Network
Route 53 integrates with AWS Cloudfront very nicely. I setup an S3 bucket that I'm using to store all the static media files that my users will upload, and I've configured a Cloudfront distribution to source from my media.myapp.com S3 bucket. There are other CDN providers, so do your shopping. But Cloudfront gets pretty good reviews and it's a snap to setup.
Load Balancing & SSL Termination
I'm currently using AWS Elastic Load Balancer to balance the load across my application instances, which live in an Auto-Scaling Group. The request is first received by ELB, at which point SSL is terminated and the request is passed through to an instance in the Auto-Scaling Group.
NOTE: One giant caveat for ELB is that, somewhat ironically, it doesn't handle massive spikes very well. It can take up to 15 minutes for an ELB to trigger a scale-up event for itself, creating 500/timeouts in the meantime. A steady, constant increase in traffic is supposedly handled quite well, but if you get hit with a spike it can fail you. If you know you're going to get hit, you can 'call ahead' and AWS will warm up your ELB for you, which is pretty ridiculous and anti-pattern to the essence of AWS, but I imaging they're either working on it, or ignoring it because it's not really that big of a problem. You can always spin up your own HAProxy or Nginx load-balancing layer if ELB doesn't work for you.
Auto-Scaling Group
Each region has an ASG which is programmed to scale when the load passes a certain metric:
IF CPU > 90% FOR 5 MINUTES: SCALEUP
IF CPU < 70% FOR 5 MINUTES: SCALEDN
I haven't yet put the ELB/ASG combo through its paces. That's a little way down my To-Do list, but I do know that there are many others using this setup and it doesn't seem to have any major performance issues.
The config for an Auto-Scaling Group is a little convoluted in my opinion. It's actually a three-step process:
Create an AMI configured to your liking.
Create a Launch Configuration that uses the AMI you've created.
Create an Auto-Scaling Group that uses the Launch Configuration you've created to determine what AMI and instance type to launch for any given SCALEUP event.
To handle config and app deployment when any instance launches, you use the "User Data" field to input a script that will run once any given instance launches. This is possibly the worst nomenclature in the history of time. How "User Data" describes a startup script only the author knows. Anyhow, that's where you stick the script that handles all your apt-gets, mkdirs, git clones, etc.
Instances & Internal Balancing
I've also added an additional 'internal balancing layer' using Nginx that allows me to 'flat-pack' all my Node.js apps (app.myapp.com, api.myapp.com, mobile.myapp.com, www.myapp.com, etc.myapp.com) on every instance. When an instance receives a request passed to it from ELB, Nginx handles routing the request to the correct Node.js port for any given application. Sort of like a poor-mans containerization. This has the added benefit that any time one of my apps needs to talk to the other (like when app. needs to send a request to api.) it's done via localhost:XXXX rather than having to go out across the AWS network, or the internet itself.
This setup also maximizes usage of my resources by eliminating any idle infrastructure if the app layer it hosts happens to be receiving light traffic. It also obviates the need to have and ELB/ASG combo for every app, saving more cash.
There's no gotchas or caveats that I've run into using this sort of setup, but there is one work-around that needs to be in place with regard to health-checking (see below).
There's also a nice benefit in that all instances have an IAM role which means that your AWS creds are 'baked in' to each instance upon birth and accessible via your ENV vars. And AWS 'automagically' rotates your creds for you. Very secure, very cool.
Health Checks
If you go the route of the above setup, flat-packing all your apps on one box and running an internal load-balancer, then you need to create a little utility to handle the ELB Health Checks. What I did was create an additional app called ping.myapp.com. And then I configured my ELB Health Checks to send any health checks to the port that my ping app is running on, like so:
Ping Protocol: HTTP
Ping Port: XXXX
Ping Path: /ping
This sends all health checks to my little ping helper, which in turn hits localhost:XXXX/ping on all the apps residing on the instance. If they all return a 200 response, my ping app then returns a 200 response to the ELB health check and the instances gets to live for another 30 seconds.
NOTE: Do not use Auto-Scaling Health Checks if you're using an ELB. Use the ELB health checks. It's kinda confusing, I thought they were the same thing, they're not. You have the option to enable one or the other. Go with ELB.
The Data Layer
One thing that is glaringly absent from my setup is the data layer. I use Compose.io as my managed data-layer provider and I deploy on AWS so I get very low latency between my app layers and my data layer. I've done some prelim investigation on how I would roll my data layer out globally and found that it's very complex — and very expensive — so I've kicked it down my list as a problem that doesn't yet need to be solved. Worst case is that I'll be running my data layer in US-East only and beefing up the hardware. This isn't the worst thing in the world since my API is strictly JSON data on the wire so the average response is relatively tiny. But I can see this becoming a bottleneck at very large, global scale — if I ever get there. If anyone has any input on this layer I'd love to hear what you have to say.
Ta-Da!
Global High Availability On A Beer Budget. Only took me 6 months to figure it out.
Love to hear any input or ideas from anyone that happens to read this.
You can use Anycast for your webservice for free if using Cloudflare free plan.
Digital Ocean now supports Load Balancing of servers itself. It is extremely easy to set up and works great! Saves you having to add in unnecessary components such as nginx (if you only want to use for load balancing).
We were having issues using SSL file uploads with nginx on a digital ocean server, however since the Digital Ocean update, we have removed nginx and now use Digital Ocean's load balancing feature and it works just as we need it to!

http charging high volume of http requests

Sorry for banal question, but i would like to know, while i launch many many http requests to a site, will site goes down or just i will be banned after n request limit (according to server settings)?
As is most always the case the answer is: it depends :)
Now, seriously, it depends on a few things but most likely if you're just hitting "a" server with a single client computer you most likely will not be able to make it go down as servers usually have more bandwidth (networking & processing powers) available to them than most client computers. Plus, when you're making requests to a website you might actually be making requests to a load-balanced network of servers which definitely means that a single client will not be able to outperform the site. If this is the case your IP will probably get blacklisted or maybe just ignored.
The other possibility is to make a lot of requests with a number of different clients at the same time - using different IPs. This is what is usually called a coordinated or distributed denial of service attack. In that case you might be able to make the web server(s) go down for a while.
This is just a simple answer but I hope you get the point.

Is more than 100 active connection to SQL server db not normal in an ASP.NET website?

While tracing the active connection on my db i found that some times the connections exceeds 100, is that normal?
and after few minutes it return back to 20 or 25 active connection
more details about my problem
Traffic on the site is around 200 visitor per day.
Why i am asking? because the default MaxPool in the asp.net connection string is 100
Also i am using Connection in the website IIS
That really depends on your site and your traffic. I've seen a site peek out at over 350 active connections to SQL during its peak time. That was for roughly 7,000 concurent web users, on two web servers, plus various backend processes.
Edit
Some additional information that we need to give you a better answer:
How many Web Processes hit your sql
server? For example are you using web
gardens? Do you have multiple servers
how many if you do? This is important because then you can calculate how many connections you can have by figuring out how many worker threads per process you have configured. Assume worse case, each thread is running which would add a connection to the pool.
Are you using connection pooling? If so your going to see the connections stick around after the user's request ends. By default its enabled.
How many concurent users do you have?
But, I think your going after this wrong, your having an issue with no free connections available in your pool. The first thing I'd look for is any leaked connections (connections being held open for longer then they should). For example passing a data reader up to the Web Page, could be a sign of this.
Next thing is to evaluate the default settings. Maybee you should run a web garden which should give you more connections, or increase the number of connections available.
The last thing I would do is try to opitmize queries like in your last question. Let's say you cut those queries in half, all you've done is bought yourself more time until more users come onto the system, and your right back here, only this time you might not be able to optimize that query yet again.
You're leaving out some details making it difficult to answer correctly but...
It depends, really. If you're not using connection pooling then each time a page is hit that requires access to the database a new connection is going to be opened. So sure, it could be perfectly normal.
I would also look into caching. Cache pages, cache query results, etc. You might be surprised how many times you go back to the database to get a list of US States...

Resources