content delivery vs virtual server - cdn

I've been using a VPS from inmotion hosting for about 10 years , cost is about $1500 annually, which is rather expensive considering what i use it for. I have a small 1 page website that isn't visited much , my primary use is to host js , css and image files that fantasy football users can embed on their sites. I'd say there are about 2400 users of my files and they make https requests to my server about 12,000 times per day getting files that range from 50kb to 2mb.
What would be your recommendations ?

Related

What is considered as data transfer in firebase hosting?

What I considerd data transfare in firebase hosting according to docs
Data transfer (GB/month) — The amount of data transferred to end users from our CDN. Every Hosting site is automatically backed by our global CDN at no charge
So is when a user go to my domain and firebase send the user my web site that lets say for the sake of argument has 100mb so each 10 requests will cost me 0.15 usd according the pricing on fire base ?
Data transfer $0.15/GB
My current concern that my react build folder has the size of 3mb since it has too many pngs in it … so will I pay for the transfare of this build folder to end client each time a Client call this site ?
Lets say for the sake of argument has 100 MB so each 10 requests will cost me 0.15 USD
Yes, you are charged on the data downloaded from CDN/server.
My current concern that my react build folder has the size of 3mb since it has too many pngs in it
It'll be best to optimise and compress your images to reduce costs and also loading speed.
so will I pay for the transfer of this build folder to end client each time a Client call this site
Some static assets should get cached locally so the next time user loads the site it might load them from cache instead of the server. So it won't be 3 MB always.
You can get rough estimate of the data being loaded in the network tab of browser console.

Bigbluebutton With Scalelite Load Balancer

I am using 20 4 core/8gb server (40$ shared cpu) for 15K bigbluebutton user. But we are facing lots of problem like html5 client hang on three dot, audio not working , video not working every day. Any one has any idea about setting bigbluebutton in digital ocean? Should i go for 8-10 big server like 8/16 core cpu/16gb/32 ram?
How many users are online simultaneously when do you face these problems? Are all 15K users joining in a single online session or they are divided across multiple online sessions, conducted at different schedules? How many users are using webcams?
On BBB you would want to keep maximum number of users in a live session to up to 100. If I take an example of a school, may be you can have 30-40 users in each grade (and each grade could have multiple sections).
You can leverage the Scalilite for balancing the users over the BBB servers dynamically.
Full documentation can be found here

SEO crawler DDOSing sites

I have a customer that runs 36 websites (many thousands of pages) on a round robin with sticky affinity load-balanced set of IIS servers - the infra is entirely AWS based (r3.2xl - 8 VCPU, 60.5 GiB RAM)
To get straight to the point, the site is configured to 'cache on access' using standard in-memory caching with ASP.NET 4.6 and static assets through Cloudfront. The site on a 'cold start' makes both SQL Server queries for content, and separate elasticsearch queries at runtime to determine hreflang alternate language tags - this basically queries which versions of the URL are available in different languages for SEO reasons. This query has been optimised to be a lookup on a single index from a cross-index wildcard query. As mentioned, the entire result is cached for 24h once all this has executed.
Under normal use conditions the site works perfectly. As there are 36 sites running on a single box, the private set space gets allocated up to the max (99%) of physical RAM over time, as more and more content gets cached in memory. I can end up with App Pools in excess of 1.5GiB which isn't ideal. After this point, presumably the .NET LRU cache eviction algorithm is working overtime.
The problem I have, after some post-mortem review of the IIS logd, the customer is using an SEO bot tool, SEMrush, which essentially triggers a denial of service attack against the sites (thundering herd?) because of simultaneous requests for the 'long tail' of pages which are never viewed by a user and hence aren't stored in the cache.
The net result is a server brought to its knees, App Pool CPU usage all over the place, and an Elasticsearch queue length > 1000, huge ES heap growth, rejection rate - and eventually a crash.
The solutions I've thought about but haven't implemented:
Cloudfront all the sites - use a warm up script (although I don't think this will actually help as it's a cold start problem when all the pages expire, unless I could have a MOST recently used cache invalidation mechanism which invalidated pages on number of requests - say > 100, and left everything else persistent)
AWS Shield/WAF to provide some sort of rate limiting
Remove the runtime ES lookup all together and move to an eventually-consistent model which computes the hreflang lookup table elsewhere on a separate process. Hpwever, the ES instances, whilst on a v1.3.1 version which is old, is a 3-node cluster which has a lot of CPU power and each node set to a 16GiB min/max heap so should be able to take that level of throughput?
Or all 3!
Has anyone come across this problem before and what was your solution? it must be fairly common especially for large sites which are hammered by SEO / DQM web crawlers?

Limitation of free plan in firebase

What is the meaning of the following?
50 Max Connections, 5 GB Data Transfer, 100 MB Data Storage.
Can anyone explain me? Thanks
EDIT - Generous limits for hobbyists
Firebase has now updated the free plan limits
Now you have
100 max connections
10 GB data transfer
1 GB storage
That means that you can have only 50 active users at once, only 5GB data to be transferred within one month and store only 100 MB of your data.
i.g. you have an online web store: only 50 users can be there at once, only 100 mbytes of data (title, price, image of item) can be stored in DB and only 5 GB of transfer - means that your web site will be available to deliver to users only 5gb of data (i.e. your page is 1 mbyte size and users will be able to attend that page only 50 000 times).
UPD: to verify the size of certain page (to define if 5gb is enough for you) - using google chrome right click anywhere on page - "Inspect Element" and switch to tab "Network". Then refresh the page. In bottom status bar you will amount of transferred data (attached size of current stackoverflow page, which is 25 kbytes)
From the same page where the question was copied/pasted:
What is a concurrent connection?
A connection is a measure of the number of users that are using your
app or site simultaneously. It's any open network connection to our
servers. This isn't the same as the total number of visitors to your
site or the total number of users of your app. In our experience, 1
concurrent corresponds to roughly 1,400 monthly visits.
Our Hacker Plan has a hard limit on the number of connections. All of
the paid Firebases, however, are “burstable”, which means there are no
hard caps on usage. REST API requests don't count towards your
connection limits.
Data transfer refers to the amount of bytes sent back and forth between the client and server. This includes all data sent via listeners--e.g. on('child_added'...)--and read/write ops. This does not include hosted assets like CSS, HTML, and JavaScript files uploaded with firebase deploy
Data storage refers to the amount of persistent data that can live in the database. This also does not include hosted assets like CSS, HTML, and JavaScript files uploaded with firebase deploy
These limits mentioned and discussed in the answers, are per project
The number of free projects is not documented. Since this is an abuse vector, the number of free projects is based on some super secret sauce--i.e. your reputation with Cloud. Somewhere in the range of 5-10 seems to be the norm.
Note also that deleted projects take around a week to be deleted and they continue to count against your quota for that time frame.
Ref

is wordpress suitable for a site which has 317k pageviews p/w

I had meeting with a local newspaper company's owner. they are planning to have a newly designed website. their current website is static and doesnt have any kinds of database. But their weekly pageview figure is around 317k. This figure surely will increase in the future
The question is if i create a Wordpress system for them will the website run smoothly with new functionalities (news,galleries may be). it is not neccessary to use lots of plugins. can their current server support wordpress package without any upgrade.
Or shall i think to use php to design website.
Yes - so long as the machinery for it is adequate, and you configure it properly.
If the company uses CDN (like akamai), ask them if this thing can piggyback on their account, then make them do it anyway when they throw up a political barrier. Then, then stop sweating it, turn keepalive on and ignore anything below this line. Otherwise:
If this is on a VPS, make sure it has guaranteed memory and I/O resources - otherwise host it on a hardware machine. If you're paranoid, something with a 10k RPM drive and 2-3 gigs of ram will do (memory for apache and mysql to have breathing room and hard drive for unexpected swap file compensation.)
Make sure the 317k/w figure is accurate:
If it comes from GA/Omniture/another vendor tracking suite, increase the figures by about 33-50% to account for robots that they can't track.
If the number comes from house stats/httpd logs, assume it's 10-20% less (since robots don't typically hit you up for stylesheets and images.)
If it comes from combined reports by an analyst whose job it is to report on their own traffic performance, scratch your head and flip a coin.
Apache: News sites in America have lunchtime and workday winddown traffic bursts around or about 11 am, and 4 pm, so you may want to turn Keepalive off (having it on will improve things during slow traffic periods, but during burst times the machine will spin into an unrecoverable state.)
PHP: Make sure some kind of opcode caching is enabled on the hosting machine (either APC or eAccelerator). With opcode caching, memory footprint drops off significantly and machine doesn't have to borrow as much from the swap file - hard drive.
WP: Make sure you use WP3.4, as ticket http://core.trac.wordpress.org/ticket/10964 was closed in favor of this ticket's fix: http://core.trac.wordpress.org/ticket/18536. Both longstanding issues address query performances on large volume sites, but the overall improvements/fixes help everywhere else too.
Secondly, make sure to use something like the WP Super Cache caching plugin and configure it appropriately. If volume of content on this site is going to be permanently small, you shouldn't have to take any special precautions - otherwise you may want to alter the plugin/rules so to permanently archive older content into a static file. There is no reason why 2 year old content should be constantly respidered at full resource cost.
Robots.txt: prepare and properly register a dynamic sitemap with google/bing/etc. If you expect posts to be unnecessarily peppered with a bunch of tags and categories by people who don't understand what they actually do, you may want to Disallow /page/*, /category/* and /tag/*. Otherwise, when spider robots swarm the site, for every post you'll be slammed by an amount increased by number of tags/cats it has. And then some.
For several years The Baltimore Sun hosted their reader reward, sports and editorial database projects directly off a single collocated machine. Combined traffic volume was factors larger than what you mention, but adequately met.
Here's a video of httpd status w/keepalive on during a slow hour, at about 30 req./sec: http://www.youtube.com/watch?v=NAHz4GRY0WM#t=09
I would not exclude WordPress for this project based only off of the weekly pageview of < a million. I have hosted WordPress sites that receive much, much more traffic and were still very functional. Whether or not WordPress is the best solution for this type of project though based off of the other criteria you have is completely up to you.
Best of luck and happy coding!
WP is capable of handling huge traffic. See this list of people who are using WP VIP services:
Time,DowJones,NBC Sprts,CNN and many more.
Visit WordPress VIP site: http://vip.wordpress.com/clients/

Resources