Are there performance benefits to offloading WordPress media folder and database to GCP? - wordpress

I have been looking into offloading a WordPress database to Google CloudSQL and the media folder to Google Cloud Storage.
What are the performance benefits of doing so? At what point is it worth it?

For this answer I assume your Wordpress is running on a GCE instance.
Moving database and static resources to Cloud SQL and GCS is likely to have a slight increase in the amount of traffic your instance can handle before becoming overloaded. Moving the database to Cloud SQL is likley to slightly reduce speed of requests, as database hits take a round trip.
Where Cloud SQL and GCS will help you is scalability and potentially reliability:
scalability will increase as with static resources and data moved to a shared service, you no longer need to keep state in the GCE instance itself. This means you can add new GCE instances serving wordpress from the same database behind a load balancer and handle lots more traffic than a single instance could.
If you add multiple instances, you will gain reliability as you no longer have a single point of failure. If one GCE instance goes down, the other GCE will handle the traffic and the LB will ensure no downtime occurs. With HA set up on Cloud SQL, you also gain database reliability. GCS also has lots of redundancy built in. More importantly, you can spread your instances across zones and avoid impact from single-zone issues in GCP.

Related

Which one is the cheapest to use AWS RDS or my own database?

I have wordpress application running on my EC2 - AWS. I haven't decide which one is Amazon RDS or my own database on different hosting. Which one is the cheapest to use? Let's say I have my own MySQL database from Lunarpages or Bluehost hosting, to allow my wordpress on EC2 instance to connect/remote to my database on Lunarpages not allow my wordpress to connect remote to Amazon RDS. Which one is the cheapest to use? I heard people saying when you use Amazon RDS is very expensive, so I thought maybe to save costs to allow my wordpress to connect to my own database not Amazon RDS for wordpress. I don't know it is true or not. I don't know how it performance well. Which one is the best one. Any suggestion appreciated. Thank you.
I don't agree with that. In amazon AWS, the first thing you do is set up a virtual private network and create the corresponding network access interfaces. My experience working with heavy CMSs is that the architecture is much more stable with EC2 + RDS, each in one instance. In addition RDS has automated version maintenance and it is much more difficult to fail or suffer a crash, as opposed to a mysql or similar, running on the same virtual machine.
Also in terms of speed and performance, working with this scheme for example in wordpress, the system flies, the speed is much higher and appreciable even with small machines.
Running on a different hosting will cause extra latency.
Let's do the math on AWS RDS for the smallest instances (taking eu-west-1 region as example)
Running on RDS: db.t2.micro $0.018 per hour, or $12.96 per month for RDS. Free the first year under AWS free tier.
Running on EC2: t2.micro (You configure MySQL and backups, ...), $0.0126 per hour, or $9.07 per month. Free the first year under AWS free tier
If your application is small enough, you could host both your database and your application on the same machine (solution 2)
Performance wise is not good to have database on a totally different network of the website hosted itself. It'll delay. Imagine if you have a lot of calls, it'll multiply the delay.
You can host a local database on the EC2 it'self, this would be the best choice.

High performace server for 1x1 pixels (500M GET requests per day)

I need to set up a tracking server that will only serve 1x1 pixels and log all requests.
I initially thought of using Amazon's S3 or CloudFront but their costs are prohibitively high for me. I need to serve 500M pixels a day, and S3 charges $0.4 per 1M GET requests, so even without the data transfer costs I'm at $6,000/month.
I am considering setting up nginx or lighttpd on an EC2 instance. What performance should I expect with those two (e.g. per one large EC2 instance)? Are there better free products for this task?
Nginx is indeed a good candidate for this and already has built in support for empty GIFs (see http://wiki.nginx.org/HttpEmptyGifModule).
Disk I/O will probably be the biggest issue for this server because of the access logging. The only way to figure out the performance of the different EC2 instances is to test them.
If one EC2 instance does not offer the performance you need, or if you need any redundancy for this service, you should also look into using a load balancer (either an AWS Elastic Load Balancer or your own custom one).
You could also set up multiple smaller servers in different geographical regions and use DNS latency based routing to route requests to them (use either AWS Route 53 latency based routing or another DNS solution). This would significantly reduce the connection time to your server and would distribute the load across several data centers.

Shared data for Amazon EC2 instances

To handle high traffic, I'm planning to scale out, run my web application (WordPress based) on some EC2 instances (I'm very new to AWS). The instances need to work on the same data (images, videos...).
I am thinking about using S3 as the storage for this shared data.
My questions are:
If I use S3, do I need to write extra codes for my application to upload and get data to/from S3? Or there is a magic way to mount EC2 instances to S3, and after that EC2 instances can access S3 as accessing the local storage?
I've heard that S3 is a bit slow since it is accessed through web services (if users upload files and it takes time to upload the files to S3). So is there any better way for storing shared data?
I've read some documents about the ability of scaling of Amazon EC2. But none of them mentions about how to handle shared data. Any help is highly appreciated. Thanks.
There is no native facility to 'mount' an S3 bucket as storage to an EC2 instance, although there are several 3rd-party apps which offer mechanisms to make S3 storage available as a virtual drive or repository. Most of them offer a preset amount of free storage and then a tiered charging mechanism for larger amounts - Google for 'S3 storage interface' and take a look.
Whether you write code to use S3 through the API or use an interface layer, there will always be some latency between your app and the storage. That's a fact of physics and there's nothing you can do to eliminate the delay, because the S3 repository is not local to the EC2 cluster - so you will never achieve 'local' storage access speeds.
An alternative might be to use EBS which is local to your EC2 instance - it has different properties to S3 (for example, it does not offer edge locations for regionally-localised access) but is much faster for application use because it is inside the EC2 cluster and mounted as local storage.
You could mount S3 bucket onto all your EC2 instances. It's a 2-way mount so all your files will be synchronized. You could use s3f3 to do the mounting.
I used this guide and set up mine pretty quick: Mount S3 onto EC2
If you are then concerned about speed, you can use Amazon ElastiCache or even use EBS as a cache drive.
For starts you question lacks some details about your application architecture, but there are some possibilities.
First, if you project is medium-sized you could use GlusterFS on your main nodes as servers and clients at the same time (using native or NFS protocol), RDS *Multi-AZ* MySQL instance for DataBase. CloudFront as CDN with CDN linker or W3TC plugins. Also, put an ELB in front.
In this particular case I would recommend a couple c3.large instances at least.
Second, when your project would grow - you should make instance AMI and created an auto-scaling group that would just connect to your main storage and compute instances. (Also consider lifting compute load from these rather small instances).
Things to consider additionally:
Great WordPress article about WordPress clusters for is http://harish11g.blogspot.ru/2012/01/scaling-wordpress-aws-amazon-ec2-high.html
Alternative to GlusterFS solution could be Ceph File System
You also could (or maybe should) you SSD cache (for example flashcache) for that GlusterFS volume.

Planning server infrastructure when hosting duplicated web-product over multiple servers

We have a web-application product that we sell to companies that is hosted at our servers.
The product contains couple of web applications, windows services and SQL server db.
Right now we have only one client that uses our product. We have two servers - one for the web apps and services and other for the db.
In order to add the product to another client, we have to 'duplicate' all the apps and db and run in separately.
As we started expanding and some companies will require more server power then others, I need to plan the servers infrastructure.
Having two servers for each client sounds ridiculous. Hosting costs will be huge. What will happen when I'll have 10 clients? And probably some servers will take more power than others, leaving servers using 30% from their capacity while others use 70%.
One thing I really care about is separating the DB from each product so in case of server compromise, only one db will be at risk.
So... I thought about Virtual Machines...
Does it sounds right?
Do I need two super servers to hold virtual machine instances? (one for web and other for db?)
What about Load balancing / etc..?
Will it require more maintenance time only because I use virtual machines?
Are there any hardware recommendations?
Any help will be appreciated
Many thanks
Virtual Machines is definitely the safest way to separate clients and will allow you the flexibility to allocate a specific percentage of resources to specific clients.
However, using separate processes on the same physical machine will perform better (but not always significantly) and will allow more dynamic use of resources (i.e., if one spikes, it will use the resources it needs). This setup will not allow you to control the resource allocation nearly as easily though. You'll also have to build your own monitoring tools to see and analyze what processes (clients) are using what resources (piggyback on perfmon).
Using separate processes also is dangerous if your application wasn't designed for this. Anywhere the application caches data on the file system or accesses anything besides memory and the database needs to be thoroughly scrubbed to make sure data from clients is not co-mingled or shared.
Separate virtual machines is more work to manage--each one is pretty much like it's own computer. So you have to manage all the VM's plus the physical machine.
You may also want to consider hosting in a more dynamic environment like Amazon AWS or Microsoft's Azure which will allow you to more easily scale up/down as necessary than a VM at a traditional host.

web service that can withstand with 1000 concurrent users with response in 25 millisecond

Our client requirement is to develop a WCF which can withstand with 1-2k concurrent website users and response should be around 25 milliseconds.
This service reads couple of columns from database and will be consumed by different vendors.
Can you suggest any architecture or any extra efforts that I need to take while developing. And how do we calculate server hardware configuration to cope up with.
Thanks in advance.
Hardly possible. You need network connection to service, service activation, business logic processing, database connection (another network connection), database query. Because of 2000 concurrent users you need several application servers = network connection is affected by load balancer. I can't imagine network and HW infrastructure which should be able to complete such operation within 25ms for 2000 concurrent users. Such requirement is not realistic.
I guess if you simply try to run the database query from your computer to remote DB you will see that even such simple task will not be completed in 25ms.
A few principles:
Test early, test often.
Successful systems get more traffic
Reliability is usually important
Caching is often a key to performance
To elaborate. Build a simple system right now. Even if the business logic is very simplified, if it's a web service and database access you can performance test it. Test with one user. What do you see? Where does the time go? As you develop the system adding in real code keep doing that test. Reasons: a). right now you know if 25ms is even achievable. b). You spot any code changes that hurt performance immediately. Now test with lots of user, what degradation patterns do you hit? This starts to give you and indication of your paltforms capabilities.
I suspect that the outcome will be that a single machine won't cut it for you. And even if it will, if you're successful you get more traffic. So plan to use more than one server.
And anyway for reliability reasons you need more than one server. And all sorts of interesting implementation details fall out when you can't assume a single server - eg. you don't have Singletons any more ;-)
Most times we get good performance using a cache. Will many users ask for the same data? Can you cache it? Are there updates to consider? in which case do you need a distributed cache system with clustered invalidation? That multi-server case emerging again.
Why do you need WCF?
Could you shift as much of that service as possible into static serving and cache lookups?
If I understand your question 1000s of users will be hitting your website and executing queries on your DB. You should definitely be looking into connection pools on your WCF connections, but your best bet will be to avoid doing DB lookups altogether and have your website returning data from cache hits.
I'd also look into why you couldn't just connect directly to the database for your lookups, do you actually need a WCF service in the way first?
Look into Memcached.

Resources