how many volumes can ceph support? - volume

you can use ceph-volume lvm create --filestore --data example_vg/data_lv --journal example_vg/journal_lv to create ceph volume,but I want to know how many volumes can ceph support,can it be infinite?

Ceph can serve an infinite number of volumes to clients, but that is not really your question.
ceph-volume Is used to prepared a disk to be consumed by Ceph for serving capacity to users. The prepared volume will be served by an OSD and join the RADOS cluster, adding its capacity to the cluster’s.
If your question is how many disks you can attach to a single cluster today, the sensible answer is “a few thousand”. You can push farther using a few tricks. Scale increases over time, but I would say 2,500-5,000 OSDs is a reasonable limit today.

Related

I want to scale mariadb database for huge number of query requests

I use Moodle on centos7 with Php, Mariadb, Nginx. There are huge number of users that use this Moodle. If the number of users grows more than 300user per sec, the Moodle has delay in response and seems to be hanged!
I read about:
Galera (multi master clustering with 3nodes)
slave-master (separate read and write)
MaxScale
increase ram and cpu (I have up to: 288GB ram, 24coreCPU, SSD drive)
What is the best practice to serve huge number of requests without delay? How can I scale my database (because it is the bottleneck)? I want scale it for serve huge request (most of them is read from database)
MariaDB (and MySQL) can scale 'infinitely' for reads by using Replication and sending read requests to Slave servers.
500 connections per second is very high. (But I don't know what the practical limit is.)
There are several extra tools that can do "connection pooling". Search for this; it may let you go well past 500 logical connections on a single server.
In the case of Galera, you could have 3 read-write nodes, plus any number of Slaves hanging off each of the 3.
For simple Master-Slave, there can be any number of Slaves hanging off the one Master.
Obviously you can do generic MySQL/MariaDB tuning first, and use a recent version of Moodle (3.7 is current right now)
After that, one thing you can check is how you have sessions implemented.
https://docs.moodle.org/37/en/Session_handling
This page also has many more tips:
https://docs.moodle.org/37/en/Performance_recommendations

AWS EC2 issue slow instance VolumeQueueLength

I am experiencing an issue with my EC2 instance. I am scraping different websites using R programming and it works fine but after some hours, my EC2 instance is freezing.
After raising a ticket to AWS support, they noticed that this was caused by the rise of the "VolumeQueueLength" which then was decreasing the BurstBalance credits from 100 to 0.
See below when I tried around June 19th:
Would you know what is causing this VolumeQueueLength to go up?
Thanks a ton!
From I/O Characteristics and Monitoring - Amazon Elastic Compute Cloud:
If your I/O latency is higher than you require, check VolumeQueueLength to make sure your application is not trying to drive more IOPS than you have provisioned. If your application requires a greater number of IOPS than your volume can provide, you should consider using a larger gp2 volume with a higher base performance level or an io1 volume with more provisioned IOPS to achieve faster latencies.
For more information about Amazon EBS I/O characteristics, see the Amazon EBS: Designing for Performance re:Invent presentation on this topic.
This is basically saying that the IO allocated to an Amazon EBS 'General Purpose' volume is proportional to its size, so a larger volume might solve your IO problems. Alternatively, you could consider moving to a Provisioned IOPS volume (which is faster, but more expensive).
Your application seems to be using more IO than has been allocated for the volume.

Is it "okay" to host a small wordpress blog on one AWS EC2 Instance without load balancers/beanstalk?

This is a very simple question for those with the knowledge, but I'm a newbie.
In essence, I just need to know if it would be considered okay to run a small, approx. 700 visitors/day bitnami wordpress blog on just one t2.medium EC2 instance (without any auto-scaling, beanstalk).
Am at risk of it crashing? What stats should I monitor or be aware of to be aware of potential dangers? Sorry for the basic nature of these questions, but this is new.
tl;dr: It might be "okay", but it's not ideal.
If your question is because of:
Initial setup time - Load-balancing and auto-scaling will be less expensive (more time-efficient) over time.
Cost - Auto-scaling spins down instances that aren't being used to reduce cost.
Minimal setup for a great user experience - The goal of a great AWS setup is to ensure that capacity matches demand
Am at risk of it crashing?
Possibly, yes. If you average 700 visitors, then the risk is traffic spikes if all visitors hit at the same. It also depends on what your maximum visitors are, which could vary widely from the average (or not)
What stats should I monitor or be aware of to be aware of potential dangers?
Monitor the usage on high traffic days (ie. public holiday sales)
Setup billing alerts
Setup the right metrics:
See John Rotenstein's SO answer:
CPU Utilization is not always the right measure to use -- your
application might only be able to handle a limited number of
connections, it might be squeezed on RAM and the types of requests
might vary too.
You can use normal monitoring tools, or you can write something that
pushes metrics to Amazon CloudWatch, so that you go beyond the basic
CPU and Network metrics that CloudWatch normally provides. You could
even use the Load Balancer's Latency metric to trigger scaling when
the application slows down (custom code required).
I'd start with:
Two or more instances - to deal with instance redundancy (an instance going down)
Several t2.small rather than one t2.medium can work out to be more cost-efficient, and more cost efficient than EC in some use cases.
Add auto-scaling - automatically spin up or down instances based on minimum and maximum counts
Load balancing - to re-route users from unhealthy to healthy instances. And also to keep all of the spun up instances all working as evenly as possible (rather than a single instance handling 80% of the workload while the others bludge).
You can always reduce your instances after time with monitoring.
In my opinion, with 700 visitors a day, the safer option would be to run a load balanced/auto-scaling environment on Elastic Beanstalk with at least 2 instances. The problem with running just one instance is that yes you are at a great risk of crashing in case you get an increase in traffic or when the instance goes down and with just one running you will not have a fallback. You can easily set up CloudWatch monitoring on NetworkIn, NetworkOut to get a sense of the number of requests your site is receiving and serving, and setup CPU Usage monitoring as well. The trade-off with running a load balanced environment over a single instance environment is that the cost might significantly increase as you introduce other things into your environment such as a load balancer. Also if you introduce a load balancer consider reducing the instance size to maybe a t2.small, could aid in reducing the cost.
It actually depends. This question range is wide. You have multiple options here.
You can use only ec2 instance for that much amount of visitors or even more if your application allows. You can also consider caching if your app need it.
You may add instance in an autoscaling group. So that if by any chance you need more resources you can increase them horizontally.
You can add load balancers lateron also. You just need to add user data in your launch configuration attached to autoscaling group. So when your instance get up it should automatically register itself in your load balancer.
For monitoring, you can check for the request metrics in cloudwarch for ELB. You have to keep an eye on your CPU and trigger the scale out policy once it reaches a particular threshold.

How to scale up write speed on galera cluster? using maxscale as db proxy

Currently, i am researching about galera cluster using many of servers(linux centos). Scaling up read traffic is very effective and easy, but scaling for write seems difficult(not improved).
I have used many servers, using maxscale as router(Readconnroute) to distribute write queries in paralles to all servers. But the write speed is not improved.
One option would be to use the Spider storage engine in MariaDB. It supports sharding of tables and should improve write speeds compared to a Galera cluster. On the other hand, you will lose the high availability of the Galera cluster in favor of increased write speeds.
This slide set by Kentoku Shiba on Spider is a good overview of how Spider improves write scalability.
Galera does not improve write speed, as all servers will have to process all writes. MySQL is very poor for scaling writes. You could do it with a proxy (like you mentioned maxscale). Then you can shard your data. You have to pick a key for each table to distribute keys to multiple servers.
I would suggest to use another nosql server i.e. mongodb, which have sharding capabilities built in for write heavy use cases. Mongodb is much easier to set up and to maintain than mysql for this job.

High performace server for 1x1 pixels (500M GET requests per day)

I need to set up a tracking server that will only serve 1x1 pixels and log all requests.
I initially thought of using Amazon's S3 or CloudFront but their costs are prohibitively high for me. I need to serve 500M pixels a day, and S3 charges $0.4 per 1M GET requests, so even without the data transfer costs I'm at $6,000/month.
I am considering setting up nginx or lighttpd on an EC2 instance. What performance should I expect with those two (e.g. per one large EC2 instance)? Are there better free products for this task?
Nginx is indeed a good candidate for this and already has built in support for empty GIFs (see http://wiki.nginx.org/HttpEmptyGifModule).
Disk I/O will probably be the biggest issue for this server because of the access logging. The only way to figure out the performance of the different EC2 instances is to test them.
If one EC2 instance does not offer the performance you need, or if you need any redundancy for this service, you should also look into using a load balancer (either an AWS Elastic Load Balancer or your own custom one).
You could also set up multiple smaller servers in different geographical regions and use DNS latency based routing to route requests to them (use either AWS Route 53 latency based routing or another DNS solution). This would significantly reduce the connection time to your server and would distribute the load across several data centers.

Resources