Our entire infrastructure is built in Linode, Singapore region. The problem is that as of now, Linode does not provide any block storage option. We have a 3 node Cassandra cluster which is growing at the rate of 4-5 gb per day. Each node has 192 GB SSD disk allotted to it. Adding a Cassandra node is simple , but it comes at the cost of maintaining it. The rate at which we are growing, we'd be needing 20-30 servers in 3 month time.
Digital Ocean, Singapore region, on the other hand, does have a block storage option. We can simply add more SSD to our servers rather than provision for 30 small servers.
Data is streamed from Kafka and stored in Cassandra.
What would be the pros and cons of having your Cassandra cluster in a different Data center but in the same country? The latency between the two is about 2 ms. The ratio between reads and writes are roughly 5% Read ops and 95 % write ops.
Related
I have a following use case:
we have single write region Azure Cosmos
the db will be replicated to other Azure regions (e.g. 5 additional Azure regions treated as read replicas)
we have a daily ETL job that cannot interrupt users querying the database. Because of that we're rate limiting in the application layer the requests we're making to Cosmos - e.g. we're consuming only 5k RUs/s out 10k RUs/s provisioned (to be strict we're provisioning 1k RUs/s with Auto-Scale setting). Thanks to that, while we're doing the ETL job we're consuming 50% of available RUs.
Question:
is it possible that during replication we will hit 100% RU utilization in one of the read replicas because Cosmos will try to replicate everything as fast as possible?
It depends on (1) whether the ETL is reading from Cosmos DB as a source or writing to Cosmos DB as a target and (2) what the aggregate workload (ETL + app) looks like.
I'll explain -
The best way to think about RU's is its a proxy metrics for the physical system resources it takes to perform a request (CPU, memory, IOPs).
Writes must be applied to all regions - and therefore consume RUs (CPU/memory/IOPs) in each of the replicated regions. Given an example 3-region setup consisting of West US + East US + North Europe - writing a record will result in RU consumption in West US, East US, and North Europe.
Reads can be served out a single region independently of another region. Given an example 3-region setup consisting of West US + East US + North Europe - reading a record in West US has no impact on East US or North Europe.
As you suggested -
Rate-limiting the ETL job is a good choice. Depending on what your ETL tool is - a few of them have easier to use client rate-limiting configuration options (e.g. notably, Azure Data Factory data flow and the spark connector for Cosmos DB's Core SQL API has a "Write throughput budget" concept) - alternatively, you can scale down the ETL job itself to ensure the ETL job is a natural bottleneck.
Configuring autoscale maximums to have sufficient headroom for [RU/sec needed for the rate-limited ETL] + [upper bound for expected RU/sec needed for the application] is a good call as well - while also noting Cosmos DB's autoscale comes with a 10x scaling factor. (e.g. configuring a 20K RU/sec maximum on cosmos db autoscale results in automatic scaling between 2K - 20K RU/sec).
One side note worth mentioning... depending on what the use-case is for the ETL job - if this is a classic ETL from OLTP => OLAP, it may be worthwhile to consider looking at Cosmos DB's analytical storage + Synapse Link feature set as an easier out-of-box solution.
In my company we are running a kafka cluster currently across 3 AZs in a single region on AWS. There are multiple Topics with many partitions and their replicas. We know Amazon does not provide offical stats for inter AZ latency but when we test it is really v. fast or rather extremely low latency & about sub millisecond (1ms). We have producers and consumers also in the same AWS region within those same AZs.
To save on operational costs, I am currently investigating impact if we move this entire Kafka cluster & producers and consumers back on-prem and create a similar cluster across 3 DCs.
I am asking this question because - The inter DC latency in the company is E2E roughly 10ms oneway...so What would be the impact if this inter DC latency increases 10fold from say 1MS t0 10MS.
I am asking this because apart from the producers, consumers, even brokers also communicate with each other across DC (e.g. replicas are reading the messages from leaders, controller is informing other brokers about changes), some changes written as metadata to a zookeeper. What are the pitfalls that I should watch out for ?
Sorry for such an open question but want to know if anyone have experience and want to share issues, pitfalls etc. How does the Kafka broker and zookeeper get impacted if a cluster spans across a DC when latency is higher compared to AWS AZ latency. Is it even practical ?
I'm executing a Spark job on EMR. The job is currently being bottlenecked by network (reading data from S3). Looking at metrics in Ganglia, I get a straight line at around 600 MBPs. I'm using i2.8x large instance type which is suppose to give 10Gbps i.e. ~ 1280MBPs. I have verified that enhanced networking is turned on and VirtualizationType is hvm, Am I missing something? Is there any other way to increase network throughput?
Networking capacity of Amazon EC2 instances is based upon Instance Type. The larger the instance, the more networking capacity is available. You are using the largest instance type within the i2 family, so that is good.
Enhanced Networking lowers network latency and jitter and is available on a limited number of instance types. You are using it, so that is good.
The i2.8xl is listed as having 10Gbps of network throughput, but this is limited to traffic within the same Placement Group. My testing shows that EMR instances are not launched within a Placement Group, so they might not receive the full network throughput possible.
You could experiment by using more smaller instances rather than fewer large instances. For example, 2 x i2.4xlarge cost the same as 1 x i2.8xlarge.
The S3->EC2 bandwidth is actually rate limited at 5Gbps (625MB/s) on the largest instance types of each EC2 family, even on instances with Enhanced Networking and a 20Gbps network interface. This has been confirmed by the S3 team, and it matches what I observed in my experiments. Smaller instances get rate limited at lower rates.
S3's time-to-first-byte is about 80-100ms, and after the first byte it is able to deliver data to a single thread at 85MB/s, in theory. However, we've only observed about 60MB/s per thread on average (IIRC). S3 confirmed that this is expected, and slightly higher than what their customers observe. We used an HTTP client that kept connections alive to the S3 endpoint. The main reason small objects yield low throughput is the high time-to-first-byte.
The following is the max bandwidth we've observed (in MB/s) when downloading from S3 using various EC2 instances:
Instance MB/s
C3.2XL 114
C3.4XL 245
C3.8XL 600
C4.L 67
C4.XL 101
C4.2XL 266
C4.4XL 580
C4.8XL 600
I2.8XL 600
M3.XL 117
M3.2XL 117
M4.XL 95
M4.10XL 585
X1.32XL 612
We've done the above test with 32MB objects and a thread count between 10-16.
Also,
Network performance quoted on ec2 instances matrix are benchmarked as described here. This is network bandwidth between Amazon EC2 Linux instances in the same VPC. The one we are observing between s3 and ec2 instances is not what they have promised.
EC2 Instance Network performance appears to be categorized as:
Low
Moderate
High
10 Gigabit
20 Gigabit
Determining Network bandwidth on an instance designated Low, Moderate, or High appears to be done on a case-by-case basis.
C3, C4, R3, I2, M4 and D2 instances use IntelĀ® 82599g Virtual Function Interface and provide Enhanced Networking with 10 Gigabit interfaces in the largest instance size.
10 and 20 Gigabit interfaces are only able to achieve that speed when communicating within a common Placement Group, typically in support of HPC. Network traffic outside a placement group is has a max limit of 5 Gbps.
Summary: Network bandwidth mentioned is between two instances not between s3 and ec2.Even between two instances when they are in same placement group + have support of HPC we can achieve something around 10/20 Gigabit.
In my Android app I use Amazon DynamoDB. I created 10 tables with Read capacity 10 and Write capacity 5. Today I received an email from Amazon. It costs me 11.36$.
I don't understand the meaning of free tier. Here is what I read from Amazon:
DynamoDB customers get 25 GB of free storage, as well as up to 25 write capacity units and 25 read capacity units of ongoing throughput capacity (enough throughput to handle up to 200 million requests per month) and 2.5 million read requests from DynamoDB Streams for free.
Please tell me more clearly about the meaning of free tier: 25 read and 25 write capacity units!
Amazon consider aggregates of read and write capacity of all tables, not the capacity of individual tables.
In your case the read capacity is 100 and the write capacity if 50. And you are charged for the 75 read capacity usage hours and 25 write capacity of usage hours.
Please plan properly for the read and write capacity of each table, otherwise you end up paying more bills.
Thanks.
Amazon DynamoDB allows the customer to provision the throughput of reads and writes independently. I have read the Amazon Dynamo paper about the system that preceded DynamoDB and read about how Cassandra and Riak implemented these ideas.
I understand how it is possible to increase the throughput of these systems by adding nodes to the cluster which then divides the hash keyspace of tables across more nodes, thereby allowing greater throughput as long as access is relatively random across hash keys. But in systems like Cassandra and Riak this adds throughput to both reads and writes at the same time.
How is DynamoDB architected differently that they are able to scale reads and write independently? Or are they not and Amazon is just charging for them independently even though they essentially have to allocate enough nodes to cover the greater of the two?
You are correct that adding nodes to a cluster should increase the amount of available throughput but that would be on a cluster basis, not a table basis. The DynamoDB cluster is a shared resource across many tables across many accounts. It's like an EC2 node: you are paying for a virtual machine but that virtual machine is hosted on a real machine that is shared among several EC2 virtual machines and depending on the instance type, you get a certain amount of memory, CPU, network IO, etc.
What you are paying for when you pay for throughput is IO and they can be throttled independently. Paying for more throughput does not cause Amazon to partition your table on more nodes. The only thing that cause a table to be partitioned more is if the size of your table grows to the point where more partitions are needed to store the data for your table. The maximum size of the partition, from what I have gathered talking to DynamoDB engineers, is based on the size of the SSDs of the nodes in the cluster.
The trick with provisioned throughput is that it is divided among the partitions. So if you have a hot partition, you could get throttling and ProvisionedThroughputExceededExceptions even if your total requests aren't exceeding the total read or write throughput. This is contrary to what your question ask. You would expect that if your table is divided among more partitions/nodes, you'd get more throughput but in reality it is the opposite unless you scale your throughput with the size of your table.