how shards know about eachother in solrcloud? - solrcloud

If we start a SolrCloud with 2 shards. By a hash function algorithm(Murmur) documents are distributed over 2 shards. It is claimed that we can send the query to any of the cores and it will go to the write shard because the shards know about each other. I want to know how they know about each other?

solr becomes solrcloud with zookeeper ensemble.so its a zookeeper which makes it possible for nodes in the solrcloud to talk to each other.You can think of a zookeeper as a central repository for solrcloud configuration.
Now you can send queries to any node in the cloud,the node will consult the zookeeper to know which other nodes are alive in the shards and distributes the query to all alive nodes in the cluster. Every node in the cluster executes the query and sends results back to the node who distributed the request.The node which was queried will combine the search results returned by all the nodes in the shards and will send back to client.

Related

Using endpoints of AWS ElastiCache for Redis

I am using AWS ElastiCache for Redis as the caching solution for my spring-boot application. I am using spring-boot-starter-data-redis and jedis client to connect with my cache.
Imagine that I am having my cache in cluster-mode-enabled and 3 shards with 2 nodes in each. I agree then the best way of doing it is using the configuration-endpoint. Alternatively, I can list all the endpoints of all nodes and let the job done.
However, even if I use a single node's endpoint from one of the shards, my caching solution works. That doesn't looks right to me. I feel even if it works, that might case problems in the cluster in long run. When there are all together 6 nodes partitioned into 3 shards but only using one node's endpoint. I have following questions.
Is using one node's endpoint create an imbalance in the cluster?
or
Is that handled automatically by the AWS ElastiCache for Redis?
If I use only one node's endpoint does that mean the other nodes will never being used?
Thank you!
To answer your questions;
Is using one node's endpoint create an imbalance in the cluster?
NO
Is that handled automatically by the AWS ElastiCache for Redis?
Somewhat
if I use only one node's endpoint does that mean the other nodes will never being used?
No. All nodes are being used.
This is how Cluster Mode Enabled works. In your case, you have 3 shards meaning all your slots (where key-value data is stored) are divided into 3 sub-clusters ie. shards.
This was explained in this answer as well - https://stackoverflow.com/a/72058580/6024431
So, essentially, your nodes are smart enough to re-direct your requests to the nodes that has the key-slot where your data needs to be stored. So, no imbalances. Redis handles the redirection for you.
Now, while using Node endpoints, you're going to be facing other problems.
Elasticache is running on cloud (which is essentially AWS Hardware). All hardware faces issues. You have 3 primaries (1p, 2p, 3p) and 3 (1r, 2r, 3r) replicas.
So, if a primary goes down due to hardware issue (lets say 1p), the replica will get promoted to become the new Primary for the cluster (1r).
Now the problem would be, your application is connected directly to 1p which has now been demoted to replica. So, all the WRITE operations will fail.
And you will have to change the application code manually whenever this happens.
Alternatively, if you were using configurational endpoint (or other cluster level endpoints) instead of node-endpoints, this issue would only be a blip to your application at most, perhaps for 1-2 seconds.
Cheers!

Is possible to scale out Hashicorp Vault using DynamoDB storage backend?

I am using Vault on AWS with the DynamoDB backend. The backend supports HA.
storage "dynamodb" {
ha_enabled = "true"
region = "us-west-2"
table = "vault-data"
}
Reading the HA concept documentation:
https://www.vaultproject.io/docs/concepts/ha.html
To be highly available, one of the Vault server nodes grabs a lock within the data store. The successful server node then becomes the active node; all other nodes become standby nodes. At this point, if the standby nodes receive a request, they will either forward the request or redirect the client depending on the current configuration and state of the cluster -- see the sections below for details. Due to this architecture, HA does not enable increased scalability.
I am not interested in having a fleet of EC2 instances behind a ELB, where only 1 instance behaves like a master and talks to DynamoDB.
I would like to run N Ec2 instances running Vault, that read and write independently from DynamoDB.
Because DynamoDB supports read/write from multiple EC2 instances, I would expect to be able to unseal Vault from multiple instances simultaneously and perform read and write operations. This should work even with ha_enabled = "false", without doing the leader election.
Why this architecture is not suggested in the documentation ? Why it should not work ? Is there any cryptographic limitation that I am missing ?
thank you
It is a feature of Vault Enterprise. With it, you can set up a primary cluster and as many "secondary" clusters, better known as performance replicas. Each cluster has its own storage and unseal mechanism. So you could have one cluster on Dynamo DB and the other on Raft. If both are on Dynamo DB, then you'll need a Dynamo DB table for each.
But keep in mind that performance replicas will always forward write operations to the primary cluster. A write operation is something that affect the global state of Vault. In that sense a POST to /transit is not considered a write operation.
Another possibility is to have your kv store mounted locally (with the -local flag). Then it will behave like a primary even when mounted on a performance replica, at the price of not being able to replicate to the other cluster.
A final note: DR clusters are an exact copy of any cluster. Each cluster, whether a primary or a replica, can have its DR cluster.

Mariadb Galera cluster does not come up after killing the mysql process

I have a Mariadb Galera cluster with 2 nodes and it is up and running.
Before moving to production, I want to make sure that if a node crashes abruptly, It should come up on its own.
I tried using systemd "restart", but after killing the mysql process the mariadb service does not come up, so, is there any tool or method, that I can use to automate bringing up the nodes after crashes?
Galera clusters needs to have quorum (3 nodes).
In order to avoid a split-brain condition, the minimum recommended number of nodes in a cluster is 3. Blocking state transfer is yet another reason to require a minimum of 3 nodes in order to enjoy service availability in case one of the members fails and needs to be restarted. While two of the members will be engaged in state transfer, the remaining member(s) will be able to keep on serving client requests.
You can read more here.

MariaDB Galera cluster: are replicate-do-db filters applied before or after data sent?

I would like to synchronize only some databases on a cluster, with replicate-do-db.
→ If I use the Galera cluster, are all data sent over the network, or are nodes smart enough to only fetch their specific databases?
On "classic" master/slave MariaDB replication, filters are made by the slave, causing network charge for nothing if you don't replicate that database. You have to configure a blackhole proxy to filter binary logs to avoid this (setup example), but the administration after is not really easy. So it would be perfect with a cluster if I can perform the same thing :)
binlog_... are performed in the sending (Master) node.
replicate_... are performed in the receiving (Slave) node.
Is this filtered server part of the cluster? If so, you are destroying much of the beauty of Galera.
On the other hand, if this is a Slave hanging off one of the Galera nodes and the Slave does not participate in the "cluster", this is a reasonable architecture.

Which node should I push data to in a cluster?

I've setup a kafka cluster with 3 nodes.
kafka01.example.com
kafka02.example.com
kafka03.example.com
Kafka does replication so that any node in the cluster can be removed without loosing data.
Normally I would send all data to kafka01, however that will break the entire cluster if that one node goes down.
What is industry best practice when dealing with clusters? I'm evaluating setting up an NGINX reverse proxy with round robin load balancing. Then I can point all data producers at the proxy and it will divvy up between the nodes.
I need to ensure that no data is lost if one of the nodes becomes unavailable.
Is an nginx reverse proxy an appropriate tool for this use case?
Is my assumption correct that a round robin reverse proxy will distribute the data and increase reliability without data loss?
Is there a different approach that I haven't considered?
Normally your producer takes care of distributing the data to all (or selected set of) nodes that are up and running by using a partitioning function either in a round robin mode or by using some semantics of your choice. The producer publishes to a partition of a topic and different nodes are leaders for different partitions of one topic. If a broker node becomes unavailable, this node will fall out of the cluster (In Sync Replicas) and new leaders for partitions on that node will be selected. Through metadata requests/responses, your producer will become aware of this fact and push messages to other nodes which are currently up.

Resources