Using endpoints of AWS ElastiCache for Redis - spring-data-redis

I am using AWS ElastiCache for Redis as the caching solution for my spring-boot application. I am using spring-boot-starter-data-redis and jedis client to connect with my cache.
Imagine that I am having my cache in cluster-mode-enabled and 3 shards with 2 nodes in each. I agree then the best way of doing it is using the configuration-endpoint. Alternatively, I can list all the endpoints of all nodes and let the job done.
However, even if I use a single node's endpoint from one of the shards, my caching solution works. That doesn't looks right to me. I feel even if it works, that might case problems in the cluster in long run. When there are all together 6 nodes partitioned into 3 shards but only using one node's endpoint. I have following questions.
Is using one node's endpoint create an imbalance in the cluster?
or
Is that handled automatically by the AWS ElastiCache for Redis?
If I use only one node's endpoint does that mean the other nodes will never being used?
Thank you!

To answer your questions;
Is using one node's endpoint create an imbalance in the cluster?
NO
Is that handled automatically by the AWS ElastiCache for Redis?
Somewhat
if I use only one node's endpoint does that mean the other nodes will never being used?
No. All nodes are being used.
This is how Cluster Mode Enabled works. In your case, you have 3 shards meaning all your slots (where key-value data is stored) are divided into 3 sub-clusters ie. shards.
This was explained in this answer as well - https://stackoverflow.com/a/72058580/6024431
So, essentially, your nodes are smart enough to re-direct your requests to the nodes that has the key-slot where your data needs to be stored. So, no imbalances. Redis handles the redirection for you.
Now, while using Node endpoints, you're going to be facing other problems.
Elasticache is running on cloud (which is essentially AWS Hardware). All hardware faces issues. You have 3 primaries (1p, 2p, 3p) and 3 (1r, 2r, 3r) replicas.
So, if a primary goes down due to hardware issue (lets say 1p), the replica will get promoted to become the new Primary for the cluster (1r).
Now the problem would be, your application is connected directly to 1p which has now been demoted to replica. So, all the WRITE operations will fail.
And you will have to change the application code manually whenever this happens.
Alternatively, if you were using configurational endpoint (or other cluster level endpoints) instead of node-endpoints, this issue would only be a blip to your application at most, perhaps for 1-2 seconds.
Cheers!

Related

MariaDB Spider with Galera Clusters failover solutions

I am having problems trying to build a database solution for the experiment to ensure HA and performance(sharding).
Now, I have a spider node and two galera clusters (3 nodes in each cluster), as shown in the figure below, and this configuration works well in general cases.:
However, as far as I know, when the spider engine performs sharding, it must assign primary IP to distribute SQL statements to two nodes in different Galera clusters.
So my first question here is:
Q1): When the machine .12 shuts down due to destruction, how can I make .13 or .14(one of them) automatically replace .12?
The servers that spider engine know
Q2): Are there any open source tools (or technologies) that can help me deal with this situation? If so, please explain how it works. (Maybe MaxScale? But I never knew what it is and what it can do.)
Q3): The motivation for this experiment is as follows. An automated factory has many machines, and each machine generates some data that must be recorded during the production process (maybe hundreds or thousands of data per second) to observe the operation of the machine and make the quality of each batch of products the best.
So my question is: how about this architecture (Figure 1)? or please provides your suggestions.
You could use MaxScale in front of the Galera cluster to make the individual nodes appear like a combined cluster. This way Spider will be able to seamlessly access the shard even if one of the nodes fails. You can take a look at the MaxScale tutorial for instructions on how to configure it for a Galera cluster.
Something like this should work:
This of course has the same limitation that a single database node has: if the MaxScale server goes down, you'll have to switch to a different MaxScale for that cluster. The benefit of using MaxScale is that it is in some sense stateless which means it can be started and stopped almost instantly. A network load balancer (e.g. ELB) can already provide some form of protection from this problem.

uWSGI and Flask: keep objects in memory between requests

My stack is uWSGI, flask and nginx currently. I have a need to store data between requests (basically I receive push notifications from another service about events to the server and I want to store those events in the server memory, so client can just query server every n milliseconds, to receive latest update).
Normally this would not work, because of many reasons. One is a good deployment requires you to have several processes in uwsgi in production (and even maybe several machines to scale this out). But my case is very specific: I'm building a web app for a piece of hardware (You can think of your home router configuration page as a good example). This means no need to scale. I also do not have a database (at least not a traditional one) and probably normally 1-2 clients simultaneously.
if I specify --processes 1 --threads 4 in uwsgi, is this enough to ensure the data is kept in the memory as a single instance? Or do I also need to use --threads 1?
I'm also aware that some web servers clear memory randomly from time to time and restart the hosted app. Does nginx/uwsgi do that and where can I read about the rules?
I'd also welcome advises on how to design all of this, if there are better ways to handle this. Please note that I do not consider using any persistant storage for this - this does not worth the effort and may be even impossible due to hardware limitations.
Just to clarify: When I'm talking about one instance of data, I'm thinking of my app.py executing exactly one time and keeping the instances defined there for as long as the server lives.
If you don't need data to persist past a server restart, why not just build a cache object into you application that can do push and pop operations?
A simple array of objects should suffice, one flask route pushes new data to the array and another can pop the data off the array.

How to block flows from other nodes on Corda network?

My dev node is part of the Corda test network, and when I open the logs I see something like (node etc..sent you a flow which you don't have installed, you can kill it with kill flow). So I have 2 questions:
How do I reject these calls? I know the purpose of being part of the Corda network is to have the ability for CorDapps of different orgs to transact, and I don't want to go with the segregated network model (because it's more expensive for prod and pre-prod Corda nets).
Can a node on the network perform a DoS (Denial of Service) attack by sending me flows that I don't have installed and eventually bringing my node down?
I'm not sure if I'm right about my answer but as far as I know Corda Network is designed on a need to know basis and I know that you are aware of it and I was having the same doubt when I first started with Corda but I found out that one can simply block a node from sending you any undesired flows which could cost you unnecessary CPU runtime. The explanation to this is given in this link.
Apart from this I have gone through a Medium Post which explained about ResoponderFlow validating information passed through flows and one of the points mentioned over there was to verify the identity of the flow initiator(so as to find if we need that flow),which can't be done within a contract so it needs to be done inside a flow.
Also one can't keep flooding a node with a flow because it contains a timeout,maxRestartCount and backOffBase which really help in determining how the flow is getting propagated across the network.
I hope this helps you to construct a solution to your doubt.

How to get the proxy node in openstack swift cluster?

I know the command swift-ring-builder /etc/swift/object.builder can get all storage nodes in a swift cluster. Now I want to know if there are any commands like it to get the proxy nodes in the cluster?
Every controller node itself acts as a proxy server first.The requests hit the proxy-server code in the controller node which resolves functions and methods to be called and acts upon.
The list of storage nodes MUST be accessible for all nodes in the cluster.
However, swift is agnostic about the list of proxies it has, so there is no command like that.
One suggestion, if you really need this information, would be to look at the storage nodes logs and find out the ips doing the requests. This way you can discover some or all proxies. However this method is totally imprecise.

Which node should I push data to in a cluster?

I've setup a kafka cluster with 3 nodes.
kafka01.example.com
kafka02.example.com
kafka03.example.com
Kafka does replication so that any node in the cluster can be removed without loosing data.
Normally I would send all data to kafka01, however that will break the entire cluster if that one node goes down.
What is industry best practice when dealing with clusters? I'm evaluating setting up an NGINX reverse proxy with round robin load balancing. Then I can point all data producers at the proxy and it will divvy up between the nodes.
I need to ensure that no data is lost if one of the nodes becomes unavailable.
Is an nginx reverse proxy an appropriate tool for this use case?
Is my assumption correct that a round robin reverse proxy will distribute the data and increase reliability without data loss?
Is there a different approach that I haven't considered?
Normally your producer takes care of distributing the data to all (or selected set of) nodes that are up and running by using a partitioning function either in a round robin mode or by using some semantics of your choice. The producer publishes to a partition of a topic and different nodes are leaders for different partitions of one topic. If a broker node becomes unavailable, this node will fall out of the cluster (In Sync Replicas) and new leaders for partitions on that node will be selected. Through metadata requests/responses, your producer will become aware of this fact and push messages to other nodes which are currently up.

Resources