Is jedis synchronous or async? What consistency guarantee does redis provide? - asynchronous

1)Is jedis calls synchronous of async? When I make two consecutive writes through jedis, can I be sure that they will hit redis with the same order when I make them?
2)What consistency guarantee does redis provid? When I set some value through jedis, is it guaranteed that every subsequent read will see that write? I know in eventually consistency systems you don't have such guarantee.
Is it sequential consistency like zookeeper? I read the document. It just said that it's not strong consistency, which means linearizability I assume?

Is jedis calls synchronous of async?
From the client-server's perspective, it is sync, which means when jedis client send a request, redis server will process it and return the result as the response for the request. But for some command like unlink, things would be different, in some case redis will return directly and process it in background. So whether the command is sync or async, it is depend on redis server's action.
When I make two consecutive writes through jedis, can I be sure that they will hit redis with the same order when I make them?
Yes, redis is single thread model which has a request queue, so all the command is sequential processed. But there are also some cases you should take care of: if more than one clients(or processes) send commands to server, it will occur the thing like data unconsistency. you could use the multi/exec or watch or distributed lock to solve for different cases.
What consistency guarantee does redis provid? When I set some value through jedis, is it guaranteed that every subsequent read will see that write?
if the read command is arrived after the write command, it is guaranteed that read will see that write. It is just because of the single threaded model which make the world simple. But if you use the master-slave mode through the sentinel or redis cluster mode, things will be different. if you read from slave, the read is not guaranteed to see the most recent write because the redis replication from master to slave is async, but the same thing will not occur in the single redis instance.

Related

The asynchronous connection pool implementation in Rust

I have a Tokio TCP back-end application, which, briefly, after receiving a request, reads something from Redis, writes something to PostgreSQL, uploads something via HTTP, sends something to RabbitMQ etc. Processing each request takes a lot of time, so a separate task for each request is created. As sharing connections is impossible in asynchronous models, some connection pooling is required. For now, new connections are established on each request, and it is extremely excessive.
I have been looking for an asynchronous connection pool implementation in Rust, but have not found any of them up to date.
I would like to hear some advice on how to implement it myself.
The only idea I have come up with is:
Implement a Stream/Sink object with an inner collection of connections. It does not matter whether it is LIFO or FIFO, since the connections are identical. On the application startup, N connections are allocated.
Now I am not sure if it is possible to share such a pool among tasks, but if it were possible, tasks would poll the stream for a connection instance (instead of establishing their own one), use it, and then put back.
If there were no connections available, the stream might establish more of them or ask the task to hang on (depending on its configuration).
If a connection fails, it gets dropped and the pool now contains N-1 connections, so it may decide to allocate a new one on the next request.
So I have two problems I cannot find proper answers anywhere:
Must/can/should I share the stream/sink-pool among tasks in some way? Anyway, I see some Shared futures in the futures crate.
There are some gloomy points in the tokio/futures tutorial. E.g. it does not explain how do I notify the uppermost task, that is, how do I implement the mythical innermost future, which does not pool anything itself, but still has to notify the upper futures.
Or is my approach completely wrong? I could start playing with it by myself, but I have a strong suspicion that I have missed something, e.g. a one-click solution.

Is there any necessary to use pool in async redis client

I use async redis api(hiredis) in my project with one connection, because of non-blocking io, so I think one connection is enough for me, but I want to figure out whether the sequence of commands will be executed first-in-first-out
The documentation seems to suggest it is first in first out.
https://github.com/redis/hiredis
In an asynchronous context, commands are automatically pipelined due
to the nature of an event loop
Yes you don't need a pool. One connection should be sufficient.

Logging Web API calls MVC 4

I need to log to the database every call to my Web API.
Now of course I don't want to go to my database on every call.
So lets say I have a dictionary or a hash table object in my cache,
and every 10000 records I go to the database.
I still don't want this every 10000 user to wait for this operation.
And I can't start a different thread for long operations since the application pool
can be recycled basically on anytime.
What is the best solution for this scenario?
Thanks
I would argue that your view of durability is rather inconsistent. Your cache of 10000 objects could also be lost at any time due to an app pool recycle or server crash.
But to the original question of how to perform a large operation without causing the user to wait:
Put constraints on app pool recycling and deal with the potential data loss.
Periodically dump the cached messages to a Windows service for further processing. This is still not 100% guaranteed to preserve data, e.g. the service/server could crash.
Use a message queue (MSMQ), possibly with WCF. A message queue can persist to disk, so this can be considered reasonably reliable.
Message Queuing (MSMQ) technology enables applications running at
different times to communicate across heterogeneous networks and
systems that may be temporarily offline. Applications send messages to
queues and read messages from queues.
Message Queuing provides guaranteed message delivery, efficient
routing, security, and priority-based messaging. It can be used to
implement solutions to both asynchronous and synchronous scenarios
requiring high performance.
Taking this a step further...
Depending on your requirements and/or environment, you could probably eliminate your cache, and write all messages immediately (and rapidly) to a message queue and not worry about performance loss or a large write operation.

How to best implement a blocking/waiting actor?

I'm fairly new to Akka and writing concurrent applications and I'm wondering what's a good way to implement an actor that would wait for a redis list and once an item becomes available it will process it, or send it to a different actor to process?
Would using the blocking function BRPOPLPUSH be better, or would a scheduler that will ask the actor to poll redis every second be a better way?
Also, on a normal system, how many of these actors can I spawn concurrently without consuming all the resource the system has to offer? How does one decide how many of each Actor type should an actor system be able to handle on the system its running on?
As a rule of thumb you should never block inside receive. Each actor should rely only on CPU and never wait, sleep or block on I/O. When these conditions are met you can create even millions of actors working concurrently. Each actor is suppose to have 600-650 bytes memory footprint (see: Concurrency, Scalability & Fault-tolerance 2.0 with Akka Actors & STM).
Back to your main question. Unfortunately there is no official Redis client "compatible" with Akka philosophy, that is, completely asynchronous. What you need is a client that instead of blocking will return you a Future object of some sort and allow you to register callback when results are available. There are such clients e.g. for Perl and node.js.
However I found fyrie-redis independent project which you might find useful. If you are bound to synchronous client, the best you can do is either:
poll Redis periodically without blocking and inform some actor by sending a message to with a Redis reply or
block inside an actor and understand the consequences
See also
Redis client library recommendations for use from Scala
BRPOPLPUSH with block for long time (up to the timeout you specify), so I would favour a Scheduler instead which still blocks, but for a shorter amount of time every second or so.
Whichever way you go, because you are blocking, you should read this section of the Akka docs which describes methods for working with blocking libraries.
Do you you have control over the code that is inserting the item into redis? If so you could get that code to send your akka code a message (maybe over ActiveMQ using the akka camel support) to notify it when the item has been inserted into redis. This will be a more event driven way of working and prevent you from having to poll, or block for super long periods of time.

When a queue should be used?

Suppose we were to implement a network application, such as a chat with a central server and several clients: we assume that all communication must go through the central server, then it should pick up messages from some clients and forward them to target clients, and so on.
Regardless of the technology used (sockets, web services, etc..), it is possible to think that there are some producer threads (that generate messages) and some consumer threads (that read messages).
For example, you could use a single queue for incoming and outgoing messages, but using a single queue, you couldn't receive and send messages simultaneously, because only one thread at a time can access the queue.
Perhaps it would be more appropriate to use two queues: for example, this article explains a way in which you can manage a double queue so that producers and consumers can work almost simultaneously. This scenario may be fine if there are only a producer and a consumer, but if there are many clients:
How to make so that the central server can receive data simultaneously from multiple input streams?
How to make so that the central server can send data simultaneously to multiple output streams?
To resolve this problem, my idea is to use a double queue for each client: on the central server, each client connection may be associated with two queues, one for incoming messages from that client and one for outgoing messages addressed to that client. In this way the central server may send and receive data simultaneously on almost all the connections with the clients...
There are probably other ways to manage the queues ... What are the parameters to determine how many queues are needed and how to organize them? There are cases that do not need any queue?
To me, this idea of using a queue per client or multiple queues per client seems to miss the point. First of all, it is absolutely possible to build a queue which can be accessed simultaneously by 2 threads (one can be enqueueing an item while a different one is dequeueing another item). If you want to know how, post a specific question about that.
Second, even if we assume that only 1 thread at a time can access a single queue, and even if we assume that the server will be receiving or sending data to/from all the clients simultaneously, it still doesn't follow that you need a different queue for each client. To avoid limiting system performance, you just need to allow enough concurrency to utilize all the server's CPUs. Even with a single, system-wide queue, if dequeueing/enqueueing messages is fast enough compared to the other work the server is doing, it might not be a bottleneck. (And with an efficient implementation, simply inserting an item or removing an item from a queue should be very fast. It's a very simple operation.) For that message queue to become the bottleneck limiting performance, either you would need a LOT of CPUs, or everything else the server was doing would have to be very fast. In that case, you could work out some scheme with 2 or 4 system-wide queues, to allow 2x or 4x more concurrency.
The whole idea of using work queues in a multi-threaded system is that they 1) allow multiple consumers to all grab work from a single location, so producers can "dump" whatever work they need done at that single location without worrying about which consumer will do it, and 2) function as a load-balancing mechanism for the consumers. (Additionally, a work queue can act as a "buffer" if producers temporarily generate work too fast for the consumers.) If you have a dedicated pair of producer-consumer threads for each client, it calls into question why you need to use queues at all. Why not just do a synchronous "pass off" from dedicated producer to corresponding dedicated consumer? Or, why not use a single thread per client which acts as both producer and consumer? Using queues in the way which you are proposing doesn't seem to really gain anything.

Resources