RxJava Connection pool - asynchronous

The context : I want to break down a module onto sub modules. Each one with a responsability. After the breakdown, I would like to compose / orchestrate the services provided by each submodules to build aggregate objects.
My current solution : The composition used RxJava. First I load a list of ObjectA then inside a flatMap I call asynchronously (using defer and Schedulers.io) multiple services and finally inside the flatMap I use a zip operator to build a new Object.
objectsA.flatMap {
Observable<A> oa = loadAAsync();
Observable<B> ob = loadBAsync();
Observable<C> oc = loadCAsync();
return Observable.zip(oa, ob, oc, (a, b, c) -> { build() };
}
The methods loadXAsync call a service that load an object from the Databse.
My problem : This solution is currently slow. x5 times the original one (a single SQL request). I was a bit dissapointed. In fact the solution is slower each time I add a new loading part. With only A, quite good, B decreases performance, C etc etc. My first feeling is that the connection pool used (HiakriCP) cannot provide enough database connection, becoming a bottleneck. Currently I have a pool size of 10. The HikariWiki said that large enough.
My question : How do you handle this case ? is it normal for RxJava "experts" ? do you increase the pool size ? or do you limit the number of threads ?
Update 1
#divers : Perhaps I'm totally wrong about the connection pool. I will try to explain my thinking.
The calls are made asynchronously. Each new thread will pick a connection and make a request to the DB. The Schedulers.io uses a thread pool that will grow as needed. If I got a hundreds of ObjectA emitted. Each one will make 5 sql requests.
I think (need to check this one) that flatMap isn't blocking. So I will have a lot of threads and a lot of batches of 5 requests in parallel. ex: objectA#1 emitted -> 5 requests, objectA#2 emitted before the end of objectA#1 so 5 five requests in parallel.
My pool size is fixed and contains 10 connections. That's why I thought my problem comes from here.
*Update 2 * : Here is the SO Question / code that I tried on my project
*Update 3 * : Configuring my pool size to 50. When I monitor the Hikari MBean, I saw that during a short period of time, the
ActiveConnections = 50 (so all connections used)
IdleConnections = 0
ThreadsAwaitingConnection : the max value I saw was 210.
Please correct me :)

Related

Making blocking http call in akka stream processing

I am new to akka and still trying to understand the different akka and streaming concepts. For some new feature i need to add a http call to already existing stream which is working on an internal object. Something like this -
val step1Flow = Flow[SampleObject].filter(...--Filtering condition--...)
val step2Flow = Flow[SampleObject].map(obj => {
...
-- Business logic to update values in the obj --
...
})
...
override val flowGraph: Flow[SampleObject, SampleObject, NotUsed] =
bufferIn.via(Flow.fromGraph(GraphDSL.create() {
implicit builder =>
import GraphDSL.Implicits._
...
val step1 = builder.add(step1Flow)
val step2 = builder.add(step2Flow)
val step3 = builder.add(step3Flow)
...
source ~> step1 ~> step2 ~> step3 ~> merge
...
}
I need to add the new http request flow (lets call it newFlow) after step1. All these flow have Inlet and Outlet as SampleObject. Now my understanding is that the newFlow would need to be blocking because the outlet need to be SampleObject only. For that I have used Await function on the http call future. The code looks like this -
val responseFuture: Future[(Try[HttpResponse], SomeContext)] =
Source
.single(httpRequest -> context)
.via(Retry(retrySettings).join(clientFlow))
.runWith(Sink.head)
...
val (httpTry, passedAlongContext) = Await.result(responseFuture, 30.seconds)
-- logic to process response and return SampleObject --
Now this works fine but i think there should be a better way to do this without using wait. Also i think this would block the main thread till the request completes, which is going to affect the app throughput.
Could you please guide if the approach i used is correct or not. And how do i make use of some other thread pool to handle these blocking call so my main threadpool is not affected
This question seems very similar to mine but i do not understand it completely - connect Akka HTTP to Akka stream . Also i can't change the step2 or further flows.
EDIT : Added some code details for the stream
I ended up using the approach mentioned in the question because i couldn't find anything better after looking around. Adding this step decreased the throughput of my application as expected, but there are approaches to increase that can be used. Check these awesome blogs by Colin Breck -
https://blog.colinbreck.com/maximizing-throughput-for-akka-streams/
https://blog.colinbreck.com/partitioning-akka-streams-to-maximize-throughput/
To summarize -
Use Asynchronous Boundaries for flows which are blocking.
Use Futures if possible and add callbacks to futures. There are several ways to do that.
Use Buffers. There are several types of buffers available, choose what suits your needs.
Other than these, you can use inbuilt flows like -
Use "Broadcast" to broadcast your events to multiple consumers.
Use "Partition" to partition your stream into multiple streams based
on some condition.
Use "Balance" to partition your stream when there is no logical way to partition your events or they all could have different work loads.
You could use any one or multiple things from above options.

MDriven ECO_ID duplicates

We appear to have a problem with MDriven generating the same ECO_ID for multiple objects. For the most part it seems to happen in conjunction with unexpected process shutdowns and/or server shutdowns, but it does also happen during normal activity.
Our system consists of one ASP.NET application and one WinForms application. The ASP.NET app is setup in IIS to use a single worker process. We have a mixture of WebForms and MVC, including ApiControllers. We're using a rather old version of the ECO packages: 7.0.0.10021. We're on VS 2017, target framework is 4.7.1.
We have it configured to use 64 bit integers for object id:s. Database is Firebird. SQL configuration is set to use ReadCommitted transaction isolation.
As far as I can tell we have configured EcoSpaceStrategyHandler with EcoSpaceStrategyHandler.SessionStateMode.Never, which should mean that EcoSpaces are not reused at all, right? (Why would I even use EcoSpaceStrategyHandler in this case, instead of just creating EcoSpace normally with the new keyword?)
We have created MasterController : Controller and MasterApiController : ApiController classes that we use for all our controllers. These have a EcoSpace property that simply does this:
if (ecoSpace == null)
{
if (ecoSpaceStrategyHandler == null)
ecoSpaceStrategyHandler = new EcoSpaceStrategyHandler(
EcoSpaceStrategyHandler.SessionStateMode.Never,
typeof(DiamondsEcoSpace),
null,
false
);
ecoSpace = (DiamondsEcoSpace)ecoSpaceStrategyHandler.GetEcoSpace();
}
return ecoSpace;
I.e. if no strategy handler has been created, create one specifying no pooling and no session state persisting of eco spaces. Then, if no ecospace has been fetched, fetch one from the strategy handler. Return the ecospace. Is this an acceptable approach? Why would it be better than simply doing this:
if (ecoSpace = null)
ecoSpace = new DiamondsEcoSpace();
return ecoSpace;
In aspx we have a master page that has an EcoSpaceManager. It has been configured to use a pool but SessionStateMode is Never. It has EnableViewState set to true. Is this acceptable? Does it mean that EcoSpaces will be pooled but inactivated between round trips?
It is possible that we receive multiple incoming API calls in tight succession, so that one API call hasn't been completed before the next one comes in. I assume that this means that multiple instances of MasterApiController can execute simultaneously but in separate threads. There may of course also be MasterController instances executing MVC requests and also the WinForms app may be running some batch job or other.
But as far as I understand id reservation is made at the beginning of any UpdateDatabase call, in this way:
update "ECO_ID" set "BOLD_ID" = "BOLD_ID" + :N;
select "BOLD_ID" from "ECO_ID";
If the returned value is K, this will reserve N new id:s ranging from K - N to K - 1. Using ReadCommitted transactions everywhere should ensure that the update locks the id data row, forcing any concurrent save operations to wait, then fetches the update result without interference from other transactions, then commits. At that point any other pending save operation can proceed with its own id reservation. I fail to see how this could result in the same ID being used for multiple objects.
I should note that it does seem like it sometimes produces id duplicates within one single UpdateDatabase, i.e. when saving a set of new related objects, some of them end up with the same id. I haven't really confirmed this though.
Any ideas what might be going on here? What should I look for?
The issue is most likely that you use ReadCommitted isolation.
This allows for 2 systems to simultaneously start a transaction, read the current value, increase the batch, and then save after each other.
You must use Serializable isolation for key generation; ie only read things not currently in a write operation.
MDriven use 2 settings for isolation level UpdateIsolationLevel and FetchIsolationLevel.
Set your UpdateIsolationLevel to Serializable

ASP.net Thread Safety Confusion

I have a very long running process in an ASP.net application that we desperately need to dramatically shorten. The process in question is charging a large number of credit cards. Currently it performs at about 1 charge per second. We need this to be more like 10 per second.
So we decided that utilizing multiple simultaneous threads would be one way to go. So we basically take this large list of orders to process, divide the list into ten lists and then spawn a new thread to process each of the ten lists simultaneously.
An additional complication of this process is that we need to report progress on this process, and not only to the user session that initiated the process, but to any user, in any session in the application. So for example, if I log in and start this process, I will see a progress bar. If after I initiate the process, and it is still running, another user logs in elsewhere and goes to this same page, they will also see the progress bar.
I did some research and thought that I could use Application variables to store the relevant bits of information required to report progress. The client polls the server on a regular basis whenever on this page to see if there are any threads running, and if so, it returns various statistics on the progress of the process back to the client.
It would seem that this approach does not work. A simple counter of the number of currently running threads does not work as expected. It seems that the so-called thread safety of the Application object is safe in that no two threads will be able to access the same variable simultaneously, but not safe in that if two threads both attempt to increment a variable, one of them will be able to increment it, and the other will not, and rather than queue up and increment it in turn, the second thread just moves on. I'm sure this is my thread safety ignorance shining through.
Another issue is that using Debug.Print or Debug.WriteLine seem to be the same kind of "thread-safe" as the Application object. As each thread starts, we use Debug.WriteLine to output the name and start time of the thread, and as it completes, we do the same thing to write that it completed. We consistently see ten threads start and four threads end in the debug window.
I don't think we need to use Application.Lock() and Application.Unlock(), but I have tried it both with and without those calls before and after every write operation, but to no avail- the results are the same either way.
I have a ton of code, so I'm not sure exactly which parts to share, but here are some of the relevant parts:
This is how we create and start the threads:
For Each oBatch As List(Of Guid) In oOrderBatches
Dim t As New Threading.Thread(Sub() ProcessPaymentBatch(oBatch, clubrunid, oToken.UserID))
t.IsBackground = True
t.Start()
Next
Here is the sub that is started by each thread:
Private Sub ProcessPaymentBatch(oBatch As List(Of Guid), clubrunid As String, UserID As Guid)
ThreadsRunning(clubrunid) += 1
Try
Debug.Print("Thread Start")
For Each oID As Guid In oBatch
‘Do a bunch of processing stuff…
Next
Finally
ThreadsRunning(clubrunid) -= 1
Debug.Print("Thread End")
End Try
End Sub
Finally, this is an example of one of the application variables that the threads attempt to access, but seems to be failing.
Private Const _THREADSRUNNING As String = "ThreadsRunningThisRun_"
Public Property ThreadsRunning(clubid As String) As Integer
Get
Dim sToken As String = _THREADSRUNNING & clubid
If Application(sToken) Is Nothing Then
ThreadsRunning(clubid) = 0
End If
Return Application(sToken)
End Get
Set(ByVal value As Integer)
Debug.Print(value)
Dim sToken As String = _THREADSRUNNING & clubid
Application.Lock()
Application(sToken) = value
Application.UnLock()
End Set
End Property
The Debug output from this property looks something like this:
Thread Start
1
Thread Start
Thread Start
1
1
4
Thread End
5
3
Thread Start
6
3
1
-1
Thread End
-2
-3
I can't understand why there would be a different number of "Thread Start" and "Thread End" debug statements, and I don't understand how the thread count could get to negative numbers. This is why I am confused by the thread safety of the Application and Debug objects.
Your help in this matter would be greatly appreciated!
Nevermind, I was just being an idiot. The problem had nothing to do with the Application or Debug objects not being thread safe, the problem was in my methodology (as was expected really).
To clarify, the issue was that we were locking the global variables in the application object when writing, but not when reading. We then tried also locking when reading, but still had the same problem. What we failed to realize was that when incrementing a value, you are getting the current value, adding onto that, then setting the new value. The lock needed to bridge all three of those operations, so it goes like this:
Lock
Get
Add
Set
Unlock
What we were doing previously was:
Lock
Get
Unlock
Add
Lock
Set
Unlock
Which allowed for multiple threads to Get and then Set the same values as one another, which explains all of the oddities we were seeing in the debug window.

Adobe Air SQLite synchronous busy timeout / SQLite concurrent access / avoid busy loop

this is my first post here. I'm asking because I ran out of clues and I was unable to find anything about this specific issue.
My question is: In Adobe AIR, is there a way to do a synchronous usleep() equivalent (delay execution of 200ms), alternatively is there a way to specify the SQLite busy timeout somewhere?
I have an AIR application which uses the database in synchronous mode because the code cannot cope with the need of events/callbacks in SQL queries.
The database sometimes is accessed from another application, such that it is busy. Hence the execute() of a statement throws SQLerror 3119 detail 2206. In this case the command shall be retried after a short delay.
As there is another application running on the computer I want to try to avoid busy waiting, however I'm stuck with it because of three things:
First, I was unable to find a way to give the SQLConnection a busy timeout value, like it is possible in C with the function sqlite3_busy_timeout()
Second, I was unable to find the equivalent of the C usleep() command in Adobe AIR / Actionscript.
Third, I am unable to use events/timers/callbacks etc. at this location. The SQL execute() must be synchronous because it is called from deeply nested classes and functions in zillion of places all around in the application.
If the application could cope with events/callbacks while doing SQL I would use an asynchronous database anyway, so this problem cannot be solved using events. The retry must be done on the lowest level without using the AIR event processing facility.
The lowest level of code looks like:
private static function retried(fn:Function):void {
var loops:int = 0;
for (;;) {
try {
fn();
if (loops)
trace("database available again, "+loops+" loops");
return;
} catch (e:Error) {
if (e is SQLError && e.errorID==3119) {
if (!loops)
trace("database locked, retrying");
loops++;
// Braindead AIR does not provide a synchronous sleep
// so we busy loop here
continue;
}
trace(e.getStackTrace());
trace(e);
throw e;
}
}
}
One sample use of this function is:
protected static function begin(conn:SQLConnection):void {
retried(function():void{
conn.begin(SQLTransactionLockType.EXCLUSIVE);
});
}
Output of this code is something like:
database locked, retrying
database available again, 5100 loops
Read: The application loops over 500 times a second. I would like to reduce this to 5 loops somehow to reduce CPU load while waiting, because the App shall run on Laptops while on battery.
Thanks.
-Tino

create a queue of process in classic asp

here is the problem :
there is classic asp app which is calling lame.exe for encoding mp3s for lots of time per day
and there is no control of the way of calling lame.exe from several users in another word there is no queue for that purpose.
so here is what I am thinking about :
//below code all are pseudo-code
//process_flag and mp3 and processId all are reside in a database
function addQ(string mp3)
add a record to database
and set process_flag to undone
then goto checkQ
end function
function checkQ()
if there is a process in queue list and process_flag is undone
sort in by processID asc
for each processID
processQ(processID)
end for
end function
function ProcessQ(int processID)
run lame.exe with the help of wscript.exe
after doing the job set the process_flag to done
end function
so I just want to know is there any better solution?
or any other approaches out there?
regards.
Looks like a reasonable approach for classic asp.
Just make sure that in your checkQ function, you are only retrieving queue items that have the process_flag set to undone, or you might be trying to re-process the same items over and over.
Read this article for another approach using MSMQ - it starts by creating a new Public Queue, then sending messages to it from your asp page. It also required an additional executable to process queued items.
This is a perfect application for MSMQ. Let proven code handle the reliable messaging, concurrency control etc. so you can just focus on the application logic.

Resources