EJB 3.1 asynchronous method and thread pool - asynchronous

I need to process about 250.000 documents per day with an EJB 3.1 asynchronous method in order to face an overall long time task.
I do this to use more threads and process more documents concurrently. Here's an example in pseudo code:
// this returns about 250.000 documents per day
List<Document> documentList = Persistence.listDocumentsToProcess();
for(Document currentDocument: documentList){
//this is the asynchronous call
ejbInstance.processAsynchronously(currentDocument);
}
Suppose I have a thread pool of size 10 and 4 core processors, my questions are:
how many documents will the application server process SIMULTANEOUSLY?
what happen when all thread in pool are processing a documents and one more asynchronous call comes? Will this work like a sort of JMS Queue?
would I have any improvement adopting a JMS Queue solution
I work with Java EE 6 and WebSphere 8.5.5.2

The default configuration for asynchronous EJB method calls is as follows (from the infocenter):
The EJB container work manager has the following thread pool settings:
Minimum number of threads = 1
Maximum number of threads = 5
Work request queue size = 0 work objects
Work request queue full action = Block
Remote Future object duration = 86400 seconds
So trying to answer your questions:
how many documents will the application server process SIMULTANEOUSLY? (assuming 10 size thread pool)
This thread pool is for all EJB async calls, so first you need to assume that your application is the only one using EJB async calls. Then you will potentially have 10 runnable instances, that will be processed in parallel. Whether they will be processed concurrently depends on the number of cores/threads available in the system, so you cant have accurate number (some cores/threads may be doing web work for example, or other process using cpu).
what happen when all thread in pool are processing a documents and one more asynchronous call comes?
It depends on the Work request queue size and Work request queue full action, settings. If there are no available threads in the pool, then requests will be queued till the queue size is reached. Then it depends on the action, which might be Block or Fail.
would I have any improvement adopting a JMS Queue solution
Depends on your needs. Here are some pros/cons JMS solution.
Pros:
Persistence - if using JMS your asynchronous task can be persistent, so in case of the server failure you will not lost them, and will be processed after restart or by other cluster member. EJB async queue is held only in memory, so tasks in queue are lost in case of failure.
Scalability - if you put tasks to the queue, they might be concurrently processed by many servers in the cluster, not limited to single JVM
Expiration and priorities - you can define different expiration time or priorities for your messages.
Cons:
More complex application - you will need to implement MDB to process your tasks.
More complex infrastructure - you will need database to store the queues (file system can be used for single server, and shared filesystem can be used for clusters), or external messaging solution like WebSphere MQ
a bit lower performance for processing single item and higher load on server, as it will have to be serialized/deserialized to persistent storage

Related

Mule 4 Async vs VM Scope, which is more preferred to use for processing flow asynchronously?

from what I can comprehend briefly, both of them processing flow asynchronously with VM scope using more resource as it create new context, separate properties and variables. Any particular reason other than that if the use is just to process the flow asynchronously?
Async is a scope that is executed immediately in parallel with respect to the flow, if there are resources available (ie threads). VM is a connector that implements an in-memory queue. I usually recommend to prefer to use the VM connector because with Async if there are no threads available it can fail to execute. With the VM connector the messages will be queued until the flow that reads from the VM queue is able to read the next message. Note that if the number of messages queued is greater than the number of messages processed it will run out of memory or exceed the queue allocation, causing another error.
Always remember that threads are a limited resource. In Mule it is not possible to control the number of threads used, only the concurrency. Also keep in mind that threads are not free, they consume memory and CPU.

In asp.net, how many worker threads will be tied up using Async technique

An app makes 3 simultaneous HTTP requests to web server. using asynchronus technique, how many worker threads will be tied up waiting for the data
It wouldn't tied up, Because when you’re doing asynchronous work, you’re not always using a thread.
For example, if you made an async web service request, your client will not be using any threads between the “send” and “receive”.
You unwind after the “send”, and the “receive” occurs on an I/O completion port, at which time your callback is invoked and you will then be using a thread again. (Note that for this scenario your callback is executed on an i/o thread ASP.NET only uses worker threads, and only includes worker threads when it counts the threads in-use.)

Processing tasks in multiple processes with a managing application managing the heap of processes

Is there a standard name for this kind of design. and any existing frameworks in .net to make use of.
Multiple Process.exe are running in the server. There will be a ProcessPoolManager which is responsible for spawning these exes on a need basis. client(s) will send tasks to a queue. PoolManager reads the tasks queue and have them processed in the invoked process.exe and put response back into the queue. and client will get async response from queue when there is a response available.
Do you know what kind of design is this. and how to achieve it with any existing frameworks.

soa suite 11g bpel request handling is slow when many concurent requests are sent

We have a composite containing one mediator with sequential routing rule to bpel behind.
When single request is sent to the composite it is handled pretty fast (min=600ms, max=2s).
But when we send 60 concurrent requests handling is much slower (min=2s, avg=6s, max=25s).
During investigation we found out that:
Datasource pools were not exhausted (SOA_INFRA)
CPUs on SOA server and database servers were doing nothing (5-10% usage)
there is 15s lag between when request comes to the mediator and when it comes to bpel.
It seems like there are some other limited resources, e.g. max number of bpel instances running concurrently. But we are not able to find it and how to tune it.
How to tune SOA 11g to be able to serve concurrent requests faster?
Thanks!
By default, BPEL components are "async" in that the message first gets persisted to the soainfra database, and then gets invoked using a dispatcher invoke threads (even on sync - request/reply components).
See the following Oracle doc for changing the BPEL process to be truly sync and run in the existing thread: http://docs.oracle.com/cd/E23943_01/dev.1111/e10224/soa_transactions.htm#CHDBIDAA
See the following Oracle doc for increasing the number of dispatcher invoke threads if you prefer to not mess with the BPEL transaction properties: http://docs.oracle.com/cd/E25054_01/core.1111/e10108/bpel.htm#BABBGEFA

ASP.NET, IIS /CLR Thread & request in relation to synchronous v.s asynchronous programming

I'm just trying to clear up some concepts here. If anyone is willing to share their expertise on this matter, it's greatly appreciated it.
The following is my understanding of how IIS works in relation to threads, please correct me if I'm wrong.
HTTP.sys
As I understand it, for IIS 6.0 (I'll leave IIS 7.0 for now), web browser makes a request, gets pick up by HTTP.sys kernel driver, HTTP.sys hands it over to IIS 6.0's threadpool (I/O thread?) and such free up itself.
IIS 6.0 Thread/ThreadPool
IIS 6.0's thread in returns hands it over to ASP.NET, which returns a temporary HSE_STATUS_PENDING to IIS 6.0, such frees up the IIS 6.0 thread and then forward it to a CLR Thread.
CLR Thread/ThreadPool
When ASP.NET picks up a free thread in the CLR threadpool, it executes the request. If there are no available CLR threads, it gets queued up in the application level queue (which has bad performance)
So based on the previous understanding, my questions are the following.
In synchronous mode, does that mean 1 request per 1 CLR thread?
*) If so, how many CONCURRENT requests can be served on a 1 CPU? Or should I ask the reverse? How may CLR threads are allowed per 1 CPU? Say, 50 CLR threads are allowed, does that mean then it's limited to serve 50 requests at any given time? Confused.
If I set the "requestQueueLimit" in "processModle" configuration to 5000, what does it mean really? You can queue up 5000 requests in the application queue? Isn't that really bad? Why would you ever set it so high since application queue has bad performance?
If you are programming asynchronous page, exactly where it starts to get the benefit in the above process?
I researched and see that by default, IIS 6.0's threadpool size is 256. 5000 concurrent requests comes in, handled by 256 IIS 6.0 threads and then each of the 256 threads, hands it off to CLR threads which i'm guessing is even lower by default. isn't that itself asynchronous? A bit confused. In addition, where and when does the bottleneck start to show up in synchronous mode? and in asynchronous mode? (Not sure if I'm making any sense, I'm just confused).
What happens when IIS threadpool (all 256 of them) are busy?
What happens when all CLR threads are busy? (I assume then, all requests are queued up in the application level queue)
What happens when application queue is over the requestQueueLimit?
Thanks a lot for reading, greatly appreciate your expertise on this matter.
You're pretty spot-on with the handoff process to the CLR, but here's where things get interesting:
If every step of the request is CPU-bound/otherwise synchronous, yes: that request will suck up that thread for its lifetime.
However, if any part of the request processing tasks out to anything asynchronous or even anything I/O related outside of purely managed code (db connection, file read/write, etc), it is possible, if not probable, that this will happen:
Request comes into CLR-land, picked up by thread A
Request calls out to filesystem
Under the hood, the transition to unmanaged code happens at some level which results in an IO completion port thread (different than a normal thread pool thread) being allocated in a callback-like manner.
Once that handoff occurs, Thread A returns back to the thread pool, where it is able to service requests.
Once the I/O task completes, execution is re-queued, and let's say Thread A is busy - Thread B picks up the request.
This sort of "fun" behavior is also called "Thread Agility", and is one reason to avoid using ANYTHING that is Thread Static in an ASP.NET application if you can.
Now, to some of your questions:
The request queue limit is the number of requests that can be "in line" before requests start getting flat-out dropped. If you had, say, an exceptionally "bursty" application, where you may get a LOT of very short lived requests, setting this high would prevent dropped requests, since they would bunch up in the queue, but drain equally as quickly.
Asynchronous handlers allow you to create the same "call me when you're done" type of behavior that the above scenario has; for example, if you say needed to make a web service call, calling that synchronously via say some HttpWebRequest call would by default block until completion, locking up that thread until it was done. Calling the same service asynchronously (or an asynchronous handler, any Begin/EndXXX pattern...) allows you some control over who actually gets tied up - your calling thread can continue along performing actions until that web service returns, which might actually be after the request has completed.
One thing to note is there is but one ThreadPool - all non-IO threads are pulled from there, so if you move everything to asynchronous processing, you may just bite yourself by exhausting your threadpool doing background work, and not servicing requests.

Resources