What's the relationship among Enclave , thread and process?
Does SGX support multi-thread or multi-process?
What will happen if I call the "fork" to create a new process inside a enclave?
What's the relationship among Enclave , thread and process?
An enclave can be considered part of a process. A process can add enclave pages to its memory. After initializing the enclave, the process can execute the enclave code by issuing EENTER [1]. When the enclave call returns, it returns execution to non-enclave (untrusted) memory via EEXIT.
A thread is one of possibly multiple threads of executions of a process.
Does SGX support multi-thread or multi-process?
You can't run multiple processes in the same enclave, but you can run multiple threads in the same enclave. Each thread must have their own Thread Control Structure (TCS), which is supported by SGX [2]. With the SGX2 extensions (which are not supported by any CPU yet) it is also possible to add and remove TCS pages after enclave initialization, thereby allowing the enclave to dynamically adjust the amount of threads.
What will happen if I call the "fork" to create a new process inside a enclave?
fork is a system call, which is an illegal instruction in an enclave, therefore it will result in an exception [3].
Sources: The following chapters in https://software.intel.com/sites/default/files/managed/7c/f1/332831-sdm-vol-3d.pdf:
[1] 36.3 Enclave Life Cycle
[2] 38.8 TCS
[3] 38.6.1 Illegal Instructions
Related
from what I can comprehend briefly, both of them processing flow asynchronously with VM scope using more resource as it create new context, separate properties and variables. Any particular reason other than that if the use is just to process the flow asynchronously?
Async is a scope that is executed immediately in parallel with respect to the flow, if there are resources available (ie threads). VM is a connector that implements an in-memory queue. I usually recommend to prefer to use the VM connector because with Async if there are no threads available it can fail to execute. With the VM connector the messages will be queued until the flow that reads from the VM queue is able to read the next message. Note that if the number of messages queued is greater than the number of messages processed it will run out of memory or exceed the queue allocation, causing another error.
Always remember that threads are a limited resource. In Mule it is not possible to control the number of threads used, only the concurrency. Also keep in mind that threads are not free, they consume memory and CPU.
I'm trying to understand the difference between SGX threads enabled by TCS and untrusted threading provided by SDK.
If I understand correctly, TCS enables multiple logical processors to enter the same enclave. Each logical processor will have its own TCS and hence its own entry point (the OENTRY field in TCS). Each thread runs until an AEX happens or reaches the end of the thread. However, these threads enabled by TCS have no way to synchronize with each other yet. At least, there is no SGX instruction for synchronize.
Then, on the other hand, the SGX SDK offers a set of Thread Synchronization Primitives, mainly mutex and condition variable. These primitives are not trusted since they're eventually served by OS.
My question is, are these Thread Synchronization Primitives meant to be used by TCS threads? If so, wouldn't this deteriorate the security? The OS is able to play with scheduling as it wishes.
First, let us deal with your somewhat unclear terminology of
SGX threads enabled by TCS and untrusted threading provided by SDK.
Inside an enclave, only "trusted" threads can execute. There is no "untrusted" threading inside an enclave. Possibly, the following sentence in the SDK Guide [1] misled you:
Creating threads inside the enclave is not supported. Threads that run inside the enclave are created within the (untrusted) application.
The untrusted application has to set up the TCS pages (for more background on TCS see [2]). So how can the TCS set up by the untrusted application be the foundation for trusted threads inside the enclave? [2] gives the answer:
EENTER is only guaranteed to perform controlled jumps inside an enclave’s code if the contents of all the TCS pages are measured.
By measuring the TCS pages, the integrity of the threads (the TCS defines the allowed entry points) can be verified through enclave attestation. So only known-good execution paths can be executed within the enclave.
Second, let us look at the synchronization primitives.
The SDK does offer synchronization primitives, which you say are not to be trusted because they are eventually served by the OS. Lets look at the description of these primitives in [1]:
sgx_spin_lock() and unlock operate solely within the enclave (using atomic operations), with no need for OS interaction (no OCALL). Using a spinlock, you could yourself implement higher-level primitives.
sgx_thread_mutex_init() also does not make an OCALL. The mutex data structure is initialized within the enclave.
sgx_thread_mutex_lock() and unlock potentially perform OCALLS. However, since the mutex data is within the enclave, they can always enforce correctness of locking within the secure enclave.
Looking at the descriptions of the mutex functions, my guess is that the OCALLs serve to implement non-busy waiting outside the enclave. This is indeed handled by the OS, and susceptible to attacks. The OS may choose not to wake a thread waiting outside the enclave. But it can also choose to interrupt a thread running inside an enclave. SGX does not protect against DoS attacks (Denial of Service) from the (potentially compromised) OS.
To summarize, spin-locks (and by extension any higher-level synchronization) can be implemented securely inside an enclave. However, SGX does not protect against DoS attacks, and therefor you cannot assume that a thread will run. This also applies to locking primitives: a thread waiting on a mutex might not be awakened when the mutex is freed. Realizing this inherent limitation, the SDK designers chose to use (untrusted) OCALLs to efficiently implement some synchronization primitives (i.e. non-busy waiting).
[1] SGX SDK Guide
[2] SGX Explained
qweruiop, regarding your question in the comment (my answer is too long for a comment):
I would still count that as a DoS attack: the OS, which manages the resources of enclaves, denies T access to the resource CPU processing time.
But I agree, you do have to design the other threads running in that enclave with the awareness that T might never run. The semantics are different from running threads on a platform you control. If you want to be absolutely sure that the condition variable is checked, you have to do so on a platform you control.
The sgx_status_t returned by each proxy function (e.g. when making an ECALL into an enclave) can return SGX_ERROR_OUT_OF_TCS. So the SDK should handle all threading for you - just make ECALLs from two different ("untrusted") threads A and B outside the enclave, and the execution flow should continue in two ("trusted") threads inside the enclave, each bound to a separate TCS (assuming 2 unused TCS are available).
I need to process about 250.000 documents per day with an EJB 3.1 asynchronous method in order to face an overall long time task.
I do this to use more threads and process more documents concurrently. Here's an example in pseudo code:
// this returns about 250.000 documents per day
List<Document> documentList = Persistence.listDocumentsToProcess();
for(Document currentDocument: documentList){
//this is the asynchronous call
ejbInstance.processAsynchronously(currentDocument);
}
Suppose I have a thread pool of size 10 and 4 core processors, my questions are:
how many documents will the application server process SIMULTANEOUSLY?
what happen when all thread in pool are processing a documents and one more asynchronous call comes? Will this work like a sort of JMS Queue?
would I have any improvement adopting a JMS Queue solution
I work with Java EE 6 and WebSphere 8.5.5.2
The default configuration for asynchronous EJB method calls is as follows (from the infocenter):
The EJB container work manager has the following thread pool settings:
Minimum number of threads = 1
Maximum number of threads = 5
Work request queue size = 0 work objects
Work request queue full action = Block
Remote Future object duration = 86400 seconds
So trying to answer your questions:
how many documents will the application server process SIMULTANEOUSLY? (assuming 10 size thread pool)
This thread pool is for all EJB async calls, so first you need to assume that your application is the only one using EJB async calls. Then you will potentially have 10 runnable instances, that will be processed in parallel. Whether they will be processed concurrently depends on the number of cores/threads available in the system, so you cant have accurate number (some cores/threads may be doing web work for example, or other process using cpu).
what happen when all thread in pool are processing a documents and one more asynchronous call comes?
It depends on the Work request queue size and Work request queue full action, settings. If there are no available threads in the pool, then requests will be queued till the queue size is reached. Then it depends on the action, which might be Block or Fail.
would I have any improvement adopting a JMS Queue solution
Depends on your needs. Here are some pros/cons JMS solution.
Pros:
Persistence - if using JMS your asynchronous task can be persistent, so in case of the server failure you will not lost them, and will be processed after restart or by other cluster member. EJB async queue is held only in memory, so tasks in queue are lost in case of failure.
Scalability - if you put tasks to the queue, they might be concurrently processed by many servers in the cluster, not limited to single JVM
Expiration and priorities - you can define different expiration time or priorities for your messages.
Cons:
More complex application - you will need to implement MDB to process your tasks.
More complex infrastructure - you will need database to store the queues (file system can be used for single server, and shared filesystem can be used for clusters), or external messaging solution like WebSphere MQ
a bit lower performance for processing single item and higher load on server, as it will have to be serialized/deserialized to persistent storage
I've read through this page on some improvements that can be made for an ASP.NET application. However, I'm still not sure how maxWorkerThreads, maxIoThreads, and minFreeThreads functions together.
From what I understand, the minFreeThreads setting specifies the minimum number of worker threads available for callbacks, etc.... If I have a page that makes an asynchronous call to a web service, does this call use one of these free threads or does it use an IO thread? When does an IO thread get used and when are these free threads used?
I believe that I/O threads are consumed (and released) in regards to I/O operations (System.IO namespace). I'm guessing that this would typically disk reads/writes. This resource is a bit old, but I think most of the concepts have remained the same.
http://www.guidanceshare.com/wiki/ASP.NET_2.0_Performance_Guidelines_-_Threading
Looking at the processmodel element in the Web.Config there are two attributes.
maxWorkerThreads="25"
maxIoThreads="25"
What is the difference between worker threads and I/O threads?
Fundamentally not a lot, it's all about how ASP.NET and IIS allocate I/O wait objects and manage the contention and latency of communicating over the network and transferring data.
I/O threads are set aside as such because they will be doing I/O (as the name implies) and may have to wait for "long" periods of time (hundreds of milliseconds). They also can be optimized and used differently to take advantage of I/O completion port functionality in the Windows kernel. A single I/O thread may be managing multiple completion ports to maintain throughput.
Windows has a lot of capabilities for dealing with I/O blocking whereas ASP.NET/.NET has a plain concept of "Thread". ASP.NET can optimize for I/O by using more of the unmanaged threading capabilities in the OS. You wouldn't want to do this all the time for every thread as you lose a lot of capabilities that .NET gives you which is why there is a distinction between how the threads are intended to be used.
Worker threads are threads upon which regular "work" or just plain code/processing happens. Worker threads are unlikely to block a lot or wait on anything and will be short running and therefore require more aggressive scheduling to maximize processing power and throughput.
[Edit]: I also found this link which is particularly relevant to this question:
http://blogs.msdn.com/ericeil/archive/2008/06/20/windows-i-o-threads-vs-managed-i-o-threads.aspx
Just to add on to chadmyers...
Seems like I/O Threads was the old way ASP.NET serviced requests,
"Requests in IIS 5.0 are typically
serviced over I/O threads, or threads
performing asynchronous I/O because
requests are dispatched to the worker
process using asynchronous writes to a
named pipe."
with IIS6.0 this has changed.
"Thus all requests are now serviced by
worker threads drawn from the CLR
thread pool and never on I/O threads."
Source: http://msdn.microsoft.com/hi-in/magazine/cc164128(en-us).aspx