What is call out table in unix? - unix

Can anybody tell me what a 'call out table' is in Unix? Maurice J. Bach gives an explanation in his book Design of the UNIX Operating System, but I'm having difficulty in understanding the examples, especially the one explaining the reason of negative time-out fields. Why are software interrupts used there?
Thanks!

Interrupts stop the current code and start the execution of a high-priority handler; while this handler runs, nothing else can get the CPU. So if you need to do something complex, your interrupt handler would hang the whole system.
The solution: Fill a data structure with all the necessary data and then store this data structure with a pointer to the handler in the call out table. Some service (usually the clock handler) will eventually visit the table and execute the entries one by one in a standard context (i.e. one which doesn't block the process switching).

In System V unix, the kernel or device drivers could schedule some function to be run (or "called out") by the kernel at a later time. The kernel clock handler was in charge of making sure such registered call outs were executed. The call out table was the kernel data structure in which such registered "call outs" were stored.
I don't know to what end they were generally used.

Related

A MailboxProcessor that operates with a LIFO logic

I am learning about F# agents (MailboxProcessor).
I am dealing with a rather unconventional problem.
I have one agent (dataSource) which is a source of streaming data. The data has to be processed by an array of agents (dataProcessor). We can consider dataProcessor as some sort of tracking device.
Data may flow in faster than the speed with which the dataProcessor may be able to process its input.
It is OK to have some delay. However, I have to ensure that the agent stays on top of its work and does not get piled under obsolete observations
I am exploring ways to deal with this problem.
The first idea is to implement a stack (LIFO) in dataSource. dataSource would send over the latest observation available when dataProcessor becomes available to receive and process the data. This solution may work but it may get complicated as dataProcessor may need to be blocked and re-activated; and communicate its status to dataSource, leading to a two way communication problem. This problem may boil down to a blocking queue in the consumer-producer problem but I am not sure..
The second idea is to have dataProcessor taking care of message sorting. In this architecture, dataSource will simply post updates in dataProcessor's queue. dataProcessor will use Scanto fetch the latest data available in his queue. This may be the way to go. However, I am not sure if in the current design of MailboxProcessorit is possible to clear a queue of messages, deleting the older obsolete ones. Furthermore, here, it is written that:
Unfortunately, the TryScan function in the current version of F# is
broken in two ways. Firstly, the whole point is to specify a timeout
but the implementation does not actually honor it. Specifically,
irrelevant messages reset the timer. Secondly, as with the other Scan
function, the message queue is examined under a lock that prevents any
other threads from posting for the duration of the scan, which can be
an arbitrarily long time. Consequently, the TryScan function itself
tends to lock-up concurrent systems and can even introduce deadlocks
because the caller's code is evaluated inside the lock (e.g. posting
from the function argument to Scan or TryScan can deadlock the agent
when the code under the lock blocks waiting to acquire the lock it is
already under).
Having the latest observation bounced back may be a problem.
The author of this post, #Jon Harrop, suggests that
I managed to architect around it and the resulting architecture was actually better. In essence, I eagerly Receive all messages and filter using my own local queue.
This idea is surely worth exploring but, before starting to play around with code, I would welcome some inputs on how I could structure my solution.
Thank you.
Sounds like you might need a destructive scan version of the mailbox processor, I implemented this with TPL Dataflow in a blog series that you might be interested in.
My blog is currently down for maintenance but I can point you to the posts in markdown format.
Part1
Part2
Part3
You can also check out the code on github
I also wrote about the issues with scan in my lurking horror post
Hope that helps...
tl;dr I would try this: take Mailbox implementation from FSharp.Actor or Zach Bray's blog post, replace ConcurrentQueue by ConcurrentStack (plus add some bounded capacity logic) and use this changed agent as a dispatcher to pass messages from dataSource to an army of dataProcessors implemented as ordinary MBPs or Actors.
tl;dr2 If workers are a scarce and slow resource and we need to process a message that is the latest at the moment when a worker is ready, then it all boils down to an agent with a stack instead of a queue (with some bounded capacity logic) plus a BlockingQueue of workers. Dispatcher dequeues a ready worker, then pops a message from the stack and sends this message to the worker. After the job is done the worker enqueues itself to the queue when becomes ready (e.g. before let! msg = inbox.Receive()). Dispatcher consumer thread then blocks until any worker is ready, while producer thread keeps the bounded stack updated. (bounded stack could be done with an array + offset + size inside a lock, below is too complex one)
Details
MailBoxProcessor is designed to have only one consumer. This is even commented in the source code of MBP here (search for the word 'DRAGONS' :) )
If you post your data to MBP then only one thread could take it from internal queue or stack.
In you particular use case I would use ConcurrentStack directly or better wrapped into BlockingCollection:
It will allow many concurrent consumers
It is very fast and thread safe
BlockingCollection has BoundedCapacity property that allows you to limit the size of a collection. It throws on Add, but you could catch it or use TryAdd. If A is a main stack and B is a standby, then TryAdd to A, on false Add to B and swap the two with Interlocked.Exchange, then process needed messages in A, clear it, make a new standby - or use three stacks if processing A could be longer than B could become full again; in this way you do not block and do not lose any messages, but could discard unneeded ones is a controlled way.
BlockingCollection has methods like AddToAny/TakeFromAny, which work on an arrays of BlockingCollections. This could help, e.g.:
dataSource produces messages to a BlockingCollection with ConcurrentStack implementation (BCCS)
another thread consumes messages from BCCS and sends them to an array of processing BCCSs. You said that there is a lot of data. You may sacrifice one thread to be blocking and dispatching your messages indefinitely
each processing agent has its own BCCS or implemented as an Agent/Actor/MBP to which the dispatcher posts messages. In your case you need to send a message to only one processorAgent, so you may store processing agents in a circular buffer to always dispatch a message to least recently used processor.
Something like this:
(data stream produces 'T)
|
[dispatcher's BCSC]
|
(a dispatcher thread consumes 'T and pushes to processors, manages capacity of BCCS and LRU queue)
| |
[processor1's BCCS/Actor/MBP] ... [processorN's BCCS/Actor/MBP]
| |
(process) (process)
Instead of ConcurrentStack, you may want to read about heap data structure. If you need your latest messages by some property of messages, e.g. timestamp, rather than by the order in which they arrive to the stack (e.g. if there could be delays in transit and arrival order <> creation order), you can get the latest message by using heap.
If you still need Agents semantics/API, you could read several sources in addition to Dave's links, and somehow adopt implementation to multiple concurrent consumers:
An interesting article by Zach Bray on efficient Actors implementation. There you do need to replace (under the comment // Might want to schedule this call on another thread.) the line execute true by a line async { execute true } |> Async.Start or similar, because otherwise producing thread will be consuming thread - not good for a single fast producer. However, for a dispatcher like described above this is exactly what needed.
FSharp.Actor (aka Fakka) development branch and FSharp MPB source code (first link above) here could be very useful for implementation details. FSharp.Actors library has been in a freeze for several months but there is some activity in dev branch.
Should not miss discussion about Fakka in Google Groups in this context.
I have a somewhat similar use case and for the last two days I have researched everything I could find on the F# Agents/Actors. This answer is a kind of TODO for myself to try these ideas, of which half were born during writing it.
The simplest solution is to greedily eat all messages in the inbox when one arrives and discard all but the most recent. Easily done using TryReceive:
let rec readLatestLoop oldMsg =
async { let! newMsg = inbox.TryReceive 0
match newMsg with
| None -> oldMsg
| Some newMsg -> return! readLatestLoop newMsg }
let readLatest() =
async { let! msg = inbox.Receive()
return! readLatestLoop msg }
When faced with the same problem I architected a more sophisticated and efficient solution I called cancellable streaming and described in in an F# Journal article here. The idea is to start processing messages and then cancel that processing if they are superceded. This significantly improves concurrency if significant processing is being done.

Asynchronous MPI with SysV shared memory

We have a large Fortran/MPI code-base which makes use of system-V shared memory segments on a node. We run on fat nodes with 32 processors, but only 2 or 4 NICs, and relatively little memory per CPU; so the idea is that we set up a shared memory segment, on which each CPU performs its calculation (in its block of the SMP array). MPI is then used to handle inter-node communications, but only on the master in the SMP group. The procedure is double-buffered, and has worked nicely for us.
The problem came when we decided to switch to asynchronous comms, for a bit of latency hiding. Since only a couple of CPUs on the node communicate over MPI, but all of the CPUs see the received array (via shared memory), a CPU doesn't know when the communicating CPU has finished, unless we enact some kind of barrier, and then why do asynchronous comms?
The ideal, hypothetical solution would be to put the request tags in an SMP segment and run mpi_request_get_status on the CPU which needs to know. Of course, the request tag is only registered on the communicating CPU, so it doesn't work! Another proposed possibility was to branch a thread off on the communicating thread and use it to run mpi_request_get_status in a loop, with the flag argument in a shared memory segment, so all the other images can see. Unfortunately, that's not an option either, since we are constrained not to use threading libraries.
The only viable option we've come up with seems to work, but feels like a dirty hack. We put an impossible value in the upper-bound address of the receive buffer, that way once the mpi_irecv has completed, the value has changed and hence every CPU knows when it can safely use the buffer. Is that ok? It seems that it would only work reliably if the MPI implementation can be guaranteed to transfer data consecutively. That almost sounds convincing, since we've written this thing in Fortran and so our arrays are contiguous; I would imagine that the access would be also.
Any thoughts?
Thanks,
Joly
Here's a pseudo-code template of the kind of thing I'm doing. Haven't got the code as a reference at home, so I hope I haven't forgotten anything crucial, but I'll make sure when I'm back to the office...
pseudo(array_arg1(:,:), array_arg2(:,:)...)
integer, parameter : num_buffers=2
Complex64bit, smp : buffer(:,:,num_buffers)
integer : prev_node, next_node
integer : send_tag(num_buffers), recv_tag(num_buffers)
integer : current, next
integer : num_nodes
boolean : do_comms
boolean, smp : safe(num_buffers)
boolean, smp : calc_complete(num_cores_on_node,num_buffers)
allocate_arrays(...)
work_out_neighbours(prev_node,next_node)
am_i_a_slave(do_comms)
setup_ipc(buffer,...)
setup_ipc(safe,...)
setup_ipc(calc_complete,...)
current = 1
next = mod(current,num_buffers)+1
safe=true
calc_complete=false
work_out_num_nodes_in_ring(num_nodes)
do i=1,num_nodes
if(do_comms)
check_all_tags_and_set_safe_flags(send_tag, recv_tag, safe) # just in case anything else has finished.
check_tags_and_wait_if_need_be(current, send_tag, recv_tag)
safe(current)=true
else
wait_until_true(safe(current))
end if
calc_complete(my_rank,current)=false
calc_complete(my_rank,current)=calculate_stuff(array_arg1,array_arg2..., buffer(current), bounds_on_process)
if(not calc_complete(my_rank,current)) error("fail!")
if(do_comms)
check_all_tags_and_set_safe(send_tag, recv_tag, safe)
check_tags_and_wait_if_need_be(next, send_tag, recv_tag)
recv(prev_node, buffer(next), recv_tag(next))
safe(next)=false
wait_until_true(all(calc_complete(:,current)))
check_tags_and_wait_if_need_be(current, send_tag, recv_tag)
send(next_node, buffer(current), send_tag(current))
safe(current)=false
end if
work_out_new_bounds()
current=next
next=mod(next,num_buffers)+1
end do
end pseudo
So ideally, I would have liked to have run "check_all_tags_and_set_safe_flags" in a loop in another thread on the communicating process, or even better: do away with "safe flags" and make the handle to the sends / receives available on the slaves, then I could run: "check_tags_and_wait_if_need_be(current, send_tag, recv_tag)" (mpi_wait) before the calculation on the slaves instead of "wait_until_true(safe(current))".
"...unless we enact some kind of barrier, and then why do asynchronous comms?"
That sentence is a bit confused. The purpose of asynchrononous communications is to overlap communications and computations; that you can hopefully get some real work done while the communications is going on. But this means you now have two tasks occuring which eventually have to be synchronized, so there has to be something which blocks the tasks at the end of the first communications phase before they go onto the second computation phase (or whatever).
The question of what to do in this case to implement things nicely (it seems like what you've got now works but you're rightly concerned about the fragility of the result) depends on how you're doing the implementation. You use the word threads, but (a) you're using sysv shared memory segments, which you wouldn't need to do if you had threads, and (b) you're constrained not to be using threading libraries, so presumably you actually mean you're fork()ing processes after MPI_Init() or something?
I agree with Hristo that your best bet is almost certainly to use OpenMP for on-node distribution of computation, and would probably greatly simplify your code. It would help to know more about your constraint to not use threading libraries.
Another approach which would still avoid you having to "roll your own" process-based communication layer that you use in addition to MPI would be to have all the processes on the node be MPI processes, but create a few communicators - one to do the global communications, and one "local" communicator per node. Only a couple of processes per node would be a part of a communicator which actually does off-node communications, and the others do work on the shared memory segment. Then you could use MPI-based methods for synchronization (Wait, or Barrier) for the on-node synchronization. The upcoming MPI3 will actually have some explicit support for using local shared memory segments this way.
Finally, if you're absolutely bound and determined to keep doing things through what's essentially your own local-node-only IPC implementation --- since you're already using SysV shared memory segments, you might as well use SysV semaphores to do the synchronization. You're already using your own (somewhat delicate) semaphore-like mechanism to "flag" when the data is ready for computation; here you could use a more robust, already-written semaphore to let the non-MPI processes know when the data is ready for computation (and a similar mechanism to let the MPI process know when the others are done with the computation).

How to automatically update time in cics

I have two questions first is the main one.
1. I was able to display date in a cics map but what i need is, i want it to be ticking i.e., it should be display everysecond updated.
2. I have a COBOL-DB2 program which automatically inserts the data from database(DB2) to a file. I want this program to be called on a timestamp basis i.e., every 1hr, 2hr, or every day.
Thank you
You can do this, but you will need to change modify traditional psuedo-conversationl approach. Instead of returning and waiting for a user event, you can start your tran after some number of seconds with your current commarea and quit. If a user event occurs in that time, you can cancel your start request, if it doesn't, you can refresh the screen timestamp and repeat.
It is kinda a pain just to get a timestamp refreshed. Doesn't make much sense to bother with unless you have a really good reason.
The DB2 stuff is plain easy. Start your tran using interval control, the same START AFTER() described above, and you can have it run hourly, or bihourly, or whatever.
I don't think that you need to modify your pseudo-conversational approach to achieve what you need. Just issue a EXEC CICS START command with a one second delay (just do this once) for a small program that just issues a Send Map (or TC Write) to the terminal facility. Ideally reserve a common area on the screen so all transactions can use a common program. At some point, when the updates are no longer required, CANCEL the START request.The way I see it, the timer update transaction will mix in nicely with you user-initiated transaction flow. If a user transaction is active when the start timer pops, the timer update program will just be delayed a little.
While this should work, you need to bear in mind that you might be driving 3,600 transactions per hour for each user. Is this feature really worth all that?
This is not possible in standard CICS using maps. The 3270 protocol does not lend itself to continually updating screens. The majority of automatic updating screens such as consoles and monitoring displays use native VTAM methods, building their own data streams.
It might be possible to do this using unformatted data, but I would not recommend it in CICS. Pseudo-conversational CICS does not have a program in control during screen display, and conversational programming is highly discouraged.
You can't really do this in CICS, which was designed for pseudo-interactive responses at best. It was designed for use on mainframes where your terminal was sent a whole page or screen, the program read the screen as received (which has some fields the user would update and if you didn't change them the terminal did not send the data back) then, the CICS transaction having taken a part of a screen containing changes, sends the response back and quits.
This makes for very efficient data entry and inquiry programs. But realize, when the program has finished processing the screen, it's quit, it's gone, and it's not even in memory any more, all the resources have been reclaimed. This allows the company to run a mainframe with 300 terminals and maybe 10 megabytes of real memory, because when the program is waiting for you to respond, it's not using any resources at all, if there are 200 people running a data entry program, they are running a re-entrant program in which all 200 of them are running the same copy of the same program and the only thing they're using is maybe 1K of writable storage per user for the part that has to read a screen or a file record and do some calculations. Think about that, 200 people are running the same program and all of them, simultaneously, are using one module that uses 20K of memory for the application - and it's the same 20K for every single one of them - and 1K each of actual read/write data.
Think about that for a moment, the first user to start that data entry program uses 20K of memory for the application, plus 1K for the writable data. Each user after that who is being processed on that program uses an additional 1K of memory, that's all. When they're sitting there looking at the terminal, all they might be using is 4 bytes in a table to tell the system there's a terminal connected. No resources are used at all.
To be able to have a screen updated on a regular basis means that something has to keep running, which is not something CICS does very well. CICS is not intended to be used for interactive processing the way a PC does because you're actually running live on the PC.
EXEC CICS ASK TIME END-EXEC to update the timestamp.
EXEC CICS SEND MAP DATA ONLY END-EXEC to update the screen.
However, using the suggested
EXEC CICS START TRANSID ('name' | namefld)
DELAY (time)
END-EXEC.
is actually the better way.

How to write integration test for systems that interact asynchronously

Assume that i have function called PlaceOrder, which when called inserts the order details into local DB and puts a message(order details) into a TIBCO EMS Queue.
Once message received, a TIBCO BW will then invoke some other system(say ExternalSystem) to pass on the order details.
Now the way i wrote my integration tests is
Call the Place Order
Sleep, and check details exists in local DB
Sleep and check details exists in ExternalSystem.
Is the above approach correct? Above test gives me confidence that, End to End integration is working, but are there any better way to test above scenario?
The problem you describe is quite common, and your approach is a very typical solution.
The problem with this solution is that if the delay is too short, your tests may sometimes pass and sometimes fail, but if the delay is very long, then your just wasteing time waiting, and with many tests, it can add a lot of delay. But unless you can get some signal to tell you the order arrived in the database, then you just have to wait.
You can reduce the delay by doing lots of checks with short intervals. If you're order is not there after timeout, then you would fail the test.
In "Growing Object-Oriented Software, Guided by Tests"*, there is a chapter on this very subject, so you might want to get a copy if you will be doing a lot of this sort of testing.
"There are two ways a test can observe the system: by sampling its observable
state or by listening for events that it sends out. Of these, sampling is
often the only option because many systems don’t send any monitoring
events. It’s quite common for a test to include both techniques to interact
with different “ends” of its system"
(*) http://my.safaribooksonline.com/book/software-engineering-and-development/software-testing/9780321574442

How to properly write a SIGPROF handler that invokes AsyncGetCallTrace?

I am writing a short and simple profiler (in C), which is intended to print out stack traces for threads in various Java clients at regular intervals. I have to use the undocumented function AsyncGetCallTrace instead of GetStackTrace to minimize intrusion and allow for stack traces regardless of thread state. The source code for the function can be found here: http://download.java.net/openjdk/jdk6/promoted/b20/openjdk-6-src-b20-21_jun_2010.tar.gz
in hotspot/src/share/vm/prims/forte.cpp. I found some man pages documenting JVMTI, signal handling, and timing, as well as a blog with details on how to set up the AsyncGetCallTrace call: http://jeremymanson.blogspot.com/2007/05/profiling-with-jvmtijvmpi-sigprof-and.html
What this blog is missing is the code to actually invoke the function within the signal handler (the author assumes the reader can do this on his/her own). I am asking for help in doing exactly this. I am not sure how and where to create the struct ASGCT_CallTrace (and the internal struct ASGCT_CallFrame), as defined in the aforementioned file forte.cpp. The struct ASGCT_CallTrace is one of the parameters passed to AsyncGetCallTrace, so I do need to create it, but I don't know how to obtain the correct values for its fields: JNIEnv *env_id, jint num_frames, and JVMPI_CallFrame *frames. Furthermore, I do not know what the third parameter passed to AsyncGetCallTrace (void* ucontext) is supposed to be?
The above problem is the main one I am having. However, other issues I am faced with include:
SIGPROF doesn't seem to be raised by the timer exactly at the specified intervals, but rather a bit less frequently. That is, if I set the timer to send a SIGPROF every second (1 sec, 0 usec), then in a 5 second run, I am getting fewer than 5 SIGPROF handler outputs (usually 1-3)
SIGPROF handler outputs do not appear at all during a Thread.sleep in the Java code. So, if a SIGPROF is to be sent every second, and I have Thread.sleep(5000);, I will not get any handler outputs during the execution of that code.
Any help would be appreciated. Additional details (as well as parts of code and sample outputs) will be posted upon request.
I finally got a positive result, but since little discussion was spawned here, my own answer will be brief.
The ASGCT_CallTrace structure (and the underlying ASGCT_CallFrame array) can simply be declared in the signal handler, thus existing only the stack:
ASGCT_CallTrace trace;
JNIEnv *env;
global_VM_pointer->AttachCurrentThread((void **) &env, NULL);
trace.env_id = env;
trace.num_frames = 0;
ASGCT_CallFrame storage[25];
trace.frames = storage;
The following gets the uContext:
ucontext_t uContext;
getcontext(&uContext);
And then the call is just:
AsyncGetCallTrace(&trace, 25, &uContext);
I am sure there are some other nuances that I had to take care of in the process, but I did not really document them. I am not sure I can disclose the full current code I have, which successfully asynchronously requests for and obtains stack traces of any java program at fixed intervals. But if someone is interested in or stuck on the same problem, I am now able to help (I think).
On the other two issues:
[1] If a thread is sleeping and a SIGPROF is generated, the thread handles that signal only after waking up. This is normal, since it is the thread's job to handle the signal.
[2] The timer imperfections do not seem to appear anymore. Perhaps I mis-measured.

Resources