how can variables be accessed from both main() and the IRQ() functions? - microcontroller

I'm studying for a test next week and we're learning about microcontrollers. We just did a sample code with interrupts and it told them temperature in F and C when we pushed a button (interrupted). how can C and F be accessed from both
main and the IRQ() functions?

The most simple way to share a variable between IRQ handlers and the main thread on any bare metal system:
Make sure that the variable type is one
which the CPU can atomically read and write.
Make the variable global and declare it volatile, so that
the generated machine code cannot optimize away accesses to
the shared variable.
To read a value, use something like const atomic_type local_copy = shared_variable; and work with that local copy. An expression like shared_variable * shared_variable might use different values for shared_variable.
Make sure that only one IRQ handler (which must only run once
at the same time), or only the main thread writes to the shared variable.
All other parts of the code are only allowed to read the shared
variable.
If the data you want to communicate between IRQ handler and main thread does not fit inside an atomic type, have fun with complex locking protocols.

Related

how are variables found and is it done in constant time

one thing I was recently thinking about was how a computer finds his variables. When we run a program, the program will create multiple layers in the stack, one layer for every new scope it opens and put either the variable value or a pointer in case of storage in heap in this scope. When the scope is done, it and all its variables will be destroyed. But how does a computer know where its variables are? And which ones to use if the same variables occur to be present more often.
How I imagine it, the computer searches the scope it is in like an array and if it doesn't find the variable it follows the stack downwards like a linked list and searches the next scope like an array.
That leads to the assumption that a global variable is the slowest to use since it has to traverse all the way back to the last scope. So it has a computational time from a * n (a = average amount of variables per scope, n = amount of scopes). If I now assume that my code is recursive and within the recursive function calls on a global variable (let's say I have defined the variable const PI = 3.1416 and I use it in every recursion), then it would traverse it backwards again for every single call and if my recursive function takes 1000 recursion, then it does that 1000 times.
But on the other hand, while learning about recursion, I have never heard that referring to variables that are not found inside the recursive scope is to be avoided if possible. Therefore I wonder if I am right with my thoughts. Can someone please shed some light on the issue.
You got it the other way around: scopes, frames, heaps don't make variables, variables make scopes, frames, heaps.
Both are a bit of a stretch actually but my point is to avoid focusing on the lifetime of a variable (that's what terms like heap and stack really mean) and instead take a look under the hood.
Memory is a form of storage where each cell is assigned a number, the cell is called word and the number is called address.
The set of addresses is called address space, an address space is usually a range of addresses or a union of ranges of addresses.
The compiler assumes the program data will be loaded at a specific address, say X, and that the there is enough memory after X (i.e. X+1, X+2, X+3, ..., all exists) for all the data.
Variables are then laid out sequentially from X onward, it is the job of the compiler to keep the association between the address X+k and the variable instance.
Note that a variable may be instanced more than one time, calling a function twice or recursion are both examples of that.
In the first case, the two instances can share the same address X+k since they are don't overlap in time (by the time the second instance is alive, the first is over).
In the second case, the two instances overlap in time and two addresses must be used.
So we see that it is the lifetime of a variable that affects how the mapping between the variable name and its address (a.k.a. the allocation of the variable) is done.
Two common strategies are:
A stack
We start from an address X+b and allocates new instances at successive addresses X+b+1, X+b+2, etc.
The current address (e.g. X+b+54) is stored somewhere (it is the stack pointer).
When we want to free a variable we set the stack pointer back (e.g. from X+b+54 to X+b+53).
We can see that it's impossible to free a variable that is not the last allocated.
This allows for a very fast allocation/deallocation and naturally fits the need of a function frame that holds the local variables: when a function is invoked the new variables are allocated, when it ends they are removed.
From what we noted above, we see that if f calls g (i.e. f is the parent of g) then the variables of f cannot be deallocated before those of g.
This again naturally fits the semantics of functions.
The heap
This strategy dynamically allocate a variable instance at an address X+o.
The runtime reserves a block of addresses and manages their status (free, occupied), when asked, it can give a free address and mark it occupied.
This is useful to allocate an object whose size depends on the user input, for example.
The heap (static)
Some variables have the lifespan of the program but their size and number is known a compile time.
In this case, the compiler simply assigns each instance a unique address X+i.
They cannot be deallocated, they are loaded in memory in batch along with the program code and stay there until the program is unloaded.
I left behind some details, like the fact that the stack most often than not grows from bigger to lower addresses (so it can be put at the farthest edge of the memory) and that variables occupy more than one address.
Some programming languages, especially interpreted ones, don't associate addresses to variable instances, instead, they keep a map between the variable name (properly qualified) and the variable value, this way the lifespan of a variable can be controlled in many particular ways (see Closure in Javascript).
Global variables are allocated in the static heap, only one instance is present (only one address).
Each recursive function that uses it always references directly to the sole instance because the unique address is known at compile time.
Local variables in a function are allocated in the stack and each invocation of a function (recursive or not) uses a new set of instances (the addresses don't need to be the same each time, but they could).
Simply put, there is no lookup, variables are allocated so that the code can access them once compiler (either relatively, in the stack, or absolutely, in the heap).

Asterisk pre-emption and callers in a channel

I would like to have pre-emption calls in Asterisk. I think there is no Asterisk support for this feature so i'm trying to implement it following a simliar algorithm like the one showed in this thread: Asterisk - Pre-emption calls
So I'm having problems in this step:
check if B in call with lower priority caller( ASTDB or REALTIME or fastagi script).
I know how to check if B is in a call using for example DEVICE_STATE(device) cmd, but i can't achieve to know who is the other caller in order to see his priority.
So, How can I know if one users is in a call and who is the other caller inside this call?
Thanks a lot.
You can read variables of any channel using
SHARED(varname[,channel])
-= Info about function 'SHARED' =-
[Synopsis]
Gets or sets the shared variable specified.
[Description]
Implements a shared variable area, in which you may share variables between
channels.
The variables used in this space are separate from the general namespace
of the channel and thus ${SHARED(foo)} and ${foo} represent two completely
different variables, despite sharing the same name.
Finally, realize that there is an inherent race between channels operating
at the same time, fiddling with each others' internal variables, which is
why this special variable namespace exists; it is to remind you that variables
in the SHARED namespace may change at any time, without warning. You should
therefore take special care to ensure that when using the SHARED namespace,
you retrieve the variable and store it in a regular channel variable before
using it in a set of calculations (or you might be surprised by the
result).
Sure you have set variables first.
You can set in variables or in ASTDB name of current speaking channel using in-call macro
General complexity of any solution like you want is above average, need person with at least 1-2 year of extensive experience with *.

Why does System V shared memory have separate get and attach functions?

Using System V shared memory IPC requires calls to the following two functions:
int shmget(key_t key, size_t size, int shmflg);
void *shmat(int shmid, const void *shmaddr, int shmflg);
Why are they designed to be separate, instead of having a single function that accepts these arguments, performs both functions and simply returns the address?
We can consider files as an analogy. open on a string (the file path) gives us a file descriptor, and we use that to read/write from the file. We close on the file descriptor when we're done. This design seems natural, we don't have to open with a string to get a descriptor, and then attach to the descriptor.
As an example of what I have in mind, take a look at the FreeBSD sendmail shared memory implementation.
This kind of separation (shm_open and mmap) also exists with POSIX shared memory, but the reason was that mmap existed before shm_open was implemented and could be reused, and mmap requires a descriptor (source: UNIX Network Programming Vol. 2, R. Stevens, chapter 13, page 326).
Shared memory is probably one of the fastest ways of allowing for IPC as data need not be copied, the problem associated with it though is synchronizing access between multiple threads. You could do this using semaphores or record locks , we end up using the later in unix fro shared memory even though they are not as efficient as they are simple, the system cleans up well, and you don't need some of the bling that semaphores bring along.
Lets look into how these work to understand why they are implemented as such.
In comes the shmid_ds used by the linux kernel (http://www.tldp.org/LDP/lpg/node68.html)
the shm_nattch is the unsigned int counter for current attaches. shmget gets you an shm id and sets stuff like the ipc_perm , dates, pid, atime ctime, request of the segment size (shm_segsz)
next the shmctl kicks in and does stuff for ipc using IPC_STAT, IPC_RMID, IPC_SET like setting perms, getting or removing shm_id for a segment or even locking or unlocking it.
Once the segment is ready shmat is used by a process to attach to its address space, depending on the flags and address parameters. Once it attaches the kernel increments the shm_nattch. When detaching we call shmdt to detach . Removal of the identifier and the associated data structure is not automated some process has to do this calling shmctl with the IPC_RMID and depending on shm_perm
As you can see this is all very similar to how one would use semaphores and the implementation makes sense.
One possible reason I could think of is this:
(From the manpage of shmget)
After a fork(2) the child inherits the attached shared memory segments.
After an execve(2) all attached shared memory segments are detached from the process.
Upon _exit(2) all attached shared memory segments are detached from the process.
Well, technically attaching and detaching is basic reference counting on the shared memory segment that is reserved during shmget.
The functionalities of allocating the shared memory segment, via shmget and reference counting them (up or down, via shmat and shmdt respectively), are separate so that, code can be reused during fork and exec.
If they were both packed into the same function, you would anyways need a separate function, which just does reference counting (to be invoked during fork/exec). So, I think this design is simply to promote code reuse, and avoid code duplication.

Proper way to inform OpenCL kernels of many memory objects?

In my OpenCL program, I am going to end up with 60+ global memory buffers that each kernel is going to need to be able to access. What's the recommended way to for letting each kernel know the location of each of these buffers?
The buffers themselves are stable throughout the life of the application -- that is, we will allocate the buffers at application start, call multiple kernels, then only deallocate the buffers at application end. Their contents, however, may change as the kernels read/write from them.
In CUDA, the way I did this was to create 60+ program scope global variables in my CUDA code. I would then, on the host, write the address of the device buffers I allocated into these global variables. Then kernels would simply use these global variables to find the buffer it needed to work with.
What would be the best way to do this in OpenCL? It seems that CL's global variables are a bit different than CUDA's, but I can't find a clear answer on if my CUDA method will work, and if so, how to go about transferring the buffer pointers into global variables. If that wont work, what's the best way otherwise?
60 global variables sure is a lot! Are you sure there isn't a way to refactor your algorithm a bit to use smaller data chunks? Remember, each kernel should be a minimum work unit, not something colossal!
However, there is one possible solution. Assuming your 60 arrays are of known size, you could store them all into one big buffer, and then use offsets to access various parts of that large array. Here's a very simple example with three arrays:
A is 100 elements
B is 200 elements
C is 100 elements
big_array = A[0:100] B[0:200] C[0:100]
offsets = [0, 100, 300]
Then, you only need to pass big_array and offsets to your kernel, and you can access each array. For example:
A[50] = big_array[offsets[0] + 50]
B[20] = big_array[offsets[1] + 20]
C[0] = big_array[offsets[2] + 0]
I'm not sure how this would affect caching on your particular device, but my initial guess is "not well." This kind of array access is a little nasty, as well. I'm not sure if it would be valid, but you could start each of your kernels with some code that extracts each offset and adds it to a copy of the original pointer.
On the host side, in order to keep your arrays more accessible, you can use clCreateSubBuffer: http://www.khronos.org/registry/cl/sdk/1.2/docs/man/xhtml/clCreateSubBuffer.html which would also allow you to pass references to specific arrays without the offsets array.
I don't think this solution will be any better than passing the 60 kernel arguments, but depending on your OpenCL implementation's clSetKernelArgs, it might be faster. It will certainly reduce the length of your argument list.
You need to do two things. First, each kernel that uses each global memory buffer should declare an argument for each one, something like this:
kernel void awesome_parallel_stuff(global float* buf1, ..., global float* buf60)
so that each utilized buffer for that kernel is listed. And then, on the host side, you need to create each buffer and use clSetKernelArg to attach a given memory buffer to a given kernel argument before calling clEnqueueNDRangeKernel to get the party started.
Note that if the kernels will keep using the same buffer with each kernel execution, you only need to setup the kernel arguments one time. A common mistake I see, which can bleed host-side performance, is to repeatedly call clSetKernelArg in situations where it is completely unnecessary.

Does MPI_Type_commit implicitly call a barrier on all the processes in MPI_COMM_WORLD?

In my code I define a new MPI user-defined data type.
I was wondering if the MPI_Barrier function must follow the MPI_Commit or must be placed at some point where the first use of the new data type appears so that all the processes acknowledge and agree on the defintion of the new data type.
Thanks.
No - there's no communication within the MPI_Type commands, they're completely local. In particular, processes don't necessarily have to agree on the definition of a new type.
If rank 1 sends a new data type to rank 0, all they have to agree on is the amount of data, not the layout of the type. For instance, imagine rank 1 was sending all of it's (say, 2d) local array to rank 0 - it might just choose to send an MPI_Type_contiguous of NX*NY floats. But rank 0 might be receiving this into a larger global array; it might choose to receive it into a Subarray type of the global type. Even if those data types had the same names, they can describe different final layouts in memory, as long as the total amount of data is the same.
MPI datatypes are the private business of the process that creates them. They do not need to match, in fact it is possible and perfectly legal for a receiving process to use a type map that differs from that of the sending process (as long as it doesn't lead to memory corruption of course). As such, there is no synchornization whatsoever when using Define or Commit.

Resources