How to use boost::compute::atan2? - opencl

I would like to compute the phase of a complex number using boost::compute
here is my attempt, I expect the result to be equal to atan2(0.5f):
namespace bc = boost::compute;
bc::vector<std::complex<float>> vec{ {1.0f, 2.0f} };
bc::vector<float> result(1);
bc::transform(vec.begin(), vec.end(), result.begin(), bc::atan2<float>());
but I get a compilation error claiming "Non-unary function invoked one argument"

boost::compute's atan2 would appear to be a binary function just like std::atan2.
I'm assuming you're trying to obtain the phase angle of your complex number? The standard C++ function for this would be std::arg() - I don't see this one being defined in boost::compute, though I might have missed it.
If arg() is indeed missing, you're quite right it's implemented via atan2 - you'll need to extract the imaginary (boost::compute::imag()) and real (boost::compute::real()) components first though, and pass them as individual arguments to atan2.

I think you can also use Boost.Compute's lambda expressions for this:
bc::vector<float2> input{ {1.0f, 2.0f}, {3.0f, 4.0f}, {5.0f, 6.0f} };
bc::vector<float> output(3);
using boost::compute::lambda::atan2;
using boost::compute::_1;
using boost::compute::lambda::get;
bc::transform(
float2_input.begin(),
float2_input.end(),
float_output.begin(),
atan2(get<1>(_1), get<0>(_1)),
queue
);
float2 is bassically a complex in Boost.Compute. You can also check test_lambda.cpp.

I found a way to make it work.
stage 1: allocate 2 vectors:
bc::vector<std::complex<float>> vec{ {1.0f, 2.0f}, {3.0f, 4.0f}, {5.0f, 6.0f} };
bc::vector<float> result(3);
stage 2: interpret the complex vector as a float buffer iterator
buffer_iterator is quite useful when you have a strongly typed vector and would like to pass it to an algorithm as a different type.
auto beginf = bc::make_buffer_iterator<float>(vec.get_buffer(), 0);
auto endf = bc::make_buffer_iterator<float>(vec.get_buffer(), 6); // note end point to final index + 1
stage 3: define strided iterators so that we can use the same buffer as the argument for tan2. each iterator iterates the buffers in strides of 2 indices, and they supply tan2 with interleaved access to the buffer:
auto begin_a = bc::make_strided_iterator(beginf + 1, 2); // access imaginary part
auto end_a = bc::make_strided_iterator_end(beginf + 1, endf , 2);
auto begin_b = bc::make_strided_iterator(beginf, 2); // access real part
finally, call transform:
bc::transform(begin_a, end_a, begin_b, result.begin(), bc::atan2<float>()); // atan(b/a)
bc::system::default_queue().finish();

Related

How to define persistent matrix variable inside the Scilab function?

I have been developing a Scilab function where I need to have persistent variable of the matrix type. Based on my similar question I have chosen the same approach. Below is the code I have used for test of this approach.
function [u] = FuncXYZ(x)
global A;
global init;
if init == 0 then
init = 1;
A = eye(4, 4);
endif
u = A(1, 1);
endfunction
As soon as I have integrated the function inside my Xcos simulation I have been surprised that I see "0" at the output of the scifunc_block_m.
Nevertheless I have found that in case I use below given command for "return" from the function
u = A(3, 3);
the function returns really the expected "1". Additionaly if I take a look at the Variable Browser on the top right corner of the Scilab window I can't se the expected A 4x4 item. It seems that I am doing something wrong.
Can anybody give me an advice how to define a persistent variable of the matrix type inside the Scilab function?
Thanks in advance for any ideas.
Global variables are by default initialized with an empty matrix. Hence, you should detect first call with isempty()
function [u] = FuncXYZ(x)
global A;
global init;
if isempty(init)
init = 1;
A = eye(4, 4);
end
u = A(1, 1);
endfunction
BTW, your code is incorrect, there is no endif in Scilab.

java 8: How to convert following code to functional?

Instead of using the for loop, how do I use the Stream API of Java 8 on array of booleans? How do I use methods such as forEach, reduce etc.?
I want to get rid of the two variables totalRelevant and retrieved which I am using to maintain state.
As in a lambda expression, we can only reference final variables from its lexical context.
import java.util.Arrays;
import java.util.List;
public class IRLab {
public static void main(String[] args) {
// predefined list of either document is relevant or not
List<Boolean> documentRelivency = Arrays.asList(true, false, true, true, false);
System.out.println("Precision\tRecall\tF-Measure");
// variables for output
double totalRelevant = 0.0;
double retrieved = 0.0;
for (int i = 0; i < documentRelivency.size(); ++i) {
Boolean isRelevant = documentRelivency.get(i);
// check if document is relevant
if (isRelevant) totalRelevant += 1;
// total number of retrieved documents will be equal to
// number of document being processed currently, i.e. retrieved = i + 1
retrieved += 1;
// storing values using formulas
double precision = totalRelevant / retrieved;
double recall = totalRelevant / totalRelevant;
double fmeasure = (2 * precision * recall) / (precision + recall);
// Printing the final calculated values
System.out.format("%9.2f\t%.2f\t%.2f\t\n", precision, recall, fmeasure);
}
}
}
How do I convert above code to functional code using the Java 8 Stream API and Lambda Expressions? I need to maintain state for two variables as above.
Generally, converting imperative to a functional code will only be an improvement when you manage to get rid of mutable state that causes the processing of one element to depend on the processing of the previous one.
There are workarounds that allow you to incorporate mutable state, but you should first try to find a different representation of your problem that works without. In your example, the processing of each element depends on two values, totalRelevant and retrieved. The latter is just an ascending number and therefore can be represented as a range, e.g. IntStream.range(startValue, endValue). The second stems from your list of boolean values and is the number of true value inside the sublist (0, retrieved)(inclusive).
You could recalculate that value without needing the previous value, but reiterating the list in each step could turn out to be expensive. So instead, collect your list into a single int number representing a bitset first, i.e. [true, false, true, true, false] becomes 0b_10110. Then, you can get the number of one bits using intrinsic operations:
List<Boolean> documentRelivency = Arrays.asList(true, false, true, true, false);
int numBits=documentRelivency.size(), bitset=IntStream.range(0, numBits)
.map(i -> documentRelivency.get(i)? 1<<(numBits-i-1): 0).reduce(0, (i,j) -> i|j);
System.out.println("Precision\tRecall\tF-Measure");
IntStream.rangeClosed(1, numBits)
.mapToObj(retrieved -> {
double totalRelevant = Integer.bitCount(bitset&(-1<<(numBits-retrieved)));
return String.format("%9.2f\t%.2f\t%.2f",
totalRelevant/retrieved, 1f, 2/(1+retrieved/totalRelevant));
})
.forEach(System.out::println);
This way, you have expressed the entire operation in a functional way where the processing of one element does not depend on the previous one. It could even run in parallel, though this would offer no benefit here.
If the list size exceeds 32, you have to resort to long, or java.util.BitSet for more than 64.
But the whole operation is more an example of how to change the thinking from “this is a number I increment in each iteration” to “I’m processing a continuous range of values” and from “this is a number I increment when the element is true” to “this is the count of true values in a range of this list”.
It's unclear why you need to change your code to lambdas. Currently it's quite short and lambdas will not make it shorter or cleaner. However if you really want, you may encapsulate your shared state in the separate object:
static class Stats {
private int totalRelevant, retrieved;
public void add(boolean relevant) {
if(relevant)
totalRelevant++;
retrieved++;
}
public double getPrecision() {
return ((double)totalRelevant) / retrieved;
}
public double getRecall() {
return 1.0; // ??? was totalRelevant/totalRelevant in original code
}
public double getFMeasure() {
double precision = getPrecision();
double recall = getRecall();
return (2 * precision * recall) / (precision + recall);
}
}
And use with lambda like this:
Stats stats = new Stats();
documentRelivency.forEach(relevant -> {
stats.add(relevant);
System.out.format("%9.2f\t%.2f\t%.2f\t\n", stats.getPrecision(),
stats.getRecall(), stats.getFMeasure());
});
Lambda is here, but not Stream API. Seems that involving Stream API for such problem is not very good idea as you need to output the intermediate states of mutable container which should be mutated strictly in given order. Well, if you desperately need Stream API, replace .forEach with .stream().forEachOrdered.

gcl_memcpy auto detection of pointer types

I have a trivial kernel running on OS X that returns a single int. The essential bits are:
cl_int d;
cl_int* dptr = &d;
void* dev_d = gcl_malloc(sizeof(cl_int),NULL,CL_MEM_WRITE_ONLY);
// ... stuff to setup dispatch queue
dispatch_sync(queue, ^{
// ... running the kernel stuff
gcl_memcpy((void*)&d, dev_d, sizeof(cl_int)); // this gives d==0
gcl_memcpy((void*)dptr, dev_d, sizeof(cl_int)); // this gives correct d
});
Question is, what is the difference between &d and dptr? I've always thought of them as essentially interchangeable, but gcl_memcpy seems to be making a distinction. Any ideas? I can obviously just use the dptr solution, but I'm still curious what's happening.
I don't think this has to do with the gcl_memcpy call specifically. I think it has to do with your GCD call.
When you call dispatch_sync, your block gets a copy of the variables you use in it. In fact, in similar situations, I get a warning from my compiler about using &d in the block, since it's probably a common mistake.
So in your main function you have a variable d at Address1 with value 0 and a variable dptr at Address2 with value Address1. In your dispatch block you have a variable d at Address3 with value 0 and a variable dptr at Address4 with value Address1. So when you write to &d within your dispatch block, you are putting the value in Address3 which you won't see outside of your dispatch block. When you write to dptr in your dispatch block, you are putting the value in Address1, which is what you expect.
Or to put it another way, your call to dispatch_queue is like calling a function defined like
void myfunction(cl_int d, cl_int* dptr).
If you're skeptical of my answer, I suggest you try this with a simple assignment instead of the gcl_malloc call.

how to store contetnts of a several array in a matrix in C

I am trying to store contents of different vectors in a matrix.
length of vectors are different and they are all strings. lets say:
A=["MXAA', "MXBB", "MXCC"]
B=["JJJ", "LKLKLKL"]
so the new matrix should look like the following:
C= [MXAA, MXBB, MXCC;JJJ, LKLKLKL, 0]
is the a way to do that in C?
thanks
You would need to create an array of pointers to pointer to the element type (which in your case is a pointer to char).
The problem you need to consider is that every array is different size; so I suggest you store the size of the arrays, or you will quickly end up running over the bounds of an array. This sounds a bit like a custom type.
typedef {
int n;
char **strArr;
} stringArray;
stringArray *str2d;
str2d = (stringArray*) malloc(2*sizeof(stringArray));
str2d[0].n=3;
str2d[0].strArr = (char**)malloc(3*sizeof(char*));
str2d[0].strArr[0] = "MXAA";
str2d[0].strArr[1] = "MXBB";
str2d[0].strArr[2] = "MXCC";
str2d[1].n = 2;
str2d[1].strArr = (char**)malloc(2*sizeof(char*));
str2d[1].strArr[0] = "JJJ";
str2d[1].strArr[1] = "LKLKLKL";
If you want to access an element you use similar addressing - but check that you stay within bounds!
I deliberately did this in very explicit steps, hoping this makes the principle clear. There are better ways to do this but they are more obscure (or not "standard C")

Define a new mathematical function in TCL using Tcl_CreateMathFunc

I use TCL 8.4 and for that version I need to add a new mathematical function into TCL interpreter by using TCL library function, particularly Tcl_CreateMathFunc. But I could not find a single example of how it can be done. Please could you write for me a very simple example, assuming that in the C code you have a Tcl_Interp *interp to which you should add a math function (say, a function that multiplies two double numbers).
I once did some alternative implementations of random number generators for Tcl and you can look at some examples at the git repository. The files in generic implement both a tcl command and a tcl math function for each PRNG.
So for instance in the Mersenne Twister implementation, in the package init function we add the new function to the interpreter by declaring
Tcl_CreateMathFunc(interp, "mt_rand", 1, (Tcl_ValueType *)NULL, RandProc, (ClientData)state);
this registers the C function RandProc for us. In this case the function takes no arguments but the seeding equivalent (srand) shows how to handle a single parameter.
/*
* A Tcl math function that implements rand() using the Mersenne Twister
* Pseudo-random number generator.
*/
static int
RandProc(ClientData clientData, Tcl_Interp *interp, Tcl_Value *args, Tcl_Value *resultPtr)
{
State * state = (State *)clientData;
if (! (state->flags & Initialized)) {
unsigned long seed;
/* This is based upon the standard Tcl rand() initializer */
seed = time(NULL) + ((long)Tcl_GetCurrentThread()<<12);
InitState(state, seed);
}
resultPtr->type = TCL_DOUBLE;
resultPtr->doubleValue = RandomDouble(state);
return TCL_OK;
}
Be aware that this is an API that is very unlikely to survive indefinitely (for reasons such as its weird types, inflexible argument handling, and the inability to easily use it from Tcl itself). However, here's how to do an add(x,y) with both arguments being doubles:
Registration
Tcl_ValueType types[2] = { TCL_DOUBLE, TCL_DOUBLE };
Tcl_CreateMathFunc(interp, "add", 2, types, AddFunc, NULL);
Implementation
static int AddFunc(ClientData ignored, Tcl_Interp *interp,
Tcl_Value *args, Tcl_Value *resultPtr) {
double x = args[0].doubleValue;
double y = args[1].doubleValue;
resultPtr->doubleValue = x + y;
resultPtr->type = TCL_DOUBLE;
return TCL_OK;
}
Note that because this API is always working with a fixed number of arguments to the function (and argument type conversions are handled for you) then the code you write can be pretty short. (Writing it to be type-flexible with TCL_EITHER — only permissible in the registration/declaration — makes things quite a lot more complex, and you really are stuck with a fixed argument count.)

Resources