do not understand result of opencl select statement

do not understand result of opencl select statement - opencl

I have a simple kernel in OpenCL that has the following structure:
kernel void simple_select(global double *input, global double *output) {
size_t i = get_global_id(0);
printf("input %d\n", (int)(input[i] != 0.0));
output[i] = select((float)0.0, (float)1.0, (int)(input[i] != 0.0));
//output[i] = select((float)0.0, (float)1.0, 1);
}
Equivalently this can be:
kernel void simple_select(global double *input, global double *output) {
size_t i = get_global_id(0);
printf("input %d\n", (int)(input[i] != 0.0));
output[i] = input[i] != 0.0 ? 1.0 : 0.0;
//output[i] = 1 ? 1.0 : 0.0;
}
When I print to the command line, I see:
input 1
input 1
input 1
But the output array has all 0.0. However, if I uncomment the last line of the kernel and comment out the second-to-last-line (meaning if I use the scalar 1 in the select statement) then it works as expected and the output array has all 1.0. So what is the difference between these two lines that leads to two different results?

Here is the answer.
It's a quirk in OpenCL. The problem is that true/false values for scalars are 1/0 (like printf has shown you), but true/false values for vectors are -1/0 - and this is also what select() expects in last argument (more precisely, it expects MSB set which means any negative integer).
Though i think the ternary operator on scalars should still work as expected, if it doesn't i would consider it a bug.

Related

Memoization code for "Longest Common Substring" doesn't work as expected

I was able to think of a recursive solution for the problem "Longest Common Substring" but when I try to memoize it, it doesn't seem to work as I expected it to, and throws a wrong answer.
Here is the recursive code.
int lcs(string X, string Y,int i, int j, int count)
{
if (i == 0 || j == 0)
return count;
if (X[i - 1] == Y[j - 1])
count = lcs(X,Y,i - 1, j - 1, count + 1);
count = max(count,max(lcs(X,Y,i, j-1, 0),lcs(X,Y,i - 1, j, 0)));
return count;
}
int longestCommonSubstr(string S1, string S2, int n, int m)
{
return lcs(S1,S2,n,m,0,dp);
}
And here is the memoized code.
int lcs(string X, string Y,int i, int j, int count,vector<vector<vector<int>>>& dp)
{
if (i == 0 || j == 0)
return count;
if(dp[i - 1][j - 1][count] != -1)
return dp[i - 1][j - 1][count];
if (X[i - 1] == Y[j - 1])
count = lcs(X, Y, i - 1, j - 1, count + 1, dp);
count = max(count,max(lcs(X,Y,i, j-1, 0,dp),lcs(X,Y,i - 1, j, 0,dp)));
return dp[i-1][j-1][count]=count;
}
int longestCommonSubstr(string S1, string S2, int n, int m)
{
int maxSize=max(n,m);
vector<vector<vector<int>>> dp(n,vector<vector<int>>(m,vector<int>(maxSize,-1)));
return lcs(S1,S2,n,m,0,dp);
}
I do know that the problem can be solved using a 2D DP vector as well but my objective was to convert my original recursive solution to a memoized solution and not write a solution from scratch. And as I have 3 parameters which are changing, so it should use a 3D DP table.
Can anyone figure out what's wrong or help me out with a 3D DP solution with recursive code same or similar to mine.
Note:-
An interesting observation, the max function for some reason works from left to right on my Mac system and on Ubuntu running under parallels as well, but the same function works from right to left in Windows machine and in online compilers. I do not know the reason but I would be happy to know about it. I'm running the code in an M1 Mac, I don't know if the ARM compiler is different from x86 Mac compiler or not.
Another thing, the memoized code gives different answers depending upon which recursive call is called first on the line,
count = max(count,max(lcs(X,Y,i, j-1, 0),lcs(X,Y,i - 1, j, 0)));
If I swap the positions of the function call statements then it gives a correct output but for that specific test case and probably similar cases.
This Memo solution gives TLE as well in large test cases, and I do not know why.
I recently started studying DP and this is the only question which I wasn't able to solve by just modifying the original recursive solution. It has been two days and I just can't figure out the proper reasons.
Submission Link:- https://practice.geeksforgeeks.org/problems/longest-common-substring1452/1/#
Any help in this regard would be great.

How does "runif" function work internally in R?

I am trying to generate a set of uniformly distributed numbers in R. I know that we can use the function "runif" in R to do the same. But I really want to understand the idea behind how this function would have been developed. In the sense how does the code work for the function "runif". So, in a nutshell, I want to create my own function which can do the same task as the "runif"

Ultimately, runif calls a pseudorandom number generator. One of the simpler ones can be found here defined in C within the R code base and should be straightforward to emulate
static unsigned int I1=1234, I2=5678;
void set_seed(unsigned int i1, unsigned int i2)
{
I1 = i1; I2 = i2;
}
void get_seed(unsigned int *i1, unsigned int *i2)
{
*i1 = I1; *i2 = I2;
}
double unif_rand(void)
{
I1= 36969*(I1 & 0177777) + (I1>>16);
I2= 18000*(I2 & 0177777) + (I2>>16);
return ((I1 << 16)^(I2 & 0177777)) * 2.328306437080797e-10; /* in [0,1) */
}
So effectively this takes the initial integer seed values, shuffles them bitwise, then recasts them as double precision floating point numbers via multiplying by a small constant that normalises the doubles into the [0, 1) range.

How to find a pair of numbers in a list given a specific range?

The problem is as such:
given an array of N numbers, find two numbers in the array such that they will have a range(max - min) value of K.
for example:
input:
5 3
25 9 1 6 8
output:
9 6
So far, what i've tried is first sorting the array and then finding two complementary numbers using a nested loop. However, because this is a sort of brute force method, I don't think it is as efficient as other possible ways.
import java.util.*;
public class Main {
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
int n = sc.nextInt(), k = sc.nextInt();
int[] arr = new int[n];
for(int i = 0; i < n; i++) {
arr[i] = sc.nextInt();
}
Arrays.sort(arr);
int count = 0;
int a, b;
for(int i = 0; i < n; i++) {
for(int j = i; j < n; j++) {
if(Math.max(arr[i], arr[j]) - Math.min(arr[i], arr[j]) == k) {
a = arr[i];
b = arr[j];
}
}
}
System.out.println(a + " " + b);
}
}
Much appreciated if the solution was in code (any language).

Here is code in Python 3 that solves your problem. This should be easy to understand, even if you do not know Python.
This routine uses your idea of sorting the array, but I use two variables left and right (which define two places in the array) where each makes just one pass through the array. So other than the sort, the time efficiency of my code is O(N). The sort makes the entire routine O(N log N). This is better than your code, which is O(N^2).
I never use the inputted value of N, since Python can easily handle the actual size of the array. I add a sentinel value to the end of the array to make the inner short loops simpler and quicker. This involves another pass through the array to calculate the sentinel value, but this adds little to the running time. It is possible to reduce the number of array accesses, at the cost of a few more lines of code--I'll leave that to you. I added input prompts to aid my testing--you can remove those to make my results closer to what you seem to want. My code prints the larger of the two numbers first, then the smaller, which matches your sample output. But you may have wanted the order of the two numbers to match the order in the original, un-sorted array--if that is the case, I'll let you handle that as well (I see multiple ways to do that).
# Get input
N, K = [int(s) for s in input('Input N and K: ').split()]
arr = [int(s) for s in input('Input the array: ').split()]
arr.sort()
sentinel = max(arr) + K + 2
arr.append(sentinel)
left = right = 0
while arr[right] < sentinel:
# Move the right index until the difference is too large
while arr[right] - arr[left] < K:
right += 1
# Move the left index until the difference is too small
while arr[right] - arr[left] > K:
left += 1
# Check if we are done
if arr[right] - arr[left] == K:
print(arr[right], arr[left])
break

Increment (++) a dereferenced pointer in a macro --> result is +2 instead of +1

I have the following code:
#include <stdio.h>
#define MIN(x, y) ((x) <= (y) ? (x) : (y))
int main ()
{
int x=5, y=0, least;
int *p;
p = &y;
least = MIN((*p)++, x);
printf("y=%d", y);
printf("\nleast=%d", least);
return 0;
}
I would expect the following result:
y=1
least=1
but instead y=2.
Can somebody explain why y is now 2 and not 1. I suppose that it is because some double incrementation, but I do not understand the mechanism behind it.
Thanks.

Preprocessor macros work by text substitution. So your line:
least = MIN((*p)++, x);
gets expanded to
least = (((*p)++) <= (x) ? ((*p)++) : (x));
The double-increment is clear.

It's because you are using a macro. Since you are passing the dereferenced pointer plus the increment through the macro, the macro then puts the dereferenced pointer and the increment operation in each place y shows up in your macro. Since y shows up twice in your macro, the increment operator happens two times.
If you do the increment before you call the macro y should only be 1.

How can I use `vector <unsigned int*> vec;` properly

I am new in C++ and I want to use vector <unsigned int*> vec;
I try this code:
vector <unsigned int*> vec;
unsigned int* tmpV= new unsigned int[4];
for(unsigned int i=0; i<4;i++){
tmpV[i]=i;
}
vec.push_back(tmpV);
unsigned int* tmpV2=vec.at(0);
cout<<"A) tmpV2[1]: "<<tmpV2[1] <<endl;
cout<<"vec.size(): "<<vec.size()<<endl;
for(unsigned int i=0; i<4;i++){
tmpV[i]=i+4;
}
vec.push_back(tmpV);
tmpV2=vec.at(0);
cout<<"vec.size(): "<<vec.size()<<endl;
cout<<"B) tmpV2[1]: "<<tmpV2[1]<<endl;
The problem her is that I wanted to output the same value for A) and B)
but it ouputs
A) tmpV2[1]: 1
B) tmpV2[1]: 5
I want to be able to handle different elements in this vector of pointers.
I can roughly understand why this is going on but I couldn't find a solution.
Have in mind that I don't want to use: vector < vector <unsigned int> >

It is because you have incremented the value pointed by the pointer at the index a in vector
vec
if you reprint it again after printing
valuecout<<"B) tmpV2[1]: "<<tmpV2[1]<<endl
valuecout<<"B) tmpV2[1]: "<<vec[1] <<endl
both will show same result
What you have done so far is
You have a vector of integer pointers
You have initialized this array
You had one temporary pointer pointing to the zeroth index of the vector
Now using this pointer you printed the value pointed by the second index of the vec
After that you manipulated all the values pointed by vector of pointers (incremented by 4)
Now you are again printing the value pointed by index 1 of vector
Both are same only thing is you printed the value manipulated it and again printed after manipulation. If you reprint both *vec[1] and tmpV2[1] at the end again you will find both are same.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

do not understand result of opencl select statement - opencl

Related

Memoization code for "Longest Common Substring" doesn't work as expected

How does "runif" function work internally in R?

How to find a pair of numbers in a list given a specific range?

Increment (++) a dereferenced pointer in a macro --> result is +2 instead of +1

How can I use `vector <unsigned int*> vec;` properly

Categories

Resources