3D Arrays in OpenCL - opencl

I am new to OpenCL programming and my input is a 3D array. I am calculating the index as:
int gidX = get_global_id(0)?1:get_global_id(0);
int gidY = get_global_id(1)?1:get_global_id(1);
int gidZ = get_global_id(2)?1:get_global_id(2);
int index = gidX + (gidY*SizeX) + (gidZ*SizeY*SizeZ);
Is this the right way to do it? How do I use the local thread ids with 3d arrays? I had used it with 2d arrays as:
int tid = get_local_id(0);
int gid = get_global_id(0);
int index = tid + gid*width;
And, is there a way I could use image3d_t type for my 3D volume?
Thanks,
Sayan

What you seem to need is some basic information about the functionality and working principles of OpenCL. Please have a look to the following links:
http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/OpenCL_Programming_Guide.pdf
http://www.nvidia.com/object/cuda_opencl_new.html
http://developer.download.nvidia.com/compute/cuda/3_0/sdk/website/OpenCL/website/samples.html
You code samples for getting gidX, gidY and gidZ do not make much sense and the calculation of the index is wrong, too. The calculation depends on the ordering of your 3D matrix. It should look something like:
int index = x + y * sizeX + z * sizeX * sizeY;
But you should check the documentation first. Especially the working principle of the local ids are not explained quickly.

It depends how you have your 3D array linearized to memory.. but Rick's answer coded as an inline function would work fine. The other optimization you may want are prefetching to local memory when possible.
/* Visualize as a cube. You are looking at the front in x,y coordinates. Z is depth. You have stored it by starting at (x=0, y=0) and taking the depth z lists of elements one by one and placing them in a contiguous array.*/
//Inline this
int matrix3D_lookup(int x, int y, int z, int sizeZ, int sizeX){
return z+ sizeZ*x +(sizeZ*sizeX*y);
}

Related

3D Projection Modification - Encode Z/W into Z

This is a little tricky to explain, so bare with me. I'm attempting to design a 2D projection matrix that takes 2D pixel coordinates along with a custom world-space depth value, and converts to clip-space.
The idea is that it would allow drawing elements based on screen coordinates, but at specific depths, so that these elements would interact on the depth buffer with normal 3D elements. However, I want x and y coordinates to remain the same scale at every depth. I only want depth to influence the depth buffer, and not coordinates or scale.
After the vertex shader, the GPU sets depth_buffer=z/w. However, it also scales x/w and y/w, which creates the depth scaling I want to avoid. This means I must make sure my final clip-space w coordinate ends up being 1.0, to avoid those things. I think I could also adopt to scale x and y by w, to cancel out the divide, but I would rather do the former, if possible.
This is the process that my 3D projection matrix uses to convert depth into clip space (d = depth, n = near distance, f = far distance)
z = f/(f-n) * d + f/(f-n) * -n;
w = d;
This is how I would like to setup my 2D projection matrix. Compared to the 3D version, it would divide both attributes by the input depth. This would simulate having z/w encoded into just the z value.
z = ( f/(f-n) * d + f/(f-n) * -n ) / d;
w = d / d;
I think this turns into something like..
r = f/(f-n); // for less crazy math
z = r + ( r * -n ) / d;
w = 1.0;
However, I can't seem to wrap my math around the values that I would need to plug into my matrix to get this result. It looks like I would need to set my matrix up to perform a division by depth. Is that even possible? Can anyone help me figure out the values I need to plug into my matrix at m[2][2] and m[3][2] (m._33 and m._43) to make something like this happen?
Note my 3D projection matrix uses the following properties to generate the final z value:
m._33 = f / (f-n); // depth scale
m._43 = -(f / (f-n)) * n; // depth offset
Edit: After thinking about this a little more, I realized that the rate of change of the depth buffer is not linear, and I'm pretty sure a matrix can only perform linear change when its input is linear. If that is the case, then what I'm trying to do wouldn't be possible. However, I'm still open to any ideas that are in the same ball park, if anyone has one. I know that I can get what I want by simply doing pos.z /= pos.w; pos.w = 1; in the vertex shader, but I was really hoping to make it all happen in the projection matrix, if possible.
In case anyone is attempting to do this, it cannot be done. Without black magic, there is apparently no way to divide values with a matrix, unless of course the diviser is a constant or etc, where you can swap out a scaler with 1/x. I resorted to performing the operation in the shader in the end.

Giving a direction to a moving object

I wish to create a tower defense game in SDL. Before starting the project, I experiment everything I will need to do when programming the game. In the test I am doing currently, there are a tower (static object), targets (moving objects) that are in its range, and shoots (moving objects) that are fired from the turret to the targets. What I fail to do is find a way to give the 'shoot' objects a direction. By shoot object, I mean the object that is fired by the tower when targets are in range. Also, whatever the direction is, the shoot shall always have the same speed, which forbids the use of the formula dirx = x2 - x1.
Shoots are structures defined as the following:
typedef struct shoot
{
SDL_Surface *img; // Visual representation of the shoot object.
SDL_Rect pos; // Position of the object, it's a structure containing
// coordinates x and y (these are of type int).
int index;
float dirx; // dirx and diry are the movement done by the shoots object in
// 1 frame, so that for each frame, the object shoot is moved dirx
// pixels on the axis x and diry pixels on the axis y
float diry; // (the program deals with the fact that the movement will be done
// with integers and not with floats, that is no problem to me)
float posx; // posx and posy are the real, precise coordinates of the shoot
float posy;
struct shoot *prev;
struct shoot *next;
} shoot;
What I need is a way to calculate the position of the object shoot in the next frame, given its position and direction in the current frame.
This is the best I could find (please note that it is a paper written formula, so the names are simplified, different from the names in the code):
dirx = d * ((p2x - p1x) / ((p2x - p1x) + (p2y - p1y)))
diry = d * ((p2y - p1y) / ((p2x - p1x) + (p2y - p1y)))
dirx and diry correspond to the movement done, in the pixel, by the shoot on the axis x and y, in one frame.
d is a multiplier and the big parenthesis (all of what is not d) is a coefficient.
p2 is the point the shoot shall aim for (the center of the target aimed for). p1 is the current position of the shoot object. x or y means that we use the coordinate x or y of the point.
The problem with this formula is that it gives me an unexact value. For example, aiming in diagonal will make the shoot slower that aiming straight north. Moreover, it doesn't go in the right direction, and I can't find why since my paper tests show me I'm right...
I would love some help here to find a formula that makes the shoot move correctly.
If p1 is the source of a shoot object, p2 is the destination, and s is the speed you want to move it at (units per frame or per second - latter is better), then the velocity of the object is given by
float dx = p2.x - p1.x, dy = p2.y - p1.y,
inv = s / sqrt(dx * dx + dy * dy);
velx = inv * dx; vely = inv * dy;
(You should probably change dir to vel as it is a more sensible variable name)
Your attempt seems to be normalizing the direction vector by Manhattan distance, which is wrong - you must normalize by the Euclidean distance, which is given by the sqrt term.

Generating movement based on time t for real time ocean waves from an initial spectrum

I've spent the last week or so rendering a simple ocean using gerstner waves but having issues with tiling, so I decided to start rendering them "properly" and dip my toes into the murky waters of rendering a heightfield using an iFFT.
There are plenty of papers explaining the basic gist -
1) calculate a frequency spectrum
2) use this to create a heightfield using ifft to convert from frequency domain to spatial domain - animating with time t
Since the beginning of this journey I have learned about things like the complex plane, the complex exponent equation, the FFT in more detail etc but after the initial steps of creating an initial spectrum (rendering a texture full of guassian numbers with mean 0 and sd of 1, filtered by the phillips spectrum) I am still totally lost.
My code for creating the initial data is here (GLSL):
float PhillipsSpectrum(vec2 k){
//kLen is the length of the vector from the centre of the tex
float kLen = length(k);
float kSq = kLen * kLen;
// amp is wave amplitude, passed in as a uniform
float Amp = amplitude;
//L = velocity * velcoity / gravity
float L = (velocity*velocity)/9.81;
float dir = dot(normalize(waveDir),normalize(k));
return Amp * (dir*dir) * exp(-1.0/(kSq * L * L))/ (kSq * kSq) ;
}
void main(){
vec3 sums;
//get screenpos - center is 0.0 and ranges from -0.5 to 0.5 in both
//directions
vec2 screenPos = vec2(gl_FragCoord.x,gl_FragCoord.y)/texSize - vec2(0.5,0.5);
//get random Guass number
vec2 randomGuass = vec2(rand(screenPos),rand(screenPos.yx));
//use phillips spectrum as a filter depending on position in freq domain
float Phil = sqrt(PhillipsSpectrum(screenPos));
float coeff = 1.0/sqrt(2.0);
color = vec3(coeff *randomGuass.x * Phil,coeff * randomGuass.y * Phil,0.0);
}
which creates a texture like this:
Now I am totally lost as how to :
a) derive spectrums in three directions from the initial texture
b) animate this according to time t like mentioned in this paper (https://developer.nvidia.com/sites/default/files/akamai/gamedev/files/sdk/11/OceanCS_Slides.pdf) on slide 5
I might be completely stupid and overlooking something really obvious - I've looked at a bunch of papers and just get lost in formulae even after acquainting myself with their meaning. Please help.

Need an algorithm for 3D vectors intersection

I have 2 vectors, each defined by 2 Point3D (origin and direction). I need to find out the point of their intersection.
A little bit of help is always welcome.
I will post my function, which gives me wrong output.
public static CurvIntersect3D Intersect2Linii3D (Vector3D dr1, Vector3D dr2) {
CurvIntersect3D result = new CurvIntersect3D(0, null);
double x = Math3D.VectorNorm3D(dr1.getDirectie());
double t = Math3D.VectorNorm3D(dr2.getDirectie());
double cosa = (dr1.getDirectie().getX()*dr2.getDirectie().getX() + dr1.getDirectie().getY()*dr2.getDirectie().getY() + dr1.getDirectie().getZ()*dr2.getDirectie().getZ()) / (t*x);
Punct3D p1 = dr1.getOrigine();
Punct3D p2 = new Punct3D(), p3 = new Punct3D();
for (int i=0; i<3; i++)
{
p2.set(i, dr1.getOrigine().get(i) + dr1.getDirectie().get(i));
p3.set(i, dr1.getOrigine().get(i) + dr2.getDirectie().get(i));
}
Matrici.Matrice3x3 rot = Math3D.GetMatriceRotatie(p1, p2, p3);
Punct3D orig = new Punct3D();
for (int i=0; i<3; i++)
orig.set(i, rot.getElement(i, 0) * (dr2.getOrigine().getX()-dr1.getOrigine().getX()) +
rot.getElement(i, 1) * (dr2.getOrigine().getY()-dr1.getOrigine().getY()) +
rot.getElement(i, 2) * (dr2.getOrigine().getZ()-dr1.getOrigine().getZ()));
x = orig.getY() - orig.getZ()* cosa / Math.sqrt(1 - cosa*cosa);
p1 = new Punct3D();
for (int i=0; i<3; i++)
p1.set(i, dr1.getOrigine().get(i) + x*dr1.getDirectie().get(i));
result.setCount(1);
result.add(p1);
return result;
}
CurvIntersec3D is a structure that stores the array of points and its length.
As mentioned before the two lines may not meet at a single point. The best you can do in general is find the point on line1 closest to line2 and vise versa. Connect those two points to create the common normal direction.
Given two lines passing through 3D points r1=[r1x,r1y,r1z] and r2=[r2x,r2y,r2z] and having unit directions e1=[e1x,e1y,e1z] and e2=[e2x,e2y,e2z] you can find the points on the line which are closest to the other line like this:
Find the direction projection u=Dot(e1,e2)=e1x*e2x+e1y*e2y+e1z*e2z
If u==1 then lines are parallel. No intersection exists.
Find the separation projections t1=Dot(r2-r1,e1) and t2=Dot(r2-r1,e2)
Find distance along line1 d1 = (t1-u*t2)/(1-u*u)
Find distance along line2 d2 = (t2-u*t1)/(u*u-1)
Find the point on line1 p1=Add(r1,Scale(d1,e1))
Find the point on line2 p2=Add(r2,Scale(d2,e2))
Note: You must have the directions as unit vectors, Dot(e1,e1)=1 and Dot(e2,e2)=1.
The function Dot() is the vector dot product. The function Add() adds the components of vectors, and the function Scale() multiplies the components of the vector with a number.
Good luck.
Are you sure that your lines have intersection?
If it is guaranteed, then the problem is rather simple: Get parametric equations of lines, solve a system of two linear equations like these:
A_X0+t*A_Dir_X = B_X0+u*B_DirX , where X0 is base point, and Dir is direction vector (take into account any pair of coordinates with non-zero cross product)
If not, then it is necessary to calculate a distance between two skew lines at first. If the distance is zero, then we could find intersection point.

Writing CUDA surface backed by an array of vectors

I am trying to write to a 2-dimensional cudaArray through a surface<void, 2>.
The array has a channel format {32, 32, 0, 0, cudaChannelFormatKindFloat} or to put it more simply, holds vector2s.
I am trying to write a vector2 to the surface at the position indicated by integer coordinates (x, y). The following works well:
// write the float2 vector d to outSurf
surf2Dwrite(d.x, outSurf, x * sizeof(float2), y);
surf2Dwrite(d.y, outSurf, x * sizeof(float2) + sizeof(float), y);
However, if I do
surf2Dwrite(d, outSurf, x * sizeof(float2), y);
only the x component of the vector is being written. What is the reason for this slightly unintuitive behaviour?
I find it hard to believe that any of those surf2Dwrite calls actually do what you think they do. To write a float2 I would use this:
surf2Dwrite<float2>(d, outSurf, x, y);
The x and y arguments are the coordinates on the surface you are writing to and the template parameter tells the call the size of the type being accessed.

Resources