clCreateSubBuffer not found oO - opencl

i can't seem to find clCreateSubBuffer in cl.h or cl.hpp (only error macro). it is mentioned in the specifications, any idea about this? or any other way to create a sub buffer?
all i can think of is recreating the buffers using an incremented pointer.

Note that clCreateSubBuffer is a OpenCL 1.1 function, maybe your using/looking into openCL 1.0 header files.
What platform are you btw using? (im pretty sure that NVIDIA supports until now "only" OpenCL 1.0)

Related

Translate OpenCL SPIR-V to Vulkan SPIR-V

Is it possible to translate OpenCL-style SPIR-V to Vulkan-style SPIR-V?
I know that it is possible to use clspv to compile OpenCL C to Vulkan-style SPIR-V, but I haven't seen any indication that it also supports ingesting OpenCL-style SPIR-V.
Thank you for any suggestions if you know how to achieve this :)
I know that it is possible to use clspv to compile OpenCL C to
Vulkan-style SPIR-V, but I haven't seen any indication that it also
supports ingesting OpenCL-style SPIR-V.
clspv compiles to "Opencl-style SPIR-V". IOW, it uses OpenCL execution model and also OpenCL memory model. The answer to your question is no (in general). The problem is that e.g. GLSL uses logical memory model, which means pointers are abstract, so you can't have pointers to pointers. While OpenCL allows this, because it uses physical memory model. Plus there are other things in OpenCL which cannot be expressed in GLSL. You could try to write some translator, and it might work for some very simple code, but that's about it.

openCL on consoles for General purpose GPU?

Can we use openCL on consoles like Xbox One and PS4 for General purpose GPU? If yes, can we use openCL framework like ArrayFire - http://arrayfire.com/ ?
No.
While the AMD GPU hardware on the most recent version of each console is similar enough to desktop hardware that has OpenCL support, the console vendors do not offer OpenCL as a programming API (unless it's not public information and only available under NDA). If enough game devs asked for it perhaps it would happen.
I was also looking for it too! And I've found this discution: here
Apparently, up to now, the only way out is to use HLSL for Xbox, and PSSL on Playstation. According to a guy on the topic above. They (the APIs) are different, but very similar in such a way that shall be possible to write a code that compiles under both platform using some preprocessors.
PS.: I'm also looking for a good answer for that question. So, if you've found anything around, please post here, I'd love to know ^^

Is there a general binary intermediate representation for OpenCL kernel programming?

as I understood, the OpenCL uses a modified C language (by adding some keywords like __global) as the general purpose for defining kernel function. And now I am doing a front-end inside F# language, which has a code quotation feature that can do meta programming (you can think it as some kind of reflection tech). So I would like to know if there is a general binary intermediate representation for the kernel instead of C source file.
I know that CUDA supports LLVM IR for the binary intermediate representation, so we can create kernel programmatically, and I want to do the same thing with OpenCL. But the document says that the binary format is not specified, each implementation can use their own binary format. So is there any general purpose IR which can be generated by program and can also run with NVIDIA, AMD, Intel implementation of OpenCL?
Thansk.
No, not yet. Khronos is working on SPIR (the spec is still provisional), which would hopefully become this. As far as I can tell, none of the major implementations support it yet. Unless you want to bet your project on its success and possibly delay your project for a year or two, you should probably start with generating code in the C dialect.

Does CUDA support recursion?

Does CUDA support recursion?
It does on NVIDIA hardware supporting compute capability 2.0 and CUDA 3.1:
New language features added to CUDA C
/ C++ include:
Support for function
pointers and recursion make it easier
to port many existing algorithms to
Fermi GPUs
http://developer.nvidia.com/object/cuda_3_1_downloads.html
Function pointers:
http://developer.download.nvidia.com/compute/cuda/sdk/website/CUDA_Advanced_Topics.html#FunctionPointers
Recursion:
I can't find a code sample on NVIDIA's website, but on the forum someone post this:
__device__ int fact(int f)
{
if (f == 0)
return 1;
else
return f * fact(f - 1);
}
Yes, see the NVIDIA CUDA Programming Guide:
device functions only support recursion in device code compiled for devices
of compute capability 2.0.
You need a Fermi card to use them.
Even though it only supports recursion for specific chips, you can sometimes get away with "emulated" recursion: see how I used compile-time recursion for my CUDA raytracer.
In CUDA 4.1 release CUDA supports recursion only for __device__ function but not for __global__ function.
Only after 2.0 compute capability on compatible devices
Any recursive algorithm can be implemented with a stack and a loop. It's way more of a pain, but if you really need recursion, this can work.
Sure it does, but it requires the Kepler architecture to do so.
Check out their latest example on the classic quick sort.
http://blogs.nvidia.com/2012/09/how-tesla-k20-speeds-up-quicksort-a-familiar-comp-sci-code/
As far as i know, only latest Kepler GK110 supports dynamic parallelism, which allow this kind of recursive call and spawning of new threads within the kernel. Before Kepler GK110, it was not possible. And note that not all Kepler architecture supports this, only GK110 does.
If you need recursion, you probably need the Tesla K20.
I'm not sure if Fermi does supports it,never read of it. :\
But Kepler sure does. =)
CUDA 3.1 supports recursion
If your algorithm invovles alot of recursions, then support or not, it is not designed for GPUs, either redesign your algorthims or get a better CPU, either way it will be better (I bet in many cases, maginitudes better) then do recurisons on GPUs.
Yeah, it is supported on the actual version. But despite the fact it is possible to execute recursive functions, you must have in mind that the memory allocation from the execution stack cannot be predicted (the recursive function must be executed in order to know the true depth of the recursion), so your stack could result being not enough for your purposes and it could need a manual increment of the default stack size
Yes, it does support recursion. However, it is not a good idea to do recursion on GPU. Because each thread is going to do it.
Tried just now on my pc with a NVIDIA GPU with 1.1 Compute capability. It says recursion not yet supported. So its not got anything to do with the runtime but the hardware itself

OpenCL vs. DirectCompute?

I'm looking for comparisons between OpenCL and DirectCompute, but I haven't found anything. OpenCL's advantages of being cross-platform and having a wider range of supported GPUs don't matter to me. I'm fine with coding on Windows against DX11 GPUs only. Assuming that, what are the pros and cons of each API?
I know this question was raised before, but I'm looking for more details.
I'm not interested in CUDA, since I don't want to restrict myself to only Nvidia hardware.
Probably the biggest difference for a coder is that DirectCompute is programmed by a language which is similar to HLSL, and OpenCL is programmed via a C-like language.
Another difference to consider is that, generally, for commodity level GPUs, the DirectX support is better (faster and less buggy) than OpenGL support on Windows. This may translate to more stable support for DirectCompute, but really, this is just speculation.
Well the major advantage of OpenCL is that it is not just limited to graphics cards. You can make use of your multicore CPU, Graphics Card and potentially any number of other hardware acceleration devices (DSPs etc) all from the same program.
I'm not sure if DirectCompute allows that freedom.
The OpenCL cross-platform-ness is not just a detail, as the host code (the one calling the OpenCL API and submitting kernels) can itself be cross-platform (see link text, link text...).
Write once, run on any GPGPU, anywhere.
Otherwise the OpenCL tooling is really getting better, with an ATI Stream plugin for Visual Studio, the NVidia & ATI SDKs that contains tons of samples, etc...
Another option now is C++ AMP which gives you modern C++ syntax without a need for a seperate compiler while still preserving hardware portability. Please follow links from here for more info and feel free to post questions as you have them: http://blogs.msdn.com/b/nativeconcurrency/archive/2011/09/13/c-amp-in-a-nutshell.aspx
I use OpenCL because i can easily port my App to Linux but with DirectCompute this is not possible.
I think also that the performance of the OpenCL implementation will increase with time (that it comes at the same Level like CUDA for NVidia Cards) and also that the (driver)bugs will (hopefully ;) ) be eliminated with time.

Resources