OpenCL code that compiles on linux, doesn't compile on windows - opencl

i've been writing some OpenCL code lately on linux (ubuntu 10.4, ati catalyst 10.4 and ati sdk v2.1) and its working great on linux.
When i wanted to run my code on windows, i got program build errors complaining about
"this declaration has no storage class or type specifier"
and then "global variable must be declared in addrSapce constant"
even having a void kernel doesn't do, hell i commented the code and it still gave same errors lol!
weird enough that samples r working just fine. when i copied my code into the samples projects, it gave same errors.
i'm using windows 7 32-bit, ati stream sdk v2.1 and v10.6 drivers (cause i couldn't find the 10.4 for windows anywhere, which is sad since 10.6 doesn't have a guarantee to support OpenCL, way to go amd lol! )
i cut all the kernels out and left just this one, i still got same errors, here it is
__kernel void set_float( __global float* buff ,
float v) {
buff[get_global_id(0)]=v;
}

Man, no matter how many times u get bitten by strings, one never learns.
It was just a non-null terminated string problem lol.

It works for me (successfully compiled using AMD Stream Kernel Analyzer). On Win7 64-bit, sdk v2.1 and v10.6 drivers. Your formatting is horrible though.

Related

OpenCL for Intel CPU and Nvidia GPU simultaneously

I am trying to get started with some OpenCL coding.
I've installed the NVidia CUDA OpenCL on my computer and have managed to build a simple "Hello World!" application using Visual Studio 2017.
I have also installed the Intel OpenCL SDK (installation warned me that I needed to update my OpenCL drivers but the Intel update manager was telling me that everything was up to date, so I'm not sure whether this could be an issue).
Now whenever I query the OpenCL platforms on my PC lie so:
std::vector< cl::Platform > platformList;
cl::Platform::get(&platformList);
I only get back my nVidia openCL platform, with my GPU as the only device. I am not getting anything back for my CPU.
Can anyone help? Is it possible to perform both CPU and GPU OpenCL computations in the same project (In different OpenCL contexts? How would I go about doing this?
Seems that Intel GPU driver was not installed properly. You can install a CPU-only package instead:
https://software.intel.com/en-us/articles/opencl-drivers#latest_CPU_runtime

CL_PLATFORM_NOT_FOUND_KHR in opencl

This is a very strange situation. Why do I get error
CL_PLATFORM_NOT_FOUND_KHR
when I'm calling this function:
clGetPlatformIDs(0, NULL, &platformCount);
Earlier this error was not. I have installed the driver and SDK from Intel and Nvidia. Are there any suggestions?
Here is explained why such error can occur. clGetPlatformIDs returns CL_SUCCESS if the function is executed successfully and there are a non-zero number of platforms available. Otherwise it can return CL_PLATFORM_NOT_FOUND_KHR if the cl_khr_icd extension is enabled and no platforms are found.
You are in luck. Well sort of... Seeing this is 3 years later.
Disclaimer: I HAVE NO CLUE WHY THIS WORKS:
Machine: x64 windows 10.
Graphics Card: Geforce GTX 960
Total Failure To Load Library : LoadLibraryA( "OpenCL64.dll" )
WRONG (but loads) : LoadLibraryA( "C:/Program Files/NVIDIA Corporation/OpenCL/OpenCL64.dll" )
WRONG (but loads) : LoadLibraryA( "C:/Program Files/NVIDIA Corporation/OpenCL/OpenCL.dll" )
CORRECT: LoadLibraryA( "OpenCL.dll" )
Here is the really insideous thing: Both of my "WRONG" answers will let you
grab function pointers, but when you call clGetPlatformIDs the return status
will be 0xFFFFFC17 ( CL_PLATFORM_NOT_FOUND_KHR ).
Then you'll be examining your function call correctness.
Maybe you'll even look at the calling convention. Maybe you'll check
the header files and make sure there are not any typos there. And yet,
you are looking in all the wrong places because the original problem happened
more steps back than you think.
Because of this problem, I build into my programs code that reads a file:
"OPEN_CL_SEARCH_PATHS.TXT" so the user of the software can change what DLL file
the program attempts to load.
While I am here, I would also like to add that there seems to be a bug with the
driver that makes it so OpenCL <==> OpenGL sharing is NOT a zero-copy share and
is incredibly laggy. Now I've got to go figure out Vulkan to make my fractal
rendering engine even though OpenCL's abstraction better suits the problem.
It is probably important to note that I am NOT using an SDK or any
validation layers. In fact, I am not even using
windows.h.
I wrote assembly code to grab the address of GetProcAddress and LoadLibrary by navigating the PEB file. I am also not using cl.h or cl_platform.h.
I reconstruct the structs I need from the documentation. I am also not
bothering with prototypes for function signatures either. For example,
I call "clGetPlatformIDs" by casting it to type "F_03" and then
calling it that way.
typedef void* (F_03)( void, void*, void* );
My machine doesn't have GPU and so had to use hashcat with OpenCL for CPU alone. My machine was Intel core i3, so I have downloaded the OpenCL softwares from Intel website and installed manually and the error gone.
Source: https://youtu.be/AieYqNQ6ADM

clBuildProgram crash on NVidia cards

I have an OpenCL application that runs fine when using an AMD GPU.
When using an NVidia card, the clBuildProgram call crashes the application (does not even return a failure value, just a crash). When debugging, the crash yields:
read access violation in the nvopencl.dll module. code 0xc0000005. The debugger indicates the clGetExportTable function (inside nvopencl.dll) as source of the violation.
By commenting random parts of the kernels, I have reached this point:
In the code fragment:
if (something){
//some stuff
float3 gradient = (float3)(0,1,0);
gradient = normalize(gradient);
return;
}
By deleting the "gradient = normalize(gradient);" line, the clBuildProgram does not crash, but letting it there, crashed the program. the gradient variable is not even used inside the kernel, so it is not related to any other part of it. And the normalize funcion by itself should not be the source of the problem, because it is used in other parts of the code.
I think it may be related to some driver bug. Because installing the latest CUDA version (6.5) makes the OpenCL Volume Rendering sample binaries distributed by NVidia to misbehave, while using a CUDA 6 installation make the Volume Rendering sample to work properly.
My code is related to volume rendering techniques, that is why I think that it may be related, but my problem appears with both CUDA 6.5 and CUDA 6 installations.
Have you experienced something similar? What could be the cause of the problem, and how can I handle it?
Thank you.
After further analysis, the problem seems to be a bug in the drivers, as Xapa mentioned.

openCL clGetDeviceIds seg fault (imx6 (Freescale) with openCL on a Linux SUSE distribution (armv7))

I'm developing an application with openCL on an imx6q (freescale - Vivante gc200 with openCL EP) with a Linux suse 13.1 distribution adapted for armv7.
I'm based on the following tutorial : https://community.freescale.com/docs/DOC-93984#comment-12585. I installed the following package : gpu-viv-bin-mx6q.
When I try the example code, it works on a laptop version, but on the imx6, it gives me a segmentation fault when calling the function clGetDeviceIds.
The program is compiled correctly but not work when running;
I tried by passing different null variables in the function. I'm not sure if it's due to memory allocation (as the same code work on my laptop, i can suppose this is not the problem). When I launch it in debug mode, the program seems not to find the file : "gc_hal_user_query.c" (hal is for Hardware Abstraction Layer).
I can't find sufficient documentation on the web, and i'm quite newbie on linux and openCL, so if anybody could help me. Thanks in advance.
I guess the issue is that, when you call
clGetPlatformIDs(1, &cpPlatform, NULL);
cpPlatform receives 0 if no platform is detected. This leads to a segmentation fault during the next call to
clGetDeviceIDs(cpPlatform, CL_DEVICE_TYPE_GPU, 1, &cdDevice, NULL);
I unfortunately can't help further, I have the same problem.
You are running with insufficient permissions. Try running as root.

Qt/MinGW32 memory usage limitation?

I wrote an application with Qt 4.8.1 and MinGW32 (Nokia Qt SDK). I try to load a large file with this app, but the app always crash when memory usage reach 1,868 MB. If I reduce the size of input file the app works fine. Is there any memory limitations on Qt apps or MinGW32? What should I do if I really want my app to use more memory? My windows is 64 bit.
p.s. Adding "QMAKE_LFLAGS_WINDOWS += -Wl,--stack,32000000" to .pro file won't work
Thanks very much!
p.p.s. I saw many software are capable of using 10+ GB, e.g. Matlab, how to do that on Qt apps?
Your copy of windows may be 64 bit, but MingW32 is a 32 bit compiler, so any app written with that compiler has all the standard limits inherent to 32 bit Windows. Effectively, you won't be able to get more than around 2G of memory for your app to use.
There's a method to get that up to 3G, but beyond that you need a 64 bit compiler.
2GB is limit is for process only.
You can spread your application along N processes (32-bit) to allocate N x 2GB. Operating system must still be 64-bit.

Resources