This is a very strange situation. Why do I get error
CL_PLATFORM_NOT_FOUND_KHR
when I'm calling this function:
clGetPlatformIDs(0, NULL, &platformCount);
Earlier this error was not. I have installed the driver and SDK from Intel and Nvidia. Are there any suggestions?
Here is explained why such error can occur. clGetPlatformIDs returns CL_SUCCESS if the function is executed successfully and there are a non-zero number of platforms available. Otherwise it can return CL_PLATFORM_NOT_FOUND_KHR if the cl_khr_icd extension is enabled and no platforms are found.
You are in luck. Well sort of... Seeing this is 3 years later.
Disclaimer: I HAVE NO CLUE WHY THIS WORKS:
Machine: x64 windows 10.
Graphics Card: Geforce GTX 960
Total Failure To Load Library : LoadLibraryA( "OpenCL64.dll" )
WRONG (but loads) : LoadLibraryA( "C:/Program Files/NVIDIA Corporation/OpenCL/OpenCL64.dll" )
WRONG (but loads) : LoadLibraryA( "C:/Program Files/NVIDIA Corporation/OpenCL/OpenCL.dll" )
CORRECT: LoadLibraryA( "OpenCL.dll" )
Here is the really insideous thing: Both of my "WRONG" answers will let you
grab function pointers, but when you call clGetPlatformIDs the return status
will be 0xFFFFFC17 ( CL_PLATFORM_NOT_FOUND_KHR ).
Then you'll be examining your function call correctness.
Maybe you'll even look at the calling convention. Maybe you'll check
the header files and make sure there are not any typos there. And yet,
you are looking in all the wrong places because the original problem happened
more steps back than you think.
Because of this problem, I build into my programs code that reads a file:
"OPEN_CL_SEARCH_PATHS.TXT" so the user of the software can change what DLL file
the program attempts to load.
While I am here, I would also like to add that there seems to be a bug with the
driver that makes it so OpenCL <==> OpenGL sharing is NOT a zero-copy share and
is incredibly laggy. Now I've got to go figure out Vulkan to make my fractal
rendering engine even though OpenCL's abstraction better suits the problem.
It is probably important to note that I am NOT using an SDK or any
validation layers. In fact, I am not even using
windows.h.
I wrote assembly code to grab the address of GetProcAddress and LoadLibrary by navigating the PEB file. I am also not using cl.h or cl_platform.h.
I reconstruct the structs I need from the documentation. I am also not
bothering with prototypes for function signatures either. For example,
I call "clGetPlatformIDs" by casting it to type "F_03" and then
calling it that way.
typedef void* (F_03)( void, void*, void* );
My machine doesn't have GPU and so had to use hashcat with OpenCL for CPU alone. My machine was Intel core i3, so I have downloaded the OpenCL softwares from Intel website and installed manually and the error gone.
Source: https://youtu.be/AieYqNQ6ADM
Related
I'm working on some OpenCL code within a larger project. The code only gets compiled at run-time - but I don't want to deploy a version and start it up just for that. Is there some way for me to have the syntax of those kernels checked (even without consider), or even compile them, at least under some restrictions, to make it easier to catch errors earlier?
I will be targeting AMD and/or NVIDIA GPUs.
The type of program you are looking for is an "offline compiler" for OpenCL kernels - knowing this will hopefully help with your search. They exist for many OpenCL implementations, you should check availability for the specific implementation you are using; otherwise, a quick web search suggests there are some generic open source ones which may or may not fit the bill for you.
If your build machine is also your deployment machine (i.e. your target OpenCL implementation is available on your build machine), you can of course also put together a very basic offline compiler yourself by simply wrapping clBuildProgram() and friends in a basic command line utility.
I have an OpenCL application that runs fine when using an AMD GPU.
When using an NVidia card, the clBuildProgram call crashes the application (does not even return a failure value, just a crash). When debugging, the crash yields:
read access violation in the nvopencl.dll module. code 0xc0000005. The debugger indicates the clGetExportTable function (inside nvopencl.dll) as source of the violation.
By commenting random parts of the kernels, I have reached this point:
In the code fragment:
if (something){
//some stuff
float3 gradient = (float3)(0,1,0);
gradient = normalize(gradient);
return;
}
By deleting the "gradient = normalize(gradient);" line, the clBuildProgram does not crash, but letting it there, crashed the program. the gradient variable is not even used inside the kernel, so it is not related to any other part of it. And the normalize funcion by itself should not be the source of the problem, because it is used in other parts of the code.
I think it may be related to some driver bug. Because installing the latest CUDA version (6.5) makes the OpenCL Volume Rendering sample binaries distributed by NVidia to misbehave, while using a CUDA 6 installation make the Volume Rendering sample to work properly.
My code is related to volume rendering techniques, that is why I think that it may be related, but my problem appears with both CUDA 6.5 and CUDA 6 installations.
Have you experienced something similar? What could be the cause of the problem, and how can I handle it?
Thank you.
After further analysis, the problem seems to be a bug in the drivers, as Xapa mentioned.
I'm developing an application with openCL on an imx6q (freescale - Vivante gc200 with openCL EP) with a Linux suse 13.1 distribution adapted for armv7.
I'm based on the following tutorial : https://community.freescale.com/docs/DOC-93984#comment-12585. I installed the following package : gpu-viv-bin-mx6q.
When I try the example code, it works on a laptop version, but on the imx6, it gives me a segmentation fault when calling the function clGetDeviceIds.
The program is compiled correctly but not work when running;
I tried by passing different null variables in the function. I'm not sure if it's due to memory allocation (as the same code work on my laptop, i can suppose this is not the problem). When I launch it in debug mode, the program seems not to find the file : "gc_hal_user_query.c" (hal is for Hardware Abstraction Layer).
I can't find sufficient documentation on the web, and i'm quite newbie on linux and openCL, so if anybody could help me. Thanks in advance.
I guess the issue is that, when you call
clGetPlatformIDs(1, &cpPlatform, NULL);
cpPlatform receives 0 if no platform is detected. This leads to a segmentation fault during the next call to
clGetDeviceIDs(cpPlatform, CL_DEVICE_TYPE_GPU, 1, &cdDevice, NULL);
I unfortunately can't help further, I have the same problem.
You are running with insufficient permissions. Try running as root.
One of my customers had a problem with a Xeon E5 machine: they were having one gpu (I believe it was an NVIDIA one) hanging and they solved by adding the
intel_iommu = igfx_off
in the grub loader.
What is this value and what does it? I read around but couldn't just figure that out in simple terms
Quoting from the "Intel-IOMMU.txt" file included in the Linux kernel documentation:
"If you encounter issues with graphics devices, you can try adding option intel_iommu=igfx_off to turn off the integrated graphics engine. If this fixes anything, please ensure you file a bug reporting the problem."
Apparently the GPU in this case was not working properly with the DMAR (DMA Remapping) feature provided by the Intel chipset. Using the "igfx_off" parameter allows the GPU to access the physical memory directly without going through the DMAR.
The purpose of the DMAR feature is to enable things like direct assignment of hardware to virtualized guests. If you have to use the "igfx_off" parameter then you probably won't be able to use this GPU in such a direct-assigned virtualization scenario.
The linker for an iOS simulator target I have is reporting the following warning:
ld: warning: too many personality routines for compact unwind to encode
No line number is given, nor anything else that is actionable. Googling turned up some Apple open source code, but I'm not groking it.
What does it mean and what can I do to address it?
I found some information in the C++ ABI for Itanium docs that sheds some light on what this means.
The personality routine is the function in the C++ (or other language) runtime library which serves as an interface between the system unwind library and language-specific exception handling semantics.
Extrapolating, this warning indicates that you've got too many kinds of exception handling in the same binary, and at least one of them may fail. Indeed this is exactly what is observed in this question.
Sadly, there's no clear way to fix this, only workarounds. You can suppress the warning, remove code, rewrite code in a different language, disable a language's exception handing and possibly others.
If you have a crash on exception with this warning, i.e. ::terminate() call on every throw, the workaround is to use old dwarf exception-handling tables. Add following flags to Build Settings/Linking/Other Linker Flags:
-Wl,-keep_dwarf_unwind -Wl,-no_compact_unwind
You can try silencing the warning with -Wl,-no_compact_unwind for the Other Linker Flags setting. I think it's harmless.
I bumped into this issue as well when trying to run on the iOS simulator while my code was running correctly on the device. In my case, it was not a warning but a linker error.
Luckily, i remembered that I added two flags for luajit to run correctly following the instructions of this page under the section Embedding LuaJIT:
-pagezero_size 10000 -image_base 100000000
My guess is that the image_base address is simply out of bounds on the host CPU.
Anyway, removing these settings made it work in my configuration.
So go to your project's settings and look for any hard wired values of this kind if not the same.
Hope this helps,
M