using square root function (sqrt) with doubles in OpenCL - opencl

I've got a kernel which uses the OpenCL builtin square root function (sqrt) but when I try to run the kernel on the GPU I get a unrecognized command error when building, it works fine if i use floats but when using doubles it does not work. I'm running on a Mac OS X 10.7.5 and my Graphics Card is a ATI Radeon HD 6750 card.
Does anyone know what the problem could be?

Apparently your gpu doesn't support double precision floats:
http://clbenchmark.com/device-environment.jsp?config=12011396
AMD cards that do support double report extension: cl_khr_fp64 (or cl_amd_fp64).
You could check at openCL compile time this way:
#ifdef cl_khr_fp64
#pragma OPENCL EXTENSION cl_khr_fp64 : enable
#elif defined(cl_amd_fp64)
#pragma OPENCL EXTENSION cl_amd_fp64 : enable
#else
#error "Double precision floating point not supported by OpenCL implementation."
#endif
Or you could check without running the opencl compile this way:
status = clGetDeviceInfo (oclInfo->device, CL_DEVICE_DOUBLE_FP_CONFIG, sizeof configFp64, &configFp64, NULL);

Related

Load SPIR binary with clBuildProgram on Windows

I am trying to load a SPIR binary i created with clang+llvm 6.0.1.
Created a few different files with :
clang -target spir-unknown-unknown -cl-std=CL1.2 -c -emit-llvm -Xclang -finclude-default-header OCLkernel.cl
clang -target amdgcn-amd-amdhsa -cl-std=CL1.2 -c -emit-llvm -Xclang -finclude-default-header OCLkernel.cl
clang -cc1 -emit-llvm-bc -triple spir-unknown-unknown -cl-std=CL1.2 -include "include\opencl-c.h" OCLkernel.cl
This is all happening on windows, installed AMD APP SDK 3 and Adrenalin 18.6.1 drivers.
After this i try to create a program from the binary :
clCreateProgramWithBinary(context, 1, &device, &programSrcSize, (const unsigned char**)&programSrc, 0 , &status)
This all goes OK, i don't get any errors here, but i do when trying to build it afterwards :
clBuildProgram(program, 1, &device, " –x spir -spir-std=1.2", NULL, NULL);
The error i get is :
Error CL_INVALID_COMPILER_OPTIONS when calling clBuildProgram
I tried without the "-x spir..." stuff too, but then i just get a :
error: Invalid value (Producer: 'LLVM6.0.1' Reader: 'LLVM 3.9.0svn')
EDIT:
CL_DEVICE_NAME: gfx900
CL_DEVICE_VERSION: OpenCL 2.0 AMD-APP (2580.6)
CL_DEVICE_OPENCL_C_VERSION: OpenCL C 2.0
CL_DRIVER_VERSION: 2580.6 (PAL,HSAIL)
CL_DEVICE_SPIR_VERSIONS: 1.2
After running clCreateProgramWithBinary i query the device with clGetProgramBuildInfo and get :
CL_PROGRAM_BINARY_TYPE = [CL_PROGRAM_BINARY_TYPE_INTERMEDIATE]
So that should mean the binary is being recognised, else i guess it would return CL_PROGRAM_BINARY_TYPE_NONE
EDIT2:
I think clang isn't creating a 'good' binary, but how to create it then?
Appreciate your help!
Unfortunately the support for SPIR was silently removed from AMD drivers, see dipak answers in this thread of AMD community forum:
https://community.amd.com/thread/232093
Regarding your second question: general clang+LLVM (not the secret version tuned by AMD and included in their proprietary drivers) still cannot produce binaries compatible with general-purpose Windows AMD drivers, however it is possible for Linux: all new AMD’s ROCm, AMD PAL and Mesa 3D runtime are covered.
It is a mystery for me why LLVM AMDGPU backend developers do not prioritize the task to produce binaries for Windows drivers, as there is a couple of GCN assembler projects that provide such a functionality through Windows OpenCL interface, to name a few: CLRadeonExtender, ASM4GCN, HepPas, etc. Moreover I know an undocumented fork of clang+LLVM that (as its author states) produce such OpenCL binaries! "There are more things in heaven and earth, Horatio, Than are dreamt of in your philosophy."

CodeXL cannot run GPU profile

I write an OpenCL program and want to profile it with codeXL.But the GPU : Performance Counters didn't work. The program is a very easy vector-add example and it runs properly on visual studio 2017. The codeXL displays cannot open vecAdd.cl, failed to create CL program from source. It is strange, who can give me some advice? The operating system is windows 10 x64 pro, codeXL 2.5.67, AMD FirePro w7100, amd app sdk 3.0 x86.
The vecAdd.cl is as follows:
__kernel void vector_add(global const float *a, global const float *b,
global float *result)
{
int gid = get_global_id(0);
result[gid] = a[gid] + b[gid];
}
OK,I have solved it. Because I set the wrong categories and codeXL cannot find the vecAdd.cl.

clCreateFromGLTexture() returns CL_INVALID_CONTEXT on certain platforms only

After positive creation of the shared context between OpenGL and OpenCL using following:
cl_context_properties cps[] = {
CL_GL_CONTEXT_KHR,
(cl_context_properties)glXGetCurrentContext(),
CL_GLX_DISPLAY_KHR,
(cl_context_properties)glXGetCurrentDisplay(),
CL_CONTEXT_PLATFORM,
(cl_context_properties)platform_id,
0
};
// Create an OpenCL context
m_contextCL = clCreateContext( cps, 1, &device_id, NULL, NULL, &err);
I try to create a shared texture:
cl_mem mem = clCreateFromGLTexture(
m_contextCL ,
CL_MEM_READ_ONLY ,
GL_TEXTURE_2D ,
0 ,
qt_fbo->texture() ,
&err
);
Now the call is successful only on xubuntu 16.04 with NVIDIA Quadro K620 using proprietary driver version 387.26 and OpenCL delivered with CUDA implementation package.
However when trying it on Toshiba laptop with Intel HD Graphics 520 on Manjaro OS and Beignet OpenCL implementation. The clCreateFromGLTexture(...) is failing by returning CL_INVALID_CONTEXT,
Additionally I tried another platform with Ubuntu 16.04 and Intel Iris IGP (Integrated Graphics Processor) using both Intel SDK and Beignet OpenCL. It fails at the same point of shared texture creation.
I created minimum working example for comparison two GPU techniques (OpenGL and OpenCL) and its interoperability with Qt:
https://github.com/pietrzakmat/opengl-opencl-qt-interop.
All the steps are derived from two tutorials:
1. https://www.codeproject.com/Articles/685281/OpenGL-OpenCL-Interoperability-A-Case-Study-Using
2. https://software.intel.com/en-us/articles/opencl-and-opengl-interoperability-tutorial
Anyone could point out what am I doing wrong and why the creation of shared texture fails on the platforms with integrated graphics or IGP Intel cpu? Is this some problem with drivers or OpenCL implementations? I managed to build and run the samples included in Beignet or intel_ocl_examples so I think the installation is correct.
1) is supported cl_khr_gl_sharing extension? Did you try to use this code on Windows Platform/macOS for Intel GPU ?
2) Did you try to use texture without attached to FBO ?
any way, i think this is problem in OpenCL implementation on Linux platform.

How to enable OpenCL extensions?

I am trying to enable the OpenCL extension cl_khr_gl_depth_images to make the following work:
glGenRenderbuffers(1, &gl_depthbuffer);
glBindRenderbuffer(GL_RENDERBUFFER, gl_depthbuffer);
glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_COMPONENT32F, width, height);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, gl_depthbuffer);
...
cl_depth = clCreateFromGLRenderbuffer(context, CL_MEM_READ_ONLY, gl_depthbuffer, &error);
At the moment I am getting the following error from the clCreateFromGLRenderbuffer call CL_INVALID_IMAGE_FORMAT_DESCRIPTOR.
I added the following lines to the top of my cpp file:
#include <CL/cl.hpp>
#pragma OPENCL EXTENSION cl_khr_gl_sharing : enable
#pragma OPENCL EXTENSION cl_khr_gl_depth_images : enable
But my compiler gives two unknown pragma warnings and I am still getting the CL_INVALID_IMAGE_FORMAT_DESCRIPTOR error.
Am I including the extensions wrong or can one not use depth-renderbuffers in opencl?
Edit: My Device is supporting the extensions in question!
The specification!
As doqtor already pointed out, put the lines
#pragma OPENCL EXTENSION cl_khr_gl_sharing : enable
#pragma OPENCL EXTENSION cl_khr_gl_depth_images : enable
at the top of your OpenCL C source code and not in your C++ code.
The C++ part of all available extensions is enabled by default and the required functions of the extension are automatically compiled into the executable.

How to do OpenCL programming in the newest Xilinx Vivado (2014.2)?

I used a simple "Hello, world." OpenCL program in the version 2014.2 Xilinx Vivado IDE, which declared its OpenCL support. One of the code snippets is as follows:
#include <CL/opencl.h>
...
// Connect to a compute device
//
int gpu = 1;
err = clGetDeviceIDs(NULL, gpu ? CL_DEVICE_TYPE_GPU : CL_DEVICE_TYPE_CPU, 1, &device_id, NULL);
if (err != CL_SUCCESS)
{
printf("Error: Failed to create a device group!\n");
return EXIT_FAILURE;
}
However, it seems that this Vivado couldn't recognize the header "CL/opencl.h" and the cl related functions. I resolved the header problem by manually put a external CL directory (derived from CUDA SDK) in my current Vivado HLS project, but it still reported errors like "function 'clGetDeviceIDs' has no function body".
#include <CL/opencl.h> is how it's done on Mac OS X, but on Windows it is usually #include <CL/cl.h>. Have you located your CL include folder? Have you told the IDE where it is? It sounds like your second problem (after you worked around the first) is that you're not linking against OpenCL.lib (or whatever the library extension is on your platform). You need to locate that too and link to it. On an ICD-supporting platform, the Khronos lib can be used and it dynamically locates the installed drivers, but on your platform it is probably be different, so consult the Xilinx instructions.
It seems that including clc.h in my Vivado 2015.2 did the trick.

Resources