I am trying to load a SPIR binary i created with clang+llvm 6.0.1.
Created a few different files with :
clang -target spir-unknown-unknown -cl-std=CL1.2 -c -emit-llvm -Xclang -finclude-default-header OCLkernel.cl
clang -target amdgcn-amd-amdhsa -cl-std=CL1.2 -c -emit-llvm -Xclang -finclude-default-header OCLkernel.cl
clang -cc1 -emit-llvm-bc -triple spir-unknown-unknown -cl-std=CL1.2 -include "include\opencl-c.h" OCLkernel.cl
This is all happening on windows, installed AMD APP SDK 3 and Adrenalin 18.6.1 drivers.
After this i try to create a program from the binary :
clCreateProgramWithBinary(context, 1, &device, &programSrcSize, (const unsigned char**)&programSrc, 0 , &status)
This all goes OK, i don't get any errors here, but i do when trying to build it afterwards :
clBuildProgram(program, 1, &device, " –x spir -spir-std=1.2", NULL, NULL);
The error i get is :
Error CL_INVALID_COMPILER_OPTIONS when calling clBuildProgram
I tried without the "-x spir..." stuff too, but then i just get a :
error: Invalid value (Producer: 'LLVM6.0.1' Reader: 'LLVM 3.9.0svn')
EDIT:
CL_DEVICE_NAME: gfx900
CL_DEVICE_VERSION: OpenCL 2.0 AMD-APP (2580.6)
CL_DEVICE_OPENCL_C_VERSION: OpenCL C 2.0
CL_DRIVER_VERSION: 2580.6 (PAL,HSAIL)
CL_DEVICE_SPIR_VERSIONS: 1.2
After running clCreateProgramWithBinary i query the device with clGetProgramBuildInfo and get :
CL_PROGRAM_BINARY_TYPE = [CL_PROGRAM_BINARY_TYPE_INTERMEDIATE]
So that should mean the binary is being recognised, else i guess it would return CL_PROGRAM_BINARY_TYPE_NONE
EDIT2:
I think clang isn't creating a 'good' binary, but how to create it then?
Appreciate your help!
Unfortunately the support for SPIR was silently removed from AMD drivers, see dipak answers in this thread of AMD community forum:
https://community.amd.com/thread/232093
Regarding your second question: general clang+LLVM (not the secret version tuned by AMD and included in their proprietary drivers) still cannot produce binaries compatible with general-purpose Windows AMD drivers, however it is possible for Linux: all new AMD’s ROCm, AMD PAL and Mesa 3D runtime are covered.
It is a mystery for me why LLVM AMDGPU backend developers do not prioritize the task to produce binaries for Windows drivers, as there is a couple of GCN assembler projects that provide such a functionality through Windows OpenCL interface, to name a few: CLRadeonExtender, ASM4GCN, HepPas, etc. Moreover I know an undocumented fork of clang+LLVM that (as its author states) produce such OpenCL binaries! "There are more things in heaven and earth, Horatio, Than are dreamt of in your philosophy."
Related
I am trying to use Intel oneAPI advisor beta to do a GPU offloading analysis (via analyze.py and collect.py). I have the problem that all non offloaded regions show Cannot be modelled: No Execution Count.
Furthermore I get the warning
advixe: Warning: A symbol file is not found. The call stack passing through `...../programm.out' module may be incorrect.
I've already tried the troubleshooting described here and here. Moreover I tried using a programm with larger runtime.
I compiled with the compiler flags (according to this) (notice that debugging information is turned on):
-O2 -std=c++11 -fopenmp -g -no-ipo -debug inline-debug-info
I am using Intel(R) Advisor 2021.1 beta07 (build 606302), and Intel(R) C Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 2021.1 Beta Build 202006. The program uses OpenMP.
What could I do to solve this problem?
The problem occurred because the program had too large a workload / the memory of the machine was not sufficient.
Try
running collect.py with --no-track-heap-objects (might reduce precision)
reducing the runtime and memory usage of the program that is analysed
pausing and resuming only on relevant parts via the libittnotify API
I would like to find out, whether built-in Intel graphics cards (e.g. Intel Iris Plus Graphics 655) support OpenACC directives? Would anyone be able to direct me to any relevant information?
The PGI C compiler does not support Intel as a target architecture, where architecture can be specified with the -ta option:
pgcc -I../common -acc -ta=nvidia,time -Minfo=accel -o laplace2d_acc laplace2d.c
Compiler issues the following warning:
pgcc-Warning-OpenACC for GPUs no longer supported on macOS, enabling multicore CPU code generation. Use -ta=multicore to avoid this warning
That means that no GPUs are supported on macOS, but it is still possible to compile the code with OpenACC directives aiming for execution on multiple cores of the CPU with -ta=multicore:
pgcc -I../common -acc -ta=multicore,time -Minfo=accel -o laplace2d_acc laplace2d.c
GNU C compiler (starting from version 7) supports OpenACC (ver. 7 and 8 support OpenACC 2.0a, ver. 9--OpenACC 2.5), where the acc directives are enabled with the -fopenacc option:
gcc -I../common -fopenacc -o laplace2d_acc laplace2d.c
However, I wasn't able to find compiler flags to target the Intel Iris card specifically.
After positive creation of the shared context between OpenGL and OpenCL using following:
cl_context_properties cps[] = {
CL_GL_CONTEXT_KHR,
(cl_context_properties)glXGetCurrentContext(),
CL_GLX_DISPLAY_KHR,
(cl_context_properties)glXGetCurrentDisplay(),
CL_CONTEXT_PLATFORM,
(cl_context_properties)platform_id,
0
};
// Create an OpenCL context
m_contextCL = clCreateContext( cps, 1, &device_id, NULL, NULL, &err);
I try to create a shared texture:
cl_mem mem = clCreateFromGLTexture(
m_contextCL ,
CL_MEM_READ_ONLY ,
GL_TEXTURE_2D ,
0 ,
qt_fbo->texture() ,
&err
);
Now the call is successful only on xubuntu 16.04 with NVIDIA Quadro K620 using proprietary driver version 387.26 and OpenCL delivered with CUDA implementation package.
However when trying it on Toshiba laptop with Intel HD Graphics 520 on Manjaro OS and Beignet OpenCL implementation. The clCreateFromGLTexture(...) is failing by returning CL_INVALID_CONTEXT,
Additionally I tried another platform with Ubuntu 16.04 and Intel Iris IGP (Integrated Graphics Processor) using both Intel SDK and Beignet OpenCL. It fails at the same point of shared texture creation.
I created minimum working example for comparison two GPU techniques (OpenGL and OpenCL) and its interoperability with Qt:
https://github.com/pietrzakmat/opengl-opencl-qt-interop.
All the steps are derived from two tutorials:
1. https://www.codeproject.com/Articles/685281/OpenGL-OpenCL-Interoperability-A-Case-Study-Using
2. https://software.intel.com/en-us/articles/opencl-and-opengl-interoperability-tutorial
Anyone could point out what am I doing wrong and why the creation of shared texture fails on the platforms with integrated graphics or IGP Intel cpu? Is this some problem with drivers or OpenCL implementations? I managed to build and run the samples included in Beignet or intel_ocl_examples so I think the installation is correct.
1) is supported cl_khr_gl_sharing extension? Did you try to use this code on Windows Platform/macOS for Intel GPU ?
2) Did you try to use texture without attached to FBO ?
any way, i think this is problem in OpenCL implementation on Linux platform.
We have an OpenCL program that works fine on my OS X machine. We just set up a machine with a Xeon Phi and Intel MPSS. However, even when not using the Phi but the Xeon CPU, the CL_PROGRAM_BUILD_STATUS we get is CL_BUILD_NONE.
Unfortunately, we cannot find any documentation on what might cause CL_BUILD_NONE. Do you have any suggestion on how to debug this?
Thank you in advance!
Versions:
[#memphis:~] $ cat /etc/SuSE-release
SUSE Linux Enterprise Server 11 (x86_64)
VERSION = 11
PATCHLEVEL = 2
[#memphis:~] $ uname -a
Linux memphis 3.0.13-0.27-default #1 SMP Wed Feb 15 13:33:49 UTC 2012 (d73692b) x86_64 x86_64 x86_64 GNU/Linux
[#memphis:~] 1 $ rpm -qa |grep intel
intel-mic-2.1.6720-15.suse
intel-mic-mpm-2.1.6720-15.suse
opencl-1.2-intel-mic-3.0.67279-1
intel-mic-sysmgmt-2.1.6720-15.suse
intel-mic-kmod-2.1.6720-15.3.0.13.0.suse
intel-mic-gdb-2.1.6720-15.suse
intel-mic-flash-2.1.386-3.suse
intel-mic-cdt-2.1.6720-15.suse
opencl-1.2-intel-devel-3.0.67279-1
intel-mic-micmgmt-2.1.6720-15.3.0.13.0.suse
opencl-1.2-intel-cpu-3.0.67279-1
intel-mic-gpl-2.1.6720-15.suse
intel-mic-crashmgr-2.1.6720-15.suse
The documentation for clGetProgramBuildInfo seems pretty straightforward:
CL_BUILD_NONE. The build status returned if no clBuildProgram, clCompileProgram or clLinkProgram has been performed on the specified program object for device.
You mention that your program worked on other platforms, but maybe you ended up with a slightly different flow between platforms which led to those methods not being properly invoked in the new flow? I'd suggest carefully verifying the return value from the earlier invoked functions to see you get what you expect to get.
Found it. I am not sure why I had &ret (cl_int return value) as the last parameter instead of having it as the return value of clBuildProgram. Moving it and setting the last parameter to NULL fixes the problem:
wrong:
clBuildProgram(*program, 1, &device_id, opts.str().c_str(), NULL, &ret);
correct:
ret = clBuildProgram(*program, 1, &device_id, opts.str().c_str(), NULL, NULL);
I do understand why this problem occured - apparently the compiler / the OpenCL libraries understood that I wanted to use pfn_notify and asynchronously build my kernel. I am, however, not sure if this behavior is fully conformant to the OpenCL documentation:
If pfn_notify is NULL, clBuildProgram does not return until the build has completed.
In my code, the pfn_notify argument was actually NULL, however user_data was (erroneously) not. While my code didn't make any sense, I believe that user_data should be ignored when pfn_notify is NULL.
I posted this on the Intel forums to see if they agree with my interpretation of the documentation.
I was trying to get my problem solved for hours, but I did not find any usefull hints. Hopefully you guys can help me out:
Some usefull data:
OS: Windows 8 Basic 64bit
Library: Intel OpenCL SDK
Compiler: MinGW(-gcc) (latest version)
IDE: Code::Blocks (latest version)
Minimal not working Code:
#include <stdlib.h>
#include <CL/cl.h>
int main(void)
{
cl_uint available;
cl_platform_id* platforms = (cl_platform_id*)malloc(sizeof(cl_platform_id));
cl_int result = clGetPlatformIDs(1, platforms, &available);
free(platforms);
if(result == CL_SUCCESS)
return 0;
return -1;
}
Code::Blocks Global Compiler Settings:
Linker Settings: Added path to Intel's OpenCL.lib ([...]\Intel\OpenCL SDK\3.0\lib\x64\OpenCL.lib) (tried -lOpenCL as Other Options as well)
Search-Directories for Compiler: Path to Intels OpenCL-SDK include directory ([...]\Intel\OpenCL SDK\3.0\include)
Search-Directories for Linker: Path to Intels OpenCL-Lib directory ([...]\Intel\OpenCL SDK\3.0\lib\x64)
Build-Log:
mingw32-g++.exe -L"[...]\Intel\OpenCL SDK\3.0\lib\x64" -o bin\Release\openCLTest.exe obj\Release\main.o -s "[...]\Intel\OpenCL SDK\3.0\lib\x64\OpenCL.lib"
obj\Release\main.o:main.c:(.text.startup+0x39): undefined reference to `clGetPlatformIDs#12'
collect2.exe: error: ld returned 1 exit status
Process terminated with status 1 (0 minutes, 0 seconds)
1 errors, 0 warnings (0 minutes, 0 seconds)
I do not know why he does not link properly.
The [...] in the text is modified by me to shorten the path, normally it would be "C:\Program Files (x86)...".
Hopefully you guys can help me! It is really frustrating! :(
Do you need more information?
EDIT:
Okay... one additional hour and I solved my own problem.
Hope this hint can help some other ppl:
I had to link additionally against the x86-library (seems that some functions are not implemented in X64).
Good to know -.-'''
I got the same problem and I tried hard to figure out the solution and finally I did :)
First of all my hardware are Intel Processor Intel(R) Core(TM) i5-2500 CPU # 3.30GHz and Intel(R) HD Graphics then I installed Intel OpenCL SDK 1.2 after updating the drivers. After that I configure the code::blocks to the new paths for include folder and lib folder as mentioned on the following link: http://www.obellianne.fr/alexandre/tutorials/OpenCL/tuto_opencl_codeblocks.php
Then I tried to compile the examples and I got linking problem as follows:
opencl.o(.text+0x6f):opencl.c: undefined reference to `clGetPlatformIDs#12'
opencl.o(.text+0xa7):opencl.c: undefined reference to `clGetDeviceIDs#24'
opencl.o(.text+0x142):opencl.c: undefined reference to `clGetDeviceInfo#20'
opencl.o(.text+0x263):opencl.c: undefined reference to `clGetDeviceInfo#20'
collect2: ld returned 1 exit status
Process terminated with status 1 (0 minutes, 0 seconds)
4 errors, 0 warnings (0 minutes, 0 seconds)
I tried to use command line and I got the same error then I tried to uninstall Intel sdk and replace it with AMD sdk 2.8 which is support X86 CPU with SSE (Streaming SIMD Extension which is designed by Intel ) 2.x orlater
http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-accelerated-parallel-processing-app-sdk/system-requirements-driver-compatibility/
Finally it works :)
I hope you find this comment useful.
According to an external source i stumbled upon along my own path of enlightenment to this problem i found out something is actually wrong with the mingw-w64 linker. mingw-w64's ld.exe does not want to link with the standard libopencl.a.. whether this is intel SDK specific or not im not sure but here is the link to the solution.
http://sourceforge.net/p/mingw-w64/support-requests/46/
you just have to link to the supplied libopencl.a instead of the default one.
still dont know exactly why the linker gives a problem but i have verified that the solution does (some how) solve the problem.