I get the following Error when i try to run the following pyopencl code on my linxu mint computer with a Radeon grafic card.
unsupported initializer for address space in wordpuzzle
The program is easy to understand: I have a INT Array which i copy to kernel. I read out the first number of the INT Array and based on the returned number i want to pick the char from the CharSSet and return it.
When i replace
wordbuff[0] = charSSet[a_g[0]];
with
wordbuff[0] = charSSet[10];
then it works well!
WHY???
el Code:
import numpy as np
import pyopencl as cl
a_np = np.array([10,22,5,3,8])
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
mf = cl.mem_flags
a_g = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_np)
wordValue = np.array(['\0','\0','\0','\0','\0','\0'])
word_buffer = cl.Buffer(ctx, mf.READ_WRITE | mf.COPY_HOST_PTR, hostbuf=wordValue)
prg = cl.Program(ctx, """
__kernel void wordpuzzle(
__global const int *a_g, __global unsigned char *wordbuff)
{
int gid = get_global_id(0);
char charSSet[53] = {'a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z','A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y','Z'};
if (gid == 0){
wordbuff[0] = charSSet[a_g[0]];
}
}
""").build()
res_g = cl.Buffer(ctx, mf.WRITE_ONLY, a_np.nbytes)
prg.wordpuzzle(queue, (1,), (1,), a_g, word_buffer)
theChars = np.empty_like(wordValue)
cl.enqueue_read_buffer(queue,word_buffer,theChars)
print(theChars)
THE WHOLE ERROR REPORT:
Choose platform: [0]
Choice [0]: Set the environment variable PYOPENCL_CTX='' to avoid
being asked again. Traceback (most recent call last): File
"demo.py", line 35, in
""").build() File "/usr/lib/python2.7/dist-packages/pyopencl/init.py", line 213, in
build
options=options, source=self._source) File "/usr/lib/python2.7/dist-packages/pyopencl/init.py", line 253, in
_build_and_catch_errors
raise err pyopencl.RuntimeError: clBuildProgram failed: build program failure -
Build on :
unsupported initializer for address space in wordpuzzle (options: -I
/usr/lib/python2.7/dist-packages/pyopencl/cl) (source saved as
/tmp/tmp7PxezZ.cl)
Related
I'm reading a unix book and specifically the part about execve() system call. The book says that file descriptors related to opened file are passed to child processes and also ( default behaviour ) after a process calls execve().
However, when I tried this code to read an opened file descriptor delivered to a process generated with execve() it doesn't seem to work. What's the problem ?
Program that calls execve() :
int main(int arg,char *argv[],char **env){
int fd;
if ( (fd = open("text.txt",O_RDWR | O_CREAT, ALL_OWNER )) == -1 ){
printf("Open failed\n");
exit(1);
};
printf("%d\n",fd); // 3
char buff [] = "Hello World\n";
write(fd,buff,strlen(buff));
int res;
if ( (res = execl("./demo",(char *)0)) == -1 ){
exit(1);
};
}
Program demo invoked by execve() :
setbuf(stdout,NULL);
printf("Demo executing...\n");
ssize_t r;
char buff[1024];
while ( (r = read(3,buff,sizeof(buff))) > 0 ){
write(STDOUT_FILENO,buff,r);
}
I'm using a Mac OS
The "demo" process inherit file descriptor and can read the file, but the file offset is at the end of the file. Use lseek(fd, 0, SEEK_SET) before calling execl(), or do it in "demo" before reading the file.
I am trying to make this minimal Rcpp/Intel pragma code to work, however running into some pretty big errors which I am struggling to overcome.
Code
This is the full code that I am trying to run - it is a simple text read out showing if the target Xeon Phi device is engaged as per this website: Offload Computations from Servers with an Intel® Xeon Phi™ Processor, and this is based on a sample code found here: Lightning-Fast R Machine Learning Algorithms:
library(Rcpp)
library(inline)
# Create and register a Rcpp plugin
plug <- Rcpp:::Rcpp.plugin.maker(
include.before = "#include <stdint.h>
#include <stdio.h>
#include <omp.h>"
)
registerPlugin("daalNB", plug)
whatCPU <-
'
#pragma omp declare target
void what_cpu()
{
uint32_t eax;
const uint32_t xeon_phi_x100_id = 0x00010;
const uint32_t xeon_phi_x200_id = 0x50070;
__asm volatile("cpuid":"=a"(eax):"a"(1));
uint32_t this_cpu_id = eax & 0xF00F0;
if (this_cpu_id == xeon_phi_x100_id)
printf("This CPU is Intel(R) XeonPhi(TM) x100 Processor!");
else
if (this_cpu_id == xeon_phi_x200_id)
printf("This CPU is Intel(R) XeonPhi(TM) x200 Processor!");
else
printf("This CPU is other Intel(R) Processor.");
}
'
offloadExampleRcpp <-
'
//[[Rcpp::plugins(openmp)]]
printf("Running on host: ");
what_cpu();
#pragma offload target(mic:0)
{
printf("Running on target: ");
what_cpu();
}
'
runOffloadExample <- cxxfunction(sig = signature(), body = offloadExampleRcpp, plugin="daalNB", includes = '
//[[Rcpp::plugins(openmp)]]
#pragma omp declare target
void what_cpu()
{
uint32_t eax;
const uint32_t xeon_phi_x100_id = 0x225d;
const uint32_t xeon_phi_x200_id = 0x50070;
__asm volatile("cpuid":"=a"(eax):"a"(1));
uint32_t this_cpu_id = eax & 0xF00F0;
if (this_cpu_id == xeon_phi_x100_id)
printf("This CPU is Intel(R) XeonPhi(TM) x100 Processor!");
else
if (this_cpu_id == xeon_phi_x200_id)
printf("This CPU is Intel(R) XeonPhi(TM) x200 Processor!");
else
printf("This CPU is other Intel(R) Processor.");
}
', verbose = 2)
runOffloadExample()
What I have tried and Errors:
I have set up the software stack the the Xeon Phi properly, and this can be confirmed by when I compile the .c code (that is wrapped inside Rcpp in the above code) outside of R, using the Intel icc compiler, it is successful; namely I am able to get the exact output as show in the Intel website processor.
It seems however, that when the same .c code is wrapped inside Rcpp, the following errors arise which prevent compilation (this a sample from a much longer readout):
Compilation argument:
/usr/local/lib64/R/bin/R CMD SHLIB file306737f15222.cpp 2> file306737f15222.cpp.err.txt
/opt/intel/compilers_and_libraries_2017.1.132/linux/bin/intel64/icc -I/usr/local/lib64/R/include -DNDEBUG -I"/home/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/include" -I/usr/local/include -fpic -qopenmp -c file306737f15222.cpp -o file306737f15222.o
file306737f15222.cpp(52): warning #2571: variable has not been declared with compatible "target" attribute
BEGIN_RCPP
^
file306737f15222.cpp(63): warning #2570: function has not been declared with compatible "target" attribute
END_RCPP
^
file306737f15222.cpp(63): warning #2570: function has not been declared with compatible "target" attribute
END_RCPP
^
Furthermore, the above error readout is followed by tonnes and tonnes of the below errors (again, I have only included a sample for the sake of brevity, however I believe that the other errors are all related to the same issue as posed in my question):
/home/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/include/Rcpp/protection/Shelter.h(34): warning #2570: function has not been declared with compatible "target" attribute
Rcpp_unprotect(nprotected) ;
^
detected during:
instantiation of "Rcpp::Shelter<T>::~Shelter() [with T=SEXP]" at line 323 of "/home/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/include/Rcpp/exceptions.h"
instantiation of "SEXP exception_to_condition_template(const Exception &, bool) [with Exception=Rcpp::exception]" at line 339 of "/home/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/include/Rcpp/exceptions.h"
/home/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/include/Rcpp/protection/Shelter.h(30): warning #2570: function has not been declared with compatible "target" attribute
return Rcpp_protect(x) ;
^
detected during:
instantiation of "SEXP Rcpp::Shelter<T>::operator()(SEXP) [with T=SEXP]" at line 326 of "/home/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/include/Rcpp/exceptions.h"
instantiation of "SEXP exception_to_condition_template(const Exception &, bool) [with Exception=Rcpp::exception]" at line 339 of "/home/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/include/Rcpp/exceptions.h"
/home/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/include/Rcpp/utils/tinyformat/tinyformat.h(218): warning #2570: function has not been declared with compatible "target" attribute
static void invoke(std::ostream& /*out*/, const T& /*value*/) { TINYFORMAT_ASSERT(0); }
^
detected during:
instantiation of "void tinyformat::detail::formatValueAsType<T, fmtT, convertible>::invoke(std::ostream &, const T &) [with T=const char *, fmtT=char, convertible=false]" at line 329
instantiation of "void tinyformat::formatValue(std::ostream &, const char *, const char *, int, const T &) [with T=const char *]" at line 528
instantiation of "void tinyformat::detail::FormatArg::formatImpl<T>(std::ostream &, const char *, const char *, int, const void *) [with T=const char *]" at line 504
instantiation of "tinyformat::detail::FormatArg::FormatArg(const T &) [with T=const char *]" at line 881
instantiation of "tinyformat::detail::FormatListN<N>::FormatListN(const Args &...) [with N=1, Args=<const char *>]" at line 930
instantiation of "tinyformat::detail::FormatListN<<expression>> tinyformat::makeFormatList(const Args &...) [with Args=<const char *>]" at line 966
instantiation of "void tinyformat::format(std::ostream &, const char *, const Args &...) [with Args=<const char *>]" at line 975
instantiation of "std::string tinyformat::format(const char *, const Args &...) [with Args=<const char *>]" at line 226 of "/home/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/include/Rcpp/exceptions.h"
instantiation of "Rcpp::not_compatible::not_compatible(const char *, Args &&...) [with Args=<const char *const &>]" at line 37 of "/home/R/x86_64-pc-linux-gnu-library/3.4/Rcpp/include/Rcpp/r_cast.h"
Wondering if anyone can help point me to what issues are resulting in the above errors? I recognise that I might be missing something fundamental owing to my in familiarity with C and the Rcpp package and so please excuse this.
Many thanks in advance,
Just started to learn OpenCL and setup a Visual Studio project using VS2015. Somehow, the code can find only 1 platform (I guess it should be the CPU), and cannot find the GPU device. Can someone please help? The detailed information is as follows:
GPU: Nvidia Quadro K4000
CUDA Installation
CUDA is at: “C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5”
OpenCL related files are located at "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\include\CL" and "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v7.5\lib\Win32" (assuming 32bit system)
The installer created two environment variables “CUDA_PATH” and “CUDA_PATH_V7_5”. They both point to the above location.
In Visual Studio, the project is set up as
"Project Properties" -> "C/C++" -> "Additional Include Directories" -> "$(CUDA_PATH)\include"
"Project Properties" -> "Linker" -> "Additional Library Directories" -> "$(CUDA_PATH)\lib\Win32"
"Project Properties" -> "Linker" -> "Input" -> "Additional Dependencies" -> "OpenCL.lib"
The code is very simple:
#include "stdafx.h"
#include <iostream>
#include <CL/cl.h>
using namespace std;
int main()
{
cl_int err;
cl_uint numPlatforms;
err = clGetPlatformIDs(0, NULL, &numPlatforms);
if (CL_SUCCESS == err)
cout << "Detected OpenCL platforms: " << numPlatforms << endl;
else
cout << "Error calling clGetPlatformIDs. Error code:" << err << endl;
cl_device_id device = NULL;
err = clGetDeviceIDs(NULL, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
if (err == CL_SUCCESS)
cout << device << endl;
return 0;
}
The code compiles and runs, but it cannot the GPU device. Specifically, the returned value of variable device is device = 0x00000000 <NULL>. What would be the problem? Thanks for the help.
This is not the way you use the OpenCL API.
You need to obtain a valid cl_platform_id object which it needs to be used to retrieve a cl_device_id. You are always passing NULL, this can't work.
The first time you invoke the clGetPlatformIds, you do it in order to obtain the number of platforms in the system. After than you need to invoke the method again in order to retrieve the actual cl_platform_ids:
size_t numPlatforms;
err = clGetPlatformIDs(0, NULL, &numPlatforms);
assert(numPlatforms > 0);
cl_platform_id platform_ids[numPlatforms];
err = clGetPlatformIDs(numPlatforms, platform_ids, NULL);
However, if you already know there is going to be only one platform in the system, then you can do speedup things as follows, but make sure to check for errors:
cl_platform_id platform_id;
err = clGetPlatformIDs(1, &platform_id, NULL);
assert(err == CL_SUCCESS);
After you have obtained a platform you need to follow the same procedure to first obtain the number of devices and then retrieve the list of OpenCL devices (which you then will need to build a cl_context, queues...):
// Note: this has to be done for each `cl_platform_id`
// until you find the device you were looking for
size_t numDevices;
err = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_GPU, 0, NULL, &numDevices);
assert(numDevices > 0);
cl_device_id devices[numDevices];
err = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_GPU, numDevices, devices, NULL);
I guess you understand the procedure now. If like above, you already know that there is only 1 GPU device in the system, you can directly get its cl_device_id as follows:
cl_device_id device;
err = clGetDeviceIDs(platform_id, CL_DEVICE_TYPE_GPU, 1, &device, NULL);
assert(err == CL_SUCCESS);
here is the command line in linux:
icc test.c -o test.o -L/opt/intel/current/mkl/intel64 -I/opt/intel/current/mkl/include -lmkl_intel_ilp64 -lmkl_core -lmkl_scalapack_ilp64
after running this command: I got a long line of undefined reference errors. I have also tried in eclipse but could not resolve the linking problem there too. I would be happy if anyone just help me to run a small code like this:
//test.c- a sample code from user guide
#include "mkl.h"
#define N 5
void main()
{
int n, inca = 1, incb = 1, i;
typedef struct{ double re; double im; } complex16;
complex16 a[N], b[N], c;
void zdotc();
n = N;
for( i = 0; i < n; i++ ){
a[i].re = (double)i; a[i].im = (double)i * 2.0;
b[i].re = (double)(n - i); b[i].im = (double)i * 2.0;
}
zdotc( &c, &n, a, &inca, b, &incb );
printf( "The complex dot product is: ( %6.2f, %6.2f) ", c.re, c.im );
}
my server
> MKLROOT: /opt/intel/current/mkl/
> library: $MKLROOT/lib/intel64/
> include:$MKLROOT/include
ICC 64bit is installed.
thanks in advance.
The best way to get right linkline for Intel MKL is using MKL Linkline Advisor. Even with right LD_LIBRARY_PATH compiler options and set of libraries you link doesn't look right. Should be
-DMKL_ILP64 -I$(MKLROOT)/include -L$(MKLROOT)/lib/intel64 -lmkl_intel_ilp64 -lmkl_intel_thread -lmkl_core -openmp -lpthread -lm
I have worked little bit in OpenCL now but recently "clBuildProgram" failed in one of my program. My code excerpt is below:
cl_program program;
program = clCreateProgramWithSource(context, 1, (const char**) &kernel_string, NULL, &err);
if(err != CL_SUCCESS)
{
cout<<"Unable to create Program Object. Error code = "<<err<<endl;
exit(1);
}
if(clBuildProgram(program, 0, NULL, NULL, NULL, NULL) != CL_SUCCESS)
{
cout<<"Program Build failed\n";
size_t length;
char buffer[2048];
clGetProgramBuildInfo(program, device_id[0], CL_PROGRAM_BUILD_LOG, sizeof(buffer), buffer, &length);
cout<<"--- Build log ---\n "<<buffer<<endl;
exit(1);
}
Normally earlier I got syntax or other errors inside kernel file here with the help of "clGetProgramBuildInfo()" function whenever "clBuildProgram" Failed but when this program runs, on console it only prints:
Program Build failed
--- Build log ---
And when I tried to print the error code returned by "clBuildProgram"; it is "-11"......
What can be the problem with my kernel file that I dont get any build fail information ?
You can learn the meaning of OpenCL error codes by searching in cl.h. In this case, -11 is just what you'd expect, CL_BUILD_PROGRAM_FAILURE. It's certainly curious that the build log is empty. Two questions:
1.) What is the return value from clGetProgramBuildInfo?
2.) What platform are you on? If you are using Apple's OpenCL implementation, you could try setting CL_LOG_ERRORS=stdout in your environment. For example, from Terminal:
$ CL_LOG_ERRORS=stdout ./myprog
It's also pretty easy to set this in Xcode (Edit Scheme -> Arguments -> Environment Variables).
If you are using the C instead of C++:
err = clBuildProgram(program, 0, NULL, NULL, NULL, NULL);
////////////////Add the following lines to see the log file///////////
if (err != CL_SUCCESS) {
char *buff_erro;
cl_int errcode;
size_t build_log_len;
errcode = clGetProgramBuildInfo(program, devices[0], CL_PROGRAM_BUILD_LOG, 0, NULL, &build_log_len);
if (errcode) {
printf("clGetProgramBuildInfo failed at line %d\n", __LINE__);
exit(-1);
}
buff_erro = malloc(build_log_len);
if (!buff_erro) {
printf("malloc failed at line %d\n", __LINE__);
exit(-2);
}
errcode = clGetProgramBuildInfo(program, devices[0], CL_PROGRAM_BUILD_LOG, build_log_len, buff_erro, NULL);
if (errcode) {
printf("clGetProgramBuildInfo failed at line %d\n", __LINE__);
exit(-3);
}
fprintf(stderr,"Build log: \n%s\n", buff_erro); //Be careful with the fprint
free(buff_erro);
fprintf(stderr,"clBuildProgram failed\n");
exit(EXIT_FAILURE);
}
I encountered the same problem with an empty log file. I was testing my ocl kernel on a different computer. It had 2 platforms instead of one. One Intel GPU and one AMD GPU. I only had AMD OCL SDK installed. Installing the Intel OCL SDK fixed the problem. Also selecting the AMD platform instead of the Intel GPU platform fixed it.
I've seen this happen on OSX 10.14.6 when the OpenCL kernel source is missing the _kernel attribute tag. If both the _kernel tag and return type are missing it seems to crash the system OpenCL compiler daemon, which then takes a few seconds to restart before new kernels will compile again.