Error in compile-time arguments with AMD - opencl

This is regarding compile time argument in openCL.
I have an array of constants of fixed size, and I am passing it as compile-time argument, as follows:
-DCOEFF=0.1f,0.2f,0.5f,0.2f,0.1f
And in the Kernel, I am reading it as,
__kernel void Smoothing(__global const float *in, __global float *out)
{
float chnWeight[] = {COEFF};
}
This way, using intel-SDK, I am getting a considerable amount of performance benefit, compared to passing the Coefficients as another argument to the kernel.
The problem is in AMD, this is not getting compiled. I am getting the following error :
0.2f:
Catastrophic error: cannot open source file "0.2f"
1 catastrophic error detected in the compilation of "0.2f".
Compilation terminated.
I understand that in AMD (comma) is also taken as a separating character for the compile time arguments, and this is causing the error.
Any help to solve this problem will be appreciated. Thanks in advance.

This problem was introduced into AMD OpenCL sometime between versions 937.2 and 1268.1. Here is a work-around:
Replace,
-DCOEFF=0.1f,0.2f,0.5f,0.2f,0.1f
with
-D COEFF=0.1f,0.2f,0.5f,0.2f,0.1f

Try quoting the string to -DCOEFF="0.1f,0.2f,0.5f,0.2f,0.1f"
It looks that the compiler is looking for the file "0.2f" and that is the second element, so after the first element and comma the compiler has already stopped interpreting the input as part of the COEFF define.

Related

Compatibility of Rcpp code involving std::transform in package

I am in the process of finishing a package I've been working on. All checks look good and it compiles without problems on my computer. win-builder has no problems with the package as well. As further check, I've tried to install from source on the computer of a colleague and it fails. The problem comes from a Rcpp function that I've taken from a StackOverflow thread on vector powers in Rcpp:
NumericVector vecpow(const NumericVector base, const NumericVector exp) {
NumericVector out(base.size());
std::transform(base.begin(), base.end(),
exp.begin(), out.begin(), ::pow);
return out;
}
This compiles and works fine for me, but throws an error for my colleague when installing, involving
error: no matching function function for call to 'transform'
and
candidate function template not viable: requires 4 arguments, but 5 were provided.
I can reproduce the error for instance by replacing ::pow in the original code by pow. I'm on Windows 8.1, my colleague is on Mac. The colleague maintains his own packages involving extensive amounts of Rcpp code and usually has no problems with compiling.
I assume that this might be a compiler problem. The original thread has some alternative code involving C++11 (thread is already five years old), so in principle, I could replace the problematic code using alternatives. However, as I'm not very experienced with this, this would be trial and error. My question is: Is there a simple reason why this error pops up? And how can I modify my code in order to be sure that the package will be installable and usable for most users?
the error is caused because the compiler can't match the std::pow function as binary operation (this may be due to the fact that it has at least two overloads and the compiler can't guess the types float/double) giving rise to the following note :
note: candidate template ignored: couldn't infer template argument '_BinaryOperation'
And then it falls back to the unary std::transform which only has 4 arguments pushing the second note:
note: candidate function template not viable: requires 4 arguments, but 5 were provided
Compilation stops as it hasn't found a valid std::transform to apply to the specified arguments.
Switching from pow to powf stops this issue as the compiler doesn't have to resolve any overloads, however precision might be lost due to this change :
Rcpp::cppFunction("NumericVector vecpow(const NumericVector base, const NumericVector exp) {
NumericVector out(base.size());
std::transform(base.begin(), base.end(),
exp.begin(), out.begin(), ::powf);
return out;
}
") -> pow
pow(1:5,5:1)
[1] 1 16 27 16 5
Another work around would be to use static cast i.e replace ::pow with static_cast<double(*)(double, double)>(::pow) to tell the compiler to use the double overload of pow

Julia ccall outb - Problems with libc

I run the following ccall's:
status = ccall((:ioperm, "libc"), Int32, (Uint, Uint, Int32), 0x378, 5, 1)
ccall((:outb, "libc"), Void, (Uint8, Uint16), 0x00, 0x378)
After the second ccall I receive the following Error message:
ERROR: ccall: could not find function outb in library libc
in anonymous at no file
in include at ./boot.jl:245
in include_from_node1 at loading.jl:128
in process_options at ./client.jl:285
After some research and messing around I found the following information:
ioperm is in libc, but outb is not
However, both ioperm and outb are defined in the same header file <sys/io.h>
An equivalent version of C code compiles and runs smoothly.
outb in glibc, however on the system glibc is defined as libc
Same problem with full path names /lib/x86_64-linux-gnu/libc.so.6
EDIT:
Thanks for the insight #Employed Russian! I did not look closely enough to realize the extern declaration. Now, all of my above notes make total sense!
Great, we found that ioperm is a libc function that is declared in <sys/io.h>, and that outb is not in libc, but is defined in <sys/io.h> as a volatile assembly instruction.
Which library, or file path should I use?
Implementation of ccall.
However, both ioperm and outb are defined in the same header file <sys/io.h>
By "defined" you actually mean "declared". They are different. On my system:
extern int ioperm (unsigned long int __from, unsigned long int __num,
int __turn_on) __attribute__ ((__nothrow__ , __leaf__));
static __inline void
outb (unsigned char __value, unsigned short int __port)
{
__asm__ __volatile__ ("outb %b0,%w1": :"a" (__value), "Nd" (__port));
}
It should now be obvious why you can call ioperm but not outb.
Update 1
I am still lost as to how to correct the error.
You can't import outb from libc. You would have to provide your own native library, e.g.
void my_outb(unsigned char value, unsigned short port) {
outb(value, port);
}
and import my_outb from it. For symmetry, you should probably implement my_ioperm the same way, so you are importing both functions from the same native library.
Update 2
Making a library worked, but in terms of performance it is horrible.
I guess that's why the original is implemented as an inline function: you are only executing a single outb instruction, so the overhead of a function call is significant.
Unoptimized python does x5 better.
Probably by having that same outb instruction inlined into it.
Do you know if outb exist in some other library, not in libc
That is not going to help: you will still have a function call overhead. I am guessing that when you call the imported function from Julia, you probably execute a dlopen and dlsym call, which would impose an overhead of additional several 100s of instructions.
There is probably a way to "bind" the function dynamically once, and then use it repeatedly to make the call (thus avoiding repeated dlopen and dlsym). That should help.

LLVM converting a Constant to a Value

I am using custom LLVM pass where if I encounter a store to
where the compiler converts the value to a Constant; e.g. there is an explicit store:
X[gidx] = 10;
Then LLVM will generate this error:
aoc: ../../../Instructions.cpp:1056: void llvm::StoreInst::AssertOK(): Assertion `getOperand(0)->getType() == cast<PointerType>(getOperand(1)->getType())->getElementType() && "Ptr must be a pointer to Val type!"' failed.
The inheritance order goes as: Value<-User<-Constant, so this shouldn't be an issue, but it is. Using an a cast on the ConstantInt or ConstantFP has no effect on this error.
So I've tried this bloated solution:
Value *new_value;
if(isa<ConstantInt>(old_value) || isa<ConstantFP>(old_value)){
Instruction *allocInst = builder.CreateAlloca(old_value->getType());
builder.CreateStore(old_value, allocInst);
new_value = builder.CreateLoad(allocResultInst);
}
However this solution creates its own register errors when different type are involved, so I'd like to avoid it.
Does anyone know how to convert a Constant to a Value? It must be a simple issue that I'm not seeing. I'm developing on Ubuntu 12.04, LLVM 3, AMD gpu, OpenCL kernels.
Thanks ahead of time.
EDIT:
The original code that produces the first error listed is simply:
builder.CreateStore(old_value, store_addr);
EDIT2:
This old_value is declared as
Value *old_value = current_instruction->getOperand(0);
So I'm grabbing the value to be stored, in this case "10" from the first code line.
You didn't provide the code that caused this first assertion, but its wording is pretty clear: you are trying to create a store where the value operand and the pointer operand do not agree on their types. It would be useful for the question if you'd provide the code that generated that error.
Your second, so-called "bloated" solution, is the correct way to store old_value into the stack and then load it again. You write:
However this solution creates its own register errors when different type are involved
These "register errors" are the real issue you should be addressing.
In any case, the whole premise of "converting a constant to a value" is flawed - as you have correctly observed, all constants are values. There's no point storing a value into the stack with the sole purpose of loading it again, and indeed the standard LLVM pass "mem2reg" will completely remove such a sequence, replacing all uses of the load with the original value.

Matlab: Attempt to reference field of non-structure array

I am using the Kernel Density Estimator toolbox form http://www.ics.uci.edu/~ihler/code/kde.html . But I am getting the following error when I try to execute the demo files -
>> demo_kde_3
KDE Example #3 : Product sampling methods (single, anecdotal run)
Attempt to reference field of non-structure array.
Error in double (line 10)
if (npd.N > 0) d = 1; % return 1 if the density exists
Error in repmat (line 49)
nelems = prod(double(siz));
Error in kde (line 39)
if (size(ks,1) == 1) ks = repmat(ks,[size(points,1),1]); end;
Error in demo_kde_3 (line 8)
p = kde([.1,.45,.55,.8],.05); % create a mixture of 4 gaussians for
testing
Can anyone suggest what might be wrong? I am new to Matlab and having a hard time to figure out the problem.
Thank You,
Try changing your current directory away from the #kde folder; you may have to add the #kde folder to your path when you do this. For example run:
cd('c:\');
addpath('full\path\to\the\folder\#kde');
You may also need to add
addpath('full\path\to\the\folder\#kde\examples');
Then see if it works.
It looks like function repmat (a mathworks function) is picking up the #kde class's version of the double function, causing an error. Usually, only objects of the class #kde can invoke that functions which are in the #kde folder.
I rarely use the #folder form of class definitions, so I'm not completely sure of the semantics; I'm curious if this has any effect on the error.
In general, I would not recommend using the #folder class format for any development that you do. The mathworks overhauled their OO paradigm a few versions ago to a much more familiar (and useful) format. Use help classdef to see more. This #kde code seems to predate this upgrade.
MATLAB gives you the code line where the error occurs. As double and repmat belong to MATLAB, the bug probably is in kde.m line 39. Open that file in MATLAB debugger, set a breakpoint on that line (so the execution stops immediately before the execution of that specific line), and then when the code is stopped there, check the situation. Try the entire code line in console (copy-paste or type it, do not single-step, as causing an uncatched error while single-stepping ends the execution of code in debugger), it should give you an error (but doesn't stop execution). Then try pieces of the code of that code line, what works as it should and what not, eg. does the result of size(points, 1) make any sense.
However, debugging unfamiliar code is not an easy task, especially if you're a beginner in MATLAB. But if you learn and understand the essential datatypes of MATLAB (arrays, cell arrays and structs) and the different ways they can be addressed, and apply that knowledge to the situation on the line 39 of kde.m, hopefully you can fix the bug.
Repmat calls double and expects the built-in double to be called.
However I would guess that this is not part of that code:
if (npd.N > 0) d = 1; % return 1 if the density exists
So if all is correct this means that the buil-tin function double has been overloaded, and that this is the reason why the code crashes.
EDIT:
I see that #Pursuit has already addressed the issue but I will leave my answer in place as it describes the method of detection a bit more.

Global variable touched by a passed-in parameter becomes unusable

folks!
I pass a struct full of data to my kernel, and I run into the following difficulty using it (very stripped down):
[edit: mac osx / xcode 3.2 on mac book pro; this compile is obviously for cpu]
typedef struct
{
float xoom;
int sizex;
} varholder;
float zX, xd;
__kernel void Harlan( __global varholder * vh )
{
int X = get_global_id(0), Y = get_global_id(1);
zX = ( ( X - vh->sizex/2 ) / vh->xoom + vh->sizex/2 ); // (a)
xd = zX; // (b) BOOM!!
}
after executing line (a), the line marked (b), a simple assignment, gives "LLVM compiler failed to compile a function".
if, however, we do not execute line (a), then line (b) is fine.
So, through my fiddling around a LOT with this, it seems as if it is the assignment statement (a), which uses a passed-in parameter, that messes up the future access of the variable zX. However, of course I need to be able to use the results of calculations further down the line.
I have zX and xd declared at the file level because my helper functions need them.
Any thoughts?
Thanks!
David
p.s. I'm now registered so will be able to upvote and accept answers, which I am sadly unable to do for the last person who helped me (used same username to register, but can't seem to vote on the old post; sorry!).
No, say it ain't so!
I am sincerely hoping that this is not a "correct" answer to my own question. I found on another forum (though not the same question asked!) the following, and I am afraid that it refers to what I'm trying to do:
(quote)
You're doing something the standard prohibits. Section 6.5 says:
'All program scope variables must be declared in the __constant address space.'
In other words, program scope variables cannot be mutable.
(end quote)
... well, tcha!!!! What an astoundingly inconvenient restriction! I'm sure there's reasoning behind it.
[edit: Not At All inconvenient! it was in fact astonishingly easy to work around, given a fresh start the next morning. (And no alcohol.)]
You guys & dolls all knew this, right, and didn't have the heart to tell me?...

Resources