I'm trying to clarify some structs and syntax in OpenCL. Currently I'm working with VS2013 and OpenCL Emulator-Debugger. I started working with the demo project which comes with the emulator and stuck into this:
__Kernel(hello)
__ArgNULL
{
...
}
Just two lines above there is this:
//__kernel void
//hello()
What's the difference between them? As far as I understand from the documentation (here: http://amd-dev.wpengine.netdna-cdn.com/wordpress/media/2012/10/OpenCL-Emu-Documentation-2.pdf
and here: https://www.khronos.org/registry/cl/specs/opencl-1.x-latest.pdf) the first one is just a Macro definition in the OCL-Emu environment for the second one, but there isn't a clear and definite answer. Is this right?
Yes, it is right, the first one is a macro.
__Kernel() is a macro, and __kernel is a special CL flag to declare a C99 function as a GPU code entry function (kernel function).
So this __Kernel(hello) expands to __kernel hello
And __ArgNULL expands to ().
Giving you normal CL code: __kernel hello() { ... }
In this Emu-CL case, the macros are probably needed, since it doesn't internally expand to CL code. The macros are a way to simplify and adapt the language to a CL-like expressions.
Related
The code should run on Windows 10. I tried asking on Reddit, but the ideas are Unix/Linux only. There's also CFFI, but I didn't understand how to use it for this problem (the main usability part of the documentation I found is just an elaborate example not related to this problem).
I also looked through the SetCursorPos of Python, and found that it calls ctypes.windll.user32.SetCursorPos(x, y), but I have no clue what that would look like in CL.
And finally, there's CommonQt, but while there seems to be QtCursor::setPos in Qt, I couldn't find the CL version.
The function called by the Python example seems to be documented here. It is part of a shared library user32.dll, which you can load with CFFI,
(ql:quickload :cffi)
#+win32
(progn
(cffi:define-foreign-library user32
(:windows "user32.dll"))
(cffi:use-foreign-library user32))
The #+win32 means that this is only evaluated on Windows.
Then you can declare the foreign SetCursorPos-function with CFFI:DEFCFUN. According to the documentation, it takes in two ints and returns a BOOL. CFFI has a :BOOL-type, however the Windows BOOL seems to actually be an int. You could probably use cffi-grovel to automatically find the typedef from some Windows header, but I'll just use :INT directly here.
#+win32
(cffi:defcfun ("SetCursorPos" %set-cursor-pos) (:boolean :int)
(x :int)
(y :int))
I put a % in the name to indicate this is an internal function that should not be called directly (because it is only available on Windows). You should then write a wrapper that works on different platforms (actually implementing it on other platforms is left out here).
(defun set-cursor-pos (x y)
(check-type x integer)
(check-type y integer)
#+win32 (%set-cursor-pos x y)
#-win32 (error "Not supported on this platform"))
Now calling (set-cursor-pos 100 100) should move the mouse near the top left corner.
There are two problems here:
How to move the mouse in windows
How to call that function from CL.
It seems you have figured out a suitable win32 function exists so the challenge is to load the relevant library, declare the functions name and type, and then call it. I can’t really help you with that unfortunately.
Some other solutions you might try:
Write and compile a trivial C library to call the function you want and see if you can call that from CL (maybe this is easier?)
Write and compile a trivial C library and see if you can work out how to call it from CL
Write/find some trivial program in another language to move the mouse based on arguments/stdin and run that from CL
In the article "How to set up Xcode to run OpenCL code, and how to verify the kernels before building" NeXTCoder referred to some code as the "Short Answer", i.e. https://developer.apple.com/library/mac/#documentation/Performance/Conceptual/OpenCL_MacProgGuide/XCodeHelloWorld/XCodeHelloWorld.html.
In that code the author says "Wrap your kernel code into a kernel block:" without explaining what is a "kernel block". (The OpenCL Programmer Guide for Mac OS X by Apple makes no mention of kernel block.)
The host program calls "square_kernel" but the sample kernel is called "square" and the sample kernel block is labelled "kernelName" (in italics). Can you please tell me how to put the 3 pieces together:kernel, kernel block & host program to run in Xcode 5.1? I only have one kernel. Thanks.
It's not really jargon. It's closure-like entity.
OpenCL C 2.0 adds support for the clang block syntax. You use the ^ operator to declare a Block variable and to indicate the beginning of a Block literal. The body of the Block itself is contained within {}, as shown in the example (as usual with C, ; indicates the end of the statement).The Block is able to make use of variables from the same scope in which it was defined.
Example:
int multiplier = 7;
int (^myBlock)(int) = ^(int num) {
return num * multiplier;
};
printf(“%d\n”, myBlock(3));
// prints 21
Source:
https://www.khronos.org/registry/cl/sdk/2.1/docs/man/xhtml/blocks.html
The term "kernel block" only seems to be a jargon to refer to the "part of the code that is the kernel". Particularly, the kernel block in this case is simply the function that is declared to be a kernel, by adding kernel before its declaration. Or, even simpler, and from the way how the term is used on this website, I would say that "kernel block" is the same as "kernel".
The kernelName (in italics) is a placeholder. The code there shows the general pattern of how to define any kernel:
It is prefixed with kernel
It returns void
It has a name ... the kernelName, which may for example be square
It has several input- and output parameters
The reason why the kernel is called square, but invoked with square_kernel seems to be some magic that is done by XCode: It seems to read the .cl file, and creates a .h file that contains additional declarations that are derived from the .cl file (as can be seen in this question, where a kernel called rebound is defined, and GCL generated a rebound_kernel declaration).
Is it possible to have one function to wrap both MPI_Init and MPI_Init_thread? The purpose of this is to have a cleaner API while maintaining backward compatibility. What happens to a call to MPI_Init_thread when it is not supported by the MPI run time? How do I keep my wrapper function working for MPI implementations when MPI_Init_thread is not supported?
MPI_INIT_THREAD is part of the MPI-2.0 specification, which was released 15 years ago. Virtually all existing MPI implementations are MPI-2 compliant except for some really archaic ones. You might not get the desired level of thread support, but the function should be there and you should still be able to call it instead of MPI_INIT.
You best and most portable option is to have a configure-like mechanism probe for MPI_Init_thread in the MPI library, e.g. by trying to compile a very simple MPI program and see if it fails with an unresolved symbol reference, or you can directly examine the export table of the MPI library with nm (for archives) or objdump (for shared ELF objects). Once you've determined that the MPI library has MPI_Init_thread, you can have a preprocessor symbol defined, e.g. CONFIG_HAS_INITTHREAD. Then have your wrapped similar to this one:
int init_mpi(int *pargc, char ***pargv, int desired, int *provided)
{
#if defined(CONFIG_HAS_INITTHREAD)
return MPI_Init_thread(pargc, pargv, desired, provided);
#else
*provided = MPI_THREAD_SINGLE;
return MPI_Init(pargc, pargv);
#endif
}
Of course, if the MPI library is missing MPI_INIT_THREAD, then MPI_THREAD_SINGLE and the other thread support level constants will also not be defined in mpi.h, so you might need to define them somewhere.
I am using the Kernel Density Estimator toolbox form http://www.ics.uci.edu/~ihler/code/kde.html . But I am getting the following error when I try to execute the demo files -
>> demo_kde_3
KDE Example #3 : Product sampling methods (single, anecdotal run)
Attempt to reference field of non-structure array.
Error in double (line 10)
if (npd.N > 0) d = 1; % return 1 if the density exists
Error in repmat (line 49)
nelems = prod(double(siz));
Error in kde (line 39)
if (size(ks,1) == 1) ks = repmat(ks,[size(points,1),1]); end;
Error in demo_kde_3 (line 8)
p = kde([.1,.45,.55,.8],.05); % create a mixture of 4 gaussians for
testing
Can anyone suggest what might be wrong? I am new to Matlab and having a hard time to figure out the problem.
Thank You,
Try changing your current directory away from the #kde folder; you may have to add the #kde folder to your path when you do this. For example run:
cd('c:\');
addpath('full\path\to\the\folder\#kde');
You may also need to add
addpath('full\path\to\the\folder\#kde\examples');
Then see if it works.
It looks like function repmat (a mathworks function) is picking up the #kde class's version of the double function, causing an error. Usually, only objects of the class #kde can invoke that functions which are in the #kde folder.
I rarely use the #folder form of class definitions, so I'm not completely sure of the semantics; I'm curious if this has any effect on the error.
In general, I would not recommend using the #folder class format for any development that you do. The mathworks overhauled their OO paradigm a few versions ago to a much more familiar (and useful) format. Use help classdef to see more. This #kde code seems to predate this upgrade.
MATLAB gives you the code line where the error occurs. As double and repmat belong to MATLAB, the bug probably is in kde.m line 39. Open that file in MATLAB debugger, set a breakpoint on that line (so the execution stops immediately before the execution of that specific line), and then when the code is stopped there, check the situation. Try the entire code line in console (copy-paste or type it, do not single-step, as causing an uncatched error while single-stepping ends the execution of code in debugger), it should give you an error (but doesn't stop execution). Then try pieces of the code of that code line, what works as it should and what not, eg. does the result of size(points, 1) make any sense.
However, debugging unfamiliar code is not an easy task, especially if you're a beginner in MATLAB. But if you learn and understand the essential datatypes of MATLAB (arrays, cell arrays and structs) and the different ways they can be addressed, and apply that knowledge to the situation on the line 39 of kde.m, hopefully you can fix the bug.
Repmat calls double and expects the built-in double to be called.
However I would guess that this is not part of that code:
if (npd.N > 0) d = 1; % return 1 if the density exists
So if all is correct this means that the buil-tin function double has been overloaded, and that this is the reason why the code crashes.
EDIT:
I see that #Pursuit has already addressed the issue but I will leave my answer in place as it describes the method of detection a bit more.
folks!
I pass a struct full of data to my kernel, and I run into the following difficulty using it (very stripped down):
[edit: mac osx / xcode 3.2 on mac book pro; this compile is obviously for cpu]
typedef struct
{
float xoom;
int sizex;
} varholder;
float zX, xd;
__kernel void Harlan( __global varholder * vh )
{
int X = get_global_id(0), Y = get_global_id(1);
zX = ( ( X - vh->sizex/2 ) / vh->xoom + vh->sizex/2 ); // (a)
xd = zX; // (b) BOOM!!
}
after executing line (a), the line marked (b), a simple assignment, gives "LLVM compiler failed to compile a function".
if, however, we do not execute line (a), then line (b) is fine.
So, through my fiddling around a LOT with this, it seems as if it is the assignment statement (a), which uses a passed-in parameter, that messes up the future access of the variable zX. However, of course I need to be able to use the results of calculations further down the line.
I have zX and xd declared at the file level because my helper functions need them.
Any thoughts?
Thanks!
David
p.s. I'm now registered so will be able to upvote and accept answers, which I am sadly unable to do for the last person who helped me (used same username to register, but can't seem to vote on the old post; sorry!).
No, say it ain't so!
I am sincerely hoping that this is not a "correct" answer to my own question. I found on another forum (though not the same question asked!) the following, and I am afraid that it refers to what I'm trying to do:
(quote)
You're doing something the standard prohibits. Section 6.5 says:
'All program scope variables must be declared in the __constant address space.'
In other words, program scope variables cannot be mutable.
(end quote)
... well, tcha!!!! What an astoundingly inconvenient restriction! I'm sure there's reasoning behind it.
[edit: Not At All inconvenient! it was in fact astonishingly easy to work around, given a fresh start the next morning. (And no alcohol.)]
You guys & dolls all knew this, right, and didn't have the heart to tell me?...