How to use clang to compile OpenCL to ptx code? - opencl

Clang 3.0 is able to compile OpenCL to ptx and use Nvidia's tool to launch the ptx code on GPU. How can I do this? Please be specific.

With the current version of of llvm(3.4), libclc and nvptx back-end, the compilation process has changed slightly.
You have to explicitly tell the nvptx backend which driver interface to use; your options are nvptx-nvidia-cuda or nvptx-nvidia-nvcl (for OpenCL) and their 64 bit equivalents nvptx64-nvidia-cuda or nvptx64-nvidia-nvcl.
The generated .ptx code differs slightly according to the chosen interface. In the assembly code produced for the CUDA driver API, intrinsics .global and .ptr are dropped from entry functions but they are required by OpenCL. I've modified Mikael's compile steps slightly to produce code that can be run with OpenCL host:
Compile to LLVM IR:
clang -Dcl_clang_storage_class_specifiers -isystem libclc/generic/include -include clc/clc.h -target nvptx64-nvidia-nvcl -xcl test.cl -emit-llvm -S -o test.ll
Link kernel:
llvm-link libclc/built_libs/nvptx64--nvidiacl.bc test.ll -o test.linked.bc
Compile to Ptx:
clang -target nvptx64-nvidia-nvcl test.linked.bc -S -o test.nvptx.s

See Justin Holewinski's blog for a specific example or this thread for some more detailed steps and links to samples.

Here is brief guide how to do it with Clang trunk (3.4 at this point) and libclc. I assume you have basic knowledge how to configure and compile LLVM and Clang, so I just listed the configure flags I have used.
square.cl:
__kernel void vector_square(__global float4* input, __global float4* output) {
int i = get_global_id(0);
output[i] = input[i]*input[i];
}
Compile llvm and clang with nvptx support:
../llvm-trunk/configure --prefix=$PWD/../install-trunk --enable-debug-runtime --enable-jit --enable-targets=x86,x86_64,nvptx
make install
Get libclc (git clone http://llvm.org/git/libclc.git) and compile it.
./configure.py --with-llvm-config=$PWD/../install-trunk/bin/llvm-config
make
If you have problem compiling this you might need to fix couple of headers in ./utils/prepare-builtins.cpp
-#include "llvm/Function.h"
-#include "llvm/GlobalVariable.h"
-#include "llvm/LLVMContext.h"
-#include "llvm/Module.h"
+#include "llvm/IR/Function.h"
+#include "llvm/IR/GlobalVariable.h"
+#include "llvm/IR/LLVMContext.h"
+#include "llvm/IR/Module.h"
Compile kernel to LLVM IR assember:
clang -Dcl_clang_storage_class_specifiers -isystem libclc/generic/include -include clc/clc.h -target nvptx -xcl square.cl -emit-llvm -S -o square.ll
Link kernel with builtin implementations from libclc
llvm-link libclc/nvptx--nvidiacl/lib/builtins.bc square.ll -o square.linked.bc
Compile fully linked LLVM IR to PTX
clang -target nvptx square.linked.bc -S -o square.nvptx.s
square.nvptx.s:
//
// Generated by LLVM NVPTX Back-End
//
.version 3.1
.target sm_20, texmode_independent
.address_size 32
// .globl vector_square
.entry vector_square(
.param .u32 .ptr .global .align 16 vector_square_param_0,
.param .u32 .ptr .global .align 16 vector_square_param_1
)
{
.reg .pred %p<396>;
.reg .s16 %rc<396>;
.reg .s16 %rs<396>;
.reg .s32 %r<396>;
.reg .s64 %rl<396>;
.reg .f32 %f<396>;
.reg .f64 %fl<396>;
ld.param.u32 %r0, [vector_square_param_0];
mov.u32 %r1, %ctaid.x;
ld.param.u32 %r2, [vector_square_param_1];
mov.u32 %r3, %ntid.x;
mov.u32 %r4, %tid.x;
mad.lo.s32 %r1, %r3, %r1, %r4;
shl.b32 %r1, %r1, 4;
add.s32 %r0, %r0, %r1;
ld.global.v4.f32 {%f0, %f1, %f2, %f3}, [%r0];
mul.f32 %f0, %f0, %f0;
mul.f32 %f1, %f1, %f1;
mul.f32 %f2, %f2, %f2;
mul.f32 %f3, %f3, %f3;
add.s32 %r0, %r2, %r1;
st.global.f32 [%r0+12], %f3;
st.global.f32 [%r0+8], %f2;
st.global.f32 [%r0+4], %f1;
st.global.f32 [%r0], %f0;
ret;
}

Related

__e_acsl_assert is not getting added for all given assert in .i file

I am new to Frama-C. I specifically need to use e-acsl plugin for verification purposes. I used first.i file as
int main(void) {
int x = 0;
/∗# assert x == 0; ∗/
/∗# assert x == 1; ∗/
return 0;
}
Created monitored_first.c file from first.i file using the following command.
$ frama-c -e-acsl first.i -then-last -print -ocode monitored_first.c
The main function inside the monitored_first.c looks like the one below.
int main(void)
{
int __retres;
__e_acsl_memory_init((int *)0,(char ***)0,8UL);
int x = 0;
__retres = 0;
__e_acsl_memory_clean();
return __retres;
}
It is not adding e_acsl assertion for x==1.
I tried it using the "e-acsl-gcc.sh" script , which generated the monitored_first.i file. But the main function inside monitored_first.i is same as that in monitored_first.c.
$ e-acsl-gcc.sh -c -omonitored_first.i first.i
The above command generated two executable, "a.out.e-acsl" and "a.out". It also generates some warnings when run in ubuntu 22.04 as follows:
/home/amrutha/.opam/4.11.1/bin/frama-c -remove-unused-specified-functions -machdep gcc_x86_64 '-cpp-extra-args= -std=c99 -D_DEFAULT_SOURCE -D__NO_CTYPE -D__FC_MACHDEP_X86_64 ' -no-frama-c-stdlib first.i -e-acsl -e-acsl-share=/home/amrutha/.opam/4.11.1/bin/../share/frama-c/e-acsl -then-last -print -ocode monitored_first.i
[kernel] Parsing first.i (no preprocessing)
[e-acsl] beginning translation.
[kernel] Parsing FRAMAC_SHARE/e-acsl/e_acsl.h (with preprocessing)
/tmp/ppannot15ad34.c:362: warning: "__STDC_IEC_60559_BFP__" redefined
362 | #define __STDC_IEC_60559_BFP__ 201404L
|
In file included from <command-line>:
/usr/include/stdc-predef.h:39: note: this is the location of the previous definition
39 | # define __STDC_IEC_60559_BFP__ 201404L
|
/tmp/ppannot15ad34.c:363: warning: "__STDC_IEC_60559_COMPLEX__" redefined
363 | #define __STDC_IEC_60559_COMPLEX__ 201404L
|
In file included from <command-line>:
/usr/include/stdc-predef.h:49: note: this is the location of the previous definition
49 | # define __STDC_IEC_60559_COMPLEX__ 201404L
|
[e-acsl] translation done in project "e-acsl".
+ gcc -std=c99 -m64 -g -O2 -fno-builtin -fno-merge-constants -Wall -Wno-long-long -Wno-attributes -Wno-nonnull -Wno-undef -Wno-unused -Wno-unused-function -Wno-unused-result -Wno-unused-value -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-implicit-function-declaration -Wno-empty-body first.i -o a.out
+ gcc -DE_ACSL_SEGMENT_MMODEL -std=c99 -m64 -g -O2 -fno-builtin -fno-merge-constants -Wall -Wno-long-long -Wno-attributes -Wno-nonnull -Wno-undef -Wno-unused -Wno-unused-function -Wno-unused-result -Wno-unused-value -Wno-unused-function -Wno-unused-variable -Wno-unused-but-set-variable -Wno-implicit-function-declaration -Wno-empty-body -I/home/amrutha/.opam/4.11.1/bin/../share/frama-c/e-acsl -o a.out.e-acsl monitored_first.i /home/amrutha/.opam/4.11.1/bin/../share/frama-c/e-acsl/e_acsl_rtl.c /home/amrutha/.opam/4.11.1/bin/../lib/frama-c/e-acsl/libeacsl-dlmalloc.a -lgmp -lm
In ubuntu 20.04 there is no any warning, only the end part is getting displayed. When run ./a.out.e-acsl , it simply run the code without any message, which is not supposed. The expected output should look like this:
$ ./a.out.e-acsl
first.i: In function 'main'
first.i:4: Error: Assertion failed:
The failing predicate is:
x == 1.
Aborted (core dumped)
$ echo $?
134
I tried it in ubuntu 22.04 with opam version 2.1.2 and Fragma-C 25.0
and ubuntu 20.04 with opam version 2.0.5 and Fragma-C 25.0
The same issue has been posted to Frama-C's public bug tracking and it seems the cause might have been the non-ASCII asterisk characters used in the ACSL annotations: ∗ instead of *.
I still don't understand how the comments could parse at all (my compiler gives a syntax error), but the user seems to indicate that replacing them solved the problem.
In any case, in similar situations one can either use the Frama-C GUI to open the parsed file and check if Frama-C recognizes the ACSL annotations (they should show up in the CIL normalized code), or try other analyses, e.g. running frama-c -eva and checking that it detects the annotations.

using R's c code in my standalone executables: "Undefined symbols"

Suppose I have the following c source code, in dexp_test.c:
#include <stdio.h>
double dexp(double x, double scale, int log);
int main() {
double x;
x = dexp(1 , 2, 0);
printf("Value: %f\n", x);
return 0;
}
where dexp is defined in R's source code (https://github.com/wch/r-source/blob/trunk/src/nmath/dexp.c). I would like to compile this to a standalone executable. I have R 4.0 installed on my system. I have the following gcc lines:
gcc r-source/src/nmath/dexp.c -I/Library/Frameworks/R.framework/Versions/4.0/Resources/include -c -o a.o
gcc dexp_test.c -c -o b.o
These lines run just fine on my system and I am left with new files a.o and b.o without errors.
When I run this line to get an executable:
gcc -o test_exp a.o b.o
...I get these errors:
Undefined symbols for architecture x86_64:
"_R_NaN", referenced from:
_Rf_dexp in a.o
"_R_NegInf", referenced from:
_Rf_dexp in a.o
"_dexp", referenced from:
_main in b.o
(maybe you meant: _Rf_dexp)
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
I'm definitely missing something conceptually here; how do I get this to compile? If it helps, I'm on OSX 15.6, and the output of gcc -v is
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 12.0.0 (clang-1200.0.32.2)
Target: x86_64-apple-darwin19.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
You should mention the operating system you are using.
You need to include the appropriate headers. And tell the linker where and which libraries you want to use.
So your source should be
#include <stdio.h>
#include <R.h>
#include <Rmath.h>
int main() {
double x;
x = dexp(1 , 2, 0);
printf("Value: %f\n", x);
return 0;
}
And on the command line you should use the following
gcc -I/Library/Frameworks/R.framework/Versions/4.0/Resources/include -L/Library/Frameworks/R.framework/Versions/4.0/Resources/lib -lR -o test_exp trydexp.c

how to make loadable kernel module on solaris? no linux

1. how to create loadable kernel module on solaris 11?
simple loadable kernel module (hello world).
I searched, but only showed how to create a Linux kernel module.
in linux, header linux/kernel.h, but not included header on solaris
2. how to compile loadable kernel module on solaris 11?
gcc -D_KERNEL -m64 -c cpluscplus.cpp
Is it appropriate to compile as above?
64bit, x86
Here's the minimal hello world kernel module I can come up with:
#include <sys/modctl.h>
#include <sys/cmn_err.h>
/*
* Module linkage information for the kernel.
*/
static struct modlmisc modlmisc = {
&mod_miscops, "test module"
};
static struct modlinkage modlinkage = {
MODREV_1, (void *)&modlmisc, NULL
};
int
_init(void)
{
return (mod_install(&modlinkage));
}
int
_fini(void)
{
return (mod_remove(&modlinkage));
}
int
_info(struct modinfo *modinfop)
{
cmn_err(CE_NOTE, "hello kernel");
return (mod_info(&modlinkage, modinfop));
}
Compiling this as 64-bit binary with Oracle Developer Studio 12.6 and the Solaris linker like so:
cc -D_KERNEL -I include -m64 -c foomod.c
ld -64 -z type=kmod -znodefs -o foomod foomod.o
For GCC you will likely need a distinct set of options.
Then load it with:
modload ./foomod
This will complain about signature verification. This is innocuous unless you are running the system with Verified Boot enabled.
Check that module is loaded:
# modinfo -i foomod
ID LOADADDR SIZE INFO REV NAMEDESC
312 fffffffff7a8ddc0 268 -- 1 foomod (test module)
# dmesg | tail -1
Mar 16 12:22:57 ST091 foomod: [ID 548715 kern.notice] NOTICE: hello kernel
This works on Solaris 11.4 SRU 33 running on x86 machine (VirtualBox instance in fact).

How to include nmath.h?

I would like to include the header nmath.h for my C code (within an R package) to find R_FINITE and ML_ERR_return_NAN. I found that one cannot include nmath.h directly. For R_FINITE to be found, I could include R_ext/libextern.h. But I don't know what to include so that ML_ERR_return_NAN is found. Any ideas? I found here that Prof. Brian Ripley referred to Writing R Extensions, but I couldn't find nmath.h being addressed there (where exactly?)
On Debian or Ubuntu:
sudo apt-get install r-mathlib
after which you can build test programs such as this:
// -*- mode: C++; c-indent-level: 4; c-basic-offset: 4;
// compile-command: "gcc -s -Wall -O3 \
// -I/usr/share/R/include -o rmath_rnorm \
// rmath_rnorm.c -lRmath -lm" -*-
// Compare to
// $ Rscript -e "RNGkind('Marsaglia'); \
// .Random.seed[2:3] <- c(123L, 456L); rnorm(2)"
// [1] -0.2934974 -0.3343770
#include <stdio.h>
#define MATHLIB_STANDALONE 1
#include <Rmath.h>
int main(void) {
set_seed(123, 456);
printf("rnorm: %f %f\n", rnorm(0.0, 1.0), rnorm(0.0, 1.0));
return 0;
}
Note: The first four lines should be one-line in the file you safe, then M-x compile build the program for your. Ditto for the Rscript invocation: one line.
Edit: Drats. Answered the wrong question :) nmath.h appears to not be exported from src/nmath/nmath.h but this R Mathlibrary is what is exported by R Core for use by others. Where as the nmath.h file has
/* Private header file for use during compilation of Mathlib */
#ifndef MATHLIB_PRIVATE_H
#define MATHLIB_PRIVATE_H
so you are not supposed to rely on it.

Qt Moc generates code with undefined behaviour

I have a simple class which has a QT signal that gives me trouble when compiling the moc generated code. I don't use the qmake buildsystem, but my scons build script calls qt' moc command directly.
The relevant source file "write_qstring.h" is:
#ifndef LOG_WRITE_QSTRING_POLICY_H
#define LOG_WRITE_QSTRING_POLICY_H
#include <QObject>
#include <QString>
namespace Log
{
class WriteQString : public QObject
{
Q_OBJECT
public:
signals:
void Changed(QString newString);
};
}
#endif // LOG_WRITE_QSTRING_POLICY_H
I run the moc compiler with the command:
/opt/Qt/5.2.0/gcc_64/bin/moc -DQT_CORE_LIB
-I/opt/Qt/5.2.0/gcc_64/include/QtCore -I/opt/Qt/5.2.0/gcc_64/include -o moc_write_qstring.cc write_qstring.h
and then I compile the generated moc_write_qstring.cc using clang++:
clang++ -o moc_write_qstring.o -c -Weverything -pedantic -g -std=c++11
-fcxx-exceptions -pthread -fdiagnostics-fixit-info -fPIC -Wno-c++98-compat -Wno-documentation-unknown-command -Wno-documentation -Wno-padded -Wno-weak-vtables -Wno-exit-time-destructors -Wno-global-constructors -isystem/opt/Qt/5.2.0/gcc_64/include -isystem/opt/Qt/5.2.0/gcc_64/include/QtCore -DQT_CORE_LIB -I/opt/Qt/5.2.0/gcc_64/include/QtCore moc_write_qstring.cc
This gives a warning about undefined behaviour:
warning: dereference of type '_t ' (aka 'void
(Log::WriteQString::*)(QString)') that was reinterpret_cast from type
'void **' has undefined behavior [-Wundefined-reinterpret-cast]
With the relevant line being:
if (*reinterpret_cast<_t *>(func) ==
static_cast<_t>(&WriteQString::Changed)) {
Since the compiler is warning about undefined behaviour I can only think the code is broken, but why is the metacompiler emitting broken code? What did I do wrong here? is it a c++11 problem??
Output of clang++ --version
Ubuntu clang version 3.5-1~exp1 (trunk) (based on LLVM 3.5) Target:
x86_64-pc-linux-gnu Thread model: posix
Output of qmake --version
QMake version 3.0 Using Qt version 5.2.0 in /opt/Qt/5.2.0/gcc_64/lib

Resources