How to get Rcpp and BH (Boost headers) working with regex_replace for Windows build - r

I'm using the excellent Rcpp project to implement some faster text processing in R, but cannot get the package to build correctly under Windows. (It builds fine under OS X and Linux.) I am using Rcpp, and the BH header packages. I can get the example at http://gallery.rcpp.org/articles/boost-regular-expressions/ and I can get the code below to build on every platform except Windows.
To isolate my problem I removed it from my larger package, and put it into a simple package so that it can be installed using devtools::install_github("kbenoit/boostTest").
The files are:
Makevars.win:
PKG_LIBS = -lboost_regex
src/clean.cpp:
#include <Rcpp.h>
#include <string>
#include <boost/regex.hpp>
// [[Rcpp::depends(BH)]]
const boost::regex re_digits("[[:digit:]]");
const boost::regex re_punct("[[:punct:]]");
const std::string space0("");
std::string removeDigits(const std::string& s) {
return boost::regex_replace(s, re_digits, space0, boost::match_default | boost::format_sed);
}
std::string removePunct(const std::string& s) {
return boost::regex_replace(s, re_punct, space0, boost::match_default | boost::format_sed);
}
// [[Rcpp::export]]
Rcpp::DataFrame cleanCpp(std::vector<std::string> str) {
int n = str.size();
for (int i=0; i<n; i++) {
str[i] = removeDigits(str[i]);
str[i] = removePunct(str[i]);
}
return Rcpp::DataFrame::create (Rcpp::Named("text") = str);
}
and cleanC.R is exported as:
cleanC <- function(x) as.character(cleanCpp(x)[, 1])
(I did this because I am so new to Rcpp that I could not figure out how to return a CharacterVector, but could get the Rcpp::DataFrame return type working by following the boost-regular-expressions example linked above.
In my DESCRIPTION file I have:
LinkingTo: Rcpp,BH
The problem is that when I build the package using devtools::build_win(), it fails, even though it builds fine on Linux and OS X. The output can be seen here:
* installing *source* package 'boostTest' ...
** libs
*** arch - i386
g++ -I"D:/RCompile/recent/R-3.1.3/include" -I"d:/RCompile/CRANpkg/lib/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/lib/3.1/RcppArmadillo/include" -I"d:/RCompile/CRANpkg/lib/3.1/BH/include" -I"d:/RCompile/r-compiling/local/local320/include" -O3 -Wall -mtune=core2 -c RcppExports.cpp -o RcppExports.o
g++ -I"D:/RCompile/recent/R-3.1.3/include" -I"d:/RCompile/CRANpkg/lib/3.1/Rcpp/include" -I"d:/RCompile/CRANpkg/lib/3.1/RcppArmadillo/include" -I"d:/RCompile/CRANpkg/lib/3.1/BH/include" -I"d:/RCompile/r-compiling/local/local320/include" -O3 -Wall -mtune=core2 -c clean.cpp -o clean.o
g++ -shared -s -static-libgcc -o boostTest.dll tmp.def RcppExports.o clean.o -lboost_regex -Ld:/RCompile/r-compiling/local/local320/lib/i386 -Ld:/RCompile/r-compiling/local/local320/lib -LD:/RCompile/recent/R-3.1.3/bin/i386 -lR
d:/compiler/gcc-4.6.3/bin/../lib/gcc/i686-w64-mingw32/4.6.3/../../../../i686-w64-mingw32/bin/ld.exe: cannot find -lboost_regex
collect2: ld returned 1 exit status
no DLL was created
ERROR: compilation failed for package 'boostTest'
* removing 'd:/RCompile/CRANguest/R-release/lib/boostTest'
Any help appreciated!

Related

R dyn.load gives "undefined symbol" error

I am writing a shared library that itself uses the uthash library in C under Ubuntu. I am going to use this shared library in R. For this purpose I have created a package.c file where the main code is stored. In order to use CALLOC function (a part of the uthash) I have included uthash.h in the beginning of package.c. In the body of package.c the CALLOC function is called.
I can build package.c without any problem by running:
R CMD SHLIB package.c
This is the output of the above command in the console:
gcc -I"/usr/local/lib/R/include" -DNDEBUG -I/usr/local/include -fpic -g -O2 -c package.c -o package.o
gcc -shared -L/usr/local/lib/R/lib -L/usr/local/lib -o package.so package.o -L/usr/local/lib/R/lib -lR
It produces two files "package.o" and "package.so". I load package.so in R using dyn:
dyn.load("package.so")
But it gives the following error message:
Error in dyn.load("package.so") :
unable to load shared object '/home/me/package/package.so':
/home/me/package/package.so: undefined symbol: memoryMap
I searched for a solution and found this. According to what is written there, here is the the output of "ldd package.so" and "nm -g package.so" commands:
"ldd package.so" output:
$ ldd package.so
linux-vdso.so.1 (0x00007ffd283da000)
libR.so => not found
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd3643b4000)
/lib64/ld-linux-x86-64.so.2 (0x00007fd3649a8000)
"nm -g package.so" outout:
$ nm -g package.so
0000000000202088 B __bss_start
U calloc##GLIBC_2.2.5
w __cxa_finalize##GLIBC_2.2.5
0000000000202088 D _edata
0000000000202090 B _end
00000000000016d8 T _fini
U free##GLIBC_2.2.5
0000000000001260 T getSteadyStateDistribution_R
w __gmon_start__
00000000000008b0 T _init
U INTEGER
w _ITM_deregisterTMCloneTable
w _ITM_registerTMCloneTable
U malloc##GLIBC_2.2.5
U memcpy##GLIBC_2.14
U memoryMap
U __printf_chk##GLIBC_2.3.4
U REAL
U Rf_error
U Rf_isNull
U Rf_length
0000000000000d70 T simulate
0000000000000e50 T simulate_R
U __stack_chk_fail##GLIBC_2.4
0000000000000aa0 T stateTransition
U unif_rand
I have also read this post but still could not find a solution for my problem.
Edit:
Here is an example to reproduce the error:
#include <R.h>
#include <Rinternals.h>
#include "uthash.h"
typedef struct
{
// a pointer to the allocated memory
void * ptr;
// used by the hash table
UT_hash_handle hh;
} AllocatedMemory;
// map that stores all allocated memory pointers
// to free them on interrupt
extern AllocatedMemory * memoryMap;
static inline void* CALLOC(size_t n, size_t sz)
{
void * ptr = calloc(n, sz);
if (ptr == NULL)
error("Out of memory!");
AllocatedMemory * m = calloc(1, sizeof(AllocatedMemory));
m->ptr = ptr;
HASH_ADD_PTR(memoryMap, ptr, m);
return ptr;
}
SEXP example_R(SEXP vec_R) {
unsigned int n = length(vec_R);
unsigned int * v = CALLOC(n, sizeof(unsigned int));
}
If the above code is stored in "example.c", compile it using:
R CMD SHLIB example.c
It will produce the files example.o and example.so. Then load the so file in R:
dyn.load('example.so')
Here is the minimal change to make your example compile and load i.e. link dynamically:
$ diff -u question.orig.c question.c
--- question.orig.c 2022-12-22 09:41:46.483509755 -0600
+++ question.c 2022-12-22 09:44:45.217420621 -0600
## -12,7 +12,8 ##
// map that stores all allocated memory pointers
// to free them on interrupt
-extern AllocatedMemory * memoryMap;
+AllocatedMemory memoryMapInstance;
+AllocatedMemory *memoryMap = &memoryMapInstance;
static inline void* CALLOC(size_t n, size_t sz)
$
Instead of the reference to an extern instance you fail to supply, but one instance (cheaply) on the stack and define the pointer you need as its address. In a real program you would allocate this on the heap.
Another remaining error is that your example_R is wrong as you claim to take and return a SEXP. But we leave this for another time.
Now with a simple wrapper questions.sh
#!/bin/bash
R CMD SHLIB question.c
R -e 'dyn.load("question.so")'
we get another warning because the return from CALLOC is unused -- but no longer an error on load.
$ ./question.sh
ccache gcc -I"/usr/share/R/include" -DNDEBUG -fpic -g -O3 -Wall -pipe -pedantic -Wno-ignored-attributes -std=gnu99 -c question.c -o question.o
question.c: In function ‘example_R’:
question.c:34:20: warning: unused variable ‘v’ [-Wunused-variable]
34 | unsigned int * v = CALLOC(n, sizeof(unsigned int));
| ^
question.c:35:1: warning: control reaches end of non-void function [-Wreturn-type]
35 | }
| ^
ccache gcc -Wl,-S -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -flto=auto -ffat-lto-objects -flto=auto -Wl,-z,relro -o question.so question.o -L/usr/lib/R/lib -lR
R version 4.2.2 Patched (2022-11-10 r83330) -- "Innocent and Trusting"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> dyn.load("question.so")
>
>
$
No error.
(You can ignore the compiler settings, and use of ccache I have locally. The warning is real.)

Rcpp with embedded R code, show no output of R code

I have a cpp file that defines c++ and R functions, which are sourced into R using Rcpp::sourceCpp().
When I source the file, the R code is (partially) printed as well, even when I specify showOutput = FALSE (I guess it only applies to cpp code?!).
The question now is: how can I suppress the partial R output without using capture.output() or similar tricks.
MWE
in tester.cpp
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::NumericVector timesTwo(Rcpp::NumericVector x) {
return x * 2;
}
/*** R
foo <- function(x) timesTwo(x)
*/
When sourcing the file, I see the following:
Rcpp::sourceCpp("tester.cpp", showOutput = FALSE)
#> foo <- function(x) timesTwo(x)
Even shorter MWE
Rcpp::sourceCpp(code='
#include <Rcpp.h>
// [[Rcpp::export]]
Rcpp::NumericVector timesTwo(Rcpp::NumericVector x) {
return x * 2;
}
/*** R
foo <- function(x) timesTwo(x)
*/
')
Could this question be rooted in a misunderstanding of what showOutput is for?
Looking at help(sourceCpp) we see
showOutput: ‘TRUE’ to print ‘R CMD SHLIB’ output to the console.
that it affects the actual compilation step and has nothing to do with any R code added as optional bit to also run if present.
The following example should make this clear (and reveal a few of CXX and other settings):
> cppFunction("int doubleMe(int i) { return i+i; }", showOutput=TRUE)
/usr/lib/R/bin/R CMD SHLIB -o 'sourceCpp_10.so' 'file99f11710553a7.cpp'
ccache g++ -I"/usr/share/R/include" -DNDEBUG -I"/usr/local/lib/R/site-library/Rcpp/include" -I"/tmp/RtmpC7dZ23/sourceCpp-x86_64-pc-linux-gnu-1.0.5.4" -fpic -g -O3 -Wall -pipe -pedantic -c file99f11710553a7.cpp -o file99f11710553a7.o
ccache g++ -Wl,-S -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o sourceCpp_10.so file99f11710553a7.o -L/usr/lib/R/lib -lR
>
> cppFunction("int trippleMe(int i) { return i+i+i; }", showOutput=FALSE)
>
Now, suppressing output of R code is a different topic and orthogonal to whether such code contains element made via Rcpp or not.
And lastly #MrFlick is spot on that if you don't want R code sourced as part of a sourceCpp() call ... then just don't include such code! Or just break the regexp /*** R.

using R's c code in my standalone executables: "Undefined symbols"

Suppose I have the following c source code, in dexp_test.c:
#include <stdio.h>
double dexp(double x, double scale, int log);
int main() {
double x;
x = dexp(1 , 2, 0);
printf("Value: %f\n", x);
return 0;
}
where dexp is defined in R's source code (https://github.com/wch/r-source/blob/trunk/src/nmath/dexp.c). I would like to compile this to a standalone executable. I have R 4.0 installed on my system. I have the following gcc lines:
gcc r-source/src/nmath/dexp.c -I/Library/Frameworks/R.framework/Versions/4.0/Resources/include -c -o a.o
gcc dexp_test.c -c -o b.o
These lines run just fine on my system and I am left with new files a.o and b.o without errors.
When I run this line to get an executable:
gcc -o test_exp a.o b.o
...I get these errors:
Undefined symbols for architecture x86_64:
"_R_NaN", referenced from:
_Rf_dexp in a.o
"_R_NegInf", referenced from:
_Rf_dexp in a.o
"_dexp", referenced from:
_main in b.o
(maybe you meant: _Rf_dexp)
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
I'm definitely missing something conceptually here; how do I get this to compile? If it helps, I'm on OSX 15.6, and the output of gcc -v is
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include/c++/4.2.1
Apple clang version 12.0.0 (clang-1200.0.32.2)
Target: x86_64-apple-darwin19.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
You should mention the operating system you are using.
You need to include the appropriate headers. And tell the linker where and which libraries you want to use.
So your source should be
#include <stdio.h>
#include <R.h>
#include <Rmath.h>
int main() {
double x;
x = dexp(1 , 2, 0);
printf("Value: %f\n", x);
return 0;
}
And on the command line you should use the following
gcc -I/Library/Frameworks/R.framework/Versions/4.0/Resources/include -L/Library/Frameworks/R.framework/Versions/4.0/Resources/lib -lR -o test_exp trydexp.c

How to write a premake5.lua file?

I am using premake and GNU scientific library in MacOSX to compile a simple example code
#include <stdio.h>
#include <gsl/gsl_sf_bessel.h>
int
main (void)
{
double x = 15.0;
double y = gsl_sf_bessel_J0 (x);
printf ("J0(%g) = %.18e/n", x, y);
return 0;
}
To compile this test_gsl.c from the terminal I can do
clang -Wall -I/usr/local/include -c test_gsl.c
clang -L/usr/local/lib test_gsl.o -lgsl -lgslcblas -lm
I cannot figure out how to write the premake5.lua file.

Rcpp Compilation ERROR: 'clang: error: no such file or directory: '/usr/local/lib/libfontconfig.a'

I was trying to run this peace of code in R (credit to the author):
require(Rcpp)
require(RcppArmadillo)
require(inline)
cosineRcpp <- cxxfunction(
signature(Xs = "matrix"),
plugin = c("RcppArmadillo"),
body='
Rcpp::NumericMatrix Xr(Xs); // creates Rcpp matrix from SEXP
int n = Xr.nrow(), k = Xr.ncol();
arma::mat X(Xr.begin(), n, k, false); // reuses memory and avoids extra copy
arma::mat Y = arma::trans(X) * X; // matrix product
arma::mat res = (1 - Y / (arma::sqrt(arma::diagvec(Y)) * arma::trans(arma::sqrt(arma::diagvec(Y)))));
return Rcpp::wrap(res);
')
And got, after few fixes, the following error:
Error in compileCode(f, code, language = language, verbose = verbose) :
Compilation ERROR, function(s)/method(s) not created!
clang: error: no such file or directory: '/usr/local/lib/libfontconfig.a'
clang: error: no such file or directory: '/usr/local/lib/libreadline.a'
make: *** [file5a681e35ebe1.so] Error 1
In addition: Warning message:
running command '/Library/Frameworks/R.framework/Resources/bin/R CMD SHLIB file5a681e35ebe1.cpp 2> file5a681e35ebe1.cpp.err.txt' had status 1
I used to use Rcpp a lot in the past. But between now and then my computer has been reformatted and all the installation re-done using homebrew.
I installed cairo with brew: brew install cairo
the libreadline.a error was solved with:
brew link --force readline
But the same did not work for libfontconfig.a since was already linked:
brew link --force fontconfig
Warning: Already linked: /usr/local/Cellar/fontconfig/2.11.1
To relink: brew unlink fontconfig && brew link fontconfig
I would have assumed that fontconfig is within cairo. In fact, when I type
brew install fontconfig
Warning: fontconfig-2.11.1 already installed
But the truth is that there is no libfontconfig.a at /usr/local/lib/:
ls /usr/local/lib/libfont*
/usr/local/lib/libfontconfig.1.dylib
/usr/local/lib/libfontconfig.dylib
Using the very questionable approach of going here and download it, the code runs, but still gives a the corresponding warning, since the file corresponds to a different os.x architecture (I did not found one for 10.9):
+ . + ld: warning: ignoring file /usr/local/lib/libfontconfig.a, missing required architecture x86_64 in file /usr/local/lib/libfontconfig.a (2 slices)
So at this stage I am a little lost.
How do I install libfontconfig.a or find the 10.9 version?
In case is of any use, I have Xcode installed, I am on a Mac 10.9.5,
and based on this very nice and detailed answer my ~/.R/Makevars file looks like:
CC=clang
CXX=clang++
FLIBS=-L/usr/local/bin/
Your system setup is broken. Neither R nor Rcpp have anything to do with clang (unless you chose clang as your system compiler) or fontconfig.
So start simpler:
R> library(Rcpp)
R> evalCpp("2 + 2")
[1] 4
R>
This just showed that my system has a working compiler R (and Rcpp) can talk to. We can it more explicit:
R> evalCpp("2 + 2", verbose=TRUE)
Generated code for function definition:
--------------------------------------------------------
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
SEXP get_value(){ return wrap( 2 + 2 ) ; }
No rebuild required (use rebuild = TRUE to force a rebuild)
[1] 4
R>
and R is clever enough not to rebuild. We can then force a build
R> evalCpp("2 + 2", verbose=TRUE, rebuild=TRUE)
Generated code for function definition:
--------------------------------------------------------
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
SEXP get_value(){ return wrap( 2 + 2 ) ; }
Generated extern "C" functions
--------------------------------------------------------
#include <Rcpp.h>
// get_value
SEXP get_value();
RcppExport SEXP sourceCpp_0_get_value() {
BEGIN_RCPP
Rcpp::RObject __result;
Rcpp::RNGScope __rngScope;
__result = Rcpp::wrap(get_value());
return __result;
END_RCPP
}
Generated R functions
-------------------------------------------------------
`.sourceCpp_0_DLLInfo` <- dyn.load('/tmp/Rtmpeuaiu4/sourcecpp_6a7c7c8295fc/sourceCpp_2.so')
get_value <- Rcpp:::sourceCppFunction(function() {}, FALSE, `.sourceCpp_0_DLLInfo`, 'sourceCpp_0_get_value')
rm(`.sourceCpp_0_DLLInfo`)
Building shared library
--------------------------------------------------------
DIR: /tmp/Rtmpeuaiu4/sourcecpp_6a7c7c8295fc
/usr/lib/R/bin/R CMD SHLIB -o 'sourceCpp_2.so' --preclean 'file6a7c6d1fc2d6.cpp'
ccache g++ -I/usr/share/R/include -DNDEBUG -I"/usr/local/lib/R/site-library/Rcpp/include" -I"/tmp/Rtmpeuaiu4" -fpic -g -O3 -Wall -pipe -Wno-unused -pedantic -c file6a7c6d1fc2d6.cpp -o file6a7c6d1fc2d6.o
g++ -shared -L/usr/lib/R/lib -Wl,-Bsymbolic-functions -Wl,-z,relro -o sourceCpp_2.so file6a7c6d1fc2d6.o -L/usr/lib/R/lib -lR
[1] 4
R>
and on that you see system details on my side (Linux, also using ccache) that will be different for you.
After that, try (Rcpp)Armadillo one-liners and so on.

Resources