Pass an R package on CRAN with issues on MACOS due + OpenMP - r

I have an R package with Fortran and OpenMP than can't pass CRAN. I receive the following message:
Your package no longer installs on macOS with OpenMP issues.
My Makevars file is:
USE_FC_TO_LINK =
PKG_FFLAGS = $(SHLIB_OPENMP_FFLAGS)
PKG_LIBS = $(SHLIB_OPENMP_FFLAGS)
C_OBJS = init.o
FT_OBJS = e_bottomup.o e_topdown.o check_nt.o
all:
#$(MAKE) $(SHLIB)
#rm -f *.o
$(SHLIB): $(FT_OBJS) $(C_OBJS)
init.o: e_bottomup.o e_topdown.o check_nt.o
How to solve this issue? Thanks.
Edit 1:
I tried adding the flag cpp:
USE_FC_TO_LINK =
PKG_FFLAGS = $(SHLIB_OPENMP_FFLAGS) *-cpp*
PKG_LIBS = $(SHLIB_OPENMP_FFLAGS)
to add the condition #ifdef _OPENMP on Fortran code before !omp...
But with R CMD Check I got the message:
Non-portable flags in variable 'PKG_FFLAGS': -cpp

The Makevars file is fine. The OMP directives must be commented !$, including the USE OMP.
For instance, I created an R package with Fortran and OMP to test (and play with it).
I included an R function to return the max number of threads in each machine:
get_threads
The Fortran code is :
SUBROUTINE checkntf (nt)
!$ USE OMP_LIB
IMPLICIT NONE
INTEGER nt
!$ nt = OMP_GET_MAX_THREADS()
RETURN
END
The already install on Windows, Ubuntu and macOS as shown here

You can look how the data.table package deal with that using #ifdef _OPENMP: https://github.com/Rdatatable/data.table/blob/master/src/myomp.h It should be pretty similar in Fortran I guess
#ifdef _OPENMP
#include <omp.h>
#else
// for machines with compilers void of openmp support
#define omp_get_num_threads() 1
#define omp_get_thread_num() 0
#define omp_get_max_threads() 1
#define omp_get_thread_limit() 1
#define omp_get_num_procs() 1
#define omp_set_nested(a) // empty statement to remove the call
#define omp_get_wtime() 0
#endif

Related

Error in including boost tokenizer.hpp file when building R package on MacOS/CRAN

I'm running into issues publishing my R package to CRAN, because of a specific error when including boost libraries. The top of my one .cpp file in the package is
#include <Rcpp.h>
#include <boost/tokenizer.hpp>
#include <boost/algorithm/string.hpp>
#include <algorithm>
#include <string>
#include <unordered_map>
#include <omp.h>
#include <vector>
// [[Rcpp::depends(BH)]]
// [[Rcpp::plugins(openmp)]]
When running through the check on MacOS (via rhub::check(platform = "macos-highsierra-release-cran"), I get the following error:
In file included from wgt_jaccard.cpp:6:
/Users/user2suimGYX/R/BH/include/boost/tokenizer.hpp:63:9: error: field of type 'std::_1::wrap_iter<const char *>' has private constructor
: first(c.begin()), last(c.end()), f(f) { }
^
wgt_jaccard.cpp:117:19: note: in instantiation of function template specialization 'boost::tokenizer<boost::char_separator<char, std::__1::char_traits >, std::__1::__wrap_iter<const char *>, std::__1::basic_string ::tokenizer<Rcpp::internal::string_proxy<16, PreserveStorage> >' requested here
tokenizer tokens(y(i), sep);
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/iterator:1420:31: note: declared private here
_LIBCPP_INLINE_VISIBILITY __wrap_iter(iterator_type __x) _NOEXCEPT_DEBUG : __i(__x) {}
^
My Makevars file's contents are
PKG_CXXFLAGS = $(SHLIB_OPENMP_CXXFLAGS) -DBOOST_NO_AUTO_PTR
PKG_LIBS = $(SHLIB_OPENMP_CXXFLAGS)
CXX_STD = CXX11
I've tried searching around, but can't find much on this error. The full package is located here. Any help is much appreciated.
Relative to the repo you kindly supplied, we found a need for two changes:
First, to no (unconditionally) have an #include <omp.h> as OpenMP can be optional, esp. on macOS. A simpled #ifdef OPENMP does the job.
Second, the (arguably near-incomprehensible) compiler message had to do with the fact that the Boost type / class for tokenizer was puzzled by the Rcpp object you gave it by directly indexing from an Rcpp::CharacterVector. Been there, done that -- a more conservative approach is to first assign to std::string and to then pass that on.
With those two changes, it's all roses and it compiles on macOS under clang++ as well.
The (by now merged, thanks) PR #2 has the gory details, but is short.

Should -fsanitize=address go into CFLAGS or LDFLAGS?

I'm trying to use address sanitizer using (-fsanitize=address) and I'm not sure if it belongs into CFLAGS or LDFLAGS. It actually seems to work fine when added just to LDFLAGS, but I do not know if that is a coincidence or if it is supposed to be like that.
Is -fsanitize=address needed for the compilation itself, or does it suffice to provide the flag for the linking step?
Is -fsanitize=address needed for the compilation itself, or does it suffice to provide the flag for the linking step?
Address Sanitizer instruments source code to insert additional checks, and so must be present at compilation time.
Providing the argument only on the link line results in asan runtime being linked into the process, but no checks being actually done, except for a small subset -- namely the checks achievable by interposing new delete, malloc, free, and other standard functions.
Example:
1 #include <malloc.h>
2 #include <stdio.h>
3
4 void fn(int *ip)
5 {
6 ip[0] = 1; // BUG: heap buffer overflow
7 }
8
9 int main()
10 {
11 int *ip = malloc(1); // Allocation too small.
12 printf("%d\n", ip[0]); // BUG: heap buffer overflow
13 free(ip);
14 free(ip); // BUG: double free
15 }
With no instrumentation, only the double-free is detected:
gcc -g -c t.c && gcc -fsanitize=address t.o && ./a.out
190
=================================================================
==55787==ERROR: AddressSanitizer: attempting double-free on 0x602000000010 in thread T0:
With instrumentation: both the bug in printf and the bug in fn are also detected.
gcc -g -c -fsanitize=address t.c && gcc -fsanitize=address t.o && ./a.out
=================================================================
==58202==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000000010 at pc 0x564565639252 bp 0x7ffe36b0a560 sp 0x7ffe36b0a558
READ of size 4 at 0x602000000010 thread T0
#0 0x564565639251 in main /tmp/t.c:12

RcppZiggurat unable to compile example code

I am trying to employ the Ziggurat sampler in R, however actually wanted to use it directly in my C++ code. I installed the GSL library, RcppGSL and RcppZiggurat and using zrnorm() in R works just fine. I thought ok, lets try to compile the code sample provided in the RcppZiggurat.pdf, and go from there to implement the Ziggurat sampler directly in my C++ code... the following happens though...
From the pdf file I thought I can simply utilize:
#include <Rcpp.h>
#include <Ziggurat.h>
static Ziggurat::Ziggurat::Ziggurat zigg;
// [[Rcpp::export]]
Rcpp::NumericVector zrnorm(int n) {
Rcpp::NumericVector x(n);
for (int i=0; i<n; i++) {
x[i] = zigg.norm();
}
return x;
}
// [[Rcpp::export]]
void zsetseed(unsigned long int s) {
zigg.setSeed(s);
return;
}
Error:
official_zigg_code.cpp:2:10: fatal error: 'Ziggurat.h' file not found
#include <Ziggurat.h>
^
1 error generated.
make: *** [official_zigg_code.o] Error 1
clang++ -I/Library/Frameworks/R.framework/Resources/include -DNDEBUG -I/usr/local/include -
I/usr/local/include/freetype2 -I/opt/X11/include -
I"/Library/Frameworks/R.framework/Versions/3.1/Resources/library/Rcpp/include" -fPIC -Wall -
mtune=core2 -g -O2 -c official_zigg_code.cpp -o official_zigg_code.o
Error in Rcpp::sourceCpp("official_zigg_code.cpp") :
Error 1 occurred building shared library.
I have absolutely no clue how to proceed from here. I desperately tried to find answers on stack exchange but nothing could help me to solve this. From what I understand the RcppZiggurat package actually uses the above function so how can I fail to compile it, when I am able to use zrnorm() directly?
The error is fairly obvious:
fatal error: 'Ziggurat.h' file not found
This means that you did not tell R / the compiler about RcppZiggurat.
The fix is easy. In the case of an Rcpp-driven compilation via sourceCpp(), add
this one line
// [[Rcpp::depends(RcppZiggurat)]]
which does just that. All this is documented with Rcpp, and you are more or less expected to read at least some of its documentation.
If you want to build outside of Rcpp, you need to make sure the compiler find the header file(s). One commonly uses the -I flag for that, this is typically discussed where compiler are introduced.

After including RcppArmadillo.h, errors occur when compiling code

I would like to use some of the functionalities included in RcppArmadillo. As I read in another post on SO, if RcppArmadillo.h is included, Rcpp.h should not be included. I did just that, but when trying to compile the .cpp file, I got some error messages. EDIT. Per Dirk's suggestion, I only included RcppArmadillo.h, which significantly reduced the number of error messages: The minimally reproducible code is below:
#include <RcppArmadillo.h>
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
using namespace arma;
template <class RandomAccessIterator, class StrictWeakOrdering>
void sort(RandomAccessIterator first, RandomAccessIterator last, StrictWeakOrdering comp);
struct val_order{
int order;
double value;
};
bool compare(const val_order & a, const val_order & b){return (a.value<b.value);}
// [[Rcpp::export]]
IntegerVector order(NumericVector x){
int n=x.size();
std::vector<int> output(n);
std::vector<val_order> index(n);
for(int i=0;i<x.size();i++){
index[i].value=x(i);
index[i].order=i;
}
std::sort(index.begin(), index.end(), compare);
for(int i=0;i<x.size();i++){
output[i]=index[i].order;
}
return wrap(output);
}
The error message is below:
Error in sourceCpp("functions.cpp") :
Error 1 occurred building shared library.
ld: warning: directory not found for option '-L/usr/local/lib/gcc/i686-apple-darwin8/4.2.3/x86_64'
ld: warning: directory not found for option '-L/usr/local/lib/x86_64'
ld: warning: directory not found for option '-L/usr/local/lib/gcc/i686-apple-darwin8/4.2.3'
ld: library not found for -lgfortran
collect2: ld returned 1 exit status
make: *** [sourceCpp_44748.so] Error 1
Just to reiterate, this code has no problem compiling when I use Rcpp.h.
A few things:
Your post is not helpful. We do not need dozens of lines of error messages, but we do need reproducible code. Which you did not included.
You include several C++ headers and several R headers. Don't.
Include one header only: RcppArmadillo.h and if that fails, post a reproducible example.
Edit: Thanks for your update. Your code now compiles, your error is a linker error. You simply need to install the Fortran compiler under OS X.

Is there a way for a C binary code to figure out where it is stored? [duplicate]

Is there a platform-agnostic and filesystem-agnostic method to obtain the full path of the directory from where a program is running using C/C++? Not to be confused with the current working directory. (Please don't suggest libraries unless they're standard ones like clib or STL.)
(If there's no platform/filesystem-agnostic method, suggestions that work in Windows and Linux for specific filesystems are welcome too.)
Here's code to get the full path to the executing app:
Variable declarations:
char pBuf[256];
size_t len = sizeof(pBuf);
Windows:
int bytes = GetModuleFileName(NULL, pBuf, len);
return bytes ? bytes : -1;
Linux:
int bytes = MIN(readlink("/proc/self/exe", pBuf, len), len - 1);
if(bytes >= 0)
pBuf[bytes] = '\0';
return bytes;
If you fetch the current directory when your program first starts, then you effectively have the directory your program was started from. Store the value in a variable and refer to it later in your program. This is distinct from the directory that holds the current executable program file. It isn't necessarily the same directory; if someone runs the program from a command prompt, then the program is being run from the command prompt's current working directory even though the program file lives elsewhere.
getcwd is a POSIX function and supported out of the box by all POSIX compliant platforms. You would not have to do anything special (apart from incliding the right headers unistd.h on Unix and direct.h on windows).
Since you are creating a C program it will link with the default c run time library which is linked to by ALL processes in the system (specially crafted exceptions avoided) and it will include this function by default. The CRT is never considered an external library because that provides the basic standard compliant interface to the OS.
On windows getcwd function has been deprecated in favour of _getcwd. I think you could use it in this fashion.
#include <stdio.h> /* defines FILENAME_MAX */
#ifdef WINDOWS
#include <direct.h>
#define GetCurrentDir _getcwd
#else
#include <unistd.h>
#define GetCurrentDir getcwd
#endif
char cCurrentPath[FILENAME_MAX];
if (!GetCurrentDir(cCurrentPath, sizeof(cCurrentPath)))
{
return errno;
}
cCurrentPath[sizeof(cCurrentPath) - 1] = '\0'; /* not really required */
printf ("The current working directory is %s", cCurrentPath);
This is from the cplusplus forum
On windows:
#include <string>
#include <windows.h>
std::string getexepath()
{
char result[ MAX_PATH ];
return std::string( result, GetModuleFileName( NULL, result, MAX_PATH ) );
}
On Linux:
#include <string>
#include <limits.h>
#include <unistd.h>
std::string getexepath()
{
char result[ PATH_MAX ];
ssize_t count = readlink( "/proc/self/exe", result, PATH_MAX );
return std::string( result, (count > 0) ? count : 0 );
}
On HP-UX:
#include <string>
#include <limits.h>
#define _PSTAT64
#include <sys/pstat.h>
#include <sys/types.h>
#include <unistd.h>
std::string getexepath()
{
char result[ PATH_MAX ];
struct pst_status ps;
if (pstat_getproc( &ps, sizeof( ps ), 0, getpid() ) < 0)
return std::string();
if (pstat_getpathname( result, PATH_MAX, &ps.pst_fid_text ) < 0)
return std::string();
return std::string( result );
}
If you want a standard way without libraries: No. The whole concept of a directory is not included in the standard.
If you agree that some (portable) dependency on a near-standard lib is okay: Use Boost's filesystem library and ask for the initial_path().
IMHO that's as close as you can get, with good karma (Boost is a well-established high quality set of libraries)
I know it is very late at the day to throw an answer at this one but I found that none of the answers were as useful to me as my own solution. A very simple way to get the path from your CWD to your bin folder is like this:
int main(int argc, char* argv[])
{
std::string argv_str(argv[0]);
std::string base = argv_str.substr(0, argv_str.find_last_of("/"));
}
You can now just use this as a base for your relative path. So for example I have this directory structure:
main
----> test
----> src
----> bin
and I want to compile my source code to bin and write a log to test I can just add this line to my code.
std::string pathToWrite = base + "/../test/test.log";
I have tried this approach on Linux using full path, alias etc. and it works just fine.
NOTE:
If you are on windows you should use a '\' as the file separator not '/'. You will have to escape this too for example:
std::string base = argv[0].substr(0, argv[0].find_last_of("\\"));
I think this should work but haven't tested, so comment would be appreciated if it works or a fix if not.
Filesystem TS is now a standard ( and supported by gcc 5.3+ and clang 3.9+ ), so you can use current_path() function from it:
std::string path = std::experimental::filesystem::current_path();
In gcc (5.3+) to include Filesystem you need to use:
#include <experimental/filesystem>
and link your code with -lstdc++fs flag.
If you want to use Filesystem with Microsoft Visual Studio, then read this.
No, there's no standard way. I believe that the C/C++ standards don't even consider the existence of directories (or other file system organizations).
On Windows the GetModuleFileName() will return the full path to the executable file of the current process when the hModule parameter is set to NULL. I can't help with Linux.
Also you should clarify whether you want the current directory or the directory that the program image/executable resides. As it stands your question is a little ambiguous on this point.
On Windows the simplest way is to use the _get_pgmptr function in stdlib.h to get a pointer to a string which represents the absolute path to the executable, including the executables name.
char* path;
_get_pgmptr(&path);
printf(path); // Example output: C:/Projects/Hello/World.exe
Maybe concatenate the current working directory with argv[0]? I'm not sure if that would work in Windows but it works in linux.
For example:
#include <stdio.h>
#include <unistd.h>
#include <string.h>
int main(int argc, char **argv) {
char the_path[256];
getcwd(the_path, 255);
strcat(the_path, "/");
strcat(the_path, argv[0]);
printf("%s\n", the_path);
return 0;
}
When run, it outputs:
jeremy#jeremy-desktop:~/Desktop$ ./test
/home/jeremy/Desktop/./test
For Win32 GetCurrentDirectory should do the trick.
You can not use argv[0] for that purpose, usually it does contain full path to the executable, but not nessesarily - process could be created with arbitrary value in the field.
Also mind you, the current directory and the directory with the executable are two different things, so getcwd() won't help you either.
On Windows use GetModuleFileName(), on Linux read /dev/proc/procID/.. files.
Just my two cents, but doesn't the following code portably work in C++17?
#include <iostream>
#include <filesystem>
namespace fs = std::filesystem;
int main(int argc, char* argv[])
{
std::cout << "Path is " << fs::path(argv[0]).parent_path() << '\n';
}
Seems to work for me on Linux at least.
Based on the previous idea, I now have:
std::filesystem::path prepend_exe_path(const std::string& filename, const std::string& exe_path = "");
With implementation:
fs::path prepend_exe_path(const std::string& filename, const std::string& exe_path)
{
static auto exe_parent_path = fs::path(exe_path).parent_path();
return exe_parent_path / filename;
}
And initialization trick in main():
(void) prepend_exe_path("", argv[0]);
Thanks #Sam Redway for the argv[0] idea. And of course, I understand that C++17 was not around for many years when the OP asked the question.
Just to belatedly pile on here,...
there is no standard solution, because the languages are agnostic of underlying file systems, so as others have said, the concept of a directory based file system is outside the scope of the c / c++ languages.
on top of that, you want not the current working directory, but the directory the program is running in, which must take into account how the program got to where it is - ie was it spawned as a new process via a fork, etc. To get the directory a program is running in, as the solutions have demonstrated, requires that you get that information from the process control structures of the operating system in question, which is the only authority on this question. Thus, by definition, its an OS specific solution.
#include <windows.h>
using namespace std;
// The directory path returned by native GetCurrentDirectory() no end backslash
string getCurrentDirectoryOnWindows()
{
const unsigned long maxDir = 260;
char currentDir[maxDir];
GetCurrentDirectory(maxDir, currentDir);
return string(currentDir);
}
For Windows system at console you can use system(dir) command. And console gives you information about directory and etc. Read about the dir command at cmd. But for Unix-like systems, I don't know... If this command is run, read bash command. ls does not display directory...
Example:
int main()
{
system("dir");
system("pause"); //this wait for Enter-key-press;
return 0;
}
Works with starting from C++11, using experimental filesystem, and C++14-C++17 as well using official filesystem.
application.h:
#pragma once
//
// https://en.cppreference.com/w/User:D41D8CD98F/feature_testing_macros
//
#ifdef __cpp_lib_filesystem
#include <filesystem>
#else
#include <experimental/filesystem>
namespace std {
namespace filesystem = experimental::filesystem;
}
#endif
std::filesystem::path getexepath();
application.cpp:
#include "application.h"
#ifdef _WIN32
#include <windows.h> //GetModuleFileNameW
#else
#include <limits.h>
#include <unistd.h> //readlink
#endif
std::filesystem::path getexepath()
{
#ifdef _WIN32
wchar_t path[MAX_PATH] = { 0 };
GetModuleFileNameW(NULL, path, MAX_PATH);
return path;
#else
char result[PATH_MAX];
ssize_t count = readlink("/proc/self/exe", result, PATH_MAX);
return std::string(result, (count > 0) ? count : 0);
#endif
}
For relative paths, here's what I did. I am aware of the age of this question, I simply want to contribute a simpler answer that works in the majority of cases:
Say you have a path like this:
"path/to/file/folder"
For some reason, Linux-built executables made in eclipse work fine with this. However, windows gets very confused if given a path like this to work with!
As stated above there are several ways to get the current path to the executable, but the easiest way I find works a charm in the majority of cases is appending this to the FRONT of your path:
"./path/to/file/folder"
Just adding "./" should get you sorted! :) Then you can start loading from whatever directory you wish, so long as it is with the executable itself.
EDIT: This won't work if you try to launch the executable from code::blocks if that's the development environment being used, as for some reason, code::blocks doesn't load stuff right... :D
EDIT2: Some new things I have found is that if you specify a static path like this one in your code (Assuming Example.data is something you need to load):
"resources/Example.data"
If you then launch your app from the actual directory (or in Windows, you make a shortcut, and set the working dir to your app dir) then it will work like that.
Keep this in mind when debugging issues related to missing resource/file paths. (Especially in IDEs that set the wrong working dir when launching a build exe from the IDE)
A library solution (although I know this was not asked for).
If you happen to use Qt:
QCoreApplication::applicationDirPath()
Path to the current .exe
#include <Windows.h>
std::wstring getexepathW()
{
wchar_t result[MAX_PATH];
return std::wstring(result, GetModuleFileNameW(NULL, result, MAX_PATH));
}
std::wcout << getexepathW() << std::endl;
// -------- OR --------
std::string getexepathA()
{
char result[MAX_PATH];
return std::string(result, GetModuleFileNameA(NULL, result, MAX_PATH));
}
std::cout << getexepathA() << std::endl;
This question was asked 15 years ago, so the existing answers are now incorrect. If you're using C++17 or greater, the solution is very straightforward today:
#include <filesystem>
std::cout << std::filesystem::current_path();
See cppreference.com for more information.
On POSIX platforms, you can use getcwd().
On Windows, you may use _getcwd(), as use of getcwd() has been deprecated.
For standard libraries, if Boost were standard enough for you, I would have suggested Boost::filesystem, but they seem to have removed path normalization from the proposal. You may have to wait until TR2 becomes readily available for a fully standard solution.
Boost Filesystem's initial_path() behaves like POSIX's getcwd(), and neither does what you want by itself, but appending argv[0] to either of them should do it.
You may note that the result is not always pretty--you may get things like /foo/bar/../../baz/a.out or /foo/bar//baz/a.out, but I believe that it always results in a valid path which names the executable (note that consecutive slashes in a path are collapsed to one).
I previously wrote a solution using envp (the third argument to main() which worked on Linux but didn't seem workable on Windows, so I'm essentially recommending the same solution as someone else did previously, but with the additional explanation of why it is actually correct even if the results are not pretty.
As Minok mentioned, there is no such functionality specified ini C standard or C++ standard. This is considered to be purely OS-specific feature and it is specified in POSIX standard, for example.
Thorsten79 has given good suggestion, it is Boost.Filesystem library. However, it may be inconvenient in case you don't want to have any link-time dependencies in binary form for your program.
A good alternative I would recommend is collection of 100% headers-only STLSoft C++ Libraries Matthew Wilson (author of must-read books about C++). There is portable facade PlatformSTL gives access to system-specific API: WinSTL for Windows and UnixSTL on Unix, so it is portable solution. All the system-specific elements are specified with use of traits and policies, so it is extensible framework. There is filesystem library provided, of course.
The linux bash command
which progname will report a path to program.
Even if one could issue the which command from within your program and direct the output to a tmp file and the program
subsequently reads that tmp file, it will not tell you if that program is the one executing. It only tells you where a program having that name is located.
What is required is to obtain your process id number, and to parse out the path to the name
In my program I want to know if the program was
executed from the user's bin directory or from another in the path
or from /usr/bin. /usr/bin would contain the supported version.
My feeling is that in Linux there is the one solution that is portable.
Use realpath() in stdlib.h like this:
char *working_dir_path = realpath(".", NULL);
The following worked well for me on macOS 10.15.7
brew install boost
main.cpp
#include <iostream>
#include <boost/filesystem.hpp>
int main(int argc, char* argv[]){
boost::filesystem::path p{argv[0]};
p = absolute(p).parent_path();
std::cout << p << std::endl;
return 0;
}
Compiling
g++ -Wall -std=c++11 -l boost_filesystem main.cpp

Resources