wait for user input while debugging Rcpp [duplicate] - r

This question already has an answer here:
Getting user input from R console: Rcpp and std::cin
(1 answer)
Closed 5 years ago.
When debugging Rcpp, I usually find myself having to use old-school Rprintf statements to watch variables and program flow (as per the comment from #RomainFrancois on this answer). However, for situations such as large loops, where too much rapid output from Rprintf can quickly disappear over the top of the console, I would like to be able to pause execution until I press a key.
If I try using a std::cin.get() call in the c++ script and running it from Rstudio, the console gets stuck with no obvious way to direct keystrokes to the c++ call that is awaiting them. NB this is an RSudio specific problem - it does not occur when running the same code from a terminal window, which works as expected.
A simple reproducible example:
library(Rcpp)
cppFunction('
void test() {
for (int i = 0; i < 100; i++) {
Rprintf(\"i = %i\\n\", i);
std::cin.get();
}
}
')
test()
How can I get this to work so I can step through Rcpp functions interactively to debug them?

OK, I figured out a workaround - which is to use an R function (readline) to read user input, rather than a c++ one. I'd still like to know if there is a way for RStudio to pass input to c++ functions.
cppFunction('
void test(){
Environment base = Environment("package:base");
Function readline = base["readline"];
for (int i = 0; i < 10; i++) {
Rprintf(\"i = %i\", i);
readline("");
}
}
')
test()

Related

Assigning Rcpp::XPtr on the R side

I'm learning about external pointers, XPtr, in Rcpp. I made the following test functions:
// [[Rcpp::export]]
Rcpp::XPtr<arma::Mat<int>> create_xptr(int i, int j) {
arma::Mat<int>* ptr(new arma::Mat<int>(i, j));
Rcpp::XPtr<arma::Mat<int>> p(ptr, true);
return p;
}
// [[Rcpp::export]]
void fill_xptr(Rcpp::XPtr<arma::Mat<int>>& xptr, int k) {
(*xptr).fill(k);
}
// [[Rcpp::export]]
arma::Mat<int> return_val(Rcpp::XPtr<arma::Mat<int>>& xptr) {
return *xptr;
}
Now on the R side I can of-course create an instance and work with it:
x <- create_xptr(1000, 1000) # (1)
Say, for some reason I accidentally called create_xptr again and assigned the result to x, i.e
x <- create_xptr(1000, 1000) # (2)
Then, I have no longer access to the pointer i created in (1) which makes sense and hence, I cannot free the memory. What I would like is, that the second time (2) it just overwrite the first one (1). And secondly, if I create an external pointer in some local scope (say a simple for loop), should the memory used then be freed automatically when it goes out of scope? I've tried the following:
for (i in 1:3) {
a <- create_xptr(1000, 100000)
fill_xptr(a, 1)
}
But it just adds to the memory usage for each i.
I have tried reading some code on different git-repos, old posts here on stack read a little about finalizers and garbage collection in R. I can't seem to put together the puzzle.
We use external pointers for things that do not have already existing interfaces such as database connections or objects from other new libraries. You may be better off looking at some existing uses of XPtr (or the other external pointer variants in some other packages, there are two small ones on CRAN).
And I don't think I can think of an example directly referencing this in R. It is mostly for "wrapping" external objects to hold on to them and to pass them around for further use elsewhere. And you are correct that you need to read up a little on finalizers. I find reading Writing R Extensions, as dense as it is, to be the best source because you need to get the initial "basics" in C right first.

What is the rule behind instruction count in Intel PIN?

I wanted to count instructions in simple recursive fibo function O(2^n). I succeded to do so with bubble sort and matrix multiplication, but in this case it seemed like instruction count ignored my fibo function. Here is the code used for instrumentation:
// Insert a call at the entry point of a routine to increment the call count
RTN_InsertCall(rtn, IPOINT_BEFORE, (AFUNPTR)docount, IARG_PTR, &(rc->_rtnCount), IARG_END);
// For each instruction of the routine
for (INS ins = RTN_InsHead(rtn); INS_Valid(ins); ins = INS_Next(ins))
{
// Insert a call to docount to increment the instruction counter for this rtn
INS_InsertCall(ins, IPOINT_BEFORE, (AFUNPTR)docount, IARG_PTR, &(rc->_icount), IARG_END);
}
I started to wonder what's the difference between this program and the previous ones and my first thought was: here I'm not using an array.
This is what I realised after some manual tests:
a = 5; // instruction ignored by PIN and
// pretty much everything not using array
fibo[1] = 1 // instruction counted properly
a = fibo[1] // instruction ignored by PIN
So it seems like only instructions counted are writes to the memory (that's what I assume). After I changed my fibo function to this it works:
long fibonacciNumber(int n, long *fiboNumbers)
{
if (n < 2) {
fiboNumbers[n] = n;
return n;
}
fiboNumbers[n] = fiboNumbers[n-1] + fiboNumbers[n-2];
return fibonacciNumber(n - 1, fiboNumbers) + fibonacciNumber(n - 2, fiboNumbers);
}
But I would like to count instructions also for programs that aren't written by me. Is there a way to count all type of instrunctions? Is there any particular reason why only this instructions are counted? Any help appreciated.
//Edit
I used disassembly option in Visual Studio to check how it looks and it still makes no sense for me. I can't find the reason why only assingment to array is interpreted by PIN as instruction.
instruction_comparison
This exceeded all my expectations, counted as 2 instructions:
even 2 instructions, not one
PIN, like other low-level profiling and analysis tools, measures individual instructions, low-level orders like "add these two registers" or "load a value from that memory address". The sequence of instructions which a program comprises are generally produced from a high-level language like C++ through a compiler. An individual line of C++ code might be transformed into exactly one instruction, but it's also common for a line to translate to several instructions or even to zero instructions; and the instructions for a line of code may be interleaved with those of other instructions.
Your compiler can output an assembly-language file for your source code, showing what instructions were produced for which lines of code. (For GCC and Clang, this is done with the -S flag.) Note that reading the assembly code output from a compiler is not the best way to learn assembly. Also, I would point you to godbolt.org, a very convenient tool for analyzing assembly output.

OpenCL Programm freezes everytime private variables used

I got a kernel method, simplified looks like this.
__kernel void calculate (__global float *a, __global float *b, __global int *res) {
int workItem = get_global_id(0); // Syntax may not right, but you get the idea
int found = 0;
for (int i = 0; i<100000000; i++) {
float c = a[i]*3;
float d = b[i]*2;
if (c<d) {
found++;
}
}
res[workItem]= found;
}
So, nothing much but simple calculation and a very big loop, the problem is the programm freezes all the time when i run this code. I have to force reset the computer everytime this happens.
But if i make some changes, like this
if (true) {
found++;
}
Or
if (1<2) {
found++
}
Then the programm works like a charm, and very fast! So i wonder if is there any thing wrong with variables c and d ? I tried to use things like
__private float c= ..;
__private float d= ..;
It didnt work either.
I read the return code of every step while creating programm and kernel, so its not the problem.
What did I do wrong here ?
You have several issues here.
You are running a giant loop, presumably with a single work item, which is exactly what you want to avoid in OpenCL, as the entire concept of parallelism is lost. OpenCL will not magically make your code fast if you don't use it correctly.
Even if you are using multiple work items, you are not making use of the value you retrieve from get_global_id() except for output. But, as they're all using the same input, you'll get the same output for every single work item!
Work items and their associated global IDs are intended to allow you to partition your processing into discrete units, rather than one big monolithic loop. I suggest you look at some tutorials like this one to understand the concept better. Don't start writing your own code until you understand his.
As for why your program freezes your PC, I can only speculate without seeing your host code. Perhaps you are getting buffer overrun?

Calling the original function after tweaking arguments is an incorrect use of the R-API?

I am trying to create a small extension for R here for embedding the current time on the R prompt: https://github.com/musically-ut/extPrompt
Things seem to be working overall, but R CMD check . raised a warning:
File '[truncated]..Rcheck/extPrompt/libs/extPrompt.so’:
Found non-API call to R: ‘ptr_R_ReadConsole’
Compiled code should not call non-API entry points in R.
The concerned file is this: https://github.com/musically-ut/extPrompt/blob/master/src/extPrompt.c and occurs on line 38, I think.
void extPrompt() {
// Initialize the plugin by replacing the R_ReadConsole function
old_R_ReadConsole = ptr_R_ReadConsole;
ptr_R_ReadConsole = extPrompt_ReadConsole;
// ...
}
int extPrompt_ReadConsole(const char *old_prompt, unsigned char *buf, int len,
int addtohistory) {
// ...
// Call the old function with the `new_prompt`
return (*old_R_ReadConsole)(new_prompt, buf, len, addtohistory);
}
I am trying to make the R_ReadConsole API call. However, since a different plugin (like mine) could have overridden it already, I do not want to directly invoke R_ReadConsole but the function which previously was at ptr_R_ReadConsole.
Is this an incorrect use of the API?
From the r-devel mailing list:
As for incorrectly using the API - yes. Since R CMD check is telling
you this is a problem, it is officially a problem. This is of no
consequence if the package works for you and any users of it, and
the package is not to be hosted on CRAN. However, using this symbol in
your code may not work on all platforms, and is not guaranteed to
work in the future (but probably will!).

Extract long[] from R object

I'm trying to make a wrapper for some C-based sparse-matrix-handling code (see previous question). In order to call the workhorse C function, I need to create a structure that looks like this:
struct smat {
long rows;
long cols;
long vals; /* Total non-zero entries. */
long *pointr; /* For each col (plus 1), index of first non-zero entry. */
long *rowind; /* For each nz entry, the row index. */
double *value; /* For each nz entry, the value. */
};
These correspond nicely to the slots in a dgCMatrix sparse matrix. So ideally I'd just point to the internal arrays in the dgCMatrix (after verifying that the C function won't twiddle with the data [which I haven't done yet]).
For *value, it looks like I'll be able to use REALSXP or something to get a double[] as desired. But for *pointr and *rowind, I'm not sure the best way to get at an appropriate array. Will I need to loop through the entries and copy them to new arrays, casting as I go? Or can Rcpp provide some sugar here? This is the first time I've really used Rcpp much and I'm not well-versed in it yet.
Thanks.
Edit: I'm also having some linking trouble that I don't understand:
Error in dyn.load(libLFile) :
unable to load shared object '/var/folders/TL/TL+wXnanH5uhWm4RtUrrjE+++TM/-Tmp-//RtmpAA9upc/file2d4606aa.so':
dlopen(/var/folders/TL/TL+wXnanH5uhWm4RtUrrjE+++TM/-Tmp-//RtmpAA9upc/file2d4606aa.so, 6): Symbol not found: __Z8svdLAS2AP4smatl
Referenced from: /var/folders/TL/TL+wXnanH5uhWm4RtUrrjE+++TM/-Tmp-//RtmpAA9upc/file2d4606aa.so
Expected in: flat namespace
in /var/folders/TL/TL+wXnanH5uhWm4RtUrrjE+++TM/-Tmp-//RtmpAA9upc/file2d4606aa.so
Do I need to be creating my library with some special compilation flags?
Edit 2: it looks like my libargs parameter has no effect, so libsvd symbols never make it into the library. I can find no way to include libraries using cxxfunction() - here's what I'd tried, but the extra parameters (wishful-thinkingly-borrowed from cfunction()) are silently gobbled up:
fn <- cxxfunction(sig=c(nrow="integer", mi="long", mp="long", mx="numeric"),
body=code,
includes="#include <svdlib.h>\n",
cppargs="-I/Users/u0048513/Downloads/SVDLIBC",
libargs="-L/Users/u0048513/Downloads/SVDLIBC -lsvd",
plugin="Rcpp",
verbose=TRUE)
I feel like I'm going about this whole process wrong, since nothing's working. Anyone kick me in the right direction?
I decided to also post a query on the Rcpp-devel mailing list, and got some good advice & help from Dirk and Doug:
http://lists.r-forge.r-project.org/pipermail/rcpp-devel/2011-February/001851.html
I'm still not super-facile with this stuff, but getting there. =)
I've done something similar for a [R]-Smalltalk-interface last year and went about it more generic to be able to pass all data back-and-forth by using byte-arrays:
In C i have:
DLLIMPORT void getLengthOfNextMessage(byte* a);
DLLIMPORT void getNextMessage(byte* a);
In R:
getLengthOfNextMessage <- function() {
tmp1 <- as.raw(rep(0,4))
tmp2<-.C("getLengthOfNextMessage", tmp1)
return(bvToInt(tmp2))
}
receiveMessage <- function() {
#if(getNumberOfMessages()==0) {
# print("error: no messages")
# return();
#}
tmp1<-as.raw(rep(0, getLengthOfNextMessage()+getSizeOfMessages()))
tmp2<-.C("getNextMessage", tmp1)
msg<-as.raw(tmp2[[1]])
print(":::confirm received")
print(bvToInt(msg[13:16]))
# confirmReceived(bvToInt(msg[13:16]))
return(msg)
}
I have commented-out the use of the functions getNumberOfMessages() and confirmReceived() which are specific to the problem i had to solve (multiple back-and-forth communication). Essentially, the code uses the argument byte-array to transfer the information, first the 4-byte-long length-info, then the actual data. This seems less elegant (even to me) than to use structs, but i found it to be more generic and i can hook into any dll, transfering any datatype.

Resources