Possible to decompile R bytecode? - r

Is it possible to go from compiled R code found in packages back to R source code? I would like to get the source code for various functions from packages installed from CRAN or other sources. I know I can download the full source via separate downloads.

You can extract the text of functions in a package using args() and body(). To list all the objects in a package you can use ls() and specify the package environment.
Caveat: The approach below will give you the source code, but not the NAMESPACE or DESCRIPTION.
For example, to print the source code of everything in ggplot2, try this:
library(ggplot2)
pkg <- as.environment("package:ggplot2")
allfuns <- ls(envir = pkg)
for(f in allfuns[1:2]){
args <- capture.output(print(args(f)))[1]
body <- paste(capture.output(print(body(f))), collapse = "\n")
cat(sprintf("%s <- %s\n%s\n\n", f, args, body))
}
This will give you:
%+% <- function (e1, e2)
{
e2name <- deparse(substitute(e2))
if (is.theme(e1))
add_theme(e1, e2, e2name)
else if (is.ggplot(e1))
add_ggplot(e1, e2, e2name)
}
%+replace% <- function (e1, e2)
{
if (!is.theme(e1) || !is.theme(e2)) {
stop("%+replace% requires two theme objects", call. = FALSE)
}
e1[names(e2)] <- e2
e1
}

Andrie's answer above has been hugely helpful for me. I had to use a package that is not on CRAN or git, and was only distributed as a compiled package built for R 3.0.1. Obviously when I tried to use it with R 4 it didn't work, and I didn't want to keep switching back and forth between R versions, so I had to decompile the package, rebuild the source, and reinstall.
However, Andrie's code example has a few shortcomings, most importantly that it only decompiles exported functions, not internal ones. Obviously this post is pretty old, but I'm posting my updates here in case it's helpful for anyone else trying to do the same thing.
packagename <- "ggplot2" #input package name here
pkg <- asNamespace(packagename) # importing as namespace rather than package accesses internals
allfuns <- ls(name = pkg)
for(f in allfuns){
# Andrie's [1] subset didn't work if the function arguments were more than one line long
args <- head(capture.output(print(args(getFromNamespace(f, packagename)))), -1)
body <- paste(capture.output(print(body(getFromNamespace(f, packagename)))),
collapse = "\n")
# This now writes directly to an R code file, rather than the console, to avoid copy/paste
# You could tweak this to create a separate file for each function, if desired.
cat(sprintf("%s <- %s\n%s\n\n", f, args, body),
file = paste(packagename, "functions.R"),
append = TRUE)
}

Related

Use Rcpp function in foreach %dopar%

I have created an Rcpp test package called "test" using the Rcpp package skeleton to try to run c++ code in parallel but keep running into errors. I'm running R 4.1.2 on Mac OS and have updated all parallel computing packages. I added to the package skeleton an R script containing
# wrap c++ function in R function
test_func <- function()
{
return(rcpp_hello_world())
}
# attempt to parallelize
parallelize <- function()
{
# create cluster
cl <- parallel::makeCluster(parallel::detectCores() - 1)
parallel::clusterExport(cl,varlist = c("test_func","rcpp_hello_world"),envir = environment())
doParallel::registerDoParallel(cl)
# call test_func in parallel
res <- foreach::`%dopar%`(foreach::foreach(i = 1:5,.combine = c),ex = {test_func()})
# clean up
parallel::stopCluster(cl)
return(res)
}
I loaded my package using devtools::load_all(), but typing parallelize() in my console I get the error "Error in { : task 1 failed - "object '_test_rcpp_hello_world' not found" ". When I add "_test_rcpp_hello_world" to the clusterExport call I get the error "Error in { : task 1 failed - "NULL value passed as symbol address" ".
Everything works fine when I switch %dopar% to %do%, but I'm hoping to be able to still parallelize.
I know that similar questions have been asked here, but I can't use a solution which calls sourceCpp on each worker (the c++ code in my actual R package is huge and this operation would defeat the purpose of parallelizing).
Any help would be greatly appreciated!!
(Continuing from the comments)
The key is that to execute 'local' code on a node, you cannot send a (compiled) function to the node. The node needs to have it, and the best way it to have the node(s) have access to the same package(s), load them and thus be ready to run code using them. I just glanced at some old slide decks from presentations I gave and I didn't find an perfect example -- but a pointer to a (thirteen-plus (!!) year old) directory of example scripts including this for running (cpu-wise expensive) DieHarder tests on nodes via Rmpi:
#!/usr/bin/env r
suppressMessages(library(Rmpi))
suppressMessages(library(snow))
cl <- NULL
mpirank <- mpi.comm.rank(0)
if (mpirank == 0) {
cl <- makeMPIcluster()
} else { # or are we a slave?
sink(file="/dev/null")
slaveLoop(makeMPImaster())
mpi.finalize()
q()
}
clusterEvalQ(cl, library(RDieHarder))
res <- parLapply(cl, c("mt19937","mt19937_1999",
"mt19937_1998", "R_mersenne_twister"),
function(x) {
dieharder(rng=x, test="operm5",
psamples=100, seed=12345)
})
stopCluster(cl)
print( do.call(rbind, lapply(res, function(x) { x[[1]] } )))
mpi.quit()
The key is in the middle: clusterEvalQ(cl, library(RDieHarder)) All worker nodes are asked to load the RDieHarder package. Conceptually, you want to do the same here, and the foreach family lets you do it too.

Why Do "rstudio:run:" Hyperlinks Work Only for Functions Exported by "rlang"?

Background
I've often aspired to prototype an R package which serves as a source of truth for styling conventions in (say) diagnostic output, and which enables central updates to those conventions.
After many failed experiments with crayon::hyperlink(), I was excited to stumble across Console hyperlinks to functions like `rlang::last_error()`
which preview documentation when hovered
and execute the code when clicked!
Attempt
On a whim, I prototyped a family of functions (see Code section below) that make universal provisions for such functionality. By inspecting the source code, I was able to replicate the functionality above for other rlang functions. Here we pipe (|>) my function style_run_call0() through cat()
rlang::quo("test") |> style_run_call0() |> cat()
to display in the Console a hyperlink to live code
which previews documentation when hovered
and executes the code when clicked:
Problem
This all works well enough for rlang functions. But for functions from other packages
base::sum(1:10) |> style_run_call0() |> cat()
it wrongly displays a "broken" link:
Even with rlang, the links are broken for any arguments that are calls themselves
rlang::quo(rlang::as_string("test")) |> style_run_call0() |> cat()
and for all private functions
rlang:::ansi_alert() |> style_run_call0() |> cat()
though the links do work for "simple" arguments with operators:
rlang::quo(TRUE || FALSE) |> style_run_call0() |> cat()
#> `rlang::quo(TRUE || FALSE)`
style_run_expr0(TRUE || FALSE) |> cat()
#> `TRUE || FALSE`
Question
I am 99% sure this problem boils down to the rstudio:run: format
style_rlang_run <- function(code) {
style_hyperlink(
paste0("rlang::", code),
paste0("rstudio:run:rlang::", code)
)
}
and its limitations for hyperlinking code.
But why would rlang:::style_rlang_run() need to specify "rlang::" if the "rstudio:run:" accommodated only rlang functions and nothing else?
Code
Text Links
These links are generated from character strings.
# Hyperlink to a URL.
style_hyperlink_url <- function (url, text = NULL, params = NULL) {
# Display text defaults to URL.
if (is.null(text))
text <- url
# Underline the link in blue for classic URL style.
crayon::underline$blue(rlang:::style_hyperlink(text, url, params))
}
# Hyperlink to run code (given as text) interactively in RStudio.
style_run_code <- function(code, text = NULL) {
# Display text defaults to code in backticks.
if (is.null(text))
text <- paste0("`", code, "`")
# Underline the hyperlink in silver to designate code link.
crayon::underline$silver(rlang:::style_hyperlink(text, paste0("rstudio:run:", code)))
}
Live Code
These code links are generated from R language itself.
# Hyperlink to run a (simple) 'call' object as an interactive command in RStudio.
style_run_call <- function(call, text = NULL) {
call_expr <- rlang::get_expr(call)
call_qual <- call_qualify(call_expr)
style_run_code(base::deparse1(call_qual), text)
}
# Hyperlink to run a (simple) literal call as an interactive command in RStudio.
style_run_call0 <- function(call, text = NULL) {
call_quo <- rlang::enquo0(call)
style_run_call(call_quo, text)
}
# Hyperlink to run a (simple) 'expression' object as an interactive command in RStudio.
style_run_expr <- function(expr, text = NULL) {
expr_expr <- rlang::get_expr(expr)
call_quo <- rlang::quo(rlang::eval_bare(!!expr_expr))
# Text defaults to the expression itself, not the code evaluating it.
if (is.null(text))
text <- paste0("`", base::deparse1(expr_expr), "`")
style_run_call(call_quo, text)
}
# Hyperlink to run a (simple) literal 'expression' as an interactive command in RStudio.
style_run_expr0 <- function(expr, text = NULL) {
expr_quo <- rlang::enquo0(expr)
style_run_expr(expr_quo, text)
}
Call Qualification
This function qualifies a call to fn() as pkg::fn().
call_qualify <- function(call) {
if (!rlang::is_call(call))
rlang::abort("`call` must be a call")
if (!rlang::is_call_simple(call))
rlang::abort("`call` must be a simple call")
# Check the namespace that qualifies the function.
call_ns_name <- rlang::call_ns(call)
# Qualify if necessary.
if (is.null(call_ns_name)) {
call_fn <- rlang.call_fn(call)
call_fn_name <- rlang::call_name(call)
call_ns_name <- rlang::ns_env_name(call_fn)
call_ns_sym <- rlang::sym(call_ns_name)
call_fn_sym <- rlang::sym(call_fn_name)
# TODO: Check if namespace exports the function.
if (fn_is_exported(call_fn_name, call_ns_name)) {
qual_sym <- quote(`::`)
} else {
qual_sym <- quote(`:::`)
}
# Assemble the qualified function name.
qual_expr <- quote(`::`(pkg = NULL, name = NULL))
qual_expr[[1]] <- qual_sym
qual_expr$pkg <- call_ns_sym
qual_expr$name <- call_fn_sym
# Assemble the qualified call.
call_expr <- rlang::get_expr(call)
call_expr[[1]] <- qual_expr
call <- rlang::set_expr(call, call_expr)
}
# Return the qualified call.
call
}
Helper Functions
# Function to check if a function is exported (TRUE) from its namespace, or internal (FALSE).
fn_is_exported <- function(fn_name, ns_name) {
# Placeholder.
TRUE
# TODO: Figure out an efficient algorithm.
# Since a function object may be assigned to a new name, perhaps we should match by bytecode instead?
}
# Current styler from "rlang".
.rlang.style_hyperlink <- rlang:::style_hyperlink
# Function to extract the function from a call. Deprecated in "rlang" and reconstructed here.
.rlang.call_fn <- function(call, env = caller_env()) {
expr <- rlang::get_expr(call)
env <- rlang::get_env(call, env)
if (!rlang::is_call(expr)) {
rlang:::abort_call_input_type("call")
}
switch(rlang:::call_type(expr),
recursive = rlang::abort("`call` does not call a named or inlined function"),
inlined = rlang:::node_car(expr),
named = , namespaced = ,
rlang::eval_bare(rlang:::node_car(expr), env)
)
}
Executing base:: functions is explicitly forbidden, see the RStudio PR and discussion in the issue.
If I understand the test code correctly, you can run code of your own package with:
cli::style_hyperlink("show style code", "ide:run:yourpackage::style()")
If yourpackage is not installed, it should just be copied into the console, but not executed.
I'm not really sure what your use case is, but maybe another option would be to generate a link to a help page or vignette in your package?
cli::style_hyperlink("help page", "ide:help:yourpackage::correct_style")

R: Understand the ".call" function in R

I am using R. I am working with a library called "mco" : https://cran.r-project.org/web/packages/mco/index.html
I was looking over some of the function definitions used in this library at the github repository, for example: https://github.com/olafmersmann/mco/blob/master/R/nsga2.R
Over here, I came across the following lines of code:
res <- .Call(do_nsga2,
ff, cf, sys.frame(),
as.integer(odim),
as.integer(cdim),
as.integer(idim),
lower.bounds, upper.bounds,
as.integer(popsize), as.integer(generations),
cprob, as.integer(cdist),
mprob, as.integer(mdist))
if (1 == length(res)) {
res <- res[[1]]
names(res) <- c("par", "value", "pareto.optimal")
class(res) <- c("nsga2", "mco")
} else {
for (i in 1:length(res)) {
names(res[[i]]) <- c("par", "value", "pareto.optimal")
class(res[[i]]) <- c("nsga2", "mco")
}
class(res) <- "nsga2.collection"
}
return (res)
}
In the very first line of this code, it makes reference to some object called "do_nsga2". But apart from this function, I can't find any reference to "do_nsga2" within the entire package.
Does anyone know what exactly is being "called"?
Thanks
Note: I am trying to copy/paste all the functions from the github repository into my R session, since I am working with an older computer in which directly installing libraries from CRAN is not possible. When I tried to copy/paste all these functions, I got the following error:
Error in nsga2....
object 'do_nsga2' not found

Adding additional documentation to a package in R

Aside from a vignette, I wish to add an additional document as PDF to my package. I can, of course, copy it to the inst/doc directory and it will then be included in the package documentation.
However, I would like to make it easy for the user to display this file. The authors of the edgeR package decided to do the following: the main users manual is distributed as PDF (and is not a regular vignette), and the authors include a function called edgeRUsersGuide() which shows the PDF by means of the following code:
edgeRUsersGuide <- function (view = TRUE) {
f <- system.file("doc", "edgeRUsersGuide.pdf", package = "edgeR")
if (view) {
if (.Platform$OS.type == "windows")
shell.exec(f)
else system(paste(Sys.getenv("R_PDFVIEWER"), f, "&"))
}
return(f)
}
It appears to work. Do you think it is a reasonable approach?
Or should one use something else? Potentially, the following code would also work and be more robust:
z <- list(PDF="edgeR.pdf", Dir=system.file(package="edgeR"))
class(z) <- "vignette"
return(z)
My solution was to ape the code in utils:::print.vignette():
function(docfile) {
## code inspired by tools:::print.vignette
pdfviewer <- getOption("pdfviewer")
f <- system.file("doc", docfile, package = "tmod")
if(identical(pdfviewer, "false"))
stop(sprintf("Cannot display the file %s", f))
if (.Platform$OS.type == "windows" &&
identical(pdfviewer, file.path(R.home("bin"), "open.exe"))) {
shell.exec(f)
} else {
system2(pdfviewer, shQuote(f), wait = FALSE)
}
return(invisible(f))
}

Patch base::library with wrapper in R

Inside an R package, I'm trying to patch the base::library() function in R to specifically set the position of the loaded packages in the search path. I haveve defined several environments (all named env:<something>) and want to make sure that libraries are placed below these environments in the search path.
# wrap around library function.
library = function(..., pos = NULL) {
print("NEW LIBRARY FUNCTION!")
if (is.null(pos)) {
pos <- grep("env:", search())
pos <- if (length(pos) == 0) 2 else max(pos) + 1
}
base::library(..., pos=pos)
}
When I assign this function in the console, everything runs fine:
> library(stats)
[1] "NEW LIBRARY FUNCTION!"
> eval(parse(text = "library(stats)"))
[1] "NEW LIBRARY FUNCTION!"
> eval(parse(text = "library(stats)"), envir = globalenv())
[1] "NEW LIBRARY FUNCTION!"
When I define the above wrapper function inside my package, build it and load it in a new R Session, the following executes as expected:
> library(mypackage)
> mypackage:::library(stats)
[1] "NEW LIBRARY FUNCTION!"
But, when using eval() with the envir argument inside a function in mypackage, my new definition of library() is not retrieved:
# Functions defined in mypackage
testlibrary1 = function(...) {
library(...)
}
testlibrary2 = function(code) {
eval(parse(text = code))
}
testlibrary3 = function(code) {
eval(parse(text = code), envir = globalenv())
}
In console, I get the following results:
> mypackage:::testlibrary1(stats)
[1] "NEW LIBRARY FUNCTION!"
> mypackage:::testlibrary2("library(stats)")
[1] "NEW LIBRARY FUNCTION!"
> mypackage:::testlibrary3("library(stats)")
>
The last function, testlibrary3(), did not use the new wrapper function.
I want all functions that call library() inside mypackage to use my wrapper function. Can somebody help me out?
I guess the problem is the following, but as your question did not include a fully reproducible example (i.e., by uploading the package somewhere) it is difficult to tell.
As long as your library function is not exported from your package via the NAMESPACE it is not visible. Consequently, the only available library function to eval is base::library().
Note that while your function resides in the namespace of the package the calling environment for mypackage:::testlibraryX() is still the global environment. There your library functions is not available. Try to export is and see if this helps.

Resources