extracting source code from r package - r

I am trying to install the r package sowas and unfortunately it is too old to implement in the new versions of r.
According to the author you can use the package using the source() function to gain access to the code but I have not been able to figure out how to do that.
Any help is appreciated.
Here is a link to the package I described as it is not a CRAN package: http://tocsy.pik-potsdam.de/wavelets/

The .zip file is a windows binary and as such it won't be too interesting. What you'll want to look at is the contents of the .tar.gz archive. You can extract those contents and then look at the code in the R subdirectory.
You could also update the package to work with new versions of R so that you can actually build and install the package. To do so you could unpack the .tar.gz as before but now you'll need to add a NAMESPACE file. This is just a plaintext file at the top of the package directory that has a form like:
export(createar)
export(createwgn)
export(criticalvaluesWCO)
export(criticalvaluesWSP)
export(cwt.ts)
export(plot.wt)
export(plotwt)
export(readmatrix)
export(readts)
export(rk)
export(wco)
export(wcs)
export(writematrix)
export(wsp)
Where you have an export statement for any function in the package that you actually want to be able to use. If a function isn't exported then the functions in the package still have access to that function but the user can't use it (as easily). Once you do that you should be able to build and install the package.
I took the liberty of doing some of this already. I haven't actually taken the time to figure out which functions are useful and should be exported and just assumed that if a help page was written for the function that it should be exported and if there wasn't a help page then I didn't export it. I used Rd2roxygen to convert the help pages to roxygen code (because that's how I roll) and had to do a little bit of cleanup after that but it seems to install just fine.
So if you have the devtools package installed you should actually be able to install the version I modified directly by using the following commands
library(devtools)
install_github("SOWAS", "Dasonk")
Personally I would recommend that you go the route of adding the NAMESPACE file and what not directly as then you'll have more control over the code and be more able to fix any problems that might occur when using the package. Or if you use git you could fork my repo and continue fixing things from there. Good luck.

If you want to see the source code of a particular function, then just type the name of the function without the braces and press enter. You will see the code.
For example type var in command prompt to see it's code.
> var
function (x, y = NULL, na.rm = FALSE, use)
{
if (missing(use))
use <- if (na.rm)
"na.or.complete"
else "everything"
na.method <- pmatch(use, c("all.obs", "complete.obs", "pairwise.complete.obs",
"everything", "na.or.complete"))
if (is.na(na.method))
stop("invalid 'use' argument")
if (is.data.frame(x))
x <- as.matrix(x)
else stopifnot(is.atomic(x))
if (is.data.frame(y))
y <- as.matrix(y)
else stopifnot(is.atomic(y))
.Call(C_cov, x, y, na.method, FALSE)
}
<bytecode: 0x0000000008c97980>
<environment: namespace:stats>

Related

How to tell RStudio to autocomplete my function's arguments with package names?

According to RStudio:
In addition, certain functions, such as library() and require(), expect package names for completions. RStudio automatically infers whether a particular function expects a package name and provides those names as completion...
My question is: how? I'm writing a custom function that takes package names as arguments, yet RStudio's only completing the arguments with object & function names, and I can't tell what it is about the library() and require() code that RStudio is picking up on.
My function is:
unpack <- function(...,
lib = NULL,
repos = getOption("repos")) {
pkgs <- sapply(match.call(expand.dots = TRUE)[-1], as.character)
new.pkgs <-
pkgs[!(
pkgs %in% installed.packages(lib.loc = lib)[, "Package"]
)]
if (length(new.pkgs))
install.packages(new.pkgs,
lib = lib,
repos = repos)
sapply(pkgs, require,
lib.loc = lib,
character.only = TRUE)
}
As #hrbrmstr pointed out, there's both Java and R code that specifically name the four functions that autocomplete with package names, so the solution is to either mask one of those and cross your fingers, or add your function's name to those lists in both source files (or maybe just the R, I wonder).
I recently created a package which has few more autocompletion (though totally experimental) (as extra code only).
It can be seen here https://github.com/r-rudra/patch/blob/main/inst/embedded/usecases.R
Maybe soon enough all these will be available by default in RStudio.
Check this comment

R: Patching a package function and reloading base libraries

Occasionally one wants to patch a function in a package, without recompiling the whole package.
For example, in Emacs ESS, the function install.packages() might get stuck if tcltk is not loaded. One might want to patch install.packages() in order to require tcltk before installation and unload it after the package setup.
A temp() patched version of install.packages() might be:
## Get original args without ending NULL
temp=rev(rev(deparse(args(install.packages)))[-1])
temp=paste(paste(temp, collapse="\n"),
## Add code to load tcltk
"{",
" wasloaded= 'package:tcltk' %in% search()",
" require(tcltk)",
## Add orginal body without braces
paste(rev(rev(deparse(body(install.packages))[-1])[-1]), collapse="\n"),
## Unload tcltk if it was not loaded before by user
" if(!wasloaded) detach('package:tcltk', unload=TRUE)",
"}\n",
sep="\n")
## Eval patched function
temp=eval(parse(text=temp))
# temp
Now we want to replace the original install.packages() and perhaps insert the code in Rprofile.
To this end it is worth nothing that:
getAnywhere("install.packages")
# A single object matching 'install.packages' was found
# It was found in the following places
# package:utils
# namespace:utils
# with value
#
# ... install.packages() source follows (quite lengthy)
That is, the function is stored inside the package/namespace of utils. This environment is sealed and therefore install.packages() should be unlocked before being replaced:
## Override original function
unlockBinding("install.packages", as.environment("package:utils"))
assign("install.packages", temp, envir=as.environment("package:utils"))
unlockBinding("install.packages", asNamespace("utils"))
assign("install.packages", temp, envir=asNamespace("utils"))
rm(temp)
Using getAnywhere() again, we get:
getAnywhere("install.packages")
# A single object matching 'install.packages' was found
# It was found in the following places
# package:utils
# namespace:utils
# with value
#
# ... the *new* install.packages() source follows
It seems that the patched function is placed in the right place.
Unfortunately, running it gives:
Error in install.packages(xxxxx) :
could not find function "getDependencies"
getDependencies() is a function inside the same utils package, but not exported; therefore it is not accessible outside its namespace.
Despite the output of getAnywhere("install.packages"), the patched install.packages() is still misplaced.
The problem is that we need to reload the utils library to obtain the desired effect, which also requires unloading other libraries importing it.
detach("package:stats", unload=TRUE)
detach("package:graphics", unload=TRUE)
detach("package:grDevices", unload=TRUE)
detach("package:utils", unload=TRUE)
library(utils)
install.packages() works now.
Of course, we need to reload the other libraries too. Given the dependencies, using
library(stats)
should reload everything. But there is a problem when reloading the graphics library, at least on Windows:
library(graphics)
# Error in FUN(X[[i]], ...) :
# no such symbol C_contour in package path/to/library/graphics/libs/x64/graphics.dll
Which is the correct way of (re)loading the graphics library?
Patching functions in packages is a low-level operation that should be avoided, because it may break internal assumptions of the execution environment and lead to unpredictable behavior/crashes. If there is a problem with tck/ESS (I didn't try to repeat that) perhaps it should be fixed or there may be a workaround. Particularly changing locked bindings is something to avoid.
If you really wanted to run some code at the start/end of say install.packages, you can use trace. It will do some of the low-level operations mentioned in the question, but the good part is you don't have to worry about fixing this whenever some new internals of R change.
trace(install.packages,
tracer=quote(cat("Starting install.packages\n")),
exit=quote(cat("Ending install packages.\n"))
)
Replace tracer and exit accordingly - maybe exit is not needed anyway, maybe you don't need to unload the package. Still, trace is a very useful tool for debugging.
I am not sure if that will solve your problem - if it would work with ESS - but in general you can also wrap install.packages in a function you define say in your workspace:
install.packages <- function(...) {
cat("Entry.\n")
on.exit(cat("Exit.\n"))
utils::install.packages(...)
}
This is the cleanest option indeed.

Automatically install list of packages in R if necessary

I would like to check, at the beginning of my R script, whether the required packages are installed and, if not, install them.
I would like to use something like the following:
RequiredPackages <- c("stockPortfolio","quadprog")
for (i in RequiredPackages) { #Installs packages if not yet installed
if (!require(i)) install.packages(i)
}
However, this gives me error messages because R tries to install a package named 'i'. If instead I use...
if (!require(i)) install.packages(get(i))
...in the relevant line, I still get error messages.
Anybody know how to solve this?
Although the problem has been solved by #Thomas's answer, I would like to point out that pacman might be a better yet simple choice:
First install pacman:
install.packages("pacman")
Then load packages. Pacman will check whether each package has been installed, and if not, will install it automatically.
pacman::p_load("stockPortfolio","quadprog")
That's it.
Relevant links:
pacman GitHub page
Introduction to pacman
I think this is close to what you want:
requiredPackages <- c("stockPortfolio","quadprog")
for (package in requiredPackages) { #Installs packages if not yet installed
if (!requireNamespace(package, quietly = TRUE))
install.packages(package)
}
HERE is the source code and an explanation of the requireNamespace function.
Both library and require use non-standard evaluation on their first argument by default. This makes them hard to use in programming. However, they both take a character.only argument (Default is FALSE), which you can use to achieve your result:
RequiredPackages <- c("stockPortfolio","quadprog")
for (i in RequiredPackages) { #Installs packages if not yet installed
if (!require(i, character.only = TRUE)) install.packages(i)
}
I have by now written the following function (and put it into a package), which essentially does what #Thomas and #federico propose:
SPLoadPackages<-function(packages){
for(fP in packages){
eval(parse(text="if(!require("%_%fP%_%")) install.packages('"%_%fP%_%"')"))
eval(parse(text="library("%_%fP%_%")"))
}
}

R Functions require package declaration when they are included from another file?

I am writing some data manipulation scripts in R, and I finally decided to create an external .r file and call my functions from there. But it started giving me some problems when I try calling some functions. Simple example:
This one works with no problem:
change_column_names <- function(file,new_columns,seperation){
new_data <- read.table(file, header=TRUE, sep=seperation)
colnames(new_data) <- new_columns
write.table(new_data, file=file, sep=seperation, quote=FALSE, row.names = FALSE)
}
change_column_names("myfile.txt",c("Column1", "Column2", "Cost"),"|")
When I crate a file "data_manipulation.r", and put the above change_column_names function in there, and do this
sys.source("data_manipulation.r")
change_column_names("myfile.txt",c("Column1", "Column2", "Cost"),"|")
it does not work. It gives me could not find function "read.table" error. I fixed it by changing the function calls to util:::read.table and util:::write.table .
But this kinda getting frustrating. Now I have the same issue with the aggregate function, and I do not even know what package it belongs to.
My questions: Which package aggregate belongs to? How can I easily know what packages functions come from? Any cleaner way to handle this issue?
The sys.source() by default evaluates inside the base environment (which is empty) rather than the global environment (where you usually evaluate code). You probably should just be using source() instead.
You can also see where functions come from by looking at their environment.
environment(aggregate)
# <environment: namespace:stats>
For the first part of your question: If you want to find what package a function belongs to, and that function is working properly you can do one of two (probably more) things:
1.) Access the help files
?aggregate and you will see the package it belongs to in the top of the help file.
Another way, is to simply type aggregate without any arguments into the R console:
> aggregate
function (x, ...)
UseMethod("aggregate")
<bytecode: 0x7fa7a2328b40>
<environment: namespace:stats>
The namespace is the package it belongs to.
2.) Both of those functions that you are having trouble with are base R functions and should always be loaded. I was not able to recreate the issue. Try using source instead of sys.source and let me know if it alleviates your error.

How to debug (placing break point,etc) an installed R package in RStudio?

I need to run a function line-by-line to understand how it works. But the function is part of an installed package and I don't know where R stores the source of the installed packages (say MultiPhen). I am using RStudio 0.98.501 and R 3.0.2 in Ubuntu 12 (64it). Apparently source code of installed packages are not stored, right? Sorry if it is a naive question, I am new to R.
If the sources are not stored, is there anyway to re-install a package with source and debug it (basically place a break point).
Thanks,
Kayhan
Look at trace. Here is an example adding a breakpoint at the fourth statement in the base package function var. Here we ask trace to invoke the function browser at the sixth statement:
> trace(var, browser, at=6)
Tracing function "var" in package "stats"
[1] "var"
> var(1:10)
Tracing var(1:10) step 6
Called from: eval(expr, envir, enclos)
Browse[1]> n
debug: if (is.data.frame(y)) y <- as.matrix(y) else stopifnot(is.atomic(y))
Browse[2]> n
debug: stopifnot(is.atomic(y))
Browse[2]> n
debug: .Call(C_cov, x, y, na.method, FALSE)
Browse[2]> n
[1] 9.166667
Remember to untrace when you're done. You can do fairly complex stuff with trace, though in most cases trace(fun.name, browser) is probably enough.
Alternatively, you can just load the package and type the name of the function on the command line like so:
> var
function (x, y = NULL, na.rm = FALSE, use)
{
if (missing(use))
use <- if (na.rm)
"na.or.complete"
else "everything"
na.method <- pmatch(use, c("all.obs", "complete.obs", "pairwise.complete.obs",
"everything", "na.or.complete"))
if (is.na(na.method))
stop("invalid 'use' argument")
if (is.data.frame(x))
x <- as.matrix(x)
else stopifnot(is.atomic(x))
if (is.data.frame(y))
y <- as.matrix(y)
else stopifnot(is.atomic(y))
.Call(C_cov, x, y, na.method, FALSE)
}
<bytecode: 0x000000000928ad30>
<environment: namespace:stats>
You can then copy that into your editor and play around with it, add your browser statement, and step through the results.
I think that when when you type install.packages('MultiPhen') you get a binary version of the package. I think that there is no way to set a breakpoint and step thru code with that version of the package.
All R packages are open source, and the source is available on the CRAN page for the package. For example, this is the CRAN page for MultiPhen. If you click on the link next to the text "Package source:" you will download the source.
In terms of what to do when you have the source: all R packages have the same directory structure. What matters for your situation is that all the R code for the package is in the directory called "R".
I recommend uninstalling the package from RStudio and sourcing the code in the directory "R", setting breakpoints and stepping thru code as you see fit.
Please let us know if this solves your problem.
I found a easy way to do this. First you write a script to recall the function, and then set a breakpoint. Run the script, and it you stop at the breakpoint. Then you can see there are different options to run the code. next line, step into the function, continue, stop...And now you can run the code line by line, and run into to your function.

Resources