reliably extract srclines and srcfile from a function - r

I need to extract the exact lines of the source that was parsed to create an R function, for use in coverage analysis. deparse is not accurate enough because in doing coverage analysis with package covr exact line numbers matter.
If there is a srcfile, I just need the filename. If there isn't, e.g. function was created in console, I need to create an equivalent temporary file that could have been, line by line, the source file for that function.
I see several function to extract src information from a function, like getsrcFilename or getSrcref, but none specifically to get the source code.
getSrclines looked promising, but doesn't take functions as arguments. I tried to use attributes to get to the srcref and get to the information that way, but it doesn't seem to be stored consistently -- clearly I am missing something.
Sometimes
attributes(body(cover.fun))$srcfile works and sometimes this attributes(attributes(cover.fun)$srcref)$srcfile) does, and in the srcref itself I found the source in srcfile$lines or srcfile$original$lines and of course these look just like experiments and not The Right Way to implement this.
I need something that takes care of functions created in a package, with source or interactively. If the filename is available, that's all I need, otherwise I need the source lines. Thanks

Related

How can I feed a string that includes quotes to a system2 call in R?

I have a somewhat niche question that I'm hoping someone can answer for me or help me find a work-around to:
I've written a script in R that will run an ImageJ macro for sets of images I produce as a part of my workflow.
Because this is work that I may publish at some point or may be used by other researchers in the lab after me, I like to keep a copy of the R script and ImageJ macro within each dataset's folder as I sometimes modify the script a little for a certain series of images and this makes it very clear which version of code I used to process which set of images.
I am somewhat new to coding so I'm slowly trying to make this piece of code more streamlined and to have fewer places within the script that need to be modified each time I copy it to a new datafile, which is where I'm running into an issue.
In the script, I call the macro using the following code:
macro <- function(i) {
system2('/Applications/Fiji.app/Contents/MacOS/ImageJ-macosx', args=c('-batch "/Users/xxxx/yyyy/zzzz/current experiment/ImageJ Macro.ijm"', i))
}
For each new project I need to edit the filepath manually. I would love a way to define a variable at the beginning of the script which could be passed into the arguments as a string, but I can't figure out how to do it.
I've tried creating a variable just for the filepath, but R can't recognize a variable as part of the string that includes '-batch...'
I've also tried creating a variable containing the entire string that needs to be passed to args, but that doesn't work either. Here's what I coded for that attempt:
ImageJMacro <- paste(getwd(),"/ImageJ Macro.ijm",sep="")
batch1 = sprintf('-batch "%s"', ImageJMacro)
batchline = sprintf("'%s'", batch1)
As you can see, I had to do this in two steps because the single quotes outside of double quotes was giving an error. I thought this would work because when I run:
cat(batchline)
The string looks correct, but when passed into the arguments clause of the system command like so:
macro <- function(i) {
system2('/Applications/Fiji.app/Contents/MacOS/ImageJ-macosx', args=c(batchline, i))
}
it still throws an error.
Any ideas? Other solutions I should try? Thanks in advance for your help, I appreciate it!
Editing to add additional clarification as requested by #rmagn0:
ImageJ is an image analysis software which allows you to write 'macros', hard-coded scripts of repetitive analyses and apply them across many images. I'm not including the ImageJ macro code here because it's not relevant to my question. Suffice it to say that it expects to receive a string argument input, which it then parses out into several components used in the image processing. These string components are parsed out using an asterisk as a delimiter as described in this stack overflow: Calling an ImageJ Macro from R
I am trying to use R to pass a list of arguments to my ImageJ macro, one for each data file I need analyzed, as demonstrated in the code above. Note on above: I named the function in R 'macro', but it is really just calling the command line instance of my ImageJ macro.
If I were to run one instance of the list in the command line, it would look like this:
Contents/MacOS/ImageJ-macosx -batch "/Users/xxxx/yyyy/zzzz/current experiment/ImageJ Macro.ijm" ImageName.tif*Xcoord*Ycoord*/Users/xxxx/yyyy/zzzz/InputDirectory*/Users/xxxx/yyyy/zzzz/OutputDirectory*Suffix

How to get roxygen2 to interpret backticks as code formatting?

The standard way of writing documentation with code formatting is by using \code{}.
That is, #' #param foo value passed on to \code{bar()} becomes
Arguments
foo    value passed on to bar()
However, I've seen some packages (i.e. dplyr) using backticks instead of \code{} to the same effect. This is much better, since it's less clunky and allows for very nice syntax highlighting.
However, if I try that on my own package, the backticks get interpreted as... just backticks, like any other character.
The documentation for dplyr::across(), for example, starts with:
#' #description
#' `across()` makes it easy to apply the same transformation to multiple [...]
which gets compiled and displayed in the man page as:
Description
across() makes it easy to apply the same transformation to multiple [...]
But if I try something similar on my package, I get:
Description
`across()` makes it easy to apply the same transformation to multiple [...]
Weirdly, I've forked the package glue (which also manages to use backticks for code formatting) for a simple PR, and if I build the package locally, the backticks work (I get code formatting). Can't for the life of me figure out why it works there but not for my package.
So, is there some setting I need to modify to get this to work? I checked the dplyr.Rproj but found nothing relevant. I also glanced at the Doxyfile, but didn't know what it did or what I'd even be looking for there.
All credit goes to #rawr's comment to the question, just formalizing it with an answer:
The secret is in the roxygen2 documentation: just add the following to the end of the package DESCRIPTION file:
# DESCRIPTION file
# [... rest of file ...]
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.2 # actually works since 6.0.0
As that code would imply, this sets roxygen2 to interpret the commands as good ol' Markdown like we're used to using here on SO and elsewhere. This also implies all the other standard Markdown commands such as **bold** and *italics*, as well as [text](http://www.url.com), code blocks defined by ```, itemized and enumerated lists, etc. It's a huge improvement across the board.
Though be careful and take a look at the documentation, since there are a few gotchas. For instance, an empty line isn't necessary to start lists, so don't start any lines with #' 1. [...] or #' * [...] or you'll accidentally create a list!). There's also a few things which don't work yet, but they're pretty minor.

Function parameters - replace by reference

Thanks for all your advice. My remaining question is this:
Can I replace column name 'sulphate' in the following statement ...
dataclean <- datatable$sulfate[!datanas]
.... with a reference to a parameter 'pollutant', which may or may not have a value of 'sulfate'?
When you attach values to arguments, they appear as they would be objects in your workspace. But the environment is not workspace but that of the function.
So in your case, directory would be a character string and it would work. For the first time. Your working directory is now changed and you need to revert back to the previous one for the function to work again. This can get pretty messy so what I like to do is just refer to raw files by full path. See ?list.files for more info.
For your second question, your best bet is to refer to a certain level within the variable, is to do
x[, pollutant]
It is convenient to add drop = FALSE argument there, in order to keep the what I'm assuming is a data.frame.
You could improve your function by also implementing the datatable argument. That way you have all the objects bundled together nicely.
The most important thing to note here would be "debugging". You should learn to use at least browser(). This function will stop the execution of your function at the very step where it was called. This enables you, in the R console, to inspect elements in the function and run code to see what's going. This way you can speed up the development of code, at least initially when you usually haven't internalized all the data structures and paradigms yet.

R: How do I add an extra function to a package?

I would like to add an idiosyncratically modified function to package written by someone else, with an R Script, i.e. just for the session, not permanently. The specific example is, let's say, bls_map_county2() added to the blscrapeR package. bls_map_county2 is just a copy of the bls_map_county() function with an added ... argument, for purposes of changing a few of the map drawing parameters. I have not yet inserted the additional parameters. Running the function as-is, I get the error:
Error in BLS_map_county(map_data = df, fill_rate = "unemployed_rate", :
could not find function "geom_map"
I assume this is because my function does not point to the blscrapeR namespace. How do I assign my function to the (installed, loaded) blscrapeR namespace, and is there anything else I need to do to let it access whatever machinery from the package it requires?
When I am hacking on a function in a particular package that in turn calls other functions I often use this form after the definition:
mod_func <- function( args) {body hacked}
environment(mod_func) <- environment(old_func)
But I think the function you might really want is assignInNamespace. These methods will allow the access to non-exported functions in loaded packages. They will not however succeed if the package is not loaded. So you may want to have a stopifnot() check surrounding require(pkgname).
There are two parts to this answer - first a generic answer to your question, and 2nd a specific answer for the particular function that you reference, in which the problem is something slightly different.
1) generic solution to accessing internal functions when you edit a package function
You should already have access to the package namespace, since you loaded it, so it is only the unexported functions that will give you issues.
I usually just prepend the package name with the ::: operator to the non exported functions. I.e., find every instance of a call to some_internal_function(), and replace it with PackageName:::some_internal_function(). If there are several different internal functions called within the function you are editing, you may need to do this a few times for each of the offending function calls.
The help page for ::: does contain these warnings
Beware -- use ':::' at your own risk!
and
It is typically a design mistake to use ::: in your code since the
corresponding object has probably been kept internal for a good
reason. Consider contacting the package maintainer if you feel the
need to access the object for anything but mere inspection.
But for what you are doing, in terms of temporarily hacking another function from the same package for your own use, these warnings should be safe to ignore (at you own risk, of course - as it says in the manual)
2) In the case of blscrapeR ::bls_map_county()
The offending line in this case is
ggplot2::ggplot() + geom_map(...
in which the package writers have specified the ggplot2 namespace for ggplot(), but forgotten to do so for geom_map() which is also part of ggplot2 (and not an internal function in blscrapeR ).
In this case, just load ggplot2, and you should be good to go.
You may also consider contacting the package maintainer to inform them of this error.

Can I load an RData file while bypassing loading the namespaces?

Let's say some of my users cannot alter their R environments, but I need them to be able to open up RData files. These environment files require a package to be loaded (httpuv to be exact). We don't care about the package, we don't need its capabilities, we just need to get at the data. Is there a way to either force R to bypass loading namespaces when loading the RData file, or force it to save it without namespace dependencies at the originating end? Thanks.
To reproduce, install Shiny. Create and save a some R objects to the server's file system from within a Shiny applet as an RData file. Copy the file over to a computer that doesn't have Shiny or the httpuv package installed. Try loading the RData file, even if the actual objects you saved are completely ordinary data.frames that have nothing to do with Shiny or httpuv.
I did strings on the RData, and the damn thing is full of references to httpuv. The software is loading the file and then actively deciding to not continue in the internal loadFromConn2() function. Therefore there must be a way to make it stop doing so.
Really #baptiste should get credit for the link in his comment to some general solutions, especially the R CMD INSTALL --fake trick, and I will accept that if he reposts it as an answer. That is why I am not accepting the following answer of my own to the specific problem that caused it in my case, but I am posting my answer in case it helps someone else.
Some of the objects I was saving were lm fitted objects. Those contain formula/terms objects (at least two each, for some reason... maybe because they've been through stepAIC), and those formulas in turn each have an environment attribute. The environment attribute is .GlobalEnv which probably does contain copies of package functions someplace. When I dug through the objects inside the fitted models, and then the objects inside all the attributes of those objects, and then the objects inside the attributes of the attributes of those objects... and set every environment attribute I could find to NULL, eventually I was able to save that fitted model to a file that could be opened from a different R installation without getting the error about not being able to load a namespace.
I suppose I could also write a function to iterate through the objects within a fitted model, and their attributes, and remove environments but that sounds ugly and dangerous. Maybe there is a way to force formulas and fitted models not to retain environments, and that will be better. For the time being, instead of saving fitted models, I will save their call attributes after scrubbing any environment attributes I might find there. If that doesn't work, I'll deparse them into character strings.
PS: I used the RDS format and haven't yet tested it with RData, but I suspect that the problem was the saving of the evalution environment in some of the attributes, and had nothing to do with the format in which the objects get saved. I'll post an update if it turns out that this doesn't also work with RData.
PPS: I suspect I'm not the only one here who's hearing about the R CMD INSTALL --fake trick for the first time, and perhaps the word should be spread about this... because to the extent other R users don't know about it, this remains an obvious vector for denial-of-service attacks against R!
I will accept my own answer to get rid of the SO auto-nagger, but will unaccept it and accept #baptiste if they make it possible for me to do so by posting it as an answer. Thanks.

Resources