My R workflow is usually such that I have a file open into which I type R commands, and I’d like to execute those commands in a separately opened R shell.
The easiest way of doing this is to say source('the-file.r') inside R. However, this always reloads the whole file which may take considerable time if big amounts of data are processed. It also requires me to specify the filename again.
Ideally, I’d like to source only a specific line (or lines) from the file (I’m working on a terminal where copy&paste doesn’t work).
source doesn’t seem to offer this functionality. Is there another way of achieving this?
Here's another way with just R:
source2 <- function(file, start, end, ...) {
file.lines <- scan(file, what=character(), skip=start-1, nlines=end-start+1, sep='\n')
file.lines.collapsed <- paste(file.lines, collapse='\n')
source(textConnection(file.lines.collapsed), ...)
}
Using the right tool for the job …
As discussed in the comments, the real solution is to use an IDE that allows sourcing specific parts of a file. There are many existing solutions:
For Vim, there’s Nvim-R.
For Emacs, there’s ESS.
And of course there’s the excellent stand-alone RStudio IDE.
As a special point of note, all of the above solutions work both locally and on a server (accessed via an SSH connection, say). R can even be run on an HPC cluster — it can still communicate with the IDEs if set up properly.
… or … not.
If, for whatever reason, none of the solutions above work, here’s a small module[gist] that can do the job. I generally don’t recommend using it, though.1
#' (Re-)source parts of a file
#'
#' \code{rs} loads, parses and executes parts of a file as if entered into the R
#' console directly (but without implicit echoing).
#'
#' #param filename character string of the filename to read from. If missing,
#' use the last-read filename.
#' #param from first line to parse.
#' #param to last line to parse.
#' #return the value of the last evaluated expression in the source file.
#'
#' #details If both \code{from} and \code{to} are missing, the default is to
#' read the whole file.
rs = local({
last_file = NULL
function (filename, from, to = if (missing(from)) -1 else from) {
if (missing(filename)) filename = last_file
stopifnot(! is.null(filename))
stopifnot(is.character(filename))
force(to)
if (missing(from)) from = 1
source_lines = scan(filename, what = character(), sep = '\n',
skip = from - 1, n = to - from + 1,
encoding = 'UTF-8', quiet = TRUE)
result = withVisible(eval.parent(parse(text = source_lines)))
last_file <<- filename # Only save filename once successfully sourced.
if (result$visible) result$value else invisible(result$value)
}
})
Usage example:
# Source the whole file:
rs('some_file.r')
# Re-soure everything (same file):
rs()
# Re-source just the fifth line:
rs(from = 5)
# Re-source lines 5–10
rs(from = 5, to = 10)
# Re-source everything up until line 7:
rs(to = 7)
1 Funny story: I recently found myself on a cluster with a messed-up configuration that made it impossible to install the required software, but desperately needing to debug an R workflow due to a looming deadline. I literally had no choice but to copy and paste lines of R code into the console manually. This is a situation in which the above might come in handy. And yes, that actually happened.
Related
Is it possible to get R to write a plot in bitmap format (e.g. PNG) to standard output? If so, how?
Specifically I would like to run Rscript myscript.R | other_prog_that_reads_a_png_from_stdin. I realise it's possible to create a temporary file and use that, but it's inconvenient as there will potentially be many copies of this pipeline running at the same time, necessitating schemes for choosing unique filenames and removing them afterwards.
I have so far tried setting outf <- file("stdout") and then running either bitmap(file=outf, ...) or png(filename=outf, ...), but both complain ('file' must be a non-empty character string and invalid 'filename' argument, respectively), which is in line with the official documentation for these functions.
Since I was able to persuade R's read.table() function to read from standard input, I'm hoping there's a way. I wasn't able to find anything relevant here on SO by searching for [r] stdout plot, or any of the variations with stdout replaced by "standard output" (with or without double quotes), and/or plot replaced by png.
Thanks!
Unfortunately the {grDevices} (and, by implication, {ggplot2}) seems to fundamentally not support this.
The obvious approach to work around this is: let a graphics device write to a temporary file, and then read that temporary file back into the R session and write it to stdout.
But this fails because, on the one hand, the data cannot be read into a string: character strings in R do not support embedded null characters (if you try you’ll get an error such as “nul character not allowed”). On the other hand, readBin and writeBin fail because writeBin categorically refuses to write to any device that’s hooked up to stdout, which is in text mode (ignoring the fact that, on POSIX system, the two are identical).
This can only be circumvented in incredibly hacky ways, e.g. by opening a binary pipe to a command such as cat:
dev_stdout = function (underlying_device = png, ...) {
filename = tempfile()
underlying_device(filename, ...)
filename
}
dev_stdout_off = function (filename) {
dev.off()
on.exit(unlink(filename))
fake_stdout = pipe('cat', 'wb')
on.exit(close(fake_stdout), add = TRUE)
writeBin(readBin(filename, 'raw', file.info(filename)$size), fake_stdout)
}
To use it:
tmp_dev = dev_stdout()
contour(volcano)
dev_stdout_off(tmp_dev)
On systems where /dev/stdout exists (which are most but not all POSIX systems), the dev_stdout_off function can be simplified slightly by removing the command redirection:
dev_stdout_off = function (filename) {
dev.off()
on.exit(unlink(filename))
fake_stdout = file('/dev/stdout', 'wb')
on.exit(close(fake_stdout), add = TRUE)
writeBin(readBin(filename, 'raw', file.info(filename)$size), fake_stdout)
}
This might not be a complete answer, but it's the best I've got: can you open a connection using the stdout() command? I know that png() will change the output device to a file connection, but that's not what you want, so it might work to simply substitute png by stdout. I don't know enough about standard outputs to test this theory, however.
The help page suggests that this connection might be text-only. In that case, a solution might be to generate a random string to use as a filename, and pass the name of the file through stdout so that the next step in your pipeline knows where to find your file.
My R workflow is usually such that I have a file open into which I type R commands, and I’d like to execute those commands in a separately opened R shell.
The easiest way of doing this is to say source('the-file.r') inside R. However, this always reloads the whole file which may take considerable time if big amounts of data are processed. It also requires me to specify the filename again.
Ideally, I’d like to source only a specific line (or lines) from the file (I’m working on a terminal where copy&paste doesn’t work).
source doesn’t seem to offer this functionality. Is there another way of achieving this?
Here's another way with just R:
source2 <- function(file, start, end, ...) {
file.lines <- scan(file, what=character(), skip=start-1, nlines=end-start+1, sep='\n')
file.lines.collapsed <- paste(file.lines, collapse='\n')
source(textConnection(file.lines.collapsed), ...)
}
Using the right tool for the job …
As discussed in the comments, the real solution is to use an IDE that allows sourcing specific parts of a file. There are many existing solutions:
For Vim, there’s Nvim-R.
For Emacs, there’s ESS.
And of course there’s the excellent stand-alone RStudio IDE.
As a special point of note, all of the above solutions work both locally and on a server (accessed via an SSH connection, say). R can even be run on an HPC cluster — it can still communicate with the IDEs if set up properly.
… or … not.
If, for whatever reason, none of the solutions above work, here’s a small module[gist] that can do the job. I generally don’t recommend using it, though.1
#' (Re-)source parts of a file
#'
#' \code{rs} loads, parses and executes parts of a file as if entered into the R
#' console directly (but without implicit echoing).
#'
#' #param filename character string of the filename to read from. If missing,
#' use the last-read filename.
#' #param from first line to parse.
#' #param to last line to parse.
#' #return the value of the last evaluated expression in the source file.
#'
#' #details If both \code{from} and \code{to} are missing, the default is to
#' read the whole file.
rs = local({
last_file = NULL
function (filename, from, to = if (missing(from)) -1 else from) {
if (missing(filename)) filename = last_file
stopifnot(! is.null(filename))
stopifnot(is.character(filename))
force(to)
if (missing(from)) from = 1
source_lines = scan(filename, what = character(), sep = '\n',
skip = from - 1, n = to - from + 1,
encoding = 'UTF-8', quiet = TRUE)
result = withVisible(eval.parent(parse(text = source_lines)))
last_file <<- filename # Only save filename once successfully sourced.
if (result$visible) result$value else invisible(result$value)
}
})
Usage example:
# Source the whole file:
rs('some_file.r')
# Re-soure everything (same file):
rs()
# Re-source just the fifth line:
rs(from = 5)
# Re-source lines 5–10
rs(from = 5, to = 10)
# Re-source everything up until line 7:
rs(to = 7)
1 Funny story: I recently found myself on a cluster with a messed-up configuration that made it impossible to install the required software, but desperately needing to debug an R workflow due to a looming deadline. I literally had no choice but to copy and paste lines of R code into the console manually. This is a situation in which the above might come in handy. And yes, that actually happened.
I'd like to store a knit()ted document directly in R as an R object, as a character vector.
I know I can do this by knit()ing to a tempfile() and then import the result, like so:
library(knitr)
library(readr)
ex_file <- tempfile(fileext = ".tex")
knitr::knit(text = "foo", output = ex_file)
knitted_obj <- readr::read_file(ex_file)
knitted_obj
returns
# [1] "foo\n"
as intended.
Is there a way to do this without using a tempfile() and by directly "piping" the result to a vector?
Why on earth would I want this, you ask?
*.tex string will be programmatically saved to disc, and rendered to PDF later. Reading rendered *.tex from disc in downstream functions would make code more complicated.
Caching is just a whole lot easier, and moving this cache to a different machine.
I am just really scared of side effects in general and file system shenanigans across machines/OSes in particular. I want to isolate those to as few (print(), save(), plot()) functions as possible.
Does that make me a bad (or just OCD) R developer?
It should be as straightforward as a single line like this:
knitted_obj = knitr::knit(text = "foo")
You may want to read the help page ?knitr::knit again to know what it returns.
You can use con <- textConnection("varname", "w") to create a connection that writes its output to variable varname, and use output=con in the call to knit(). For example:
library(knitr)
con <- textConnection("knitted_obj", "w")
knit(text="foo", output = con)
close(con)
knitted_obj
returns the same as your tempfile approach, except for the newline. Multiple lines will show up as different elements of knitted_obj. I haven't timed it, but text connections have a reputation for being slow, so this isn't necessarily as fast as writing to the file system.
I am trying to run a script that has lots of comments to explain each table, statistical test and graph. I am using RStudio IDE as follows
source(filename, echo=T)
That ensures that the script outputs everything to the console. If I run the following sequence it will send all the output to a txt file and then switch off the output diversion
sink("filenameIwantforoutput.txt")
source(filename, echo=T)
sink()
Alas, I am finding that a lot of my comments are not being outputted. Instead I get
"...but only if we had had an exclusively b .... [TRUNCATED]".
Once before I learned where to preserve the output but that was a few months ago and now I cannot remember. Can you?
Set the max.deparse.length= argument to source. You probably need something greater than the default of 150. For example:
source(filename, echo=TRUE, max.deparse.length=1e3)
And note the last paragraph in the Details section of ?source reads:
If ‘echo’ is true and a deparsed
expression exceeds
‘max.deparse.length’, that many
characters are output followed by ‘
.... [TRUNCATED] ’.
You can make this behavior the default by overriding the source() function in your .Rprofile.
This seems like a reasonable case for overriding a function because in theory the change should only affect the screen output. We could contrive an example where this is not the case, e.g., can capture the screen output and use as a variable like capture.output(source("somefile.R")) but it seems unlikely. Changing a function in a way that the return value is changed will likely come back to bite you or whoever you share your code with (e.g., if you change a default of a function's na.rm argument).
.source_modified <- source
formals(.source_modified)$max.deparse.length <- Inf
# Use 'nms' because we don't want to do it for all because some were already
# unlocked. Thus if we lock them again, then we are changing the previous
# state.
# (https://stackoverflow.com/a/62563023/1376404)
rlang::env_binding_unlock(env = baseenv(), nms = "source")
assign(x = "source", value = .source_modified, envir = baseenv())
rlang::env_binding_lock(env = baseenv(), nms = "source")
rm(.source_modified)
An alternative is to create your own 'alias'-like function. I use the following in my .Rprofile:
s <- source
formals(s)$max.deparse.length <- Inf
formals(s)$echo <- TRUE
I have a program in R. Sometimes when I save history, they do not write into my history file. I lost some histories a few times and this really drive me crazy.
Any recommendation on how to avoid this?
First check your working directory (getwd()). savehistory() saves the history in the current working directory. And to be honest, you better specify the filename, as the default is .History. Say :
savehistory('C:/MyWorkingDir/MySession.RHistory')
which allows you to :
loadhistory('C:/MyWorkingDir/MySession.RHistory')
So the history is not lost, it's just in a place and under a name you weren't aware of. See also ?history.
To clarify : the history is no more than a text file containing all commands of that current session. So it's a nice log of what you've done, but I almost never use it. I construct my "analysis log" myself by using scripts, as hinted in another answer.
#Stedy has provided a workable solution to your immediate question. I would encourage you to learn how to use .R files and a proper text editor, or use an integrated development environment (see this SO page for suggestions). You can then source() in your .R file so that you can consistently replicate your analysis.
For even better replicability, invest the time into learning Sweave. You'll be glad you did.
Check the Rstudio_Desktop/history_database file - it stores every command for any working directory.
See here for more details How to save the whole sequence of commands from a specific day to a file?
Logging your console on a regular basis to **dated* files is handy. The package TeachingDemos has a great function for logging your console session, but it's written as a singleton, which is problematic for automatic logging, since you wouldn't be able to use that function to create teaching demo's if you use it for logging. I re-used that function using a bit of meta-programming to make a copy of that functionality that I include in the .First function in my local .Rprofile, as follows:
.Logger <- (function(){
# copy local versions of the txtStart,
locStart <- TeachingDemos::txtStart
locStop <- TeachingDemos::txtStop
locR2txt <- TeachingDemos:::R2txt
# creat a local environment and link it to each function
.e. <- new.env()
.e.$R2txt.vars <- new.env()
environment(locStart) <- .e.
environment(locStop) <- .e.
environment(locR2txt) <- .e.
# reference the local functions in the calls to `addTaskCallback`
# and `removeTaskCallback`
body(locStart)[[length(body(locStart))-1]] <-
substitute(addTaskCallback(locR2txt, name='locR2txt'))
body(locStop)[[2]] <-
substitute(removeTaskCallback('locR2txt'))
list(start=function(logDir){
op <- options()
locStart(file.path(logDir,format(Sys.time(), "%Y_%m_%d_%H_%M_%S.txt")),
results=FALSE)
options(op)
}, stop = function(){
op <- options()
locStop()
options(op)
})
})()
.First <- function(){
if( interactive() ){
# JUST FOR FUN
cat("\nWelcome",Sys.info()['login'],"at", date(), "\n")
if('fortunes' %in% utils::installed.packages()[,1] )
print(fortunes::fortune())
# CONSTANTS
TIME <- Sys.time()
logDir <- "~/temp/Rconsole.logfiles"
# CREATE THE TEMP DIRECORY IF IT DOES NOT ALREADY EXIST
dir.create(logDir, showWarnings = FALSE)
# DELETE FILES OLDER THAN A WEEK
for(fname in list.files(logDir))
if(difftime(TIME,
file.info(file.path(logDir,fname))$mtime,
units="days") > 7 )
file.remove(file.path(logDir,fname))
# sink() A COPY OF THE TERMINAL OUTPUT TO A DATED LOG FILE
if('TeachingDemos' %in% utils::installed.packages()[,1] )
.Logger$start(logDir)
else
cat('install package `TeachingDemos` to enable console logging')
}
}
.Last <- function(){
.Logger$stop()
}
This causes a copy of the terminal contents to be copied to a dated log file. The nice thing about having dated files is that if you use multiple R sessions the log files won't conflict, unless you start multiple interactive sessions in the same second).