How to automatically load a function - r

I have created some functions in R and whenever I need any of those functions, I need to re-create that. Please suggest me the way and steps so that i can use directly those functions in any session of R without recreating them.

While Carl's answer is acceptable, I personally think that this is exactly the situation where you should package your functions and simply call them as a library.
There are very good reasons to do this:
Documentation (with emphasis!)
Tests
Easy loading (library(mypackage))
Easy to share and portable across systems
Easy to use within reporting (Rmd/knitr)
Reduces potential for duplication
Learning the R package system will be a strong part of your toolbox and other benefits of organizing your code appropriately will become apparent.

I have a series of functions that I need across all sessions. The trick is to add them to your .First file so that they are sourced into every session globally.
A helper function to find your first-file
find.first <- function(edit = FALSE, show_lib = TRUE){
candidates <- c(Sys.getenv("R_PROFILE"),
file.path(Sys.getenv("R_HOME"), "etc", "Rprofile.site"),
Sys.getenv("R_PROFILE_USER"),
file.path(getwd(), ".Rprofile")
)
first_hit <- Filter(file.exists, candidates)
if(show_lib & !edit){
return(first_hit)
}else {
file.edit(first_hit)
}
}
Say your scripts you use everywhere are in '/mystuff/R'
# Pop open the first Rprofile file.
find.first(edit = TRUE)
You will see something like this:
##Emacs please make this -*- R -*-
## empty Rprofile.site for R on Debian
##
## Copyright (C) 2008 Dirk Eddelbuettel and GPL'ed
##
## see help(Startup) for documentation on ~/.Rprofile and Rprofile.site
# ## Example of .Rprofile
# options(width=65, digits=5)
# options(show.signif.stars=FALSE)
# setHook(packageEvent("grDevices", "onLoad"),
# function(...) grDevices::ps.options(horizontal=FALSE))
# set.seed(1234)
#.First <- function(){}
#
#
Edit the function to something like:
.First <- function(){
all_my_r <- list.files('/mystuff/R', full.names = T,
recursive = T, pattern = ".R$" )
lapply(all_my_r, function(i){
tryCatch(source(i), error = function(e)NULL)
})
}
Save the file. Then restart the session.

Related

Execute part of a script from R console or terminal [duplicate]

My R workflow is usually such that I have a file open into which I type R commands, and I’d like to execute those commands in a separately opened R shell.
The easiest way of doing this is to say source('the-file.r') inside R. However, this always reloads the whole file which may take considerable time if big amounts of data are processed. It also requires me to specify the filename again.
Ideally, I’d like to source only a specific line (or lines) from the file (I’m working on a terminal where copy&paste doesn’t work).
source doesn’t seem to offer this functionality. Is there another way of achieving this?
Here's another way with just R:
source2 <- function(file, start, end, ...) {
file.lines <- scan(file, what=character(), skip=start-1, nlines=end-start+1, sep='\n')
file.lines.collapsed <- paste(file.lines, collapse='\n')
source(textConnection(file.lines.collapsed), ...)
}
Using the right tool for the job …
As discussed in the comments, the real solution is to use an IDE that allows sourcing specific parts of a file. There are many existing solutions:
For Vim, there’s Nvim-R.
For Emacs, there’s ESS.
And of course there’s the excellent stand-alone RStudio IDE.
As a special point of note, all of the above solutions work both locally and on a server (accessed via an SSH connection, say). R can even be run on an HPC cluster — it can still communicate with the IDEs if set up properly.
… or … not.
If, for whatever reason, none of the solutions above work, here’s a small module[gist] that can do the job. I generally don’t recommend using it, though.1
#' (Re-)source parts of a file
#'
#' \code{rs} loads, parses and executes parts of a file as if entered into the R
#' console directly (but without implicit echoing).
#'
#' #param filename character string of the filename to read from. If missing,
#' use the last-read filename.
#' #param from first line to parse.
#' #param to last line to parse.
#' #return the value of the last evaluated expression in the source file.
#'
#' #details If both \code{from} and \code{to} are missing, the default is to
#' read the whole file.
rs = local({
last_file = NULL
function (filename, from, to = if (missing(from)) -1 else from) {
if (missing(filename)) filename = last_file
stopifnot(! is.null(filename))
stopifnot(is.character(filename))
force(to)
if (missing(from)) from = 1
source_lines = scan(filename, what = character(), sep = '\n',
skip = from - 1, n = to - from + 1,
encoding = 'UTF-8', quiet = TRUE)
result = withVisible(eval.parent(parse(text = source_lines)))
last_file <<- filename # Only save filename once successfully sourced.
if (result$visible) result$value else invisible(result$value)
}
})
Usage example:
# Source the whole file:
rs('some_file.r')
# Re-soure everything (same file):
rs()
# Re-source just the fifth line:
rs(from = 5)
# Re-source lines 5–10
rs(from = 5, to = 10)
# Re-source everything up until line 7:
rs(to = 7)
1 Funny story: I recently found myself on a cluster with a messed-up configuration that made it impossible to install the required software, but desperately needing to debug an R workflow due to a looming deadline. I literally had no choice but to copy and paste lines of R code into the console manually. This is a situation in which the above might come in handy. And yes, that actually happened.

How to use an R script from GitHub?

I am trying to use an R script hosted on GitHub plugin-draw.R. How should I use this plugin?
You can simply use source_url from package devtools :
library(devtools)
source_url("https://raw.github.com/tonybreyal/Blog-Reference-Functions/master/R/bingSearchXScraper/bingSearchXScraper.R")
Based on #Matifou's reply, but using the "new" method appending ?raw=TRUE at the end of your URL:
devtools::source_url("https://github.com/tonybreyal/Blog-Reference-Functions/blob/master/R/bingSearchXScraper/bingSearchXScraper.R?raw=TRUE")
You can use solution offered on R-Bloggers:
source_github <- function(u) {
# load package
require(RCurl)
# read script lines from website
script <- getURL(u, ssl.verifypeer = FALSE)
# parase lines and evaluate in the global environment
eval(parse(text = script))
}
source_github("https://raw.github.com/tonybreyal/Blog-Reference-Functions/master/R/bingSearchXScraper/bingSearchXScraper.R")
For the function to be evaluated in a global environment (I'm guessing that you will prefer this solution) you can use:
source_https <- function(u, unlink.tmp.certs = FALSE) {
# load package
require(RCurl)
# read script lines from website using a security certificate
if(!file.exists("cacert.pem")) download.file(url="http://curl.haxx.se/ca/cacert.pem", destfile = "cacert.pem")
script <- getURL(u, followlocation = TRUE, cainfo = "cacert.pem")
if(unlink.tmp.certs) unlink("cacert.pem")
# parase lines and evealuate in the global environement
eval(parse(text = script), envir= .GlobalEnv)
}
source_https("https://raw.github.com/tonybreyal/Blog-Reference-Functions/master/R/bingSearchXScraper/bingSearchXScraper.R")
source_https("https://raw.github.com/tonybreyal/Blog-Reference-Functions/master/R/htmlToText/htmlToText.R", unlink.tmp.certs = TRUE)
As mentioned in the the original article by Tony Breyal, this discussion on SO should also be credited as it is relevant to the discussed question.
If it is a link on GitHub where you can click on Raw next to the Blame, you can actually just use the ordinary base::source. Go to the R script of your choice and find the Raw button.
The link will contain raw.githubusercontent.com now, and the page show nothing but R script itself. Then, for this example,
source(
paste0(
"https://raw.githubusercontent.com/betanalpha/knitr_case_studies/master/",
"stan_intro/stan_utility.R"
)
)
(paste0 was used just to fit the URL into a narrower screen.)

"File is still in use": How to free all resources / force deletion of files?

Question: How to free all file handlers / connections R is using ? In Python, one could have a look which objects are still alive. Is there anything comparable in R?
Within a function, I create a directory with some files. At the end of the function, it should be deleted again. I am facing the problem that I am unable to delete the files, presumably because a file handler is still open. The example is with the MetaSKAT package, but I'm interested in a general solution. The example data can be found here: https://groups.google.com/group/skat_slee/attach/28a76339619d8358/Datasets.zip?part=4&authuser=0
# Code author: Seunggeun (Shawn) Lee
setwd('./Datasets')
foo <- function(dir.name) {
###### Preparation stuff ################################################
if (!require(MetaSKAT)) {install.packages('MetaSKAT'); require(MetaSKAT)}
dir.create(file.path('.',dir.name),showWarnings=F)
dir.path <- paste("./",dir.name,sep="")
file.copy(c("01.fam","01.bed", "01.bim", "01_3.SetID"), dir.path)
setwd(dir.path)
FAM<-read.table("01.fam", header=FALSE)
y<-FAM[,6]
N.Sample<-length(y)
x1<-rnorm(N.Sample)
x2<-rbinom(N.Sample,1, 0.5)
obj <-SKAT_Null_Model(y~cbind(x1, x2))
re <-Generate_Meta_Files(obj, "01.bed", "01.bim", "01_3.SetID", "01.MSSD", "01.MInfo", N.Sample)
###### Problem #######################################################
print(file.remove(list.files(), force = T)) # problem: cannot delete
# curiously, sometimes there is 1, sometimes 2 False...
###### my different tries to solve it ################################
rm(re)
closeAllConnections()
sink.number() # shows 0
rm(list = ls())
gc()
###### problem is still there ######################################
print(file.remove(list.files()))
setwd('..')
# print(unlink(dir.path, recursive = T)) # I want finally delete the directory
}
debug(foo)
foo("temp2")
I am using R Studio. Even if a try to delete it manually in Windows while R is still open, it tells me that the file is be used by a program. I can only delete it after I closed R.
So how can I force R to free these files? I will try to solve the problem at the root and look at the source code of Generate_Meta_Files(), but I thought there must be a global function in R which forces to free everything (Note: I am well aware that it does not make sense to create the files and delete it directly afterwards, it's just an example.)
Edit: After a hint, I tried it under Linux. It turns out that though it shows me that there was a problem with deletion of one (of the 6) files, all is properly deleted, hence I guess this is a windows-specific problem. Any hints what this is?

R sometimes does not save my history

I have a program in R. Sometimes when I save history, they do not write into my history file. I lost some histories a few times and this really drive me crazy.
Any recommendation on how to avoid this?
First check your working directory (getwd()). savehistory() saves the history in the current working directory. And to be honest, you better specify the filename, as the default is .History. Say :
savehistory('C:/MyWorkingDir/MySession.RHistory')
which allows you to :
loadhistory('C:/MyWorkingDir/MySession.RHistory')
So the history is not lost, it's just in a place and under a name you weren't aware of. See also ?history.
To clarify : the history is no more than a text file containing all commands of that current session. So it's a nice log of what you've done, but I almost never use it. I construct my "analysis log" myself by using scripts, as hinted in another answer.
#Stedy has provided a workable solution to your immediate question. I would encourage you to learn how to use .R files and a proper text editor, or use an integrated development environment (see this SO page for suggestions). You can then source() in your .R file so that you can consistently replicate your analysis.
For even better replicability, invest the time into learning Sweave. You'll be glad you did.
Check the Rstudio_Desktop/history_database file - it stores every command for any working directory.
See here for more details How to save the whole sequence of commands from a specific day to a file?
Logging your console on a regular basis to **dated* files is handy. The package TeachingDemos has a great function for logging your console session, but it's written as a singleton, which is problematic for automatic logging, since you wouldn't be able to use that function to create teaching demo's if you use it for logging. I re-used that function using a bit of meta-programming to make a copy of that functionality that I include in the .First function in my local .Rprofile, as follows:
.Logger <- (function(){
# copy local versions of the txtStart,
locStart <- TeachingDemos::txtStart
locStop <- TeachingDemos::txtStop
locR2txt <- TeachingDemos:::R2txt
# creat a local environment and link it to each function
.e. <- new.env()
.e.$R2txt.vars <- new.env()
environment(locStart) <- .e.
environment(locStop) <- .e.
environment(locR2txt) <- .e.
# reference the local functions in the calls to `addTaskCallback`
# and `removeTaskCallback`
body(locStart)[[length(body(locStart))-1]] <-
substitute(addTaskCallback(locR2txt, name='locR2txt'))
body(locStop)[[2]] <-
substitute(removeTaskCallback('locR2txt'))
list(start=function(logDir){
op <- options()
locStart(file.path(logDir,format(Sys.time(), "%Y_%m_%d_%H_%M_%S.txt")),
results=FALSE)
options(op)
}, stop = function(){
op <- options()
locStop()
options(op)
})
})()
.First <- function(){
if( interactive() ){
# JUST FOR FUN
cat("\nWelcome",Sys.info()['login'],"at", date(), "\n")
if('fortunes' %in% utils::installed.packages()[,1] )
print(fortunes::fortune())
# CONSTANTS
TIME <- Sys.time()
logDir <- "~/temp/Rconsole.logfiles"
# CREATE THE TEMP DIRECORY IF IT DOES NOT ALREADY EXIST
dir.create(logDir, showWarnings = FALSE)
# DELETE FILES OLDER THAN A WEEK
for(fname in list.files(logDir))
if(difftime(TIME,
file.info(file.path(logDir,fname))$mtime,
units="days") > 7 )
file.remove(file.path(logDir,fname))
# sink() A COPY OF THE TERMINAL OUTPUT TO A DATED LOG FILE
if('TeachingDemos' %in% utils::installed.packages()[,1] )
.Logger$start(logDir)
else
cat('install package `TeachingDemos` to enable console logging')
}
}
.Last <- function(){
.Logger$stop()
}
This causes a copy of the terminal contents to be copied to a dated log file. The nice thing about having dated files is that if you use multiple R sessions the log files won't conflict, unless you start multiple interactive sessions in the same second).

How to obtain a list of directories within a directory, like list.files(), but instead "list.dirs()"

I am able to use list.files() to obtain a list of files in a given directory, but if I want to get a list of directories, how would I do this? Is it somehow right in front of me as an option within list.files()?
Also, I'm using Windows, so if the answer is to shell out to some Linux/unix command, that won't work for me.
.NET for example has a Directory.GetFiles() method, and a separate Directory.GetDirectories()
method, so I figured R would have an analogous pair.
Update: A list.dirs function was added to the base package in revision 54353, which was included in the R-2.13.0 release in April, 2011.
list.dirs(path = ".", full.names = TRUE, recursive = TRUE)
So my function below was only useful for a few months. :)
I couldn't find a base R function to do this, but it would be pretty easy to write your own using:
dir()[file.info(dir())$isdir]
Update: here's a function (now corrected for Timothy Jones' comment):
list.dirs <- function(path=".", pattern=NULL, all.dirs=FALSE,
full.names=FALSE, ignore.case=FALSE) {
# use full.names=TRUE to pass to file.info
all <- list.files(path, pattern, all.dirs,
full.names=TRUE, recursive=FALSE, ignore.case)
dirs <- all[file.info(all)$isdir]
# determine whether to return full names or just dir names
if(isTRUE(full.names))
return(dirs)
else
return(basename(dirs))
}
base R now includes a list.dirs function, so home-brewed variants are no longer necessary.
For example:
list.dirs('.', recursive=FALSE)
Just to update this thread:
I see that in the newer version of R (currently I'm using 2.5.1), there is now a list.dirs function included in the base install:
list.dirs implicitly has all.files = TRUE, and if recursive = TRUE,
the answer includes path itself (provided it is a readable directory).
list.dirs <- function(...) {
x <- dir(...)
x[file_test("-d", x)]
}
might be of use?
How might we do this recursively? (the recursive argument of dir breaks these functions because it never returns directory names, just the files within each directory, etc...).
What about something like this, give it a try:
dir('.')[file.info(dir('.',full.names=T))$isdir]
You mention that you don't want to shell out to a Linux/UNIX command but I assume its ok to shell out to a Windows command. In that case this would do it:
shell("dir/ad/b", intern = TRUE)
and this would do it recursively:
shell("dir/ad/b/s", intern = TRUE)
Normally I would prefer the platform independent solutions of others here but particularly for interactive use where you are just concerned with getting the answer as simply and directly as possible this may be less work.
I had this problem a while back and used this recursive code to find all directories. Perhaps this can be of use?
list.dirs <- function(parent=".") # recursively find directories
{
if (length(parent)>1) # work on first and then rest
return(c(list.dirs(parent[1]), list.dirs(parent[-1])))
else { # length(parent) == 1
if (!is.dir(parent))
return(NULL) # not a directory, don't return anything
child <- list.files(parent, full=TRUE)
if (!any(is.dir(child)))
return(parent) # no directories below, return parent
else
return(list.dirs(child)) # recurse
}
}
is.dir <- function(x) # helper function
{
ret <- file.info(x)$isdir
ret[is.na(ret)] <- FALSE
ret
}

Resources