Related
I wrote a function that helps executing pipe chains step by step.
To use it the users has to copy the instruction to clipboard, then execute the function, and move to the console to proceed.
I would like to build an addin that would allow me to select the instructions and run the function with Ctrl + P without the awkward steps.
Ideally, the addin would :
capture the selection
run the function
move the cursor to the console
be triggered by Ctrl + P
I believe it's extremely similar to what the reprex addin is doing but I don't know where to start as I'm 100% new to addins.
I looked into rstudioapi::getActiveDocumentContext() but there was nothing there of interest to me.
How can I make this work ?
The function
debug_pipe <- function(.expr){
.pchain <-
if (missing(.expr)) readClipboard() # windows only , else try clipr::read_clip()
else deparse(substitute(.expr))
.lhs <- if (grepl("^\\s*[[:alnum:]_.]*\\s*<-",.pchain[1])) {
sub("^\\s*([[:alnum:]_.]*)\\s*<-.*","\\1",.pchain[1])
} else NA
.pchain <- sub("[^%]*<-\\s*","",.pchain) # remove lhs of assignment if exists
.pchain <- paste(.pchain,collapse = " ") # collapse
.pchain <- gsub("\\s+"," ",.pchain) # multiple spaces to single
.pchain <- strsplit(.pchain,"\\s*%>%\\s*")[[1]] # split by pipe
.pchain <- as.list(.pchain)
for (i in rev(seq_along(.pchain))) {
# function to count matches
.f <- function(x) sum(gregexpr(x,.pchain[i],fixed = TRUE)[[1]] != -1)
# check if unbalanced operators
.balanced <-
all(c(.f("{"),.f("("),.f("[")) == c(.f("}"),.f(")"),.f("]"))) &
!.f("'") %% 2 &
!.f('"') %% 2
if (!.balanced) {
# if unbalanced, combine with previous
.pchain[[i - 1]] <- paste(.pchain[[i - 1]],"%>%", .pchain[[i]])
.pchain[[i]] <- NULL
}
}
.calls <- Reduce( # build calls to display
function(x,y) paste0(x," %>%\n ",y),
.pchain, accumulate = TRUE)
.xinit <- eval(parse(text = .pchain[1]))
.values <- Reduce(function(x,y){ # compute all values
if (inherits(x,"try-error")) NULL
else try(eval(parse(text = paste("x %>%", y))),silent = TRUE)},
.pchain[-1], .xinit, accumulate = TRUE)
message("press enter to show, 's' to skip, 'q' to quit, lhs can be accessed with `.`")
for (.i in (seq_along(.pchain))) {
cat("\n",.calls[.i])
.rdl_ <- readline()
. <- .values[[.i]]
# while environment is explored
while (!.rdl_ %in% c("q","s","")) {
# if not an assignment, should be printed
if (!grepl("^\\s*[[:alnum:]_.]*\\s*<-",.rdl_)) .rdl_ <- paste0("print(",.rdl_,")")
# wrap into `try` to safely fail
try(eval(parse(text = .rdl_)))
.rdl_ <- readline()
}
if (.rdl_ == "q") return(invisible(NULL))
if (.rdl_ != "s") {
if (inherits(.values[[.i]],"try-error")) {
# a trick to be able to use stop without showing that
# debug_pipe failed in the output
opt <- options(show.error.messages = FALSE)
on.exit(options(opt))
message(.values[[.i]])
stop()
} else
{
print(.)
}
}
}
if (!is.na(.lhs)) assign(.lhs,tail(.values,1),envir = parent.frame())
invisible(NULL)
}
Example code:
library(dplyr)
# copy following 4 lines to clipboard, no need to execute
test <- iris %>%
slice(1:2) %>%
select(1:3) %>%
mutate(x=3)
debug_pipe()
# or wrap expression
debug_pipe(
test <- iris %>%
slice(1:2) %>%
select(1:3) %>%
mutate(x=3)
)
Here are the steps I came with :
Two good ressources were :
The reprex addin's code from Jenny Bryan
This RStudio webinar
1. create a new package
New Project/R package/Name package as pipedebug
2. build R file
Put the function's code into a .R file in the R folder. We rename the function pdbg as I realised that magrittr already has a function called debug_pipe that does something different (it executes browser and returns input).
We must add a second function, without parameter, that the addin will trigger, we can name it however we want:
pdbg_addin <- function(){
selection <- rstudioapi::primary_selection(
rstudioapi::getSourceEditorContext())[["text"]]
rstudioapi::sendToConsole("",execute = F)
eval(parse(text=paste0("pdbg(",selection,")")))
}
The first line captures the selection, adapted from reprex's code.
The second line is sending an empty string to the console and not executing it, that's all I found to move the cursor, but there might be a better way.
The third line is running the main function with the selection as an argument.
3. Create dcf file
Next step is to create file inst/rstudio/addins.dcf with following content:
Name: debug pipe
Description: debug pipes step by step
Binding: pdbg_addin
Interactive: false
usethis::use_addin("pdbg_addin") will create the file, fill it with a template and open it so you can edit it.
4. build package
Ctrl+Shift+B
5. Add shortcut
Tools / addins / browse addins / keyboard shortcuts / debug pipe / Ctrl+P
6. Test it
Copy in text editor / select / Ctrl+P
test <- iris %>%
slice(1:2) %>%
select(1:3) %>%
mutate(x=3)
find a rough version here:
devtools::install_github("moodymudskipper/pipedebug")
?pdbg
similar efforts:
#Alistaire did this and advertised this other effort on his page.
This is probably not correct terminology, but hopefully I can get my point across.
I frequently end up doing something like:
myVar = 1
f <- function(myvar) { return(myVar); }
# f(2) = 1 now
R happily uses the variable outside of the function's scope, which leaves me scratching my head, wondering how I could possibly be getting the results I am.
Is there any option which says "force me to only use variables which have previously been assigned values in this function's scope"? Perl's use strict does something like this, for example. But I don't know that R has an equivalent of my.
EDIT: Thank you, I am aware of that I capitalized them differently. Indeed, the example was created specifically to illustrate this problem!
I want to know if there is a way that R can automatically warn me when I do this.
EDIT 2: Also, if Rkward or another IDE offers this functionality I'd like to know that too.
As far as I know, R does not provide a "use strict" mode. So you are left with two options:
1 - Ensure all your "strict" functions don't have globalenv as environment. You could define a nice wrapper function for this, but the simplest is to call local:
# Use "local" directly to control the function environment
f <- local( function(myvar) { return(myVar); }, as.environment(2))
f(3) # Error in f(3) : object 'myVar' not found
# Create a wrapper function "strict" to do it for you...
strict <- function(f, pos=2) eval(substitute(f), as.environment(pos))
f <- strict( function(myvar) { return(myVar); } )
f(3) # Error in f(3) : object 'myVar' not found
2 - Do a code analysis that warns you of "bad" usage.
Here's a function checkStrict that hopefully does what you want. It uses the excellent codetools package.
# Checks a function for use of global variables
# Returns TRUE if ok, FALSE if globals were found.
checkStrict <- function(f, silent=FALSE) {
vars <- codetools::findGlobals(f)
found <- !vapply(vars, exists, logical(1), envir=as.environment(2))
if (!silent && any(found)) {
warning("global variables used: ", paste(names(found)[found], collapse=', '))
return(invisible(FALSE))
}
!any(found)
}
And trying it out:
> myVar = 1
> f <- function(myvar) { return(myVar); }
> checkStrict(f)
Warning message:
In checkStrict(f) : global variables used: myVar
checkUsage in the codetools package is helpful, but doesn't get you all the way there.
In a clean session where myVar is not defined,
f <- function(myvar) { return(myVar); }
codetools::checkUsage(f)
gives
<anonymous>: no visible binding for global variable ‘myVar’
but once you define myVar, checkUsage is happy.
See ?codetools in the codetools package: it's possible that something there is useful:
> findGlobals(f)
[1] "{" "myVar" "return"
> findLocals(f)
character(0)
You need to fix the typo: myvar != myVar. Then it will all work...
Scope resolution is 'from the inside out' starting from the current one, then the enclosing and so on.
Edit Now that you clarified your question, look at the package codetools (which is part of the R Base set):
R> library(codetools)
R> f <- function(myVAR) { return(myvar) }
R> checkUsage(f)
<anonymous>: no visible binding for global variable 'myvar'
R>
Using get(x, inherits=FALSE) will force local scope.
myVar = 1
f2 <- function(myvar) get("myVar", inherits=FALSE)
f3 <- function(myvar){
myVar <- myvar
get("myVar", inherits=FALSE)
}
output:
> f2(8)
Error in get("myVar", inherits = FALSE) : object 'myVar' not found
> f3(8)
[1] 8
You are of course doing it wrong. Don't expect static code checking tools to find all your mistakes. Check your code with tests. And more tests. Any decent test written to run in a clean environment will spot this kind of mistake. Write tests for your functions, and use them. Look at the glory that is the testthat package on CRAN.
There is a new package modules on CRAN which addresses this common issue (see the vignette here). With modules, the function raises an error instead of silently returning the wrong result.
# without modules
myVar <- 1
f <- function(myvar) { return(myVar) }
f(2)
[1] 1
# with modules
library(modules)
m <- module({
f <- function(myvar) { return(myVar) }
})
m$f(2)
Error in m$f(2) : object 'myVar' not found
This is the first time I use it. It seems to be straightforward so I might include it in my regular workflow to prevent time consuming mishaps.
you can dynamically change the environment tree like this:
a <- 1
f <- function(){
b <- 1
print(b)
print(a)
}
environment(f) <- new.env(parent = baseenv())
f()
Inside f, b can be found, while a cannot.
But probably it will do more harm than good.
You can test to see if the variable is defined locally:
myVar = 1
f <- function(myvar) {
if( exists('myVar', environment(), inherits = FALSE) ) return( myVar) else cat("myVar was not found locally\n")
}
> f(2)
myVar was not found locally
But I find it very artificial if the only thing you are trying to do is to protect yourself from spelling mistakes.
The exists function searches for the variable name in the particular environment. inherits = FALSE tells it not to look into the enclosing frames.
environment(fun) = parent.env(environment(fun))
will remove the 'workspace' from your search path, leave everything else. This is probably closest to what you want.
#Tommy gave a very good answer and I used it to create 3 functions that I think are more convenient in practice.
strict
to make a function strict, you just have to call
strict(f,x,y)
instead of
f(x,y)
example:
my_fun1 <- function(a,b,c){a+b+c}
my_fun2 <- function(a,b,c){a+B+c}
B <- 1
my_fun1(1,2,3) # 6
strict(my_fun1,1,2,3) # 6
my_fun2(1,2,3) # 5
strict(my_fun2,1,2,3) # Error in (function (a, b, c) : object 'B' not found
checkStrict1
To get a diagnosis, execute checkStrict1(f) with optional Boolean parameters to show more ore less.
checkStrict1("my_fun1") # nothing
checkStrict1("my_fun2") # my_fun2 : B
A more complicated case:
A <- 1 # unambiguous variable defined OUTSIDE AND INSIDE my_fun3
# B unambiguous variable defined only INSIDE my_fun3
C <- 1 # defined OUTSIDE AND INSIDE with ambiguous name (C is also a base function)
D <- 1 # defined only OUTSIDE my_fun3 (D is also a base function)
E <- 1 # unambiguous variable defined only OUTSIDE my_fun3
# G unambiguous variable defined only INSIDE my_fun3
# H is undeclared and doesn't exist at all
# I is undeclared (though I is also base function)
# v defined only INSIDE (v is also a base function)
my_fun3 <- function(a,b,c){
A<-1;B<-1;C<-1;G<-1
a+b+A+B+C+D+E+G+H+I+v+ my_fun1(1,2,3)
}
checkStrict1("my_fun3",show_global_functions = TRUE ,show_ambiguous = TRUE , show_inexistent = TRUE)
# my_fun3 : E
# my_fun3 Ambiguous : D
# my_fun3 Inexistent : H
# my_fun3 Global functions : my_fun1
I chose to show only inexistent by default out of the 3 optional additions. You can change it easily in the function definition.
checkStrictAll
Get a diagnostic of all your potentially problematic functions, with the same parameters.
checkStrictAll()
my_fun2 : B
my_fun3 : E
my_fun3 Inexistent : H
sources
strict <- function(f1,...){
function_text <- deparse(f1)
function_text <- paste(function_text[1],function_text[2],paste(function_text[c(-1,-2,-length(function_text))],collapse=";"),"}",collapse="")
strict0 <- function(f1, pos=2) eval(substitute(f1), as.environment(pos))
f1 <- eval(parse(text=paste0("strict0(",function_text,")")))
do.call(f1,list(...))
}
checkStrict1 <- function(f_str,exceptions = NULL,n_char = nchar(f_str),show_global_functions = FALSE,show_ambiguous = FALSE, show_inexistent = TRUE){
functions <- c(lsf.str(envir=globalenv()))
f <- try(eval(parse(text=f_str)),silent=TRUE)
if(inherits(f, "try-error")) {return(NULL)}
vars <- codetools::findGlobals(f)
vars <- vars[!vars %in% exceptions]
global_functions <- vars %in% functions
in_global_env <- vapply(vars, exists, logical(1), envir=globalenv())
in_local_env <- vapply(vars, exists, logical(1), envir=as.environment(2))
in_global_env_but_not_function <- rep(FALSE,length(vars))
for (my_mode in c("logical", "integer", "double", "complex", "character", "raw","list", "NULL")){
in_global_env_but_not_function <- in_global_env_but_not_function | vapply(vars, exists, logical(1), envir=globalenv(),mode = my_mode)
}
found <- in_global_env_but_not_function & !in_local_env
ambiguous <- in_global_env_but_not_function & in_local_env
inexistent <- (!in_local_env) & (!in_global_env)
if(typeof(f)=="closure"){
if(any(found)) {cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),":", paste(names(found)[found], collapse=', '),"\n"))}
if(show_ambiguous & any(ambiguous)) {cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),"Ambiguous :", paste(names(found)[ambiguous], collapse=', '),"\n"))}
if(show_inexistent & any(inexistent)) {cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),"Inexistent :", paste(names(found)[inexistent], collapse=', '),"\n"))}
if(show_global_functions & any(global_functions)){cat(paste(f_str,paste(rep(" ",n_char-nchar(f_str)),collapse=""),"Global functions :", paste(names(found)[global_functions], collapse=', '),"\n"))}
return(invisible(FALSE))
} else {return(invisible(TRUE))}
}
checkStrictAll <- function(exceptions = NULL,show_global_functions = FALSE,show_ambiguous = FALSE, show_inexistent = TRUE){
functions <- c(lsf.str(envir=globalenv()))
n_char <- max(nchar(functions))
invisible(sapply(functions,checkStrict1,exceptions,n_char = n_char,show_global_functions,show_ambiguous, show_inexistent))
}
What works for me, based on #c-urchin 's answer, is to define a script which reads all my functions and then excludes the global environment:
filenames <- Sys.glob('fun/*.R')
for (filename in filenames) {
source(filename, local=T)
funname <- sub('^fun/(.*).R$', "\\1", filename)
eval(parse(text=paste('environment(',funname,') <- parent.env(globalenv())',sep='')))
}
I assume that
all functions and nothing else are contained in the relative directory ./fun and
every .R file contains exactly one function with an identical name as the file.
The catch is that if one of my functions calls another one of my functions, then the outer function has to also call this script first, and it is essential to call it with local=T:
source('readfun.R', local=T)
assuming of course that the script file is called readfun.R.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I have always found startup profile files of other people both useful and instructive about the language. Moreover, while I have some customization for Bash and Vim, I have nothing for R.
For example, one thing I always wanted is different colors for input and output text in a window terminal, and maybe even syntax highlighting.
Here is mine. It won't help you with the coloring but I get that from ESS and Emacs...
options("width"=160) # wide display with multiple monitors
options("digits.secs"=3) # show sub-second time stamps
r <- getOption("repos") # hard code the US repo for CRAN
r["CRAN"] <- "http://cran.us.r-project.org"
options(repos = r)
rm(r)
## put something this is your .Rprofile to customize the defaults
setHook(packageEvent("grDevices", "onLoad"),
function(...) grDevices::X11.options(width=8, height=8,
xpos=0, pointsize=10,
#type="nbcairo")) # Cairo device
#type="cairo")) # other Cairo dev
type="xlib")) # old default
## from the AER book by Zeileis and Kleiber
options(prompt="R> ", digits=4, show.signif.stars=FALSE)
options("pdfviewer"="okular") # on Linux, use okular as the pdf viewer
I hate to type the full words 'head', 'summary', 'names' every time, so I use aliases.
You can put aliases into your .Rprofile file, but you have to use the full path to the function (e.g. utils::head) otherwise it won't work.
# aliases
s <- base::summary
h <- utils::head
n <- base::names
EDIT: to answer your question, you can use the colorout package to have different colors in the terminal. Cool! :-)
options(stringsAsFactors=FALSE)
Although I don't actually have that in my .Rprofile, because it might breaks my coauthors' code, I wish it was the default. Why?
1) Character vectors use less memory (but only barely);
2) More importantly, we would avoid problems such as:
> x <- factor(c("a","b","c"))
> x
[1] a b c
Levels: a b c
> x <- c(x, "d")
> x
[1] "1" "2" "3" "d"
and
> x <- factor(c("a","b","c"))
> x[1:2] <- c("c", "d")
Warning message:
In `[<-.factor`(`*tmp*`, 1:2, value = c("c", "d")) :
invalid factor level, NAs generated
Factors are great when you need them (e.g. implementing ordering in graphs) but a nuisance most of the time.
I like saving my R command history and having it available each time I run R:
In the shell or .bashrc:
export R_HISTFILE=~/.Rhistory
in .Rprofile:
.Last <- function() {
if (!any(commandArgs()=='--no-readline') && interactive()){
require(utils)
try(savehistory(Sys.getenv("R_HISTFILE")))
}
}
Here are two functions I find handy for working with windows.
The first converts the \s to /.
.repath <- function() {
cat('Paste windows file path and hit RETURN twice')
x <- scan(what = "")
xa <- gsub('\\\\', '/', x)
writeClipboard(paste(xa, collapse=" "))
cat('Here\'s your de-windowsified path. (It\'s also on the clipboard.)\n', xa, '\n')
}
The second opens the working directory in a new explorer window.
getw <- function() {
suppressWarnings(shell(paste("explorer", gsub('/', '\\\\', getwd()))))
}
Here's mine. I always use the main cran repository, and have code to make it easy to source in-development package code.
.First <- function() {
library(graphics)
options("repos" = c(CRAN = "http://cran.r-project.org/"))
options("device" = "quartz")
}
packages <- list(
"describedisplay" = "~/ggobi/describedisplay",
"linval" = "~/ggobi/linval",
"ggplot2" = "~/documents/ggplot/ggplot",
"qtpaint" = "~/documents/cranvas/qtpaint",
"tourr" = "~/documents/tour/tourr",
"tourrgui" = "~/documents/tour/tourr-gui",
"prodplot" = "~/documents/categorical-grammar"
)
l <- function(pkg) {
pkg <- tolower(deparse(substitute(pkg)))
if (is.null(packages[[pkg]])) {
path <- file.path("~/documents", pkg, pkg)
} else {
path <- packages[pkg]
}
source(file.path(path, "load.r"))
}
test <- function(path) {
path <- deparse(substitute(path))
source(file.path("~/documents", path, path, "test.r"))
}
I've got this, more dynamic trick to use full terminal width, which tries to read from the COLUMNS environment variable (on Linux):
tryCatch(
{options(
width = as.integer(Sys.getenv("COLUMNS")))},
error = function(err) {
write("Can't get your terminal width. Put ``export COLUMNS'' in your \
.bashrc. Or something. Setting width to 120 chars",
stderr());
options(width=120)}
)
This way R will use the full width even as you resize your terminal window.
Most of my personal functions and loaded libraries are in the Rfunctions.r script
source("c:\\data\\rprojects\\functions\\Rfunctions.r")
.First <- function(){
cat("\n Rrrr! The statistics program for Pirates !\n\n")
}
.Last <- function(){
cat("\n Rrrr! Avast Ye, YO HO!\n\n")
}
#===============================================================
# Tinn-R: necessary packages
#===============================================================
library(utils)
necessary = c('svIDE', 'svIO', 'svSocket', 'R2HTML')
if(!all(necessary %in% installed.packages()[, 'Package']))
install.packages(c('SciViews', 'R2HTML'), dep = T)
options(IDE = 'C:/Tinn-R/bin/Tinn-R.exe')
options(use.DDE = T)
library(svIDE)
library(svIO)
library(svSocket)
library(R2HTML)
guiDDEInstall()
shell(paste("mkdir C:\\data\\rplots\\plottemp", gsub('-','',Sys.Date()), sep=""))
pldir <- paste("C:\\data\\rplots\\plottemp", gsub('-','',Sys.Date()), sep="")
plot.str <-c('savePlot(paste(pldir,script,"\\BeachSurveyFreq.pdf",sep=""),type="pdf")')
Here's from my ~/.Rprofile, designed for Mac and Linux.
These make errors easier to see.
options(showWarnCalls=T, showErrorCalls=T)
I hate the CRAN menu choice, so set to a good one.
options(repos=c("http://cran.cnr.Berkeley.edu","http://cran.stat.ucla.edu"))
More history!
Sys.setenv(R_HISTSIZE='100000')
The following is for running on Mac OSX from the terminal (which I greatly prefer to R.app because it's more stable, and you can organize your work by directory; also make sure to get a good ~/.inputrc). By default, you get an X11 display, which doesn't look as nice; this instead gives a quartz display same as the GUI. The if statement is supposed to catch the case when you're running R from the terminal on Mac.
f = pipe("uname")
if (.Platform$GUI == "X11" && readLines(f)=="Darwin") {
# http://www.rforge.net/CarbonEL/
library("grDevices")
library("CarbonEL")
options(device='quartz')
Sys.unsetenv("DISPLAY")
}
close(f); rm(f)
And preload a few libraries,
library(plyr)
library(stringr)
library(RColorBrewer)
if (file.exists("~/util.r")) {
source("~/util.r")
}
where util.r is a random bag of stuff I use, under flux.
Also, since other people were mentioning console width, here's how I do it.
if ( (numcol <-Sys.getenv("COLUMNS")) != "") {
numcol = as.integer(numcol)
options(width= numcol - 1)
} else if (system("stty -a &>/dev/null") == 0) {
# mac specific? probably bad in the R GUI too.
numcol = as.integer(sub(".* ([0-9]+) column.*", "\\1", system("stty -a", intern=T)[1]))
if (numcol > 0)
options(width= numcol - 1 )
}
rm(numcol)
This actually isn't in .Rprofile because you have to re-run it every time you resize the terminal window. I have it in util.r then I just source it as necessary.
Here are mine:
.First <- function () {
options(device="quartz")
}
.Last <- function () {
if (!any(commandArgs() == '--no-readline') && interactive()) {
require(utils)
try(savehistory(Sys.getenv("R_HISTFILE")))
}
}
# Slightly more flexible than as.Date
# my.as.Date("2009-01-01") == my.as.Date(2009, 1, 1) == as.Date("2009-01-01")
my.as.Date <- function (a, b=NULL, c=NULL, ...) {
if (class(a) != "character")
return (as.Date(sprintf("%d-%02d-%02d", a, b, c)))
else
return (as.Date(a))
}
# Some useful aliases
cd <- setwd
pwd <- getwd
lss <- dir
asd <- my.as.Date # examples: asd("2009-01-01") == asd(2009, 1, 1) == as.Date("2009-01-01")
last <- function (x, n=1, ...) tail(x, n=n, ...)
# Set proxy for all web requests
Sys.setenv(http_proxy="http://192.168.0.200:80/")
# Search RPATH for file <fn>. If found, return full path to it
search.path <- function(fn,
paths = strsplit(chartr("\\", "/", Sys.getenv("RPATH")), split =
switch(.Platform$OS.type, windows = ";", ":"))[[1]]) {
for(d in paths)
if (file.exists(f <- file.path(d, fn)))
return(f)
return(NULL)
}
# If loading in an environment that doesn't respect my RPATH environment
# variable, set it here
if (Sys.getenv("RPATH") == "") {
Sys.setenv(RPATH=file.path(path.expand("~"), "Library", "R", "source"))
}
# Load commonly used functions
if (interactive())
source(search.path("afazio.r"))
# If no R_HISTFILE environment variable, set default
if (Sys.getenv("R_HISTFILE") == "") {
Sys.setenv(R_HISTFILE=file.path("~", ".Rhistory"))
}
# Override q() to not save by default.
# Same as saying q("no")
q <- function (save="no", ...) {
quit(save=save, ...)
}
# ---------- My Environments ----------
#
# Rather than starting R from within different directories, I prefer to
# switch my "environment" easily with these functions. An "environment" is
# simply a directory that contains analysis of a particular topic.
# Example usage:
# > load.env("markets") # Load US equity markets analysis environment
# > # ... edit some .r files in my environment
# > reload() # Re-source .r/.R files in my environment
#
# On next startup of R, I will automatically be placed into the last
# environment I entered
# My current environment
.curr.env = NULL
# File contains name of the last environment I entered
.last.env.file = file.path(path.expand("~"), ".Rlastenv")
# Parent directory where all of my "environment"s are contained
.parent.env.dir = file.path(path.expand("~"), "Analysis")
# Create parent directory if it doesn't already exist
if (!file.exists(.parent.env.dir))
dir.create(.parent.env.dir)
load.env <- function (string, save=TRUE) {
# Load all .r/.R files in <.parent.env.dir>/<string>/
cd(file.path(.parent.env.dir, string))
for (file in lss()) {
if (substr(file, nchar(file)-1, nchar(file)+1) %in% c(".r", ".R"))
source(file)
}
.curr.env <<- string
# Save current environment name to file
if (save == TRUE) writeLines(.curr.env, .last.env.file)
# Let user know environment switch was successful
print (paste(" -- in ", string, " environment -- "))
}
# "reload" current environment.
reload <- resource <- function () {
if (!is.null(.curr.env))
load.env(.curr.env, save=FALSE)
else
print (" -- not in environment -- ")
}
# On startup, go straight to the environment I was last working in
if (interactive() && file.exists(.last.env.file)) {
load.env(readLines(.last.env.file))
}
sink(file = 'R.log', split=T)
options(scipen=5)
.ls.objects <- function (pos = 1, pattern, order.by = "Size", decreasing=TRUE, head = TRUE, n = 10) {
# based on postings by Petr Pikal and David Hinds to the r-help list in 2004
# modified by: Dirk Eddelbuettel (http://stackoverflow.com/questions/1358003/tricks-to- manage-the-available-memory-in-an-r-session)
# I then gave it a few tweaks (show size as megabytes and use defaults that I like)
# a data frame of the objects and their associated storage needs.
napply <- function(names, fn) sapply(names, function(x)
fn(get(x, pos = pos)))
names <- ls(pos = pos, pattern = pattern)
obj.class <- napply(names, function(x) as.character(class(x))[1])
obj.mode <- napply(names, mode)
obj.type <- ifelse(is.na(obj.class), obj.mode, obj.class)
obj.size <- napply(names, object.size) / 10^6 # megabytes
obj.dim <- t(napply(names, function(x)
as.numeric(dim(x))[1:2]))
vec <- is.na(obj.dim)[, 1] & (obj.type != "function")
obj.dim[vec, 1] <- napply(names, length)[vec]
out <- data.frame(obj.type, obj.size, obj.dim)
names(out) <- c("Type", "Size", "Rows", "Columns")
out <- out[order(out[[order.by]], decreasing=decreasing), ]
if (head)
out <- head(out, n)
out
}
Make data.frames display somewhat like 'head', only without having to type 'head'
print.data.frame <- function(df) {
if (nrow(df) > 10) {
base::print.data.frame(head(df, 5))
cat("----\n")
base::print.data.frame(tail(df, 5))
} else {
base::print.data.frame(df)
}
}
(From How to make 'head' be applied automatically to output? )
I often have a chain of debug calls I need to call and uncommenting them can be very tedious. With the help of the SO community, I went for the following solution and inserted this into my .Rprofile.site. # BROWSER is there for my Eclipse Tasks so that I have an overview of browser calls in the Task View window.
# turn debugging on or off
# place "browser(expr = isTRUE(getOption("debug"))) # BROWSER" in your function
# and turn debugging on or off by bugon() or bugoff()
bugon <- function() options("debug" = TRUE)
bugoff <- function() options("debug" = FALSE) #pun intended
Mine is not too fancy:
# So the mac gui can find latex
Sys.setenv("PATH" = paste(Sys.getenv("PATH"),"/usr/texbin",sep=":"))
#Use last(x) instead of x[length(x)], works on matrices too
last <- function(x) { tail(x, n = 1) }
#For tikzDevice caching
options( tikzMetricsDictionary='/Users/cameron/.tikzMetricsDictionary' )
setwd("C://path//to//my//prefered//working//directory")
library("ggplot2")
library("RMySQL")
library("foreign")
answer <- readline("What database would you like to connect to? ")
con <- dbConnect(MySQL(),user="root",password="mypass", dbname=answer)
I do a lot of work from mysql databases, so connecting right away is a godsend. I only wish there was a way of listing the avaialble databases so I wouldn't have to remember all the different names.
Stephen Turner's post on .Rprofiles has several useful aliases and starter functions.
I find myself using his ht and hh often.
#ht==headtail, i.e., show the first and last 10 items of an object
ht <- function(d) rbind(head(d,10),tail(d,10))
# Show the first 5 rows and first 5 columns of a data frame or matrix
hh <- function(d) d[1:5,1:5]
Here's mine, including some of the mentioned ideas.
Two things you might want to look at:
.set.width() / w() update your print width to the one of the terminal. Unfortunately I did not find a way to do this automatically on terminal resize - R documentation mentions this is done by some R interpreters.
history is saved every time together with a timestamp and the working directory
.
.set.width <- function() {
cols <- as.integer(Sys.getenv("COLUMNS"))
if (is.na(cols) || cols > 10000 || cols < 10)
options(width=100)
options(width=cols)
}
.First <- function() {
options(digits.secs=3) # show sub-second time stamps
options(max.print=1000) # do not print more than 1000 lines
options("report" = c(CRAN="http://cran.at.r-project.org"))
options(prompt="R> ", digits=4, show.signif.stars=FALSE)
}
# aliases
w <- .set.width
.Last <- function() {
if (!any(commandArgs()=='--no-readline') && interactive()){
timestamp(,prefix=paste("##------ [",getwd(),"] ",sep=""))
try(savehistory("~/.Rhistory"))
}
}
I use the following to get cacheSweave (or pgfSweave) to work with the "Compile PDF" button in RStudio:
library(cacheSweave)
assignInNamespace("RweaveLatex", cacheSweave::cacheSweaveDriver, "utils")
Mine includes options(menu.graphics=FALSE) because I like to Disable/suppress tcltk popup for CRAN mirror selection in R.
Here's mine. Nothing too innovative. Thoughts on why particular choices:
I went with setting a default for stringsAsFactors because I find
it extremely draining to pass it as an argument each time I read a CSV in. That said, it has already caused me some minor vexation when using code written on my usual computer on a computer which did not have my .Rprofile. I'm keeping it, though, as the troubles it has caused pale in comparison to the troubles not having it set everyday used to cause.
If you don't load the utils package before options(error=recover), it cannot find recover when placed inside an interactive() block.
I used .db for my dropbox setting rather than options(dropbox=...) because I use it all the time inside file.path and it saves much typing. The leading . keeps it from appearing with ls().
Without further ado:
if(interactive()) {
options(stringsAsFactors=FALSE)
options(max.print=50)
options(repos="http://cran.mirrors.hoobly.com")
}
.db <- "~/Dropbox"
# `=` <- function(...) stop("Assignment by = disabled, use <- instead")
options(BingMapsKey="blahblahblah") # Used by taRifx.geo::geocode()
.First <- function() {
if(interactive()) {
require(functional)
require(taRifx)
require(taRifx.geo)
require(ggplot2)
require(foreign)
require(R.utils)
require(stringr)
require(reshape2)
require(devtools)
require(codetools)
require(testthat)
require(utils)
options(error=recover)
}
}
Here's a little snippet for use exporting tables to LaTeX. It changes all the column names to math mode for the many reports I write. The rest of my .Rprofile is pretty standard and mostly covered above.
# Puts $dollar signs in front and behind all column names col_{sub} -> $col_{sub}$
amscols<-function(x){
colnames(x) <- paste("$", colnames(x), "$", sep = "")
x
}
I set my lattice color theme in my profile. Here are two other tweaks I use:
# Display working directory in the titlebar
# Note: This causes demo(graphics) to fail
utils::setWindowTitle(base::getwd())
utils::assignInNamespace("setwd",function(dir) {.Internal(setwd(dir));setWindowTitle(base::getwd())},"base")
# Don't print more than 1000 lines
options(max.print=2000)
I have an environment variable R_USER_WORKSPACE which points to the top directory of my packages. In .Rprofile I define a function devlib which sets the working directory (so that data() works) and sources all .R files in the R subdirectory. It is quite similar to Hadley's l() function above.
devlib <- function(pkg) {
setwd(file.path(Sys.getenv("R_USER_WORKSPACE", "."), deparse(substitute(pkg)), "dev"))
sapply(list.files("R", pattern=".r$", ignore.case=TRUE, full.names=TRUE), source)
invisible(NULL)
}
.First <- function() {
setwd(Sys.getenv("R_USER_WORKSPACE", "."))
options("repos" = c(CRAN = "http://mirrors.softliste.de/cran/", CRANextra="http://www.stats.ox.ac.uk/pub/RWin"))
}
.Last <- function() update.packages(ask="graphics")
I found two functions really necessary: First when I have set debug() on several functions and I have resolved the bug, so I want to undebug() all functions - not one by one. The undebug_all() function added as the accepted answer here is the best.
Second, when I have defined many functions and I am looking for a specific variable name, it's hard to find it within all results of the the ls(), including the function names. The lsnofun() function posted here is really good.
I am currently writing a package using reference classes. I have come across
an issue which from reading various sources:
Method initialisation in R reference classes
Can't reliably use RefClass methods in Snowfall
I gather is caused because reference methods are not all copied to every object
in the class rather they are copied when first accessed.
https://stat.ethz.ch/pipermail/r-devel/2011-June/061261.html
As an example define:
test <- setRefClass("TEST",
fields = list( a = "numeric"),
methods = list(
addone = function(){
a <<- a+1
},
initialize = function(){
a <<- 1
}
)
)
example <- test$new()
So example is a new object of class TEST. Typing example$ and tabbing in the
console gives
> example$
# example$.->a example$.refClassDef example$.self
# example$a example$initialize
so the method addone is not presented as an option. It is available to
call however:
example$addone()
Now tabbing again reveals
# >
# > example
# Reference class object of class "TEST"
# Field "a":
# [1] 2
# > example$
# example$.->a example$.refClassDef example$.self
# example$a example$addone example$field
# example$initialize example$show
so now addone and field and show are presented as options.
Martin Morgan advises to force definition of the methods in one of the above links. This
works well
test <- setRefClass("TEST",
fields = list( a = "numeric"),
methods = list(
addone = function(){
a <<- a+1
},
initialize = function(){
a <<- 1
.self$addone #force definition
}
)
)
example <- test$new()
so now tabbing gives:
# > example$
# example$.->a example$.refClassDef example$.self
# example$a example$addone example$initialize
Some of my classes have over 30 methods so I would like to do this as succintly as possible.
I have defined:
test <- setRefClass("TEST",
fields = list( a = "numeric"),
methods = list(
addone = function(){
a <<- a+1
},
initialize = function(){
a <<- 1
eval(parse(text=paste0('.self$',ls(test$def#refMethods))))
}
)
)
example <- test$new()
tabbing now gives:
# > example$
# example$.->a example$.refClassDef example$.self
# example$a example$addone example$callSuper
# example$copy example$export example$field
# example$getClass example$getRefClass example$import
# example$initFields example$initialize example$show
# example$trace example$untrace
Whilst this works it feels a bit clumsy. Also test$def#refMethods is used rather then getRefClass("TEST")$def#refMethods so that
feels a bit wrong. Has anyone dealt with this issue before.
Is there a better way to approach a solution? Thanks for any advice and apologies if the question is overly drawn out.
I wonder what your objective is? Function names showing up with tab completion? Then it's worth a post to the R-devel mailing list with a feature request. The original scenario is more elegantly handled with usingMethods as documented on ?setRefClass. A continued hack might be
initialize = function(...) {
methods <- getRefClass(class(.self))$methods()
eval(parse(text=paste0(".self$", methods)))
callSuper(...)
}
Tab completions can be customized via .DollarNames in the utils package, so
.DollarNames.TEST <- function(x, pattern)
grep(pattern, getRefClass(class(x))$methods(), value=TRUE)
Maybe an S3 method could be written at the base of your class hierarchy for this?
I know this is an old question but it is still the top entry when searching for refClass tab completion on google, so I'll just add an update:
Instead of using grep in the .DollarNames function as suggested by Martin, use findMatches from the utils package as it plays better with the different Rgui's around (grep will delete your partially typed name upon hitting tab)
.DollarNames.TEST <- function(x, pattern){
utils:::findMatches(pattern, getRefClass(class(x))$methods())
}
This is also how tab completion is handled internally for lists and data.frames
#Martin Morgan noted that this was termed tab completion. The package rcompletion and later rcompgen were tasked with achieving this. They have been now moved to utils.
rcompletion update
I looked thru the code for completion.R and from what I could determine utils:::.DollarNames.environment was handling tab completion for reference classes.
completion.R
Redefining the function seemed to achieve tab completion:
assignInNamespace( x = ".DollarNames.environment",
function(x, pattern = "") {
y <- NULL
if(isS4(x) && !is.null(x[['.refClassDef']])){
if(.hasSlot(x$.refClassDef,'refMethods')){
y<-x$.refClassDef#refMethods
y<-ls(y, all.names = TRUE, pattern = pattern)
}
}
x<-ls(x, all.names = TRUE, pattern = pattern)
unique(c(x,y))
}
,ns = "utils")
Some things to note:
I would only use this for my own use. Currently I am debugging and documenting a package. I had some longish method names and couldnt remember exactly what they were so tab completion will help greatly.
Usage of assignInNamespace in a package is frowned upon (if not banned) see ?assignInNamespace.
Forced definition of methods is more advisable.
Is there any way to "check" or "verify" a source code file in R when sourcing it ?
For example, I have this function in a file "source.R"
MyFunction <- function(x)
{
print(x+y)
}
When sourcing "source.R", I would like to see some sort of warning : MyFunctions refers to an undefined object Y.
Any hints on how to check / verifiy R code ?
Cheers!
I use a function like this one for scanning all the functions in a file:
critic <- function(file) {
require(codetools)
tmp.env <- new.env()
sys.source(file, envir = tmp.env)
checkUsageEnv(tmp.env, all = TRUE)
}
Assuming source.R contains the definitions of two rather poorly written functions:
MyFunction <- function(x) {
print(x+y)
}
MyFunction2 <- function(x, z) {
a <- 10
x <- x + 1
print(x)
}
Here is the output:
critic("source.R")
# MyFunction: no visible binding for global variable ‘y’
# MyFunction2: local variable ‘a’ assigned but may not be used
# MyFunction2: parameter ‘x’ changed by assignment
# MyFunction2: parameter ‘z’ may not be used
You can use the codetools package in base R for that. And if you had your code in a package, it would tell you about this: