Repeating and incrementing a readline() user input x number of times - r

This function should take the integer that the user inputs in the first readline() prompt, and then run the next section of readline() inputs based on that integer.
If numSampFolders == 1, then it only requires one sample folder name, as per the code below. Similarly, if numSampFolders == 2, then it requires two sample folder names.
There must be a better way to write this than how I've set it up, though (and I don't really want to end up with dozens of if() statements just to cover every eventuality...). For example, if numSampFolders = 24, the code should ask the user, via readline(), to input 24 names and store those names as global variables. (I understand that this could be more easily done with list.files(), however there are folders in the same directory that are not required.)
Thanks for ideas.
set_params <- function(){
numSampFolders <- (readline("How many sample folders are you using? "))
numSampFolders <<- as.numeric(numSampFolders)
if (numSampFolders == 1)
SampFolder1 <<- readline("Enter the name of your only sample folder: ")
if (numSampFolders == 2)
SampFolder1 <<- readline("Enter the name of sample folder 1: ")
SampFolder2 <<- readline("Enter the name of sample folder 2: ")
}
if(interactive()) set_params()

The general approach is to use lapply (or sapply/for) for iteration and the R verb assign. assign behaves like <-, and the trick to making assign behave like <<- is to specify the environment with envir = .GlobalEnv. See the following solution
set_params <- function(){
numSampFolders <- (readline("How many sample folders are you using? "))
numSampFolders <<- as.numeric(numSampFolders)
if (numSampFolders == 1) {
SampFolder1 <<- readline("Enter the name of your only sample folder: ")
} else if (numSampFolders > 1) {
lapply(seq_len(numSampFolders), function(i) assign(paste0("SampFolder", i), readline(paste0("Enter the name of sample folder ", i, ": ")), envir = .GlobalEnv))
}
}
if(interactive()) set_params()
Here's the output of my console
ls()
# character(0)
# How many sample folders are you using? 2
# Enter the name of sample folder 1: ok
# Enter the name of sample folder 2: whatever
# [[1]]
# [1] "ok"
# [[2]]
# [1] "whatever"
ls()
# [1] "numSampFolders" "SampFolder1" "SampFolder2" "set_params"
SampFolder1
# [1] "ok"
You may not like that lapply prints the values to the console (in this case, it's a side-effect). The following is what I'm referring to
# [[1]]
# [1] "ok"
In that case, you can use purrr::walk to loop without the side-effects.
library(purrr)
set_params <- function(){
numSampFolders <- (readline("How many sample folders are you using? "))
numSampFolders <<- as.numeric(numSampFolders)
if (numSampFolders == 1) {
SampFolder1 <<- readline("Enter the name of your only sample folder: ")
} else if (numSampFolders > 1) {
walk(seq_len(numSampFolders), function(i) assign(paste0("SampFolder", i), readline(paste0("Enter the name of sample folder ", i, ": ")), envir = .GlobalEnv))
}
}
if(interactive()) set_params()

Related

Assign string to object based on filename with multiple conditions

I'm using some switches in my R script based on the provided data and I would like to automatize the recognition of said data. The files I'm using contain the required information in their name, I'm looking for a good way to match and assign this parts.
file names:
# Folder1:
T090_V4_plate1_S90_L001_R1_001.fastq.gz
T090_V4_plate1_S90_L001_R2_001.fastq.gz
# Folder2:
T091_V4_plate2_S1_L001_R1_001.fastq.gz
T091_V4_plate2_S1_L001_R2_001.fastq.gz
# Folder3:
TNT_2017_13_V34_plate4_S13_L001_R1_001.fastq.gz
TNT_2017_13_V34_plate4_S13_L001_R2_001.fastq.gz
TNT_2017_14_V34_plate4_S14_L001_R1_001.fastq.gz
TNT_2017_14_V34_plate4_S14_L001_R2_001.fastq.gz
the two values I would like to assign to objects are V3 or V34 to the object primerset and plate[1-4] to plate. I tried it like this:
if (length(list.files(pattern = "plate1")) > 1) {
plate <<- "plate1"
} else if (length(list.files(pattern = "plate2")) > 1) {
plate <<- "plate2"
} else if (length(list.files(pattern = "plate3")) > 1) {
plate <<- "plate3"
} else if (length(list.files(pattern = "plate4")) > 1) {
plate <<- "plate4"}
if (length(list.files(pattern = "V4")) > 1) {
primerset <<- "V4"
} else if (length(list.files(pattern = "V34")) > 1) {
primerset <<- "V34"
}
# print message based on detected values from file names
if (primerset == "V34"){
cat("sequence length is 301 bp")
} else if (primerset == "V4"){
cat("sequence length is 250 bp")
}
It works fine, but it looks complicated and easy to fail. Is there a more elegant solution? I would prefer not to load a package just for this task.
Additionally I don't know how to add a break if more than one condition is met, e.g. plate1 and plate2 in the same folder (I have the data sets separated, but just to be on the safe side).
Solution:
Based on the answers below these two versions also test if only one instance of primerset or plate is present:
filenames <- list.files()
if (length(unique(sub(".*_(plate\\d)_.*", "\\1", filenames))) == 1) {
plate <- unique(sub(".*_(plate\\d)_.*", "\\1", filenames))
}
matches = stringr::str_match(filenames, '_(V\\d+)_(plate\\d)')
if (length(unique(matches[, 2])) == 1) {
primerset = unique(matches[, 2])
}
In base R, we can use sub to extract specific part of the string.
primerset <- sub(".*_(V4|V34)_.*", "\\1", x)
#Or more generally
#primerset <- sub(".*_(V\\d+)_.*", "\\1", x)
plate <- sub(".*_(plate\\d)_.*", "\\1", x)
where x is vector of all the filenames
x <- c("T090_V4_plate1_S90_L001_R1_001.fastq.gz",
"T090_V4_plate1_S90_L001_R2_001.fastq.gz",
"T091_V4_plate2_S1_L001_R1_001.fastq.gz",
"T091_V4_plate2_S1_L001_R2_001.fastq.gz",
"TNT_2017_13_V34_plate4_S13_L001_R1_001.fastq.gz",
"TNT_2017_13_V34_plate4_S13_L001_R2_001.fastq.gz",
"TNT_2017_14_V34_plate4_S14_L001_R1_001.fastq.gz",
"TNT_2017_14_V34_plate4_S14_L001_R2_001.fastq.gz")
This calls for a regular expression. Using the {stringr} package, you would write:
matches = stringr::str_match(x, '_(V\\d+)_(plate\\d)')
primerset = matches[, 2]
plate = matches[, 3]
That is: match an underscore, followed by 'V' and a single digit, followed by underscore, followed by 'plate' and a single digit. You can extend the expression to also match the lane, mate and replicate.
Best of all, the above is vectorised so it works correctly with a vector of filenames.
Note that, either way, you should not be using <<- here (this performs global rather than local assignment and is very rarely appropriate).

Getting name of an object from list in Map

Given the following data:
list_A <- list(data_cars = mtcars,
data_air = AirPassengers,
data_list = list(A = 1,
B = 2))
I would like to print names of objects available across list_A.
Example:
Map(
f = function(x) {
nm <- deparse(match.call()$x)
print(nm)
# nm object is only needed to properly name flat file that may be
# produced within Map call
if (any(class(x) == "list")) {
length(x) + 1
} else {
length(x) + 1e6
saveRDS(object = x,
file = tempfile(pattern = make.names(nm), fileext = ".RDS"))
}
},
list_A
)
returns:
[1] "dots[[1L]][[1L]]"
[1] "dots[[1L]][[2L]]"
[1] "dots[[1L]][[3L]]"
$data_cars
NULL
$data_air
NULL
$data_list
[1] 3
Desired results
I would like to get:
`data_cars`
`data_air`
`data_list`
Update
Following the comments, I have modified the example to make it more reflective of my actual needs which are:
While using Map to iterate over list_A I'm performing some operations on each element of the list
Periodically I want to create a flat file with name reflecting name of object that was processed
In addition to list_A, there are also list_B, list_C and so forth. Therefore, I would like to avoid calling names(list) inside the function f of the Map as I will have to modify it n number of times. The solution I'm looking to find should lend itself for:
Map(function(l){...}, list_A)
So I can later replace list_A. It does not have to rely on Map. Any of the apply functions would do; same applied to purrr-based solutions.
Alternative example
do_stuff <- function(x) {
nm <- deparse(match.call()$x)
print(nm)
# nm object is only needed to properly name flat file that may be
# produced within Map call
if (any(class(x) == "list")) {
length(x) + 1
} else {
length(x) + 1e6
saveRDS(object = x,
file = tempfile(pattern = make.names(nm), fileext = ".RDS"))
}
}
Map(do_stuff, list_A)
As per the notes below, I want to avoid having to modify do_stuff function as I will be looking to do:
Map(do_stuff, list_A)
Map(do_stuff, list_B)
Map(do_stuff, list_...)
We could wrap it into a function, and do it in two steps:
myFun <- function(myList){
# do stuff
res <- Map(
f = function(x) {
#do stuff
head(x)
},
myList)
# write to a file, here we might add control
# if list is empty do not output to a file
for(i in names(res)){
write.table(res[[ i ]], file = paste0(i, ".txt"))
}
}
myFun(list_A)
Would something like this work ?
list_A2 <- Map(list, x = list_A,nm = names(list_A) )
trace(do_stuff, quote({ nm <- x$nm; x<- x$x}), at=3)
Map(do_stuff, list_A2)

Dynamic variable names in plots, files and compatibility with loop

I am trying to write a function that makes a plot and saves it into a file automatically.
The trick I struggle with it to do both dynamically [plotname=varname & filename=varname &],
and to make it compatible with calling it from a loop.
# Create data
my_df = cbind(uni=runif (100),norm=rnorm (100),bino=rbinom(100,20, 0.5)); head (my_df)
my_vec = my_df[,'uni'];
# How to make plot and file-name meaningful if you call the variable in a loop?
# if you call by name, the plotname is telling. It is similar what I would like to see.
hist(my_df[,'bino'])
for (plotit in colnames(my_df)) {
hist(my_df[,plotit])
print (plotit)
# this is already not meaningful
}
# step 2 write it into files
hist_auto <- function(variable, col ="gold1", ...) {
if ( length (variable) > 0 ) {
plotname = paste(substitute(variable), sep="", collapse = "_"); print (plotname); is (plotname)
# I would like to define plotname, and later tune it according to my needs
FnP = paste (getwd(),'/',plotname, '.hist.pdf', collapse = "", sep=""); print (FnP)
hist (variable, main = plotname)
#this is apparently not working: I do not get my_df[, "bino"] or anything similar
dev.copy2pdf (file=FnP )
} else { print ("var empty") }
}
hist_auto (my_vec)
# name works, and is meaningful [as much as the var name ... ]
hist_auto (my_df[,'bino'])
# name sort of works, but falls apart
assign (plotit, my_df[,'bino'])
hist_auto (get(plotit))
# name works, but meaningless
# Now in a loop
for (plotit in colnames(my_df)) {
my_df[,plotit]
hist(my_df[,plotit])
## name works, but meaningless and NOT UNIQUE > overwritten by next
}
for (plotit in colnames(my_df)) {
hist_auto(my_df[,plotit])
## name works, but meaningless and NOT UNIQUE > overwritten by next
}
for (plotit in colnames(my_df)) {
assign (plotit, my_df[,plotit])
hist_auto (get(plotit))
## name works, but meaningless and NOT UNIQUE > overwritten by next
}
My aim is to have a function that iterates over eg. columns of a matrix, plots and saves each with a unique and meaningful name.
The solution will probably involve a smart combination of substitute() parse() eval() and paste (), but lacking solid understanding I failed to figure out.
My basis of experimentation was:
how to dynamically call a variable?
How about something like this? You may need to install.packages("ggplot2")
library(ggplot2)
my_df <- data.frame(uni=runif(100),
norm=rnorm(100),
bino=rbinom(100, 20, 0.5))
get_histogram <- function(df, varname, binwidth=1, save=T) {
stopifnot(varname %in% names(df))
title <- sprintf("Histogram of %s", varname)
p <- (ggplot(df, aes_string(x=varname)) +
geom_histogram(binwidth=binwidth) +
ggtitle(title))
if(save) {
filename <- sprintf("histogram_%s.png", gsub(" ", "_", varname))
ggsave(filename, p, width=10, height=8)
}
return(p)
}
for(var in names(my_df))
get_histogram(my_df, var, binwidth=0.5) # If you want to save them
get_histogram(my_df, "uni", binwidth=0.1, save=F) # If you want to look at a specific one
So I ended up with 2 functions, one that can iterate over data frames, and another that takes a single vectors. Using parts of Adrian's [thanks!] solution:
hist_dataframe <- function(variable, col ="gold1", ...) {
stopifnot(colName %in% colnames(df))
variable = df[,colName]
stopifnot(length (variable) >1 )
plotname = paste(substitute(df),'__', colName, sep="")
FnP = paste (getwd(),'/',plotname, '.hist.pdf', collapse = "", sep=""); print (FnP)
hist (variable, main = plotname)
dev.copy2pdf (file=FnP )
}
And the one for simple vectors stays as in Q.

R Language - Waiting for user input with scan or readline

I'm trying to get the user to input a few keywords for a query, and in my script I used either scan or readline. I tried it using the R-embeeded script editor (Windows) but when I execute the code, it uses my next lines of script as the standard input.
Here is my (part of) script
keywords <- scan(what=character(), nlines=1)
keywords <- paste(keywords, collapse=",")
keywords
And here is the output when executed from the editor
> keywords <- scan(what=character(), nlines=1)
1: keywords <- paste(keywords, collapse=",")
Read 4 items
> keywords
[1] "keywords" "<-" "paste(keywords," "collapse=\",\")"
Meanwhile when I use the source() command, I have my user input respected.
So is there any way to be able to input some things while executing the code right from the R software?
This is how I use readLInes:
FUN <- function(x) {
if (missing(x)) {
message("Uhh you forgot to eneter x...\nPlease enter it now.")
x <- readLines(n = 1)
}
x
}
FUN()
Or maybe something along these lines:
FUN2 <- function() {
message("How many fruits will you buy")
x <- readLines(n = 1)
message("Good you want to buy %s fruits.\n Enter them now.")
y <- readLines(n = x)
paste(y, collapse = ", ")
}
FUN2()
EDIT: With your approach in Rgui...
FUN3 <- function(n=2) {
keywords <- scan(what=character(), nlines=n)
paste(keywords, collapse=",")
}
## > FUN3 <- function(n=2) {
## + keywords <- scan(what=character(), nlines=n)
## + paste(keywords, collapse=",")
## + }
## > FUN3()
## 1: apple
## 2: chicken
## Read 2 items
## [1] "apple,chicken"

Customizing R profile [duplicate]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I have always found startup profile files of other people both useful and instructive about the language. Moreover, while I have some customization for Bash and Vim, I have nothing for R.
For example, one thing I always wanted is different colors for input and output text in a window terminal, and maybe even syntax highlighting.
Here is mine. It won't help you with the coloring but I get that from ESS and Emacs...
options("width"=160) # wide display with multiple monitors
options("digits.secs"=3) # show sub-second time stamps
r <- getOption("repos") # hard code the US repo for CRAN
r["CRAN"] <- "http://cran.us.r-project.org"
options(repos = r)
rm(r)
## put something this is your .Rprofile to customize the defaults
setHook(packageEvent("grDevices", "onLoad"),
function(...) grDevices::X11.options(width=8, height=8,
xpos=0, pointsize=10,
#type="nbcairo")) # Cairo device
#type="cairo")) # other Cairo dev
type="xlib")) # old default
## from the AER book by Zeileis and Kleiber
options(prompt="R> ", digits=4, show.signif.stars=FALSE)
options("pdfviewer"="okular") # on Linux, use okular as the pdf viewer
I hate to type the full words 'head', 'summary', 'names' every time, so I use aliases.
You can put aliases into your .Rprofile file, but you have to use the full path to the function (e.g. utils::head) otherwise it won't work.
# aliases
s <- base::summary
h <- utils::head
n <- base::names
EDIT: to answer your question, you can use the colorout package to have different colors in the terminal. Cool! :-)
options(stringsAsFactors=FALSE)
Although I don't actually have that in my .Rprofile, because it might breaks my coauthors' code, I wish it was the default. Why?
1) Character vectors use less memory (but only barely);
2) More importantly, we would avoid problems such as:
> x <- factor(c("a","b","c"))
> x
[1] a b c
Levels: a b c
> x <- c(x, "d")
> x
[1] "1" "2" "3" "d"
and
> x <- factor(c("a","b","c"))
> x[1:2] <- c("c", "d")
Warning message:
In `[<-.factor`(`*tmp*`, 1:2, value = c("c", "d")) :
invalid factor level, NAs generated
Factors are great when you need them (e.g. implementing ordering in graphs) but a nuisance most of the time.
I like saving my R command history and having it available each time I run R:
In the shell or .bashrc:
export R_HISTFILE=~/.Rhistory
in .Rprofile:
.Last <- function() {
if (!any(commandArgs()=='--no-readline') && interactive()){
require(utils)
try(savehistory(Sys.getenv("R_HISTFILE")))
}
}
Here are two functions I find handy for working with windows.
The first converts the \s to /.
.repath <- function() {
cat('Paste windows file path and hit RETURN twice')
x <- scan(what = "")
xa <- gsub('\\\\', '/', x)
writeClipboard(paste(xa, collapse=" "))
cat('Here\'s your de-windowsified path. (It\'s also on the clipboard.)\n', xa, '\n')
}
The second opens the working directory in a new explorer window.
getw <- function() {
suppressWarnings(shell(paste("explorer", gsub('/', '\\\\', getwd()))))
}
Here's mine. I always use the main cran repository, and have code to make it easy to source in-development package code.
.First <- function() {
library(graphics)
options("repos" = c(CRAN = "http://cran.r-project.org/"))
options("device" = "quartz")
}
packages <- list(
"describedisplay" = "~/ggobi/describedisplay",
"linval" = "~/ggobi/linval",
"ggplot2" = "~/documents/ggplot/ggplot",
"qtpaint" = "~/documents/cranvas/qtpaint",
"tourr" = "~/documents/tour/tourr",
"tourrgui" = "~/documents/tour/tourr-gui",
"prodplot" = "~/documents/categorical-grammar"
)
l <- function(pkg) {
pkg <- tolower(deparse(substitute(pkg)))
if (is.null(packages[[pkg]])) {
path <- file.path("~/documents", pkg, pkg)
} else {
path <- packages[pkg]
}
source(file.path(path, "load.r"))
}
test <- function(path) {
path <- deparse(substitute(path))
source(file.path("~/documents", path, path, "test.r"))
}
I've got this, more dynamic trick to use full terminal width, which tries to read from the COLUMNS environment variable (on Linux):
tryCatch(
{options(
width = as.integer(Sys.getenv("COLUMNS")))},
error = function(err) {
write("Can't get your terminal width. Put ``export COLUMNS'' in your \
.bashrc. Or something. Setting width to 120 chars",
stderr());
options(width=120)}
)
This way R will use the full width even as you resize your terminal window.
Most of my personal functions and loaded libraries are in the Rfunctions.r script
source("c:\\data\\rprojects\\functions\\Rfunctions.r")
.First <- function(){
cat("\n Rrrr! The statistics program for Pirates !\n\n")
}
.Last <- function(){
cat("\n Rrrr! Avast Ye, YO HO!\n\n")
}
#===============================================================
# Tinn-R: necessary packages
#===============================================================
library(utils)
necessary = c('svIDE', 'svIO', 'svSocket', 'R2HTML')
if(!all(necessary %in% installed.packages()[, 'Package']))
install.packages(c('SciViews', 'R2HTML'), dep = T)
options(IDE = 'C:/Tinn-R/bin/Tinn-R.exe')
options(use.DDE = T)
library(svIDE)
library(svIO)
library(svSocket)
library(R2HTML)
guiDDEInstall()
shell(paste("mkdir C:\\data\\rplots\\plottemp", gsub('-','',Sys.Date()), sep=""))
pldir <- paste("C:\\data\\rplots\\plottemp", gsub('-','',Sys.Date()), sep="")
plot.str <-c('savePlot(paste(pldir,script,"\\BeachSurveyFreq.pdf",sep=""),type="pdf")')
Here's from my ~/.Rprofile, designed for Mac and Linux.
These make errors easier to see.
options(showWarnCalls=T, showErrorCalls=T)
I hate the CRAN menu choice, so set to a good one.
options(repos=c("http://cran.cnr.Berkeley.edu","http://cran.stat.ucla.edu"))
More history!
Sys.setenv(R_HISTSIZE='100000')
The following is for running on Mac OSX from the terminal (which I greatly prefer to R.app because it's more stable, and you can organize your work by directory; also make sure to get a good ~/.inputrc). By default, you get an X11 display, which doesn't look as nice; this instead gives a quartz display same as the GUI. The if statement is supposed to catch the case when you're running R from the terminal on Mac.
f = pipe("uname")
if (.Platform$GUI == "X11" && readLines(f)=="Darwin") {
# http://www.rforge.net/CarbonEL/
library("grDevices")
library("CarbonEL")
options(device='quartz')
Sys.unsetenv("DISPLAY")
}
close(f); rm(f)
And preload a few libraries,
library(plyr)
library(stringr)
library(RColorBrewer)
if (file.exists("~/util.r")) {
source("~/util.r")
}
where util.r is a random bag of stuff I use, under flux.
Also, since other people were mentioning console width, here's how I do it.
if ( (numcol <-Sys.getenv("COLUMNS")) != "") {
numcol = as.integer(numcol)
options(width= numcol - 1)
} else if (system("stty -a &>/dev/null") == 0) {
# mac specific? probably bad in the R GUI too.
numcol = as.integer(sub(".* ([0-9]+) column.*", "\\1", system("stty -a", intern=T)[1]))
if (numcol > 0)
options(width= numcol - 1 )
}
rm(numcol)
This actually isn't in .Rprofile because you have to re-run it every time you resize the terminal window. I have it in util.r then I just source it as necessary.
Here are mine:
.First <- function () {
options(device="quartz")
}
.Last <- function () {
if (!any(commandArgs() == '--no-readline') && interactive()) {
require(utils)
try(savehistory(Sys.getenv("R_HISTFILE")))
}
}
# Slightly more flexible than as.Date
# my.as.Date("2009-01-01") == my.as.Date(2009, 1, 1) == as.Date("2009-01-01")
my.as.Date <- function (a, b=NULL, c=NULL, ...) {
if (class(a) != "character")
return (as.Date(sprintf("%d-%02d-%02d", a, b, c)))
else
return (as.Date(a))
}
# Some useful aliases
cd <- setwd
pwd <- getwd
lss <- dir
asd <- my.as.Date # examples: asd("2009-01-01") == asd(2009, 1, 1) == as.Date("2009-01-01")
last <- function (x, n=1, ...) tail(x, n=n, ...)
# Set proxy for all web requests
Sys.setenv(http_proxy="http://192.168.0.200:80/")
# Search RPATH for file <fn>. If found, return full path to it
search.path <- function(fn,
paths = strsplit(chartr("\\", "/", Sys.getenv("RPATH")), split =
switch(.Platform$OS.type, windows = ";", ":"))[[1]]) {
for(d in paths)
if (file.exists(f <- file.path(d, fn)))
return(f)
return(NULL)
}
# If loading in an environment that doesn't respect my RPATH environment
# variable, set it here
if (Sys.getenv("RPATH") == "") {
Sys.setenv(RPATH=file.path(path.expand("~"), "Library", "R", "source"))
}
# Load commonly used functions
if (interactive())
source(search.path("afazio.r"))
# If no R_HISTFILE environment variable, set default
if (Sys.getenv("R_HISTFILE") == "") {
Sys.setenv(R_HISTFILE=file.path("~", ".Rhistory"))
}
# Override q() to not save by default.
# Same as saying q("no")
q <- function (save="no", ...) {
quit(save=save, ...)
}
# ---------- My Environments ----------
#
# Rather than starting R from within different directories, I prefer to
# switch my "environment" easily with these functions. An "environment" is
# simply a directory that contains analysis of a particular topic.
# Example usage:
# > load.env("markets") # Load US equity markets analysis environment
# > # ... edit some .r files in my environment
# > reload() # Re-source .r/.R files in my environment
#
# On next startup of R, I will automatically be placed into the last
# environment I entered
# My current environment
.curr.env = NULL
# File contains name of the last environment I entered
.last.env.file = file.path(path.expand("~"), ".Rlastenv")
# Parent directory where all of my "environment"s are contained
.parent.env.dir = file.path(path.expand("~"), "Analysis")
# Create parent directory if it doesn't already exist
if (!file.exists(.parent.env.dir))
dir.create(.parent.env.dir)
load.env <- function (string, save=TRUE) {
# Load all .r/.R files in <.parent.env.dir>/<string>/
cd(file.path(.parent.env.dir, string))
for (file in lss()) {
if (substr(file, nchar(file)-1, nchar(file)+1) %in% c(".r", ".R"))
source(file)
}
.curr.env <<- string
# Save current environment name to file
if (save == TRUE) writeLines(.curr.env, .last.env.file)
# Let user know environment switch was successful
print (paste(" -- in ", string, " environment -- "))
}
# "reload" current environment.
reload <- resource <- function () {
if (!is.null(.curr.env))
load.env(.curr.env, save=FALSE)
else
print (" -- not in environment -- ")
}
# On startup, go straight to the environment I was last working in
if (interactive() && file.exists(.last.env.file)) {
load.env(readLines(.last.env.file))
}
sink(file = 'R.log', split=T)
options(scipen=5)
.ls.objects <- function (pos = 1, pattern, order.by = "Size", decreasing=TRUE, head = TRUE, n = 10) {
# based on postings by Petr Pikal and David Hinds to the r-help list in 2004
# modified by: Dirk Eddelbuettel (http://stackoverflow.com/questions/1358003/tricks-to- manage-the-available-memory-in-an-r-session)
# I then gave it a few tweaks (show size as megabytes and use defaults that I like)
# a data frame of the objects and their associated storage needs.
napply <- function(names, fn) sapply(names, function(x)
fn(get(x, pos = pos)))
names <- ls(pos = pos, pattern = pattern)
obj.class <- napply(names, function(x) as.character(class(x))[1])
obj.mode <- napply(names, mode)
obj.type <- ifelse(is.na(obj.class), obj.mode, obj.class)
obj.size <- napply(names, object.size) / 10^6 # megabytes
obj.dim <- t(napply(names, function(x)
as.numeric(dim(x))[1:2]))
vec <- is.na(obj.dim)[, 1] & (obj.type != "function")
obj.dim[vec, 1] <- napply(names, length)[vec]
out <- data.frame(obj.type, obj.size, obj.dim)
names(out) <- c("Type", "Size", "Rows", "Columns")
out <- out[order(out[[order.by]], decreasing=decreasing), ]
if (head)
out <- head(out, n)
out
}
Make data.frames display somewhat like 'head', only without having to type 'head'
print.data.frame <- function(df) {
if (nrow(df) > 10) {
base::print.data.frame(head(df, 5))
cat("----\n")
base::print.data.frame(tail(df, 5))
} else {
base::print.data.frame(df)
}
}
(From How to make 'head' be applied automatically to output? )
I often have a chain of debug calls I need to call and uncommenting them can be very tedious. With the help of the SO community, I went for the following solution and inserted this into my .Rprofile.site. # BROWSER is there for my Eclipse Tasks so that I have an overview of browser calls in the Task View window.
# turn debugging on or off
# place "browser(expr = isTRUE(getOption("debug"))) # BROWSER" in your function
# and turn debugging on or off by bugon() or bugoff()
bugon <- function() options("debug" = TRUE)
bugoff <- function() options("debug" = FALSE) #pun intended
Mine is not too fancy:
# So the mac gui can find latex
Sys.setenv("PATH" = paste(Sys.getenv("PATH"),"/usr/texbin",sep=":"))
#Use last(x) instead of x[length(x)], works on matrices too
last <- function(x) { tail(x, n = 1) }
#For tikzDevice caching
options( tikzMetricsDictionary='/Users/cameron/.tikzMetricsDictionary' )
setwd("C://path//to//my//prefered//working//directory")
library("ggplot2")
library("RMySQL")
library("foreign")
answer <- readline("What database would you like to connect to? ")
con <- dbConnect(MySQL(),user="root",password="mypass", dbname=answer)
I do a lot of work from mysql databases, so connecting right away is a godsend. I only wish there was a way of listing the avaialble databases so I wouldn't have to remember all the different names.
Stephen Turner's post on .Rprofiles has several useful aliases and starter functions.
I find myself using his ht and hh often.
#ht==headtail, i.e., show the first and last 10 items of an object
ht <- function(d) rbind(head(d,10),tail(d,10))
# Show the first 5 rows and first 5 columns of a data frame or matrix
hh <- function(d) d[1:5,1:5]
Here's mine, including some of the mentioned ideas.
Two things you might want to look at:
.set.width() / w() update your print width to the one of the terminal. Unfortunately I did not find a way to do this automatically on terminal resize - R documentation mentions this is done by some R interpreters.
history is saved every time together with a timestamp and the working directory
.
.set.width <- function() {
cols <- as.integer(Sys.getenv("COLUMNS"))
if (is.na(cols) || cols > 10000 || cols < 10)
options(width=100)
options(width=cols)
}
.First <- function() {
options(digits.secs=3) # show sub-second time stamps
options(max.print=1000) # do not print more than 1000 lines
options("report" = c(CRAN="http://cran.at.r-project.org"))
options(prompt="R> ", digits=4, show.signif.stars=FALSE)
}
# aliases
w <- .set.width
.Last <- function() {
if (!any(commandArgs()=='--no-readline') && interactive()){
timestamp(,prefix=paste("##------ [",getwd(),"] ",sep=""))
try(savehistory("~/.Rhistory"))
}
}
I use the following to get cacheSweave (or pgfSweave) to work with the "Compile PDF" button in RStudio:
library(cacheSweave)
assignInNamespace("RweaveLatex", cacheSweave::cacheSweaveDriver, "utils")
Mine includes options(menu.graphics=FALSE) because I like to Disable/suppress tcltk popup for CRAN mirror selection in R.
Here's mine. Nothing too innovative. Thoughts on why particular choices:
I went with setting a default for stringsAsFactors because I find
it extremely draining to pass it as an argument each time I read a CSV in. That said, it has already caused me some minor vexation when using code written on my usual computer on a computer which did not have my .Rprofile. I'm keeping it, though, as the troubles it has caused pale in comparison to the troubles not having it set everyday used to cause.
If you don't load the utils package before options(error=recover), it cannot find recover when placed inside an interactive() block.
I used .db for my dropbox setting rather than options(dropbox=...) because I use it all the time inside file.path and it saves much typing. The leading . keeps it from appearing with ls().
Without further ado:
if(interactive()) {
options(stringsAsFactors=FALSE)
options(max.print=50)
options(repos="http://cran.mirrors.hoobly.com")
}
.db <- "~/Dropbox"
# `=` <- function(...) stop("Assignment by = disabled, use <- instead")
options(BingMapsKey="blahblahblah") # Used by taRifx.geo::geocode()
.First <- function() {
if(interactive()) {
require(functional)
require(taRifx)
require(taRifx.geo)
require(ggplot2)
require(foreign)
require(R.utils)
require(stringr)
require(reshape2)
require(devtools)
require(codetools)
require(testthat)
require(utils)
options(error=recover)
}
}
Here's a little snippet for use exporting tables to LaTeX. It changes all the column names to math mode for the many reports I write. The rest of my .Rprofile is pretty standard and mostly covered above.
# Puts $dollar signs in front and behind all column names col_{sub} -> $col_{sub}$
amscols<-function(x){
colnames(x) <- paste("$", colnames(x), "$", sep = "")
x
}
I set my lattice color theme in my profile. Here are two other tweaks I use:
# Display working directory in the titlebar
# Note: This causes demo(graphics) to fail
utils::setWindowTitle(base::getwd())
utils::assignInNamespace("setwd",function(dir) {.Internal(setwd(dir));setWindowTitle(base::getwd())},"base")
# Don't print more than 1000 lines
options(max.print=2000)
I have an environment variable R_USER_WORKSPACE which points to the top directory of my packages. In .Rprofile I define a function devlib which sets the working directory (so that data() works) and sources all .R files in the R subdirectory. It is quite similar to Hadley's l() function above.
devlib <- function(pkg) {
setwd(file.path(Sys.getenv("R_USER_WORKSPACE", "."), deparse(substitute(pkg)), "dev"))
sapply(list.files("R", pattern=".r$", ignore.case=TRUE, full.names=TRUE), source)
invisible(NULL)
}
.First <- function() {
setwd(Sys.getenv("R_USER_WORKSPACE", "."))
options("repos" = c(CRAN = "http://mirrors.softliste.de/cran/", CRANextra="http://www.stats.ox.ac.uk/pub/RWin"))
}
.Last <- function() update.packages(ask="graphics")
I found two functions really necessary: First when I have set debug() on several functions and I have resolved the bug, so I want to undebug() all functions - not one by one. The undebug_all() function added as the accepted answer here is the best.
Second, when I have defined many functions and I am looking for a specific variable name, it's hard to find it within all results of the the ls(), including the function names. The lsnofun() function posted here is really good.

Resources