I have 11 lists of different length, imported into R as p1,p2,p3,...,p11. Now I want to get the rollmean (library TTR) from all lists and name the result p1y,p2y,...,p11y.
This seems to be the job for a loop, but I read that this is often not good practice in R. I tried something (foolish) like
sample=10
for (i in 1:11){
paste("p",i,"y",sep="")<-rollmean(paste("p",i,sep=""),sample)
}
which does not work.
I also tried to use it in combination with assign(), but as I understand assign can only take a variable and a single value.
As always it strikes me that I am missing some fundamental function of R.
As Manuel pointed out, your life will be easier if you combine the variables into a list. For this, you want mget (short for "multiple get").
var_names <- paste("p", 1:11, sep = "")
p_all <- mget(var_names, envir = globalenv())
Now simply use lapply to call rollmean on each element of your list.
sample <- 10
rolling_means <- lapply(p_all, rollmean, sample)
(Also, consider renaming the sample to something that isn't already a function name.)
I suggest leaving the answers as a list, but if you really like the idea of having separate rolling mean variables to match the separate p1, p11 variables then use list2env.
names(rolling_means) <- paste(var_names, "y", sep = "")
list2env(rolling_means, envir = globalenv())
You could group your lists into one and do the following
sample <- 10
mylist <- list(p1, p2, p3, p4, p5, p6, p7, p8, p9, p10, p11)
for(i in 1:11) assign(paste('p',i,'y',sep=''), rollmean(mylist[i], sample))
This can be done with ?get and ?do.call .
x1<-1:3
x2 <- seq(3.5,5.5,1)
for (i in 1:2) {
sx<- (do.call("sin",list(c(get(paste('x',i,sep='',collapse=''))))))
cat(sx)
}
Sloppy example, but you get the idea, I hope.
Related
I am fairly new to R and I like to understand the concept of using the "apply"-family functions to avoid loop and custom functions. Unfortunately I am failing at the very first exercise.
Here is my minimum reproducible example:
x <- data.frame(Hours=cbind(c(rep(5,5),rep(6,5),rep(7,5),rep(8,5),rep(9,5))),Price=c(cbind(seq(48,50.4, by=0.1),seq(48,52.8, by=0.2),seq(48,55.2, by=0.3),seq(48,57.8, by=0.4),seq(48,60.0, by=0.5))),Volume=seq(10000:10024))
f1 <- approxfun(x$Volume,x$Price, rule=2)
plot(x$Volume, x$Price)
curve(f1, add=TRUE)
However, I would like to perform approxfun() with every unique Hour in x$Hour.
How would I have to approach this?
Thank you for your help.
This solution was provided by bunk.
The idiom is split/apply/combine: split the data, apply the function, combine the results. R/*plyr/data.table etc has many functions to do this:
fns <- lapply(split(x, x$Hours), function(dat) approxfun(dat$Volume, dat$Price, rule=2)); plot(x$Volume, x$Price); cols <- 1; for(fn in fns) curve(fn, add=TRUE, col=(cols<<-cols+1))
I'm trying to create a y-axis label that is generated by pasting together two vectors that are the same length. The catch is that the first element needs to be italicized. Here's an example...
n <- 1:5
t <- LETTERS[1:5]
together <- paste(t, n)
plot(x=1:5, y=1:5, yaxt="n")
axis(2, at=1:5, label=together, las=2)
So, I'd like the t elements italicized. I've looked around expression, bquote, and substitute and am not making much progress. Anyone got a hint to help me here?
This is a bit tricky because the expression function expects a list of expressions. Therefore you need to convert the strings returned by paste to a list of unevaluated expressions. One way is like this
together <- do.call(expression, as.list(parse(text = paste0("italic(", t, ")~", n))))
You could use bquote
together <- as.expression(sapply(seq_along(t), function(i)
bquote(italic(.(t[i]))*.(n[i]))))
Or using for loop
v1 <- c()
for(i in seq_along(t)){
v1 <- c(v1, bquote(italic(.(t[i]))*.(n[i])))
}
together <- as.expression(v1)
My memory is getting clogged by a bunch of intermediate files (call them temp1, temp2, etc.), and I would like to know if it is possible to remove them from memory without doing repeated rm calls (i.e. rm(temp1), rm(temp2))?
I tried rm(list(temp1, temp2, etc.)), but that doesn't seem to work.
Make the list a character vector (not a vector of names)
rm(list = c('temp1','temp2'))
or
rm(temp1, temp2)
An other solution rm(list=ls(pattern="temp")), remove all objects matching the pattern.
Or using regular expressions
"rmlike" <- function(...) {
names <- sapply(
match.call(expand.dots = FALSE)$..., as.character)
names = paste(names,collapse="|")
Vars <- ls(1)
r <- Vars[grep(paste("^(",names,").*",sep=""),Vars)]
rm(list=r,pos=1)
}
rmlike(temp)
Another variation you can try is (expanding #mnel's answer)
if you have many temp'x'.
Here, "n" could be the number of temp variables present:
rm(list = c(paste("temp",c(1:n),sep="")))
It seems like every question involving loops in R is met with "Loops are bad" and "You're doing it wrong" with advice to use list, or tapply or whatnot.
I'm learning R, and have implemented the following loop to create image files for each factor level, with the # of factor levels changing each time I run it:
for(i in unique(df$factor)) {
lnam <- paste("test_", i, sep="")
assign(lnam, subset(df, factor==i))
lfile <- paste(lnam, ".png", sep="")
png(file = lfile, bg="transparent")
with(get(lnam), hist(x, main = paste("Histogram of x for ", i, " factor", sep="")))
dev.off()
}
This works. I want to expand it to perhaps run various tests on those subgroups (also output to files), etc.
Is this a valid and legitimate use of loops? Or is there a preferred way to skin this cat?
There's nothing wrong with loops in general. Sometimes, particularly when you're working with files or calling functions for their side-effects rather than their outputs, loops can be easier to follow than *apply calls. However, when you use a loop to simulate a operation that can be vectorised, it's often much slower, hence the recommendation to avoid them.
Re your specific example, though, I'd make the following comments:
If you want to do something for each level in a factor, it's more straightforward to use levels(factor) rather than unique(factor).
You don't need to create a new data frame specifically for each factor level.
With that in mind:
for(i in levels(df$factor))
{
lf <- paste("test_", i, ".png", sep="")
png(file=lf, bg="transparent",
with(subset(df, factor == i), hist(x, ....)
dev.off()
}
In this case, a reasonable option is to use split to convert your data frame into a list of data frames, each containing subset of with a specific factor level.
split_df <- split(df, df$factor)
As Colin mentioned, paste can be vectorised, so you only need to call it once.
lfile <- paste("test_", names(split_df), ".png", sep = "")
Group all your plotting code into a function.
draw_and_save_histogram <- function(data, file)
{
png(file)
with(data, hist(x))
dev.off()
}
Now you can more easily compare the difference between a plain loop and an *apply function (in this case mapply, since we need two inputs).
for(i in seq_along(split_df))
{
draw_and_save_histogram(split_df[[i]], lfile[i])
}
mapply(
draw_and_save_histogram,
split_df,
lfile
)
Rather than drawing a lots of histograms to be saved in different files, it is much more preferable to draw one plot with several panels using lattice or ggplot2.
library(lattice)
histogram(~ x | factor, df)
library(ggplot2)
ggplot(df, aes(x)) + geom_histogram() + facet_wrap(~ factor)
R's named vectors are incredibly handy, however, I want to combine two vectors which contain the estimates of coefficients and the standard errors for those estimates, and both vectors have the same names:
> coefficients_estimated
y0 Xit (Intercept)
1.1 2.2 3.3
> ses_estimated
y0 Xit (Intercept)
.04 .11 .007
This would be easy to solve if I knew what order the elements were in for sure, but this isn't guaranteed in my script, so I can't simply do names(ses_estimated) <- whatever. All I want to do is add either "coef" or "se" to the end of each name, and to do this, I've come up with what I think is a pretty ugly hack:
names(coefficients_estimated) <- sapply(names(coefficients_estimated),
function(name)return(paste(name,"coef",sep=""))
)
names(ses_estimated) <- sapply(names(ses_estimated),
function(name)return(paste(name,"se",sep=""))
)
Is there an idomatic way to do this? Or am I going to have to stick with what I've written?
Assuming you're combining the vectors using c(), I don't believe there's a "pure" way to do this.
In your code above, you don't even need to use sapply. Just paste(names(coefficients_estimated), "coef", sep="") will get you the same thing. You can get a little simpler still by applying the names to the combined vector vs. separately.
If these were data frames, the suffixes argument would be exactly what you want.
setNames is helpful here:
# Make fake data for test:
namedData <- function(x) setNames(x, c("y0", "Xit", "(Intercept)"))
coefficients_estimated <- namedData(c(1.1, 2.2, 3.3))
ses_estimated <- namedData(c(.04, .11, .007))
# Do it:
withNameSuffix <- function(obj, suffix) setNames(obj, paste(names(obj), suffix, sep=""))
combined <- c(withNameSuffix(coefficients_estimated, "coef"),
withNameSuffix(ses_estimated, "se"))
coef_ses_estimated <- c(coefficients_estimated,ses_estimated)
names(coef_ses_estimated) <- as.vector(outer(names(coefficients_estimated),
c("coef","se"),paste,sep="_"))