I would like to add a column to every data frame in my R environment which all have the same format.
I can create the column I want with a simple assignment like this:
x[,8] <- x[,4]/(x[,4]+x[,5])
When I try to put this in a for loop that will iterate over every object in the environment, I get an error.
control_data <- ls()
for (i in control_data) {(i[,8] <- i[,4]/(i[,4]+i[,5]))}
Error: unexpected '[' in "for (i in control_data) {["
Here is what the input files look like:
ENSMUSG00000030088 Aldh1l1 chr6:90436420-90550197 1.5082200 3.130860 0.671814 0.0000000
ENSMUSG00000020932 Gfap chr11:102748649-102762226 7.0861500 44.182700 20.901700 0.2320750
ENSMUSG00000024411 Aqp4 chr18:15547902-15562193 3.4920400 3.474880 2.463230 0.0331238
ENSMUSG00000023913 Pla2g7 chr17:43705046-43749150 1.5105400 24.275600 11.422400 1.5111100
ENSMUSG00000035805 Mlc1 chr15:88786313-88809437 1.9010200 7.147400 5.313190 0.6358940
ENSMUSG00000007682 Dio2 chr12:91962993-91976878 1.7322900 12.094200 6.738320 1.0736900
ENSMUSG00000017390 Aldoc chr11:78136469-78141283 55.4562000 199.958000 91.328300 22.9541000
ENSMUSG00000005089 Slc1a2 chr2:102498815-102630941 63.7394000 130.729000 103.710000 10.0406000
ENSMUSG00000070880 Gad1 chr2:70391128-70440071 2.6501400 14.907500 13.730200 1.3992200
ENSMUSG00000026787 Gad2 chr2:22477724-22549394 3.9908200 11.308600 28.221500 1.4530500
Thank you for any help you could provide. Is there a better way to do this using an apply function?
As mentioned in the comment, your error happens because the results of calling ls are not the objects themselves but rather their names as strings.
To use the for-loop, you'll be headed down the eval(parse(...)) path. You can also do this with apply and a function.
myfun <- function(x) {
df <- get(x)
df[,8] <- df[,4] / (df[,4] + df[,5])
return(df)
}
control_data <- ls()
lapply(control_data, myfun)
As per the comment:
for(i in control_data) {
df <- get(i)
df[,8] <- df[,4] / (df[,4] + df[,5])
assign(i, df)
}
Related
I have a variable named SAL_mean created like this (I want to make a loop once I figure this out):
watersheds <- c('ANE', 'SAL', 'CER')
assign(paste0(watersheds[1], '_mean'), read.csv(paste0(watersheds[1], '_mean.csv')))
now the next step should be something like this (which works):
cols_dont_want <- c('B1', 'B2', 'B3')
assign(paste0(watersheds[1], '_mean'), SAL_mean[, !names(SAL_mean) %in% cols_dont_want])
but I wanted to ask how to replace "SAL_mean" by using watersheds[1], because this line of code doesn't work:
assign(paste0(watersheds[1], '_mean'), paste0(watersheds[1], '_mean')[, !names(paste0(watersheds[1], '_mean')) %in% cols_dont_want])
I think it treats the "paste0(watersheds[2], '_mean')" as string and not as a name of variable but I haven't been able to find a solution (I tried for example "as.name" function but it gave me an error "object of type 'symbol' is not subsettable")
Keep dataframes in a list using ?lapply, then it gets easier to carry out same transformations on multiple dataframes in a list, something like:
# set vars
watersheds <- c('ANE', 'SAL', 'CER')
cols_dont_want <- c('B1', 'B2', 'B3')
# result, all dataframes in one list
myList <- lapply(watersheds, function(i){
# read the file
x <- read.csv(paste0(i, "_mean.csv"))
# exclude columns and return
x[, !colnames(x) %in% cols_dont_want]
} )
replace
paste0(watersheds[2], '_mean')
with
eval(parse(text = paste0(watersheds[2], '_mean')))
and it should work. Your guess is correct, paste0 just gives you a string but you need to call the variable which is done using eval()
Or you can do it in a for loop (some find the syntax more understandable). It's equivalent to zx8754's solution, except it assigns names to each dataframe as per the OP. It's trivial to modify zx8754's solution do do the same.
watersheds <- c('ANE', 'SAL', 'CER')
cols_dont_want <- c('B1', 'B2', 'B3')
ws.list <- list()
for (i in 1:length(watersheds)) {
ws.list[[i]] <- read.csv(paste0(watersheds[i], '_mean.csv'))
names(ws.list)[i] <- paste0(watersheds[i], '_mean')
ws.list[[i]] <- ws.list[[i]][!names(ws.list[[i]]) %in% cols_dont_want]
}
names(ws.list)
# "ANE_mean" "SAL_mean" "CER_mean"
# If you absolutely want to call the data.frames by their
# individual names, you can do so after you attach() the list.
attach(ws.list)
ANE_mean
My dataset looks like this, and I have a list of data.
Plot_ID Canopy_infection_rate DAI
1 YO01 5 7
2 YO01 8 14
3 YO01 10 21
What I want to do is to apply a function called "audpc_Canopyinfactionrate" to a list of dataframes.
However, when I run lapply, I get an error as below:
Error in FUN(X[[i]], ...) : argument "DAI" is missing, with no default
I've checked my list that my data does not shift a column.
Does anyone know what's wrong with it? Thanks
Here is part of my code:
#Read files in to list
for(i in 1:length(files)) {
lst[[i]] <- read.delim(files[i], header = TRUE, sep=" ")
}
#Apply a function to the list
densities <- list()
densities<- lapply(lst, audpc_Canopyinfactionrate)
#canopy infection rate
audpc_Canopyinfactionrate <- function(Canopy_infection_rate,DAI){
n <- length(DAI)
meanvec <- matrix(-1,(n-1))
intvec <- matrix(-1,(n-1))
for(i in 1:(n-1)){
meanvec[i] <- mean(c(Canopy_infection_rate[i],
Canopy_infection_rate[i+1]))
intvec[i] <- DAI[i+1] - DAI[i]
}
infprod <- meanvec * intvec
sum(infprod)
}
As pointed out in the comments, the problem lies in the way you are using lapply.
This function is built up like this: lapply(X, FUN, ...). FUN is the name of a function used to apply to the elements in a data.frame/list called X. So far so good.
Back to your case: You want to apply a function audpc_Canopyinfactionrate() to all data frames in lst. This function takes two arguments. And I think this is where things got mixed up in your code. Make sure you understand that in the way you are using lapply, you use lst[[1]], lst[[2]], etc. as the only argument in audpc_Canopyinfactionrate(), whereas it actually requires two arguments!
If you reformulate your function a bit, you can use lst[[1]], lst[[2]] as the only argument to your function, because you know that argument contains the columns you need - Canopy_infection_rate and DAI:
audpc_Canopyinfactionrate <- function(df){
n <- nrow(df)
meanvec <- matrix(-1, (n-1))
intvec <- matrix(-1, (n-1))
for(i in 1:(n-1)){
meanvec[i] <- mean(c(df$Canopy_infection_rate[i],
df$Canopy_infection_rate[i+1]))
intvec[i] <- df$DAI[i+1] - df$DAI[i]
}
infprod <- meanvec * intvec
return(sum(infprod))
}
Call lapply in the following way:
lapply(lst, audpc_Canopyinfactionrate)
Note: lapply can also be used with more than 1 argument, by using the ... in lapply(X, FUN, ...). In your case, however, I think this is not the best option.
I want to write a function that creates a time series, but I'd like it to generate the name of the time series as part of the call.
Sort of
makeTS(my.data.frame, string(dateName), string(varName)){
-create time series tsAux from my.data.frame, dateName and varName
-create string tsName
(-the creation of tsAux is not a problem)
assign(tsName, tsAux)
return(tsName)
}
This, perhaps not surprisingly, returns the string tsName, but is there any way that I can make it return a named object?
I've tried with
do.call('<-', list(tsName, tsAux))
and I've also tried using
as.name(tsName) <- tsAux
but nothing seems to work.
I know that
tsName <- makeTS2(my.data.frame, dateName, varName)
would do the trick (where makeTS2() just generates the time series tsAux and returns it), but is there any way to make it work with one function call?
Thanks!
Can you? Sure:
makeTS <- function(dat, varName) {
result <- NA
assign( varName, result, envir = .GlobalEnv )
result
}
> makeTS(NA, "test")
[1] NA
> test
[1] NA
Should you? Almost surely not.
Ari B.' answer is good. You could also use assign() with a variable.
> makeTS <- function(dat) {
+ return(666)
+ }
> varName <- "tmp"
> tmp
Error: object 'tmp' not found
> assign(varName, makeTS(1))
> tmp
[1] 666
In improving an rbind method, I'd like to extract the names of the objects passed to it so that I might generate unique IDs from those.
I've tried all.names(match.call()) but that just gives me:
[1] "rbind" "deparse.level" "..1" "..2"
Generic example:
rbind.test <- function(...) {
dots <- list(...)
all.names(match.call())
}
t1 <- t2 <- ""
class(t1) <- class(t2) <- "test"
> rbind(t1,t2)
[1] "rbind" "deparse.level" "..1" "..2"
Whereas I'd like to be able to retrieve c("t1","t2").
I'm aware that in general one cannot retrieve the names of objects passed to functions, but it seems like with ... it might be possible, as substitute(...) returns t1 in the above example.
I picked this one up from Bill Dunlap on the R Help List Serve:
rbind.test <- function(...) {
sapply(substitute(...()), as.character)
}
I think this gives you what you want.
Using the guidance here How to use R's ellipsis feature when writing your own function?
eg substitute(list(...))
and combining with with as.character
rbind.test <- function(...) {
.x <- as.list(substitute(list(...)))[-1]
as.character(.x)
}
you can also use
rbind.test <- function(...){as.character(match.call(expand.dots = F)$...)}
Thanks in advance, and sorry if this question has been answered previously - I have looked pretty extensively. I have a dataset containing a row of with concatenated information, specifically: name,color code,some function expression. For example, one value may be:
cost#FF0033#log(x)+6.
I have all of the code to extract the information, and I end up with a vector of expressions that I would like to convert to a list of actual functions.
For example:
func.list <- list()
test.func <- c("x","x+1","x+2","x+3","x+4")
where test.func is the vector of expressions. What I would like is:
func.list[[3]]
To be equivalent to
function(x){x+3}
I know that I can create a function using:
somefunc <- function(x){eval(parse(text="x+1"))}
to convert a character value into a function. The problem comes when I try and loop through to make multiple functions. For an example of something I tried that didn't work:
for(i in 1:length(test.func)){
temp <- test.func[i]
f <- assign(function(x){eval(expr=parse(text=temp))})
func.list[[i]] <- f
}
Based on another post (http://stats.stackexchange.com/questions/3836/how-to-create-a-vector-of-functions) I also tried this:
makefunc <- function(y){y;function(x){y}}
for(i in 1:length(test.func)){
func.list[[i]] <- assign(x=paste("f",i,sep=""),value=makefunc(eval(parse(text=test.func[i]))))
}
Which gives the following error: Error in eval(expr, envir, enclos) : object 'x' not found
The eventual goal is to take the list of functions and apply the jth function to the jth column of the data.frame, so that the user of the script can specify how to normalize each column within the concatenated information given by the column header.
Maybe initialize your list with a single generic function, and then update them using:
foo <- function(x){x+3}
> body(foo) <- quote(x+4)
> foo
function (x)
x + 4
More specifically, starting from a character, you'd probably do something like:
body(foo) <- parse(text = "x+5")
Just to add onto joran's answer, this is what finally worked:
test.data <- matrix(data=rep(1,25),5,5)
test.data <- data.frame(test.data)
test.func <- c("x","x+1","x+2","x+3","x+4")
func.list <- list()
for(i in 1:length(test.func)){
func.list[[i]] <- function(x){}
body(func.list[[i]]) <- parse(text=test.func[i])
}
processed <- mapply(do.call,func.list,lapply(test.data,list))
Thanks again, joran.
This is what I do:
f <- list(identity="x",plus1 = "x+1", square= "x^2")
funCreator <- function(snippet){
txt <- snippet
function(x){
exprs <- parse(text = txt)
eval(exprs)
}
}
listOfFunctions <- lapply(setNames(f,names(f)),function(x){funCreator(x)}) # I like to have some control of the names of the functions
listOfFunctions[[1]] # try to see what the actual function looks like?
library(pryr)
unenclose(listOfFunctions[[3]]) # good way to see the actual function http://adv-r.had.co.nz/Functional-programming.html
# Call your funcions
listOfFunctions[[2]](3) # 3+1 = 4
do.call(listOfFunctions[[3]],list(3)) # 3^2 = 9
attach(listOfFunctions) # you can also attach your list of functions and call them by name
square(3) # 3^2 = 9
identity(7) # 7 ## masked object identity, better detach it now!
detach(listOfFunctions)