I'm trying to use as.name(x) to refer to a list to input into a function. Here's an example of my simplified version of my stats function followed by the for loop I'm using to output all the data at once.
get<-function(data,x) {
for (i in x) {
lm(as.formula(paste(i,'~',variable)),data)
}
}
lists<-c("a","b","c")
# where each of a, b, and c are lists that refer to column names of my data
for (j in lists) {
get(data,as.name(j))
}
I keep getting the following error:
Error in for (i in x) { : invalid for() loop sequence
If I just do get(data,a) each time it works but not when I try and do a loop.
Are each of a, b and c a list that contains only one value? I ask because your lm() formula has i on the left hand side, and can only be a vector.
If that's the case, then replacing as.name(j) with j should make your code work.
Related
I have a few problems concerning the same topic.
(1) I am trying to loop over:
premium1999 <- as.data.frame(coef(summary(data1999_mod))[c(19:44), 1])
for 10 years, in which I wrote:
for (year in seq(1999,2008)) {
paste0('premium',year) <- as.data.frame(coef(summary(paste0('data',year,'_mod')))[c(19:44), 1])
}
Note:
for data1999_mod is regression results that I want extract some of its estimators as a dataframe vector.
The coef(summary(data1999_mod)) looks like this:
#A matrix: ... of type dbl
Estimate Std. Error t value Pr(>|t|)
age 0.0388573570 2.196772e-03 17.6883885 3.362887e-6
age_sqr -0.0003065876 2.790296e-05 -10.9876373 5.826926e-28
relation 0.0724525759 9.168118e-03 7.9026659 2.950318e-15
sex -0.1348453659 8.970138e-03 -15.0326966 1.201003e-50
marital 0.0782049161 8.928773e-03 8.7587533 2.217825e-18
reg 0.1691004469 1.132230e-02 14.9351735 5.082589e-50
...
However, it returns Error: $ operator is invalid for atomic vectors, even if I did not use $ operator here.
(2) Also,
I want to create a column 'year' containing repeated values of the associated year and am trying to loop over this:
premium1999$year <- 1999
In which I wrote:
for (i in seq(1999,2008)) {
assign(paste0('premium',i)[['year']], i)
}
In this case, it returns Error in paste0("premium", i)[["year"]]: subscript out of bounds
(3) Moreover, I'd like to repeat some rows and loop over:
premium1999 <- rbind(premium1999, premium1999[rep(1, 2),])
for 10 years again and I wrote:
for (year in seq(1999,2008)) {
paste0('premium',year) <- rbind(paste0('premium',year), paste0('premium',year)[rep(1, 2),])
}
This time it returns Error in paste0("premium", year)[rep(1, 2), ]: incorrect number of dimensions
I also tried to loop over a few other similar things but I always get Error.
Each code works fine individually.
I could not find what I did wrong. Any help or suggestions would be very highly appreciated.
The problem with the code is that the paste0() function returns the character and not calling the object that is having the name as this character. For example, paste0('data',year,'_mod') returns a character vector of length 1, i.e., "data1999_mod" and not calling the object data1999_mod.
For easy understanding, there is huge a difference between, "data1999_mod"["Estimate"] and data1999_mod["Estimate"]. Subsetting as data frame merely by paste0() function returns the former, however, the expected output will be given by the latter only. That is why you are getting, Error: $ operator is invalid for atomic vectors.
The same error is found in all of your codes. On order to call the object by the output of a paste0() function, we need to enclose is by get().
As, you have not supplied the reproducible sample, I couldn't test it. However, you can try running these.
#(1)
for (year in seq(1999,2008)) {
paste0('premium',year) <- as.data.frame(coef(summary(get(paste0('data',year,'_mod'))))[c(19:44), 1])
}
#(2)
for (i in seq(1999,2008)) {
assign(get(paste0('premium',i))[['year']], i)
}
#(3)
for (year in seq(1999,2008)) {
paste0('premium',year) <- rbind(get(paste0('premium',year)), get(paste0('premium',year))[rep(1, 2),])
}
I have data
dat1 <- data.frame(a=1:3, b=rnorm(3))
dat2 <- data.frame(a=c(rep(1,3),rep(2,5),rep(3,4)), c=runif(12,1,50))
and a function that takes both data frames as inputs
foo <- function(dat1,dat2,par){
if(par< 25){return(dat1$b*par)}
if(par>=25){return(sum(dat2$c>par))}
}
which might work if it was embedded in a loop over different values of a.
However, I would like to find the value of par that minimizes the output of foo across all values of a. The optim() funtion should be able to do just this, but my problem is that I need to pass it two dataframes of different dimensions. I suspect some form of list could help but wouldn't know how.
from the help documentation on optim,
fn - A function to be minimized (or maximized), with first argument the vector of parameters over which minimization is to take place. It should return a scalar result.
Your function is not returning a scalar when par < 25. Since you are not changing the data.frames during optimization process, you do not have to pass in them again. Below is an example usage of optim in your case:
foo <- function(par) {
if(par < 25) {
return(sum(dat1$b*par))
} else {
return(sum(dat2$c>par))
}
}
optim(0, foo, method="Brent", lower=-1e6, upper=1e6)
My dataset test[[1]] can be found here.
I'm defining a function and using it in a for loop in the following code. The function is supposed to concatenate strings such as (test[[1]], '$', names(test[[1]])[1])) before converting them into an R variable. So in this example, these strings go in and out comes test[[1]]$V1.
I then iterate the function over the variables in test[[1]].
Unfortunately, I keep getting this error: Error in stvar(test[[1]], j) <- NULL : could not find function "stvar<-".
stvar <- function(df,num) {
eval(parse(text=paste(deparse(substitute(df)),'$',names(df)[num],sep='')))
}
for (j in 1:length(names(test[[1]]))){
if (trimws(as.character(stvar(test[[1]],j)[1]))=="Div" &
grepl("^M",stvar(test[[1]],j)[3])==0) {
stvar(test[[1]],j) <- NULL
}
}
Also, not sure if this is important, but the for-loop finds columns containing certain characteristics (first observation == "Div", third observation doesn't start with 'M') and removes matching columns.
Is there a way I can make the loop recognize my function?
My apologies if this has been answered somewhere else. I've defined two functions in R and then nested them with good results. Now I would like to evaluate these two nested functions by changing a variable in the second function. I've tried creating a list for the changing variable and then using lapply to evaluate each element, but I'm getting an error.
My code looks something like this:
# First function
FirstFun <- function(a, b, c, d) {
answer1 <- (a + b)/(1-(0.2*(c/d))-(0.8*(c/d)^2))
return(answer1)
}
# First function evaluated
FirstFun(13,387,1728,1980)
# Second function
SecondFun <- function(answer1,c,d) {
answer2 <- answer1*(1-(0.2*(c/d))-(0.8*(c/d)^2))
return(answer2)
}
# Nested function evaluated
SecondFun(FirstFun(13,387,1728,1980),1728,1980)
# Nested function evaluated with elements of a list
c <- list(0:1980)
lapply(c, SecondFun(FirstFun(13,387,1728,1980),c,1980))
if I under stand you correctly - you are looking for :
SecondFun(FirstFun(13,387,1728,1980),0:1980,1980)
or maybe this :
SecondFun(FirstFun(13,387,1728,0:1980),0:1980,1980)
both return a numeric vector of length 1981.
2 things -
1. no need for a list. a range would work.
2. calling a variable 'c' is a bad idea..... c is reserved
I'm currently writing a utility to run a series of test on a set of data. I have the data in a data.frame and would like to run N tests on each row of data. (Apologies if my terminology isn't all there: I've been using R for all of five hours).
In my utility, I would like to split the tests into different files and in the main program, load all those tests and run them once for each data.frame row. Here's what I'm doing to source the relevant files:
file.sources = list.files(pattern="validator-.*.R$")
sapply(file.sources,source,verbose = TRUE)
This works well, and if I do this in each matched file:
b <- function(a) {
if(grep("^[[:blank:]]*$", a)) {
return(FALSE)
} else {
return(TRUE)
}
test.functions <- append(test.functions, b)
Then I end up with a test.function list which accurately contain all the test functions to run, but this is now where I get stuck. I've tried variations of sapply() and I think do.call() is also relevant in this. This is my current attempt:
process.entry <- function(a) {
lapply(test.functions,do.call,a)
}
sapply(all.data,process.entry)
My attempt here was to create a function which takes one row of data as its argument, iterates over test.functions and calls do.call() with the function and row of data as arguments. This doesn't seem to work quite, and the error thrown is:
Error in FUN(X[[i]], ...) : second argument must be a list
However, I'm not entirely sure where this error occurs, and quite possibly: there are other, cleaner, ways of doing what I intend!
# I would
process.entry <- function(a) {
# call each function to a
# I think a anonymous function is easier here;
lapply(test.functions, function(f) f(a))
}
# sapply iterate over column of data.frame by default,
# if you want to iterate over rows, use for or apply;
apply(all.data, 1, process.entry)