Pasting a string in a loop - r

I am writing some code for a loop and I want to paste a string in the loop. However, for some reason the command "paste" does not seems to work:
A simple example:
### Creating some variables
test1<-c(1,2,3,4,5,6,7,8,9,10)
test2<-c(4,6,7,2,5,3,6,2,7,1)
test3<-c(3,5,6,7,7,7,7,3,5,3)
### An example of a loop
for (i in 1:2)
{
name<-paste("test",i,sep="")
fit <- lm(name~test2+test3)
}
I don't understand why this works:
fit <- lm(test1~test2+test3)
But this doesn't:
fit <- lm(name~test2+test3)
even though paste is equal to test1.
Any help would be much appreciated. Ideally I would like to use a loop rather than apply.

First, put your vectors in a data.frame. Second, you don't need a loop in this example.
DF <- data.frame(test1,
test2,
test3)
fits <- lm(do.call(cbind, DF[, 1:2]) ~ test2 + test3, data=DF)
#Coefficients:
# test1 test2
#(Intercept) 7.655e+00 1.123e-15
#test2 -3.669e-01 1.000e+00
#test3 -1.089e-01 3.594e-17
Note that the result for test2 differs from lm(test2 ~ test2 + test3) because the response variable on the RHS is not removed.

get returns the value of a named object:
fit2 <- lm(get(name)~test2+test3)

Related

How to use for loop in r for generate indexes variables to calculate math function

I would like to calculate this functions for 37 data set as follows:
The first data is V1
library(trend)
Q1 <- API37[,"V1"]
mk.test(Q1)
sens.slope( Q1);
The second data is V2
Q2 <- API37[,"V2"]
mk.test(Q2)
sens.slope( Q2);
.
.
.
Q37<- API37[,"V37"]
mk.test(Q37)
sens.slope( Q37);
How to write R code using for loop or while loop to calculate run this function for each data ?
I tried to write this but did not work.
for (i in 1:37) {
Q[i] <- API37[,"V[i]"]
mk.test(Q[i])
sens.slope( Q[i]);
print (Q[i])
}
I nee do generate results for each Vi , i=1,2,...,37 separately and print it as list
Could any one help, please?
In your expression API37[,"V[i]"], the part "V[i]" will never be equal to "V1", "V2", etc., while you're incrementing your counter i. It's simply the character string "V[i]", remaining constant at each iteration. Replacing this expression by something like API[, paste("V", i, sep = "")] should work.
However, there are more efficient ways to implement what you are trying to do. Posting a minimal working example would help to give you such solutions, but here is an example using a fake dataset randomly generated:
library(trend)
### 1. Generate an example dataset:
dat <- matrix(rnorm(100), ncol = 10)
colnames(dat) <- paste0("V", 1:10)
head(dat)
### 2. Perform a MK-test for each variable:
apply(dat, MARGIN = 2, mk.test)
You can adapt this to your dataset.

Write a loop for my function in r

I am currently trying to write my first loop for lagged regressions on 30 variables. Variables are labeled as rx1, rx2.... rx3, and the data frame is called my_num_data.
I have created a loop that looks like this:
z <- zoo(my_num_data)
for (i in 1:30)
{dyn$lm(my_num_data$rx[i] ~ lag(my_num_data$rx[i], 1)
+ lag(my_num_data$rx[i], 2))
}
But I received an error message:
Error in model.frame.default(formula = dyn(my_num_data$rx[i] ~ lag(my_num_data$rx[i], :
invalid type (NULL) for variable 'my_num_data$rx[i]'
Can anyone tell me what the problem is with the loop?
Thanks!
This produces a list, L, whose ith component has the name of the ith column of z and whose content is the regression of the ith column of z on its first two lags. Lag is same as lag except for a reversal of argument k's sign.
library(dyn)
z <- zoo(anscombe) # test input using builtin data.frame anscombe
Lag <- function(x, k) lag(x, -k)
L <- lapply(as.list(z), function(x) dyn$lm(x ~ Lag(x, 1:2)))
First problem, I'm pretty sure the function you're looking for is dynlm(), without the $ character. Second, using $rx[i] doesn't concatenate rx and the contents of i, it selects the (single) element in $rx with index i. Try this... edited I don't have your data, so I can't test it on my machine:
results <- list()
for (i in 1:30) {
results[[i]] <- dynlm(my_num_data[,i] ~ lag(my_num_data[,i], 1)
+ lag(my_num_data[,i], 2))
}
and then list element results[[1]] will be the results from the first regresssion, and so on.
Note that this assumes your my_num_data data.frame ONLY consists of columns rx1, rx2, etc.
I am not super familiar with R, but it appears you are trying to increase the index of rx. Is rx a vector with values at different indices?
If not the solution my be to concatenate a string
for (i in 1:30){
varName <-- "rx"+i
dyn$lm(my_num_data$rx[i] ~ lag(my_num_data$rx[i], 1)
+ lag(my_num_data$varName, 2))
}
Again, I may be way off here, as this if my first post and R is still pretty new to me.

Removing quotes in function output in R

I am trying to write a function in R, for a simple time series regression (the result of this function is the output for more complicated ones). In the first part i define the variables and create some lags for the function, which are named ar_i depending on the used lag.
However in the second part i try to combine this lags in a matrix using a cbind function on the variables initially defined. As you can see the output is not the expected matrix, but the names of the lags themselves. I tried to solve this by using the noquote() and cat() function, but these don't seem to work.
Do you have any suggestions? Thanks in advance!!!
Pd: The code and the results are below.
trans <- dlpib
ar <- dlpib
linear <- 1:4
for (i in linear){
assign(paste("ar_",i,sep = ""), lag(ar,k=-i))
}
linear_dat <- cbind(paste("ar_",linear, collapse=',', sep = ""))
> linear_dat
[,1]
[1,] "ar_1,ar_2,ar_3,ar_4"
I think you could go about this more efficiently with sapply:
linear <- 1:4
linear_list <- lapply(linear, function(i) lag(ar, k=-i))
linear_dat <- do.call(cbind, linear_list)
colnames(linear_dat) <- paste0("ar_", linear)

Reformulate a for loop to use apply in R

I am currently learning R, and I tried to change a for loop to use apply.
The context is a dataframe galton with 2 variables, parent (hight in inches) and child (height in inches). I want to sample repeatedly from this and get a linear model (using lm) and save that result into a vector.
library(UsingR)
sampleLm <- vector(100,mode="list")
for(i in 1:100) {
sampleGalton <- galton[sample(1:length(galton$child),size=50,replace=F),]
sampleLm[[i]] <- lm(sampleGalton$child ~ sampleGalton$parent)
}
I tried this:
sampleLm <- vector(100,mode="list")
sapply(samples, function(x) {
sampleGalton <- galton[sample(1:length(galton$child),size=50,replace=F),]
x <- lm(sampleGalton$child ~ sampleGalton$parent)
})
the code samples are taken from the galton height of children given parents height.
you can get this data in the UsingR package. This way you get galton. But really it could be anything. just some regular data frame.
but while it executes properly, the sampleLm vector isn't updated and contains all None. I get the impression this is normal because of the "no side effect" rule I found from the R documentation.
There must be a way to reformulate this so the for is replaced with apply. The question is how?
The easiest way here is replicate:
sampleLm <- replicate(100, lm(child ~ parent, data = galton,
subset = sample(seq(nrow(galton)), size = 50)),
simplify = FALSE)
You don't need to preallocate sampleLm when using the *apply family. You just need to write the function you want to run so that it turns the result of interest and then store the final result in a variable.
sampleLm <- sapply(samples, function(x) {
sampleGalton <- galton[sample(1:length(galton$child),size=50,replace=F),]
lm(sampleGalton$child ~ sampleGalton$parent)
})

Anova in R: Dataframe selection

I just run into a problem when using a variable in the anova term. Normally I would use "AGE" directly in the term, but run it all in a loop so myvar will change.
myvar=as.name("AGE")
x=summary( aov (dat ~ contrasts*myvar)+ Error(ID/(contrasts)), data =set))
names(set) = "contrasts" "AGE" "ID" "dat"
It's like when I want to select:
set$myvar
not function! but set$AGE yes
Is there any code for this?
You need to create a string representation of the model formula, then convert it using as.formula.
myvar <- "AGE"
f <- as.formula(paste("dat ~", myvar))
aov(f)
As Richie wrote, pasting seems like the simplest solution. Here's a more complete example:
myvar <- "AGE"
f <- as.formula(paste("dat ~ contrasts *", myvar, "+ Error(ID/contrasts)"))
x <- summary( aov(f, data=set) )
...and instead of set$myvar you would write
set[[myvar]]
A more advanced answer is that a formula is actually a call to the "~" operator. You can modify the call directly, which would be slightly more efficient inside the loop:
> f <- dat ~ contrasts * PLACEHOLDER + Error(ID/contrasts) # outside loop
> f[[3]][[2]][[3]] <- as.name(myvar) # inside loop
> f # see what it looks like...
dat ~ contrasts * AGE + Error(ID/contrasts)
The magic [[3]][[2]][[3]] specifies the part of the formula you want to replace. The formula actually looks something like this (a parse tree):
`~`(dat, `+`(`*`(contrasts, PLACEHOLDER), Error(`/`(ID, contrasts))
Play around with indexing the formula and you'll understand:
> f[[3]]
contrasts * AGE + Error(ID/contrasts)
> f[[3]][[2]]
contrasts * AGE
UPDATE: What are the benefits of this? Well, it is more robust - especially if you don't control the data's column names. If myvar <- "AGE GROUP" the current paste solution doesn't work. And if myvar <- "file.create('~/OWNED')", you have a serious security risk...

Resources