store results of for loop in unique objects - r

Here is a simple loop
for (i in seq(1,30)) {
mdl<-i
}
How do I get 30 mdl rather than just one mdl (which is happening because within the loop, mdli is being replaced by mdli+1 at every iteration. What I want is to have 30 mdl perhaps with names like mdl1, mdl2 ....mdl30
I tried this:
for (i in seq(1,30)) {
mdli<-i
}
But if I type mdl1, it says mdl1 not found whereas typing mdli gives me the value of i=5
Thank you

You can specify your store variable beforhand without determine how many values it shall store. If you want for each value a seperate variable take a look at the paste function.
x<- NULL
for (i in 1:10){
x[i] <- i*2
}
*edit: The comment above is right. This way is not the most efficent one. But I still use it when computation time is not an issue.

Related

Cannot figure out how to use IF statement

I want to create a categorical variable for my DB: I want to create the "Same_Region" group, that includes all the people that live and work in the same Region and a "Diff_Region" for those who don't. I tried to use the IF statement, but I actually don't know how to proper say "if the variable Region of residence and Region of work are the same, return...". It's the very first time I try to approach by my self R, and I feel a lil bit lost.
I tried to put the two variables (Made by 2 letters - f.i. "BO") as Characters and use the "grep" command. But it eventually took to no results.
Then I tried by putting both the variables as factors, and nothing much changed.
----In R-----
extractSamepr <- function(RegionOfRes, RegionOfWo){
if(RegionOfRes== RegionOfWo){
return("SamePr")
}
else {
return("DiffPr")
}
SamePr <- NULL
for (i in 1:nrow(Data.Base)) {
SamePr <- c(SamePr, extractSamepr(Data.Base[i, "RegionOfRes", "RegionOfWo"]))
}
The ifelse way proposed in #deepseefan's comment is a standard way of solving this type of problem.
Here is another one. It uses the fact that FALSE/TRUE are coded as integers 0/1 to create a logical vector based on equality and then add 1 to that vector, giving a vector of 1/2 values. This result is used in the function's final instruction to index a vector with the two possible outcomes.
extractSamepr <- function(DF){
i <- 1 + (DF[["RegionOfRes"]] == DF[["RegionOfWo"]])
c("DiffPr", "SamePr")[i]
}
Data.Base$SamePr <- extractSamepr(Data.Base)

Multiple regressions with loop in loop in R

I want to run the following regressions, the variable which has the problem is EP, is a dummy variable and I must to check different cases, z (lenght=1000) is the threshold variable. Ι want to crate 1000 different variables of EP from z variable and save the coefficients. I use a loop in loop but the results are completely wrong.The code runs properly and does not make an error. The square brackets and parentheses are the code I run. The problem is that there is a huge delay and the results after two hours still running.
I reduced the sample by 99% and again I did not get a result, the code ran without problem .
I do not want anything special, just for each value of z to run a different regression and end up to stored the estimates. I can not understand why take so long. Any idea?
for (k in 1:1000){
z<-u[k]
for (i in 1:length(dS)){
if (dS[i]>=z) {
EP[i]=1
} else {
EP[i]=0
}
fitT <- dynlm(dR ~ L(dR,1)+L(EN)+L(EP)+L(ΚΜ,1)
prob[[k]] <- summary(fitT)$coefficients[1, 2]
}
You don't have a closing } for the i-loop; you also don't have a closing ) for dynlm.
Note, you can really replace your i-loop by
EP <- as.integer(dS >= z)
Next time when asking question, be clear and specific. What do you mean by "I use a loop in loop but the results are completely wrong"? Error message, etc?

Trying to find mean of each column in a data set

Hello everyone I am fairly new to r programming and i was wondering if someone could help me out. I was just playing with r and wanted to make a function that returned a vector of the means of each column in a data set that the user would put in as an argument. The problem is I am trying to do it without the mean r the apply functions so I am just manually trying it out and feel I am very close to finishing it. Just wanted to ask if someone could check it to see where I made an error.
Here is my code:
findMeans<- function(data)
{
meanVec <- numeric()
for(i in 1:6)
{
mean=0
for( j in 1:153)
{
value=0
count=0
if(is.na(data[j,i])==FALSE)
{
value= value + data[i,j]
count=count+1
}
else
{
value= value +0
}
}
mean =value/count
meanVec[i]<-mean
}
meanVec
}
and when I try to list the vector it just gives this
> meanVec
numeric(0)
could anyone possibly shed some light on what I am doing wrong?
If you're looking for function writing practice, and are already aware of the colMeans function, there's a couple errors I spotted.
1) I assume that when you're going from 1:6, you're going through each column in your data frame, and 1:153, you're going through each row. If this is accurate, your value=0 and count = 0 statements should be moved a level up, next to mean = 0. Otherwise, you're resetting the value to zero every row you go through, which won't do anything but report the last value it comes across.
2) In the line value= value + data[i,j], you need data[j,i] instead. You reversed the row and column values.
With those two changes, your function seems to work for a data set with 6 columns and 153 rows. For more practice, I'd recommend trying to find a way to generalize the function for any number of columns and rows.

Looping in R to create transformed variables

I have a dataset of 80 variables, and I want to loop though a subset of 50 of them and construct returns. I have a list of the names of the variables for which I want to construct returns, and am attempting to use the dplyr command mutate to construct the variables in a loop. Specifically my code is:
for (i in returnvars) {
alldta <- mutate(alldta,paste("r",i,sep="") = (i - lag(i,1))/lag(i,1))}
where returnvars is my list, and alldta is my dataset. When I run this code outside the loop with just one of the `i' values, it works fine. The code for that looks like this:
alldta <- mutate(alldta,rVar = (Var- lag(Var,1))/lag(Var,1))
However, when I run it in the loop (e.g., attempting to do the previous line of code 50 times for 50 different variables), I get the following error:
Error: unexpected '=' in:
"for (i in returnvars) {
alldta <- mutate(alldta,paste("r",i,sep="") ="
I am unsure why this issue is coming up. I have looked into a number of ways to try and do this, and have attempted solutions that use lapply as well, without success.
Any help would be much appreciated! If there is an easy way to do this with one of the apply commands as well, that would be great. I did not provide a dataset because my question is not data specific, I'm simply trying to understand, as a relative R beginner, how to construct many transformed variables at once and add them to my data frame.
EDIT: As per Frank's comment, I updated the code to the following:
for (i in returnvars) {
varname <- paste("r",i,sep="")
alldta <- mutate(alldta,varname = (i - lag(i,1))/lag(i,1))}
This fixes the previous error, but I am still not referencing the variable correctly, so I get the error
Error in "Var" - lag("Var", 1) :
non-numeric argument to binary operator
Which I assume is because R sees my variable name Var as a string, rather than as a variable. How would I correctly reference the variable in my dataset alldta? I tried get(i) and alldta$get(i), both without success.
I'm also still open to (and actively curious about), more R-style ways to do this entire process, as opposed to using a loop.
Using mutate inside a loop might not be a good idea either. I am not sure if mutate makes a copy of the data frame but its generally not a good practice to grow a data frame inside a loop. Instead create a separate data frame with the output and then name the columns based on your logic.
result = do.call(rbind,lapply(returnvars,function(i) {...})
names(result) = paste("r",returnvars,sep="")
After playing around with this more, I discovered (thanks to Frank's suggestion), that the following works:
extended <- alldta # Make a copy of my dataset
for (i in returnvars) {
varname <- paste("r",i,sep="")
extended[[varname]] = (extended[[i]] - lag(extended[[i]],1))/lag(extended[[i]],1)}
This is still not very R-styled in that I am using a loop, but for a task that is only repeating about 50 times, this shouldn't be a large issue.

How to simplify several for loops into a single loop or function in R

I am trying to combine several for loops into a single loop or function. Each loop is evaluating if an individual is present at a site that is protected, and based on that is assigning a number (numbers represent sites) at each time step. After that, the results for each time step are stored in a matrix and later used in other analysis. The problem that I am having is that I am repeating the same loop several times to evaluate the different scenarios (10%, 50%, 100% of sites protected). Since I need to store my results for each scenario I can't think of a better way to simplify this into a single loop or function. Any ideas or suggestions will be appreciated. This is a very small and simplify idea of the problem. I would like to keep the structure of the loop since my original loop is using several if statements. The only thing that is changing is the proportion of sites that are protected.
N<-10 # number of sites
sites<-factor(seq(from=1,to=N))
sites10<-as.factor(sample(sites,N*1))
sites5<-as.factor(sample(sites,N*0.5))
sites1<-as.factor(sample(sites,N*0.1))
steps<-10
P.stay<-0.9
# storing results
result<-matrix(0,nrow=steps)
time.step<-seq(1,steps)
time.step<-data.frame(time.step)
time.step$event<-0
j<-numeric(steps)
j[1]<-sample(1:N,1)
time.step$event[1]<-j[1]
for(i in 1:(steps-1)){
if(j[i] %in% sites1){
if(rbinom(1,1,P.stay)==1){time.step$event[i+1]<-j[i+1]<-j[i]} else
time.step$event[i+1]<-0
}
time.step$event[i+1]<-j[i+1]<-sample(1:N,1)
}
results.sites1<-as.factor(result)
###
result<-matrix(0,nrow=steps)
time.step<-seq(1,steps)
time.step<-data.frame(time.step)
time.step$event<-0
j<-numeric(steps)
j[1]<-sample(1:N,1)
time.step$event[1]<-j[1]
for(i in 1:(steps-1)){
if(j[i] %in% sites5){
if(rbinom(1,1,P.stay)==1){time.step$event[i+1]<-j[i+1]<-j[i]} else
time.step$event[i+1]<-0
}
time.step$event[i+1]<-j[i+1]<-sample(1:N,1)
}
results.sites5<-as.factor(result)
###
result<-matrix(0,nrow=steps)
time.step<-seq(1,steps)
time.step<-data.frame(time.step)
time.step$event<-0
j<-numeric(steps)
j[1]<-sample(1:N,1)
time.step$event[1]<-j[1]
for(i in 1:(steps-1)){
if(j[i] %in% sites10){
if(rbinom(1,1,P.stay)==1){time.step$event[i+1]<-j[i+1]<-j[i]} else
time.step$event[i+1]<-0
}
time.step$event[i+1]<-j[i+1]<-sample(1:N,1)
}
results.sites10<-as.factor(result)
#
results.sites1
results.sites5
results.sites10
Instead of doing this:
sites10<-as.factor(sample(sites,N*1))
sites5<-as.factor(sample(sites,N*0.5))
sites1<-as.factor(sample(sites,N*0.1))
and running distinct loops for each of the three variables, you can make a general loop and put it in a function, then use one of the -apply functions to call it with specific parameters. For example:
N<-10 # number of sites
sites<-factor(seq(from=1,to=N))
steps<-10
P.stay<-0.9
simulate.n.sites <- function(n) {
n.sites <- sample(sites, n)
result<-matrix(0,nrow=steps)
time.step<-seq(1,steps)
time.step<-data.frame(time.step)
time.step$event<-0
j<-numeric(steps)
j[1]<-sample(1:N,1)
time.step$event[1]<-j[1]
for(i in 1:(steps-1)){
if(j[i] %in% n.sites){
...etc...
return(result)
}
results <- lapply(c(1, 5, 10), simulate.n.sites)
Now results will be a list, with three matrix elements.
The key is to identify places where you repeat yourself, and then refactor those areas into functions. Not only is this more concise, but it's easy to extend in the future. Want to sample for 2 site? Put a 2 in the vector you pass to lapply.
If you're unfamiliar with the -apply family of functions, definitely look into those.
I also suspect that much of the rest of your code could be simplified, but I think you've gutted it too much for me to make sense of it. For example, you define an element of time.step$event based on a condition, but then you overwrite that element. Surely this isn't what the actual code does?

Resources