I am new to R, I have a question about Loop through all the combinations of unique days and unique individuals in the activity_budget dataset. For each iteration of the inner loop subset on the current value of day and individual of your loops. Calculate the mean time value for this subset and store it in a vector called my_vector .
I write a bunch of code but I received an error. thank you in advanced.
setwd("C:/ /")
activity_budget <- read.csv("activity_budget.csv")
getwd()
str(activity_budget)
head(activity_budget)
my_vector<-NULL
for(i in unique(activity_budget$day)){
for(j in unique(activity_budget$individual)){
subset_data<-subset(activity_budget, activity_budget$day == i & activity_budget$individual== j)
my_vector<-mean(activity_budget$time[subset(activity_budget, activity_budget$day & activity_budget$individual)],na.rm=TRUE)
}
}
my_vector<-NULL
unique(activity_budget$day)
unique(activity_budget$individual)
unique(activity_budget$time)
mean(activity_budget$time)
activity_budget$day==i& activity_budget$individual==j
my_vector<-NULL
index<-0
for(i in unique(activity_budget$day)){
for(j in unique(activity_budget$individual)){
subset_data<- activity_budget$day == i & activity_budget$individual==j
index<-index+1
my_vector[index]<-mean(activity_budget$time[subset_data],na.rm=TRUE)
}
}
my_vector
Related
This is the code that I am trying to run and it's taking a while.
Districts is a data frame of 39299 rows and 16 columns and lm_data is a data frame of 59804 rows and 16 variables. I want to set up a new variable in lm_data called tentativeStartDate which takes on the value of districts$firstDay[j] if a couple of conditions are meant. Is there a more efficient way to do this?
for (i in 1: nrow(lm_data)){
for (j in 1: nrow(districts)){
if (lm_data$DISTORGID[i] == districts$DISTORGID[j] & lm_data$gradeCode[i] == districts$gradeCode[j]){
lm_data$tentativeStartDate[i] = districts$firstDay[j]
}
}
}
Not sure if this will work since I can't test it, but if it does work it should be much faster.
# get the indices
idx <- which(lm_data$DISTORGID == districts$DISTORGID & lm_data$gradeCode == districts$gradeCode)
lm_data$tentativeStartDate[idx] <- districts$firstDay[idx]
I have a list of dataframes. For each of these dataframes I want to do computations that involve only specific rows and columns. These computations use a for loop that counts the number of cases for a specific number range and stores this number in an object called counter.
The code works if I apply it to just one of the dataframes.
counter=0
for (val in df[7,10:109]) {
if (val <= 5000 & val !=-1) {counter=counter+1}
else {break}
}
Now I want it to calculate the same for all the dataframes. I tried this using the sapply function:
counter=0
sapply(filelist, function(x){
x<-get(x)
for (val in x[7,10:109]) {
if (val <= 5000 & val !=-1) {counter=counter+1}
else {break}
print(counter)
}
})
However, the results of the computation are not saved in the counter. When I include the print(counter) command I can see that the results are in fact saved for each of the data frames temporarily.
How do I have them added up in one object that I can then manipulate further?
Your code does not look optimal, but if you run sapply on a filelist you should expect getting back a vector of counts. So you may try returning the count from each call in sapply, and then assigning it to a variable:
results <- sapply(filelist, function(x) {
x <- get(x)
counter <- 0
for (val in x[7,10:109]) {
if (val <= 5000 & val !=-1) {counter=counter+1}
else {break}
print(counter)
}
return(counter)
})
I'm trying to save each iteration of this for loop in a vector.
for (i in 1:177) {
a <- geomean(er1$CW[1:i])
}
Basically, I have a list of 177 values and I'd like the script to find the cumulative geometric mean of the list going one by one. Right now it will only give me the final value, it won't save each loop iteration as a separate value in a list or vector.
The reason your code does not work is that the object ais overwritten in each iteration. The following code for instance does what precisely what you desire:
a <- c()
for(i in 1:177){
a[i] <- geomean(er1$CW[1:i])
}
Alternatively, this would work as well:
for(i in 1:177){
if(i != 1){
a <- rbind(a, geomean(er1$CW[1:i]))
}
if(i == 1){
a <- geomean(er1$CW[1:i])
}
}
I started down a similar path with rbind as #nate_edwinton did, but couldn't figure it out. I did however come up with something effective. Hmmmm, geo_mean. Cool. Coerce back to a list.
MyNums <- data.frame(x=(1:177))
a <- data.frame(x=integer())
for(i in 1:177){
a[i,1] <- geomean(MyNums$x[1:i])
}
a<-as.list(a)
you can try to define the variable that can save the result first
b <- c()
for (i in 1:177) {
a <- geomean(er1$CW[1:i])
b <- c(b,a)
}
UPDATE
Thanks to the help and suggestions of #CarlWitthoft my code was simplified to this:
model <- unlist(sapply(1:length(model.list),
function(i) ifelse(length(model.list[[i]][model.lookup[[i]]] == "") == 0,
NA, model.list[[i]][model.lookup[[i]]])))
ORIGINAL POST
Recently I read an article on how vectorizing operations in R instead of using for loops are a good practice, I have a piece of code where I used a big for loop and I'm trying to make it a vector operation but I cannot find the answer, could someone help me? Is it possible or do I need to change my approach? My code works fine with the for loop but I want to try the other way.
model <- c(0)
price <- c(0)
size <- c(0)
reviews <- c(0)
for(i in 1:length(model.list)) {
if(length(model.list[[i]][model.lookup[[i]]] == "") == 0) {
model[i] <- NA
} else {
model[i] <- model.list[[i]][model.lookup[[i]]]
}
if(length(model.list[[i]][price.lookup[[i]]] == "") == 0) {
price[i] <- NA
} else {
price[i] <- model.list[[i]][price.lookup[[i]]]
}
if(length(model.list[[i]][reviews.lookup[[i]]] == "") == 0) {
reviews[i] <- NA
} else {
reviews[i] <- model.list[[i]][reviews.lookup[[i]]]
}
size[i] <- product.link[[i]][size.lookup[[i]]]
}
Basically the model.list variable is a list from which I want to extract a particular vector, the location from that vector is given by the variables model.lookup, price.lookup and reviews.lookup which contain logical vectors with just one TRUE value which is used to return the desired vector from model.list. Then every cycle of the for loop the extracted vectors are stored on variables model, price, size and reviews.
Could this be changed to a vector operation?
In general, try to avoid if when not needed. I think your desired output can be built as follows.
model <- unlist(sapply(1:length(model.list), function(i) model.list[[i]][model.lookup[[i]]]))
model[model=='']<-NA
And the same for your other variables. This assumes that all model.lookup[[i]] are of length one. If they aren't, you won't be able to write the output to a single element of model in the first place.
I would also note that you are grossly overcoding, e.g. x<-0 is better than x<-c(0), and don't bother with length evaluation on a single item.
I have a command that generates a variable every 10 loops in R (index1, index2, index3... and so on). The command I have is functional, but I am thinking of a smarter way to write this command. Here's what my command looks like:
for (counter in 1:10){
for (i in 1:100){
if (counter == 1){
index1 <- data1 ## some really long command here, I just changed it to this simple command to illustrate the idea
}
if (counter == 2){
index2 <- data2
}
.
.
.
# until I reach index10
} indexing closure
} ## counter closure
Is there a way to write this without having to write the conditional if commands? I would like to generate index1, index2.... I am sure there is some easy way to do this but I just cannot think of it.
Thanks.
What you need is the modulo operator %%. inside the inner loop. Ex: 100%%10 returns 0 101%%10 returns 1 92%%10 returns 2 - in other words if it is multiple of 10 then you get 0. And the assign function.
Note: You no longer need the outer loop used in your example.
So to create a variable at every 10 iteration do something like this
for(i in 1:100){
#check if i is multiple of 10
if(i%%10==0){
myVar<-log(i)
assign(paste("index",i/10,sep=""), myVar)
}
}
ls() #shows that index1, index2, ...index10 objects have been created.
index1 #returns 2.302585
update:
Alternatively, you can store results in a vector
index<-vector(length=10)
for(i in 1:100){
#check if i is multiple of 10
if(i%%10==0){
index[i/10]<-log(i)
}
}
index #returns a vector with 10 elements, each a result at end of an iteration that is a multiple of 10.