R Language For Loop on factorial - r

Sorry, i have a question on For loop.
Now there're two different loop coding, and my goal is to create a factorial via a function of for loop.
----------------------------------
Method 1
s<-function(input){
stu<-1
for(i in 1:input){
stu<-1*((1:input)[i])
}
return(stu)
}
----------------------------------------
Method 2
k <- function(input){
y <- 1
for(i in 1:input){
y <-y*((1:input)[i])
}
return(y)
}
But 1 result is
> s(1)
[1] 1
> s(4)
[1] 4
> s(8)
[1] 8
and 2 result is
> k(1)
[1] 1
> k(4)
[1] 24
> k(8)
[1] 40320
-------------------------------
It's obviously that 2 is correct, and 1 is incorrect. But why? what's different between 1 and 2? Why i can't use stu<-1*((1:input)[i]) instead of stu<-stu*((1:input)[i])?

it's because the variable stu is not updating within the for loop.
s<-function(input){
stu<-1
for(i in 1:input){
stu<-1*((1:input)[i])
message(paste(i,stu,sep="\t"))
}
return(stu)
}
s(5)
1 1 # at the first loop, 1 x 1 is calculated
2 2 # at the 2nd loop, 1 x 2 is calculated
3 3 # at the 3rd loop, 1 x 3 is calculated
4 4 # at the 4th loop, 1 x 4 is calculated
5 5 # at the 5th loop, 1 x 5 is calculated
[1] 5
However, if you use stu<-stu*((1:input)[i]) instead of stu<-1*((1:input)[i]) then the result shows following :
s(5)
1 1 # at the first loop, 1 x 1 is calculated.
2 2 # at the second loop, 1 x 2 is calculated.
3 6 # at the third loop, 2 x 3 is calculated.
4 24 # at the fourth loop, 6 x 4 is calculated.
5 120 # at the fifth loop, 24 x 5 is calculated.

Related

How do I run my loop for each simulation and create a new vector with these values?

This is my data frame:
time<-rep(c(1:5),4)
sim1<-rep(c(paste("sim",1)),5)
sim2<-rep(c(paste("sim",2)),5)
sim3<-rep(c(paste("sim",3)),5)
sim4<-rep(c(paste("sim",4)),5)
sim<-c(sim1,sim2,sim3,sim4)
id<-as.vector(replicate(4,sample(1:5)))
df<-data.frame(time,sim,id)
df$simnu<-as.numeric(df$sim)
Which should look something like this:
time sim id simnu
1 1 sim 1 1 1
2 2 sim 1 3 1
3 3 sim 1 2 1
4 4 sim 1 4 1
5 5 sim 1 5 1
6 1 sim 2 1 2
7 2 sim 2 5 2
8 3 sim 2 4 2
9 4 sim 2 2 2
10 5 sim 2 3 2
11 1 sim 3 2 3
12 2 sim 3 3 3
13 3 sim 3 4 3
14 4 sim 3 1 3
15 5 sim 3 5 3
16 1 sim 4 3 4
17 2 sim 4 5 4
18 3 sim 4 2 4
19 4 sim 4 1 4
20 5 sim 4 4 4
I have created this loop that subsets the data by simulation and then calculates the output I want:
surveillance<-5
n<-1
simsub<-df[which(df$simnu==1),names(df)%in%c("time","sim","id")]
while (n<=surveillance){
print (n)
rndid<-df[sample(nrow(simsub),1),]
print(rndid)
if(n<rndid$time){
n<-n+1
} else {
tinf<-sum(length(df[which(simsub$time<=n),1]))
prev<-tinf/length(simsub[,1])
print(paste(prev,"prevalence"))
break
}
}
My question is how do I run this loop for each simulation and return the values of this as a vector?
My suggestion for you is to take a look at the lapply function (resp. sapply and vapply), and avoid using while, to be honest it's a bit tricky to help without really knowing what is happening in your code, but in any case here's an example how you can use lapply, however since I don't know what your code should return I can't be sure that the output is correct
I added comments and questions with your original lines, hope this helps
# first define a function that takes one simnu and returns whatever you want it to return
my_calc_fun <- function(sim_nr){
## you can subset the DF without which, names, or %in%
# simres[[i]] <- my_df[which(my_df$simnu==i),names(my_df)%in%c("time","sim","id")]
sim_df <- my_df[my_df$simnu == sim_nr, c("time","sim","id")]
for(n in 1:surveillance){
## I'm not sure that is what you meant to do,
## you are sampling the full DF, but you want a sample
## from the subset i.e., simres[[i]]
# rndid<-my_df[sample(nrow(simres[[i]]),1),]
row_id <- sample(nrow(sim_df), 1)
rndid <- sim_df[row_id, ]
if(n >= rndid$time){
## what are you trying to sum here?
## because you are giving the function one number length(....)
## and just like above you are subsetting the full DF here
# tinf<-sum(length(my_df[which(simres[[i]]$time<=n),1]))
tinf <- length(sim_df[sim_df$time<=n, 1])
# is this the value you want to return for each simnu?
prev <- tinf/length(sim_df["time"])
break
}
}
return(c('simnu'=sim_nr, 'prev' = prev))
}
# apply this function on all values of simnu and save to list
result_all <- lapply(unique(my_df$simnu), my_calc_fun)
result_all

R: Aggregate dataframe if column has less than 3 zeros, else return zero

I have ratings of images by several raters:
data <- as.data.frame(matrix(c(rep(1,6),rep(2,6),rep(1:6,2),
0,2,1,0,1,0,0,0,3,0,0,0),12,3))
colnames(data) <- c("image", "rater", "rating")
print(data)
# image rater rating
# 1 1 1 0
# 2 1 2 2
# 3 1 3 1
# 4 1 4 0
# 5 1 5 1
# 6 1 6 0
# 7 2 1 0
# 8 2 2 0
# 9 2 3 3
# 10 2 4 0
# 11 2 5 0
# 12 2 6 0
I want to aggregate (mean) ratings by images, but only if there less than 3 zero ratings for a given image. Otherwise (=if there are 3 zeros or more), the aggregated rating should be zero. And the counting of zeros should only be for raters 1-5.
So for the above data:
# image rating
# 1 1 0.8
# 2 2 0.0
For image 1 ratings are aggregated because the third zero belongs to rater 6. For image 2, the aggregated rating is zero because there are more than 2 zeros.
On top of that, I want the aggregation to take into account a) only the first 5 ratings for each image, and b) only positive ratings.
I can manage the last 2 conditions using aggregate:
aggregate(rating ~ image, data = data[data$rater <= 5 & data$rating != 0,], mean)
# Result:
# image rating
# 1 1 1.333333
# 2 2 3.000000
But I can't figure out the first condition.
Correct results should be:
# image rating
# 1 1 1.333333
# 2 2 0.000000
Can anyone please help? Thanks.
Here is a nice method using base R:
data$this <- ave(data$rating, data$image,
FUN=function(i) if(sum(i[1:5] > 0) > 2) mean(i[1:5]) else 0)
I use i[1:5] to subset each image, so if you have fewer than 5 raters for an image, you will get an error. This returns the mean for each group, if that is of interest. Of course, you can use the same function to produce the aggregation table you mentioned:
aggregate(data$rating, data["image"],
FUN=function(i) if(sum(i[1:5] > 0) > 2) mean(i[1:5]) else 0)

Replace some component value in a vector with some other value

In R, in a vector, i.e. a 1-dim matrix, I would like to change components with value 3 to with value 1, and components with value 4 with value 2. How shall I do that? Thanks!
The idiomatic r way is to use [<-, in the form
x[index] <- result
If you are dealing with integers / factors or character variables, then == will work reliably for the indexing,
x <- rep(1:5,3)
x[x==3] <- 1
x[x==4] <- 2
x
## [1] 1 2 1 2 5 1 2 1 2 5 1 2 1 2 5
The car has a useful function recode (which is a wrapper for [<-), that will let you combine all the recoding in a single call
eg
library(car)
x <- rep(1:5,3)
xr <- recode(x, '3=1; 4=2')
x
## [1] 1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
xr
## [1] 1 2 1 2 5 1 2 1 2 5 1 2 1 2 5
Thanks to #joran for mentioning mapvalues from the plyr package, another wrapper for [<-
x <- rep(1:5,3)
mapvalues(x, from = c(3,1), to = c(1,2))
plyr::revalue is a wrapper for mapvalues specifically factor or character variables.

Count and label observations per participant using loop

I have repeated-measures data.
I need to create a loop that will incrementally count each observation, within a participant, and label it.
I am new to writing loops. My logic was to say, for each item in the list of unique ids, count each row in that, and apply some function to that row.
Could someone point our what I am doing wrong?
data$Ob <- 0
for (i in unique(data$id)) {
count <- 1
for (u in data[data$id == i,]) {
data[data$id ==u,]$Ob <- count
count <- count + 1
print(count)
}
}
Thanks!
Justin
You can also use ave:
set.seed(1)
data <- data.frame(id = sample(4, 10, TRUE))
data$Ob = ave(data$id, data$id, FUN=seq_along)
data
id Ob
1 2 1
2 2 2
3 3 1
4 4 1
5 1 1
6 4 2
7 4 3
8 3 2
9 3 3
10 1 2
# Generate some dummy data
data <- data.frame(Ob=0, id=sample(4,20,TRUE))
# Go through every id value
for(i in unique(data$id)){
# Label observations
data$Ob[data$id == i] = 1:sum(data$id == i)
}
Be aware though that for loops are notoriously slow in R. In this simple case they work fine, but should you have millions and millions of rows in your data frame you'd better do something purely vectorized.
But you don't need a loop...
data <- data.frame (id = sample (4, 10, TRUE))
## id
## 1 3
## 2 4
## 3 1
## 4 3
## 5 3
## 6 4
## 7 2
## 8 1
## 9 1
## 10 4
data$Ob [order (data$id)] <- sequence (table (data$id))
## id Ob
## 1 3 1
## 2 4 1
## 3 1 1
## 4 3 2
## 5 3 3
## 6 4 2
## 7 2 1
## 8 1 2
## 9 1 3
## 10 4 3
(works also with character or factor IDs)
(isn't R just cool!?)

extracting row labels (?) from a data.frame

Starting with a data.frame...
df = data.frame(k=c(1,5,4,7,6), v=c(3,1,4,1,5))
> df
k v
1 1 3
2 5 1
3 4 4
4 7 1
5 6 5
I might run some number of arbitrary manipulations...
> foo1 = df[df$k>3,]
> foo2 = head(foo1[order(foo1$v),], 2)
> foo2
k v
2 5 1
4 7 1
At this point foo2 has somehow retained the original row numbers fromdf (in this case 2 and 4).
How do I extract these?
> insert_magic_function_here(foo2)
[1] 2 4
I think you're looking for rownames.

Resources