I have created a function to call a function on each row in a dataset. I would like to have the output as a vector. As you can see below the function outputs the results to the screen, but I cannot figure out how to redirect the output to a vector that I can use outside the function.
n_markers <- nrow(data)
p_values <-rep(0, n_markers)
test_markers <- function()
{
for (i in 1:n_markers)
{
hets <- data[i, 2]
hom_1 <- data[i, 3]
hom_2 <- data[i, 4]
p_values[i] <- SNPHWE(hets, hom_1, hom_2)
}
return(p_values)
}
test_markers()
Did you just take this code from here? I worry that you didn't even try to figure it out on your own first, but hopefully I am wrong.
You might be overthinking this. Simply store the results of your function in a vector like you do with other functions:
stored_vector <- test_markers()
But, as mentioned in the comments, your function could probably be reduced to:
stored_vector <- sapply(1:nrow(data), function(i) SNPHWE(data[i,2],data[i,3],data[i,4]) )
Related
I'm trying to save each iteration of this for loop in a vector.
for (i in 1:177) {
a <- geomean(er1$CW[1:i])
}
Basically, I have a list of 177 values and I'd like the script to find the cumulative geometric mean of the list going one by one. Right now it will only give me the final value, it won't save each loop iteration as a separate value in a list or vector.
The reason your code does not work is that the object ais overwritten in each iteration. The following code for instance does what precisely what you desire:
a <- c()
for(i in 1:177){
a[i] <- geomean(er1$CW[1:i])
}
Alternatively, this would work as well:
for(i in 1:177){
if(i != 1){
a <- rbind(a, geomean(er1$CW[1:i]))
}
if(i == 1){
a <- geomean(er1$CW[1:i])
}
}
I started down a similar path with rbind as #nate_edwinton did, but couldn't figure it out. I did however come up with something effective. Hmmmm, geo_mean. Cool. Coerce back to a list.
MyNums <- data.frame(x=(1:177))
a <- data.frame(x=integer())
for(i in 1:177){
a[i,1] <- geomean(MyNums$x[1:i])
}
a<-as.list(a)
you can try to define the variable that can save the result first
b <- c()
for (i in 1:177) {
a <- geomean(er1$CW[1:i])
b <- c(b,a)
}
I just read that vectorization increases performance and lowers significantly computation time, and in the case of if() else , best choice is ifelse().
My problem is I got some if statements inside a for loop, and each if statement contains multiple assignments, like the following:
x <- matrix(1:10,10,3)
criteria <- matrix(c(1,1,1,0,1,0,0,1,0,0,
1,1,1,1,1,0,0,1,1,0,
1,1,1,1,1,1,1,1,1,1),10,3) #criteria for the ifs
output1 <- rep(list(NA),10) #storage list for output
for (i in 1:10) {
if (criteria[i,1]>=1) {
output1[[i]] <- colMeans(x)
output1[[i]] <- output1[[i]][1] #part of the somefunction output
} else {
if(criteria[i,2]>=1) {
output1[[i]] <- colSums(x)
output1[[i]] <- output1[[i]][1] #the same
} else {
output1[[i]] <- colSums(x+1)
output1[[i]] <- output1[[i]][1] #the same
}}}
How can I translate this into ifelse?
Thanks in advance!
Note that you don't need a for loop as all operations used are vectorized:
output2 <- ifelse(criteria[, 1] >= 1,
colMeans(x)[1],
ifelse(criteria[, 2] >= 1,
colSums(x)[1],
colSums(x+1)[1]))
identical(output1, as.list(output2))
## [1] TRUE
At least you can convert two assignments into one. So instead of
output[[i]] <- somefunction(arg1,arg2,...)
output[[i]] <- output[[i]]$thing #part of the somefunction output
you can refer directly to the only part you are interested in.
output[[i]] <- somefunction(arg1,arg2,...)$thing #part of the somefunction output
Hope that it helps!
It seems I found the answer trying to build the example:
output2 <- rep(list(NA),10) #storage list for output
for (i in 1:10) {
output2[[i]] <- ifelse(criteria[i,1]>=1,
yes=colMeans(x)[1],
no=ifelse(criteria[i,2]>=1,
yes=colSums(x)[1],
no=colSums(x+1)[1]))}
I have problems storing user defined functions in R list when they are put on it in a for loop.
I have to define some segment-specific functions based on some parameters, so I create functions and put them on a list looping through segments with for-loop. The problem is I get same function everywhere on a result list.
The code looks like this:
n <- 100
segmenty <- 1:n
segment_functions <- list()
for (i in segmenty){
segment_functions[[i]] <- function(){return(i)}
}
When i run the code what I get is the same function (last created in the loop) for all indexes:
## for all k
segment_functions[[k]]()
[1] 100
There is no problem when I put the functions on list manually e.g.
segment_functions[[1]] <- function(){return(1)}
segment_functions[[2]] <- function(){return(2)}
segment_functions[[3]] <- function(){return(3)}
works just fine.
I honsetly have no idea what's wrong. Could you help?
You need to use the force function to ensure that the evaluation of i is done during the assignment into the list:
n <- 100
segmenty <- 1:n
segment_functions <- list()
f <- function(i) { force(i); function() return(i) }
for (i in segmenty){
segment_functions[[i]] <- f(i)
}
I'd use lapply and capture i in a clousre of the wrapper:
segment_functions <- lapply(1:100, function(i) function() i)
I've created a simple loop to calculate the efficiency of some simulated data. It performs perfectly well whilst as a loop:
NSE_cal <- NULL
for(i in 1:6) {
Qobs <- flowSummary_NSE1[[i]][[3]]
Qsim <- flowSummary_NSE1[[i]][[1]]
object_cal <- NSEsums("NSE")
NSE_cal <- c(NSE_cal, object_cal)
}
#NSE_cal
#[1] 0.8466699 0.7577019 0.8128499 0.9163561 0.7868013 0.8462228
However, I want to apply this loop quite a few times - I need to vary the object flowSummary_NSE# and I have four different transformation types to apply. As a start, I put the loop inside a function, with only transformation needing to be specified, like so:
badFunction <- function(transformation){
NSE_cal <- NULL
for(i in 1:6) {
Qobs <- flowSummary_NSE1[[i]][[3]]
Qsim <- flowSummary_NSE1[[i]][[1]]
object_cal <- NSEsums(transformation)
NSE_cal <- c(NSE_cal, object_cal)
}
print(NSE_cal)
}
badFunction("NSE")
# [1] 0.8462228 0.8462228 0.8462228 0.8462228 0.8462228 0.8462228
The function has exactly the same information input as in the for loop on its own, except, for some reason, it outputs the same value for each case of i.
It is clear that I have done something wrong. But as far as I can see, it must be something simple contained to the function itself. However, incase it is an error elsewhere, I have attached the code that generates the necessary data and dependent functions (here)
Any help would be much appreciated
You need to pass objects into the nested function as arguments.
In your function_NSEsums.r script change the first line to NSEsums <- function(i, Qobs, Qsim) {
In your example_script.r change your code to the following:
badFunction <- function(transformation){
NSE_cal <- NULL
for(i in 1:6) {
Qobs <- flowSummary_NSE1[[i]][[3]]
Qsim <- flowSummary_NSE1[[i]][[1]]
object_cal <- NSEsums(transformation, Qobs = Qobs, Qsim = Qsim)
NSE_cal <- c(NSE_cal, object_cal)
}
print(NSE_cal)
}
badFunction("NSE")
[1] 0.8466699 0.7577019 0.8128499 0.9163561 0.7868013 0.8462228
I'm a novice R user and have created a small script that is doing some trigonometry with movement data. I need to add a final column that deletes repeated values from the column before it.
I've tried adding an if else statement that seems to work when isolated, but keep having errors when it is put into the for loop. I'd appreciate any advice.
# trig loop
list.df <- vector("list", max(Sp_test$ID))
names1 <- c(1:max(Sp_test$ID))
for(i in 1:max(Sp_test$ID)) {
if(i %in% unique(Sp_test$ID)) {
idata <- subset(Sp_test, ID == i)
idata$originx <- idata[1,3]
idata$originy <- idata[1,4]
idata$deltax <- idata[,"UTME"]-idata[,"originx"]
idata$deltay <- idata[,"UTMN"]-idata[,"originy"]
idata$length <- sqrt((idata[,"deltax"])^2+(idata[,"deltay"]^2))
idata$arad <- atan2(idata[,"deltay"],idata[,"deltax"])
idata$xnorm <- idata[,"deltax"]/idata[,"length"]
idata$ynorm <- idata[,"deltay"]/idata[,"length"]
sumy <- sum(idata$ynorm, na.rm=TRUE)
sumx <- sum(idata$xnorm, na.rm=TRUE)
idata$vecsum <- atan2(sumy,sumx)
idata$width <- idata$length*sin(idata$arad-idata$vecsum)
# need if else statement excluding a repeat from the position just before it
list.df[[i]] <- idata
names1[i] <- i
} }
# this works alone, I think the problem is when it gets to the first of the dataset and there is not one before it
if (idata$width[j]==idata$width[j-1]) {
print("NA")
} else {
print(idata$width[j])
}
I think you want to use the function diff for this. diff(idata$width) will give the differences between successive values of idata$width. Then
idata$width[c(FALSE, diff(idata$width) == 0)] <- NA
I think does what you want. The initial FALSE is since there is no value corresponding to the first element (since as you rightly noted, the first element doesn't have an element before it).