R: Result vector is not showing up in For Loop - r

OK, so I am new to R, but I've had some pretty good success so far - I am running a statistical test between corresponding rows of two dataframes (well, one is just a string of values since it has just one column). I wish to use the following For-Loop:
zvalues = NULL
zvalues = numeric(0)
for(i in seq(nrow(geneexpx))){
zvalues[i] <- try(unname((geneexpx[i]-rowMeans(geneexpy[i,])) / rowSds(geneexpy[i,])))
}
The problem is, the resultant zvalues numeric is empty. I have no idea why. I can run the same function for a single row and it works fine. For instance:
s = unname(geneexpx[4]-rowMeans(geneexpy[4,])) / rowSds(geneexpy[4,])
s
[1] -2.431277e+156
Please let me know if you have any ideas as to what might be the problem.
EDIT:
head of geneexpx:
c(1.501400411, -0.818584726, -0.455614921, -0.138022494, -1.213938495, -0.536465133)
geneexpy is very large, but each column is similar to geneexpx above.

You have a couple things going on here. First, you need to define zvalues as a vector. Second, rowMeans and rowSds are operations on matrices, not vectors. By selecting greneecxpy[i, ] you are selecting the ith row of the matrix, which will be a vector.
You did not provide geneexpy so I made one up:
zvalues = rep(NA, length(geneexpx))
geneexpy <- matrix(runif(60), nrow = 6)
for(i in seq_along(geneexpx)){
zvalues[i] <- (geneexpx[i] - mean(geneexpy[i,])) / sd(geneexpy[i,])
}
> zvalues
[1] 3.772994 -4.283168 -2.812811 -2.074548 -5.649359 -4.323920

Related

Looping through an R vector to apply a formula

I am trying to loop through two vectors and apply a formula to the data but I can't figure out how to do what I want to do...
Basically here is some sample data...two vectors that I need to reference...
v1 <- c(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17)
v2 <- c(0.01,0.06,0.11,0.16,0.21,0.26,0.31,0.36,0.41,0.46,0.51,0.56,0.61,0.66,0.71,0.76,0.81)
This is this the formaula I would like to apply...this would be the first instance...
(1+v2[2])^v1[2] / (1+(v2[1]^v1[1]))
The second iteration would be...
(1+v2[3])^v1[3] / (1+(v2[2]^v1[2]))
So I am trying to do this but I can't figure out how to iterate the vector index...I am trying along the lines of this but I can't increase the index via a counter, e.g., i...
for (i in seq_along(v1) {
i <- 0
res[i] <- ((1+v2[2+i])^v1[2+i] / (1+(v2[1+i]^v1[1+i]))
i <- i+1
}
After going through the loop I would get the following output...
> res
[1] 1.112475 1.217187 1.323924 1.432501 1.542753 1.654534 1.767715 1.882178 1.997821 2.114550 2.232280 2.350938 2.470454 2.590767 2.711821 2.833564
I have done lots of searching but I am coming up blank, any suggestions?
You can do this in one go with res <- (1+v2[-1])^v1[-1] / (1+(v2[-17]^v1[-17]))
Basically this uses length-16 vectors (v1 and v2 minus the first and last elements) and takes advantage of R's natural vector processing.

How to append a vector to a vector r - in a vectorized style

We all know that appending a vector to a vector within a for loop in R is a bad thing because it costs time. A solution would be to do it in a vectorized style. Here is a nice example by Joshua Ulrich. It is important to first create a vector with known length and then fill it up, instead of appending each new piece to an existing piece within the loop.
Still, in his example he demonstrates 'only' how to append one data piece at a time. I am now fighting with the idea to fill a vector with vectors - not scalars.
Imagine I have a vector with a length of 100
vector <- numeric(length=100)
and a smaller vector that would fit 10 times into the first vector
vec <- seq(1,10,1)
How would I have to construct a loop that adds the smaller vector to the large vector without using c() or append ?
EDIT: This example is simplified - vec does not always consist of the same sequence but is generated within a for loop and should be added to vector.
You could just use normal vector indexing within the loop to accomplish this:
vector <- numeric(length=100)
for (i in 1:10) {
vector[(10*i-9):(10*i)] <- 1:10
}
all.equal(vector, rep(1:10, 10))
# [1] TRUE
Of course if you were just trying to repeat a vector a certain number of times rep(vec, 10) would be the preferred solution.
A similar approach, perhaps a little more clear if your new vectors are of variable length:
# Let's over-allocate so that we now the big vector is big enough
big_vec = numeric(1e4)
this.index = 1
for (i in 1:10) {
# Generate a new vector of random length
new_vec = runif(sample(1:20, size = 1))
# Stick in in big_vec by index
big_vec[this.index:(this.index + length(new_vec) - 1)] = new_vec
# update the starting index
this.index = this.index + length(new_vec)
}
# truncate to only include the added values
big_vec = big_vec[1:(this.index - 1)]
As #josilber suggested in comments, lists would be more R-ish. This is a much cleaner approach, unless the new vector generation depends on the previous vectors, in which case the for loop might be necessary.
vec_list = list()
for (i in 1:10) {
# Generate a new vector of random length
vec_list[[i]] = runif(sample(1:20, size = 1))
}
# Or, use lapply
vec_list = lapply(1:10, FUN = function(x) {runif(sample(1:20, size = 1))})
# Then combine with do.call
do.call(c, vec_list)
# or more simply, just unlist
unlist(vec_list)

Replace rbind in for-loop with lapply? (2nd circle of hell)

I am having trouble optimising a piece of R code. The following example code should illustrate my optimisation problem:
Some initialisations and a function definition:
a <- c(10,20,30,40,50,60,70,80)
b <- c(“a”,”b”,”c”,”d”,”z”,”g”,”h”,”r”)
c <- c(1,2,3,4,5,6,7,8)
myframe <- data.frame(a,b,c)
values <- vector(length=columns)
solution <- matrix(nrow=nrow(myframe),ncol=columns+3)
myfunction <- function(frame,columns){
athing = 0
if(columns == 5){
athing = 100
}
else{
athing = 1000
}
value[colums+1] = athing
return(value)}
The problematic for-loop looks like this:
columns = 6
for(i in 1:nrow(myframe){
values <- myfunction(as.matrix(myframe[i,]), columns)
values[columns+2] = i
values[columns+3] = myframe[i,3]
#more columns added with simple operations (i.e. sum)
solution <- rbind(solution,values)
#solution is a large matrix from outside the for-loop
}
The problem seems to be the rbind function. I frequently get error messages regarding the size of solution which seems to be to large after a while (more than 50 MB).
I want to replace this loop and the rbind with a list and lapply and/or foreach. I have started with converting myframeto a list.
myframe_list <- lapply(seq_len(nrow(myframe)), function(i) myframe[i,])
I have not really come further than this, although I tried applying this very good introduction to parallel processing.
How do I have to reconstruct the for-loop without having to change myfunction? Obviously I am open to different solutions...
Edit: This problem seems to be straight from the 2nd circle of hell from the R Inferno. Any suggestions?
The reason that using rbind in a loop like this is bad practice, is that in each iteration you enlarge your solution data frame and then copy it to a new object, which is a very slow process and can also lead to memory problems. One way around this is to create a list, whose ith component will store the output of the ith loop iteration. The final step is to call rbind on that list (just once at the end). This will look something like
my.list <- vector("list", nrow(myframe))
for(i in 1:nrow(myframe)){
# Call all necessary commands to create values
my.list[[i]] <- values
}
solution <- rbind(solution, do.call(rbind, my.list))
A bit to long for comment, so I put it here:
If columns is known in advance:
myfunction <- function(frame){
athing = 0
if(columns == 5){
athing = 100
}
else{
athing = 1000
}
value[colums+1] = athing
return(value)}
apply(myframe, 2, myfunction)
If columns is not given via environment, you can use:
apply(myframe, 2, myfunction, columns) with your original myfunction definition.

How to make a matrix from a given vector by using for loop

I am trying to make a $n\times 4$ matrix by retrieving the n-th four elements in a given vector. Since I am new to R, don't know how to use loop functions properly.
My code is like
x<-runif(150,-2,2)
x1<-c(0,0,0,0,x)
for (i in 0:150)
{ai<-x1[1+i,4+i]
}
However, I got: Error in x1[1 + i, 4 + i] : incorrect number of dimensions.
I also want to combine these ai into a matrix, and each ai will be the i+1-th row of the matrix. Guess I should use the cbind function?
Any help will be appreciated. Thanks in advance.
You can do this directly with the matrix command:
x <- 1:36
xmat<-matrix(x,nr=9,byrow=TRUE)
May be this helps:
n <- length(x1)-1
res <- sapply((4:n)-3, function(i) x1[(i+3):i])
dim(res)
#[1] 4 150

Cannot create an empty vector and append new elements in R

I am just beginning to learn R and am having an issue that is leaving me fairly confused. My goal is to create an empty vector and append elements to it. Seems easy enough, but solutions that I have seen on stackoverflow don't seem to be working.
To wit,
> a <- numeric()
> append(a,1)
[1] 1
> a
numeric(0)
I can't quite figure out what I'm doing wrong. Anyone want to help a newbie?
append does something that is somewhat different from what you are thinking. See ?append.
In particular, note that append does not modify its argument. It returns the result.
You want the function c:
> a <- numeric()
> a <- c(a, 1)
> a
[1] 1
Your a vector is not being passed by reference, so when it is modified you have to store it back into a. You cannot access a and expect it to be updated.
You just need to assign the return value to your vector, just as Matt did:
> a <- numeric()
> a <- append(a, 1)
> a
[1] 1
Matt is right that c() is preferable (fewer keystrokes and more versatile) though your use of append() is fine.

Resources