How to save results into a empty vector in R? - r

I have problem to create a empty vector in R, and save the results of another vector into them. This is my code:
k<-vector(mode = "numeric", length = 0)
for (j in length(Pe)){
if ((Pe[j])>0) {
k[j]<-Pe[j]
}
}
The lenght of the vector Pe is 1000. I need only to save the values mayor than zero in the vector k, but when I type the vector k the display window show:
numerical(0)
This is the correct way to initiate a empty vector in R (k)?
Thanks

in fact, it can be much more easy. Type
k <- c()
instead. But I think this won't get you what you want.
What happens when element p is not > 0? R will fill k[p] with NA, while I think you want k to be a shorter vector of only the elements of Pe which are > 0, not to be the same length but with NA's?
If so, you don't even need a loop. Try
k <- Pe[Pe > 0]
This will get you a vector only containing the elements of Pe > 0, no NA's.
Excuse my bad english, hope I helped you

As MaxPD pointed out
for (j in length(Pe)) print(j)
would only print the length of Pe, you should
for (j in seq_len(length(Pe))) print(j)
## or
for (j in seq_along(Pe)) print(j)
## or
for (j in 1:length(Pe)) print(j)
but in your case i wouldn't even use a loop
k<-vector(mode = "numeric", length = 0)
k[Pe > 0] <- Pe[Pe > 0]
should do the trick if both objects are vectors and have the same length.

Related

Efficient algorithm to turn matrix subdiagonal to columns r

I have a non-square matrix and need to do some calculations on it's subdiagonals. I figure out that the best way is too turn subdiagonals to columns/rows and use functions like cumprod. Right now I use a for loop and exdiag defined as below:
exdiag <- function(mat, off=0) {mat[row(mat) == col(mat)+off]}
However it to be not really efficient. Do you know any other algorithm to achieve that kind of results.
A little example to show what I am doing:
exdiag <- function(mat, off=0) {mat[row(mat) == col(mat)+off]}
mat <- matrix(1:72, nrow = 12, ncol = 6)
newmat <- matrix(nrow=11, ncol=6)
for (i in 1:11){
newmat[i,] <- c(cumprod(exdiag(mat,i)),rep(0,max(6-12+i,0)))
}
Best regards,
Artur
The fastest but by far the most cryptic solution to get all possible diagonals from a non-square matrix, would be to treat your matrix as a vector and simply construct an id vector for selection. In the end you can transform it back to a matrix if you want.
The following function does that:
exdiag <- function(mat){
NR <- nrow(mat)
NC <- ncol(mat)
smalldim <- min(NC,NR)
if(NC > NR){
id <- seq_len(NR) +
seq.int(0,NR-1)*NR +
rep(seq.int(1,NC - 1), each = NR)*NR
} else if(NC < NR){
id <- seq_len(NC) +
seq.int(0,NC-1)*NR +
rep(seq.int(1,NR - 1), each = NC)
} else {
return(diag(mat))
}
out <- matrix(mat[id],nrow = smalldim)
id <- (ncol(out) + 1 - row(out)) - col(out) < 0
out[id] <- NA
return(out)
}
Keep in mind you have to take into account how your matrix is formed.
In both cases I follow the same logic:
first construct a sequence indicating positions along the smallest dimension
To this sequence, add 0, 1, 2, ... times the row length.
This creates the first diagonal. After doing this, you simply add a sequence that shifts the entire previous sequence by 1 (either down or to the right) until you reach the end of the matrix. To shift right, I need to multiply this sequence by the number of rows.
In the end you can use these indices to select the correct positions from mat, and return all that as a matrix. Due to the vectorized nature of this code, you have to check that the last subdiagonals are correct. These contain less elements than the first, so you have to replace the values not part of that subdiagonal by NA. Also here you can simply use an indexing trick.
You can use it as follows:
> diag1 <- exdiag(amatrix)
> diag2 <- exdiag(t(amatrix))
> identical(diag1, diag2)
[1] TRUE
In order to come to your result
amatrix <- matrix(1:72, ncol = 6)
diag1 <- exdiag(amatrix)
res <- apply(diag1,2,cumprod)
res[is.na(res)] <- 0
t(res)
You can modify the diag() function.
exdiag <- function(mat, off=0) {mat[row(mat) == col(mat)+off]}
exdiag2 <- function(matrix, off){diag(matrix[-1:-off,])}
Speed Test:
mat = diag(10, 10000,10000)
off = 4
> system.time(exdiag(mat,4))
user system elapsed
7.083 2.973 10.054
> system.time(exdiag2(mat,4))
user system elapsed
5.370 0.155 5.524
> system.time(diag(mat))
user system elapsed
0.002 0.000 0.002
It looks like that the subsetting from matrix take a lot of time, but it still performs better than your implementation. May be there are a lot of other subsetting approaches, which outperforms my solution. :)

R loop missing value where TRUE/FALSE needed

I've got this code in R:
j <- 1
k <- nrow(group_IDs)
while (j <= k)
{
d_clust <- Mclust(Customers_Attibutes_s[which (Customers_Attibutes_s$Group_ID == group_IDs$Group_ID[j]),3:7], G=2:7)
temp <- cbind(Customers_Attibutes[which (Customers_Attibutes$Group_ID == group_IDs$Group_ID[j]),], as.data.frame (predict.Mclust(d_clust, Customers_Attibutes[which(Customers_Attibutes$Group_ID == group_IDs$Group_ID[j]), 3:7]))[1])
temp_ <- rbind(temp,temp_)
j <- j+1
}
j <= k in the while statement is returning this error:
missing value where TRUE/FALSE needed.
group_IDs is not null and it actually contains the value 8 in this case.
It seems to get into the loop and crash at the second round.
You can get around the indexing issues using for, e.g.:
for (ID in group_IDs) {}
This, of course, assumes that group_IDs is a vector of values.
Note: Your code shows the following inside the loop group_IDs$Group_ID[j] which implies something other than a vector; perhaps you meant group_IDs[j]?
Since group_ IDsis a vector, try length(group_IDs) instead of nrow. A vector doesn't have rows, so the equivalent is length.
Here's what I suspect is happening:
> group_IDs <- 8
> nrow(group_IDs)
NULL

Create a function that takes in a vector and returns a matrix in R

I am trying to create a function that will take in a vector k and return to me a matrix with dimensions length(distMat[1,]) by length(k). distMat is a huge matrix and indSpam is a long vector. In particular to my situation, length(distMat[1,]) is 2412. When I enter in k as a vector of length one, I get a vector of length 2412. I want to be able to enter in k as a vector of length two and get a matrix of 2412x2. I am trying to use a while loop to let it go through the length of k, but it only returns to me a vector of length 2412. What am I doing wrong?
predNeighbor = function(k, distMat, indSpam){
counter = 1
while (counter<(length(k)+1))
{
preMatrix = apply(distMat, 1, order)
orderedMatrix = t(preMatrix)
truncate = orderedMatrix[,1:k[counter]]
checking = indSpam[truncate]
checking2 = matrix(checking, ncol = k[counter])
number = apply(checking2, 1, sum)
return(number[1:length(distMat[1,])] > (k[counter]/2))
counter = counter + 1
}
}
I am trying to create a function that will take in a vector k and return to me a matrix with dimensions length(distMat[1,]) by length(k)
Here's a function that does this.
foo <- function(k, distMat) {
return(matrix(0, nrow = length(distMat[1, ]), ncol = length(k)))
}
If you have other requirements, please describe them in words.
Based on your comment, I think I understand better your goal. You have a function that returns a vector of length k and you want to save it's output as rows in a matrix. This is a pretty common task. Let's do a simple example where k starts out as 1:10, and say we want to add some noise to it with a function foo() and see how the rank changes.
In the case where the input to the function is always the same, replicate() works very well. It will automatically put everything in a matrix
k <- 1:10
noise_and_rank <- function(k) {
rank(k + runif(length(k), min = -2, max = 2))
}
results <- replicate(n = 8, expr = {noise_and_rank(k)})
In the case where you want to iterate, i.e., the output from the one go is the input for the next, a for loop is good, and we just pre-allocate a matrix with 0's, to fill in one column/row at a time
k <- 1:10
n.sim <- 8
results <- matrix(0, nrow = length(k), ncol = n.sim)
results[, 1] <- k
for(i in 2:n.sim) {
results[, i] <- noise_and_rank(results[, i - 1])
}
What your original question seems to be about is how to do the pre-allocation. If the input is always the same, using replicate() means you don't worry about it. If the input is is different each time, then pre-allocate using matrix(), you don't need to write any special function.

Store/make sparse vector in R

I am generating a sparse vector length >50,000. I am producing it in a for loop. I wonder if there is an efficient way of storing the zeros?
Basically the code looks like
score = c()
for (i in 1:length(someList)) {
score[i] = getScore(input[i], other_inputs)
if (score[i] == numeric(0))
score[i] = 0 ###I would want to do something about the zeros
}
This code will not work. You should preallocate score vector size before looping. Preallocating also will create a vector with zeros. So, no need to assign zeros values, you can only assign numeric results from getScore function.
N <- length(someList) ## create a vector with zeros
score = vector('numeric',N)
for (i in 1:N) {
ss <- getScore(input[i], other_inputs)
if (length(ss)!=0)
score[i] <- ss
}

Subtracting from random values in a weighted matrix in R

and thanks in advance for your help!
This question is related to one I posted before, but I think it deserves its own post because it is a separate challenge.
Last time I asked about randomly selecting values from a matrix after adding a vector. In that example, the matrix and the vector were both binary. Now I would like to change the values in a weighted matrix after adding a weighted vector. Here is some example code to play with.
require(gamlss.dist)
mat1<-matrix(c(0,0,0,0,1,0, 0,10,0,0,0,5, 0,0,0,0,1,0, 0,0,3,0,0,0, 0,0,0,0,3,0,
0,0,2,0,0,0, 2,1,0,1,0,1, 0,0,0,0,37,0, 0,0,0,2,0,0, 0,0,0,0,0,1, 1,0,0,0,0,0,
0,1,1,0,0,0), byrow=T, ncol=6, nrow=12)
vec1<-c(0,0,0,1,1,1)
ones <- which(vec1 == 1L)
temp=rZIP(sum(vec1)) #rZIP is a function from gamlss.dist that randomly selects values from a zero-inflated distribution
vec1[ones]<-temp
The values in the vector are sampled from a zero-inflated distribution (thanks to this question). When I bind the vector to the matrix, I want to randomly select a non zero value from the same column, and subtract the vector value from it. I can see a further complication arising if the vector value is greater than the randomly selected value in the same column. In such an instance, it would simply set that value to zero.
Here is some modified code from the earlier question that does not work for this problem but maybe will be helpful.
foo <- function(mat, vec) {
nr <- nrow(mat)
nc <- ncol(mat)
cols <- which(vec != 0) #select matrix columns where the vector is not zero
rows <- sapply(seq_along(cols),
function(x, mat, cols) {
ones <- which(mat[,cols[x]] != 0)
out <- if(length(ones) != 0) {
ones
} else {
sample(ones, 1)
}
out
}, mat = mat, cols = cols)
ind <- (nr*(cols-1)) + rows #this line doesn't work b/c it is not binary
mat[ind] <- 0 #here is where I would like to subtract the vector value
mat <- rbind(mat, vec)
rownames(mat) <- NULL
mat
}
Any ideas? Thanks again for all of the fantastic help!
EDIT:
Thanks to help from bnaul down below, I am a lot closer to the answer, but we have run into the same problem we hit last time. The sample function doesn't work properly on columns where there is only one nonzero value. I have fixed this using Gavin Simpson's if else statement (which was the solution in the previous case). I've adjusted the matrix to have columns with only one nonzero value.
mat1<-matrix(c(0,0,0,0,1,0, 0,0,0,0,0,5, 0,0,0,0,1,0, 0,0,0,0,0,0, 0,0,0,0,3,0,
0,0,2,0,0,0, 2,1,0,1,0,1, 0,0,0,0,37,0, 0,0,0,2,0,0, 0,0,0,0,0,1, 1,0,0,0,0,0,
0,0,0,0,0,0), byrow=T, ncol=6, nrow=12)
vec1<-c(0,1,0,0,1,1)
ones <- which(vec1 == 1L)
temp=rZIP(sum(vec1))
vec1[ones]<-temp
mat2 = rbind(mat1, vec1)
apply(mat2, 2, function(col) { #Returns matrix of integers indicating their column
#number in matrix-like object
nonzero = which(head(col,-1) != 0); #negative integer means all but last # of elements in x
sample_ind = if(length(nonzero) == 1){
nonzero
} else{
sample(nonzero, 1)
}
; #sample nonzero elements one time
col[sample_ind] = max(0, col[sample_ind] - tail(col,1)); #take max of either 0 or selected value minus Inv
return(col)
}
)
Thanks again!
mat2 = rbind(mat1, vec1)
apply(mat2, 2, function(col) {
nonzero = which(head(col,-1) != 0);
sample_ind = sample(nonzero, 1);
col[sample_ind] = max(0, col[sample_ind] - tail(col,1));
return(col)
}
)
I made a couple of simplifications; hopefully they don't conflict with what you had in mind. First, I ignore the requirement that you only operate on the nonzero elements of the vector, since subtracting 0 from anything will not change it. Second, I bind the matrix and vector and then perform the operation column-wise on the result, since this is a bit easier than tracking the indices in two separate data structures and then combining them afterward.

Resources