How to declare a vector of zeros in R - r

I suppose this is trivial, but I can't find how to declare a vector of zeros in R.
For example, in Matlab, I would write:
X = zeros(1,3);

You have several options
integer(3)
numeric(3)
rep(0, 3)
rep(0L, 3)

You can also use the matrix command, to create a matrix with n lines and m columns, filled with zeros.
matrix(0, n, m)

replicate is another option:
replicate(10, 0)
# [1] 0 0 0 0 0 0 0 0 0 0
replicate(5, 1)
# [1] 1 1 1 1 1
To create a matrix:
replicate( 5, numeric(3) )
# [,1] [,2] [,3] [,4] [,5]
#[1,] 0 0 0 0 0
#[2,] 0 0 0 0 0
#[3,] 0 0 0 0 0

X <- c(1:3)*0
Maybe this is not the most efficient way to initialize a vector to zero, but this requires to remember only the c() function, which is very frequently cited in tutorials as a usual way to declare a vector.
As as side-note: To someone learning her way into R from other languages, the multitude of functions to do same thing in R may be mindblowing, just as demonstrated by the previous answers here.

Here are four ways to create a one-dimensional vector with zeros - then check if they are identical:
numeric(2) -> a; double(2) -> b; vector("double", 2) -> c; vector("numeric", 2) -> d
identical(a, b, c, d)
In the iteration chapter in R for Data Science they use the "d" option to create this type of vector.

Related

how to see if any element of a list contains only a certain value in R

I have a list of many matrices, let say my_list. I want to check if any of those matrices has only zero s as its element and if so which matrix in that list has such situation.
library(R.utils)
output_vec <- vector()
for(i in 1:length(my_list)){
asZero(as.vector(my_list[[i]]))}
this gives me true falses but, I am not able to return the index of matrices with all zero elements. I appreciate any help with this.
We may need to wrap with all - loop over the list of matrices with sapply, create a logical expression (x == 0), wrap with all to return a single TRUE/FALSE - if all values excluding NAs (na.rm = TRUE) are 0, this returns TRUE or else FALSe
sapply(my_list, function(x) all(x == 0, na.rm = TRUE))
You can use norm to judge if all entries in the matrix is zeros, e.g.,
sapply(my_list, norm) == 0
since the norm of matrix is 0 if and only if all values are zeros.
Example
> my_list
[[1]]
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 0
[3,] 0 0 0
[[2]]
[,1] [,2]
[1,] 1 1
[2,] 0 0
> sapply(my_list, norm) == 0
[1] TRUE FALSE

How to create matrix of all 2^n binary sequences of length n using recursion in R?

I know I can use expand.grid for this, but I am trying to learn actual programming. My goal is to take what I have below and use a recursion to get all 2^n binary sequences of length n.
I can do this for n = 1, but I don't understand how I would use the same function in a recursive way to get the answer for higher dimensions.
Here is for n = 1:
binseq <- function(n){
binmat <- matrix(nrow = 2^n, ncol = n)
r <- 0 #row counter
for (i in 0:1) {
r <- r + 1
binmat[r,] <- i
}
return(binmat)
}
I know I have to use probably a cbind in the return statement. My intuition says the return statement should be something like cbind(binseq(n-1), binseq(n)). But, honestly, I'm completely lost at this point.
The desired output should produce something like what expand.grid gives:
n = 5
expand.grid(replicate(n, 0:1, simplify = FALSE))
It should just be a matrix as binmat is being filled recursively.
As requested in a comment (below), here is a limited implementation for binary sequences only:
eg.binary <- function(n, digits=0:1) {
if (n <= 0) return(matrix(0,0,0))
if (n == 1) return(matrix(digits, 2))
x <- eg.binary(n-1)
rbind(cbind(digits[1], x), cbind(digits[2], x))
}
After taking care of an initial case that R cannot handle correctly, it treats the "base case" of n=1 and then recursively obtains all n-1-digit binary strings and prepends each digit to each of them. The digits are prepended so that the binary strings end up in their usual lexicographic order (the same as expand.grid).
Example:
eg.binary(3)
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 1
[3,] 0 1 0
[4,] 0 1 1
[5,] 1 0 0
[6,] 1 0 1
[7,] 1 1 0
[8,] 1 1 1
A general explanation (with a more flexible solution) follows.
Distill the problem down to the basic operation of tacking the values of an array y onto the rows of a dataframe X, associating a whole copy of X with each value (via cbind) and appending the whole lot (via rbind):
cross <- function(X, y) {
do.call("rbind", lapply(y, function(z) cbind(X, z)))
}
For example,
cross(data.frame(A=1:2, b=letters[1:2]), c("X","Y"))
A b z
1 1 a X
2 2 b X
3 1 a Y
4 2 b Y
(Let's worry about the column names later.)
The recursive solution for a list of such arrays y assumes you have already carried out these operations for all but the last element of the list. It has to start somewhere, which evidently consists of converting an array into a one-column data frame. Thus:
eg_ <- function(y) {
n <- length(y)
if (n <= 1) {
as.data.frame(y)
} else {
cross(eg_(y[-n]), y[[n]])
}
}
Why the funny name? Because we might want to do some post-processing, such as giving the result nice names. Here's a fuller implementation:
eg <- function(y) {
# (Define `eg_` here to keep it local to `eg` if you like)
X <- eg_(y)
names.default <- paste0("Var", seq.int(length(y)))
if (is.null(names(y))) {
colnames(X) <- names.default
} else {
colnames(X) <- ifelse(names(y)=="", names.default, names(y))
}
X
}
For example:
eg(replicate(3, 0:1, simplify=FALSE))
Var1 Var2 Var3
1 0 0 0
2 1 0 0
3 0 1 0
4 1 1 0
5 0 0 1
6 1 0 1
7 0 1 1
8 1 1 1
eg(list(0:1, B=2:3))
Var1 B
1 0 2
2 1 2
3 0 3
4 1 3
Apparently this was the desired recursive code:
binseq <- function(n){
if(n == 1){
binmat <- matrix(c(0,1), nrow = 2, ncol = 1)
}else if(n > 1){
A <- binseq(n-1)
B <- cbind(rep(0, nrow(A)), A)
C <- cbind(rep(1, nrow(A)), A)
binmat <- rbind(B,C)
}
return(binmat)
}
Basically for n = 1 we create a [0, 1] matrix. For every n there after we add a column of 0's to the original matrix, and, separately, a column of 1's. Then we rbind the two matrices to get the final product. So I get what the algorithm is doing, but I don't really understand what the recursion is doing. For example, I don't understand the step from n = 2 to n = 3 based on the algorithm.

Creating canonical basis vectors in R?

Is there an efficient way of creating the canonical basis vectors:
e_1=c(1,0,0,...),
e_2=c(0,1,0,...),
e_3=c(0,0,1,...),
...
for arbitrary lengths in R and probably large p?
I know that I could do
e_1 = rep(0,p)
e_1[1] = 1
and so on or diag(p)[1] ... But I wonder if there is a more efficient way since I only need one at a time in a loop.
It can be somewhat shorter with replace:
make_basis <- function(k, p = 10) replace(numeric(p), k, 1)
# usage
e_4 = make_basis(4)
e_4
# [1] 0 0 0 1 0 0 0 0 0 0

How to generate a matrices A) each row has a single value of one; B) rows sum to one

This is a two-part problem: the first is to create an NXN square matrix for which only one random element in each row is 1, the other items must be zero. (i.e. the sum of elements in each row is 1).
The second is to create an NXN square matrix for which the sum of items in each row is 1, but each element follows a distribution e.g. normal distribution.
Related questions include (Create a matrix with conditional sum in each row -R)
Matlab seems to do what I want automatically (Why this thing happens with random matrix such that all rows sum up to 1?), but I am looking for a solution in r.
Here is what I tried:
# PART 1
N <- 50
x <- matrix(0,N,N)
lapply(1:N, function(y){
x[y,sample(N,1)]<- 1
})
(I get zeroes still)
# PART 2
N <- 50
x <- matrix(0,N,N)
lapply(1:N, function(y){
x[y,]<- rnorm(N)
})
(It needs scaling)
Here's another loop-less solution that uses the two column addressing facility using the "[<-" function. This creates a two-column index matrix whose first column is simply an ascending series that assigns the row locations, and whose second column (the one responsible for picking the column positions) is a random integer value. (It's a vectorized version of Matthew's "easiest method", and I suspect would be faster since there is only one call to sample.):
M <- matrix(0,N,N)
M[ cbind(1:N, sample(1:N, N, rep=TRUE))] <- 1
> rowSums(M)
[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
If you didn't specify rep=TRUE, then colSums(M) would have all been ones as well, but that was not what you requested. It does mean the rank of your resultant matrix may be less than N. If you left out the rep=TRUE the matrix would be full rank.
Here you see why lapply doesn't always replace a loop. You're trying to iterate through the rows of x and modify the matrix, but what you're modifying is a copy of the x from the global environment.
The easiest fix is to use a for loop:
for (y in 1:N) {
x[y,sample(N,1)]<- 1
}
apply series should be used for the return value, rather than programming functions with side-effects.
A way to do this is to return the rows, then rbind them into a matrix. The second example is shown here, as this more closely resembles an apply:
do.call(rbind, lapply((1:N), function(i) rnorm(N)))
However, this is more readable:
matrix(rnorm(N*N), N, N)
Now to scale this to have row sums equal to 1. You use the fact that a matrix is column-oriented and that vectors are recycled, meaning that you can divide a matrix M by rowSums(M). Using a more reasonable N=5:
m <- matrix(rnorm(N*N), N, N)
m/rowSums(m)
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.1788692 0.5398464 0.24980924 -0.01282655 0.04430168
## [2,] 0.4176512 0.2564463 0.11553143 0.35432975 -0.14395871
## [3,] 0.3480568 0.7634421 -0.38433940 0.34175983 -0.06891932
## [4,] 1.1807180 -0.0192272 0.16500179 -0.31201400 -0.01447859
## [5,] 1.1601173 -0.1279919 -0.07447043 0.20865963 -0.16631458
No-loop solution :)
n <- 5
# on which column in each row insert 1s
s <- sample(n,n,TRUE)
# indexes for each row
w <- seq(1,n*n,by=n)-1
index <- s+w
# vector of 0s
vec <- integer(n*n)
# put 1s
vec[index] <- 1
# voila :)
matrix(vec,n,byrow = T)
[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 0 0 0
[2,] 0 0 0 1 0
[3,] 0 0 0 0 1
[4,] 1 0 0 0 0
[5,] 1 0 0 0 0

R equivalent to diag(x,k) in matlab

I guess, I have a two leveled question referring to diag in R and matlab.
1) I was wondering if there was a way already developed to access different diagonals of matrices in R similar to the way it is done in Matlab (see http://www.mathworks.com/help/techdoc/ref/diag.html).
2) If there is not already a current function how can my code be improved such that it is similar to the R diag where
diag(x = 1, nrow, ncol) # returns the values of the diagonal
diag(x) <- value # inserts values on the diagonal
Presently my code returns the elements on the diagonal given k but how can it be written so that if it is specified like the second way (above) that it allows me to insert the values on the diagonal. Presently to do this, I use diag.ind to give me the indices and then using those indices to insert the values on the k diagonal.
Here is the code:
'diag.ind'<-function(x,k=0){
if(k=='') k=0
x<-as.matrix(x)
if(dim(x)[2]==dim(x)[1]){
stp_pt_r<-dim(x)[1]
stp_pt_c<-dim(x)[2]
}
if(ncol(x)> dim(x)[1]){
stp_pt_r<-dim(x)[1]
stp_pt_c<-stp_pt_r + 1
}
if(ncol(x)< dim(x)[1]){
stp_pt_c<-dim(x)[2]
stp_pt_r<-stp_pt_c+1
}
if(k==0){
r<-as.matrix(seq(1,stp_pt_r,by=1))
c<-as.matrix(seq(1,stp_pt_c,by=1))
ind.r<- cbind(r,c)
}
if(k>0){
r<-t(as.matrix(seq(1,stp_pt_r,by=1)))
c<-t(as.matrix(seq((1+k),stp_pt_c,by=1)))
ind<-t(rbind.fill.matrix(r,c))
ind.r<-ind[!is.na(ind[,2]),]
}
if(k<0){
k<-abs(k)
r<-t(as.matrix(seq((1+k),stp_pt_r,by=1)))
c<-t(as.matrix(seq(1,stp_pt_c,by=1)))
ind<-t(rbind.fill.matrix(r,c))
ind.r<-ind[!is.na(ind[,1]),]
}
diag.x<-x[ind.r]
output<-list(diag.x=diag.x, diag.ind=ind.r)
return(output)
}
This is kind of clunky and I feel like I must be reinventing the wheel. Thanks in advance for any insight!
After your reply to Andrie this may satisfy:
exdiag <- function(mat, off) {mat[row(mat)+off == col(mat)]}
x <- matrix(1:16, ncol=4)
exdiag(x,1)
#[1] 5 10 15
I was thinking you wanted a function that can assign or return one of a diagonal or a sub- or super- diagonal matrix, This is the constructor function:
subdiag <- function(vec, size, offset=0){
M <- matrix(0, size, size)
M[row(M)-offset == col(M)] <- vec
return(M)}
> subdiag(1, 5, 1)
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 1 0 0 0 0
[3,] 0 1 0 0 0
[4,] 0 0 1 0 0
[5,] 0 0 0 1 0
Called with only two arguments you would get a diagonal matrix. You can construct super-diagonal matrices with negative offsets. If this is what you wanted for the constructor, then it should not be too hard to construct a similar subdiag<- function to go along with it.
In MATLAB, to assign the values x to the diagonal of A:
n = size(A,1);
A(1:n+1:end) = x
Look up linear indexing.
Although, that might not be what you asked.

Resources