I'm looking to generate all possible binary vectors of length n in R. What is the best way (preferably both computationally efficient and readable code) to do this?
n = 3
expand.grid(replicate(n, 0:1, simplify = FALSE))
# Var1 Var2 Var3
#1 0 0 0
#2 1 0 0
#3 0 1 0
#4 1 1 0
#5 0 0 1
#6 1 0 1
#7 0 1 1
#8 1 1 1
Inspired by this question generating all possible binary vectors of length n containing less than m 1s, I've extended this code to produce all possible combinations. It's not pretty, though.
> z <- 3
> z <- rep(0, n)
> do.call(rbind, lapply(0:n, function(i) t(apply(combn(1:n,i), 2, function(k) {z[k]=1;z}))))
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 1 0 0
[3,] 0 1 0
[4,] 0 0 1
[5,] 1 1 0
[6,] 1 0 1
[7,] 0 1 1
[8,] 1 1 1
What is it doing? Once we strip it back, the core of this one-liner is the following:
apply(combn(1:n,i), 2, function(k) {z[k]=1;z})
To understand this, let's step back one level further. The function combn(x,m) generates all possible combinations of x taken m at a time.
> combn(1:n, 1)
[,1] [,2] [,3]
[1,] 1 2 3
> combn(1:n, 2)
[,1] [,2] [,3]
[1,] 1 1 2
[2,] 2 3 3
> combn(1:n, 3)
[,1]
[1,] 1
[2,] 2
[3,] 3
For using apply(MARGIN=2), we pass in a column of this function at a time to our inner function function(k) {z[k]=1;z} which simply replaces all of the values at the indices k with 1. Since they were originally all 0, this gives us each possible binary vector.
The rest is just window dressing. combn gives us a wide, short matrix; we transpose it with t. lapply returns a list; we bind the matrices in each element of the list together with do.call(rbind, .).
You should define what is "the best way" (fastest? shortest code?, etc.).
One way is to use the package R.utils and the function intToBin for converting decimal numbers to binary numbers. See the example.
require(R.utils)
n <- 5
strsplit(intToBin(0:(2 ^ n - 1)), split = "")
Related
I have two arrays.
Using numpy.append we can merge two arrays.
How can we do same thing in R?
merge can not do that.
Python Output/Example:
a=np.array([1,2,3,4,5,5])
b=np.array([0,0,0,0,0,0])
np.append(a,b)
array([1, 2, 3, 4, 5, 5, 0, 0, 0, 0, 0, 0]) # this is what I want
x<-c(mat , (0.0) * (l - length(demeaned)))
mat is matrix (size is 20)
l - length(demeaned) is 10
i want at the end 30 size
The c-function concatenates its arguments. A vector can be a concatenation of numbers or of other verctors:
a = c(1,2,3,4,5,5)
b = c(0,0,0,0,0,0)
c(a,b)
[1] 1 2 3 4 5 5 0 0 0 0 0 0
At least for one-dimensional arrays like in your python-example this is equivalent to np.append
Adding to the previous answer, you can use rbind or cbind to create two-dimensional arrays (matrices) from simple arrays (vectors):
cbind(a,b)
# output
a b
[1,] 1 0
[2,] 2 0
[3,] 3 0
[4,] 4 0
[5,] 5 0
[6,] 5 0
or
rbind(a,b)
# output
[,1] [,2] [,3] [,4] [,5] [,6]
a 1 2 3 4 5 5
b 0 0 0 0 0 0
If you want to convert it back to vector, use as.vector. This
as.vector(rbind(a,b))
will give you a joined vector with alternating elements.
Also, note that c can flatten lists if you use the recursive=TRUE argument:
a <- list(1,list(1,2,list(3,4)))
b <- 10
c(a,b, recursive = TRUE)
# output
[1] 1 1 2 3 4 10
Finally, you can use rep to generate sequences of repeating numbers:
rep(0,10)
I am trying to fill some rows of a (500,2) matrix with the row vector (1,0) using this code, last line is to verify the result:
data<-matrix(ncol=2,nrow=500)
data[41:150,]<-matrix(c(1,0),nrow=1,ncol=2,byrow=TRUE)
data[41:45,]
But the result is
> data[41:45,]
[,1] [,2]
[1,] 1 1
[2,] 0 0
[3,] 1 1
[4,] 0 0
[5,] 1 1
instead of
> data[41:45,]
[,1] [,2]
[1,] 1 0
[2,] 1 0
[3,] 1 0
[4,] 1 0
[5,] 1 0
(1) What am I doing wrong?
(2) Why aren't the row indices in the result 41, 42, 43, 44 and 45?
You're trying to fill a part of the matrix, so the block you're trying to drop in there should be of the right size:
data[41:150,]<-matrix(c(1,0),nrow=110,ncol=2,byrow=TRUE)
# nrow = 110, instead of 1 !!!!
Otherwise your piece-to-be-added will be reverted to vector and added columnwise. Try, for example, this:
data[41:150,] <- matrix(c(1,2,3,4,5), nrow=5, ncol=2, byrow=TRUE)
data[41:45,]
[,1] [,2]
[1,] 1 1
[2,] 3 3
[3,] 5 5
[4,] 2 2
[5,] 4 4
Can one complain? Yes, and now. No, because R behaves as documented (matrices are vectors with dimension attributes, and recycling works on vectors). Yes, because although recycling can be convenient, it may create false expectations.
Why aren't row indices 41,42,43,... ? I don't know, that's just the way matrices and vectors behave.
> (1:10)[5:6]
[1] 5 6
(Notice there's [1] in the output, not [5].)
Data frames behave differently, so you would see the original line numbers for slices:
as.data.frame(data)[45:50,]
It will be cleaner to just do this column-wise:
data[41:150, 1L] = 1
data[41:150, 2L] = 0
You could also accomplish this in one line with matrix indexing like so:
data[cbind(rep(41:150, each = 2L), 1:2)] = 1:0
You could use rep.
data[41:150,] <- rep(1:0, each=150-41+1)
#> data[41:45,]
# [,1] [,2]
#[1,] 1 0
#[2,] 1 0
#[3,] 1 0
#[4,] 1 0
#[5,] 1 0
I think MichaelChirico approach is the cleanest/savest to use.
Suppose I have a list of matrices. Suppose further I have found the smallest values by the column.
Here is my last question
I really need to know from which matrix each smallest value is selected. My original function is very complicated. Therefore, I provided a simple example. I have one idea and really do not know to implement it correctly in R.
My idea is:
Suppose that [i,j] is the elements of the matrix. Then,
if(d[[1]][i,j] < d[[2]][i,j]){
d[[1]][i,j] <– "x"
}else { d[[2]][i,j] <– "z"}
So, I would like to sign the name of the matrix that corresponds to each smallest value. Then, store the names in a separate matrix. So, then I can see the values in one matrix and their corresponding names (from where they come from) in another matrix
For example,
y <- c(3,2,4,5,6, 4,5,5,6,7)
x[lower.tri(x,diag=F)] <- y
> x
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 3 0 0 0 0
[3,] 2 6 0 0 0
[4,] 4 4 5 0 0
[5,] 5 5 6 7 0
k <- c(1,4,5,2,5,-4,4,4,4,5)
z[lower.tri(z,diag=F)] <- k
> z
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 1 0 0 0 0
[3,] 4 5 0 0 0
[4,] 5 -4 4 0 0
[5,] 2 4 4 5 0
d <- list(z, x)
Then:
do.call(pmin, d) (answered by #akrun)
Then, I will only get the matrix with smallest values. I would like to know where each value is come from?
Any idea or help, please?
You can use Map and do.call to create your own functions that will be applied element-wise to a list of inputs,
in your case a list of matrices.
pwhich.min <- function(...) {
which.min(c(...)) # which.min takes a single vector as input
}
di <- unlist(do.call(Map, c(list(f = pwhich.min), d)))
dim(di) <- dim(x) # take dimension from one of the inputs
di
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 2 1 1 1 1
[3,] 1 2 1 1 1
[4,] 1 2 2 1 1
[5,] 2 2 2 2 1
EDIT:
To elaborate,
you could do something like Map(f = min, z, x) to apply min to each pair of values in z and x,
although in that case min already supports arbitrary amount of inputs through an ellipsis (...).
By contrast,
which.min only takes a single vector as input,
so you need a wrapper with an ellipsis that combines all values into a vector
(pwhich.min above).
Since you may want to have more than two matrices,
you can put them all in a list,
and use do.call to put each element in the list as a parameter to the function you specify in f.
Or another option would be to convert it to a 3D array and use apply with which.min
apply(array(unlist(d), c(5, 5, 2)), c(1, 2), which.min)
Or with pmap from purrr
library(purrr)
pmap_int(d, ~ which.min(c(...))) %>%
array(., dim(x))
I've run into an R-programming problem that I can't seem to wrap my head around. I have data like the following:
data = data.frame("start"=c(1,2,4,5),
"length"=c(2,2,2,3),
"decision"=c("yes","no","yes","yes"))
Which looks like:
start length decision
1 1 2 yes
2 2 2 no
3 4 2 yes
4 5 3 yes
Row one stands for a sequence of integers that start at 1 for length 2 (1,2). Row 3 is 2 integers starting at 4 (4,5). I'm looking for intersections between entries that have a 'yes' decision variable. When the decision variable is 'no', then the sequence is thrown out. Here's what I've attempted so far.
I think I need to create a sequence list first.
sequence.list = lapply(seq(dim(data)[1]),
function(d){
seq(data$start[d],(data$start[d]+data$length[d]-1),by=1)
})
This outputs:
sequence.list
[[1]]
[1] 1 2
[[2]]
[1] 2 3
[[3]]
[1] 4 5
[[4]]
[1] 5 6 7
Which is a start. Then I create a list that counts intersections between items on my list (I stole this idea from another post on here).
count.intersect = lapply(sequence.list,function(a) {
sapply(seq(length(sequence.list)),
function(b) length(intersect(sequence.list[[b]], a)))
})
This creates the list:
count.intersect
[[1]]
[1] 2 1 0 0
[[2]]
[1] 1 2 0 0
[[3]]
[1] 0 0 2 1
[[4]]
[1] 0 0 1 3
The way to read this is that entry 1 in the data frame has 2 trivial intersections with itself and 1 intersection with entry 2.
Here's where I get fuzzy on what to do. Make it a matrix?
intersect.matrix = do.call(rbind,count.intersect)
Then set the rows and columns of non-used entries to zero?
intersect.matrix[,data$decision=="no"]=0
intersect.matrix[data$decision=="no",]=0
intersect.matrix
[,1] [,2] [,3] [,4]
[1,] 2 0 0 0
[2,] 0 0 0 0
[3,] 0 0 2 1
[4,] 0 0 1 3
Now, I would like to return indices 3 and 4 somehow. I want to find the rows (or columns) containing non zeros that are also not on the diagonal.
Sorry for posting the whole procedure, I also want to know if there is a shorter way to go from the starting dataframe to finding intersections in used entries.
Since you are not interested in non zero values on the diagonal, first I would subtract them away:
diag.mat <- diag(intersect.matrix) * diag(ncol(intersect.matrix)
which gives:
intersect.matrix - diag.mat
[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 0 0 0 0
[3,] 0 0 0 1
[4,] 0 0 1 0
Then identify which of the columns still hold non zero entries using which:
which(colSums(intersect.matrix - diag.mat) != 0)
[1] 3 4
You asked whether there is a short way to go from your data frame data to the indices. Here it is.
(Note: This may be hard to understand if you're new to R.)
1) Create the sequence list:
sequence.list <- apply(data[1:2], 1, function(x) seq_len(x[2]) + x[1] - 1)
# [[1]]
# [1] 1 2
#
# [[2]]
# [1] 2 3
#
# [[3]]
# [1] 4 5
#
# [[4]]
# [1] 5 6 7
2) Count intersects and create the intersect matrix
intersect.matrix <- outer(s <- seq_along(sequence.list), s,
Vectorize(function(a, b)
length(Reduce(intersect, sequence.list[seq(a, b)]))))
# [,1] [,2] [,3] [,4]
# [1,] 2 1 0 0
# [2,] 1 2 0 0
# [3,] 0 0 2 1
# [4,] 0 0 1 3
3) Set cells corresponding to "no" to zero
idx <- data$decision == "no"
intersect.matrix[idx, ] <- intersect.matrix[ , idx] <- 0
# [,1] [,2] [,3] [,4]
# [1,] 2 0 0 0
# [2,] 0 0 0 0
# [3,] 0 0 2 1
# [4,] 0 0 1 3
4) Find indices of non-zero rows/columns (except diagonal)
result <- which(as.logical(colSums("diag<-"(intersect.matrix, 0))))
# [1] 3 4
This question already has answers here:
How to replace non-diagonal elements in a matrix?
(6 answers)
Closed 9 years ago.
Okay, I asked this question earlier but I got bashed (deservedly) for not specifying anything and showing no sign of previous attempt. So let me try again..
I'm using R, and I have a 463✕463 matrix. What I would like to do is to replace all elements other than the diagonal ones (X11, X22, X33,...,Xjj) with zero.
E.g. I want:
[1 4 5
2 3 5
3 9 8]
to be:
[1 0 0
0 3 0
0 0 8]
When I use the diag() function, it simply gives me a column vector of the diagonal values. I imagine I can use the replace() function somehow combined with a "if not diagonal" logic...but I am lost.
And yes, as some here have guessed, I am probably much younger than many people here and am completely new at this...so please put me in the right direction. Really appreciate all your help!
In R, the diag method has two functions.
It returns the diagonal of a matrix. I.e.
m <- matrix(1:9, ncol=3)
m
# [,1] [,2] [,3]
# [1,] 1 4 7
# [2,] 2 5 8
# [3,] 3 6 9
diag(m)
# [1] 1 5 9
It can construct a diagonal matrix.
diag(1:3)
# [,1] [,2] [,3]
# [1,] 1 0 0
# [2,] 0 2 0
# [3,] 0 0 3
So in your case, extract the diagonal from your existing matrix and supply it to diag:
diag(diag(m))
# [,1] [,2] [,3]
# [1,] 1 0 0
# [2,] 0 5 0
# [3,] 0 0 9
using outer
You can use the following to compute a logical matrix which describes the non-diagonal entries of a n×n matrix:
outer(1:n, 1:n, function(i,j) i!=j)
Applied to your example:
> m <- matrix(c(1,2,3,4,3,9,5,5,8),ncol=3)
> m
[,1] [,2] [,3]
[1,] 1 4 5
[2,] 2 3 5
[3,] 3 9 8
> m[outer(1:3, 1:3, function(i,j) i!=j)] <- 0
> m
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 0 3 0
[3,] 0 0 8
using triangles
A possible alternative would be combining the two triangles on either side of the diagonal. In this case, you use the matrix m itself as input to determine the size.
upper.tri(m) | lower.tri(m)
Applied to your use case:
> m[upper.tri(m) | lower.tri(m)] <- 0
It seems you already got this answer in response to your original post…
m[ col(m)==row(m) ] <- 0
> m <- matrix(1:9, 3)
> m[ col(m)==row(m) ]
[1] 1 5 9
> m[ col(m)!=row(m) ] <- 0
> m
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 0 5 0
[3,] 0 0 9