All binary sequences in R - r

I'm trying to write a recursive function that takes as input an integer n and returns a matrix that contains all binary sequences of length n.
I wrote this code but it is not giving an output
binseq <- function(n){
binsequ <- matrix(nrow = length(n), ncol = n)
r <- 0 # current row of binseq
for (i in 0:n) {
for (j in 0:n) {
for (k in 0:n) {
r <- r + 1
return (binsequ[r,] <- c(i, j, k))
}
}
}
}
I tried to run it using n=3
binseq(3)
But with no success.
However, when I do not use the function command and give specific numbers, it works. For example,
binseq <- matrix(nrow = 8, ncol = 3)
r <- 0 # current row of binseq
for (i in 0:1) {
for (j in 0:1) {
for (k in 0:1) {
r <- r + 1
binseq[r,] <- c(i, j, k)
}
}
}
binseq
the output is:
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 1
[3,] 0 1 0
[4,] 0 1 1
[5,] 1 0 0
[6,] 1 0 1
[7,] 1 1 0
[8,] 1 1 1

I created a function, was thinking about doing recursive but it turned out to be using loop. Hope to see how others did this as this sounds like a quite basic question. Landed here. If helps, here is my function.
##k: number of digits, k>1, eg k=3 for 101 etc
binary_gen <- function(k) {
base <- c(0L,1L)
##initialize
bi.set <- integer(2^(k))
bi.set[1:2] <- base
##create set through loop
for (i in 2:k) {
bi.set[(2^(i-1)+1):2^i] <- 10^(i-1)+bi.set[1:2^(i-1)]
bi.set
}
return(bi.set)
}
Here is output for k=4.
> binary_gen(4L)
[1] 0 1 10 11 100 101 110 111 1000 1001 1010 1011 1100 1101 1110 1111
You can manipulate the output vector into a matrix of desired format (# of rows=k, # of columns=2^k). Based on k, the location for binary numbers of k digits are 2^(k-1)+1 to 2^k.

Related

conditional which.min function

I have two sets of data, one is coordinates of machines, one is coordinates of the nearest repair shop.
I have a working model that has assigned each machine to the nearest store. However one store only has 1 machine and another has 7 machines assigned to it.
What I want is to add a condition so that each store is assigned at least 2 machines but no more than 4.
library(geosphere)
library(ggplot2)
#machine Locations
machine.x <- c(-122.37, -111.72, -111.87, -112.05, -87.17, -86.57, -86.54, -88.04, -86.61, -88.04, -86.61)
machine.y <- c(37.56, 35.23, 33.38, 33.57, 30.36, 30.75, 30.46, 30.68, 30.42, 30.68, 30.42)
machines <- data.frame(machine.x, machine.y)
#store locations
store.x <- c(-121.98, -112.17, -86.57)
store.y <- c(37.56, 33.59, 30.75)
stores <- data.frame(store.x, store.y)
centers<-data.frame(x=stores$store.x, y=stores$store.y)
pts<-data.frame(x=(machines$machine.x), y=(machines$machine.y))
#allocate space
distance<-matrix(-1, nrow = length(pts$x), ncol= length(centers$x))
#calculate the dist matrix - the define centers to each point
#columns represent centers and the rows are the data points
dm<-apply(data.frame(1:length(centers$x)), 1, function(x){ replace(distance[,x], 1:length(pts$x), distGeo(centers[x,], pts))})
#find the column with the smallest distance
closestcenter<-apply(dm, 1, which.min)
#color code the original data for verification
colors<-c(stores)
#create a scatter plot of assets color coded by which fe they belong to
plot(pts, col=closestcenter, pch=9)
So what I want is for each group to have a minimum count of 2 and a max count of 4, I tried adding a if else statement in the closest center variable but it didn't get even close to working out the way I thought it would. and i've looked around on line but can't find any way to add a counting condition to the which.min statement.
Note:My actual data set has several thousand machines and over 100 stores.
If M is an 11 x 3 zero-one matrix where M[i,j] = 1 if machine i is assigned to store j and 0 otherwise then the rows of M must each sum to 1 and the columns must each sum to 2 to 4 inclusive and we want to choose such an M which minimizes the sum of the distances sum(M * dm), say. This would give us the 0-1 linear program shown below. Below A is such that A %*% c(M) is the same as rowSums(M). Also B is such that B %*% c(M) is the same as colSums(M).
library(lpSolve)
k <- 3
n <- 11
dir <- "min"
objective.in <- c(dm)
A <- t(rep(1, k)) %x% diag(n)
B <- diag(k) %x% t(rep(1, n))
const.mat <- rbind(A, B, B)
const.dir <- c(rep("==", n), rep(">=", 3), rep("<=", 3))
const.rhs <- c(rep(1, n), rep(2, k), rep(4, k))
res <- lp(dir, objective.in, const.mat, const.dir, const.rhs, all.bin = TRUE)
res
## Success: the objective function is 9025807
soln <- matrix(res$solution, n, k)
and this solution:
> soln
[,1] [,2] [,3]
[1,] 1 0 0
[2,] 1 0 0
[3,] 0 1 0
[4,] 0 1 0
[5,] 0 1 0
[6,] 0 0 1
[7,] 0 0 1
[8,] 1 0 0
[9,] 0 0 1
[10,] 0 1 0
[11,] 0 0 1
or in terms of the vector of store numbers assigned to each machine:
c(soln %*% (1:k))
## [1] 1 1 2 2 2 3 3 1 3 2 3

how to make double for loops faster in R

I am trying to do the below calculation using R. my function is recursive and it uses a double for loop to calculate values of "result" matrix. Is there a method to replace the for loops or achieve the if condition faster?
x<-rnorm(2400,0, 3)
y<-rnorm(400,0,3)
no_row<-length(x)
no_col<-length(y)
input<-matrix(data=1,nrow = no_row, ncol = no_col)
result<-matrix(nrow = no_row, ncol = no_col)
calculation<-function(x,y)
{
for(i in 1:no_row)
{
for(j in 1:no_col)
{
z<-exp(x[i]-y[j])
result[i,j]<-(z/1+z)
}
}
new_x<-x-1
new_y<-y-1
residual<-input-result
sq_sum_residulas<-sum((rowSums(residual, na.rm = T))^2)
if(sq_sum_residulas>=1){calculation(new_x,new_y)}
else(return(residual))
}
output<-calculation(x,y)
To complete Benjamin answer, you shouldn't use a recursion function. You should instead use a while loop with a max_iter parameter.
Reusing Benjamin function:
calculation2 <- function(x, y){
result <- outer(x, y, function(x, y) { z <- exp(x - y); z / 1 + z})
result
}
calculation <- function(x, y, max_iter = 10){
input <- matrix(data=1,nrow = length(x), ncol = length(y))
sq_sum_residulas <- 1 # Initialize it to enter while loop
new_x <- x # Computation x: it will be updated at each loop
new_y <- y # Computation y
n_iter <- 1 # Counter of iteration
while(sq_sum_residulas >= 1 & n_iter < max_iter){
result <- calculation2(new_x, new_y)
new_x <- x - 1
new_y <- y - 1
residual <- input - result
sq_sum_residulas <- sum((rowSums(residual, na.rm = T))^2)
n_iter <- n_iter + 1
}
if (n_iter == max_iter){
stop("Didn't converge")
}
return(residual)
}
If you try to run this code, you will see that it doesn't converge. I geuess there is a mistake in your computation. Especially in z/1 + z ?
The outer function is the tool you are looking for.
Compare these two functions that only generate the result matrix
x<-rnorm(100,0, 3)
y<-rnorm(100,0,3)
calculation<-function(x,y)
{
result <- matrix(nrow = length(x), ncol = length(y))
for(i in seq_along(x))
{
for(j in seq_along(y))
{
z<-exp(x[i]-y[j])
result[i,j]<-(z/1+z)
}
}
result
}
calculation2 <- function(x, y){
result <- outer(x, y, function(x, y) { z <- exp(x - y); z / 1 + z})
result
}
library(microbenchmark)
microbenchmark(
calculation(x, y),
calculation2(x, y)
)
Unit: microseconds
expr min lq mean median uq max neval
calculation(x, y) 1862.40 1868.119 1941.5523 1871.490 1876.1825 8375.666 100
calculation2(x, y) 466.26 469.192 515.3696 471.392 480.9225 4481.371 100
That discrepancy in time seems to grow as the length of the vectors increases.
Note, this will solve the speed for your double for loop, but there seem to be other issues in your function. It isn't clear to me what you are trying to do, or why you are calling calculation from within itself. As you have it written, there are no changes to x and y before it gets to calling itself again, so it would be stuck in a loop forever, if it worked at all (it doesn't on my machine)
#Benjamin #Emmanuel-Lin Thanks for the solutions :) I was able to solve the issue with your inputs. Please find below the sample data set and code. The solution converges when sq_sum_residual becomes less than 0.01. This is more than 12x faster than my code with double for loops.Sorry for the confusion created by the sample data & new_x, new_y calculation provided in the question.
Input is a dichotomous 9x10 matrix
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1 NA 1 1 1 1 1 1 1 0 1
2 1 1 1 1 1 1 1 0 1 0
3 1 1 1 1 1 1 0 1 0 0
4 1 1 1 1 1 1 0 1 0 0
5 1 1 1 1 1 1 0 1 0 0
6 1 1 1 1 1 0 1 0 0 0
7 1 1 1 1 0 1 0 0 0 0
8 1 0 1 0 1 0 0 0 0 0
9 0 1 0 1 0 0 0 0 0 0
x<-c( 2.0794415,1.3862944,0.8472979, 0.8472979, 0.8472979,0.4054651,0.0000000, -0.8472979, -1.3862944)
y<-c(-1.4404130, -1.5739444, -1.5739444, -1.5739444, -0.7472659, -0.1876501, 1.1986443 , 0.7286407,2.5849387,2.5849387 )
result<-matrix(nrow = length(x), ncol = length(y))
calculation<-function(x,y)
{
result<-outer(x,y,function(x,y){ z<-exp(x-y);z/(1+z)})
result[!is.finite(result)]<-NA
variance_result<-result*(1-result)
row_var<- (-1)*rowSums(variance_result,na.rm=T)
col_var<- (-1)*colSums(variance_result,na.rm=T)
residual<-input-result
row_residual<-rowSums(residual,na.rm=T)#(not to be multiplied by -1)
col_residual<-(-1)*colSums(residual,na.rm=T)
new_x<-x-(row_residual/row_var)
new_x[!is.finite(new_x)]<-NA
new_x<as.array(new_x)
new_y<-y-(col_residual/col_var)
new_y[!is.finite(new_y)]<-NA
avg_new_y<-mean(new_y, na.rm = T)
new_y<-new_y-avg_new_y
new_y<-as.array(new_y)
sq_sum_residual<-round(sum(row_residual^2),5)
if(sq_sum_residual>=.01)
{calculation(new_x,new_y)}
else(return(residual))
}
calculation(x,y)

Comparing list elements in R

I have created two list, the fist named list1 having 4 elements containing 4 bits and the second list2 containing 1 element with 4 bits. I want to compare the lists and if any element in list1 is same as the only element in list2, then i want to delete the element from list1. I've implemented the following code but not getting the correct result.
list1<-c()
n<-4
#Creating list1 with 4 vectors having 4 bits each
for(i in 1:5)
{
rndno<-round(runif(1, 1, 2^n -1),0)
bn<-bin(rndno)
pad<-rep.int(0,n-length(bn))
bn<-c(pad,bn)
list1<-rbind(list1,bn)
}
list2<-c()
rndno<-round(runif(1, 1, 2^n -1),0)
bn<-bin(rndno)
pad<-rep.int(0,n-length(bn))
bn<-c(pad,bn)
list2<-rbind(list2,bn)
for(i in 1:nrow(k))
{
if(list2[1,] == list2[i,])
{
print(i)
}
}
Please Help.
The problem is that you didn't define the bin function in your example, so we can only guess what you would like to achieve. But I think you would like to do something like this:
bin <- function(x) {
i <- 0
string <- numeric(32)
while(x > 0) {
string[32 - i] <- x %% 2
x <- x %/% 2
i <- i + 1
}
first <- match(1, string)
string[first:32]
}
This function converts your decimal number into a binary.
It is also useful to set the random number generator to a given value
in order to make your script reproducible:
set.seed(1)
Now, this is your code below, although it may not be the most efficient way to create your data. Note also that you named your objects lists, but actually, you're dealing with matrices here.
list1<-c()
n<-4
#Creating list1 with 4 vectors having 4 bits each
for(i in 1:5)
{
rndno<-round(runif(1, 1, 2^n -1),0)
bn<- bin(rndno)
pad<-rep.int(0,n-length(bn))
bn<-c(pad,bn)
list1<-rbind(list1,bn)
}
# [,1] [,2] [,3] [,4]
# bn 0 1 0 1
# bn 0 1 1 0
# bn 1 0 0 1
# bn 1 1 1 0
# bn 0 1 0 0
list2<-c()
rndno<-round(runif(1, 1, 2^n -1),0)
bn<-bin(rndno)
pad<-rep.int(0,n-length(bn))
bn<-c(pad,bn)
list2<-rbind(list2,bn)
# [,1] [,2] [,3] [,4]
# bn 1 1 1 0
Now, in order to delete the values specified in list2 from list1 you compare the rows of the matrix to your target vector and you select only the rows where there is no complete match:
list1[apply(apply(list1, 1, function(x) x == list2), 2, function(x) any(x == FALSE)),]
# [,1] [,2] [,3] [,4]
# bn 0 1 0 1
# bn 0 1 1 0
# bn 1 0 0 1
# bn 0 1 0 0

R, Create K-nearest neighbors weights in a Matrix

I have a 2-column data frame corresponding to X and Y cartesian coordinates of a sample of 500 georeferenced observations.
I want to generate a weight Matrix W where each elements is equal to:
* 1 :if observation j is one of the k-nearest neighbors to observation i, and
* 0 :if else.
Suppose we have this data frame:
df=as.data.frame(cbind(x=rnorm(500), y=rnorm(500)))
And let suppose k= 20, so how to create this matrix with R ?
Using CRAN's FastKNN package... Let's say you have your distance matrix of 5 * 5 as follows:
library(FastKNN)
df <- as.data.frame(cbind(x = rnorm(5), y=rnorm(5)))
dist_mat <- as.matrix(dist(df, method = "euclidean", upper = TRUE, diag=TRUE))
## Let's say k = 2...
k <- 2
nrst <- lapply(1:nrow(dist_mat), function(i) k.nearest.neighbors(i, dist_mat, k = k))
## Build w
w <- matrix(nrow = dim(dist_mat), ncol=dim(dist_mat)) ## all NA right now
w[is.na(w)] <- 0 ## populate with 0
for(i in 1:length(nrst)) for(j in nrst[[i]]) w[i,j] = 1
So my df looked like this:
> df
x y
1 -0.2109351 -0.315256132
2 0.5172415 0.003352551
3 1.5700413 -0.737475081
4 -0.2699282 -0.198414683
5 1.3997493 -0.241382737
And my w ended up looking like this:
> w
[,1] [,2] [,3] [,4] [,5]
[1,] 0 1 0 1 0
[2,] 1 0 0 1 0
[3,] 0 1 0 0 1
[4,] 1 1 0 0 0
[5,] 0 1 1 0 0

How to fill off-diagonals and ignore diagonals in matrix in R?

I am trying to fill a matrix in R where the final result will ignore the diagonal entries and the values will be filled in around the diagonal. A simple example of what I mean is, if I take a simple 3x3 matrix like the one shown below:
ab <- c(1:9)
mat <- matrix(ab,nrow=3,ncol=3)
colnames(mat)<- paste0("x", 1:3)
rownames(mat)<- paste0("y", 1:3)
mat
x1 x2 x3
y1 1 4 7
y2 2 5 8
y3 3 6 9
What I want to achieve is to fill the diagonals with 0 and shift all the other values around the diagonal. So, for example if I just use diag(mat)<-0 that results in this:
x1 x2 x3
y1 0 4 7
y2 2 0 8
y3 3 6 0
Whereas, the result I'm looking for is something like this (where the values get wrapped around the diagonal):
x1 x2 x3
y1 0 3 5
y2 1 0 6
y3 2 4 0
I'm not worried about the values that are pushed out of the matrix (i.e., 7,8,9).
Any suggestions?
Thanks
EDIT: The upvoted solution below, seems to have solved the problem
One solution that works for your example is to first declare a matrix full of ones except on the diagonal:
M <- 1 - diag(3)
And then to replace all the ones by the desired off-diagonal values
M[M == 1] <- 1:6
M
# [,1] [,2] [,3]
# [1,] 0 3 5
# [2,] 1 0 6
# [3,] 2 4 0
A more complicated scenario (e.g. diagonal coefficients that are not 0, or an unkonwn number of off-diagonal elements) might need a little bit of additionnal work.
You may need a loop:
n <- 9
seqs <- seq(1:n)
mats <- matrix(0, nrow = 3, ncol = 3)
ind <- 0
for(i in 1:nrow(mats)){
for(j in 1:nrow(mats)){
if(i == j) {
mats[i,j] <- 0 }
else {
ind <- ind + 1
mats[j,i] <- seqs[ind]
}
}
}
Resulting in:
>mats
[,1] [,2] [,3]
[1,] 0 3 5
[2,] 1 0 6
[3,] 2 4 0
This will work ok for your example. Not sure I needed n1 & n2, could be altered to one value if always symmetric
# original data
ab <- c(1:9)
n1 <- 3
n2 <- 3
# You could add the 0's to the diagonal, by adding a 0 before every n1 split
# of the data e.g. 0,1,2,3 & 0,4,5,6 & 0,7,8,9
split_ab <- split(ab, ceiling((1:length(ab))/n1))
update_split_ab <- lapply(split_ab, function(x){
c(0, x)
})
new_ab <- unlist(update_split_ab)
mat <- matrix(new_ab, nrow=n1, ncol=n2)
colnames(mat)<- paste0("x", 1:n2)
rownames(mat)<- paste0("y", 1:n1)
mat
# turn this in to a function
makeShiftedMatrix <- function(ab=1:9, n1=3, n2=3){
split_ab <- split(ab, ceiling((1:length(ab))/n1))
update_split_ab <- lapply(split_ab, function(x){
c(0, x)
})
new_ab <- unlist(update_split_ab)
mat <- matrix(new_ab, nrow=n1, ncol=n2)
colnames(mat)<- paste0("x", 1:n2)
rownames(mat)<- paste0("y", 1:n1)
mat
return(mat)
}
# default
makeShiftedMatrix()
# to read in original matrix and shift:
old_mat <- matrix(ab, nrow=n1, ncol=n2)
makeShiftedMatrix(ab=unlist(old_mat))

Resources