Comparing list elements in R - r

I have created two list, the fist named list1 having 4 elements containing 4 bits and the second list2 containing 1 element with 4 bits. I want to compare the lists and if any element in list1 is same as the only element in list2, then i want to delete the element from list1. I've implemented the following code but not getting the correct result.
list1<-c()
n<-4
#Creating list1 with 4 vectors having 4 bits each
for(i in 1:5)
{
rndno<-round(runif(1, 1, 2^n -1),0)
bn<-bin(rndno)
pad<-rep.int(0,n-length(bn))
bn<-c(pad,bn)
list1<-rbind(list1,bn)
}
list2<-c()
rndno<-round(runif(1, 1, 2^n -1),0)
bn<-bin(rndno)
pad<-rep.int(0,n-length(bn))
bn<-c(pad,bn)
list2<-rbind(list2,bn)
for(i in 1:nrow(k))
{
if(list2[1,] == list2[i,])
{
print(i)
}
}
Please Help.

The problem is that you didn't define the bin function in your example, so we can only guess what you would like to achieve. But I think you would like to do something like this:
bin <- function(x) {
i <- 0
string <- numeric(32)
while(x > 0) {
string[32 - i] <- x %% 2
x <- x %/% 2
i <- i + 1
}
first <- match(1, string)
string[first:32]
}
This function converts your decimal number into a binary.
It is also useful to set the random number generator to a given value
in order to make your script reproducible:
set.seed(1)
Now, this is your code below, although it may not be the most efficient way to create your data. Note also that you named your objects lists, but actually, you're dealing with matrices here.
list1<-c()
n<-4
#Creating list1 with 4 vectors having 4 bits each
for(i in 1:5)
{
rndno<-round(runif(1, 1, 2^n -1),0)
bn<- bin(rndno)
pad<-rep.int(0,n-length(bn))
bn<-c(pad,bn)
list1<-rbind(list1,bn)
}
# [,1] [,2] [,3] [,4]
# bn 0 1 0 1
# bn 0 1 1 0
# bn 1 0 0 1
# bn 1 1 1 0
# bn 0 1 0 0
list2<-c()
rndno<-round(runif(1, 1, 2^n -1),0)
bn<-bin(rndno)
pad<-rep.int(0,n-length(bn))
bn<-c(pad,bn)
list2<-rbind(list2,bn)
# [,1] [,2] [,3] [,4]
# bn 1 1 1 0
Now, in order to delete the values specified in list2 from list1 you compare the rows of the matrix to your target vector and you select only the rows where there is no complete match:
list1[apply(apply(list1, 1, function(x) x == list2), 2, function(x) any(x == FALSE)),]
# [,1] [,2] [,3] [,4]
# bn 0 1 0 1
# bn 0 1 1 0
# bn 1 0 0 1
# bn 0 1 0 0

Related

Distribute a sum randomly across columns

Suppose I have a vector of length n. Let's say n=3 so and the vector is v=c(3,5,4).
I have a matrix of zeros with n rows and m columns. Let's say m=5
mymatrix <- matrix(rep(0, 3*5), nrow=3)
what I want to to is randomly distribute the values of v across columns, for each row. So in this example, the first row would sum to 3, the second row would sum to 5, and the third row would sum to 4. E.g. this would be one possible random assignment:
0 1 2 0 0
3 1 0 0 1
0 2 2 0 0
The sums of the rows are 3,5,4, which are the values of v.
How can I accomplish this? My thought was to start with
sapply(v, function(i){sample(1:m, i, replace=TRUE)})
and go from there but this gives me a list, since each result is of a different length, and I'm not sure how to proceed from there.
EDIT: the intention is no negative numbers, so 0 9 1 1 -8 summing to 3 would not be a valid row.
Using purrr:
library(tibble)
library(dplyr)
library(purrr)
n <- 3
m <- 5
v <- c(3,5,4)
y <- sapply(v, function(i){sample(1:m, i, replace=TRUE)})
do_it <- function(x) {
tmp <- tibble(
index = x,
cnt = 1
) %>%
group_by(index) %>%
summarise(cnt = sum(cnt))
out <- rep(0, m)
out[tmp$index] <- tmp$cnt
return(out)
}
y %>%
map(~do_it(.x)) %>%
unlist() %>%
matrix(nrow = 3, byrow = TRUE)
Maybe parts from package partitions is the tool you need for your objective, i.e.,
library(partitions)
# define your custom function `f` to generate random combinations and positions, with row sum subject to the given value
f <- function(k) replace(rep(0,ncol(mymatrix)),
sample(ncol(mymatrix),k),
as.data.frame.matrix(parts(k))[sample(k),sample(k,1)])
such that
set.seed(1)
M <- t(sapply(v, f))
> M
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 3 0 0
[2,] 1 0 0 4 0
[3,] 0 4 0 0 0

Generate matrices using positive integer solutions of the indefinite equation

I asked a question similar to this one previously. But this one little more tricky. I have POSITIVE INTEGER solutions(previously NON-NEGATIVE solutions) matrix(say A) to the indefinite equation x1+x2+x3 = 8. Also, I have another matrix(say B) with columns
0 1 0 1
0 0 1 1
I want to generate matrices using rows of A and the columns of B.
For an example, let (2,2,4) is the one solution(one row) of the matrix A. In this case, I just cannot use rep. So I tried to generate all the three column matrices from matrix B and then try to apply rep, but couldn't figure that out. I use the following lines to generate lists of all three column matrices.
cols <- combn(ncol(B), 3, simplify=F, FUN=as.numeric)
M3 <- lapply(cols, function(x) cbind(B[,x]))
For an example, cols[[1]]
[1] 1 2 3
Then, the columns of my new matrix would be
0 0 1 1 0 0 0 0
0 0 0 0 1 1 1 1
Columns of this new matrix are the multiples of columns of B. i.e., first column 2-times, second column 2-time and third column 4-times. I want to use this procedure all the rows of matrix A. How do I do this?
?rep(x, times) says;
if times is a vector of the same length as x (after replication by
each), the result consists of x[1] repeated times[1] times, x[2]
repeated times[2] times and so on.
Basic idea is;
B <- matrix(c(0, 1, 0, 1, 0, 0, 1, 1), byrow = T, nrow = 2)
cols <- combn(ncol(B), 3, simplify=F, FUN=as.numeric)
a1 <- c(2, 2, 4)
cols[[1]] # [1] 1 2 3
rep(cols[[1]], a1) # [1] 1 1 2 2 3 3 3 3
B[, rep(cols[[1]], a1)]
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,] 0 0 1 1 0 0 0 0
# [2,] 0 0 0 0 1 1 1 1
testA <- rbind(c(2,2,4), c(2,1,5), c(2,3,3))
## apply(..., lapply(...)) approach (output is list in list)
apply(testA, 1, function(x) lapply(cols, function(y) B[, rep(y, x)]))
## other approach using combination of indices
ind <- expand.grid(ind_cols = 1:length(cols), ind_A = 1:nrow(testA))
col_ind <- apply(ind, 1, function(x) rep(cols[[x[1]]], testA[x[2],]))
lapply(1:ncol(col_ind), function(x) B[, col_ind[,x]]) # output is list
library(dplyr)
apply(col_ind, 2, function(x) t(B[, x])) %>% matrix(ncol = 8, byrow=T) # output is matrix

All binary sequences in R

I'm trying to write a recursive function that takes as input an integer n and returns a matrix that contains all binary sequences of length n.
I wrote this code but it is not giving an output
binseq <- function(n){
binsequ <- matrix(nrow = length(n), ncol = n)
r <- 0 # current row of binseq
for (i in 0:n) {
for (j in 0:n) {
for (k in 0:n) {
r <- r + 1
return (binsequ[r,] <- c(i, j, k))
}
}
}
}
I tried to run it using n=3
binseq(3)
But with no success.
However, when I do not use the function command and give specific numbers, it works. For example,
binseq <- matrix(nrow = 8, ncol = 3)
r <- 0 # current row of binseq
for (i in 0:1) {
for (j in 0:1) {
for (k in 0:1) {
r <- r + 1
binseq[r,] <- c(i, j, k)
}
}
}
binseq
the output is:
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 1
[3,] 0 1 0
[4,] 0 1 1
[5,] 1 0 0
[6,] 1 0 1
[7,] 1 1 0
[8,] 1 1 1
I created a function, was thinking about doing recursive but it turned out to be using loop. Hope to see how others did this as this sounds like a quite basic question. Landed here. If helps, here is my function.
##k: number of digits, k>1, eg k=3 for 101 etc
binary_gen <- function(k) {
base <- c(0L,1L)
##initialize
bi.set <- integer(2^(k))
bi.set[1:2] <- base
##create set through loop
for (i in 2:k) {
bi.set[(2^(i-1)+1):2^i] <- 10^(i-1)+bi.set[1:2^(i-1)]
bi.set
}
return(bi.set)
}
Here is output for k=4.
> binary_gen(4L)
[1] 0 1 10 11 100 101 110 111 1000 1001 1010 1011 1100 1101 1110 1111
You can manipulate the output vector into a matrix of desired format (# of rows=k, # of columns=2^k). Based on k, the location for binary numbers of k digits are 2^(k-1)+1 to 2^k.

Merge all possible combinations of multiple data frames

I would like to merge by columns all the possible pair combinations of these three data frames (i.e. nine combinations)
frame1 = data.frame(a=c(1,2,3), b=c(1,2,3), c=c(1,2,3))
frame2 = data.frame(a=c(2,1,3), b=c(2,1,3), c=c(2,1,3))
frame3 = data.frame(a=c(3,2,1), b=c(3,2,1), c=c(3,2,1))
which contain the same 3 rows each but not in the same order, so I would also like that the merging be by coincidence of the pair of values of the columns a and b in the two files merged. Example:
a b c
1 1 1
2 2 2
3 3 3
+
a b c
2 2 2
1 1 1
3 3 3
=
a.x b.x c.x a.y b.y c.y
1 1 1 1 1 1
2 2 2 2 2 2
3 3 3 3 3 3
I wanted then to obtain the difference between each pair of values of the columns c.x and c.y present in each merged file, in absolute values, and sum all these differences thus obtaining a "score" (of course this would be zero in this example), which I would like to add to an empty matrix 3x3 in the correspondant cell (i.e., the score of frame1 vs. frame 2 should be located in cell [2,1], etc.):
nframes = 3
frames = c(frame1,frame2,frame3)
matrix = matrix(, nrow = nframes, ncol = nframes)
matrix_scores = data.frame(matrix)
for (i in frames){
for (j in frames)
{
x = merge(i, j, by=c("a","b"))
score = sum(abs(x$c.x - x$c.y))
matrix_scores[j,i] <- score
}
}
However, when I run the loop I obtain the following message:
Error in fix.by(by.x, x) : 'by' must specify uniquely valid columns
Also, I understand that the line
matrix_scores[j,i] <- score
will give an error, too, but I do not know how to express that I want the score to be stored in cell [1,1], for the first iteration of the loop (frame1 vs. frame1).
The resulting matrix should be a 3x3 matrix containing all zeros:
f1 f2 f3
frame1 0 0 0
frame2 0 0 0
frame3 0 0 0
You can do:
# Put all frames in a list
d <- list(frame1, frame2, frame3)
# get all merge-combinations
gr <- expand.grid(1:length(d), 1:length(d))
# function to merge and get the sum diff:
foo <- function(i, x, gr){
tmp <- merge(x[[gr[i, 1]]], x[[gr[i, 2]]], by=c("a", "b"))
sum(abs(tmp$c.x - tmp$c.y))
}
# result matrix
matrix(sapply(1:nrow(gr), foo, d, gr), length(d), length(d), byrow = T)
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 0
[3,] 0 0 0
# The scores are set as followed:
matrix(apply(gr, 1, paste, collapse="_"), 3, 3, byrow = T)
[,1] [,2] [,3]
[1,] "1_1" "2_1" "3_1"
[2,] "1_2" "2_2" "3_2"
[3,] "1_3" "2_3" "3_3"
# alternative using apply:
# function to merge and get the sum diff:
foo <- function(y, x){
tmp <- merge(x[[ y[1] ]], x[[ y[2] ]], by=c("a", "b"))
sum(abs(tmp$c.x - tmp$c.y))
}
# result matrix
matrix(apply(gr, 1, foo, d), length(d), length(d), byrow = T)

R, Create K-nearest neighbors weights in a Matrix

I have a 2-column data frame corresponding to X and Y cartesian coordinates of a sample of 500 georeferenced observations.
I want to generate a weight Matrix W where each elements is equal to:
* 1 :if observation j is one of the k-nearest neighbors to observation i, and
* 0 :if else.
Suppose we have this data frame:
df=as.data.frame(cbind(x=rnorm(500), y=rnorm(500)))
And let suppose k= 20, so how to create this matrix with R ?
Using CRAN's FastKNN package... Let's say you have your distance matrix of 5 * 5 as follows:
library(FastKNN)
df <- as.data.frame(cbind(x = rnorm(5), y=rnorm(5)))
dist_mat <- as.matrix(dist(df, method = "euclidean", upper = TRUE, diag=TRUE))
## Let's say k = 2...
k <- 2
nrst <- lapply(1:nrow(dist_mat), function(i) k.nearest.neighbors(i, dist_mat, k = k))
## Build w
w <- matrix(nrow = dim(dist_mat), ncol=dim(dist_mat)) ## all NA right now
w[is.na(w)] <- 0 ## populate with 0
for(i in 1:length(nrst)) for(j in nrst[[i]]) w[i,j] = 1
So my df looked like this:
> df
x y
1 -0.2109351 -0.315256132
2 0.5172415 0.003352551
3 1.5700413 -0.737475081
4 -0.2699282 -0.198414683
5 1.3997493 -0.241382737
And my w ended up looking like this:
> w
[,1] [,2] [,3] [,4] [,5]
[1,] 0 1 0 1 0
[2,] 1 0 0 1 0
[3,] 0 1 0 0 1
[4,] 1 1 0 0 0
[5,] 0 1 1 0 0

Resources