fill in a vector with intgers with respect to constraints - r

I'm using r and
I want to fill a 8 lenght dimension vector/table with integer numbers form 1 to 4 with respect to the conditions below:
vector [i]<= vector[i+1]
all integrs should be present
example:
1 1 1 1 2 2 3 4 may be a solution
1 2 1 1 2 3 3 4 isn't a solution to my problem
I am wondering also if there is a way to list all solutions

To get all solutions, reserve four slots for the numbers 1:4 (since every number must appear at least once), and consider all possible length-4 sequences of 1:4 to fill the remaining slots. Sorting and removing duplicates leaves you with 35 non-decreasing sequences:
# The sequences will be the rows of a matrix. First, the 'reserved' slots:
reserved = matrix(1:4, 256, 4, byrow=TRUE)
# Add all combinations of 1:4 to fill the remaining four slots:
result = cbind(reserved,
unname(as.matrix(expand.grid(1:4, 1:4, 1:4, 1:4))) )
# Now simply sort and de-duplicate along rows:
result = t(apply(result, 1, sort))
result = unique(result)
> head(result)
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,] 1 1 1 1 1 2 3 4
# [2,] 1 1 1 1 2 2 3 4
# [3,] 1 1 1 1 2 3 3 4
# [4,] 1 1 1 1 2 3 4 4
# [5,] 1 1 1 1 2 2 3 4
# [6,] 1 1 1 2 2 2 3 4

Related

Quickly calculate the number of shared neighbor between any pair of nodes from an adjacency matrix

I'm wondering if there's a way to quickly calculate the number of shared neighbors between any pairs of nodes (i.e., the number of nodes connected to both node i and j) from an adjacency matrix (like the one below) and then return the output in matrix format?
I've referenced these following posts,
Find common neighbors of selected vertices
Counting the number of neighbors in common between two vertices in R
Calculate number of common neighbors in igraph R
but can't seem to find many clues to do what I want to achieve. I was told that this can be directly calculated as adjm'*adjm, but I am not sure if this makes sense. It will be really appreciated if someone could shed some light on this.
# create an adj. matrix
adjm <- matrix(sample(0:1, 100, replace=TRUE, prob=c(0.6,0.4)), nc=10)
# set diagonal element to 0
diag(adjm) <- 0
# making it symmetric
adjm[lower.tri(adjm)] = t(adjm)[lower.tri(adjm)]
Yes, you can get the number of shared neighbors by computing the matrix
product of adjm' and adjm. Since you are using R, adjm'*adjm means
the component-wise product of the matrices. We want the matrix product,
so you need to use %*%. I will use that below.
To simplify the notation, I will denote adjm = A where
A[i,j] is 1 if there is a link between nodes i and j (they are neighbors)
and A[i,j] = 0 otherwise.
Let's compute t(A) %*% A.
The i-jth coordinate of t(A) %*% A is
(t(A) %*% A)[i,j] =
sum(t(A)[i,k] * A[k,j]) =
sum(A[k,i] * A[k,j])
All of the products in the sum are either 0 or 1.
If both A[k,i]=1 and A[k,j]=1, the product is 1,
otherwise it is zero. So (t(A)%*%A)[i,j] is equal to the
number of different k's for which both A[k,i]=1 and
A[k,j]=1. But A[k,i]=1 means k is a neighbor of i
and A[k,j]=1 means k is a neighbor of j, so
(t(A)%*%A)[i,j] is equal to the number of different k's
for which k is a neighbor of both i and j.
Let's try it out on your example. In order to make the results
reproducible, I set the random.seed.
library(igraph)
## For reproducibility
set.seed(1492)
# create an adj. matrix
adjm <- matrix(sample(0:1, 100, replace=TRUE, prob=c(0.6,0.4)), nc=10)
# set diagonal element to 0
diag(adjm) <- 0
# making it symmetric
adjm[lower.tri(adjm)] = t(adjm)[lower.tri(adjm)]
Shared = t(adjm) %*% adjm
g = graph_from_adjacency_matrix(adjm, mode = "undirected")
plot(g)
Notice for example that Shared[1,4] = 4.
That is because nodes 1 and 4 have four shared neighbors,
Nodes 2,3,6 and 9. Shared[5,7]=0 because nodes 5 and 7
have no neighbors in common.
I think #G5W's solution is super concise already, highly recommended!
Below is a graph theory way to solve it, maybe not that efficient:
> nb <- neighborhood(g,mindist = 1)
> outer(nb, nb, FUN = Vectorize(function(a, b) length(intersect(a, b))))
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 6 1 1 4 2 2 2 5 3 1
[2,] 1 3 2 0 1 3 0 1 2 1
[3,] 1 2 4 1 2 4 1 1 3 3
[4,] 4 0 1 4 2 1 1 3 2 1
[5,] 2 1 2 2 4 2 0 1 3 2
[6,] 2 3 4 1 2 5 1 2 3 3
[7,] 2 0 1 1 0 1 2 2 1 1
[8,] 5 1 1 3 1 2 2 5 3 1
[9,] 3 2 3 2 3 3 1 3 7 3
[10,] 1 1 3 1 2 3 1 1 3 4
> t(adjm) %*% adjm
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 6 1 1 4 2 2 2 5 3 1
[2,] 1 3 2 0 1 3 0 1 2 1
[3,] 1 2 4 1 2 4 1 1 3 3
[4,] 4 0 1 4 2 1 1 3 2 1
[5,] 2 1 2 2 4 2 0 1 3 2
[6,] 2 3 4 1 2 5 1 2 3 3
[7,] 2 0 1 1 0 1 2 2 1 1
[8,] 5 1 1 3 1 2 2 5 3 1
[9,] 3 2 3 2 3 3 1 3 7 3
[10,] 1 1 3 1 2 3 1 1 3 4
If the graph is undirected, then the matrix of the number of shared neighbours is A.A, where I used . to denote the matrix product. As others note, the correct operator in R is %*%. Transposition is not necessary, since in the undirected case, the adjacency matrix is symmetric: A' = A.
If the graph is directed, let's assume that A_{ij} denotes the number of connections from i to j (this is igraph's interpretation).
Then the i,j element of A'.A gives the number of nodes that point to both i and j (i <- v -> j), and A.A' gives the number of nodes that both i and j point to (i -> v <- j). These quantities are sometimes called cocitation and bibliographic coupling, respectively, and can be computed using cocitation() and bibcoupling() in igraph. In the undirected case, both of these functions return the same, thus you may use either.

sample with replacement but constrain the max frequency of each member to be drawn

Is it possible to extend the sample function in R to not return more than say 2 of the same element when replace = TRUE?
Suppose I have a list:
l = c(1,1,2,3,4,5)
To sample 3 elements with replacement, I would do:
sample(l, 3, replace = TRUE)
Is there a way to constrain its output so that only a maximum of 2 of the same elements are returned? So (1,1,2) or (1,3,3) is allowed, but (1,1,1) or (3,3,3) is excluded?
set.seed(0)
The basic idea is to convert sampling with replacement to sampling without replacement.
ll <- unique(l) ## unique values
#[1] 1 2 3 4 5
pool <- rep.int(ll, 2) ## replicate each unique so they each appear twice
#[1] 1 2 3 4 5 1 2 3 4 5
sample(pool, 3) ## draw 3 samples without replacement
#[1] 4 3 5
## replicate it a few times
## each column is a sample after out "simplification" by `replicate`
replicate(5, sample(pool, 3))
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 4 2 2 3
#[2,] 4 5 1 2 5
#[3,] 2 1 2 4 1
If you wish different value to appear up to different number of times, we can do for example
pool <- rep.int(ll, c(2, 3, 3, 4, 1))
#[1] 1 1 2 2 2 3 3 3 4 4 4 4 5
## draw 9 samples; replicate 5 times
oo <- replicate(5, sample(pool, 9))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 5 1 4 3 2
# [2,] 2 2 4 4 1
# [3,] 4 4 1 1 1
# [4,] 4 2 3 2 5
# [5,] 1 4 2 5 2
# [6,] 3 4 3 3 3
# [7,] 1 4 2 2 2
# [8,] 4 1 4 3 3
# [9,] 3 3 2 2 4
We can call tabulate on each column to count the frequency of 1, 2, 3, 4, 5:
## set `nbins` in `tabulate` so frequency table of each column has the same length
apply(oo, 2L, tabulate, nbins = 5)
# [,1] [,2] [,3] [,4] [,5]
#[1,] 2 2 1 1 2
#[2,] 1 2 3 3 3
#[3,] 2 1 2 3 2
#[4,] 3 4 3 1 1
#[5,] 1 0 0 1 1
The count in all columns meet the frequency upper bound c(2, 3, 3, 4, 1) we have set.
Would you explain the difference between rep and rep.int?
rep.int is not the "integer" method for rep. It is just a faster primitive function with less functionality than rep. You can get more details of rep, rep.int and rep_len from the doc page ?rep.

Forcing sample() in R to sample a given fraction of rows over all of k subarrays

My problem is as follows and I have considered many derivations:
I have an array, say with dimensions dims = c(10000, 5, 2) - that is 10 rows, 5 columns and 2 subarrays.
I would like to be able to use the sample() function to sample a given proportion (say m) of rows in EACH subarray and move them BETWEEN subarrays.
So, say swap Row 5 subarray 1 with Row 10 of subarray 2 (see example below).
I asked a similar question
Moving rows between subarrays
and got some great help.
The solution is useful but restricted to random sampling of rows (meaning that not all subarrays will be sampled).
, , 1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 4 3 4 4 3 4 5 2 4 4
[2,] 1 4 3 5 4 5 4 5 2 4
[3,] 1 5 2 1 1 2 1 4 5 1
[4,] 3 1 1 3 5 4 2 4 4 4
[5,] 3 2 5 1 2 2 5 5 4 3 <-- e.g., switch this row
[6,] 4 5 5 2 3 4 1 3 5 5
[7,] 5 5 5 5 1 4 3 1 2 5
[8,] 3 4 3 1 3 3 4 3 2 3
[9,] 1 1 3 2 4 4 1 4 2 3
[10,] 1 4 4 2 4 2 4 2 2 1
, , 2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 5 5 1 1 5 2 1 4 3 1
[2,] 4 3 2 4 3 5 5 5 4 3
[3,] 2 4 1 1 4 2 2 2 3 4
[4,] 5 1 4 5 4 4 3 4 4 5
[5,] 1 5 5 4 3 3 5 2 2 2
[6,] 2 2 2 2 5 5 3 4 3 5
[7,] 5 2 1 1 2 5 3 4 4 2
[8,] 3 4 3 3 1 3 3 2 3 5
[9,] 2 1 4 4 3 2 4 5 5 2
[10,] 5 3 4 5 4 3 5 1 2 3 <-- with this row
In the above example, m = 0.10, that is 10% of the rows (1 row) in each subarray are sampled and then swapped.
Any ideas on how to force sample() to sample within ALL subarrays? Ideally, the number of rows in each subarray will be very large (10000 or more).
Though I have only included 2 subarrays, where a random row or rows swap(s) with a random row or rows in subarray 2 (dictated by m), I need a routine that is generalizable to k subarrays. So if k = 3, then sampling occurs within ALL subarrays and random rows are swapped with neighbouring subarrays.
So, a random row or rows in subarray 1 has equal chance of moving to either subarray 2 or subarray 3 (it doesn't matter which subarray rows go to, so long as they are always moving between subarrays. Then, the corresponding row or rows from subarray 2 or 3 will go to subarray 1.
The number of rows must remain constant. For example, there cannot be 11 rows in subarray 1 and only 9 in subarray 2 -- it has to be 10 and 10.
I don't know of any packages that will do this. The goal here is to simulate movement of animals.
Any ideas are greatly appreciated.
This is verbose and could be cleaned up, but should get you there.
set.seed(4)
x <- array(sample(1:10, 200, replace = T), dim = c(10, 10, 2))
nrows_x <- dim(x)[1]
proportion <- 0.10
idx <- sample(1:nrows_x, nrows_x * proportion)
rows_dim_1 <- x[idx, , 1]
rows_dim_2 <- x[idx, , 2]
x[idx, , 1] <- rows_dim_2
x[idx, , 2] <- rows_dim_1

Indices of matching of rows between matrices

Let n be a positive integer. We have a matrix B that has n columns, whose entries are integers between 1 and n. The aim is to match the rows of B with the rows of permutations(n), memorizing the indices in a vector v.
For example, let us consider the following. If
permutations(3)=
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 1 3 2
[3,] 2 1 3
[4,] 2 3 1
[5,] 3 1 2
[6,] 3 2 1
and
B=
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 1 2 3
[3,] 3 1 2
[4,] 2 3 1
[5,] 3 1 2
Then the vector v is
1 1 5 4 5
because the first two rows of B are equal to the row number 1 of permutations(3), the third row of B is the row number 5 of permutations(3), and so on.
I tried to apply the command
row.match
but the latter returns the error:
Error in do.call("paste", c(x[, , drop = FALSE], sep = "\r")) :
second argument must be a list
One way is to use match,
match(do.call(paste, data.frame(B)), do.call(paste, data.frame(m1)))
#[1] 1 1 5 4 5
One possible way is to turn your matrices into dataframes and join them:
A = read.table(text = "
1 2 3
1 3 2
2 1 3
2 3 1
3 1 2
3 2 1
")
B = read.table(text = "
1 2 3
1 2 3
3 1 2
2 3 1
3 1 2
")
library(dplyr)
A %>%
mutate(row_id = row_number()) %>%
right_join(B) %>%
pull(row_id)
# [1] 1 1 5 4 5

Creating a matrix of increasing concentric rings of numbers in R

I need to write a function in R that creates a matrix of increasing concentric rings of numbers. This function's argument is a number of layers. For example, if x = 3, matrix will look like following:
1 1 1 1 1
1 2 2 2 1
1 2 3 2 1
1 2 2 2 1
1 1 1 1 1
I have no idea how to do it. I would really appreciate any suggestions.
1) Try this:
x <- 3 # input
n <- 2*x-1
m <- diag(n)
x - pmax(abs(row(m) - x), abs(col(m) - x))
giving:
[,1] [,2] [,3] [,4] [,5]
[1,] 1 1 1 1 1
[2,] 1 2 2 2 1
[3,] 1 2 3 2 1
[4,] 1 2 2 2 1
[5,] 1 1 1 1 1
2) A second approach is:
x <- 3 # input
n <- 2*x-1
mid <- pmin(1:n, n:1) # middle row/column
outer(mid, mid, pmin)
giving the same result as before.
3) yet another approach having some similarities to the prior two approaches is:
x <- 3 # input
n <- 2*x-1
Dist <- abs(seq_len(n) - x)
x - outer(Dist, Dist, pmax)
Note: The above gives the sample matrix shown in the question but the subject of the question says the rings should be increasing which may mean increasing from the center to the outside so if that is what is wanted then try this where m, mid and Dist are as before:
pmax(abs(row(m) - x), abs(col(m) - x)) + 1
or
x - outer(mid, mid, pmin) + 1
or
outer(Dist, Dist, pmax) + 1
Any of these give:
[,1] [,2] [,3] [,4] [,5]
[1,] 3 3 3 3 3
[2,] 3 2 2 2 3
[3,] 3 2 1 2 3
[4,] 3 2 2 2 3
[5,] 3 3 3 3 3
Try this:
x<-3
res<-matrix(nrow=2*x-1,ncol=2*x-1)
for (i in 1:x) res[i:(2*x-i),i:(2*x-i)]<-i
res
# [,1] [,2] [,3] [,4] [,5]
#[1,] 1 1 1 1 1
#[2,] 1 2 2 2 1
#[3,] 1 2 3 2 1
#[4,] 1 2 2 2 1
#[5,] 1 1 1 1 1
A recursive solution for kicks (odd n only)
f <- function(n) if (n == 1) 1 else `[<-`(matrix(1,n,n), 2:(n-1), 2:(n-1), 1+Recall(n-2))
f(5)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 1 1 1 1
# [2,] 1 2 2 2 1
# [3,] 1 2 3 2 1
# [4,] 1 2 2 2 1
# [5,] 1 1 1 1 1
Here's the logic, implement it yourself in R.
Create a matrix with number of rows and columns equal to 2*x-1 and
fill it with zeros and start traversing the array from (0,0) to
(2*x-2,2*x-2).
Now, at each cell, calculate the 'level' of the cell. The level of
the cell is the nearest distance of it from the four borders of
the matrix, i.e. min(i,j,2*x-2-i,2*x-2-j).
This 'level' value is the one to be put in the cell.

Resources