assign vector elements to matrix - r

I wish to assign vector elements to a matrix. In my example I have a population of 10 kinds of fruit available for purchase, in my.fruit. A sample of five pieces of fruit are purchased in the sequence shown in sequence.of.purchased.item. I want to create a matrix containing the sequence in which the fruit was purchased as shown in desired.result.
Here is a somewhat long description of how desired.result is constructed. The vector my.fruit is essentially the row names of the matrix desired.result. The first plum purchased was the first piece of fruit purchased. The sequence number of 1 is placed in the first column of the fifth row, the row representing plums. The second plum purchased was the third piece of fruit purchased. The sequence number of 3 is placed in the second column of the plum row. The first apple purchased was the second piece of fruit purchased. So, the number 2 is placed in the first column of the first row, the row representing apples. The fourth piece of purchased fruit was an orange. So, a 4 is placed in the first column of the second row, the row representing oranges.
The tenth row represents olives, but no olives were purchased. Similarly, no cherries, peaches, apricots, pears, grapefruit or figs were purchased. So, their rows are all zero in desired.result.
my.fruit <- c('apple', 'orange', 'cherry', 'peach', 'plum',
'apricot', 'pear', 'grapefruit', 'fig', 'olive')
sequence.of.purchased.item <- c('plum', 'apple', 'plum', 'orange', 'plum')
desired.result <- matrix(c(
2, 0, 0,
4, 0, 0,
0, 0, 0,
0, 0, 0,
1, 3, 5,
0, 0, 0,
0, 0, 0,
0, 0, 0,
0, 0, 0,
0, 0, 0), ncol = 3, byrow = TRUE)
I can obtain the row and column indices for each piece of purchased fruit using:
purchase.order.row <- match(sequence.of.purchased.item, my.fruit)
purchase.order.row
#[1] 5 1 5 2 5
purchase.order.col <- sapply(1:length(sequence.of.purchased.item),
function(i) {sum(sequence.of.purchased.item[i] == sequence.of.purchased.item[1:i])})
purchase.order.col
#[1] 1 1 2 1 3
Here I attempt to assign the sequence number of each purchased fruit to an output matrix using sapply:
my.output <- matrix(0, ncol = 3, nrow = 10)
sapply(1:5, function(x) my.output[purchase.order.row[x],
purchase.order.col[x]] = x)
However, the sapply statement is not returning the desired output.
my.output
[,1] [,2] [,3]
[1,] 0 0 0
[2,] 0 0 0
[3,] 0 0 0
[4,] 0 0 0
[5,] 0 0 0
[6,] 0 0 0
[7,] 0 0 0
[8,] 0 0 0
[9,] 0 0 0
[10,] 0 0 0

This will probably be much easier:
my.output <- matrix(0, ncol = 3, nrow = 10)
i <- cbind(purchase.order.row,purchase.order.col)
my.output[i] <- 1:5

Use <<- instead of = inside your sapply function so that sapply knows to assign x to a globally defined variable instead of function internal.

Related

How to create a vector every iteration of a for loop?

I have a vector containing different names e.g. (apple, apple, apple, pear, grape, apple)
For each of these names, I would like to create a vector, whose elements are a binary 1/0 depending on whether that name appears in the corresponding entry.
So I want my nested for-loop to return, from my example, three vectors:
apple <- c(1, 1, 1, 0, 0, 1)
pear <- c(0, 0, 0, 1, 0, 0)
grape <- c(0, 0, 0, 0, 1, 0)
Here is the code I've tried:
for(i in SurgList) {
for(j in Surgeons) {
assign("i", "i")
ifelse(i==j, append(i, 1), append(i,0))
}
}
Thank you for your help!
v0 <- c('apple', 'pear', 'grape') # Targets to be checked
v1 <- c('apple', 'apple', 'apple', 'pear', 'grape', 'apple') # Input vector
list_01 <- list() # Create an empty list to store each vectors
for (i in v0){ # v0 = Targets to be checked
list_ <- list(i = +(v1 == i)) # Compare if the input vector meet the case
names(list_) <- i
list_01 <- append(list_01, list_) # Store the vector into list
}
list_01 # The binaries are stored in this list
apple <- list_01$apple # Each groups' binaries could be
pear <- list_01$pear # extracted as a vector
grape <- list_01$grape
Console Output:
> list_01
$apple
[1] 1 1 1 0 0 1
$pear
[1] 0 0 0 1 0 0
$grape
[1] 0 0 0 0 1 0
> apple
[1] 1 1 1 0 0 1
> pear
[1] 0 0 0 1 0 0
> grape
[1] 0 0 0 0 1 0

Cluster groups based on pairwise distances

I have an n x n matrix with pairwise distances as entries. The matrix looks for example like this:
m = matrix (c(0, 0, 1, 1, 1, 1,0, 0, 1, 1, 0, 1,1, 1, 0, 1, 1, 0,1, 1, 1, 0, 1, 1,1, 0, 1, 1, 0, 1,1, 1, 0, 1, 1, 0),ncol=6, byrow=TRUE)
colnames(m) <- c("A","B","C","D","E","F")
rownames(m) <- c("A","B","C","D","E","F")
Now I want to put every letter in the same cluster if the distance to any other letter is 0. For the example above, I should get three clusters consisting of:
(A,B,E)
(C,F)
(D)
I would be interested in the number of entries in each cluster. At the end, I want to have a vector like:
clustersizes = c(3,2,1)
I assume it is possible by using the hclust function, but I'm not able to extract the three clusters. I also tried the cutree function, but if I don't know the number of clusters before and also not the cutoff for the height, how should I do it?
This is what I tried:
h <- hclust(dist(m),method="single")
plot(h)
Thanks!
Welcome to SO.
There are several ways to handle this but an easy choice is to use the igraph package.
First we convert your matrix m to an adjacency matrix. It contains the distances to neighbouring nodes, where 0 means no connection. Thus, we subtract your matrix from 1 to get that
mm <- 1 - m
diag(mm) <- 0 # We don't allow loops
This gives
> mm
A B C D E F
A 0 1 0 0 0 0
B 1 0 0 0 1 0
C 0 0 0 0 0 1
D 0 0 0 0 0 0
E 0 1 0 0 0 0
F 0 0 1 0 0 0
Then we just need to feed it to igraph to compute communities
library("igraph")
fastgreedy.community(as.undirected(graph.adjacency(mm)))
which produces
IGRAPH clustering fast greedy, groups: 3, mod: 0.44
+ groups:
$`1`
[1] "A" "B" "E"
$`2`
[1] "C" "F"
$`3`
[1] "D"
Now if you save that result you can get the community sizes right away
res < fastgreedy.community(as.undirected(graph.adjacency(mm)))
sizes(res)
which yields
Community sizes
1 2 3
3 2 1

Take value from list and use it as index

I have a matrix (A) like this (the names of row and column are identification codes (ID):
1 3 10 38 46
1 0 0.4 0 0 0
3 0 0 0 0 0
10 0 0 0.9 0.8 0
38 0 0 0 0 0
46 0 0.1 0 0 0
And another matrix (B) like this:
a b c
1 2.676651e-04 4.404911e-06 9.604227e-06
3 6.073389e-10 3.273222e-05 3.360321e-04
10 4.156392e-08 1.269607e-06 7.509217e-06
38 4.200699e-08 3.227431e-02 8.286920e-11
46 9.352353e-05 3.318948e-20 8.694981e-06
I would like to take the index of the elements of the A matrix >0, therefore I used this command:
temp <- apply(A,1, FUN=function(x) which(x>0))
it returned a list with the correct index of the elements >0.
After that I would like to multiply the element of the matrix B using the index. In particular, I would like to do something like these for each row:
1: 6.073389e-10*3.273222e-05*3.360321e-04
I have used the information of the matrix A (in the second column of the first row I have a value >0) as index to take the element in the matrix B for the first row.
For the second row, I obtained 0 because there aren't element in A[2,]>0
For the third row, I would like to obtain something like the first row, but I should sum the two products
10: 4.156392e-08*1.269607e-06*7.509217e-06 +4.200699e-08*3.227431e-02*8.286920e-11
I have tried to unlist the list but in this way I obtained a vector losing the corresponding between the ID
A <-
matrix(
c(0, 0.4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.9, 0.8, 0, 0, 0, 0, 0, 0, 0, 0.1, 0, 0, 0),
nrow = 5,
ncol = 5,
byrow = T
)
B <-
matrix(
c(
2.676651e-04, 4.404911e-06, 9.604227e-06,
6.073389e-10, 3.273222e-05, 3.360321e-04,
4.156392e-08, 1.269607e-06, 7.509217e-06,
4.200699e-08, 3.227431e-02, 8.286920e-11,
9.352353e-05, 3.318948e-20, 8.694981e-06
),
nrow = 5,
ncol = 3,
byrow = T
)
idx<-which(A>0, arr.ind = T)
result <- 0;
for (i in 1:nrow(idx)) {
cat(A[idx[i,1],idx[i,2]], sep="\n")
cat(B[idx[i,2], ], sep="\n")
result = result + sum(A[idx[i,1],idx[i,2]] * B[idx[i,2],])
}
cat("result=")
cat(result)

Generate matrices using positive integer solutions of the indefinite equation

I asked a question similar to this one previously. But this one little more tricky. I have POSITIVE INTEGER solutions(previously NON-NEGATIVE solutions) matrix(say A) to the indefinite equation x1+x2+x3 = 8. Also, I have another matrix(say B) with columns
0 1 0 1
0 0 1 1
I want to generate matrices using rows of A and the columns of B.
For an example, let (2,2,4) is the one solution(one row) of the matrix A. In this case, I just cannot use rep. So I tried to generate all the three column matrices from matrix B and then try to apply rep, but couldn't figure that out. I use the following lines to generate lists of all three column matrices.
cols <- combn(ncol(B), 3, simplify=F, FUN=as.numeric)
M3 <- lapply(cols, function(x) cbind(B[,x]))
For an example, cols[[1]]
[1] 1 2 3
Then, the columns of my new matrix would be
0 0 1 1 0 0 0 0
0 0 0 0 1 1 1 1
Columns of this new matrix are the multiples of columns of B. i.e., first column 2-times, second column 2-time and third column 4-times. I want to use this procedure all the rows of matrix A. How do I do this?
?rep(x, times) says;
if times is a vector of the same length as x (after replication by
each), the result consists of x[1] repeated times[1] times, x[2]
repeated times[2] times and so on.
Basic idea is;
B <- matrix(c(0, 1, 0, 1, 0, 0, 1, 1), byrow = T, nrow = 2)
cols <- combn(ncol(B), 3, simplify=F, FUN=as.numeric)
a1 <- c(2, 2, 4)
cols[[1]] # [1] 1 2 3
rep(cols[[1]], a1) # [1] 1 1 2 2 3 3 3 3
B[, rep(cols[[1]], a1)]
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
# [1,] 0 0 1 1 0 0 0 0
# [2,] 0 0 0 0 1 1 1 1
testA <- rbind(c(2,2,4), c(2,1,5), c(2,3,3))
## apply(..., lapply(...)) approach (output is list in list)
apply(testA, 1, function(x) lapply(cols, function(y) B[, rep(y, x)]))
## other approach using combination of indices
ind <- expand.grid(ind_cols = 1:length(cols), ind_A = 1:nrow(testA))
col_ind <- apply(ind, 1, function(x) rep(cols[[x[1]]], testA[x[2],]))
lapply(1:ncol(col_ind), function(x) B[, col_ind[,x]]) # output is list
library(dplyr)
apply(col_ind, 2, function(x) t(B[, x])) %>% matrix(ncol = 8, byrow=T) # output is matrix

make data frame with binaries to sum to 1

I have a data frame with only zeros and ones, e.g.
df <- data.frame(v1 = rbinom(100, 1, 0.5),
v2 = rbinom(100, 1, 0.2),
v3 = rbinom(100, 1, 0.4))
Now I want to modify this data set so that each row sums to 1.
So this
1 0 0
1 1 0
0 0 1
1 1 1
0 0 0
should become this:
1 0 0
0.5 0.5 0
0 0 1
0.33 0.33 0.33
0 0 0
edit: rows with all zeros should be left as is
As already pointed out by #lmo the data.frame (or matrix) can be modified with
df <- df / rowSums(df)
In the case of rows containing only zeros this will lead to rows containing only NaN. Since these rows should be kept as they were, the easiest way is probably to correct for this afterwards with
df[is.na(df)] <- 0
Here is a quick method:
# create matrix
temp <- matrix(c(1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 1, 1), ncol=3, byrow=T)
temp / rowSums(temp)
This exploits the fact that matrices are ordered column-wise, so that the element by element division of rowsSums and the recycling are aligned.
In the case that all elements in a row are zero, and you don't want an Inf, another method from #RHertel s is the following:
# save rowSum:
mySums <- rowSums(temp)
temp / ifelse(mySums != 0, mySums, 1)

Resources