How to understand a specific function in R

How to understand a specific function in R - r

Here I have a simple function in R below it is:
no.dimnames <- function(a) {
## Remove all dimension names from an array for compact printing.
d <- list()
l <- 0
for(i in dim(a)) {
d[[l <- l + 1]] <- rep("", i)
}
dimnames(a) <- d
a
}
The goal for this function is to drop all array names. However, I dont know what does the following indexing do.
d[[l <- l + 1]]
In this case, d is a null list initially, and l<- 0 so then d[[0<- 1]] implies what?
> x <- matrix(sample(1:5,20,replace=TRUE),nrow = 5)
> x
[,1] [,2] [,3] [,4]
[1,] 5 4 5 3
[2,] 2 1 5 1
[3,] 1 3 4 4
[4,] 3 1 4 3
[5,] 5 3 5 5
> no.dimnames(x)
5 4 5 3
2 1 5 1
1 3 4 4
3 1 4 3
5 3 5 5

It looks like you understand the increment code d[[l <- l + 1]] but are still asking about the empty spaces rep("", i). They are replacing the dimension names with blanks. The i is used to indicate the amount of spaces that are needed.
If we had a 4x5 matrix. We would have four row names and five column names. To make them all blank, we would need four spaces in rows rep("", 4) and five in columns rep("", 5). The code aims to accomplish that:
mat <- matrix(1:20, 4,5)
rownames(mat) <- month.abb[1:4]
colnames(mat) <- letters[1:5]
mat
# a b c d e
# Jan 1 5 9 13 17
# Feb 2 6 10 14 18
# Mar 3 7 11 15 19
# Apr 4 8 12 16 20
dimnames(mat)
# [[1]]
# [1] "Jan" "Feb" "Mar" "Apr"
#
# [[2]]
# [1] "a" "b" "c" "d" "e"
#What we need
list(rep("", 4), rep("", 5))
# [[1]]
# [1] "" "" "" ""
#
# [[2]]
# [1] "" "" "" "" ""
dimnames(mat) <- list(rep("", 4), rep("", 5))
mat
#
# 1 5 9 13 17
# 2 6 10 14 18
# 3 7 11 15 19
# 4 8 12 16 20

d[[0<- 1]] isn't valid... you are saying set 0 to 1 which can't be done. In this case l is being set to l + 1 where it is initially 0 so it is l <- 0 + 1
Forget about what a is or what it could be.
Just type this into Rstudio or w/e you are using you will see what happens if you check each variable.
> d <- list()
> l <- 0
> d[[l <- l + 1]] <- rep("", 1)
> d
> l
The one part I should explain is in this case when you type the d[[l <- l + 1]] it is assigning l + 1 to l and then using l as the parameter for the [[]].
So the d[[l <- l + 1]] breaks down to this...
l <- l + 1
l <- 0 + 1
l <- 1
[l]
[[l]]
d[[l]]
d[[1]]

Related

Substring each element of the data frame in R

I would like to use this code, which I found here: Generate N random integers that sum to M in R. I have a example data frame and I would like to use this function for each Value in data frame.
rand_vect <- function(N, M, sd = 1, pos.only = TRUE) {
vec <- rnorm(N, M/N, sd)
if (abs(sum(vec)) < 0.01) vec <- vec + 1
vec <- round(vec / sum(vec) * M)
deviation <- M - sum(vec)
for (. in seq_len(abs(deviation))) {
vec[i] <- vec[i <- sample(N, 1)] + sign(deviation)
}
if (pos.only) while (any(vec < 0)) {
negs <- vec < 0
pos <- vec > 0
vec[negs][i] <- vec[negs][i <- sample(sum(negs), 1)] + 1
vec[pos][i] <- vec[pos ][i <- sample(sum(pos ), 1)] - 1
}
vec
}
abc <- data.frame(Product = c("A", "B", "C"),
Value =c(33, 23, 12))

You can vectorize your function rand_vect for receiving a array of values
vrand_vect <- Vectorize(rand_vect,"M",SIMPLIFY = F)
which gives
res <- vrand_vect(N= 7, M = abc$Value)
> res
[[1]]
[1] 3 5 6 6 4 3 6
[[2]]
[1] 1 3 4 5 3 4 3
[[3]]
[1] 1 2 3 3 1 2 0

You can use sapply to run rand_vect with N = 7
sapply(abc$Value, rand_vect, N = 7)
# [,1] [,2] [,3]
#[1,] 4 4 0
#[2,] 5 2 2
#[3,] 5 4 2
#[4,] 4 3 3
#[5,] 5 5 3
#[6,] 6 2 1
#[7,] 4 3 1
This will give 7 random numbers which will sum upto Value.
We can verify it by
colSums(sapply(abc$Value, rand_vect, N = 7))
#[1] 33 23 12

Generalize R %in% operator to match tuples

I spent a while the other day looking for a way to check if a row vector is contained in some set of row vectors in R. Basically, I want to generalize the %in% operator to match a tuple instead of each entry in a vector. For example, I want:
row.vec = c("A", 3)
row.vec
# [1] "A" "3"
data.set = rbind(c("A",1),c("B",3),c("C",2))
data.set
# [,1] [,2]
# [1,] "A" "1"
# [2,] "B" "3"
# [3,] "C" "2"
row.vec %tuple.in% data.set
# [1] FALSE
for my made-up operator %tuple.in% because the row vector c("A",3) is not a row vector in data.set. Using the %in% operator gives:
row.vec %in% data.set
# [1] TRUE TRUE
because "A" and 3 are in data.set, which is not what I want.
I have two questions. First, are there any good existing solutions to this?
Second, since I couldn't find them (even if they exist), I tried to write my own function to do it. It works for an input matrix of row vectors, but I'm wondering if any experts have proposed improvements:
is.tuple.in <- function(matrix1, matrix2){
# Apply rbind() so that matrix1 has columns even if it is a row vector.
matrix1 = rbind(matrix1)
if(ncol(matrix1) != ncol(matrix2)){
stop("Matrices must have the same number of columns.") }
# Now check for the first row and handle other rows recursively
row.vec = matrix1[1,]
tuple.found = FALSE
for(i in 1:nrow(matrix2)){
# If we find a match, then this row exists in matrix 2 and we can break the loop
if(all(row.vec == matrix2[i,])){
tuple.found = TRUE
break
}
}
# If there are more rows to be checked, use a recursive call
if(nrow(matrix1) > 1){
return(c(tuple.found, is.tuple.in(matrix1[2:nrow(matrix1),],matrix2)))
} else {
return(tuple.found)
}
}
I see a couple problems with that that I'm not sure how to fix. First, I'd like the base case to be clear at the start of the function. I didn't manage to do this because I pass matrix1[2:nrow(matrix1),] in the recursive call, which produces an error if matrix1 has one row. So instead of getting to a case where matrix1 is empty, I have an if condition at the end deciding if more iterations are necessary.
Second, I think the use of rbind() at the start is sloppy, but I needed it for when matrix1 had been reduced to a single row. Without using rbind(), ncol(matrix1) produced an error in the 1-row case. I figure my trouble here has to do with a lack of knowledge about R data types.
Any help would be appreciated.

I'm wondering if you have made this a bit more complicated than it is. For example,
set.seed(1618)
vec <- c(1,3)
mat <- matrix(rpois(1000,3), ncol = 2)
rownames(mat) <- 1:nrow(mat)
mat[sapply(1:nrow(mat), function(x) all(vec %in% mat[x, ])), ]
# gives me this
# [,1] [,2]
# 6 3 1
# 38 3 1
# 39 3 1
# 85 1 3
# 88 1 3
# 89 1 3
# 95 3 1
# 113 1 3
# ...
you could subset this further if you care about the order
or you could modify the function slightly:
mat[sapply(1:nrow(mat), function(x)
all(paste(vec, collapse = '') %in% paste(mat[x, ], collapse = ''))), ]
# [,1] [,2]
# 85 1 3
# 88 1 3
# 89 1 3
# 113 1 3
# 133 1 3
# 139 1 3
# 187 1 3
# ...
another example with a longer vector
set.seed(1618)
vec <- c(1,4,5,2)
mat <- matrix(rpois(10000, 3), ncol = 4)
rownames(mat) <- 1:nrow(mat)
mat[sapply(1:nrow(mat), function(x) all(vec %in% mat[x, ])), ]
# [,1] [,2] [,3] [,4]
# 57 2 5 1 4
# 147 1 5 2 4
# 279 1 2 5 4
# 303 1 5 2 4
# 437 1 5 4 2
# 443 1 4 5 2
# 580 5 4 2 1
# ...
I see a couple that match:
mat[sapply(1:nrow(mat), function(x)
all(paste(vec, collapse = '') %in% paste(mat[x, ], collapse = ''))), ]
# [,1] [,2] [,3] [,4]
# 443 1 4 5 2
# 901 1 4 5 2
# 1047 1 4 5 2
but only three
for your single row case:
vec <- c(1,4,5,2)
mat <- matrix(c(1,4,5,2), ncol = 4)
rownames(mat) <- 1:nrow(mat)
mat[sapply(1:nrow(mat), function(x)
all(paste(vec, collapse = '') %in% paste(mat[x, ], collapse = ''))), ]
# [1] 1 4 5 2
here is a simple function with the above code
is.tuplein <- function(vec, mat, exact = TRUE) {
rownames(mat) <- 1:nrow(mat)
if (exact)
tmp <- mat[sapply(1:nrow(mat), function(x)
all(paste(vec, collapse = '') %in% paste(mat[x, ], collapse = ''))), ]
else tmp <- mat[sapply(1:nrow(mat), function(x) all(vec %in% mat[x, ])), ]
return(tmp)
}
is.tuplein(vec = vec, mat = mat)
# [1] 1 4 5 2
seems to work, so let's make our own %in% operator:
`%tuple%` <- function(x, y) is.tuplein(vec = x, mat = y, exact = TRUE)
`%tuple1%` <- function(x, y) is.tuplein(vec = x, mat = y, exact = FALSE)
and try her out
set.seed(1618)
c(1,2,3) %tuple% matrix(rpois(1002,3), ncol = 3)
# [,1] [,2] [,3]
# 133 1 2 3
# 190 1 2 3
# 321 1 2 3
set.seed(1618)
c(1,2,3) %tuple1% matrix(rpois(1002,3), ncol = 3)
# [,1] [,2] [,3]
# 48 2 3 1
# 64 2 3 1
# 71 1 3 2
# 73 3 1 2
# 108 3 1 2
# 112 1 3 2
# 133 1 2 3
# 166 2 1 3

Does this do what you want (even for more than 2 columns)?
paste(row.vec,collapse="_") %in% apply(data.set,1,paste,collapse="_")

interleave rows of matrix stored in a list in R

I want to create interleaved matrix from a list of matrices.
Example input:
> l <- list(a=matrix(1:4,2),b=matrix(5:8,2))
> l
$a
[,1] [,2]
[1,] 1 3
[2,] 2 4
$b
[,1] [,2]
[1,] 5 7
[2,] 6 8
Expected output:
1 3
5 7
2 4
6 8
I have checked the interleave function in gdata but it does not show this behaviour for lists. Any help appreciated.

Here is a one-liner:
do.call(rbind, l)[order(sequence(sapply(l, nrow))), ]
# [,1] [,2]
# [1,] 1 3
# [2,] 5 7
# [3,] 2 4
# [4,] 6 8
To help understand, the matrices are first stacked on top of each other with do.call(rbind, l), then the rows are extracted in the right order:
sequence(sapply(l, nrow))
# a1 a2 b1 b2
# 1 2 1 2
order(sequence(sapply(l, nrow)))
# [1] 1 3 2 4
It will work with any number of matrices and it will do "the right thing" (subjective) even if they don't have the same number of rows.

Rather than reinventing the wheel, you can just modify it to get you to your destination.
The interleave function from "gdata" starts with ... to let you specify a number of data.frames or matrices to put together. The first few lines of the function look like this:
head(interleave)
#
# 1 function (..., append.source = TRUE, sep = ": ", drop = FALSE)
# 2 {
# 3 sources <- list(...)
# 4 sources[sapply(sources, is.null)] <- NULL
# 5 sources <- lapply(sources, function(x) if (is.matrix(x) ||
# 6 is.data.frame(x))
You can just rewrite lines 1 and 3 as I did in this Gist to create a list version of interleave (here, I've called it Interleave)
head(Interleave)
#
# 1 function (myList, append.source = TRUE, sep = ": ", drop = FALSE)
# 2 {
# 3 sources <- myList
# 4 sources[sapply(sources, is.null)] <- NULL
# 5 sources <- lapply(sources, function(x) if (is.matrix(x) ||
# 6 is.data.frame(x))
Does it work?
l <- list(a=matrix(1:4,2),b=matrix(5:8,2), c=matrix(9:12,2))
Interleave(l)
# [,1] [,2]
# a 1 3
# b 5 7
# c 9 11
# a 2 4
# b 6 8
# c 10 12

Naming elements of matrix dimensions one at a time, when dimname is NULL

When dimnames is currently NULL, is it possible to re-name a matrix's dimestions one at a time?
For example, this fails:
mtx <- matrix(1:16,4)
dimnames(mtx)[[2]][1] <- 'col1'
with Error in dimnames(mtx)[[2]][1] <- "col1" : 'dimnames' must be a list
However this works:
mtx <- matrix(1:16,4)
dimnames(mtx)[[1]] <- letters[1:4]
dimnames(mtx)[[2]] <- LETTERS[1:4]
dimnames(mtx)[[2]][1] <- 'col1'
dimnames(mtx)[[2]][2] <- 'col2'
My objective is to seperately replace dimnames(mtx)[[2]][1] and dimnames(mtx)[[2]][2] etc ... if this is not possible, i can re-write the loop.
Thanks folks, I have ended up with the below -- I pass the names in via prepend:
mtxNameSticker <- function(mtx, prepend = NULL, MARGIN=2)
{
if (MARGIN == 1) max <- nrow(mtx) else
max <- ncol(mtx)
if (is.null(prepend)) if (MARGIN == 2) prepend <- 'C' else
prepend <- 'R'
if (length(prepend) == 1) prepend <- paste0(prepend, 1:dim(mtx)[[MARGIN]])
dimnames(mtx)[[MARGIN]] <- seq(from=1, by=1, length.out=dim(mtx)[[MARGIN]])
for (i in 1:max){
dimnames(mtx)[[MARGIN]][i] <- prepend[i]
}
return(mtx)
}

For as long as dimnames is NULL and not an appropriate list, you cannot make assignments to it at particular positions. One easy way to create a dummy but complete list of dimnames is to run:
dimnames(mtx) <- lapply(dim(mtx), seq_len)
mtx
# 1 2 3 4
# 1 1 5 9 13
# 2 2 6 10 14
# 3 3 7 11 15
# 4 4 8 12 16
Then, you can make assignments one at a time like you were wishing:
dimnames(mtx)[[2]][1] <- 'col1'
mtx
# col1 2 3 4
# 1 1 5 9 13
# 2 2 6 10 14
# 3 3 7 11 15
# 4 4 8 12 16

You are assigning a vector even though you are asked to supply a list.
Try this:
R> M <- matrix(1:4,2,2)
R> M
[,1] [,2]
[1,] 1 3
[2,] 2 4
R>
Columns:
R> M1 <- M; dimnames(M1) <- list(NULL, c("a","b")); M1
a b
[1,] 1 3
[2,] 2 4
R>
Rows:
R> M2 <- M; dimnames(M2) <- list(c("A","B"), NULL); M2
[,1] [,2]
A 1 3
B 2 4
R>

In response to your comment. #DirkEddelbuettel is correct, you are assigning a vector to what should be a list.
The reason for this is that you are assigning dimnames when the dimnames are NULL (not assigned yet)
The way R evaluates the following
x <- NULL
x[[2]][1] <- 'col1'
str(x)
## chr [1:2] NA "col1"
R returns a vector of length 2, not a list of length 2.
For your assignment to work, R would have to evaluate
x <- NULL
x[[2]][1] <- 'col1'
str(x)
to give
## List of 2
## $ : NULL
## $ : chr "col1"
Which is what would happen if x was originally defined as x <- list(NULL,NULL)
however, the dimnames must be NULL or a list of appropriate length vectors
The following does work (and is really #flodel solution)
dimnames(mtx) <- list(character(nrow(mtx)), character(ncol(mtx)))
# or
# dimnames(mtx) <- lapply(dim(mtx), character)
dimnames(mtx)[[2]][1] <- 'col1'

It seems you are allowed to set the name of the dimension without actually having any names for the dimension:
dimnames(mtx) = list(NULL,col1=NULL)
mtx
# col1
# [,1] [,2] [,3] [,4]
# [1,] 1 5 9 13
# [2,] 2 6 10 14
# [3,] 3 7 11 15
# [4,] 4 8 12 16

in R, how to retrieve a complete matrix using combn?

My problem, removing the specific purpose, seems like this:
how to transform a combination like this:
first use combn(letters[1:4], 2) to calculate the combination
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] "a" "a" "a" "b" "b" "c"
[2,] "b" "c" "d" "c" "d" "d"
use each column to obtain another data frame:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] 1 2 3 4 5 6
elements are obtained, for example: the first element, from the first column of the above dataframe
then How can i transform the above dataframe into a matrix, for example result, things like:
a b c d
a 0 1 2 3
b 1 0 4 5
c 2 4 0 6
d 3 5 6 0
the elements with same col and row names will have zero value where others corresponding to above value

Here is one way that works:
inputs <- letters[1:4]
combs <- combn(inputs, 2)
N <- seq_len(ncol(combs))
nams <- unique(as.vector(combs))
out <- matrix(ncol = length(nams), nrow = length(nams))
out[lower.tri(out)] <- N
out <- t(out)
out[lower.tri(out)] <- N
out <- t(out)
diag(out) <- 0
rownames(out) <- colnames(out) <- inputs
Which gives:
> out
a b c d
a 0 1 2 3
b 1 0 4 5
c 2 4 0 6
d 3 5 6 0
If I had to do this a lot, I'd wrap those function calls into a function.
Another option is to use as.matrix.dist() to do the conversion for us by setting up a "dist" object by hand. Using some of the objects from earlier:
## Far easier
out2 <- N
class(out2) <- "dist"
attr(out2, "Labels") <- as.character(inputs)
attr(out2, "Size") <- length(inputs)
attr(out2, "Diag") <- attr(out2, "Upper") <- FALSE
out2 <- as.matrix(out2)
Which gives:
> out2
a b c d
a 0 1 2 3
b 1 0 4 5
c 2 4 0 6
d 3 5 6 0
Again, I'd wrap this in a function if I had to do it more than once.

Does it have to be a mirror matrix with zeros over the diagonal?
combo <- combn(letters[1:4], 2)
in.combo <- matrix(1:6, nrow = 1)
combo <- rbind(combo, in.combo)
out.combo <- matrix(rep(NA, 16), ncol = 4)
colnames(out.combo) <- letters[1:4]
rownames(out.combo) <- letters[1:4]
for(cols in 1:ncol(combo)) {
vec1 <- combo[, cols]
out.combo[vec1[1], vec1[2]] <- as.numeric(vec1[3])
}
> out.combo
a b c d
a NA 1 2 3
b NA NA 4 5
c NA NA NA 6
d NA NA NA NA

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

How to understand a specific function in R - r

Related

Substring each element of the data frame in R

Generalize R %in% operator to match tuples

interleave rows of matrix stored in a list in R

Naming elements of matrix dimensions one at a time, when dimname is NULL

in R, how to retrieve a complete matrix using combn?

Categories

Resources