Assuming I have a sparse m by n binary matrix, and I already use a row-indexed lists to represent the ones. For example, the following 3 by 3 matrix
[,1] [,2] [,3]
[1,] 1 1 0
[2,] 0 1 0
[3,] 0 0 1
is represented by a list M_row:
> M_row
[[1]]
[1] 1 2
[[2]]
[1] 2
[[3]]
[1] 3
Here the i-th element in the list corresponds to the positions of ones in the i-th row.
I want to convert this list to a column-indexed list, where the j-th element in the new list corresponds to the (row) positions of ones in the j-th column. For the previous example, I want:
> M_col
[[1]]
[1] 1
[[2]]
[1] 1 2
[[3]]
[1] 3
Is there an efficient way to do this without writing many loops?
Try this
M_row <- list(1:2 , 2, 3) # this is the beginning list
#----------------------------------
m <- matrix(0 , length(M_row) , length(M_row))
for(i in 1:nrow(m)) {
m[ i , M_row[[i]]] <- 1
}
M_col <- apply(m , 2 , \(x) which(x == 1))
#----------------------------------
M_col # this is the required list
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> [1] 1 2
#>
#> [[3]]
#> [1] 3
Here is an algorithm that doesn't create the matrix.
Get the number of columns with sapply/max and create a results list M_col of the required length;
for each input list member, update M_col by appending the row number to it.
M_row <- list(1:2 , 2, 3)
Max_col <- max(sapply(M_row, max))
M_col <- vector("list", length = Max_col)
for(i in seq_along(M_row)) {
for(j in M_row[[i]]) {
M_col[[j]] <- c(M_col[[j]], i)
}
}
M_col
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> [1] 1 2
#>
#> [[3]]
#> [1] 3
Created on 2022-06-19 by the reprex package (v2.0.1)
You could use stack + unstack:
M_row <- list(1:2 , 2, 3) # this is the beginning list
d <- type.convert(stack(setNames(M_row, seq_along(M_row))), as.is = TRUE)
d
values ind
1 1 1
2 2 1
3 2 2
4 3 3
d is the row, column combinations where values represents the row while ind represents the columns:
columnwise:
unstack(d, ind~values)
$`1`
[1] 1
$`2`
[1] 1 2
$`3`
[1] 3
Rowwise:
unstack(d, values~ind)
$`1`
[1] 1 2
$`2`
[1] 2
$`3`
[1] 3
I am pretty new in R and so what I am trying to do is that I have been given a vector of positive integers like
index <- 1:3
and I want to use this vector to find all the possible combinations of numbers without repetition which I achieve like this
for (i in 1:length(index)) {
combn(index,i)
j = 1
while (j <= nrow(t(combn(index,i)))) {
print(t(combn(index,i))[j,])
j = j + 1
append(comb, j)
}
}
This gives me output as
[1] 1
[1] 2
[1] 3
[1] 1 2
[1] 1 3
[1] 2 3
[1] 1 2 3
But when I create a list comb <- list() and try to append each output as below:
for (i in 1:length(index)) {
combn(index,i)
j = 1
while (j <= nrow(t(combn(index,i)))) {
append(comb, t(combn(index,i))[j,])
j = j + 1
}
}
The problem is it is giving my empty list when I call
comb
list()
I wish to create a list with those elements and use them to retrieve those index rows from a data frame. Do you have any idea how I can achieve this? Any help is welcome. Thanks!
We can use unlist + lapply like below
unlist(
lapply(
seq_along(index),
combn,
x = index,
simplify = FALSE
),
recursive = FALSE
)
which gives
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] 1 2
[[5]]
[1] 1 3
[[6]]
[1] 2 3
[[7]]
[1] 1 2 3
This seems to give what you want
index <- 1:3
comb <- list()
for (i in 1:length(index)) {
combn(index,i)
j = 1
while (j <= nrow(t(combn(index,i)))) {
comb <- c(comb, list(t(combn(index,i))[j,]))
j = j + 1
}
}
comb
Output
[[1]]
[1] 1
[[2]]
[1] 2
[[3]]
[1] 3
[[4]]
[1] 1 2
[[5]]
[1] 1 3
[[6]]
[1] 2 3
[[7]]
[1] 1 2 3
Note that you have to assign your appended list back. Also if you append a list with vector each of the vector element will be a separate element in the new list. You have to wrap that vector in a list() function to append it as one.
I want to match a number within a list containing vector of different lengths. Still my solution (below) doesn't match anything beyond the first item of each vector.
seq_ <- seq(1:10)
list_ <- list(seq_[1:3], seq_[4:7], seq_[8:10])
list_
# [[1]]
# [1] 1 2 3
#
# [[2]]
# [1] 4 5 6 7
#
# [[3]]
# [1] 8 9 10
but
for (i in seq_) {
print(match(i,list_))
}
# [1] 1
# [1] NA
# [1] NA
# [1] 3
# [1] NA
# [1] NA
# [1] NA
# [1] NA
# [1] NA
# [1] NA
In the general case, you probably will be happier with which, as in
EDIT: rewrote to show the full looping over values.
seq_ <- seq(1:10)
list_ <- list(seq_[1:3], seq_[4:7], seq_[8:10])
matchlist<-list(length=length(list_))
for( j in 1:length(list_)) {
matchlist[[j]] <- unlist(sapply(seq_, function(k) which(list_[[j]]==k) ))
}
That will return the locations of all matches. It's probably more clear what's happening if you create an input like my.list <- list(sample(1:10,4,replace=TRUE), sample(1:10,7,replace=TRUE))
I am trying to play with function of lapply
lapply(1:3, function(i) print(i))
# [1] 1
# [1] 2
# [1] 3
# [[1]]
# [1] 1
# [[2]]
# [1] 2
# [[3]]
# [1] 3
I understand that lapply should be able to perform print (i) against each element i among 1:3
But why the output looks like this.
Besides, when I use unlist, I get the output like the following
unlist(lapply(1:3, function(i) print(i)))
# [1] 1
# [1] 2
# [1] 3
# [1] 1 2 3
The description of lapply function is the following:
"lapply returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X."
Your example:
lapply(1:3, function(x) print(x))
Prints the object x and returns a list of length 3.
str(lapply(1:3, function(x) print(x)))
# [1] 1
# [1] 2
# [1] 3
# List of 3
# $ : int 1
# $ : int 2
# $ : int 3
There are a few ways to avoid this as mentioned in the comments:
1) Using invisible
lapply(1:3, function(x) invisible(x))
# [[1]]
# [1] 1
# [[2]]
# [1] 2
# [[3]]
# [1] 3
unlist(lapply(1:3, function(x) invisible(x)))
# [1] 1 2 3
2) Without explicitly printing inside the function
unlist(lapply(1:3, function(x) x))
# [1] 1 2 3
3) Assining the list to an object:
l1 <- lapply(1:3, function(x) print(x))
unlist(l1)
# [1] 1 2 3
Is there a way to create a "dictionary" in R, such that it has pairs?
Something to the effect of:
x=dictionary(c("Hi","Why","water") , c(1,5,4))
x["Why"]=5
I'm asking this because I am actually looking for a two categorial variables function.
So that if x=dictionary(c("a","b"),c(5,2))
x val
1 a 5
2 b 2
I want to compute x1^2+x2 on all combinations of x keys
x1 x2 val1 val2 x1^2+x2
1 a a 5 5 30
2 b a 2 5 9
3 a b 5 2 27
4 b b 2 2 6
And then I want to be able to retrieve the result using x1 and x2. Something to the effect of:
get_result["b","a"] = 9
what is the best, efficient way to do this?
I know three R packages for dictionaries: hash, hashmap, and dict.
Update July 2018: a new one, container.
Update September 2018: a new one, collections
hash
Keys must be character strings. A value can be any R object.
library(hash)
## hash-2.2.6 provided by Decision Patterns
h <- hash()
# set values
h[["1"]] <- 42
h[["foo"]] <- "bar"
h[["4"]] <- list(a=1, b=2)
# get values
h[["1"]]
## [1] 42
h[["4"]]
## $a
## [1] 1
##
## $b
## [1] 2
h[c("1", "foo")]
## <hash> containing 2 key-value pair(s).
## 1 : 42
## foo : bar
h[["key not here"]]
## NULL
To get keys:
keys(h)
## [1] "1" "4" "foo"
To get values:
values(h)
## $`1`
## [1] 42
##
## $`4`
## $`4`$a
## [1] 1
##
## $`4`$b
## [1] 2
##
##
## $foo
## [1] "bar"
The print instance:
h
## <hash> containing 3 key-value pair(s).
## 1 : 42
## 4 : 1 2
## foo : bar
The values function accepts the arguments of sapply:
values(h, USE.NAMES=FALSE)
## [[1]]
## [1] 42
##
## [[2]]
## [[2]]$a
## [1] 1
##
## [[2]]$b
## [1] 2
##
##
## [[3]]
## [1] "bar"
values(h, keys="4")
## 4
## a 1
## b 2
values(h, keys="4", simplify=FALSE)
## $`4`
## $`4`$a
## [1] 1
##
## $`4`$b
## [1] 2
hashmap
See https://cran.r-project.org/web/packages/hashmap/README.html.
hashmap does not offer the flexibility to store arbitrary types of objects.
Keys and values are restricted to "scalar" objects (length-one character, numeric, etc.). The values must be of the same type.
library(hashmap)
H <- hashmap(c("a", "b"), rnorm(2))
H[["a"]]
## [1] 0.1549271
H[[c("a","b")]]
## [1] 0.1549271 -0.1222048
H[[1]] <- 9
Beautiful print instance:
H
## ## (character) => (numeric)
## ## [1] => [+9.000000]
## ## [b] => [-0.122205]
## ## [a] => [+0.154927]
Errors:
H[[2]] <- "Z"
## Error in x$`[[<-`(i, value): Not compatible with requested type: [type=character; target=double].
H[[2]] <- c(1,3)
## Warning in x$`[[<-`(i, value): length(keys) != length(values)!
dict
Currently available only on Github: https://github.com/mkuhn/dict
Strengths: arbitrary keys and values, and fast.
library(dict)
d <- dict()
d[[1]] <- 42
d[[c(2, 3)]] <- "Hello!" # c(2,3) is the key
d[["foo"]] <- "bar"
d[[4]] <- list(a=1, b=2)
d[[1]]
## [1] 42
d[[c(2, 3)]]
## [1] "Hello!"
d[[4]]
## $a
## [1] 1
##
## $b
## [1] 2
Accessing to a non-existing key throws an error:
d[["not here"]]
## Error in d$get_or_stop(key): Key error: [1] "not here"
But there is a nice feature to deal with that:
d$get("not here", "default value for missing key")
## [1] "default value for missing key"
Get keys:
d$keys()
## [[1]]
## [1] 4
##
## [[2]]
## [1] 1
##
## [[3]]
## [1] 2 3
##
## [[4]]
## [1] "foo"
Get values:
d$values()
## [[1]]
## [1] 42
##
## [[2]]
## [1] "Hello!"
##
## [[3]]
## [1] "bar"
##
## [[4]]
## [[4]]$a
## [1] 1
##
## [[4]]$b
## [1] 2
Get items:
d$items()
## [[1]]
## [[1]]$key
## [1] 4
##
## [[1]]$value
## [[1]]$value$a
## [1] 1
##
## [[1]]$value$b
## [1] 2
##
##
##
## [[2]]
## [[2]]$key
## [1] 1
##
## [[2]]$value
## [1] 42
##
##
## [[3]]
## [[3]]$key
## [1] 2 3
##
## [[3]]$value
## [1] "Hello!"
##
##
## [[4]]
## [[4]]$key
## [1] "foo"
##
## [[4]]$value
## [1] "bar"
No print instance.
The package also provides the function numvecdict to deal with a dictionary in which numbers and strings (including vectors of each) can be used as keys, and that can only store vectors of numbers.
You simply create a vector with your key value pairs.
animal_sounds <- c(
'cat' = 'meow',
'dog' = 'woof',
'cow' = 'moo'
)
print(animal_sounds['cat'])
# 'meow'
Update: To answer the 2nd portion of the question, you can create a dataframe and compute the values like this:
val1 <- c(5,2,5,2) # Create val1 column
val2 <- c(5,5,2,2) # Create val2 column
df <- data.frame(val1, val2) # create dataframe variable
df['x1^2+x2'] <- val1^2 + val2 # create expression column
Output:
val1 val2 x1^2+x2
1 5 5 30
2 2 5 9
3 5 2 27
4 2 2 6
You can use just data.frame and row.names to do this:
x=data.frame(row.names=c("Hi","Why","water") , val=c(1,5,4))
x["Why",]
[1] 5
In that vectors, matrices, lists, etc. behave as "dictionaries" in R, you can do something like the following:
> (x <- structure(c(5,2),names=c("a","b"))) ## "dictionary"
a b
5 2
> (result <- outer(x,x,function(x1,x2) x1^2+x2))
a b
a 30 27
b 9 6
> result["b","a"]
[1] 9
If you wanted a table as you've shown in your example, just reshape your array...
> library(reshape)
> (dfr <- melt(result,varnames=c("x1","x2")))
x1 x2 value
1 a a 30
2 b a 9
3 a b 27
4 b b 6
> transform(dfr,val1=x[x1],val2=x[x2])
x1 x2 value val1 val2
1 a a 30 5 5
2 b a 9 2 5
3 a b 27 5 2
4 b b 6 2 2
See my answer to a very recent question. In essence, you use environments for this type of functionality.
For the higher dimensional case, you may be better off using an array (twodimensional) if you want the easy syntax for retrieving the result (you can name the rows and columns). As an alternative,you can paste together the two keys with a separator that doesn't occur in them, and then use that as a unique identifier.
To be specific, something like this:
tmp<-data.frame(x=c("a", "b"), val=c(5,2))
tmp2<-outer(seq(nrow(tmp)), seq(nrow(tmp)), function(lhs, rhs){tmp$val[lhs] + tmp$val[rhs]})
dimnames(tmp2)<-list(tmp$x, tmp$x)
tmp2
tmp2["a", "b"]
Using tidyverse
Adding an answer using more recent tidyverse approaches.
There are probably cleaner ways of handling the crossing (which creates all combinations) and unnesting, but this is a quick and dirty approach.
library(tidyverse)
my_tbl <- tibble(x = c("A", "B"), val=c(5,2)) %>%
crossing(x1 = ., x2 = .) %>% # Create all combinations
unnest_wider(everything(), names_sep="_") %>% # Unpack into distinct columns
mutate(result = x1_val^2 + x2_val) # Calculate result
# Access result by accessing the row in the data frame
my_tbl %>%
filter(x1_x == "A", x2_x == "B") %>%
pull(result)
#> [1] 27
# Convert tibble to a named vector that could be accessed more easily.
# However, this is limited to string names.
my_named_vector <- my_tbl %>%
transmute(name = str_c(x1_x, "_", x2_x), value=result) %>%
deframe()
my_named_vector[["A_B"]]
#> [1] 27
Created on 2022-04-06 by the reprex package (v2.0.1)
tibble version 3.1.6
dplyr version 1.0.8
tidyr version 1.2.0
stringr version 1.4.0