I have a list of named elements that I want to transform into a named list of those elements:
el1 = c(a = 1)
el2 = c(b = 2)
b = list(el1, el2)
What I want is something like a = list(a = 1, b = 2)
so that identical(a, b) returns TRUE
Is there a transformation I can apply to b to do this? I tried a combination of unlist/unname but didn't seem like that got my any closer.
We can use as.list after concatenating the named vectors
b <- as.list(c(el1, el2))
b
#$a
#[1] 1
#$b
#[1] 2
identical(a, b)
#[1] TRUE
Also, if the list is already created as in the OP's post, unlist and use as.list
b <- list(el1, el2)
b <- as.list(unlist(b))
identical(a, b)
#[1] TRUE
Here is another base R option, but not as simple as the solution by akrun
> do.call(c,Map(as.list,b))
$a
[1] 1
$b
[1] 2
Related
I've like to remove elements in a list, if the number of elements are smaller than 3.
For this I try:
#Create a list
my_list <- list(a = c(3,5,6), b = c(3,1,0), c = 4, d = NA)
my_list
$a
[1] 3 5 6
$b
[1] 3 1 0
$c
[1] 4
$d
[1] NA
# Thant I create a function for remove the elements by my condition:
delete.F <- function(x.list){
x.list[unlist(lapply(x.list, function(x) ncol(x)) < 3)]}
delete.F(my_list)
And I have as output:
Error in unlist(lapply(x.list, function(x) ncol(x)) < 3) :
(list) object cannot be coerced to type 'double'
Any ideas, please?
An option is to create a logical expression with lengths and use that for subsetting the list
my_list[lengths(my_list) >=3]
#$a
#[1] 3 5 6
#$b
#[1] 3 1 0
Note that in the example, it is a list of vectors and not a list of data.frame. the ncol/nrow is when there is a dim attribute - matrix checks TRUE for that, as do data.frame
If we want to somehow use lapply (based on some constraints), create the logic with length
unlist(lapply(my_list, function(x) if(length(x) >=3 ) x))
If we need to create the index with lapply, use length (but it would be slower than lengths)
my_list[unlist(lapply(my_list, length)) >= 3]
Here are few more options. Using Filter in base R
Filter(function(x) length(x) >=3, my_list)
#$a
#[1] 3 5 6
#$b
#[1] 3 1 0
Or using purrr's keep and discard
purrr::keep(my_list, ~length(.) >= 3)
purrr::discard(my_list, ~length(.) < 3)
I have 2 dataframes with 2 same columns. I want to check if the datasets are identical. The original datasets have some 700K records but I'm trying to figure out a way to do it using dummy datasets
I tried using compare, identical, all, all_equal etc. None of them returns me True.
The dummy datasets are -
a <- data.frame(x = 1:10, b = 20:11)
c <- data.frame(x = 10:1, b = 11:20)
all(a==c)
[1] FALSE
compare(a,c)
FALSE [FALSE, FALSE]
identical(a,c)
[1] FALSE
all.equal(a,c)
[1] "Component “x”: Mean relative difference: 0.9090909" "Component “b”: Mean relative difference: 0.3225806"
The datasets are entirely same, except for the order of the records. If these functions only work when the datasets are mirror images of each other, then I must try something else. If that is the case, can someone help with how do I get True for these 2 datasets (unordered)
dplyr's setdiff works on data frames, I would suggest
library(dplyr)
nrow(setdiff(a, c)) == 0 & nrow(setdiff(c, a)) == 0
# [1] TRUE
Note that this will not account for number of duplicate rows. (i.e., if a has multiple copies of a row, and c has only one copy of that row, it will still return TRUE). Not sure how you want duplicate rows handled...
If you do care about having the same number of duplicates, then I would suggest two possibilities: (a) adding an ID column to differentiate the duplicates and using the approach above, or (b) sorting, resetting the row names (annoyingly), and using identical.
(a) adding an ID column
library(dplyr)
a_id = group_by_all(a) %>% mutate(id = row_number())
c_id = group_by_all(c) %>% mutate(id = row_number())
nrow(setdiff(a_id, c_id)) == 0 & nrow(setdiff(c_id, a_id)) == 0
# [1] TRUE
(b) sorting
a_sort = a[do.call(order, a), ]
row.names(a_sort) = NULL
c_sort = c[do.call(order, c), ]
row.names(c_sort) = NULL
identical(a_sort, c_sort)
# [1] TRUE
Maybe a function to sort the columns before comparison is what you need. But it will be slow on large dataframes.
unordered_equal <- function(X, Y, exact = FALSE){
X[] <- lapply(X, sort)
Y[] <- lapply(Y, sort)
if(exact) identical(X, Y) else all.equal(X, Y)
}
unordered_equal(a, c)
#[1] TRUE
unordered_equal(a, c, TRUE)
#[1] TRUE
a$x <- a$x + .Machine$double.eps
unordered_equal(a, c)
#[1] TRUE
unordered_equal(a, c, TRUE)
#[1] FALSE
Basically what you want may be to compare the ordered underlying matrices.
all.equal(matrix(unlist(a[order(a[1]), ]), dim(a)),
matrix(unlist(c[order(c[1]), ]), dim(c)))
# [1] TRUE
identical(matrix(unlist(a[order(a[1]), ]), dim(a)),
matrix(unlist(c[order(c[1]), ]), dim(c)))
# [1] TRUE
You could wrap this into a function for more convenience:
om <- function(d) matrix(unlist(d[order(d[1]), ]), dim(d))
all.equal(om(a), om(c))
# [1] TRUE
You can use the new package called waldo
library(waldo)
a <- data.frame(x = 1:10, b = 20:11)
c <- data.frame(x = 10:1, b = 11:20)
compare(a,c)
And you get:
`old$x`: 1 2 3 4 5 6 7 8 9 10 and 9 more...
`new$x`: 10 ...
`old$b`: 20 19 18 17 16 15 14 13 12 11 and 9 more...
`new$b`:
I want to include a list element c in a list L in R and name it C.
The example is as follows:
a=c(1,2,3)
b=c("a","b","c")
c=rnorm(3)
L<-list(A=a,
B=b,
C=c)
print(L)
## $A
## [1] 1 2 3
##
## $B
## [1] "a" "b" "c"
##
## $C
## [1] -2.2398424 0.9561929 -0.6172520
Now I want to introduce a condition on C, so it is only included in
the list if C.bool==T:
C.bool<-T
L<-list(A=a,
B=b,
if(C.bool) C=c)
print(L)
## $A
## [1] 1 2 3
##
## $B
## [1] "a" "b" "c"
##
## [[3]]
## [1] -2.2398424 0.9561929 -0.6172520
Now, however, the list element of c is not being named as specified in
the list statement. What's the trick here?
Edit: The intention is to only include the element in the list if the condition is met (no NULL shoul be included otherwise). Can this be done within the core definition of the list?
I don't know why you want to do it "without adding C outside the core definition of the list?" but if you're content with two lists in a single c then:
L <- c(list(A=a, B=b), if(C.bool) list(C=c))
If you really want one list but don't mind subsetting after creation then
L <- list(A=a, B=b, C=if(C.bool) c)[c(TRUE, TRUE, C.bool)]
(pace David Arenburg, isTRUE() omitted for brevity)
you can try this if you want to keep the names
L2 <-list(A=a,
B=b,
C = if (TRUE) c)
You can of course replace TRUE with the statement containing C.bool
You could place the if statement outside the core definition of the list, like this:
L <- list(A = a, B= b)
if (isTRUE(C.bool)) L$C <- c
#> L
#$A
#[1] 1 2 3
#
#$B
#[1] "a" "b" "c"
#
#$C
#[1] -0.7631459 0.7353929 -0.2085646
(Edit with isTRUE() owing to the comment by #DavidArenburg)
As a combination of the previous answers by #MamounBenghezal, #user20637
and the comment made by #DavidArenburg, I would suggest this generalized
version that does not depend on the length of the list:
L <- Filter(Negate(is.null),
x = list(A = a, B = b, C = if (isTRUE(C.bool)) c, D = "foo"))
If I was starting with a list A that contained lists N and M, which contain equal number of elements:
A = list( N=list( a=c(1,1), b=c(2,2)),
M=list( a=c(1,1), b=c(2,2)) )
I would do the following to combine each of the elements from the lists N and M into new lists
B = mapply( FUN=list, A[[1]], A[[2]], SIMPLIFY=FALSE )
to get
>B
$a
$a[[1]]
[1] 1 1
$a[[2]]
[1] 1 1
$b
$b[[1]]
[1] 2 2
$b[[2]]
[1] 2 2
How can I do the same thing as above if I don't know beforehand the number of lists that the list A will have?
There is probably a better solution to your problem, but if you literally want to "do the same thing, but without knowing size of A", you can do the following:
do.call(function(...) mapply(..., FUN = list, SIMPLIFY = FALSE), A)
try using tapply:
tapply(U, inds, list)
Where
U <- unlist(A, recursive=FALSE)
inds <- rep(seq_along(A[[1]]), length(A))
I have two lists with named elements:
a <- list(a=1, b=2)
b <- list(b=3, c=4)
I want to combine these lists, so that any elements in a that have the same names will be overwritten by the list b, so I get this out:
list(a=1, b=3, c=4)
I know I could do this in a loop, but is there a more compact way of doing this in R?
R has a built in function to do that modifyList
modifyList(a, b)
Here's a simple solution:
# create new list
newlist <- c(a,b)
# remove list element(s)
newlist[!duplicated(names(newlist), fromLast = TRUE)]
The result:
$a
[1] 1
$b
[1] 3
$c
[1] 4
An even simpler solution with setdiff:
c(a[setdiff(names(a), names(b))], b)
$a
[1] 1
$b
[1] 3
$c
[1] 4