I have a list
ls<-list(c("a"="one","b"="two"),"x"="t4",c("y"="t5","z"="t6"))
I would like to extract the list elements by names rather than indexing. Is there a way to do it?
As in
ls["a"]
> "one"
ls["y"]
> "t5"
I want only the output "one" and "t5". I will be using these outputs to either parse it with some other string , or perform arithmetic (if the outputs are numbers) with other variables
I found a similar question asked before, R: get element by name from a nested list.
But it doesnt work for this. Any thoughts?
With plyr:
plyr::llply(lst,function(x) x["a"])
or:
Filter(Negate(is.na),plyr::llply(lst,function(x) x["y"]))
[[1]]
y
"t5"
You can automate it by making it a function.
An attempt at automating the process(might be slow):
purrr::map(c("a","y"),
function(x) lapply(lst, function(z) z[x]))
The following might be sufficient in your specific case, given that the component names are unique (otherwise there is an identifiability issue).
## data
ls <- list(c(a = "one", b = "two"), x = "t4", list(c(y = "t5", z = "t6")))
getElement <- function(ls, name) unlist(ls)[[grep(name, names(unlist(ls)))]]
getElement(ls, "a")
#> [1] "one"
getElement(ls, "b")
#> [1] "two"
getElement(ls, "x")
#> [1] "t4"
getElement(ls, "y")
#> [1] "t5"
We can just unlist the list and use the [[ operator, which returns an unnamed one-element vector:
unlist(ls)[["a"]]
# [1] "one"
unlist(ls)[["y"]]
# [1] "t5"
If we want to keep the name, use [:
unlist(ls2)["a"]
# a
# "one"
unlist(ls2)["y"]
# y
# "t5"
Related
I am doing a fuzzy name matching exercise and am trying to reduce the number of spelling variations of the same name using tidystringdist. I end up with a dataframe of matches containing two vectors. One has the original value and the second has the value it needs to be changed into. So I need to go back to the original vector of names and change them based on the df with the match values. Normal this would be easy, left_join() on the original names and done. But, my original names can have anywhere from 1 to 4 values in it (multiple owners on properties) so the values to be changed are actually a list of lists. Here is a reprex of what I have done so far:
library(dplyr)
data_to_change <- data.frame(house_number = c(1,2,3),
animal = rbind(c("dog|cat|monkey"),
c("goldfish"),
c("mouse|dog|rabbit|squirrel"))) %>%
mutate(animal_split = strsplit(animal, "[|]"))
new_names <- data.frame(cbind(V1 = c("dog", "rabbit"),
V2 = c("doggy", "bunny")))
The original data looks like this:
[[1]]
[1] "dog" "cat" "monkey"
[[2]]
[1] "goldfish"
[[3]]
[1] "mouse" "dog" "rabbit" "squirrel"
And I would like to change the animal names so the result looks like this:
[[1]]
[1] "doggy" "cat" "monkey"
[[2]]
[1] "goldfish"
[[3]]
[1] "mouse" "doggy" "bunny" "squirrel"
I don't believe I can simply use replace, because the target and match df list are of different lengths. And I don't think I can unlist it and change it because I need to preserve the association with the house number and other animals in the house.
You can use a lapply() to wrap around your list, and use stringi::stri_replace_all_fixed() to replace the text.
library(stringi)
data_to_change$animal_split <- lapply(data_to_change$animal_split, stri_replace_all_fixed, new_names$V1, new_names$V2, vectorize = F)
data_to_change$animal_split
[[1]]
[1] "doggy" "cat" "monkey"
[[2]]
[1] "goldfish"
[[3]]
[1] "mouse" "doggy" "bunny" "squirrel"
As these are fixed matches, we can use deframe to convert the data.frame into a named vector and then use that to match and replace the vector elements in the list by looping over (map) and finally coalesce with the original vector so that the NAs are replaced by original vector
library(dplyr)
library(tibble)
library(purrr)
data_to_change %>%
mutate(animal_split = map(animal_split,
~ coalesce(deframe(new_names)[.x], .x)))
-output
house_number animal animal_split
1 1 dog|cat|monkey doggy, cat, monkey
2 2 goldfish goldfish
3 3 mouse|dog|rabbit|squirrel mouse, doggy, bunny, squirrel
I want to look up indexes of variables in a data.frame given a chain of (partial) variable names. An example:
df <- data.frame(var = c("az","bz","cz"), stringsAsFactors = FALSE)
Now I have a chain given as:
v <- c("a > b")
I'm now searching the sorted corresponding variable names in the data.frame.
I do this with:
df$var[grep(paste(trimws(unlist(strsplit(v, ">"))), collapse = "|"), df$var)]
[1] "az" "bz"
This works in the first example. For the second example this fails:
v <- c("b > a")
df$var[grep(paste(trimws(unlist(strsplit(v, ">"))), collapse = "|"), df$var)]
[1] "az" "bz"
It returns [1] "az" "bz", whereas I expect [1] "bz" "az".
How can I achieve this?
If you don't do it via regex (b|a) and leave them as a vector as that is derived from your strsplit() function, i.e. c(2,1), then by looping and using grep, you get the correct order, i.e.
df$var[sapply(trimws(unlist(strsplit(v, ">"))), function(i)grep(i, df$var))]
#[1] "bz" "az"
I want to check if two nested lists have the same names at the last level.
If unlist gave an option not to concatenate names this would be trivial. However, it looks like I need some function leaf.names():
X <- list(list(a = pi, b = list(alpha.c = 1:5, 7.12)), d = "a test")
leaf.names(X)
[1] "a" "alpha.c" "NA" "d"
I want to avoid any inelegant grepping if possible. I feel like there should be some easy way to do this with rapply or unlist...
leaf.names <- function(X) names(rlang::squash(X))
or
leaf.names <- function(X){
while(any(sapply(X, is.list))) X <- purrr::flatten(X)
names(X)
}
gives
leaf.names(X)
# [1] "a" "alpha.c" "" "d"
I have variable which is vector and contain the row names. I want to take the unnion of this vector with the row name of other matrix, but when I do this, it does not work properly. basically it put all things together and does not care about the duplicates,....
Here is my effort:
step 1: put the names in vector, which I read it from list of matrix :
name<-c()
name<-lapply(ismr0, function(x){
name<-union(name, rownames(x))
return(name)
})
> length(name)
[1] 733
>
Second step which does not work properly;
rn <- union(rownames(ismr0[[1]]), name)
> length(rn)
[1] 1180
>
> ismr0[[1]][1:4,]
mature RPM
MIMAT0000062 mature 49791.5560
MIMAT0000063 mature 92858.1285
MIMAT0000064 mature 10418.8532
MIMAT0000065 mature 404.7618
>
But I would expected to have length 733, because row names of ismr0[[1]] is subset of the names in name variable .
Would someone help me to solve this problem ?
As you guessed in comments, you are using union on character vector and list. If we need to get all unique rownames from list then try this example:
#dummy data
a<-matrix(1:4,ncol=1)
b<-matrix(1:4,ncol=1)
c<-matrix(1:4,ncol=1)
rownames(a) <- letters[c(2,3,5,7)]
rownames(b) <- letters[c(2,4,5,7)]
rownames(c) <- letters[c(2,3,6,7)]
ismr0 <-list(a,b,c)
#get unique names
name <- unique(unlist(lapply(ismr0,rownames)))
#check with union
rn <- union(rownames(ismr0[[1]]), name)
length(name)==length(rn)
You don't get what you expect because lapply returns a list. I ran an example of a list with 3 data.frames and it gave me :
[[1]]
[1] "l1" "l2" "l3" "l4" "l5" # first df rownames
[[2]]
[1] "l6" "l7" "l8" "l9" "l10" # second df rownames
[[3]]
[1] "l11" "l12" "l13" "l14" "l15" # third df rownames
which is a list.
Then, the line union(rownames(ismr0[[1]]), name) adds the elements of name to the list, which doesn't contain those single elements and you get something like :
[[1]]
[1] "l1" "l2" "l3" "l4" "l5"
[[2]]
[1] "l6" "l7" "l8" "l9" "l10"
[[3]]
[1] "l11" "l12" "l13" "l14" "l15"
[[4]]
[1] "l1"
[[5]]
[1] "l2"
You need to use sapply, which returns a vector instead of a list.
I have a very big list file.dput() function for two of them is as below :
> dput(mydata).....
`NA` = c("SHC2", "GRB2", "HRAS", "KRAS", "NRAS", "SHC3",
"MAPK1", "MAPK3", "MAP2K1", "MAP2K2", "RAF1", "SHC1", "SOS1",
"YWHAB", "CDK1"), `NA` = c("NUP50", "NUPL2", "PSIP1", "NUP35",
"NUP205", "NUP210", "NUP188", "NUP62", "SLC25A4", "SLC25A5",
"SLC25A6", "HMGA1", "NUP43", "KPNA1", "NUP88", "NUP54", "NUP133",
"NUP107", "RANBP2", "LOC645870", "TPR", "NUP37", "NUP85",
"NUP214", "AAAS", "SEH1L", "RAE1", "BANF1", "NUP155", "NUP93",
"NUPL1", "POM121", "NUP153"), ....
I'm also have a file including names, but I can't assign it,
names(mydata)<-list("a", "b")# clears former data and replaces with "a" and "b"
names(mydata)<-c("a", "b")
I have tried using names(mydata) but it dosen't do what I need. I think "N" should be the name which I dont know how to access it. right?
If yes what should I do? Regards**
I'm not sure what you are trying to do. If you want to name the elements of a list with the names from another file, here's how to do it:
x <- list (1,2,3,4,5)
y <- LETTERS [1:5]
names (x) <- y
Thanks
The problem was: I was using [[ ]] to recruit names but [] should be used for names:
x <- list (1,2,3,4,5)
y <- LETTERS [1:5]
names (x) <- y
> x[[1]]
[1] 1
> x[1]
$A
[1] 1
> x[2]
$B
[1] 2