Get a consistent vector of list element names - r

A list with no names returns NULL for its names:
> names(list(1,2,3))
NULL
but add one named thing and suddenly the names has the length of the list:
> names(list(1,2,3,a=4))
[1] "" "" "" "a"
because this is now a named list. What I'd like is a function, rnames say, to make any list into a named list, such that:
rnames(list(1,2,3)) == c("","","")
identical(rnames(list()), character(0))
length(rnames(foo)) == length(foo) # for all foo
and the following, which is what names() does anyway:
rnames(list(1,2,3,a=3)) == c("","","","a")
rnames(list(a=1,b=1)) == c("a","b")
My current hacky method is to add a named thing to the list, get the names, and then chop it off:
rnames = function(l){names(c(monkey=1,l))[-1]}
but is there a better/proper way to do this?

An approach that feels slightly cleaner is to assign names to the list:
x <- list(1,2,3)
names(x) <- rep("", length(x))
names(x)
[1] "" "" ""
Turning it into a function:
rnames <- function(x){
if(is.null(names(x))) names(x) <- rep("", length(x))
return(x)
}
Test cases:
x1 <- rnames(list(1,2,3))
x2 <- rnames(list(1,2,3,a=3))
x3 <- rnames(list(a=1,b=1))
names(x1)
[1] "" "" ""
names(x2)
[1] "" "" "" "a"
names(x3)
[1] "a" "b"

How about something like this?
rnames <- function(x) {
if(is.null(names(x)))
character(length(x))
else
names(x)
}
It handles the list() and no names cases; and it doesn't do anything if there are already names.

The names are in an attribute named ... names:
> lis2 <- list("a", "b")
> attributes(lis2)
NULL
> if(is.null(names(lis2)) ) {names(lis2) <-
vector(mode="character", length=length(lis2))}
> lis2
[[1]]
[1] "a"
[[2]]
[1] "b"
> names(lis2)
[1] "" ""
> attributes(lis2)
$names
[1] "" ""

Related

Cannot append a vector inside a list in R

I have a set of vectors inside a list wherein I want to append certain values to each vector. When I used append() outside the loop, it worked perfectly fine but inside a loop it doesn't seem to work.
factors <- list(c("K3BG","9"),c("RTCKO","4"))
len <- length(factors)
for (i in 1:length)
{
rejig_score <- factors[[i]][2]
rejig_score <- as.numeric(rejig_score)
if(rejig_score > 5)
{
factors[[i]] <- append(factors[[i]],"Approved")
}
else
{
factors[[i]] <- append(factors[[i]],"Disapproved")
}
}
I changed 1:lenght to 1:len inside for
factors <- list(c("K3BG","9"),c("RTCKO","4"))
len <- length(factors)
for (i in 1:len)
{
rejig_score <- factors[[i]][2]
rejig_score <- as.numeric(rejig_score)
if(rejig_score > 5)
{
factors[[i]] <- append(factors[[i]],"Approved")
}
else
{
factors[[i]] <- append(factors[[i]],"Disapproved")
}
}
factors
[[1]]
[1] "K3BG" "9" "Approved"
[[2]]
[1] "RTCKO" "4" "Disapproved"
Using lapply
lapply(factors, function(x) c(x, if(as.numeric(x[2]) > 5)
"Approved" else "Disapproved"))
-output
[[1]]
[1] "K3BG" "9" "Approved"
[[2]]
[1] "RTCKO" "4" "Disapproved"
Or another option is to extract the second element from the list and do the comparison outside, create the vector values and append
new <- c("Disapproved", "Approved")[1 +
(as.numeric(sapply(factors, `[[`, 2)) > 5)]
Map(c, factors, new)
[[1]]
[1] "K3BG" "9" "Approved"
[[2]]
[1] "RTCKO" "4" "Disapproved"

How can I use an arguement supplied to a user defined function as both an input and a character string? [duplicate]

This question already has answers here:
In R, how to get an object's name after it is sent to a function?
(4 answers)
Closed 2 years ago.
I find I often am comparing two character vectors to see where they don't match up (typically columns in two different data frames). Because I'm doing this often, I want to write a function to make it easier. This is what I've come up with so far:
x <- c("A", "B", "C")
y <- c("B", "C", "D", "X")
check_mismatch <- function(vec1, vec2) {
vec1 <- unique(as.character(vec1))
vec2 <- unique(as.character(vec2))
missing_from_1 <- vec2[vec2 %notin% vec1]
missing_from_2 <- vec1[vec1 %notin% vec2]
print("Missing from vector 1")
print(missing_from_1)
print("Missing from vector 2")
print(missing_from_2)
}
check_mismatch(x,y)
[1] "Missing from vector 1"
[1] "D" "X"
[1] "Missing from vector 2"
[1] "A"
What I would really like is "Missing from x" instead of "Missing from vector 1". I would like the function to output the name of the actual argument that was entered. Another example of how I would like the function to work:
check_mismatch(all_polygons_df$Plot, sb_year$Plot)
[1] "Missing from all_polygons_df$Plot"
[1] "KWI-1314B"
[1] "Missing from sb_year$Plot"
character(0)
Any suggestions on how I could do this? I'm open to other ways of displaying the output too - perhaps some kind of table. But the output needs to be flexible to different lengths of output.
Up front, deparse(substitute(...)) is what you're asking for, and that is what makes your initial question a duplicate.
Some recommendations, however:
printing things to the console is a little off (IMO), since it prepends [1] to everything you print. Consider message (or cat). Since many R environments color things based on comments, etc, I have found it useful to prepend # before some text to break it out from other portions of the same text.
Your function is operating solely in side-effect, printing something to the console and then losing it forever. The function does happen to return a single object (the value of missing_from_2, accidentally), but it might be more useful if the function returned the mismatches.
With that, I offer an alternative:
check_mismatch <- function(vec1, vec2) {
nm1 <- deparse(substitute(vec1))
nm2 <- deparse(substitute(vec2))
vec1 <- unique(as.character(vec1))
vec2 <- unique(as.character(vec2))
missing_from_1 <- vec2[!vec2 %in% vec1]
missing_from_2 <- vec1[!vec1 %in% vec2]
setNames(list(missing_from_1, missing_from_2), c(nm1, nm2))
}
check_mismatch(x, y)
# $x
# [1] "D" "X"
# $y
# [1] "A"
One immediate benefit is that we can look for specific differences in one of the vectors immediately:
mis <- check_mismatch(x, y)
mis$x
# [1] "D" "X"
However, this uses the names of the variables presented to it. Realize that with non-standard evaluation comes responsibility and consequence. Consider:
mis <- check_mismatch(x, c("A", "B", "E"))
mis
# $x
# [1] "E"
# $`c("A", "B", "E")`
# [1] "C"
The name of the second element is atrocious. Fortunately, if all you care about is what the differences are for the second element, once can still use [[2]] to retrieve the character vector without issue. (This is mostly aesthetic.)
mis[[2]]
# [1] "C"
Also, one might want to repeat this for more than two vectors, so generalizing it might be useful (for "1 or more"):
check_mismatch_many <- function(...) {
dots <- list(...)
if (!length(dots)) {
out <- list()
} else {
nms <- as.character(match.call()[-1])
out <- lapply(seq_along(dots), function(i) {
b <- unique(unlist(dots[-i]))
b[!b %in% dots[[i]]]
})
out <- replace(out, sapply(out, is.null), list(dots[[1]][0]))
names(out) <- nms
}
out
}
z <- c("Y","Z")
check_mismatch_many()
# list()
check_mismatch_many(x)
# $x
# character(0)
check_mismatch_many(x, y)
# $x
# [1] "D" "X"
# $y
# [1] "A"
check_mismatch_many(x, y, z)
# $x
# [1] "D" "X" "Y" "Z"
# $y
# [1] "A" "Y" "Z"
# $z
# [1] "A" "B" "C" "D" "X"
And finally, if you want to be a little "personal" with the presentation on the console, you can go overboard and class it with an additional print.myclass S3 method.
check_mismatch_many <- function(...) {
dots <- list(...)
if (!length(dots)) {
out <- list()
} else {
nms <- as.character(match.call()[-1])
out <- lapply(seq_along(dots), function(i) {
b <- unique(unlist(dots[-i]))
b[!b %in% dots[[i]]]
})
out <- replace(out, sapply(out, is.null), list(dots[[1]][0]))
names(out) <- nms
}
class(out) <- c("mismatch", "list")
out
}
print.mismatch <- function(x, ...) {
cat("<Mismatch>\n")
cat(str(x, give.attr = FALSE, no.list = TRUE))
invisible(x)
}
mis <- check_mismatch_many(x, y)
mis
# <Mismatch>
# $ x: chr [1:2] "D" "X"
# $ y: chr "A"
(There are a lot more things you can do in the print.mismatch method, obviously. str is the major component of it, and it is the swiss-army-knife of depicting structure.)

Extract list element by successive / cascading index chain

ll<-list(list(c('A', 'B', 'C'),"Peter"),"John","Hans")
looks like:
[[1]]
[[1]][[1]]
[1] "A" "B" "C"
[[1]][[2]]
[1] "Peter"
[[2]]
[1] "John"
[[3]]
[1] "Hans"
Lets say I have the indices in a list for "Peter" and "B" respectively.
peter.ind <- list(1,2) # correlates with ll[[1]][[2]]
B.ind <- list(1,1,2) # correlates with ll[[1]][[1]][[2]]
So how can I most effectively extract a "tangled" list element by its cascaded index chain?
Here is my already working function:
extract0r <- function(x,l) {
for(ind in l) {
x <- x[[ind]]
}
return(x)
}
call function:
extract0r(ll,peter.ind) #evals [1] "Peter"
extract0r(ll,B.ind) #evals [1] "B"
Is there a neater alternative to my function?
You can use a recursive function:
ll <- list(list(c('A', 'B', 'C'),"Peter"),"John","Hans")
my.ind <- function(L, ind) {
if (length(ind)==1) return(L[[ind]])
my.ind(L[[ind[1]]], ind[-1])
}
my.ind(ll, c(1,2))
my.ind(ll, c(1,1,2))
# > my.ind(ll, c(1,2))
# [1] "Peter"
# > my.ind(ll, c(1,1,2))
# [1] "B"
The recursive function has a (relative) clear coding, but during execution it has an overhead for the deep function calls.
There are many ways of doing this.
For example, you can build the commands from character strings:
my.ind.str <- function(L, ind) {
command <- paste0(c("L",sprintf("[[%i]]", ind)),collapse="")
return(eval(parse(text=command)))
}
With your example, I had to convert the lists of indices to vectors:
my.ind.str(ll, unlist(peter.ind))
[1] "Peter"
my.ind.str(ll, unlist(B.ind))
[1] "B"

Keep list index number unchanged after removing NULL and character[0] in R

Is there any way in R to keep the index number same as before, while removing NULL and character[0] element from list. For example:
ld:
[[1]]
[1]"D2HGDH"
[[2]]
character(0)
[[3]]
character(0)
[[4]]
[1] "TNF" "IL6" "CXCL1"
[[5]]
character(0)
[[6]]
character(0)
[[7]]
[1] "ICAM1"
[[8]]
NULL
Than removing NULL and character[0] using the following code.
#remove character 0 element
rmchar <- lapply(seq(ld) , function(x) ld[[x]][!ld[[x]] == "character(0)" ])
# remove null
is.NullOb <- function(x) if(!(is.function(x))) is.null(x) | all(sapply(x, is.null)) else FALSE
rmNullObs <- function(x) {
x <- Filter(Negate(is.NullOb), x)
lapply(x, function(x) if (is.list(x)) rmNullObs(x) else x)
}
rmnull <- rmNullObs(rmchar)
so after running this the index is changed like:
[[1]]
[1] "D2HGDH"
[[2]]
[1] "TNF" "IL6" "CXCL1"
[[3]]
[1] "ICAM1"
which i don't want. I want the index number same as in list ld. So basically my desired output is looking something like:
[[1]]
[1] "D2HGDH"
[[4]]
[1] "TNF" "IL6" "CXCL1"
[[7]]
[1] "ICAM1"
Any way to do this? If yes than How?
I appreciated any help. Thank You.
Screenshot of my output.
Similar to #Richards's comments here is a solution for your case
ld = list(c("D2HGDH"), character(length = 0), character(length = 0),
c("TNF","IL6","CXCL1"),character(length = 0),
character(length = 0), c("ICAM1"), NULL)
# Convert the list to names
ld <- setNames(as.list(ld), c(1:length(ld)))
Lines below copied from R Remove NULL elements from list of lists
#Remove null and character values
is.NullOb <- function(x) is.null(x) | all(sapply(x, is.null))
## Recursively step down into list, removing all such objects
rmNullObs <- function(x) {
x <- Filter(Negate(is.NullOb), x)
lapply(x, function(x) if (is.list(x)) rmNullObs(x) else x)
}
newList <- rmNullObs(ld)
To get the indexes of list that were left use
names(newList)

list(...) vs as.list(...) when using the triple dot argument

I would like to get a list with the "..." parameters passed to the function.
myfunction <- function(..., a=1){
parameters <- as.list(...)
for(i in parameters){
print(i)
}
}
But when calling myfunction("x","y","z") I get a vector with one item:
## [1] "x"
Howerver, if I replace as.list(...) by simply list(...)
myfunction <- function(..., a=1){
parameters <- list(...)
for(i in parameters){
print(i)
}
}
it works:
## [1] "x"
## [1] "y"
## [1] "z"
So why is as.list(...) behaving differently?
Cheers.
You may be looking for the c concatenate function.
as.list(c('x', 'y', 'z'))
#[[1]]
#[1] "x"
#
#[[2]]
#[1] "y"
#
#[[3]]
#[1] "z"
myfunction <- function(..., a=1){
parameters <- as.list(c(...))
for(i in parameters){
print(i)
}
}
myfunction('x', 'y', 'z')
#[1] "x"
#[1] "y"
#[1] "z"
I don't want to get the explanation wrong, so I'll let someone else explain why.

Resources