Make elements NA depending on a predicate function - r

How can I easily change elements of a list or vectors to NAs depending on a predicate ?
I need it to be done in a single call for smooth integration in dplyr::mutate calls etc...
expected output:
make_na(1:10,`>`,5)
# [1] 1 2 3 4 5 NA NA NA NA NA
my_list <- list(1,"a",NULL,character(0))
make_na(my_list, is.null)
# [[1]]
# [1] 1
#
# [[2]]
# [1] "a"
#
# [[3]]
# [1] NA
#
# [[4]]
# character(0)
Note:
I answered my question as I have one solution figured out but Id be happy to get alternate solutions. Also maybe this functionality is already there in base R or packaged in a prominent library
Was inspired by my frustration in my answer to this post

We can build the following function:
make_na <- function(.x,.predicate,...) {
is.na(.x) <- sapply(.x,.predicate,...)
.x
}
Or a bit better to leverage purrr's magic :
make_na <- function(.x,.predicate,...) {
if (requireNamespace("purrr", quietly = TRUE)) {
is.na(.x) <- purrr::map_lgl(.x,.predicate,...)
} else {
if("formula" %in% class(.predicate))
stop("Formulas aren't supported unless package 'purrr' is installed")
is.na(.x) <- sapply(.x,.predicate,...)
}
.x
}
This way we'll be using purrr::map_lgl if library purrr is available, sapply otherwise.
Some examples :
make_na <- function(.x,.predicate,...) {
is.na(.x) <- purrr::map_lgl(.x,.predicate,...)
.x
}
Some use cases:
make_na(1:10,`>`,5)
# [1] 1 2 3 4 5 NA NA NA NA NA
my_list <- list(1,"a",NULL,character(0))
make_na(my_list, is.null)
# [[1]]
# [1] 1
#
# [[2]]
# [1] "a"
#
# [[3]]
# [1] NA
#
# [[4]]
# character(0)
make_na(my_list, function(x) length(x)==0)
# [[1]]
# [1] 1
#
# [[2]]
# [1] "a"
#
# [[3]]
# [1] NA
#
# [[4]]
# [1] NA
If purrr is installed we can use this short form:
make_na(my_list, ~length(.x)==0)

Related

is there a way I can recycle elements of the shorter list in purrr:: map2 or purrr::walk2?

purrr does not seem to support recycling of elements of a vector in case there is a shortage of elements in one of the two (while using purrr::map2 or purrr::walk2). Unlike baseR where we just get a warning if the larger vector is not a multiple of the shorter one.
Consider this toy example:
This works:
map2(1:3,4:6,sum)
#
#[[1]]
#[1] 5
#[[2]]
#[1] 7
#[[3]]
#[1] 9
And this doesn't work:
map2(1:3,4:9,sum)
Error: .x (3) and .y (6) are different lengths
I understand very well why this is not allowed - as it can make catching bugs very difficult. But is there any way in purrr I can force this to happen? Perhaps using some base R trick with purrr?
You can put both lists in a data frame and let that command repeat your vectors:
input <- data.frame(a = 1:3, b = 4:9)
purrr::map2(input$a, input$b, sum)
It's by design with purrr but you can use Map :
Map(sum,1:3,4:9)
# [[1]]
# [1] 5
#
# [[2]]
# [1] 7
#
# [[3]]
# [1] 9
#
# [[4]]
# [1] 8
#
# [[5]]
# [1] 10
#
# [[6]]
# [1] 12
And here's how I would recycle if I had to :
x <- 1:3
y <- 4:9
l <- max(length(y), length(x))
map2(rep(x,len = l), rep(y,len = l),sum)
# [[1]]
# [1] 5
#
# [[2]]
# [1] 7
#
# [[3]]
# [1] 9
#
# [[4]]
# [1] 8
#
# [[5]]
# [1] 10
#
# [[6]]
# [1] 12

How to Display or Print Contents of Environment in R

I hope this question is not a duplicate, because I searched it didn't find any answer(If its a dupe, please let me know I shall remove it).
I am trying to print/display the contents of an environment, but I am unable to do it.
library(rlang)
e1 <- env(a = 1:10, b= letters[1:5])
When I use print, It just give me memory address not the contents(names and values) of that environment.
> print(e1)
<environment: 0x00000000211fbae8>
Note: I can see the env. contents in R studio Environments tab, I am using R version: "R version 3.4.2" and rlang: rlang_0.2.0
My question is : What is the right function to print contents of an environment, Sorry the question may be naive, but I am unable to figure out.
Thanks in advance
We can use get with envir parameter to get values out of specific environment
sapply(ls(e1), function(x) get(x, envir = e1))
#$a
# [1] 1 2 3 4 5 6 7 8 9 10
#$b
#[1] "a" "b" "c" "d" "e"
where
ls(e1) # gives
#[1] "a" "b"
We can use mget
mget(ls(e1), envir = e1)
#$a
#[1] 1 2 3 4 5 6 7 8 9 10
#$b
#[1] "a" "b" "c" "d" "e"
An option can be as:
lapply(ls(),function(x)get(x))
which prints content of the global environment.
#Result:
# [[1]]
# [1] 1 2
#
# [[2]]
# [1] 1 4
#
# [[3]]
# [1] 1 1
#
# [[4]]
# function (snlq)
# {
# j <- 1
# for (i in 1:length(snlq)) {
# ind <- index(snlq[[i]])
# if (identical(ind[length(ind)], "2018-05-04") == FALSE) {
# ss[j] <- i
# j <- j + 1
# }
# }
# return(ss)
# }
# <bytecode: 0x000000001fa07290>
#
#... so on

List all combinations of strings that together cover all given elements

Say I am given the following strings:
1:{a,b,c,t}
2:{b,c,d}
3:{a,c,d}
4:{a,t}
I want to make a program that will give me all different combinations of these strings, where each combination has to include each given letter.
So for example the above combinations are strings {1&2, 1&3, 2&3&4, 1&2&3&4, 2&4}.
I was thinking of doing this with for loops, where the program would look at the first string, find which elements are missing, then work down through the list to find strings which have these letters. However I think this idea will only find combinations of two strings, and also it requires listing all letters to the program which seems very un-economical.
I think something like this should work.
sets <- list(c('a', 'b', 'c', 't'),
c('b', 'c', 'd'),
c('a', 'c', 'd'),
c('a', 't'))
combinations <- lapply(2:length(sets),
function(x) combn(1:length(sets), x, simplify=FALSE))
combinations <- unlist(combinations, FALSE)
combinations
# [[1]]
# [1] 1 2
#
# [[2]]
# [1] 1 3
#
# [[3]]
# [1] 1 4
#
# [[4]]
# [1] 2 3
#
# [[5]]
# [1] 2 4
#
# [[6]]
# [1] 3 4
#
# [[7]]
# [1] 1 2 3
#
# [[8]]
# [1] 1 2 4
#
# [[9]]
# [1] 1 3 4
#
# [[10]]
# [1] 2 3 4
#
# [[11]]
# [1] 1 2 3 4
u <- unique(unlist(sets))
u
# [1] "a" "b" "c" "t" "d"
Filter(function(x) length(setdiff(u, unlist(sets[x]))) == 0, combinations)
# [[1]]
# [1] 1 2
#
# [[2]]
# [1] 1 3
#
# [[3]]
# [1] 2 4
#
# [[4]]
# [1] 1 2 3
#
# [[5]]
# [1] 1 2 4
#
# [[6]]
# [1] 1 3 4
#
# [[7]]
# [1] 2 3 4
#
# [[8]]
# [1] 1 2 3 4
As a start...
I'll edit this answer when I have time. The following result is dependent on the order of choice. I haven't figured out how to flatten the list yet. If I could flatten it, I would sort each result then remove duplicates.
v = list(c("a","b","c","t"),c("b","c","d"),c("a","c","d"),c("a","t"))
allChars <- Reduce(union, v) # [1] "a" "b" "c" "t" "d"
charInList <- function(ch, li) which(sapply(li, function(vect) ch %in% vect))
locations <- sapply(allChars, function(ch) charInList(ch, v) )
# > locations
# $a
# [1] 1 3 4
#
# $b
# [1] 1 2
#
# $c
# [1] 1 2 3
#
# $t
# [1] 1 4
#
# $d
# [1] 2 3
findStillNeeded<-function(chosen){
haveChars <- Reduce(union, v[chosen])
stillNeed <- allChars[!allChars %in% haveChars]
if(length(stillNeed) == 0 ) return(chosen) #terminate if you dont need any more characters
return ( lapply(1:length(stillNeed), function(i) { #for each of the characters you still need
loc <- locations[[stillNeed[i]]] #find where the character is located
lapply(loc, function(j){
findStillNeeded(c(chosen, j)) #when you add this location to the choices, terminate if you dont need any more characters
})
}) )
}
result<-lapply(1:length(v), function(i){
findStillNeeded(i)
})

Match number within list of different length vectors

I want to match a number within a list containing vector of different lengths. Still my solution (below) doesn't match anything beyond the first item of each vector.
seq_ <- seq(1:10)
list_ <- list(seq_[1:3], seq_[4:7], seq_[8:10])
list_
# [[1]]
# [1] 1 2 3
#
# [[2]]
# [1] 4 5 6 7
#
# [[3]]
# [1] 8 9 10
but
for (i in seq_) {
print(match(i,list_))
}
# [1] 1
# [1] NA
# [1] NA
# [1] 3
# [1] NA
# [1] NA
# [1] NA
# [1] NA
# [1] NA
# [1] NA
In the general case, you probably will be happier with which, as in
EDIT: rewrote to show the full looping over values.
seq_ <- seq(1:10)
list_ <- list(seq_[1:3], seq_[4:7], seq_[8:10])
matchlist<-list(length=length(list_))
for( j in 1:length(list_)) {
matchlist[[j]] <- unlist(sapply(seq_, function(k) which(list_[[j]]==k) ))
}
That will return the locations of all matches. It's probably more clear what's happening if you create an input like my.list <- list(sample(1:10,4,replace=TRUE), sample(1:10,7,replace=TRUE))

function usage and output of lapply

I am trying to play with function of lapply
lapply(1:3, function(i) print(i))
# [1] 1
# [1] 2
# [1] 3
# [[1]]
# [1] 1
# [[2]]
# [1] 2
# [[3]]
# [1] 3
I understand that lapply should be able to perform print (i) against each element i among 1:3
But why the output looks like this.
Besides, when I use unlist, I get the output like the following
unlist(lapply(1:3, function(i) print(i)))
# [1] 1
# [1] 2
# [1] 3
# [1] 1 2 3
The description of lapply function is the following:
"lapply returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X."
Your example:
lapply(1:3, function(x) print(x))
Prints the object x and returns a list of length 3.
str(lapply(1:3, function(x) print(x)))
# [1] 1
# [1] 2
# [1] 3
# List of 3
# $ : int 1
# $ : int 2
# $ : int 3
There are a few ways to avoid this as mentioned in the comments:
1) Using invisible
lapply(1:3, function(x) invisible(x))
# [[1]]
# [1] 1
# [[2]]
# [1] 2
# [[3]]
# [1] 3
unlist(lapply(1:3, function(x) invisible(x)))
# [1] 1 2 3
2) Without explicitly printing inside the function
unlist(lapply(1:3, function(x) x))
# [1] 1 2 3
3) Assining the list to an object:
l1 <- lapply(1:3, function(x) print(x))
unlist(l1)
# [1] 1 2 3

Resources