Checking if list contains numbers in R - r

I have a list:
list(c(1,2,3,4), c(3,2,6,8),c(6,4,3))
How would I be able to filter out for the list that contains 2 and 3 in each of the vectors? (They do not necessary have to be in descending/ ascending order)
Thank you!

Use Filter like this:
L <- list(c(1,2,3,4), c(3,2,6,8), c(6,4,3))
Filter(function(x) all(2:3 %in% x), L)
giving:
[[1]]
[1] 1 2 3 4
[[2]]
[1] 3 2 6 8
The above uses no packages but if we were to use fn from the gsubfn package it could be shortened to the following. The formula is regarded as the specification of a function whose body is the right hand side and whose arguments are the free variables in the body, in this case just x.
library(gsubfn)
fn$Filter(~ all(2:3 %in% x), L)

If we are using tidyverse, one option is keep from purrr
library(purrr)
keep(lst, ~all(2:3 %in% .x))
#[[1]]
#[1] 1 2 3 4
#[[2]]
#[1] 3 2 6 8

Related

concatenate vectors from two lists by names [duplicate]

This question already has answers here:
Merge Two Lists in R
(9 answers)
Merge contents within list of list by duplicate name
(1 answer)
Closed 3 years ago.
So I'm heavily simplifying my actual problem, but I am trying to find a way to append values inside vectors from one list, to values in vectors in another list, and do it by name ( assuming the two lists are not ordered).So this is the setup to the problem ( the numbers themselves are arbitrary here):
Data1 <- list( c(1),c(2),c(3))
names(Data1) <- c("A", "B","C")
Data2 <- list(c(11), c(12), c(13))
names(Data2) <- c("B","A","C")
Now what Im trying to do, is find a way to get a third list - say Data3, so that calling Data3[["A"]] will give me the same result as calling c(1,12):
[1] 1 12
so >Data3 should give:
[1] 1 12
[2] 2 11
[3] 3 13
Essentially im looking to append many values from one list of vectors, to another list of vectors, and do it by names rather than order, if that makes sense. (I did think about trying some loops, but I feel like there should be another way that is simpler)
nm = names(Data1)
setNames(lapply(nm, function(x){
c(Data1[[x]], Data2[[x]])
}), nm)
#$A
#[1] 1 12
#$B
#[1] 2 11
#$C
#[1] 3 13
list(do.call("cbind", list(Data1, Data2)))
[,1] [,2]
A 1 11
B 2 12
C 3 13
If you don't mind your output to be a dataframe:
Data3 <- rbind(data.frame(Data1), data.frame(Data2))
Then Data3[["A"]] will give you:
[1] 1 12
We can use Map and arrange the elements of Data2 in the same order as Data1 (or vice versa) using names and then concatenate them.
Map(c, Data1, Data2[names(Data1)])
#$A
#[1] 1 12
#$B
#[1] 2 11
#$C
#[1] 3 13

How can I remove elements by columns number from a list?

I've like to remove elements in a list, if the number of elements are smaller than 3.
For this I try:
#Create a list
my_list <- list(a = c(3,5,6), b = c(3,1,0), c = 4, d = NA)
my_list
$a
[1] 3 5 6
$b
[1] 3 1 0
$c
[1] 4
$d
[1] NA
# Thant I create a function for remove the elements by my condition:
delete.F <- function(x.list){
x.list[unlist(lapply(x.list, function(x) ncol(x)) < 3)]}
delete.F(my_list)
And I have as output:
Error in unlist(lapply(x.list, function(x) ncol(x)) < 3) :
(list) object cannot be coerced to type 'double'
Any ideas, please?
An option is to create a logical expression with lengths and use that for subsetting the list
my_list[lengths(my_list) >=3]
#$a
#[1] 3 5 6
#$b
#[1] 3 1 0
Note that in the example, it is a list of vectors and not a list of data.frame. the ncol/nrow is when there is a dim attribute - matrix checks TRUE for that, as do data.frame
If we want to somehow use lapply (based on some constraints), create the logic with length
unlist(lapply(my_list, function(x) if(length(x) >=3 ) x))
If we need to create the index with lapply, use length (but it would be slower than lengths)
my_list[unlist(lapply(my_list, length)) >= 3]
Here are few more options. Using Filter in base R
Filter(function(x) length(x) >=3, my_list)
#$a
#[1] 3 5 6
#$b
#[1] 3 1 0
Or using purrr's keep and discard
purrr::keep(my_list, ~length(.) >= 3)
purrr::discard(my_list, ~length(.) < 3)

Coercing String to Vector

I'm trying to create a calculator that multiplies permutation groups written in cyclic form (the process of which is described in this post, for anyone unfamiliar: https://math.stackexchange.com/questions/31763/multiplication-in-permutation-groups-written-in-cyclic-notation). Although I know this would be easier to do with Python or something else, I wanted to practice writing code in R since it is relatively new to me.
My gameplan for this is take an input, such as "(1 2 3)(2 4 1)" and split it into two separate lists or vectors. However, I am having trouble starting this because from my understanding of character functions (which I researched here: https://www.statmethods.net/management/functions.html) I will ultimately have to use the function grep() to find the points where ")(" occur in my string to split from there. However, grep only takes vectors for its argument, so I am trying to coerce my string into a vector. In researching this problem, I have mostly seen people suggest to use as.integer(unlist(str_split())), however, this doesn't work for me as when I split, not everything is an integer and the values become NA, as seen in this example.
library(tidyverse)
x <- "(1 2 3)(2 4 1)"
x <- as.integer(unlist(str_split(x," ")))'
x
Is there an alternative way to turn a string into a vector when there are not just integers involved? I also realize that the means by which I am trying to split up the two permutations is very roundabout, but that is because of the character functions that I researched this seems like the only way. If there are other functions that would make this easier, please let me know.
Thank you!
Comments in the code.
x <- "(1 2 3)(2 4 1)"
out1 <- strsplit(x, split = ")(", fixed = TRUE)[[1]] # split on close and open bracket
out2 <- gsub("[\\(|\\)]", replacement = "", out1) # remove brackets
out3 <- strsplit(out2, " ") # tease out numbers between spaces
lapply(out3, as.integer)
[[1]]
[1] 1 2 3
[[2]]
[1] 2 4 1
There aren't really any scalars on R. Single values like 1, TRUE, and "a" are all 1-element vectors. grep(pattern, x) will work fine on your original string. As a starting point for getting towards your desired goal, I would suggest splitting the groups using:
> str_extract_all(x, "\\([0-9 ]+\\)")
[[1]]
[1] "(1 2 3)" "(2 4 1)"
If we need to split the strings with the brackets
strsplit(x, "(?<=\\))(?=\\()", perl = TRUE)[[1]]
#[1] "(1 2 3)" "(2 4 1)"
Or we can use convenient wrapper from qdapRegex
library(qdapRegex)
ex_round(x, include.marker = TRUE)[[1]]
#[1] "(1 2 3)" "(2 4 1)"
alternative: using library(magrittr)
x <- "(1 2 3)(2 4 1)"
x %>%
gsub("^\\(","c(",.) %>% gsub("\\)\\(","),c(",.) %>% gsub("(?=\\s\\d)",", ",.,perl=T) %>%
paste0("list(",.,")") %>% {eval(parse(text=.))}
result:
# [[1]]
# [1] 1 2 3
#
# [[2]]
# [1] 2 4 1
You could use chartr with read.table :
read.table(text= chartr("()"," \n",x))
# V1 V2 V3
# 1 1 2 3
# 2 2 4 1

Paste column values together in a data frame

I am trying to paste together the rowname along with the data in the desired column. I wrote the following code but somehow couldnot find a way to do it correctly.
The desired output will be: "a,1,11" "b,2,22" "c,3,33"
x = data.frame(cbind(f1 = c(1,2,3), f2 = c(5,6,7), f3=c(11,22,33)), row.names= c('a','b','c'))
x
# f1 f2 f3
# a 1 5 11
# b 2 6 22
# c 3 7 33
do.call("paste", c(rownames(x), x[c('f1','f3')], sep=","))
# [1] "a,b,c,1,11" "a,b,c,2,22" "a,b,c,3,33"
Two main points:
Use apply instead of do.call(paste, .)
Use cbind instead of c in this case.
If you would rather use c, you would need to coerce the row names to a list or column first, eg: c(list(rownames(x)), x)
Try the following:
apply(cbind(rownames(x), x[c('f1','f3')]), 1, paste, collapse=",")
a b c
"a,1,11" "b,2,22" "c,3,33"
Your do.call instructs R to paste the list c(rownames(x), x[c('f1','f3')]) together. But take a look at your list.
> c(rownames(x), x[c('f1','f3')])
[[1]]
[1] "a"
[[2]]
[1] "b"
[[3]]
[1] "c"
$f1
[1] 1 2 3
$f3
[1] 11 22 33
The c command takes the elements of each argument and joins them together. This properly deconstructs x[c('f1','f3')] but also deconstructs rownames(x) in a way you don't want. Obeying the standard recycling rule, paste then takes an item from each list element and patches them together with sep=",".
You could fix this by encapsulating rownames(x) inside a list structure so that your list of arguments comes out properly:
do.call("paste", c(list(rownames(x)), x[c('f1','f3')], sep=","))
No need for do.call or apply:
paste(rownames(x),x[[1]],x[[3]] , sep=",")
[1] "a,1,11" "b,2,22" "c,3,33"

Extracting unique numbers from string in R

I have a list of strings which contain random characters such as:
list=list()
list[1] = "djud7+dg[a]hs667"
list[2] = "7fd*hac11(5)"
list[3] = "2tu,g7gka5"
I'd like to know which numbers are present at least once (unique()) in this list. The solution of my example is:
solution: c(7,667,11,5,2)
If someone has a method that does not consider 11 as "eleven" but as "one and one", it would also be useful. The solution in this condition would be:
solution: c(7,6,1,5,2)
(I found this post on a related subject: Extracting numbers from vectors of strings)
For the second answer, you can use gsub to remove everything from the string that's not a number, then split the string as follows:
unique(as.numeric(unlist(strsplit(gsub("[^0-9]", "", unlist(ll)), ""))))
# [1] 7 6 1 5 2
For the first answer, similarly using strsplit,
unique(na.omit(as.numeric(unlist(strsplit(unlist(ll), "[^0-9]+")))))
# [1] 7 667 11 5 2
PS: don't name your variable list (as there's an inbuilt function list). I've named your data as ll.
Here is yet another answer, this one using gregexpr to find the numbers, and regmatches to extract them:
l <- c("djud7+dg[a]hs667", "7fd*hac11(5)", "2tu,g7gka5")
temp1 <- gregexpr("[0-9]", l) # Individual digits
temp2 <- gregexpr("[0-9]+", l) # Numbers with any number of digits
as.numeric(unique(unlist(regmatches(l, temp1))))
# [1] 7 6 1 5 2
as.numeric(unique(unlist(regmatches(l, temp2))))
# [1] 7 667 11 5 2
A solution using stringi
# extract the numbers:
nums <- stri_extract_all_regex(list, "[0-9]+")
# Make vector and get unique numbers:
nums <- unlist(nums)
nums <- unique(nums)
And that's your first solution
For the second solution I would use substr:
nums_first <- sapply(nums, function(x) unique(substr(x,1,1)))
You could use ?strsplit (like suggested in #Arun's answer in Extracting numbers from vectors (of strings)):
l <- c("djud7+dg[a]hs667", "7fd*hac11(5)", "2tu,g7gka5")
## split string at non-digits
s <- strsplit(l, "[^[:digit:]]")
## convert strings to numeric ("" become NA)
solution <- as.numeric(unlist(s))
## remove NA and duplicates
solution <- unique(solution[!is.na(solution)])
# [1] 7 667 11 5 2
A stringr solution with str_match_all and piped operators. For the first solution:
library(stringr)
str_match_all(ll, "[0-9]+") %>% unlist %>% unique %>% as.numeric
Second solution:
str_match_all(ll, "[0-9]") %>% unlist %>% unique %>% as.numeric
(Note: I've also called the list ll)
Use strsplit using pattern as the inverse of numeric digits: 0-9
For the example you have provided, do this:
tmp <- sapply(list, function (k) strsplit(k, "[^0-9]"))
Then simply take a union of all `sets' in the list, like so:
tmp <- Reduce(union, tmp)
Then you only have to remove the empty string.
Check out the str_extract_numbers() function from the strex package.
pacman::p_load(strex)
list=list()
list[1] = "djud7+dg[a]hs667"
list[2] = "7fd*hac11(5)"
list[3] = "2tu,g7gka5"
charvec <- unlist(list)
print(charvec)
#> [1] "djud7+dg[a]hs667" "7fd*hac11(5)" "2tu,g7gka5"
str_extract_numbers(charvec)
#> [[1]]
#> [1] 7 667
#>
#> [[2]]
#> [1] 7 11 5
#>
#> [[3]]
#> [1] 2 7 5
unique(unlist(str_extract_numbers(charvec)))
#> [1] 7 667 11 5 2
Created on 2018-09-03 by the reprex package (v0.2.0).

Resources