Searching for specified pattern in R list - r

I have a dataset stored as a list DataList
[[1]]
[1] a
[2] f
[3] e
[4] a
[[2]]
[1] f
[2] f
[3] e
I am trying to create a function Getfrequence which return the frequence of a given pattern in the list DataList
GetFrequence<- function(pattern, DataList)
{
freq= 0
i = 1
while (i<= List.length())
{
if (.....)
freq= freq + 1
}
return freq
}
My question is how can I search if the given pattern exists in the list?

I assume that with pattern, you mean the different elements in your list. Then something like this might be helpful?
First, let us create a list roughly similar to the one you have provided above:
a <- list(letters[1:3], letters[1:2], letters[1:5])
[[1]]
[1] "a" "b" "c"
[[2]]
[1] "a" "b"
[[3]]
[1] "a" "b" "c" "d" "e"
Now, to get the frequency of each items across the whole list, we can unlist the list and stack everything into one vector. Once we have a simple vector left, we can use table.
table(unlist(a))
a b c d e
3 3 2 1 1
Note that you may have to use unlist several times, depending on your actual list-structure. That is, if you have a list of lists, it might be necessary to adjust the code somewhat. In that case, please post str(your_list).

Related

Subsetting Elements in a "Hypothetical" List

I found this R function (Algorithm to calculate power set (all possible subsets) of a set in R) that can return the "power set" for a set of letters. Below I assign it to f() then use it to return a power set in a list.
f <- function(set) {
n <- length(set)
masks <- 2^(1:n-1)
lapply( 1:2^n-1, function(u) set[ bitwAnd(u, masks) != 0 ] )
}
results = f(LETTERS[1:3])
[[1]]
character(0)
[[2]]
[1] "A"
[[3]]
[1] "B"
[[4]]
[1] "A" "B"
[[5]]
[1] "C"
[[6]]
[1] "A" "C"
[[7]]
[1] "B" "C"
[[8]]
[1] "A" "B" "C"
I could then draw some random sequence of letters from the power set object using an index value:
> random = sample.int(length(results), 1)
[1] 6
> results[random]
[[1]]
[1] "A" "C"
Suppose now I want to make a list of the power set for all 26 letters. Since this list would have 2^26 = 67108864 elements, it would be too large to store in memory:
# too big
big_results = f(LETTERS[1:26])
But suppose I were to instead generate some random number I know is an index of the big_results results:
big_random = sample.int(67108864, 1)
13626980
Is there some way of knowing which permutation of letters the "big_random" would correspond to on the "big_results" list - without actually fully running "big_results"?
For example:
# too big to run (hypothetical list)
big_results[big_random]
Could some enumeration or recursion formula be used to figure out some pattern and then somehow determine that "13626980" corresponds to the letters "P H R T L D U Z" for instance?

Working with names and values of objects in a list in R using loops

how do you retrieve the names of the objects of a list in a loop. I want to do something like this:
lst = list(a = c(1,2), b = 1)
for(x in lst){
#print the name of the object x in the list
# print the multiplication of the values
}
Desired results:
"a"
2 4
"b"
2
In Python one can use dictionary and with the below code get the desired results:
lst = {"a":[1,2,3], "b":1}
for key , value in lst.items():
print(key)
print(value * 2)
but since in R we have no dictionary data structure, I am trying to achieve this using lists but I don't know how to return the objects names. Any help would be appreciated.
We can get the names directly
names(lst)
[1] "a" "b"
Or if we want to print in a loop, loop over the sequence or names of the list, print the name, as well as the value got by extracting the list element based on the name multiplied
for(nm in names(lst)) {
print(nm)
print(lst[[nm]] * 2)
}
[1] "a"
[1] 2 4
[1] "b"
[1] 2
Or another option is iwalk
library(purrr)
iwalk(lst, ~ {print(.y); print(.x * 2)})
[1] "a"
[1] 2 4
[1] "b"
[1] 2

How to use a list as an argument in a function in R

myfunction<-function(x){if (x=="g"){g_var<-x g_nvar<-length(g_var)} return(g_nvar)}
I have written the above script to obtain specific elements out of a list. The argument x will be a list when I will call upon this function but R does not consider x as a list. How can I write a function such that when I provide a list, my output are the elements that I have specified in the function?
m
[[1]]
[[1]] [[1]]
[1] "g" "g" "h" "g" "g" "g" "k" "l"
[[2]]
[[2]] [[1]]
[1] "g" "h" "k" "k" "l" "g"
Expected result
[[1]] 5 # No. of g
[[2]] 2 # No. of g
Similarly I would like to obtain numbers for h,k and l also. I am putting m as x while calling the function.
For eg:- myfunction (m)
Your case is somewhat complicated by the fact that your m is not simply a list of character vectors, but, in your example, a list of 2 lists of 1 vector of characters, as would be generated by
m = list(strsplit("gghgggkl", ""), strsplit("ghkklg", ""))
If we want myfunction to operate on this data structure, we have to refer to the component of the length-1-lists with the operation [[1]] (see x[[1]] below), and, as loki suggested, we can use lapply to work on all components of the outer list, and sum with a logical expression to obtain the desired count:
myfunction = function(m) lapply(m, function(x) sum(x[[1]]=='g'))
myfunction(m)
result:
[[1]]
[1] 5
[[2]]
[1] 2

Access second to last element of vectors nested in list in R

*Similar questions exist, but don't respond my specific question.
In a nested list, what's an elegant way of accessing the second to last element of each vector. Take the following list:
l <- list(c("a","b","c"),c("c","d","e","f"))
How do I produce a new list (or vector) which contains the second to last element of each vector in list l? The output should look like this:
[[1]]
[1] "b"
[[2]]
[1] "e"
I access the last element of each vector via lapply(l,dplyr::last), but not sure how to select the second to last elements. Much appreciated.
Try this:
l <- list(c("a","b","c"),c("c","d","e","f"))
lapply(l, function(x) x[length(x) -1])
#> [[1]]
#> [1] "b"
#>
#> [[2]]
#> [1] "e"

extracting element of a vector based on index given in a list in R

I am looking to extract elements of a character vector based of the index in a list. For eg: I have a vector CV and index that I will like to extract are in a list AL. I can extract them individually (through a loop) but I was wondering if there's a way that I can do it without having to use the loop (perhaps using apply function). I tried using sapply unsuccessfully.
CV = c("a","b","c","d")
AL = list(c(1,2),c(2,3,4),c(2))
CV[AL[[1]]]
[1] "a" "b"
sapply(CC,'[',AL)
Your problem is that
sapply(CV,'[',AL)
will (attempt to) iterate over each element of CV, but you want to iterate over each element of AL:
sapply(AL, function(z) CV[z])
# [[1]]
# [1] "a" "b"
#
# [[2]]
# [1] "b" "c" "d"
#
# [[3]]
# [1] "b"

Resources