I have a list of records:
z <- list(list(a=1),list(a=4),list(a=2))
and I try to add fields to each of them.
Alas, neither
lapply(z,function(l) l$b <- 1+l$a)
nor
for(l in z) l$b <- 1+l$a
modifies z.
In this simple case I can, of course, do
z <- lapply(z,function(l) c(list(b= 1+l$a),l))
but this quickly gets out of hand when the lists have more nesting:
z <- list(list(a=list(b=1)),list(a=list(b=4)),list(a=list(b=2)))
How do I turn it into
list(list(a=list(b=1,c=2)),list(a=list(b=4,c=5)),list(a=list(b=2,c=3)))
without repeating the definition of the whole structure?
Each element of z has many fields, not just a; and z[[10]]$a has many subfields, not just b.
Your first code example doesn't modify the list because you need to return the list in your call to lapply:
z <- list(list(a=1),list(a=4),list(a=2))
expected <- list(list(a=1, b=2), list(a=4, b=5), list(a=2, b=3))
outcome <- lapply(z,function(l) {l$b <- 1+l$a ; l})
all.equal(expected, outcome)
# [1] TRUE
In the doubly nested example, you could use lapply within lapply, again making sure to return the list in the inner lapply:
z <- list(list(a=list(b=1)),list(a=list(b=4)),list(a=list(b=2)))
expected <- list(list(a=list(b=1, c=2)), list(a=list(b=4, c=5)), list(a=list(b=2, c=3)))
obtained <- lapply(z, function(l1) { lapply(l1, function(l2) {l2$c = l2$b+1 ; l2 } )})
all.equal(expected, obtained)
# [1] TRUE
Another, somewhat convoluted, option:
z <- list(list(a=1),list(a=4),list(a=2))
res <- list(list(a=list(b=1,c=2)),list(a=list(b=4,c=5)),list(a=list(b=2,c=3)))
res1 <- rapply(z,function(x) list(b = x,c = x+1),how = "replace")
> all.equal(res,res1)
[1] TRUE
I only say convoluted because rapply can be tricky to use at times (for me at least).
Related
g <- c(1,2,3,4)
vapply(g, function(x) {
length(x) # This prints out a value of 1.
# Is there code I can write that will allow me to uniquely access just the second element in this vector?
})
I am looking to use vapply specifically, but would like to know how I can access just a single element (any index I want) in the vector that vapply is being acted on. How can I achieve this?
The R work around for something like this is usually:
g <- c(1,2,3,4)
vapply(seq_along(g), function(i) {
## i is your index number
## g[i] is the ith element in g
}, double(1))
When I have a list or vector of things and I want to vapply (or lapply or ...) on one or more of them (technically zero-or-more), I use a "subset-assignment" technique:
g <- c(1, 2, 3, 4)
ind <- 2
### could also be:
# ind <- c(2, 4)
# ind <- c(TRUE, FALSE, TRUE, TRUE) # logical variant
# ind <- c() # empty
g[ind] <- lapply(g[ind], function(x) ...)
This does not need to reassign back into g, it could be a new vector:
newg <- lapply(g[ind], function(x) ...)
where in this case length(newg) would be the same as length(ind) (or sum(ind) if using the logical variant).
I'm trying to make a list with 10 elements, each element consisting of 5 * i items drawn from a uniform distribution, i being the ith entry, and I want to use lapply.
Currently I made this function:
z_list <- list()
z_list_generator <- function(n) {
for(i in 1:n){
a <- runif(5 * i)
tmp <- list(a)
mybiglist[[i]] <- tmp
}
mybiglist
}
This function does give the correct outcome when I just put z_list_generator(2), it prints a list with the first element consisting of 5 elements, the second of 10 elements.
What I want to achieve is that I do lapply(some number, z_list_generator) such that it generates this same list, and such that when I do length(lapply(some number, z_list_generator)), the outcome is 'some number'.
Do you mean something like this?
z_list_generator <- function(k) lapply(1:k, function(i) runif(5 * i))
set.seed(2018) # Fixed random seed for reproducibility
z_list_generator(2)
#[[1]]
#[1] 0.33615347 0.46372327 0.06058539 0.19743361 0.47431419
#
#[[2]]
# [1] 0.3010486 0.6067589 0.1300121 0.9586547 0.5468495 0.3956160 0.6645386
# [8] 0.9821123 0.6782154 0.8060278
length(z_list_generator(2))
#[1] 2
Your z_list_generator is strange.
1) You do not initialise mybiglist in your function code. It probably modifies some global variable.
2) You assign mybiglist elements with another list (of lenght 1), which first element contains a sample from a uniform distrubution. Better assign a, not tmp there.
my_list = list()
my_list[[1]] = c("Fast","Slow","Heavy","Light")
my_list[[2]] = c("Fast","Small","Intelligent","Light")
my_list[[3]] = c("Dumb","Slow","Heavy","Light")
my_list[[4]] = c("Slow","Intelligent","Dumb","Heavy")
my_list[[5]] = c("Heavy","Light","Intelligent","Tall")
This is a simplified version of what I am trying to do, but how can I filter a list so that if two strings are contained within it (ie. Fast and Slow, Tall and Small, Heavy and Light, lastly, Intelligent and Dumb), then they can are removed to leave a final vector with sensible vectors.
I have been trying to do this with an IF function, is that the most appropriate way?
This would be what you want:
cont_check <- function(x) {
cont_words <- list(c("Fast", "Slow"),
c("Heavy", "Light"))
found <- sum(sapply(cont_words, function (y) sum(y %in% x) == 2)) > 0
return(found)
}
sapply(my_list, cont_check)
# select ones
# my_list[!sapply(my_list, cont_check)]
This question already has answers here:
How to subset from a list in R
(7 answers)
Closed 6 years ago.
Say I have a list, preds, that contains 3 different sets of predictions based on unique models.
set.seed(206)
preds <- list(rnorm(1:100), rnorm(1:100), rnorm(1:100))
I am having trouble iterating through the list and creating a new list. I want a list of new vectors that are composed of the corresponding index in the other vectors. So for example, I can get the first element in each vector of the list using sapply() like this:
sapply(preds, "[[", 1)
Which works well, but I cannot figure out how to iterate through each element in each vector to construct a new list. I want something like this (doesn't work):
sapply(preds, "[[", 1:100)
Using the above seed and preds list, my desired output would be something like:
first_element <- sapply(preds, "[[", 1)
second_element <- sapply(preds, "[[", 2)
third_element <- sapply(preds, "[[", 3)
desired_result <- rbind(first_element, second_element, third_element)
But for all elements in the list, which in this example is 100.
Please let me know if this is not clear, and I appreciate any help!
We need to use [
res <- sapply(preds, `[`, 1:3 )
all.equal(res, desired_result, check.attributes=FALSE)
#[1] TRUE
Here's an alternative approach:
nth_element <- function(dat, n){
res <- sapply(dat, "[[", n)
res
}
res2 <- do.call(rbind, lapply(c(1:3), function(x) nth_element(dat = preds, n = x)))
Borrowing akrun's check code:
all.equal(res2, desired_result, check.attributes=FALSE)
From a dataframe I get a new array, sliced from a dataframe.
I want to get the amount of times a certain repetition appears on it.
For example
main <- c(A,B,C,A,B,V,A,B,C,D,E)
p <- c(A,B,C)
q <- c(A,B)
someFunction(main,p)
2
someFunction(main,q)
3
I've been messing around with rle but it counts every subrepetion also, undersirable.
Is there a quick solution I'm missing?
You can use one of the regular expression tools in R since this is really a pattern matching exercise, specifically gregexpr for this question. The p and q vectors represent the search pattern and main is where we want to search for those patterns. From the help page for gregexpr:
gregexpr returns a list of the same length as text each element of which is of
the same form as the return value for regexpr, except that the starting positions
of every (disjoint) match are given.
So we can take the length of the first list returned by gregexpr which gives the starting positions of the matches. We'll first collapse the vectors and then do the searching:
someFunction <- function(haystack, needle) {
haystack <- paste(haystack, collapse = "")
needle <- paste(needle, collapse = "")
out <- gregexpr(needle, haystack)
out.length <- length(out[[1]])
return(out.length)
}
> someFunction(main, p)
[1] 2
> someFunction(main, q)
[1] 3
Note - you also need to throw "" around your vector main, p, and q vectors unless you have variables A, B, C, et al defined.
main <- c("A","B","C","A","B","V","A","B","C","D","E")
p <- c("A","B","C")
q <- c("A","B")
I'm not sure if this is the best way, but you can simply do that work by:
f <- function(a,b)
if (length(a) > length(b)) 0
else all(head(b, length(a)) == a) + Recall(a, tail(b, -1))
Someone may or may not find a built-in function.
Using sapply:
find_x_in_y <- function(x, y){
sum(sapply(
seq_len(length(y)-length(x)),
function(i)as.numeric(all(y[i:(i+length(x)-1)]==x))
))
}
find_x_in_y(c("A", "B", "C"), main)
[1] 2
find_x_in_y(c("A", "B"), main)
[1] 3
Here's a way to do it using embed(v,n), which returns a matrix of all n-length sub-sequences of vector v:
find_x_in_y <- function(x, y)
sum( apply( embed( y, length(x)), 1,
identical, rev(x)))
> find_x_in_y(p, main)
[1] 2
> find_x_in_y(q, main)
[1] 3