How to use stopifnot with a list in R - r

Suppose I have a list of vectors. Suppose further that I would like to have a condition based on their length. That is, I would like my function return an error if the lengths of these vectors are not equal.
For example,
x <- c(1:4)
y <- c(1:5)
z <- c(1:4)
k <- list(x, y, z)
I would like to check that their lengths are equal.
stopifnot(length(k[[1]]) == length (k[[2]]) == length(k[[3]]))
How could I generalize this code and make it works for an arbitrary number of elements of the list?

We can use lengths with unique
stopifnot(length(unique(lengths(k)))==1)
Error: length(unique(lengths(k))) == 1 is not TRUE
The lengths will get the length of each of the vector in the list as a vector, get the unique and check if the length is equal to 1. If it is not i.e. stopifnot, give an error

Related

How is possible that X is acting like a vector and a concrete variable at the same time (R)?

I'll use the following code example:
minmax_norm <- function(X) {
(X - min(X))/diff(range(X))
}
dSeeds_minmax <- apply(dSeeds,2,minmax_norm)
How is possible that X is acting like each column of the dataframe and also is acting as each variable of the column? I mean min(X) is the minimum value of the column but X - min(X) is the current value of the variable minus the minimum value of all the column.
Thanks for your attention.
You're almost always dealing with vectors in R. Even:
x <- 1
x here is just a vector of length 1. If R encounters a vector operation on two vectors of different lengths, like:
X - min(X)
It checks to see if the length of the smaller vector is a factor of the length of the larger vector. If it is, R assumes you wanted the smaller vector repeated in order to do the operation.

Finding longest length out of 3 different vectors in R

I do not know if there is a function for this but I have 3 dataframes with different lengths. I was wondering if there is a way to find which one is the largest length and load that into a variable. For example:
x <- c(1:10)
y <- c(1:20)
z <- c(1:40)
I would want to use z as my variable because it has the longest length. Is there a function that can search through these 3 variables (x,y,z) and give me back the one with the longest length?
Thanks
We can place it in a list, use lengths to create an index of maximum length and extract those element from the list
lst[which.max(lengths(lst))]
data
lst <- list(x, y, z)
if you have dataframe and not vectors:
lst[which.max(sapply(lst,nrow))]
data
lst <- list(df1, df2, df3)

conditional vectorized matching to named list of vectors

I have a named list of vectors, y. The names of the list correspond to the values of variable, x. I need to return the value of the vector in y that matches the value of x at position i. For example, if x == "b" at index 25, I expect to return the 25th value of the "b" vector contained in the list y.
This is my current solution:
x <- sample(letters[1:4], 100, replace = T)
y <- list("a"=rnorm(100), "b"=rnorm(100), "c"=rnorm(100))
i <- match(x, names(y))
m <- sapply(i, function(i) {out <- rep(0,3); out[i] <- 1; out})
final <- apply(t(m) * do.call(cbind, y), 1, sum)
I am hoping for something more idiomatic. As part of the solution, the answer handle cases where values in x do not appear in the names of y.
The real world use case I am trying to solve is the case where I have several segmented model predictions applied to the entire population that I need to assign to their appropriate segment.
EDIT
Also, trying to avoid the clunky usage of ifelse. Since the names are known, I shouldn't have to specify them manually.
Using matrix subsetting with 2-dimensional indices, you could simply do
do.call(cbind, y)[cbind(1:length(i), i)]

Returning head and tail means from list of vectors

I need to calculate the mean (or other summary functions) on the top x and bottom x portions on list of vectors of varying lengths.
Here is a list of 3 vectors of different lengths similar in format with what I am working with:
t <- list(a = exp(-4:3), b = exp(-2:12), c = exp(-5:3))
Ideally, I would like a single vector of numbers for each type of means (I manually ran mean(head(t$a),2)) and mean(tail(t$a),2)) for each vectors):
Ideal output yielding a nameless vector of means of the first two elements from each vector:
[1] 0.2516074 1.859141 0.09256118
Second vector of means for last two entries in each vector:
[1] 1.859141 15064.77 1.859141
Looking for a clever lapply-type construct to get a vector of numbers for each means without the attached names (in this case a,b,c). Thanks!
What about
n = 2
v = lapply(t, function(i) mean(head(i, n)))
The variable v is list. So to get a vector, just use unlist
v = unlist(v)
To extract the numbers use as.vector
as.vector(v)
For the tail, just use
lapply(t, function(i) mean(tail(i, n)))
Using sapply you can wrap this in a function:
sapply(dat,function(x,length=2)
c(mean(head(x,length)),mean(head(x,length))))
# a b c
# [1,] 0.03405135 0.2516074 0.01252679
# [2,] 0.03405135 0.2516074 0.01252679

Subsetting a list of different data types

I have a list of different data types (factors, data.frames, and vectors, all the same length or number of rows), What I would like to do is subset each element of the list by a vector (let's call it rows) that represents row names.
If it was a data.frame() I would:
x <- x[rows,]
If it was a vector() or factor() I would:
x <- x[rows]
So, I've been playing around with this:
x <- lapply(my_list, function(x) ifelse(is.data.frame(x), x[rows,], x[rows]))
So, how do I accomplish my goal of getting a list of subsetted data?
I think this is YAIEP (Yet Another If Else Problem). From ?ifelse:
ifelse returns a value with the same shape as test which is filled
with elements selected from either yes or no depending on whether the
element of test is TRUE or FALSE.
See the trouble? Same shape as test.
So just do this:
l <- list(a = data.frame(x=1:10,y=1:10),b = 1:10, c = factor(letters[1:20]))
rows <- 1:3
fun <- function(x){
if (is.data.frame(x)){
x[rows,]
}
else{
x[rows]
}
}
lapply(l,fun)

Resources