Why are these sequences reversed when generated with the colon operator? - r

I've noticed that when I try to generate a list of sequences with the : operator (without an anonymous function), the sequences are always reversed. Take the following example.
x <- c(4, 6, 3)
lapply(x, ":", from = 1)
# [[1]]
# [1] 4 3 2 1
#
# [[2]]
# [1] 6 5 4 3 2 1
#
# [[3]]
# [1] 3 2 1
But when I use seq, everything is fine.
lapply(x, seq, from = 1)
# [[1]]
# [1] 1 2 3 4
#
# [[2]]
# [1] 1 2 3 4 5 6
#
# [[3]]
# [1] 1 2 3
And from help(":") it is stated that
For other arguments from:to is equivalent to seq(from, to), and generates a sequence from from to to in steps of 1 or -1.
Why is the first list of sequences reversed?
Can I generated forward sequences this way with the colon operator with lapply?
Or do I always have to use lapply(x, function(y) 1:y)?

The ":" operator is implemented as the primitive do_colon function in C. This primitive function does not have named arguments. It simply takes the first parameter as the "from" and the second as the "to" ignorning any parameter names. See
`:`(to=10, from=5)
# [1] 10 9 8 7 6 5
Additionally the lapply function only passes it's values as a leading unnamed parameter in the function call. You cannot pass values to primitive functions via lapply as the second positional argument.

Related

Replace empty element in a list using sapply

I have two lists. The first one has an empty element. I'd like to replace that empty element with the first vector of the third list element of another list.
l1 <- list(a=1:3,b=4:9,c="")
l2 <- list(aa=11:13,bb=14:19,cc=data.frame(matrix(100:103,ncol=2)))
l1[sapply(l1, `[[`, 1)==""] <- l2[[3]][[1]]
Using sapply, I can identify which elements are empty. However, when I try to assign a vector to this empty element: I get this error message:
Warning message: In l1[sapply(l1, [[, 1) == ""] <- l2[[3]][[1]] :
number of items to replace is not a multiple of replacement length
This is only a warning, but the result I get is not the one I want. This is the l1 I get:
> l1
$a
[1] 1 2 3
$b
[1] 4 5 6 7 8 9
$c
[1] 100
This is what I need (two elements in $c):
> l1
$a
[1] 1 2 3
$b
[1] 4 5 6 7 8 9
$c
[1] 100 101
Just use l2[[3]][1] on the right hand side (single [ not [[)
The right-hand side should be a list, since you're replacing a list element. So you want that to be
... <- list(l2[[3]][[1]])
In addition, you might consider using !nzchar(l1) in place of sapply(...) == "". It might be more efficient. The final expression would be:
l1[!nzchar(l1)] <- list(l2[[3]][[1]])
giving the updated l1:
$a
[1] 1 2 3
$b
[1] 4 5 6 7 8 9
$c
[1] 100 101

remove certain vectors from a list

I want to remove certain vectors from a list. I have for example this:
a<-c(1,2,5)
b<-c(1,1,1)
c<-c(1,2,3,4)
d<-c(1,2,3,4,5)
exampleList<-list(a,b,c,d)
exampleList returns of course:
[[1]]
[1] 1 2 5
[[2]]
[1] 1 1 1
[[3]]
[1] 1 2 3 4
[[4]]
[1] 1 2 3 4 5
Is there a way to remove certain vectors from a list in R. I want to remove all vectors in the list exampleList which contain both 1 and 5(so not only vectors which contain 1 or 5, but both). Thanks in advance!
Use Filter:
filteredList <- Filter(function(v) !(1 %in% v & 5 %in% v), exampleList)
print(filteredList)
#> [[1]]
#> [1] 1 1 1
#>
#> [[2]]
#> [1] 1 2 3 4
Filter uses a functional style. The first argument you pass is a function that returns TRUE for an element you want to keep in the list, and FALSE for an element you want to remove from the list. The second argument is just the list itself.
We can use sapply on every list element and remove those elements where both the values 1 and 5 are present.
exampleList[!sapply(exampleList, function(x) any(x == 1) & any(x == 5))]
#[[1]]
#[1] 1 1 1
#[[2]]
#[1] 1 2 3 4
Here a solution with two steps:
exampleList<-list(a=c(1,2,5), b=c(1,1,1), c=c(1,2,3,4), d=c(1,2,3,4,5))
L <- lapply(exampleList, function(x) if (!all(c(1,5) %in% x)) x)
L[!sapply(L, is.null)]
# $b
# [1] 1 1 1
#
# $c
# [1] 1 2 3 4
Here is a one-step variant without any definition of a new function
exampleList[!apply(sapply(exampleList, '%in%', x=c(1,5)), 2, all)]
(... but it has two calls to apply-functions)

R: how to find index of all repetition vector values order by unique vector without using loop?

I have a vector of integers like this:
a <- c(2,3,4,1,2,1,3,5,6,3,2)
values<-c(1,2,3,4,5,6)
I want to list, for every unique value in my vector (the unique values being ordered), the position of their occurences. My desired output:
rep_indx<-data.frame(c(4,6),c(1,5,11),c(2,7,10),c(3),c(8),c(9))
split fits pretty well here, which returns a list of indexes for each unique value in a:
indList <- split(seq_along(a), a)
indList
# $`1`
# [1] 4 6
#
# $`2`
# [1] 1 5 11
#
# $`3`
# [1] 2 7 10
#
# $`4`
# [1] 3
#
# $`5`
# [1] 8
#
# $`6`
# [1] 9
And you can access the index by passing the value as a character, i.e.:
indList[["1"]]
# [1] 4 6
You can do this, using sapply. The ordering that you need is ensured by the sort function.
sapply(sort(unique(a)), function(x) which(a %in% x))
#### [[1]]
#### [1] 4 6
####
#### [[2]]
#### [1] 1 5 11
#### ...
It will result in a list, giving the indices of your repetitions. It can't be a data.frame because a data.frame needs to have columns of same lengths.
sort(unique(a)) is exactly your vector variable.
NOTE: you can also use lapply to force the output to be a list. With sapply, you get a list except if by chance the number of replicates is always the same, then the output will be a matrix... so, your choice!
Perhaps this also works
order(match(a, values))
#[1] 4 6 1 5 11 2 7 10 3 8 9
You can use the lapply function to return a list with the indexes.
lapply(values, function (x) which(a == x))

Check if there is overlap between elements of a list

I have a list of integers and I want to check if the elements are all unique ones.
set.seed(2)
x <- list(a=sample(10,3),b=sample(10,5),c=sample(10,7))
x
# $a
# [1] 2 7 5
# $b
# [1] 2 9 8 1 6
# $c
# [1] 5 10 9 2 8 1 7
For this example, all of the following situations fails the check: 1) 2 appears in all entries, 2) 5 appears in $a and $c, 3) 8 appears in $b and $c, 4) 1 appears in $b and $c, etc.
y <- list(a=c(1,3,5),b=c(7,4),c=c(6,10))
There is no overlapping between elements of y, so it passes the check.
The expected output should be just True/False indicating whether the list passes the check.
You can convert the list to a vector with unlist and then check if any elements are duplicated in the vector with any and duplicated.
!any(duplicated(unlist(x)))
# [1] FALSE
!any(duplicated(unlist(y)))
# [1] TRUE

Assignment to the result of a function changes variable

Looking through the ave function, I found a remarkable line:
split(x, g) <- lapply(split(x, g), FUN) # From ave
Interestingly, this line changes the value of x, which I found unexpected. I expected that split(x,g) would result in a list, which could be assigned to, but discarded afterward. My question is, why does the value of x change?
Another example may explain better:
a <- data.frame(id=c(1,1,2,2), value=c(4,5,7,6))
# id value
# 1 1 4
# 2 1 5
# 3 2 7
# 4 2 6
split(a,a$id) # Split a row-wise by id into a list of size 2
# $`1`
# id value
# 1 1 4
# 2 1 5
# $`2`
# id value
# 3 2 7
# 4 2 6
# Find the row with highest value for each id
lapply(split(a,a$id),function(x) x[which.max(x$value),])
# $`1`
# id value
# 2 1 5
# $`2`
# id value
# 3 2 7
# Assigning to the split changes the data.frame a!
split(a,a$id)<-lapply(split(a,a$id),function(x) x[which.max(x$value),])
a
# id value
# 1 1 5
# 2 1 5
# 3 2 7
# 4 2 7
Not only has a changed, but it changed to a value that does not look like the right hand side of the assignment! Even if assigning to split(a,a$id) somehow changes a (which I don't understand), why does it result in a data.frame instead of a list?
Note that I understand that there are better ways to accomplish this task. My question is why does split(a,a$id)<-lapply(split(a,a$id),function(x) x[which.max(x$value),]) change a?
The help page for split says in its header: "The replacement forms replace values corresponding to such a division." So it really should not be unexpected, although I admit it is not widely used. I do not understand how your example illustrates that the assigned values "do not look like the RHS of the assignment!". The max values are assigned to the 'value' lists within categories defined by the second argument factor.
(I do thank you for the question. I had not realized that split<- was at the core of ave. I guess it is more widely used than I realized, since I think ave is a wonderfully useful function.)
Just after definition of a, perform split(a, a$id)=1, the result would be:
> a
id value
1 1 1
2 1 1
3 1 1
4 1 1
The key here is that split<- actually modified the LHS with RHS values.
Here's an example:
> x <- c(1,2,3);
> split(x,x==2)
$`FALSE`
[1] 1 3
$`TRUE`
[1] 2
> split(x,x==2) <- split(c(10,20,30),c(10,20,30)==20)
> x
[1] 10 20 30
Note the line where I re-assign split(x,x==2) <- . This actually reassigns x.
As the comments below have stated, you can look up the definition of split<- like so
> `split<-.default`
function (x, f, drop = FALSE, ..., value)
{
ix <- split(seq_along(x), f, drop = drop, ...)
n <- length(value)
j <- 0
for (i in ix) {
j <- j%%n + 1
x[i] <- value[[j]]
}
x
}
<bytecode: 0x1e18ef8>
<environment: namespace:base>

Resources