counting elements in a list based on another list - r

I have two lists looking like this:
mylist <- list(a=c(1:5),
b = c(5:12),
c = c(2:8))
list.id <- list(a=2, b=8, c=5)
I want to count the number of elements in mylist that are higher than the corresponding element in list.id and divide the result for the length of element in mylist. I have written this function.
perm.fun <- perm.fun2 = function(x,y){length(which(x[[i]] < y[[i]]))/length(x[[i]])}
However, when I do: lapply(mylist, perm.fun, list.id) I do not obtain the expected result.
Thanks

Using lapply, you would need to loop on the indices (1, 2, 3) so they can be used to extract the elements from both mylist and list.id:
perm.fun <- function(i, x, y) mean(x[[i]] > y[[i]])
lapply(seq_along(mylist), perm.fun, mylist, list.id)
But mapply is a much better tool for that task. From the doc:
mapply applies FUN to the first elements of each ... argument, the second elements, the third elements, and so on.
So your code can just be:
mapply(function(x, y) mean(x > y), mylist, list.id)
# a b c
# 0.6000000 0.5000000 0.4285714

Related

Select random and unique elements from a vector

Say I have a simple vector with repeated elements:
a <- c(1,1,1,2,2,3,3,3)
Is there a way to randomly select a unique element from each of the repeated elements? I.e. one random draw pointing which elements to keep would be:
1,4,6 ## here I selected the first 1, the first 2 and the first 3
And another:
1,5,8 ## here I selected the first 1, the second 2 and the third 3
I could do this with a loop for each repeated elements, but I am sure there must be a faster way to do this?
EDIT:
Ideally the solution should also always select a particular element if it is already a unique element. I.e. my vector could also be:
b <- c(1,1,1,2,2,3,3,3,4) ## The number four is unique and should always be drawn
Using base R ave we could do something like
unique(ave(seq_along(a), a, FUN = function(x) if(length(x) > 1) head(sample(x), 1) else x))
#[1] 3 5 6
unique(ave(seq_along(a), a, FUN = function(x) if(length(x) > 1) head(sample(x), 1) else x))
#[1] 3 4 7
This generates an index for every value of a, grouped by a and then selects one random index value in each group.
Using same logic with sapply and split
sapply(split(seq_along(a), a), function(x) if(length(x) > 1) head(sample(x), 1) else x)
And it would also work with tapply
tapply(seq_along(a), a, function(x) if(length(x) > 1) head(sample(x), 1) else x)
The reason why we need to check the length (if(length(x) > 1)) is because from ?sample
If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x.
Hence, when there is only one number (n) in sample(), it takes sample from 1:n (and not n) so we need to check it's length.

how to loop for division funciton in r

Suppose that I have a list such that: (my function is more complex than this example with many vectors that are vary from data to another). The idea of my function is to divide each element of the first vector over the sum of all elements of all vectors. That is, I am looking for something like this:
mylist[[1]]+mylist[[2]]+....+mylist[[100]]
x <- c(1,2,3,4)
y <- c(3,4,5,6)
z <- list(x,y)
Then I would like to divide each element of the first element of the list by the sum of the first element of the first and second element of the list, such that:
z[[1]] / (z[[1]] + z[[2]])
The output is:
[1] 0.2500000 0.3333333 0.3750000 0.4000000
I would like to make it doing the division automatically. That because the number of my vectors inside the list are vary. So, I used lapply
f <- lapply(z, function(x) (x/sum(x)))
The output is:
> f
[[1]]
[1] 0.1 0.2 0.3 0.4
[[2]]
[1] 0.1666667 0.2222222 0.2777778 0.3333333
The output is wrong. I know that because my f function does not do the job in correct way. So, how can I fix it? any help please?
I also tried for loop such that:
for (i in 1:2){
z[[i]] / (sum(z[[i]]))
}
This also does not solved my problem. So in the sum I would like to sum all the element of the list.
After a hard time work with this problem. I found this is the easiest way to do so.
t <- list()
for (i in 1:2){
t[[i]] <- z[[i]] / Reduce("+", z)
}
And with lapply becomes:
lapply(1:2, function(i) z[[i]] / Reduce("+", z))

Create a vector with the sum of the positive elements of each column of a m*n numeric matrix in R

I need to create an R function which takes a numeric matrix A of arbitrary format n*m as input and returns a vector that is as long as A's number of columns that contains the sum of the positive elements of each column.
I must do this in 2 ways - the first in a nested loop and the second as a one liner using vector/matrix operations.
So far I have come up with the following code which creates the vector the size of the amounts of columns of matrix A but I can only seem to get it to give me the sum of all positive elements of the matrix instead of each column:
colSumPos(A){
columns <- ncol(A)
v1 <- vector("numeric", columns)
for(i in 1:columns)
{
v1[i] <- sum(A[which(A>0)])
}
}
Could someone please explain how I get the sum of each column separately and then how I can simplify the code to dispose of the nested loop?
Thanks in advance for the help!
We can use apply with MARGIN=2 to loop through the columns and get the sum of elements that are greater than 0
apply(A, 2, function(x) sum(x[x >0], na.rm = TRUE))
#[1] 1.8036685 0.7129192 0.9305136 2.6625824 0.0000000
Or another option is colSums after replacing the values less than or equal to 0 with NA
colSums(A*NA^(A<=0), na.rm = TRUE)
#[1] 1.8036685 0.7129192 0.9305136 2.6625824 0.0000000
Or by more direct approach
colSums(replace(A, A<=0, NA), na.rm = TRUE)
#[1] 1.8036685 0.7129192 0.9305136 2.6625824 0.0000000
Or if there are no NA elements (no need for na.rm=TRUE), we can replace the values that are less than or equal to 0 with 0 and make it compact (as #ikop commented)
colSums(A*(A>0))
#[1] 1.8036685 0.7129192 0.9305136 2.6625824 0.0000000
data
set.seed(24)
A <- matrix(rnorm(25), 5, 5)
You try code folow if you using for loop
sumColum <- function(A){
for(i in 1:nrow(A)){
for(j in 1:ncol(A)){
colSums(replace(A, A<=0, NA), na.rm = TRUE)
}
}
colSums(A)
}

product of elements of the vector

I want to write a function that, given a vector v computes the product of all the entries in v. (There is a function in R that does this, but I want to write one myself.)
I tried however how can I get for product of any elements in a vector?
product <- function(v){
out <- 1
for(i in 1:length(v)){
out <- out*v[i]
}
out
}
If you use ... as the argument to your function, you can pass it several objects or just one. Inside the function, you can convert to a list and use Reduce to apply a function (*) recursively to the list. If you combine list, unlist and as.list you can make this very general. The following will work with a vector, or with 2 or more numbers, or a mixture of vectors and single numbers.
> product <- function(...) Reduce("*", as.list(unlist(list(...))))
> product(2, 7, 3)
[1] 42
> product(c(2, 7, 3))
[1] 42
> product(2, c(7, 3))
[1] 42
The use of Recall for tail recursion:
prd2 <- function(x)
if(length(x) == 2) { x[1] *x[-1] } else x[1] * Recall(x[-1])
prd2(c(2,3,4))
#[1] 24

R Combine mapply and sapply

I have a function in R that takes in 3 parameters, say foo(x,y,z).
When I call the function, I really have a list of elements for x, and a list for y but only one element for z. If Z was a list, and I wanted to apply foo to each element, mapply(foo, x, y, z) works.
However, since z is not a list, mapply(foo,x,y,z) does not work.
More specifically, if x and y are lists of 3 elements each, the following does work however: mapply(foo, x, y, list(z, z, z)).
Is there a way I can perhaps combine mapply and sapply without me first making z into a list of 3 elements? I want z to just be reused!
Edit 1: I was asked for an example:
mylist1 <- list(c(5,4), c(7,9), c(8,3))
mylist2<- list(c(2,3), c(6,7), c(10,11))
item3 <- matrix(data = 15, nrow = 3, ncol = 3)
foo <- function(x,y,z){
return(x[1]+y[2]+z[2,2])
}
The following works:
> mapply(foo, mylist1, mylist2, list(item3,item3, item3))
[1] 23 29 34
The following does not work:
mapply(foo, mylist1, mylist2, item3)
Error in z[2, 2] : incorrect number of dimensions
Use the MoreArgs argument to mapply
mapply(foo, x = mylist1, y= mylist2, MoreArgs = list(z = item3))
## [1] 23 29 34
You just have to put the last item in a list, and R will recycle it just fine:
mapply(foo, mylist1, mylist2, list(item3))
Note that the documentation specifically says that the arguments you pass need to be:
arguments to vectorize over (vectors or lists of strictly positive length, or all of zero length)
and you were trying to pass a matrix.

Resources