Triplicates in R - r

I have a set of 80 samples, with 2 variables, each measured as triplicate:
sample var1a var1b var1c var2a var2b var2c
1 -169.784 -155.414 -146.555 -175.295 -159.534 -132.511
2 -180.577 -180.792 -178.192 -177.294 -171.809 -166.147
3 -178.605 -184.183 -177.672 -167.321 -168.572 -165.335
and so on. How do I apply functions like mean, sd, se etc. for each row for var1 and var2? Also, the dataset contains NAs. Thanks for bothering with such basic questions

What is your expected result when there are NAs? apply(df[-1], 1, mean) (or whatever function) will work, but it would give NA as a result for the row. If you can replace NA with 0 then you could do df[is.na(df)] <- 0 first, and then the apply function in order to get the results.

One approach could be to reshape your data set. Another one might be just apply a function over rows of a subset of the data frame.
So, for var2X you have:
apply(dat[5:7], 1, function(x){m <- mean(x); s <- sd(x); da <-c(m, s) })
[,1] [,2] [,3]
[1,] -155.78000 -171.750000 -167.076000
[2,] 21.63763 5.573734 1.632348
and for var1X:
apply(dat[2:4], 1, function(x){m <- mean(x); s <- sd(x); da <-c(m, s) })
[,1] [,2] [,3]
[1,] -157.25100 -179.853667 -180.153333
[2,] 11.72295 1.443055 3.520835

Related

How to write an apply() function to limit each element in a matrix column to a maximum allowable value?

I'm trying to learn how to use the apply() functions.
Suppose we have a 3 row, 2 column matrix of test <- matrix(c(1,2,3,4,5,6), ncol = 2), and we would like the maximum value of each element in the first column (1, 2, 3) to not exceed 2 for example, so we end up with a matrix of (1,2,2,4,5,6).
How would one write an apply() function to do this?
Here's my latest attempt: test1 <- apply(test[,1], 2, function(x) {if(x > 2){return(x = 2)} else {return(x)}})
We may use pmin on the first column with value 2 as the second argument, so that it does elementwise checking with the recycled 2 and gets the minimum for each value from the first column
test[,1] <- pmin(test[,1], 2)
-output
> test
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 2 6
Note that apply needs the 'X' as an array/matrix or one with dimensions, when we subset only a single column/row, it drops the dimensions because drop = TRUE by default
If you really want to use the apply() function, I guess you're looking for something like this:
t(apply(test, 1, function(x) c(min(x[1], 2), x[2])))
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 2 6
But if you want my opinion, akrun's suggestion is definitely better.

How to write an apply() function that only applies to odd-numbered columns in r matrix?

Suppose we have a "test" matrix that looks like this: (1,2,3, 4,5,6, 7,8,9, 10,11,12) generated by running test <- matrix(1:12, ncol = 4). A simple 3 x 4 (rows x columns) matrix of numbers running from 1 to 12.
Now suppose we'd like to add a value of 1 to each element in each odd-numbered matrix column, so we end up with a matrix of the following values: (2,3,4, 4,5,6, 8,9,10, 10,11,12). How would we use an apply() function to do this?
Note that this is a simplified example. In the more complete code I'm working with, the matrix dynamically expands/contracts based on user inputs so I need an apply() function that counts the actual number of matrix columns, rather than using a fixed assumption of 4 columns per the above example. (And I'm not adding a value of 1 to the elements; I'm running the parallel minima function test[,1] <- pmin(test1[,1], 5) to say limit each value to a max of 5).
With my current limited understanding of the apply() family of functions, all I can so far do is apply(test, 2, function(x) {return(x+1)}) but this is adding a value of 1 to all elements in all columns rather than only the odd-numbered columns.
You may simply subset the input data frame to access only odd or even numbered columns. Consider:
test[c(TRUE, FALSE)] <- apply(test[c(TRUE, FALSE)], 2, function(x) f(x))
test[c(FALSE, TRUE)] <- apply(test[c(FALSE, TRUE)], 2, function(x) f(x))
This works because the recycling rules in R will cause e.g. c(TRUE, FALSE) to be repeated however many times is needed to cover all columns in the input test data frame.
For a matrix, we need to use the drop=FALSE flag when subsetting the matrix in order to keep it in matrix form when using apply():
test <- matrix(1:12, ncol = 4)
test[,c(TRUE, FALSE)] <- apply(test[,c(TRUE, FALSE),drop=FALSE], 2, function(x) x+1)
test
[,1] [,2] [,3] [,4]
[1,] 2 4 8 10
[2,] 3 5 9 11
[3,] 4 6 10 12
^ ^ ... these columns incremented by 1
You may use modulo %% 2.
odd <- !seq(ncol(test)) %% 2 == 0
test[, odd] <- apply(test[, odd], 2, function(x) {return(x + 1)})
# [,1] [,2] [,3] [,4]
# [1,] 2 4 8 10
# [2,] 3 5 9 11
# [3,] 4 6 10 12

How can we add two matrices with different rows and columns in R?

I have two matrices:For example
temp1 <- matrix(c(1,2,3,4,5,6),2,3,byrow = T)
temp2 <- matrix(c(7,8,9),1,3,byrow = T)
temp1
[,1] [,2] [,3]
[1,] 1 2 3
[2,] 4 5 6
temp2
[,1] [,2] [,3]
[1,] 7 8 9
I have two matrices with the same number of rows, but with different rows. I would like to add these two matrices as follows. I wonder if there is a way to add R without for statements and apply functions.
temp <- do.call(rbind,lapply(1:2,function(x){temp[x,]+temp2}))
temp
[,1] [,2] [,3]
[1,] 8 10 12
[2,] 11 13 15
This example is simple, but in practice I need to do the above with a 100 * 100 matrix and a 1 * 100 matrix. In this case, it takes too long, so I do not want to use for statements and apply functions.
You can use ?sweep:
temp1 <- matrix(c(1,2,3,4,5,6),2,3,byrow = T)
temp2 <- matrix(c(7,8,9),1,3,byrow = T)
sweep(temp1, 2, temp2, '+')
Unfortunately the help for sweep is really difficult to understand, but in this example you apply the function ´+´ with argument ´temp2´ along the second dimension of temp1.
For more examples, see: How to use the 'sweep' function

applying list of functions columns in data frame

I have 2 variables, a and b. a is potentially very large. b is always a vector of functions that can be applied to each column in the data frame a.
a <- data.frame(col1=c(1, 2, 3), col2=c(4, 5, 6))
b <- c(as.double, function(x) {1+x})
So the result that I want is that function b[1] is applied to col1, b[2] is applied to col2, and so on for all columns. I feel lapply should be used here but documentation seems to say that it can only have one function. I could use a loop I suppose but a "vectorised" way would be nice.
This is one way to do it:
results <- mapply(function(i,j) b[[i]](a[[j]]), i=1:length(b), j=1:length(a))
It gives you:
> results
[,1] [,2]
[1,] 1 5
[2,] 2 6
[3,] 3 7

R: finding maximum value every two rows in each column

I want to find maximum value in each column for every 2 rows (say). How to do that in R? For example
matrix(c(3,1,20,5,4,12,6,2,9,7,8,7), byrow=T, ncol=3)
I want the output like this
matrix(c(5,4,20,7,8,9), byrow=T, ncol=3)
Here is one way of doing it.
Define a vector that contains information about the groups you want. In this case, I use rep to repeat a sequence of numbers.
Then define a helper function to calculate the column maximum of an array — this is a simple apply of max.
finally, use sapply with an anonymous function that applies colMax to each of your grouped array subsets.
The code:
groups <- rep(1:2, each=2)
colMax <- function(x)apply(x, 2, max)
t(
sapply(unique(groups), function(i)colMax(x[which(groups==i), ]))
)
The results:
[,1] [,2] [,3]
[1,] 5 4 20
[2,] 7 8 9
A one long line:
t(sapply(seq(1,nrow(df1),by=2),function(i) apply(df1[seq(i,1+i),],2,max)))
Another option,
do.call(rbind, by(m, gl(nrow(m)/2, 2), function(x) apply(x, 2, max)))
apply(mat, 2, function(x) tapply(x, # work on each column
# create groups of 2 vector of proper length: 1,1,2,2,3,3,4,4 ....
rep(1:(length(x)/2), each=2, len=length(x))
max))
[,1] [,2] [,3]
1 5 4 20
2 7 8 9

Resources