Related
My data has 3 surveys per year (for 10 years) where 1 represents presence and 0s present absence. The subset looks like this
x <- structure(c(0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1,
0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1,
0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1,
0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1),
.Dim = c(4L, 3L, 4L))
I want to collapse these three columns into one in a way that every row that has 1 in any survey, shows 1 in the final otherwise shows 0.
Collapse the second dimension of the array with apply:
apply(x, c(1L, 3L), function(y) as.integer(any(as.logical(y))))
## [,1] [,2] [,3] [,4]
## [1,] 0 0 0 0
## [2,] 1 1 1 1
## [3,] 0 1 1 1
## [4,] 1 1 1 1
The result is a [site, year] matrix.
We could use max
apply(x, c(1, 3), FUN = max)
[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 1 1 1 1
[3,] 0 1 1 1
[4,] 1 1 1 1
Here is data set 'before' and 'after' shifting.
# Data set 'before'
df_before <- t(data.table(
x = c(1, 2, 3, 4, 5),
y = c(0, 6, 7, 8, 9),
z = c(0, 0, 11, 12, 13)))
# Shift operation
# ...
# Data set 'after'
df_after <- t(data.table(
x = c(1, 2, 3, 4, 5),
y = c(6, 7, 8, 9, NA),
z = c(11, 12, 13, NA, NA)))
How to make this kind of shifting on +1 cell only for all rows?
Thanks!
Something like this? Just start the rows always shifted by one and reset their length. The latter adds NAs.
t(sapply(1:nrow(DF), function(x) `length<-`(DF[x, x:ncol(DF)], ncol(DF))))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 2 3 4 5
# [2,] 6 7 8 9 NA
# [3,] 11 12 13 NA NA
Data
DF <- structure(c(1, 0, 0, 2, 6, 0, 3, 7, 11, 4, 8, 12, 5, 9, 13), .Dim = c(3L,
5L), .Dimnames = list(c("x", "y", "z"), NULL))
Taking a guess at the logic:
t(apply(df_before, 1, function(x) `length<-`(x[x != 0], ncol(df_before))))
[,1] [,2] [,3] [,4] [,5]
x 1 2 3 4 5
y 6 7 8 9 NA
z 11 12 13 NA NA
You can un-transpose the df_before data.frame then use the lead function from dplyr
to shift the columns
library(data.table)
library(dplyr)
df_before <- data.table(
x = c(1, 2, 3, 4, 5),
y = c(0, 6, 7, 8, 9),
z = c(0, 0, 11, 12, 13))
df_after <- t(data.table(
x = c(1, 2, 3, 4, 5),
y = c(6, 7, 8, 9, NA),
z = c(11, 12, 13, NA, NA)))
df_before[] <-lapply(1:ncol(df_before), function(x){
dplyr::lead(df_before[[x]],n= x-1)
})
If you need to transpose the data after this step:
df_after2 <- t(df_before)
all.equal(df_after,df_after2) # TRUE
I'm trying to compare to matrices. When the values aren't equivalent then I want to use the value from mat2 so long as it is greater than 0; if it is zero, then I want the value from mat1. As the code is currently, it appears to constantly return the value of mat1.
Here is my attempt:
mat.data1 <- c(1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1)
mat1 <- matrix(data = mat.data1, nrow = 5, ncol = 5, byrow = TRUE)
mat.data2 <- c(0, 0, 0, 0, 0, 0, 1, 2, 0, 0, 0, 1, 2, 2, 0, 0, 0, 1, 2, 2, 0, 2, 1, 0, 1)
mat2 <- matrix(data = mat.data2, nrow = 5, ncol = 5, byrow = TRUE)
mat3 = if(mat1 == mat2){mat1} else {if(mat2>0){mat2} else {mat1}}
the expected output should be
1 0 1 1 1
0 1 2 1 1
1 1 2 2 0
1 1 1 2 2
1 1 1 0 1
Here is one potential way to do it.
mat.data1 <- c(1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1)
mat1 <- matrix(data = mat.data1, nrow = 5, ncol = 5, byrow = TRUE)
mat.data2 <- c(0, 0, 0, 0, 0, 0, 1, 2, 0, 0, 0, 1, 2, 2, 0, 0, 0, 1, 2, 2, 0, 2, 1, 0, 1)
mat2 <- matrix(data = mat.data2, nrow = 5, ncol = 5, byrow = TRUE)
mat3 <- mat1
to_change <- which(mat2 != mat1 & mat2 > 0)
mat3[to_change] <- mat2[to_change]
This specific use of which essentially asks for the locations in mat2 that are not equal to that in mat1 AND where mat2 is greater than zero. You can then just do a subset and place those values in mat3.
This output is then:
> mat3
[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 1 1 1
[2,] 0 1 2 1 1
[3,] 1 1 2 2 0
[4,] 1 1 1 2 2
[5,] 1 2 1 0 1
We can use coalesce
library(dplyr)
out <- coalesce(replace(mat2, !mat2, NA), replace(mat1, !mat1, NA))
replace(out, is.na(out), 0)
Or as #Axeman mentioned
coalesce(out, 0)
I wish to number the non-zero elements in a matrix by row. Here is a small data set and the desired result. I would prefer a solution in base R.
my.data <- matrix(c(10, 0, 0, 0, 0,
0, 3, 9, 0, 1,
2, 12, 0, 0, 0,
5, 5, 5, 0, 5,
0, 0, 0, 0, 0), nrow = 5, byrow = TRUE)
desired.result <- matrix(c( 1, 0, 0, 0, 0,
0, 1, 2, 0, 3,
1, 2, 0, 0, 0,
1, 2, 3, 0, 4,
0, 0, 0, 0, 0), nrow = 5, byrow = TRUE)
Another couple options:
# create new matrix with multiplication
t(apply(my.data != 0, 1, cumsum)) * (my.data != 0)
# alternative:
# replace elements in original matrix
my.data[my.data != 0] = t(apply(my.data != 0, 1, cumsum))[my.data != 0]
my.data
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 0 0 0 0
# [2,] 0 1 2 0 3
# [3,] 1 2 0 0 0
# [4,] 1 2 3 0 4
# [5,] 0 0 0 0 0
Here's a relatively naive base R method:
t(apply(my.data, 1, function(x) {
x[x != 0] <- seq_len(sum(x != 0))
x
}))
[,1] [,2] [,3] [,4] [,5]
[1,] 1 0 0 0 0
[2,] 0 1 2 0 3
[3,] 1 2 0 0 0
[4,] 1 2 3 0 4
[5,] 0 0 0 0 0
I have two matrices
A = matrix(c(2, 2, 2, 3, 3, 3),nrow=3,ncol=2)
> A
[,1] [,2]
[1,] 2 3
[2,] 2 3
[3,] 2 3
B = matrix(c(2, 4, 3, 1, 5, 7),nrow=3, ncol=2)
> B
[,1] [,2]
[1,] 2 1
[2,] 4 5
[3,] 3 7
take the mean of all values in B that correspond to 3 in B:
Create a matrix with only the means:
Wanted matrix:
C
[,1] [,2]
[1,] 3 4.3
[2,] 3 4.3
[3,] 3 4.3
When the groups are not column specific this might help:
A <- matrix( c(2, 2, 2, 3, 3, 3),nrow=3,ncol=2)
B <- matrix(c(2, 4, 3, 1, 5, 7),nrow=3, ncol=2)
C <- matrix(nrow = dim(A)[1], ncol=dim(A)[2])
groups <- unique(c(A))
for(group in groups) {
C[which(A==group)] <- mean(B[which(A==group)])
}
If A contains NAvalues, then use
groups <- na.omit(unique(c(A)))
What about:
A <- matrix(c(2, 2, 2, 3, 3, 2, 3, 2), nrow=4, ncol=2)
B <- matrix(c(2, 4, 3, 1, 5, 7, 4, 2), nrow=4, ncol=2)
matrix(tapply(B, A, mean)[as.character(A)], nrow=nrow(A))
?