This question already has answers here:
R is there a way to find Inf/-Inf values?
(5 answers)
Closed 6 years ago.
I have three lists with Inf as a numeric and "NaN" as a character variable.
v1<-list(1,Inf,3,4,5,6,"NaN")
v2<-list(1,"NaN",3,4,5,6,5)
v3<-list(1,2,3,4,5,6,"NaN")
for the moment I can make a matrix with cbind, but the desired result is the code B).
A) What I got:
matrix<-cbind(v1,v2,v3)
v1 v2 v3
[1,] 1 1 1
[2,] Inf "NaN" 2
[3,] 3 3 3
[4,] 4 4 4
[5,] 5 Inf 5
[6,] 6 6 6
[7,]"NaN" 7 "NaN"
B) I want to get:
v1 v2 v3
[1,] 1 1 1
[2,] 3 3 3
[3,] 4 4 4
[4,] 6 6 6
Context:
I wanted to export into a .txt file some results located in 3 lists, the easy for me was to use cbind to get a matrix and use
write.table(matrix, file="mymatrix.txt", row.names=FALSE, col.names=FALSE)
Couple of suggestions regarding the problem.
1) It is better not to name a matrix object as matrix.
2) NaN or NA have a special meaning and are not character strings. By using quotes "NaN", it becomes difficult to apply the custom functions is.nan/is.na to do any manipulations. So, we have to resort to ==/!=
3) It is not clear why the individual list are cbinded to a matrix.
Based on the input data, we can loop through the columns of 'matrix' with apply, then loop through each of the list elements, check whether we have a finite element and is not a "NaN", get the rowSums, negate (! - converts the 0 elements to TRUE i.e. all the elements in the row are finite and all other values to FALSE). Use the logical index to subset the rows.
matrix[!rowSums(apply(matrix, 2, FUN = function(x)
sapply(x, function(y) !(is.finite(y) & y !="NaN")))),]
# v1 v2 v3
#[1,] 1 1 1
#[2,] 3 3 3
#[3,] 4 4 4
#[4,] 6 6 6
Related
When combining a data frame and a vector with different number of rows/lengths, bind_cols gives an error, whereas cbind repeats rows – why is this?
(And is it really wise to have that as a default behavior of cbind?)
See example data below.
# Example data
x10 <- c(1:10)
y10 <- c(1:10)
xy10 <- tibble(x10, y10)
z20 <- c(1:20)
# get an error
xyz20 <- dplyr::bind_cols(xy10, z20)
# why does cbind repeat rows of xy10 to suit z20?
xyz20 <- cbind(xy10, z20)
xyz20
base::cbind is a generic function. Its behavior is different for matrix and data frames.
For matrices, it does warn if objects have different number of rows (see more on Note below).
cbind(as.matrix(xy10), z20)
# x10 y10 z20
# [1,] 1 1 1
# [2,] 2 2 2
# [3,] 3 3 3
# [4,] 4 4 4
# [5,] 5 5 5
# [6,] 6 6 6
# [7,] 7 7 7
# [8,] 8 8 8
# [9,] 9 9 9
#[10,] 10 10 10
#Warning message:
#In cbind(as.matrix(xy10), z20) :
# number of rows of result is not a multiple of vector length (arg 2)
But for data frames, it actually creates a data frame from scratch. So the following is identical, both giving a data frame of 20 rows:
cbind(xy10, z20)
## in this way, R's recycling rule steps in
data.frame(xy10[, 1], xy10[, 2], z20)
From ?cbind:
The ‘cbind’ data frame method is just a wrapper for ‘data.frame(..., check.names = FALSE)’. This means that it will split matrix columns in data frame arguments, and convert character columns to factors unless ‘stringsAsFactors = FALSE’ is specified.
Note: In non-data.frame cases, matrices are not allowed to grow bigger. Only vectors will be recycled or truncated.
## handling two vectors
## vector of shorter length is recycled
cbind(1:2, 1:4)
# [,1] [,2]
#[1,] 1 1
#[2,] 2 2
#[3,] 1 3
#[4,] 2 4
## handling two matrices
## has strict requirement on dimensions
cbind(as.matrix(1:2), as.matrix(1:4))
#Error in cbind(as.matrix(1:2), as.matrix(1:4)) :
# number of rows of matrices must match (see arg 2)
## handling a matrix and a vector
## vector of shorter length is recycled
cbind(1:2, as.matrix(1:4))
# [,1] [,2]
#[1,] 1 1
#[2,] 2 2
#[3,] 1 3
#[4,] 2 4
## handling a matrix and a vector
## vector of longer length is truncated
cbind(as.matrix(1:2), 1:4)
# [,1] [,2]
#[1,] 1 1
#[2,] 2 2
#Warning message:
#In cbind(1:4, as.matrix(1:2)) :
# number of rows of result is not a multiple of vector length (arg 1)
From ?cbind:
If there are several matrix arguments, they must all have the same number of rows....
If all the arguments are vectors, ..., values in shorter arguments are recycled to achieve this length...
When the arguments consist of a mix of matrices and vectors, the number of rows of the result is determined by the number of rows of the matrix arguments... vectors... are recycled or subsetted to achieve this length.
I need help removing the rows for which the elements in the first column (V1) do not correspond to those in the column (X1). My dataset is much larger, I just used this example to make it easier to understand.
V1 V2 V3 X1 X2 X3
[1,] 1 2 3 [1,] 1 2 3
[2,] 2 2 3 [2,] 1 2 3
[3,] 1 2 3 [3,] 1 2 3
[4,] 1 2 3
[5,] 3 2 3
For this example I would require the code to remove the rows [2,] and [5,] as they are different from what you have in X1 (The general idea is to fix my dataset so they have the same elements and remove those that are in V but not in X).
The code I thought is this:
for(k in 1:nrow(V)){
if(V[k,1]!=X[k,1]){
V[-k,]
}
}
but it does not work because the size for V is different than X.
Any help would be greatly appreciated!!
We can use %in% to get the logical index of elements of 'V1' (in matrix 'm1') that is present in 'X1' (from 'm2'), negate (!) to get elements not present, and use that as row index to subset the 'm1'.
m1[!m1[,'V1'] %in% m2[, 'X1'],]
# V1 V2 V3
#[1,] 2 2 3
#[2,] 3 2 3
I have one matrix a with 24 rows and 44 columns and another one b with 44 rows and one column. I would like to multiply the first row of matrix b with the entire column of matrix a, and the second row of matrix b with the entire column of matrix a and so forth. How can I do that?
We can replicate the elements in the second matrix ('m2') to make the lengths same as in 'm1' and then do the multiplication.
m1*m2[col(m1)]
For replicating the elements, we used col, which returns the numeric index of the columns of the matrix ('m1')
col(m1)
# [,1] [,2] [,3] [,4]
#[1,] 1 2 3 4
#[2,] 1 2 3 4
#[3,] 1 2 3 4
#[4,] 1 2 3 4
#[5,] 1 2 3 4
By doing m2[col(m1)], the first element in 'm2' i.e. row1 column1 element is replicated 5 times, second 5 times, and so on.
m2[col(m1)]
#[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4
data
m1 <- matrix(1:20, ncol=4)
m2 <- matrix(1:4, nrow=4)
This alternative uses vector recycling:
t(t(m1) * as.vector(m2))
Since the vector is recycled in the same way a matrix is filled (by columns), we need to first transpose m1 and then transpose the result again.
I want to use sapply to create a 2 columns matrix, 8 rows. First column is from 1 to 8 and second column is the square of the first. I did sapply(1:8, function(x), c(x,x^2)) so I got 8 columns and 2 rows instead of getting 2 columns and 8 rows. How can I replace columns by rows?
Try using t
> t(sapply(1:8, function(x) c(x,x^2)))
[,1] [,2]
[1,] 1 1
[2,] 2 4
[3,] 3 9
[4,] 4 16
[5,] 5 25
[6,] 6 36
[7,] 7 49
[8,] 8 64
Actually no need to use sapply for that, just use matrix
> x <- 1:8
> matrix(c(x,x^2), ncol=2)
The default for sapply is to essentially cbind the the final output. You can either tell it to not simplify or just transpose your result.
# manual rbind
do.call("rbind", sapply(1:8, function(x) c(x,x^2), simplify=FALSE))
# transpose result
t(sapply(1:8, function(x) c(x,x^2)))
I have a matrix
[,1] [,2]
[1,] 2 3
[2,] 3 5
[3,] 7 9
[4,] 11 3
[5,] 11 8
and I want to merge row 1 2 4 5 by their common value.
the result should be output
2 3 5 11 8
Test case:
m <- matrix(c(2,3,7,11,11,3,5,9,3,8),ncol=2)
I'm not sure this is what you want, but it gives the right answer:
unique(c(t(m[c(1,2,4,5),])))
Only two tricky bits here:
need to use c() to collapse the matrix into a single vector
need to use t() to get the matrix collapsed row-wise rather than column-wise to get the ordering as you specified.