I have a dataframe with the following elements:
> x[3536:3540,]
V1 V2
3536 2 6
3537 13 6
3538 9 6
3539 6 6
3540 2 2
I want to remove rows with the same elements in all columns.
My desired result is as follows:
> x[3536:3540,]
V1 V2
3536 2 6
3537 13 6
3538 9 6
I tried this:
x<-x[,1] != x[,2]
But I get only boolean values for each row, not matrix with rows with non-same elements in columns:
> x
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[15] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[29] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[43] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[57] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[71] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[85] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[99] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[113] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE
Any help would be greatly appreciated.
You need to subset/filter:
Base R:
x_new <- x[x$V1 != x$V2,]
dplyr:
library(dplyr)
x_new <- x %>%
filter(V1 != V2)
Result:
x_new
V1 V2
2 1 2
3 1 3
Data:
x <- data.frame(
V1 = c(1,1,1,1),
V2 = c(1,2,3,1)
)
The below is assuming you want to subset within specific rows as per original post.
library(data.table)
setDT(df)
df <- df[3536:3540][V1 != V2]
I have an vector with indexes:
indexes
[1] 25 2 16 23
and another vector with logical:
logical
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[19] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
i want to keep all logical items that, except those with indexes stored in indexes.
i thought this would have an easy solution, but mine doesn't work:
for(index in indexes){
logical[index] = NULL
}
You could just use minus (-) indexing :
indexes <- c(25, 2, 16, 23)
logicals <- sample(c(T,F),25,replace=T)
logicals
#> [1] FALSE TRUE TRUE TRUE FALSE FALSE TRUE FALSE TRUE TRUE FALSE TRUE
#> [13] FALSE FALSE TRUE FALSE FALSE FALSE FALSE TRUE FALSE TRUE TRUE TRUE
#> [25] FALSE
logicals[-indexes]
#> [1] FALSE TRUE TRUE FALSE FALSE TRUE FALSE TRUE TRUE FALSE TRUE FALSE
#> [13] FALSE TRUE FALSE FALSE FALSE TRUE FALSE TRUE TRUE
I tried:
mdf$CLAVE.EMISORA %in% BMV[[9]]$`CLAVE EMISORA`
But it only returns:
logical(0)
For some reason the reveres seems to work:
BMV[[9]]$`CLAVE EMISORA` %in% mdf$CLAVE.EMISORA
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[20] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[39] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[58] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[77] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
My data (mdf): I have it but I don't know how to embed
My list (BMV): .... I don't know how to copy a list to clipboard sorry...
logical(0) is a vector of base type logical with 0 length.
You're getting this because your trying to check if any element in a vector of length 0 is present in BMV[[9]]$'CLAVE EMISORA'
if you run
length(mdf$CLAVE.EMISORA)
You'll get 0 as output
Reverse works because you're checking if any element from a vector of a non-zero length is present in a vector of 0 length.
I have a logic vector in R something like this:
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
[19] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[55] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[73] FALSE FALSE TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
I want to construct another numeric vector that contains a 1 if the logic vector is true and a 0 if it is false. I have tried the following code
## create an empty vector
numericvec <- vector(mode="numeric", length=0)
## for loop
for (i in logicvec){
if(i == TRUE){
c(numericvec, 1)
} else {
c(numericvec, 0)
}
}
The for loop syntax seems ok because I don't get errors when I run it but it isn't currently adding any values to the numeric vector.
This should work:
numericvec <- as.numeric(logicvec)
No need for a for() loop. R typically operates on entire columns.
I'd like to check if a data.frame has any non-finite elements.
This seems to evaluate each column, returning FALSE for each (I'm guessing its evaluating the data.frame as a list):
any( !is.finite( x ) )
I don't understand why this behaves differently from the above, but it works fine if just checking for NAs:
any( !is.na( x ) )
I'd like the solution to be as efficient as possible. I realize I can just do...
any( !is.finite( as.matrix( x ) ) )
If you type methods(is.na) you'll see that it has a data.frame method, which probably explains why it works the way you expect, where is.finite does not. The usual solution would be to write one yourself, since it's only one line. Something like this maybe,
is.finite.data.frame <- function(obj){
sapply(obj,FUN = function(x) all(is.finite(x)))
}
I'm assuming the error you are getting is the following:
> any( is.infinite( z ) )
Error in is.infinite(z) : default method not implemented for type 'list'
This error is because the is.infinite() and the is.finite() functions are not implemented with a method for data.frames. The is.na() function does have a data.frame method.
The way to work around this is to apply() the function to every row, column, or element in the data.frame. Here's an example using sapply() to apply the is.infinite() function to each element:
x <- c(1:10, NA)
y <- c(1:11)
z <- data.frame(x,y)
any( sapply(z, is.infinite) )
## or
any( ! sapply(z, is.finite) )
Your solution of calling as.matrix will only work if the data.frame only has numeric columns. Otherwise, the matrix will typically become a character matrix and the result will be false everywhere...
#joran has a good approach, but you'll have problems with factor columns unless to add a method for factors too etc...
is.finite(letters[1:3]) # FALSE - OK
is.finite(factor(letters[1:3])) # TRUE - WRONG!!
is.finite.factor <- function(obj){
logical(length(obj))
}
is.finite(factor(letters[1:3])) # FALSE - OK
Also, if you want the check to be as fast as possible, you should avoid sapply and go for vapply instead.
d <- data.frame(matrix(runif(1e6), nrow=10), letters[1:10])
# #joran's method
is.finite.data.frame <- function(obj){
sapply(obj,FUN = function(x) all(is.finite(x)))
}
system.time( x <- is.finite(d) ) # 0.42 secs
# Using vapply instead...
is.finite.data.frame <- function(obj) {
vapply(obj,FUN = function(x) all(is.finite(x)), logical(1))
}
system.time( y <- is.finite(d) ) # 0.20 secs
identical(x,y) # TRUE
One difference is that is.na and is.finite are different types of functions. is.na is a generic and will dispatch based on the class of the argument.
> methods("is.na")
[1] is.na.data.frame is.na.numeric_version is.na.POSIXlt
[4] is.na.raster*
Non-visible functions are asterisked
Note in particular that there is an is.na.data.frame function. Looking at that function:
> is.na.data.frame
function (x)
{
y <- do.call("cbind", lapply(x, "is.na"))
if (.row_names_info(x) > 0L)
rownames(y) <- row.names(x)
y
}
<bytecode: 00000000054F40F0>
<environment: namespace:base>
the part that does the work is the do.call("cbind", lapply(x, "is.na")) call which puts columns together (cbind) which are the result of lapply(x, "is.na"). Running just this with an example data.frame (mtcars):
> lapply(mtcars, "is.na")
$mpg
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
$cyl
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
$disp
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
$hp
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
$drat
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
$wt
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
$qsec
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
$vs
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
$am
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
$gear
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
$carb
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
we see that this is really just a column-wise computation, put back together into a data.frame.
Compare that to is.finite which does not have a specific function for data.frames:
> methods("is.finite")
no methods were found
In fact, it is a primitive method, meaning that the details are in C code, not R code.
> is.finite
function (x) .Primitive("is.finite")
If you want to do a column-wise computation with is.finite, you can wrap it like is.na.data.frame does.
> do.call(cbind, lapply(mtcars, is.finite))
mpg cyl disp hp drat wt qsec vs am gear carb
[1,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[2,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[3,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[4,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[5,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[6,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[7,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[8,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[9,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[10,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[11,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[12,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[13,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[14,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[15,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[16,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[17,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[18,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[19,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[20,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[21,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[22,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[23,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[24,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[25,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[26,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[27,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[28,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[29,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[30,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[31,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[32,] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
This latter could also be gotten as
sapply(mtcars, is.finite)
No testing on what would be most efficient, though.