extracting data from a weird looking object in R script - r

In my R script...
I have an object myObject which is something that looks like this:
> myObject
m convInfo data call dataClasses control
FALSE FALSE FALSE FALSE FALSE FALSE
It is what is returned from an is.na(obj) where obj is an nls fit.
I'm trying to test if that first item is FALSE rather than TRUE. How can I extract that out? I tried myObject$m but that didn't work.

You have a named (logical) vector.
> v <- 1:5
> names(v) <- LETTERS[1:5]
> is.na(v)
A B C D E
FALSE FALSE FALSE FALSE FALSE
> myObj <- .Last.value
You address it like any other atomic vector:
> myObj[1]
A
FALSE
> myObj[1] == FALSE
A
TRUE

The object returned by nls() is a list. The behaviour of is.na() on a list is somewhat peculiar in the sense of what is an is not NA. From ?is.na:
Value:
The default method for ‘is.na’ applied to an atomic vector returns
a logical vector of the same length as its argument ‘x’,
containing ‘TRUE’ for those elements marked ‘NA’ or, for numeric
or complex vectors, ‘NaN’ (!) and ‘FALSE’ otherwise. ‘dim’,
‘dimnames’ and ‘names’ attributes are preserved.
The default method also works for lists and pairlists: the result
for an element is false unless that element is a length-one atomic
vector and the single element of that vector is regarded as ‘NA’
or ‘NaN’.
So t is a logical vector with the TRUE & FALSE values in your t determined as per the quoted text above. Therefore all of
t[1]
t["m"]
head(t, 1)
extract the first element of t. If you want to test for FALSE then I might try:
!isTRUE(t[1])
E.g.
> set.seed(1)
> logi <- sample(c(TRUE,FALSE), 5, replace = TRUE)
> logi
[1] TRUE TRUE FALSE FALSE TRUE
> !isTRUE(logi[1])
[1] FALSE
The reason the $ version won't work is that $ is documented to apply only to non-atomic vectors. logi (or your t) is an atomic vector, in that it contains elements of the same type.
> is.atomic(logi)
[1] TRUE
> names(logi) <- letters[1:5]
> logi$a
Error in logi$a : $ operator is invalid for atomic vectors
> logi["a"]
a
TRUE

Related

R function returning boolean of which element in vector is maximum

I need a boolean that tells me whether an element of a vector is that vector's maximum. Should return something like this
vec <- c(3,4,1,5)
maxBoolFunct(vec)
[1] FALSE FALSE FALSE TRUE
max() just tells me what the maximum value actually is and which.max simply gives me the position in the vector. I need a boolean.
You can use logical indexing.
> vec = c(3,4,1,5)
> vec == max(vec)
[1] FALSE FALSE FALSE TRUE

Force is.unsorted() to false

I have a dataframe object that is presorted, and I am trying to call a function that requires it to be sorted. Somehow is.unsorted() is returning true. R then proceeds to sort it.
Unfortunately, there are about 2million entries, and I don't have the memory. Is there a way to force is.unsorted to be false?
Quick check of the RDocumentation (is.unsorted) includes the following line:
Note:
This function is designed for objects with one-dimensional indices, as described above. Data frames, matrices and other arrays may give surprising results.
Therefore, you should avoid using this function on complete data frames. Instead, you should run this function on certain features of the data frame, instead of the entire data frame itself.
Take the below code snippet for example. You can see that this function works as expected on one-dimensional objects (vectors); however has a surprising result when run on a data frame (returned FALSE when expecting TRUE result).
However, when the data frame was subset (using the $ operator) and the is.unsorted() function is run on the individual features, then it returns the expected result.
> vec <- c(1,2,3,4,5)
> is.unsorted(vec) # Expected: FALSE
[1] FALSE
> vec <- c(1,3,2,5,4)
> is.unsorted(vec) # Expected: TRUE
[1] TRUE
> vec <- c("A","B","C","D","E")
> is.unsorted(vec) # Expected: FALSE
[1] FALSE
> vec <- c("A","C","B","E","D")
> is.unsorted(vec) # Expected: TRUE
[1] TRUE
> dat <- data.frame(num=c(1,2,3,4,5)
+ ,chr=c("A","B","C","D","E")
+ ,stringsAsFactors=FALSE
+ )
> is.unsorted(dat) # Expected: FALSE
[1] FALSE
> dat <- data.frame(num=c(1,3,2,5,4)
+ ,chr=c("A","B","C","D","E")
+ ,stringsAsFactors=FALSE
+ )
> is.unsorted(dat) # Expected: TRUE
[1] FALSE
> is.unsorted(dat$num) # Expected: TRUE
[1] TRUE
> is.unsorted(dat$chr) # Expected: FALSE
[1] FALSE

Why do logical operators negate their argument when there is only one argument in R?

When passing only a single vector to the logical and/or operator, the operator negates the argument:
> x = c(F,T,T)
> `&`(x)
[1] TRUE FALSE FALSE
> `|`(x)
[1] TRUE FALSE FALSE
To make the logical operator work as idempotent, one needs to pass a single element vector as the second argument:
> `&`(x,T)
[1] FALSE TRUE TRUE
> `|`(x,F)
[1] FALSE TRUE TRUE
Why do the logical operators negate their argument when there is only one argument passed?
This was modified in R 3.2.1 as a result of a bug report. As you've pointed out, the previous behavior made little sense:

Difference between the following two codes

i have a dataframe named as newdata. it has two columns named as BONUS and GENDER.
When i write the following code in r:
> newdata <- within(newdata,{
PROMOTION=ifelse(BONUS>=1500,1,0)})
it works though i haven't used loop here but the following codes don't work without loop. Why?
> add <- with(newdata,
if(GENDER==F)sum(PROMOTION))
Warning message:
In if (GENDER == F) sum(PROMOTION) :
the condition has length > 1 and only the first element will be used
My question is why in the first code all elements have been used?
ifelse is vectorized, but if is not. For example:
> x <- rbinom(20,1,.5)
> ifelse(x,TRUE,FALSE)
[1] TRUE TRUE FALSE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE FALSE
[13] FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE
> if(x) {TRUE} else {FALSE}
[1] TRUE
Warning message:
In if (x) { :
the condition has length > 1 and only the first element will be used

R is there a way to find Inf/-Inf values?

I'm trying to run a randomForest on a large-ish data set (5000x300). Unfortunately I'm getting an error message as follows:
> RF <- randomForest(prePrior1, postPrior1[,6]
+ ,,do.trace=TRUE,importance=TRUE,ntree=100,,forest=TRUE)
Error in randomForest.default(prePrior1, postPrior1[, 6], , do.trace = TRUE, :
NA/NaN/Inf in foreign function call (arg 1)
So I try to find any NA's using :
> df2 <- prePrior1[is.na(prePrior1)]
> df2
character(0)
> df2 <- postPrior1[is.na(postPrior1[,6])]
> df2
numeric(0)
which leads me to believe that it's Inf's that are the problem as there don't seem to be any NA's.
Any suggestions for how to root out Inf's?
You're probably looking for is.finite, though I'm not 100% certain that the problem is Infs in your input data.
Be sure to read the help for is.finite carefully about which combinations of missing, infinite, etc. it picks out. Specifically, this:
> is.finite(c(1,NA,-Inf,NaN))
[1] TRUE FALSE FALSE FALSE
> is.infinite(c(1,NA,-Inf,NaN))
[1] FALSE FALSE TRUE FALSE
One of these things is not like the others. Not surprisingly, there's an is.nan function as well.
randomForest's 'NA/NaN/Inf in foreign function call' is often a false warning, and really irritating:
you will get this if any of the variables passed is character
actual NaNs and Infs almost never happen in clean data
My fast-and-dirty trick to narrow things down, do a binary-search on your variable list, and use token parameters like ntree=2 to get an instant pass/fail on the subset of variables:
RF <- randomForest(prePrior1[m:n],ntree=2,...)
In analogy to is.na, you can use is.infinite to find occurrences of infinites.
Take a look at with, e.g.:
> with(df, df == Inf)
foo bar baz abc ...
[1,] FALSE FALSE TRUE FALSE ...
[2,] FALSE TRUE FALSE FALSE ...
...
joran's answer is what you want and informative. For more details about is.na() and is.infinite(), you should check out https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/is.na-methods.html
and besides, after you get the logical vector which says whether each element of the original vector is NA/Inf, you can use the which() function to get the indices, just like this:
> v1 <- c(1, Inf, 2, NaN, Inf, 3, NaN, Inf)
> is.infinite(v1)
[1] FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE
> which(is.infinite(v1))
[1] 2 5 8
> is.na(v1)
[1] FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE
> which(is.na(v1))
[1] 4 7
the document for which() is here https://stat.ethz.ch/R-manual/R-devel/library/base/html/any.html

Resources