I would like to determine if a vector is either always increasing or always decreasing in R.
Ideally, if I had these three vectors:
asc=c(1,2,3,4,5)
des=c(5,4,3,2,1)
non=c(1,3,5,4,2)
I would hope that the first two would return TRUE, and the last would return FALSE.
I tried a few approaches. First, I tried:
> is.ordered(asc)
[1] FALSE
> is.ordered(des)
[1] FALSE
> is.ordered(non)
[1] FALSE
And I also tried:
> order(non)
[1] 1 5 2 4 3
And hoped that I could simply compare this vector with 1,2,3,4,5 and 5,4,3,2,1, but even that returns a string of logicals, rather than a single true or false:
> order(non)==c(1,2,3,4,5)
[1] TRUE FALSE FALSE TRUE FALSE
Maybe is.unsorted is the function your looking for
> is.unsorted(asc)
[1] FALSE
> is.unsorted(rev(des)) # here you need 'rev'
[1] FALSE
> is.unsorted(non)
[1] TRUE
From the Description of is.unsorted you can find:
Test if an object is not sorted (in increasing order), without the cost of sorting it.
Here's one way using ?is.unsorted:
is.sorted <- function(x, ...) {
!is.unsorted(x, ...) | !is.unsorted(rev(x), ...)
}
Have a look at the additional arguments to is.unsorted, which can be passed here as well.
Here is one way without is.unsorted() to check if to vectors are sorted. This function will return true, if all elements in the vector given are sorted in an ascending manner or false if not:
is.sorted <- function(x) {
if(all(sort(x, decreasing = FALSE) == x)) {
return(TRUE)
} else {
return(FALSE)
}
}
Related
If I understand, rlang::quo_is_missing evaluates a quosure and checks whether it contains a missing value. If it does, it should return TRUE, FALSE if not. Yet, I've tried the following combinations and it always returns FALSE:
rlang::quo_is_missing(quo(NA))
rlang::quo_is_missing(quo(NA_character_))
rlang::quo_is_missing(quo(NA_integer_))
If I try non-NA values, it also returns FALSE, as expected:
rlang::quo_is_missing(quo("hello"))
Why is it returning FALSE when the value is obviously missing?
"Missing" is a special term that refers to values that are not present at all. NA is not the same as "missing" -- NA is itself a value. In base R you can compare the functions is.na() and missing() each of which do different things. quo_is_missing is like the missing() function, not is.na and returns true only when there is no value at all:
rlang::quo_is_missing(quo())
If you want to check for NA, you could write a helper
quo_is_na <- function(x) {
!rlang::quo_is_symbolic(x) &&
!rlang::quo_is_missing(x) &&
!rlang::quo_is_null(x) &&
is.na(rlang::quo_get_expr(x))
}
quo_is_na(quo())
# [1] FALSE
quo_is_na(quo(x+y))
# [1] FALSE
quo_is_na(quo(NULL))
# [1] FALSE
quo_is_na(quo(42))
# [1] FALSE
quo_is_na(quo(NA))
# [1] TRUE
quo_is_na(quo(NA_character_))
# [1] TRUE
Is there an easy, straightforward way (possibly a builtin function) that could match one vector as a whole in another vector?
Example:
target <- c(1,2,3)
A <- c(4,5,6,1,2,3)
B <- c(4,5,6,3,2,1)
my_match(target, A) # TRUE
my_match(target, B) # FALSE
I tried %in%, match and pmatch but these won't give the desired result. For example, both target %in% A and target %in% B will give the result [1] TRUE TRUE TRUE, which is not what I want.
Here another version
multi_match=function(target,A) {
lA=length(A)
lt=length(target)
if (lt>lA) return(FALSE)
any(colSums(sapply(1:(lA-lt+1),function(i) A[i:(i+lt-1)])==target)==lt)
}
Let's try it with some data
target <- c(1,2,3)
A <- c(4,5,6,1,2,3,1,2,3,1,3)
B <- c(4,5,6,3,2,1)
multi_match(target,A)
#TRUE
multi_match(target,B)
#FALSE
#"wrong" input order - trivially no match
multi_match(A,target)
#FALSE
And an extension of the multi_match function above to multi_which.
multi_which=function(target,A) {
lA=length(A)
lt=length(target)
if (lt>lA) return(integer(0))
which(colSums(sapply(1:(lA-lt+1),function(i) A[i:(i+lt-1)])==target)==lt)
}
multi_which(target,A)
#[1] 4 7
multi_which(target,B)
#integer(0)
#"wrong" input order - trivially no match
multi_which(A,target)
#integer(0)
Try:
grepl(paste(target,collapse=","),paste(A,collapse=","))
grepl(paste(target,collapse=","),paste(B,collapse=","))
This concatenates the vectors into strings and looks for a substring in the second argument that matches the first.
You could put this into a function that returns true or false:
my_match <- function(x,y,dlm=",") grepl(paste(x,collapse=dlm),paste(y,collapse=dlm))
my_match(target,A)
[1] TRUE
my_match(target,B)
[1] FALSE
One possible way is to use match and check if resulting sequence is rising
all(diff(match(target, A)) == 1) && length(match(target, A)) == length(target)
Or as a function
> exact_match <- function(p, x) all(diff(match(p, x)) == 1) && length(match(p, x)) == length(p)
> exact_match(target,A)
[1] TRUE
> exact_match(target,B)
[1] FALSE
I have a set of dates (as.Date from RQuantLib) stored as a list or row in my_dates
I would like to select the first value for which this condition is true
businessDaysBetween("UnitedStates", TodayDate, my_dates)>10
where TodayDate<-as.Date(format(Sys.time(), "%Y%m%d"), "%Y%m%d") is today date.
Thank you.
This can be very simple, if you want only the values or the first value.
A conditional statement applied to a vector results in a true/false vector
A true/false vector can be used as an index to a vector, and returns a shorter vector of the values where the condition is true.
Example: Initialize x to contain 10 random numbers. Is each one bigger than 0.7? Now return those that are bigger than 0.7. Then return the first one.
x = rnorm(10)
x
[1] -0.96029244 0.41779224 0.08058894 -0.02026729 -0.65383370 -0.83572926
[7] 0.92722221 0.49157700 0.88779718 -1.09073923
x>0.7
[1] FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE
x[x>0.7]
[1] 0.9272222 0.8877972
first one...
x[x>0.7][1]
[1] 0.9272222
Use which on the result of your businessDaysBetween call.
set.seed(21)
my_dates <- Sys.Date()+sample(50,10)
bizday_gt10 <- sapply(my_dates, businessDaysBetween,
calendar="UnitedStates", from=Sys.Date()) > 10
if(any(bizday_gt10)) {
first_bizday_gt10 <- which(bizday_gt10)[1]
my_dates[first_bizday_gt10]
} else {
stop("no business days between today and today+10")
}
Note that you need to check that there's at least 1 TRUE observation in bizday_gt10, else which will return a zero-length vector.
Can someone help me modify the function below to check if a number is numeric?
# handy function that checks if something is numeric
check.numeric <- function(N){
!length(grep("[^[:digit:]]", as.character(N)))
}
check.numeric(3243)
#TRUE
check.numeric("sdds")
#FALSE
check.numeric(3.14)
#FALSE
I want check.numeric() to return TRUE when it's a decimal like 3.14.
You could use is.finite to test whether the value is numeric and non-NA. This will work for numeric, integer, and complex values (if both real/imaginary parts are finite).
> is.finite(NA)
[1] FALSE
> is.finite(NaN)
[1] FALSE
> is.finite(Inf)
[1] FALSE
> is.finite(1L)
[1] TRUE
> is.finite(1.0)
[1] TRUE
> is.finite("A")
[1] FALSE
> is.finite(pi)
[1] TRUE
> is.finite(1+0i)
[1] TRUE
Sounds like you want a function like this:
f <- function(x) is.numeric(x) & !is.na(x)
I'm trying to run a randomForest on a large-ish data set (5000x300). Unfortunately I'm getting an error message as follows:
> RF <- randomForest(prePrior1, postPrior1[,6]
+ ,,do.trace=TRUE,importance=TRUE,ntree=100,,forest=TRUE)
Error in randomForest.default(prePrior1, postPrior1[, 6], , do.trace = TRUE, :
NA/NaN/Inf in foreign function call (arg 1)
So I try to find any NA's using :
> df2 <- prePrior1[is.na(prePrior1)]
> df2
character(0)
> df2 <- postPrior1[is.na(postPrior1[,6])]
> df2
numeric(0)
which leads me to believe that it's Inf's that are the problem as there don't seem to be any NA's.
Any suggestions for how to root out Inf's?
You're probably looking for is.finite, though I'm not 100% certain that the problem is Infs in your input data.
Be sure to read the help for is.finite carefully about which combinations of missing, infinite, etc. it picks out. Specifically, this:
> is.finite(c(1,NA,-Inf,NaN))
[1] TRUE FALSE FALSE FALSE
> is.infinite(c(1,NA,-Inf,NaN))
[1] FALSE FALSE TRUE FALSE
One of these things is not like the others. Not surprisingly, there's an is.nan function as well.
randomForest's 'NA/NaN/Inf in foreign function call' is often a false warning, and really irritating:
you will get this if any of the variables passed is character
actual NaNs and Infs almost never happen in clean data
My fast-and-dirty trick to narrow things down, do a binary-search on your variable list, and use token parameters like ntree=2 to get an instant pass/fail on the subset of variables:
RF <- randomForest(prePrior1[m:n],ntree=2,...)
In analogy to is.na, you can use is.infinite to find occurrences of infinites.
Take a look at with, e.g.:
> with(df, df == Inf)
foo bar baz abc ...
[1,] FALSE FALSE TRUE FALSE ...
[2,] FALSE TRUE FALSE FALSE ...
...
joran's answer is what you want and informative. For more details about is.na() and is.infinite(), you should check out https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/is.na-methods.html
and besides, after you get the logical vector which says whether each element of the original vector is NA/Inf, you can use the which() function to get the indices, just like this:
> v1 <- c(1, Inf, 2, NaN, Inf, 3, NaN, Inf)
> is.infinite(v1)
[1] FALSE TRUE FALSE FALSE TRUE FALSE FALSE TRUE
> which(is.infinite(v1))
[1] 2 5 8
> is.na(v1)
[1] FALSE FALSE FALSE TRUE FALSE FALSE TRUE FALSE
> which(is.na(v1))
[1] 4 7
the document for which() is here https://stat.ethz.ch/R-manual/R-devel/library/base/html/any.html