Cumany function applied on NA values - r

I have the following vector:
x <- c(FALSE,FALSE,NA,TRUE,FALSE)
I use the cumany() function to see if there is at least one TRUE value within a window of the first element up to each element in the vector or in other words in the window [1, 1:length(x)].
library(dplyr)
cumany(x)
[1] FALSE FALSE NA NA NA
The output surprises me. I would expect the cumany function to work as following
for(i in 1:length(x)){
print(any(x[1:i]))
}
Therefore I would expect an output as following
[1] FALSE FALSE NA TRUE TRUE
How is the cumany() function defined when it comes to NA values?
Update:
This was a bug in previous dplyr versions and has been corrected. Just update the package if you have the same problem.

To answer the question about how is implemented we need to dive into the implementation, which is done in C++.
As you can see below, the vector is initialized with NAs, however there is a crucial line of code which propagates an information if at least one TRUE value was met before NAs
out[i] = current || out[i - 1];
There is a brief discussion about expected behaviour on GitHub.
If your result is different from what you expected than there is a high chance that you need to update the dplyr package.
For more implementation details see this code below:
LogicalVector cumany(LogicalVector x) {
int n = x.length();
LogicalVector out(n, NA_LOGICAL);
int current = out[0] = x[0];
if (current == NA_LOGICAL) return out;
if (current == TRUE) {
std::fill(out.begin(), out.end(), TRUE);
return out;
}
for (int i = 1; i < n; i++) {
current = x[i];
if (current == NA_LOGICAL) break;
if (current == TRUE) {
std::fill(out.begin() + i, out.end(), TRUE);
break;
}
out[i] = current || out[i - 1];
}
return out;
}

One option would be to replace the NA with FALSE, do the cumany and use | to get the original NA fill the position
cumany(replace(x, is.na(x), FALSE))|x
#[1] FALSE FALSE NA TRUE TRUE

To rewrite it in all base R,
Reduce(any, x, accumulate = TRUE) | x
#> [1] FALSE FALSE NA TRUE TRUE

Related

sum function in Julia is giving error if the array is empty

I am trying to create a code which identifies if the elements in an array are monotonic or not.
I wrote the below code and got the error -
function isMonotonic(array)
if length(array) <= 2
return true
end
check_up = []
check_down = []
for i in range(2, length(array))
if array[i] <= array[i-1]
append!(check_up, 1)
end
if array[i] >= array[i - 1]
append!(check_down, 1)
end
end
if sum(check_up) == length(array) - 1 || sum(check_down) == length(array) - 1
return true
else
return false
end
end
isMonotonic([1, 2, 3, 4, 5, 6 , 7])
I am getting the below error
Error: Methoderror: no method matching zero(::Type{Any})
I think it is because I am trying to sum up the empth array, I want to understand how to overcome this problem in general, I have a solution for the above code, but in genral I want to know the reason and how to use it. I do not want to first check if the array is empty or not and then do the sum.
If you wanted to save yourself lots of effort, the simplest solution would just be:
my_ismonotonic(x) = issorted(x) || issorted(x ; rev=true)
This will return true if x is sorted either forwards, or in reverse, and false otherwise.
We could maybe make it a little more efficient using a check so we only need a single call to issorted.
function my_ismonotonic(x)
length(x) <= 2 && return true
for n = 2:length(x)
if x[n] > x[1]
return issorted(x)
elseif x[n] < x[1]
return issorted(x ; rev=true)
end
end
return true
end
# Alternatively, a neater version using findfirst
function my_ismonotonic(x)
length(x) <= 2 && return true
ii = findfirst(a -> a != x[1], x)
isnothing(ii) && return true # All elements in x are equal
if x[ii] > x[1]
return issorted(x)
else
return issorted(x ; rev=true)
end
end
The loop detects the first occurrence of an element greater than or less than the first element and then calls the appropriate issorted as soon as this occurs. If all elements in the array are equal then the loop runs over the whole array and returns true.
There are a few problems of efficiency in your approach, but the reason you are getting an actual error message is because given the input, either this expression sum(check_up) or this expression sum(check_down) will effectively result in the following call:
sum(Any[])
There is no obvious return value for this since the array could have any type, so instead you get an error. If you had used the following earlier in your function:
check_up = Int[]
check_down = Int[]
then you shouldn't have the same problem, because:
julia> sum(Int[])
0
Note also that append! is usually for appending a vector to a vector. If you just want to add a single element to a vector use push!.

if else statement concatenation - R

This is a very common question: 1, 2, 3, 4, 5, and still I cannot find even an answer to my problem.
If a == 1, then do X.
If a == 0, then do Y.
If a == 0 and b == 1, then do Z.
Just to explain: the if else statements has to do Y if a==0 no matter the value of b. But if b == 1 and a == 0, Z will do additional changes to those already done by Y.
My current code and its error:
if (a == 1){
X
} else if(a == 0){
Y
} else if (a == 0 & b == 1){
Z}
Error in !criterion : invalid argument type
An else only happens if a previous if hasn't happened.
When you say
But if b == 1 and a == 0, Z will do additional changes to those already done by Y
Then you have two options:
## Option 1: nest Z inside Y
if (a == 1){
X
} else if(a == 0){
Y
if (b == 1){
Z
}
}
## Option 2: just use `if` again (not `else if`):
if (a == 1) {
X
} else if(a == 0) {
Y
}
if (a == 0 & b == 1) {
Z
}
Really, you don't need any else here at all.
## This will work just as well
## (assuming that `X` can't change the value of a from 1 to 0
if (a == 1) {
X
}
if (a == 0) {
Y
if (b == 1){
Z
}
}
Typically else is needed when you want to have a "final" action that is done only if none of the previous if options were used, for example:
# try to guess my number between 1 and 10
if (your_guess == 8) {
print("Congratulations, you guessed my number!")
} else if (your_guess == 7 | your_guess = 9) {
print("Close, but not quite")
} else {
print("Wrong. Not even close!")
}
In the above, else is useful because I don't want to have enumerate all the other possible guesses (or even bad inputs) that a user might enter. If they guess 8, they win. If they guess 7 or 9, I tell them they were close. Anything else, no matter what it is, I just say "wrong".
Note: this is true for programming languages in general. It is not unique to R.
However, since this is in the R tag, I should mention that R has if{}else{} and ifelse(), and they are different.
if{} (and optionally else{}) evaluates a single condition, and you can run code to do anything in {} depending on that condition.
ifelse() is a vectorized function, it's arguments are test, yes, no. The test evaluates to a boolean vector of TRUE and FALSE values. The yes and no arguments must be vectors of the same length as test. The result will be a vector of the same length as test, with the corresponding values of yes (when test is TRUE) and no (when test is FALSE).
I believe you want to include Z in the second condition like this:
if (a == 1){X}
else if(a == 0){
Y
if (b == 1){Z}
}

Boolean AND and OR selected rows/columns in R, without creation of a temporary copy?

I have an extremely large matrix full of boolean TRUEs and FALSEs. I need to check certain column combinations to find rows where either all of the specified columns are true, or (in some cases) any of the specified columns are true.
I can do it using apply() and all():
> toymat <- matrix(sample(c(F,T),50,rep=T),5,10)
> toymat[,c(1,5,6)]
[,1] [,2] [,3]
[1,] TRUE FALSE FALSE
[2,] FALSE FALSE TRUE
[3,] TRUE FALSE FALSE
[4,] TRUE TRUE FALSE
[5,] FALSE FALSE TRUE
> apply(toymat[, c(1,5,6)],1,all)
[1] FALSE FALSE FALSE FALSE FALSE
But if I invoke apply with a function that would change a value, it seems to be passing by value, not passing by reference. In other words it's creating a temporary copy of "toymat[, c(1,5,6)]" to run apply on (which would not be desirable, because the actual matrix is huge and the code will be doing this many times).
Is there a way I can AND or OR together an arbitrary number of selected columns or selected rows without a temporary copy being created?
This is a perfect use case for Rcpp. Just use:
#include <Rcpp.h>
using namespace Rcpp;
// [[Rcpp::export]]
IntegerVector rowsums_bool(const LogicalMatrix& x,
const IntegerVector& ind_col) {
int i, j, j2, n = x.nrow(), m = ind_col.size();
IntegerVector res(n);
for (j = 0; j < m; j++) {
j2 = ind_col[j] - 1;
for (i = 0; i < n; i++) {
if (x(i, j2)) res[i]++;
}
}
return res;
}
/*** R
toymat <- matrix(sample(c(F,T),50,rep=T),5,10)
toymat[,c(1,5,6)]
(tmp <- rowsums_bool(toymat, c(1,5,6)))
tmp == 3 ## ALL
tmp != 0 ## ANY
*/

What does "&&" do?

I can't seem to find the resource I need. What does && do in a code that is comparing variables to determine if they are true? If there is a link with a list of the symbol comparisons that would be greatly appreciated.
example: Expresssion 1: r = !z && (x % 2);
In most programming languages that use &&, it's the boolean "and" operator. For example, the pseudocode if (x && y) means "if x is true and y is true."
In the example you gave, it's not clear what language you're using, but
r = !z && (x % 2);
probably means this:
r = (not z) and (x mod 2)
= (z is not true) and (x mod 2 is true)
= (z is not true) and (x mod 2 is not zero)
= (z is not true) and (x is odd)
In most programming languages, the operator && is the logical AND operator. It connects to boolean expressions and returns true only when both sides are true.
Here is an example:
int var1 = 0;
int var2 = 1;
if (var1 == 0 && var2 == 0) {
// This won't get executed.
} else if (var1 == 0 && var2 == 1) {
// This piece will, however.
}
Although var1 == 0 evaluates to true, var2 is not equals to 0. Therefore, because we are using the && operator, the program won't go inside the first block.
Another operator you will see ofter is || representing the OR. It will evaluate true if at least one of the two statements are true. In the code example from above, using the OR operator would look like this:
int var1 = 0;
int var2 = 1;
if (var1 == 0 || var2 == 0) {
// This will get executed.
}
I hope you now understand what these do and how to use them!
PS: Some languages have the same functionality, but are using other keywords. Python, e.g. has the keyword and instead of &&.
It is the logical AND operator
(&&) returns the boolean value true if both operands are true and returns false otherwise.
boolean a=true;
boolean b=true;
if(a && b){
System.out.println("Both are true"); // Both condition are satisfied
}
Output
Both are true
The exact answer to your question depends on the which language your are coding in. In R, the & operator does the AND operation pairwise over two vectors, as in:
c(T,F,T,F) & c(T,T,F,F)
#> TRUE FALSE FALSE FALSE
whereas the && operator operated only on the first element of each vector, as in:
c(T,F,T,F) && c(T,T,F,F)
#> TRUE
The OR operators (| and ||) behave similarly. Different languages will have different meanings for these operators.
In C && works like a logical and, but it only operates on bool types which are true unless they are 0.
In contrast, & is a bitwise and, which returns the bits that are the same.
Ie. 1 && 2 and 1 && 3 are true.
But 1 & 2 is false and 1 & 3 is true.
Let's imagine the situation:
a = 1
b = 2
if a = 1 && b = 2
return "a is 1 and b is 2"
if a = 1 && b = 3
return "a is 1 and b is 3"
In this situation, because a equals 1 AND b = 2, the top if block would return true and "a is 1 and b is 2" would be printed. However, in the second if block, a = 1, but b does not equal 3, so because only one statement is true, the second result would not be printed. && Is the exact same as just saying and, "if a is 1 and b is 1".

How to test Rcpp::CharacterVector elements for equality?

I am trying to write some simple Rcpp code examples. This is remarkably easy with the Rcpp and inline packages.
But I am stumped on how to test whether two character elements for equality. The following example compares the first elements of two character vectors. But I can't get it to compile.
What is the trick?
library(Rcpp)
library(inline)
cCode <- '
Rcpp::CharacterVector cx(x);
Rcpp::CharacterVector cy(y);
Rcpp::LogicalVector r(1);
r[0] = (cx[0] == cy[0]);
return(r);
'
cCharCompare <- cxxfunction(signature(x="character", y="character"),
plugin="Rcpp", body=cCode)
cCharCompare("a", "b")
--
The comparison using == works perfectly fine if one of the two elements is a constant. The following code compiles and gives expected results:
cCode <- '
Rcpp::CharacterVector cx(x);
Rcpp::LogicalVector r(1);
r[0] = (cx[0] == "a");
return(r);
'
cCharCompareA <- cxxfunction(signature(x="character"), plugin="Rcpp", body=cCode)
cCharCompareA("a")
[1] TRUE
cCharCompareA("b")
[1] FALSE
The equality operator has been introduced in Rcpp 0.10.4. The implementation looks like this in the string_proxy class:
bool operator==( const string_proxy& other){
return strcmp( begin(), other.begin() ) == 0 ;
}
So now we can write:
#include <Rcpp.h>
using namespace Rcpp ;
// [[Rcpp::export]]
LogicalVector test( CharacterVector x, CharacterVector y){
Rcpp::LogicalVector r(x.size());
for( int i=0; i<x.size(); i++){
r[i] = (x[i] == y[i]);
}
return(r);
}
And something similar is used on our unit tests:
> test(letters, letters)
[1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
[16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
Try this:
// r[0] = (cx[0] == cy[0]);
// r[0] = ((char*)cx[0] == (char*)cy[0]); <- this is wrong
r[0] = (*(char*)cx[0] == *(char*)cy[0]); // this is correct.
It is not easy to explain, but
CharacterVector is not char[].
operator [] returns StringProxy.
StringProxy is not a type of char.
StringProxy has a member operator function char* that convert StringProxy to char*.
So, maybe (char*)cx[0] is a pointer.
Now I forget many things about C++ syntax...
The reason hy the compile fails is the failure of type inference in operator overload == for StringProxy.
Very nice (technical) answer by #kohske, but here is something more C++-ish: just compare strings!
library(inline) ## implies library(Rcpp) when we use the plugin
cCode <- '
std::string cx = Rcpp::as<std::string>(x);
std::string cy = Rcpp::as<std::string>(y);
bool res = (cx == cy);
return(Rcpp::wrap(res));
'
cCharCompare <- cxxfunction(signature(x="character", y="character"),
plugin="Rcpp", body=cCode)
cCharCompare("a", "b")
If you really want to compare just the first character of the strings, then you can go from x to x.c_str() and either index its initial element, or just dereference the pointer to the first char.
A more R-ish answer could maybe sweep over actual vectors of strings...

Resources