R's behaviour using ifelse and eval in combination - r

disclaimer: this code is bad practice., and only works due to something bug-like. Never use it in a real situation. This question is about the interesting behaviour of R, nothing else than that.
After reading this question I got pretty puzzled. Apparently, ifelse can access information that should be hidden.
Say we do :
> x <- expression(dd <- 1:3)
> y <- expression(dd <- 4:6)
> z <- c(1,0)
> eval(x)
> eval(y)
>
We get no output. Logic, as both expressions are actually assignments of a vector dd. eval() is not supposed to give output then. But strangely enough, when you try the funny code
> ifelse(z==0,eval(x),eval(y))
[1] 4 2
You get output??? Somebody has an explanation for this?
It's not as simple as "R evaluates and then uses dd". Whatever order you give z, whatever condition you use, dd is always the last mentioned eval().
> ifelse(z==0,eval(x),eval(y))
> dd
[1] 4 5 6
> ifelse(z==1,eval(x),eval(y))
> dd
[1] 4 5 6
> z <- c(0,1)
> ifelse(z==0,eval(x),eval(y))
> dd
[1] 4 5 6
> ifelse(z==1,eval(x),eval(y))
> dd
[1] 4 5 6
> ifelse(z==1,eval(y),eval(x))
> dd
[1] 1 2 3
EDIT:
a closer look at the source code of ifelse shows that the line making sure this happens, is the rep() :
> x <- expression(dd <- 1:3)
> eval(x)
> rep(eval(x),2)
[1] 1 2 3 1 2 3
Still, it doesn't solve the question...

This is not a bug
The 'output' onto the console of the result of a command is conditional. This can be determined by the function itself - for example:
> f=function(x)x;
> g=function(x)invisible(x);
> f(1)
[1] 1
> g(2)
> .Last.value
[1] 2
The value is still being returned just fine - it's just not printed on the console.
What's happening here is the eval marks its output invisible but rep and ifelse do not, and in fact effectively strip the invisible property off their input.
It appears the invisible is a special property of the variable, and is not passed through the rep operation. It's also not passed through assignment:
> h=function(x){y=x;y;}
> f(g(1))
> h(g(1))
[1] 1
>
See ?invisible for a little more background.

R always evaluates the two alternatives to an ifelse command. You can rationalize that as being necessary in order to be ready to choose which item in each vector to return to the calling environment. The opposite is true of an if (cond) {affirm-conseq} else {neg-conseq}. The basis of "dd" always getting set based on evaluation of the third ifelse argument is revealed when on looks at the code for ifelse. The "no"-vector code gets evaluated after the "yes"-vector in order to choose which items in negative consequent vector get assigned to the "ans"-output vector.

Related

Why when I pass a dataframe of integers to apply function in R the variable gets transformed?

I'm quite newbie in R, so please be indulgent. This must be something really simple. I have a function that I would like to execute over every element of a dataframe. Minimal example:
agenericfunction = function(pos) {
print(pos)
}
When I call the function like this:
apply(as.data.frame(1:5), 1, agenericfunction)
This is what I get:
1:5
1
1:5
2
1:5
3
1:5
4
1:5
5
[1] 1 2 3 4 5
But if I modify the function like this:
agenericfunction = function(pos) {
print(paste0("_",pos))
}
I then get what one would normally expect:
[1] "_1"
[1] "_2"
[1] "_3"
[1] "_4"
[1] "_5"
[1] "_1" "_2" "_3" "_4" "_5"
I do not understand why my integer 'pos' variable gets converted in the first case into some weird thing that provokes that output. If I use the "class" function on "pos", it always says that it is a integer (in any of the two cases above). Could someone explain this behaviour?
Thanks in advance and best regards
What you are seeing in the 1st case is the column name above each integer. Meaning that when the as.data.frame is called, it creates a column with colnames 1:8.
If you do df <- as.data.frame(1:8) and then colnames(df) you will see this:
> names(t)
[1] "1:8"
As to why its working with paste0, paste0 ignores the column name and just returns the values that are inside. Modifying your function to return return(paste0(pos)) will yield the result you want.
If you want to avoid that, best create a matrix instead of a data.frame:
apply(matrix(1:5), 1, agenericfunction)
Output:
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 1 2 3 4 5

Why doesn't this function print to screen when called in a loop?

Can anyone help me understand this behavior?
test<-c(1,2,3,4)
adding<-function(file){
file2<- file + 1
return(file2)
}
yields upon calling:
> adding(file = 1)
[1] 2
but when I try:
for(number in test){
adding(number)
print(number)
}
I get:
> for(number in test){
+ adding(number)
+ print(number)
+ }
[1] 1
[1] 2
[1] 3
[1] 4
when I would've expected:
[1] 2
[1] 1
[1] 3
[1] 2
[1] 4
[1] 3
[1] 5
[1] 4
I'm using this basis for another for loop that I'm working on and wondering why it's not behaving as I expect.
Opt for
for(number in test){
print(adding(number))
print(number)
}
to have your expected behavior, else adding per se won't print to screen.
Your example works exactly as expected. You never explicitly printed the return value from the function call in the loop. Automatic printing is turned off in loops.
If you want the result of the function applied to each element of the vector as a vector itself, then
purrr::map_dbl(test, adding)
## [1] 2 3 4 5
or
sapply(test, adding)
Printing something to screen without a print statement only happens in an interactive environment and only if you are inside out most scopes. The variables inside the function have their own scope and those variables are not printed.
Food for though: This is for example different in MATLAB, where the result of every statement not ending with ; is printed. In Python an additional requirement is that such a statement is executed alone. So executing two statements will not print anything, even if the last one is something that might be printed.

What is the reason to add quotation marks around R function names?

What is the difference between defining a function called myfunction as
"myfunction" <- function(<arguments>){<body>}
and
myfunction <- function(<arguments>){<body>}
furthermore: what about the comments which are usually placed around such a function, i.e.
#myfunction{{{
"myfunction" <- function(<arguments>){<body>}
#}}}
are they just for documentation or are they really necessary (if so for what)?
EDIT: I have been asked for an example where comments like
#myfunction{{{
are used: For example here https://github.com/cran/quantmod/blob/master/R/getSymbols.R
The quoted version allows otherwise illegal function names:
> "my function" <- function() NULL
> "my function"()
NULL
Note that most people use backticks to make it clear they are referring to a name rather than a character string. This allows you to do some really odd things as alluded to in ?assign:
> a <- 1:3
> "a[1]" <- 55
> a[1]
[1] 1
> "a[1]"
[1] "a[1]"
> `a[1]`
[1] 55

R Error in x$ed : $ operator is invalid for atomic vectors

Here is my code:
x<-c(1,2)
x
names(x)<- c("bob","ed")
x$ed
Why do I get the following error?
Error in x$ed : $ operator is invalid for atomic vectors
From the help file about $ (See ?"$") you can read:
$ is only valid for recursive objects, and is only discussed in the section below on recursive objects.
Now, let's check whether x is recursive
> is.recursive(x)
[1] FALSE
A recursive object has a list-like structure. A vector is not recursive, it is an atomic object instead, let's check
> is.atomic(x)
[1] TRUE
Therefore you get an error when applying $ to a vector (non-recursive object), use [ instead:
> x["ed"]
ed
2
You can also use getElement
> getElement(x, "ed")
[1] 2
The reason you are getting this error is that you have a vector.
If you want to use the $ operator, you simply need to convert it to a data.frame. But since you only have one row in this particular case, you would also need to transpose it; otherwise bob and ed will become your row names instead of your column names which is what I think you want.
x <- c(1, 2)
x
names(x) <- c("bob", "ed")
x <- as.data.frame(t(x))
x$ed
[1] 2
Because $ does not work on atomic vectors. Use [ or [[ instead. From the help file for $:
The default methods work somewhat differently for atomic vectors, matrices/arrays and for recursive (list-like, see is.recursive) objects. $ is only valid for recursive objects, and is only discussed in the section below on recursive objects.
x[["ed"]] will work.
Here x is a vector.
You need to convert it into a dataframe for using $ operator.
x <- as.data.frame(x)
will work for you.
x<-c(1,2)
names(x)<- c("bob","ed")
x <- as.data.frame(x)
will give you output of x as:
bob 1
ed 2
And, will give you output of x$ed as:
NULL
If you want bob and ed as column names then you need to transpose the dataframe like x <- as.data.frame(t(x))
So your code becomes
x<-c(1,2)
x
names(x)<- c("bob","ed")
x$ed
x <- as.data.frame(t(x))
Now the output of x$ed is:
[1] 2
You get this error, despite everything being in line, because of a conflict caused by one of the packages that are currently loaded in your R environment.
So, to solve this issue, detach all the packages that are not needed from the R environment. For example, when I had the same issue, I did the following:
detach(package:neuralnet)
bottom line: detach all the libraries no longer needed for execution... and the problem will be solved.
This solution worked for me
data<- transform(data, ColonName =as.integer(ColonName))
Atomic collections are accessible by $
Recursive collections are not. Rather the [[ ]] is used
Browse[1]> is.atomic(list())
[1] FALSE
Browse[1]> is.atomic(data.frame())
[1] FALSE
Browse[1]> is.atomic(class(list(foo="bar")))
[1] TRUE
Browse[1]> is.atomic(c(" lang "))
[1] TRUE
R can be funny sometimes
a = list(1,2,3)
b = data.frame(a)
d = rbind("?",c(b))
e = exp(1)
f = list(d)
print(data.frame(c(list(f,e))))
X1 X2 X3 X2.71828182845905
1 ? ? ? 2.718282
2 1 2 3 2.718282

How does R treat nonexistent index values?

Consider this experiment:
Rgames> oof<-c(6,7,8,0,10)
Rgames> badoof<-vector()
Rgames> oof[badoof]
numeric(0)
Rgames> oof[-badoof]
numeric(0)
Rgames> oof[-0]
numeric(0)
Rgames> oof[-10]
[1] 6 7 8 0 10
Rgames> oof[-c(10,0)]
[1] 6 7 8 0 10
Rgames> oof[!(1:length(oof)%in% c(badoof) )]
[1] 6 7 8 0 10
I know that "only negative integers can be mixed with zeroes" is a limit on subsetting. The part that isn't clear to me is why both oof[badoof] and oof[-badoof] return nothing. The background here is that I've got a function which searches for bad data and removes them from a vector of data. I was hoping not to have to treat the case where no bad items were found (i.e. badoof has no elements) separately or via an if/else. The very last example above, the one using %in% works, but I wonder why R doesn't accept the index "-badoof" construction.
Edit: in light of Hong Ooi's answer, I should ask as well: isn't it the case that the use of a negative sign inside [] really is a "NOT" operation, rather than changing the actual value of the designated indices?
It doesn't work because of this:
> x <- numeric(0)
> x
numeric(0)
> -x
numeric(0)
> identical(x, -x)
[1] TRUE
IOW, negating a vector with no elements leaves the vector unchanged, and hence the indexing operation using the vector will also be unchanged.

Resources