Basic R malfunction - r

R beginner. Why doesn't this code return the number 3?
my_mean <- function(my_vector){
sum(my_vector)/length(my_vector)
my_mean
}
my_vector <- c(1, 3, 5)
my_mean
I'm not allowed to use mean(). Thanks

Returning a value by assigning to the function name is Visual Basic syntax. To my knowledge, no other language uses this technique.
If you want to return a value in R, use the return() statement:
mymean <- function(x)
{
val <- sum(x)/length(x)
return(val)
}
But there's a shorter way to achieve the same result. If R reaches the end of a function without an explicit return, it will return the value of the last expression it found.
mymean <- function(x)
{
val <- sum(x)/length(x)
val # value of last expression is returned
}
But this can be shortened further. The variable val is only used once, as the last statement in the function. So we could omit it entirely, and just return the computed value itself without storing it in a variable first:
mymean <- function(x)
{
sum(x)/length(x)
}

Problem solved. I made a mistake in the way I called the function. The call should be:
my_mean(c(1,3,5))
not:
my_vector <- c(1, 3, 5)
my_mean

Related

Function does not to work with lubridate/mutate/across but works with a loop

I try to fix dates (years) using a function
change_century <- function(x){
a <- year(x)
ifelse(test = a >2020,yes = year(x) <- (year(x)-100),no = year(x) <- a)
return(x)
}
The function works for specific row or using a loop for one column (here date of birth)
for (i in c(1:nrow(Df))){
Df_recode$DOB[i] <- change_century(Df$DOB[i])
}
Then I try to use mutate/across
Df_recode <- Df %>% mutate(across(list_variable_date,~change_century(.)))
It does not work. Is there something I am getting wrong? thank you !
Try:
change_century <- function(x){
a <- year(x)
newx <- ifelse(test = a > 2020, yes = a - 100, no = a)
return(newx)
}
(Frankly, the use of newx as a temporary storage and then returning it was done that way solely to introduce minimal changes in your code. In general, in this case one does not need return, in fact theoretically it adds an unnecessary function to the evaluation stack. I would tend to have two lines in that function: a <- year(x) and ifelse(..), without assignment. The default behavior in R is to return the value of the last expression, which in my case would be the results of ifelse, which is what we want. Assigning it to newx and then return(newx) or even just newx as the last expression has exactly the same effect.)
Rationale
ifelse cannot have variable assignment within it. That's not to say that is is a syntax error (it is not), but that it is counter to its intent. You are asking the function to go through each condition found in test=, and return a value based on it. Regardless of the condition, both yes= and no= are evaluated completely, and then ifelse joins them together as needed.
For demonstration,
ifelse(test = c(TRUE, FALSE, TRUE), yes = 1:3, no = 11:13)
The return value is something like:
c(
if (test[1]) yes[1] else no[1],
if (test[2]) yes[2] else no[2],
if (test[3]) yes[3] else no[3]
)
# c(1, 12, 3)
To capture the results of the zipped-together yeses and nos c(1, 12, 3), one must capture the return value from ifelse itself, not inside of the call to ifelse.
Another point that may be relevant: ifelse(cond, yes, now) is not at all a shortcut for if (cond) { yes } else { no }. Some key differences:
in if, the cond must always be exactly length 1, no more, no less.
In R < 4.2, length 0 returns an error argument is of length zero (see ref), while length 2 or more produces a warning the condition has length > 1 and only the first element will be used (see ref1, ref2).
In R >= 4.2, both conditions (should) produce an error (no warnings).
ifelse is intended to be vectorized, so the cond can be any length. yes= and no= should either be the same length or length 1 (recycling is in effect here); cond= should really be the same length as the longer of yes= and no=.
if does short-circuiting, meaning that if (TRUE || stop("quux")) 1 will never attempt to evaluate stop. This can be very useful when one condition will fail (logically or with a literal error) if attempted on a NULL object, such as if (!is.null(quux) && quux > 5) ....
Conversely, ifelse always evaluates all three of cond=, yes=, and no=, and all values in each, there is no short-circuiting.

Function raise error with return statement

I want to process a own designed function on every cell using the calc function of the "raster" package.
Everything works perfectly when I try to print the "final" result of the function (value I want to return), but when I try to use return statement, I got an error :
Error in .local(x, values, ...) :
values must be numeric, integer or logical.
Here is the code leading to that error
inR <- 'D://test/TS_combined_clipped.tif'
outR <- 'D://test/R_test3.tif'
rasterB <- brick(inR)
fun1 <-function(x){
years = seq(1, 345)
na_idx = which(is.na(x))
years = years[-na_idx]
x <- na.omit(x)
idx = detectChangePoint(x, cpmType='Student', ARL0=500)$changePoint
return(years[idx]) # this raises error
# print(years[idx]) # This does *not* raises any error
}
r <- calc(rasterB, fun=fun1, filename=outR, overwrite=TRUE)
How is it possible to have a return statement to make it fails ?
Some of my tests leads to the fact that it seems that the process fails just after the execution of the calc function on the very last cell of the rasterBrick.
But I have no clue of where to start to try to fix this.
Input image is available here
[EDIT]
I just noticed that if I use return(idx) instead of return(year[idx]) the process works without error raised.
So it seems that the problem is more at fetching the value of the year variable.
Is therefore any particular thing that I missed in the use of indexes with R ?
Comment of user2554330 put me on the good track, issue was that calc cannot handle a "numeric(0)" result.
Updated code is then
inR <- 'D://test/TS_combined_clipped.tif'
outR <- 'D://test/R_test3.tif'
rasterB <- brick(inR)
fun1 <-function(x){
years = seq(1, 345)
na_idx = which(is.na(x))
years = years[-na_idx]
x <- na.omit(x)
idx = detectChangePoint(x, cpmType='Student', ARL0=500)$changePoint
if (idx==0){
return(0)
} else {
return(as.integer(years[idx]))
}
}
r <- calc(rasterB, fun=fun1, filename=outR, overwrite=TRUE)

Check the name of a function passed as an argument in an R function

Suppose I wrote a function that accept another function as an argument:
fct1 <- function(FUN) {
# If FUN is rnorm, then do this.
# If FUN is rbeta, then do this.
}
How should I check whether FUN is rnorm?
I know I can do this to convert the function to a list by as.list() and then coerce it to a string by toString():
toString(as.list(rnorm))
The result is:
", 0, 1, .Call(C_rnorm, n, mean, sd)"
I can then check the content for C_rnorm.
But I think this is not a good solution. I also read somewhere (I cannot remember the source) that coercing a closure to a list and then a string is possible just for backward compatibility, and is not encouraged.
I also thought about body(). For example,
body(rnorm)
The result is:
.Call(C_rnorm, n, mean, sd)
However, then how can I check if C_rnorm is inside the call? I tried to use as.list() and then toString():
toString(as.list(body(rnorm)))
This is the result:
".Call, C_rnorm, n, mean, sd"
However, is this a good practice?
You can use match.call:
fct1 <- function(FUN) {
called <- match.call()$FUN
if(called == "rnorm") {
return("Passed rnorm")
} else {
return("Not rnorm")
}
}
fct1(rnorm)
# [1] "Passed rnorm"
fct1(rlnorm)
# [1] "Not rnorm"

R: on.exit - use returned value without knowing its name

I have below function. I cannot alter the function in any way except the first block of code in the function.
In this simple example I want to display apply some function on returning object.
The point is the name of variable returned by function may vary and I'm not able to guess it.
Obviously I also cannot wrap the f function into { x <- f(); myfun(x); x }.
The below .Last.value in my on.exit call represents the value to be returned by f function.
f <- function(param){
# the only code I know - start
on.exit(if("character" %in% class(.Last.value)) message(print(.Last.value)) else message(class(.Last.value)))
# the only code I know - end
# real processing of f()
a <- "aaa"
"somethiiiing"
if(param==1L) return(a)
b <- 5L
"somethiiiing"
if(param==2L) return(b)
"somethiiiing"
return(32)
}
f(1L)
# function
# [1] "aaa"
f(2L)
# aaa
# [1] 5
f(3L)
# integer
# [1] 32
Above code with .Last.value seems to be working with lag (so in fact not working) and also the .Last.value is probably not the way to go as I want to use the value few times like if(fun0(x)) fun1(x) else fun2(x), and because returned value might be a big object, copy it on the side is also bad approach.
Any way to use on.exit or any other function which can help me to run my function on the f function results without knowing result variable name?
In a similar way to how you are modifying the function, you could easily wrap it as well. Here's a reproducible example.
library(data.table)
append.log<-function(x) {
cat(paste("value:",x,"\n"))
}
idx.dt <- data.table:::`[.data.table`
environment(idx.dt)<-asNamespace("data.table")
idx.wrap <- function(...) {
x<-do.call(idx.dt, as.list(substitute(...())), envir=parent.frame())
append.log(if(is(x, "data.table")) {
nrow(x)
} else { NA })
x
}
environment(idx.wrap)<-asNamespace("data.table")
(unlockBinding)("[.data.table",asNamespace("data.table"))
assign("[.data.table",idx.wrap,envir=asNamespace("data.table"),inherits=FALSE)
dt<-data.table(a=1:10, b=seq(2, 20, by=2), c=letters[1:10])
dt[a%%2==0]
Since R 3.2.0 it is fully possible, thanks to new function returnValue.
Working example below.
f <- function(x, err = FALSE){
pt <- proc.time()[[3L]]
on.exit(message(paste("proc.time:",round(proc.time()[[3L]]-pt,4),"\nnrow:",as.integer(nrow(returnValue()))[1L])))
Sys.sleep(0.001)
if(err) stop("some error")
return(x)
}
dt <- data.frame(a = 1:5, b = letters[1:5])
f(dt)
f(dt, err=T)
f(dt)
f(dt[dt$a %in% 2:3 & dt$b %in% c("c","d"),])

My function works in R console but not in R script

I tried to write some functions to calculate anova power and sample size using non-central parameter.
There're some very good functions in R but my functions were to learn and reproduce line of thought from a biostatistical book...
Despite de math involved, my "nc" and "fpower" functions just work well, and as expected:
nc <- function(diff,n,sd) {
nonc <- (diff^2/2)*(n/sd^2)
return(nonc)
}
fpower <- function(k,n,diff,sd,alpha=0.05) {
nonc <- nc(diff,n,sd)
dfn <- k - 1
dfd <- k*(n-1)
f1 <- qf(1-alpha,dfn,dfd)
f2 <- pf(f1,dfn,dfd,nonc)
return(1-f2)
}
However, my "fsample" just doesn´t work as expected. Return 2, the first n in the seq.
fsample <- function(k,diff,sd,alpha=0.05,power=0.9){
for(n in 2:5000){
if ( fpower(k,n,sd,alpha) >= power) break
}
return(n)
}
But, if I "hand" run this code in console it work as expected!!
And return the right n value.
What's wrong?
You didn't pass the diff argument to fpower, so the arguments aren't in the order you think they are. fsample should be:
fsample <- function(k,diff,sd,alpha=0.05,power=0.9){
for(n in 2:5000){
if ( fpower(k,n,diff,sd,alpha) >= power) break
}
return(n)
}
Note that this wouldn't have been a problem if you had named the arguments when you called fpower because you would have received an error about diff being missing and not having a default value:
# this will error
fsample <- function(k,diff,sd,alpha=0.05,power=0.9){
for(n in 2:5000){
if ( fpower(k=k,n=n,sd=sd,alpha=alpha) >= power) break
}
return(n)
}
Also, you might want to avoid giving data objects the same name as functions (e.g. diff, sd, and power are also functions), otherwise you may confuse yourself.

Resources