I made a function mean() taking no arguments. Actually I wanted to calculate mean of some numbers:
mean <- function() {
ozonev <- data[1]
mv <- is.na(ozonev)
mean(ozonev)
}
As there exists a pre-defined function mean() in R, this code is going into recursion. Also, I tried to rename the main function but the previous mean still exists. can any body help me with how to remove that mean function made by me so as to recover the actual functionality of mean().
> source("solution.R")
> ls()
[1] "colname" "data" "firtr" "fsr" "last" "mean" "meano"
[8] "missv" "rowno" "x" "y"
solution.R is the script and mean is the function. meano is the renamed function.
You should use rm to remove your mean function.
rm(mean)
# or, if you have an environment for this
rm(mean, envir = "<your_env>")
But you can't remove the mean from the base package. It's locked for modification.
Try this.
mean <- NULL
(I don’t know exactly why this works. But this works.)
There’s a very easy way of avoiding the recursion: call base::mean explicitly:
mean <- function() {
ozonev <- data[1]
mv <- is.na(ozonev)
base::mean(ozonev)
}
Alternatively, you can do something like this:
old_mean <- mean
mean <- function() {
ozonev <- data[1]
mv <- is.na(ozonev)
old_mean(ozonev)
}
This has the advantage that it will call any pre-existing mean function that may previously have overridden base::mean. However, a cleaner approach is usually to make a function into an S3 generic, and creating a generic for a certain class.
You can edit the function inside the R script, or inside your interactive R session by typing the command
man <- edit(main)
(Note that you need to reassign the result of edit, as I’ve done here!)
Related
Let say I have the function
mean_wrapper <- function(x) {
mean(x)
}
How can I check if the mean function is called?
An use case is for instance If I want to check this behavior in a unit test.
EDIT:
I make another exampe to be clearer. Let consider this function:
library(readr)
library(magrittr)
read_data <- function(file_name) {
read_csv(file_name) %>%
validate_data()
}
The aim of read_data is to read a CVS file and validate it. validate_data performs some checks on the data. It raises an error if one of them fail, otherwise returns the input object.
I want to test both functions but I don't want replicate the same tests I wrote for validate_data in the case of read_data. Anyway I have to check that the latter function has been called in read_data, so I wolud like to write a test that does this for me.
You could trace mean:
trace(mean, tracer = quote(message("mean was called")))
mean_wrapper(3)
#Tracing mean(x) on entry
#mean was called
#[1] 3
untrace(mean)
#Untracing function "mean" in package "base"
Instead of a message you can use anything (e.g., assignment to a variable in the enclosing environment) as tracer.
Sorry for my poor English but I can not think of a title that could concisely describe my problem, which is a little bit complicated than the title suggests.
Here is what I'd like to achieve:
In the global environment, one can get the name of a variable, say xyz, by calling deparse(substitute(xyz)). I need to use this at several places in my code so I decided to make it a function: getVarName <- function(x){deparse(substitute(x))}.
Then I write some function myfunc in which I need to call getVarName:
myfunc <- function(x){cat(getVarName(x))}
Now here is the problem: when I call myfunc(y), instead of printing out y, it still prints out x. I suspect it has something to do with the environment in which substitute() does the trick, but got no luck in that direction.
What do I have to do to make it right?
P.S. It'll be nice if some could edit the title with a better description of this question, thank you!
From what I saw while testing your code, it appears that deparse(substitute(x)) will only print out the name of the variable which was immediately below it in the call stack. In your example:
getVarName <- function(x){ deparse(substitute(x)) }
myfunc <- function(x){ cat(getVarName(x)) }
myfunc(y)
The call to getVarName() is processing a variable from myfunc() which was called x. In effect, the variable y which you passed is not part of the call stack anymore.
Solution:
Just use deparse(substitute(x)) directly in the function where you want to print the name of the variable. It's concise, and you could justify not having a helper function as easily as having one.
It is typically the kind of functional programmming problem where you can use a decorator:
decorator = function(f)
{
function(...)
{
print(as.list(match.call()[-1]))
f(...)
}
}
foo = function(x,y,z=2) paste0(x,y,z)
superFoo = decorator(foo)
Results:
> xx=34
> superFoo('bigwhale',xx)
[[1]]
[1] "bigwhale"
[[2]]
xx
[1] "bigwhale342"
The following simple example will help me address a problem in my program implementation.
fun2<-function(j)
{
x<-rnorm(10)
y<-runif(10)
Sum<-sum(x,y)
Prod<-prod(x,y)
return(Sum)
}
j=1:10
Try<-lapply(j,fun2)
#
I want to store "Prod" at each iteration so I can access it after running the function fun2. I tried using assign() to create space assign("Prod",numeric(10),pos=1)
and then assigning Prod at j-th iteration to Prod[j] but it does not work.
#
Any idea how this can be done?
Thank you
You can add anything you like in the return() command. You could return a list return(list(Sum,Prod)) or a data frame return(data.frame("In"=j,"Sum"=Sum,"Prod"=Prod))
I would then convert that list of data.frames into a single data.frame
Try2 <- do.call(rbind,Try)
Maybe re-think the problem in a more vectorized way, taking advantage of the implied symmetry to represent intermediate values as a matrix and operating on that
ni = 10; nj = 20
x = matrix(rnorm(ni * nj), ni)
y = matrix(runif(ni * nj), ni)
sums = colSums(x + y)
prods = apply(x * y, 2, prod)
Thinking about the vectorized version is as applicable to whatever your 'real' problem is as it is to the sum / prod example; in practice and when thinking in terms of vectors fails I've never used the environment or concatenation approaches in other answers, but rather the simple solution of returning a list or vector.
I have done this before, and it works. Good for a quick fix, but its kind of sloppy. The <<- operator assigns outside the function to the global environment.
fun2<-function(j){
x<-rnorm(10)
y<-runif(10)
Sum<-sum(x,y)
Prod[j]<<-prod(x,y)
}
j=1:10
Prod <- numeric(length(j))
Try<-lapply(j,fun2)
Prod
thelatemail and JeremyS's solutions are probably what you want. Using lists is the normal way to pass back a bunch of different data items and I would encourage you to use it. Quoted here so no one thinks I'm advocating the direct option.
return(list(Sum,Prod))
Having said that, suppose that you really don't want to pass them back, you could also put them directly in the parent environment from within the function using either assign or the superassignment operator. This practice can be looked down on by functional programming purists, but it does work. This is basically what you were originally trying to do.
Here's the superassignment version
fun2<-function(j)
{
x<-rnorm(10)
y<-runif(10)
Sum<-sum(x,y)
Prod[j] <<- prod(x,y)
return(Sum)
}
j=1:10
Prod <- numeric(10)
Try<-lapply(j,fun2)
Note that the superassignment searches back for the first environment in which the variable exists and modifies it there. It's not appropriate for creating new variables above where you are.
And an example version using the environment directly
fun2<-function(j,env)
{
x<-rnorm(10)
y<-runif(10)
Sum<-sum(x,y)
env$Prod[j] <- prod(x,y)
return(Sum)
}
j=1:10
Prod <- numeric(10)
Try<-lapply(j,fun2,env=parent.frame())
Notice that if you had called parent.frame() from within the function you would need to go back two frames because lapply() creates its own. This approach has the advantage that you could pass it any environment you want instead of parent.frame() and the value would be modified there. This is the seldom-used R implementation of writeable passing by reference. It's safer than superassignment because you know where the variable is that is being modified.
I am running the following code:
disc<-for (i in 1:33) {
m=n[i]
xbar<-sum(data[i,],na.rm=TRUE)/m
Sx <- sqrt(sum((data[i,]-xbar)^2,na.rm=TRUE)/(m-1))
Sx
i=i+1}
Running it:
>disc
NULL
Why is it giving me NULL?
This is from the documentation for for, accessible via ?`for`:
‘for’, ‘while’ and ‘repeat’ return ‘NULL’ invisibly.
Perhaps you are looking for something along the following lines:
library(plyr)
disc <- llply(1:33, function(i) {
m=n[i]
xbar<-sum(data[i,],na.rm=TRUE)/m
Sx <- sqrt(sum((data[i,]-xbar)^2,na.rm=TRUE)/(m-1))
Sx
})
Other variants exists -- the ll in llply stands for "list in, list out". Perhaps your intended final result is a data frame or an array -- appropriate functions exist.
The code above is a plain transformation of your example. We might be able to do better by splitting data right away and forgetting the otherwise useless count variable i (untested, as you have provided no data):
disc <- daply(cbind(data, n=n), .(), function(data.i) {
m=data.i$n
xbar<-sum(data.i,na.rm=TRUE)/m
sqrt(sum((data.i-xbar)^2,na.rm=TRUE)/(m-1))
})
See also the plyr website for more information.
Related (if not a duplicate): R - How to turn a loop to a function in R
krlmlr's answer shows you how to fix your code, but to explain your original problem in more abstract terms: A for loop allows you to run the same piece of code multiple times, but it doesn't store the results of running that code for you- you have to do that yourself.
Your current code only really assigns a single value, Sx, for each run of the for loop. On the next run, a new value is put into the Sx variable, so you lose all the previous values. At the end, you'll just end up with whatever the value of Sx was on the last run through the loop.
To save the results of a for loop, you generally need to add them to a vector as you go through, e.g.
# Create the empty results vector outside the loop
results = numeric(0)
for (i in 1:10) {
current_result = 3 + i
results = c(results, current_result)
}
In R for can't return a value. The unique manner to return a value is within a function. So the solution here, is to wrap your loop within a function. For example:
getSx <- function(){
Sx <- 0
disc <- for (i in 1:33) {
m=n[i]
xbar <- sum(data[i,],na.rm=TRUE)/m
Sx <- sqrt(sum((data[i,]-xbar)^2,na.rm=TRUE)/(m-1))
}
Sx
}
Then you call it:
getSx()
Of course you can avoid the side effect of using a for by lapply or by giving a vectorized But this is another problem: You should maybe give a reproducible example and explain a little bit what do you try to compute.
How can I ensure that when a function is called it is not allowed to grab variables from the global environment?
I would like the following code to give me an error. The reason is because I might have mistyped z (I wanted to type y).
z <- 10
temp <- function(x,y) {
y <- y + 2
return(x+z)
}
> temp(2,1)
[1] 12
I'm guessing the answer has to do with environments, but I haven't understood those yet.
Is there a way to make my desired behavior default (e.g. by setting an option)?
> library(codetools)
> checkUsage(temp)
<anonymous>: no visible binding for global variable 'z'
The function doesn't change, so no need to check it each time it's used. findGlobals is more general, and a little more cryptic. Something like
Filter(Negate(is.null), eapply(.GlobalEnv, function(elt) {
if (is.function(elt))
findGlobals(elt)
}))
could visit all functions in an environment, but if there are several functions then maybe it's time to think about writing a package (it's not that hard).
environment(temp) = baseenv()
See also http://cran.r-project.org/doc/manuals/R-lang.html#Scope-of-variables and ?environment.
environment(fun) = parent.env(environment(fun))
(I'm using 'fun' in place of your function name 'temp' for clarity)
This will remove the "workspace" environment (.GlobalEnv) from the search path and leave everything else (eg all packages).