I'm new to R and I'm just figuring out environments and how to play around with them. In the code below, I understand that we are creating this list of 4 functions. But in this environment of makeVector, there is an object created, m and is assigned to null intially. What I don't understand (coming from a C language coder) is, how does this m stored?
I understand that makeVector has its own environment and has 6 things under it - x, m, set, get, setmean, getmean.
makeVector <- function(x = numeric()) {
m <- NULL
set <- function(y) {
x <<- y
m <<- NULL
}
get <- function() x
setmean <- function(mean) m <<- mean
getmean <- function() m
list(set = set, get = get,
setmean = setmean,
getmean = getmean)
}
For example,
x1 <- makeVector(as.numeric(1:4))
x2 <- makeVector(as.numeric(1:5))
Now, my question is, does x1 and x2 both have their own copies of m or do they share the same object m?
Again, if they do have different copies of m, isn't makeVector like a class in C, with its own objects and functions? Can someone clarify, because I'm having a very tough time figuring it out myself.
Related
I'm looking to deal with call evaluations but am out of my depth when it comes to S3 Methods. Basically, I am wondering why a variable that I pass to a function call is not evaluated but rather remains the name of the variable rather than it's value. And all of this depends on whether I name the variable in the function or not.
Let me illustrate with a short example:
I first create a quick function to create a sample class to be used with S3 Methods:
create_myS3 <- function(a, b){
out <- list()
out$a <- a
out$b <- b
class(out) <- "myS3"
return(out)
}
Now the set-up that I am interested in features a number of functions within each other. I first create an S3 method for this myS3 class, let's call it m and we define a specific routine for the myS3 class as well as a default method. Note that the myS3 version calls the default version.
m <- function(x, ...){UseMethod("m")}
m.myS3 <- function(x, estimator = NULL){
y <- list()
y$a <- x$a + 1
y$b <- x$b + 1
out <- m.default(y,
estimator)
return(out)
}
m.default <- function(x, estimator = NULL, ...){
out <- list()
out$call <- sys.call()
out$result <- x$a - x$b
out$aux$estimator <- estimator
return(out)
}
Now that we have defined the functions, we can look at the results function that I'm interested in:
h <- function(x){
out <- list()
out$result_call <- if(is.null(x$call$estimator)){"Success"}else{"Fail"}
out$result_list <- if(is.null(x$aux$estimator)){"Success"}else{"Fail"}
return(out)
}
It's entire purpose is to check whether the estimator element is in the object it is passed to and to give a message based on that.
Ok, now let's put it all together:
g <- function(x){
object <- m(x)
out <- h(object)
return(out)
}
initial <- create_myS3(10,5)
g(initial)
The g() function now calls m() on the input, which was created with the create_myS3 function - so is of class myS3 and is therefore passed to m.myS3 before it is passed to m.default. The resulting object is then passed to h() - in all cases we have not set the estimator argument, which then defaults to NULL and both my check statements in h() return Success.
Now all I do is change one tiny thing: I now modify m.myS3 to call the m.default not just with the order of the input variables but now I also specify the option - in my mind the more robust way. So to clarify, from this m.default(y, estimator) I change it to m.default(x = y, estimator = estimator).
This change then changes my results from h() to Fail for the evaluation if(is.null(x$call$estimator)){"Success"}else{"Fail"} while if(is.null(x$aux$estimator)){"Success"}else{"Fail"} results in Success.
The reason for this is that the call statement evaluates to estimator rather than to its true value NULL.
Is there an easy way to evaluate this call to its true value (I have tried eval or deparse)? Or even better is there are a way to ensure that in m.myS3 the value is always passed rather than the variable?
Here below is the total code for convenience:
create_myS3 <- function(a, b){
out <- list()
out$a <- a
out$b <- b
class(out) <- "myS3"
return(out)
}
m <- function(x, ...){UseMethod("m")}
m.myS3 <- function(x, estimator = NULL){
y <- list()
y$a <- x$a + 1
y$b <- x$b + 1
out <- m.default(y,
estimator)
return(out)
}
m.default <- function(x, estimator = NULL, ...){
out <- list()
out$call <- sys.call()
out$result <- x$a - x$b
out$aux$estimator <- estimator
return(out)
}
h <- function(x){
out <- list()
out$result_call <- if(is.null(x$call$estimator)){"Success"}else{"Fail"}
out$result_list <- if(is.null(x$aux$estimator)){"Success"}else{"Fail"}
return(out)
}
g <- function(x){
object <- m(x)
out <- h(object)
return(out)
}
initial <- create_myS3(10,5)
g(initial)
$result_call
[1] "Success"
$result_list
[1] "Success"
## Changing m.myS3 (only change is to name the option of function m.default)
m.myS3 <- function(x, estimator = NULL){
y <- list()
y$a <- x$a + 1
y$b <- x$b + 1
out <- m.default(x = y,
estimator = estimator)
return(out)
}
g(initial)
$result_call
[1] "Fail"
$result_list
[1] "Success"
In this piece of code, it uses list as what appears to be a function to capture functions created upstream of it. Not entirely sure what good is it for nor how each of the pieces fit together. Can someone clarify what this is doing.
makeCacheMatrix <- function(x = matrix()) {
j <- NULL
set <- function(y){
x <<- y
j <<- NULL
}
get <- function()x
setInverse <- function(inverse) j <<- inverse
getInverse <- function() j
list(set = set, get = get, #list of functions
setInverse = setInverse,
getInverse = getInverse)
}
Environments and the like have always confused me incredibly in R. I guess therefore this is more of a reference request, since I've been surfing the site for the last hour in search of an answer to no avail.
I have a simple R function called target defined as follows
target <- function(x,scale,shape){
s <- scale
b <- shape
value <- 0.5*(sin(s*x)^b + x + 1)
return(value)
}
I then define the function AR
AR <- function(n,f,...){
variates <- NULL
for(i in 1:n){
z <- runif(1)
u <- runif(1)
if(u < f(z, scale, shape)/c){
variates[i] <- z
}else{next}
}
variates <- variates[!is.na(variates)]
return(variates)
}
in which the function target is being evaluated. Unfortunately, the call returns the following error
sample <- AR(n = 10000, f = target, shape = 8, scale = 5)
Error in fun(z, scale, shape) : object 'shape' not found
I know this has to do with the function AR not knowing where to look for the objects shape and scale, but I thought that was exactly the job of the ellipsis: allowing me to sort of put argument definition "on hold" until one actually calls the function. Where am I wrong and could anyone give me a lead as to where to look for insight on this specific problem?
You are very close, you just need to make use of your ellipses...
NB: c was not defined in AR so I added it and gave it a value.
NB2: I would refrain from using c and sample in your function as these themselves are functions and could cause some confusion downt he road.
AR <- function(n, f, c, ...){
variates <- NULL
for(i in 1:n){
z <- runif(1)
u <- runif(1)
if(u < f(z, ...)/c){ ##instead of using shape and scale use the ellipses and R will insert any parameters here which were not defined in the function
variates[i] <- z
}else{next}
}
variates <- variates[!is.na(variates)]
return(variates)
}
sample <- AR(n = 10000, f = target, shape = 8, scale = 5, c = 100)
Consider a hypothetical example:
sim <- function(n,p){
x <- rbinom(n,1,p)
y <- (x==0) * rnorm(n)
z <- (x==1) * rnorm(n,5,2)
dat <- data.frame(x, y, z)
return(dat)
}
Now I want to write another function simfun where I will call the above sim function and check if y and z columns of the data frame is less than a value k.
simfun <- function(n, p, k){
dat <- sim(n, p)
dat$threshold <- (dat$y<=k & dat$z<=k)
return(dat$threshold)
}
But is it standard to use the argument of sim as the argument of simfun? Can I write simfun <- function(k) and call the sim function inside simfun?
I'd say it's fairly standard to do this sort of thing in R. A few pointers to consider:
Usually you should explicitly declare the argument names so as not to create any unwanted behaviour if changes are made. I.e., instead of sim(n, p), write sim(n = n, p = p).
To get simfun() down to just a k argument will require default values for n and p. There are lots of ways to do this. One way would be to hardcode inside simfun itself. E.g.:
simfun <- function(k) {
dat <- sim(n = 100, p = c(.4, .6))
dat$threshold <- (dat$y<=k & dat$z<=k)
return(dat$threshold)
}
simfun(.5)
A more flexible way would be to add default values in the function declaration. When you do this, it's good practice to put variables with default values AFTER variables without default values. So k would come first as follow:
simfun <- function(k, n = 100, p = c(.4, .6)){
dat <- sim(n, p)
dat$threshold <- (dat$y<=k & dat$z<=k)
return(dat$threshold)
}
simfun(.5)
The second option is generally preferable because you can still change n or p if you need to.
While not great, you could define n and p separately
n <- 1
p <- .5
simfun <- function(k){
dat <- sim(n, p)
dat$threshold <- (dat$y<=k & dat$z<=k)
return(dat$threshold)
}
You can read more about R Environments here: http://adv-r.had.co.nz/Environments.html
I am following a Data Science course on Coursera and I have a question regarding one of the assignments where I have to inverse a Matrix and then cache that result.
Basically I have been googling away and I found the answer but there are parts of the answer that I do not yet understand. For this reason I don't want to submit my assignment yet since I don't want to submit anything that I do not fully understand.
The part that I do not understand from the code below is the part where setInverse is defined. where does the 'function(inverse) inv' come from? especially the 'inverse' was never defined?
After this a list is returned which does not make much sense to me as well?
If someone could take the time to explain this function to me I would be very grateful!
makeCacheMatrix <- function(x = matrix()) {
inv <- NULL
set <- function(y) {
x <<- y
inv <<- NULL
}
get <- function() x
setInverse <- function(inverse) inv <<- inverse
getInverse <- function() inv
list(set = set,
get = get,
setInverse = setInverse,
getInverse = getInverse)
}
## Write a short comment describing this function
cacheSolve <- function(x, ...) {
## Return a matrix that is the inverse of 'x'
inv <- x$getInverse()
if (!is.null(inv)) {
message("getting cached data")
return(inv)
}
mat <- x$get()
inv <- solve(mat, ...)
x$setInverse(inv)
inv
}
I don't know your exact assignment, but I would change your function slightly:
makeCacheMatrix <- function(x = matrix()) {
inv <- NULL
set <- function(y) {
x <<- y
inv <<- NULL
}
get <- function() x
setInverse <- function() inv <<- solve(x) #calculate the inverse
getInverse <- function() inv
list(set = set,
get = get,
setInverse = setInverse,
getInverse = getInverse)
}
You can then use it like this:
funs <- makeCacheMatrix()
funs$set(matrix(1:4, 2))
funs$get()
# [,1] [,2]
#[1,] 1 3
#[2,] 2 4
funs$setInverse()
funs$getInverse()
# [,1] [,2]
#[1,] -2 1.5
#[2,] 1 -0.5
The exercise is probably intended to teach you closures. The point is that x and inv are stored in the enclosing environment of the set, get, setInverse, getInverse functions. That means the environment within which they were defined, i.e., the environment created by the makeCacheMatrix() call. See this:
ls(environment(funs$set))
#[1] "get" "getInverse" "inv" "set" "setInverse" "x"
As you see not only are the four functions in this environment, but also the x and inv objects (a consequence of using <<-). And the get and getInverse functions only fetch these from their enclosing environment.