I'd like to use the censored boxplot in the R package NADA but I want to reorder the X-axis.
library(NADA)
data(Golden)
#this should reorder the factor and change the x-axis but does not
Golden$DosageGroup <-factor(Golden$DosageGroup, levels=c("Low","High"))
cenboxplot(Golden$Blood, Golden$BloodCen, Golden$DosageGroup)
The help says the output is the default boxplot method but I cannot seem to get it to work.
PS - similar to this post but no answers were given
Modifying a function in a loaded existing package consists of several steps:
get the code and store its environment
cenboxplot # just typing the name of a function should bring up its code.
# this appears
function (obs, cen, group, log = TRUE, range = 0, ...)
{
if (log)
log = "y"
else log = ""
if (missing(group))
ret = boxplot(cenros(obs, cen), log = log, range = range,
...)
else {
modeled = numeric()
groups = character()
for (i in levels(as.factor(group))) {
mod = suppressWarnings(cenros(obs[group == i], cen[group ==
i])$modeled)
grp = rep(i, length(mod))
modeled = c(modeled, mod)
groups = c(groups, grp)
}
# problem with levels of the `groups` object
boxplot(modeled ~ as.factor(groups), log = log, range = range,
...)
ret = data.frame(ros.model = modeled, group = groups)
}
abline(h = max(obs[cen]))
invisible(ret)
}
<environment: 0x55a8acb2d708>
cbp_env <- environment(cenboxplot)
figure where the function is deficient and make a copy ready to fix the problem:
It's because the constructed groups object doesn't inherit the levels from the group argument. When I'm looking at the code, I select the console output starting with the function name and ending just before the <environment ....> designation, and then paste that back to the console. I then put an assignment arrow (<-)right after the function name.
Modify teh code before hitting enter (or copy it to an editor if it's goint to require major surgery.) After code is modified, assign new value to existing name (or a new name at your discretion)
The modification that succeeds: Put this line in just below the curley-brace that is the end of the for loop. (It's also just before the boxplot call:
groups=factor(groups, levels=levels(group)) # adhere to user's intent
assign the same environment to the new version as the old version had:
environment(cenboxplot) <- cbp_env # which was stored above.
Now running your code yields:
There are other options to the fname<-old_fname; environment(fname)<-environment(old_fname) strategy. There is reassignInPackage in the R.utils package. And apparently you can do: environment(censboxplot) <- asNamespace('NADA')
Related
Does anyone have an idea (or a 'solution') that in case you define a View() method for a S4 class in which you would like to access a certain slot in the RStudio data.frame Viewer it just won't return it as expected?
What I mean is if you start a fresh R session eg
View(mtcars)
will work as expected and deliver
(works as expected including Filters etc in the Viewer)
But then if you define some S4 class eg
foo <- setClass("foo", slots = c(df = "data.frame"))
myfoo <- new('foo', df = mtcars)
and then a View method
setMethod("View", "foo", function(x, title) View(x#df, title))
You will suddenly face one of these (two differing) messages:
Creating a generic function for 'View' from package 'stats' in the global environment
OR this one
Creating a generic function for 'View' from package 'utils' in the global environment
Which imo is already "disturbing" since afaik the stats package does not seem to even have a View function.
But lets continue and get to what I really wonder about. Which is why I do not get the expected same result in RStudio's data.frame Viewer but some form of list view?
Can this somehow be avoided and made to produce the expected ie same type of "View" as for the standalone data.frame?
And to top it all off once you do that within a R package you end up with this (final pic below) where in the background you see View(mtcars) output before devtools::load_all() was run and in the foreground you see the result of View(mtcars) after load_all() picked up all methods/functions in the package? Is that a RStudio bug, or am I doing something wrong here?
Here is a (not-so-nice) workaround until a better solution comes along.
foo <- setClass("foo", slots = c(df = "data.frame"))
myfoo <- new('foo', df = mtcars)
old <- View
View <- function(...) {
if(isS4(...)) {
unclass(...)#df |> format.data.frame() |> old()
} else {
old(...)
}
}
The viewer with the red lines is the default utils::View,
function (x, title)
{
check <- Sys.getenv("_R_CHECK_SCREEN_DEVICE_", "")
msg <- "View() should not be used in examples etc"
if (identical(check, "stop"))
stop(msg, domain = NA)
else if (identical(check, "warn"))
warning(msg, immediate. = TRUE, noBreaks. = TRUE, domain = NA)
if (missing(title))
title <- paste("Data:", deparse(substitute(x))[1])
x0 <- as.data.frame(x)
x <- as.list(format.data.frame(x0))
rn <- row.names(x0)
if (any(rn != seq_along(rn)))
x <- c(list(row.names = rn), x)
if (!is.list(x) || !length(x) || !all(sapply(x, is.atomic)) ||
!max(lengths(x)))
stop("invalid 'x' argument")
if (grepl("darwin", R.version$os))
check_for_XQuartz()
invisible(.External2(C_dataviewer, x, title))
}
RStudio modifies View somewhere1, maybe here to a more user friendly interface, via
function (...)
.rs.callAs(name, hook, original, ...)
where original is utils::View, ( see e <- environment(View); e$original).
1 I have not figured out (yet) where exactly.
I have a data set where I want to calculate the 6 month return of stocks with tq_get (see example below)
Dataset called top
ticker 6month
AKO.A
BIG
BGFV
Function
library(tidyverse)
library(dplyr)
library(tidyquant)
library(riingo)
calculate <- function (x) {
(tq_get(x, get = "tiingo", from = yesterday, to = yesterday)$adjusted/tq_get(x, get = "tiingo", from = before, to = before)$adjusted)-1
}
top[2] <- lapply(top[1], function(x) calculate(x))
Unfortunately for some of the tickers there is no value existing which results in error message when simply using lapply or mutate as the resulting vector is smaller (less rows) then the existing dataset. Resolving with try_catch did not worked.
I now wanted to apply a work around by checking with is_supported_ticker() provided by the package riingo if the ticker is available
calculate <- function (x) {
if (is_supported_ticker(x, type = "tiingo") == TRUE) {
(tq_get(x, get = "tiingo", from = yesterday, to = yesterday)$adjusted/tq_get(x, get = "tiingo", from = before, to = before)$adjusted)-1
}
else {
NA
}
}
top[2] <- lapply(top[1], function(x) calculate(x))
But now I receive the error message x ticker must be length 1, but is actually length 3.
I assume this is based on the fact that the whole first column of my dataset is used as input for is_supported_ticker() instead of row by row. How can I resolve this issue?
Glancing at the documentation, it looks like tq_get supports multiple symbols, only if_supported_ticker goes one at a time. So probably you should check all the tickers to see if they are supported, and then use tq_get once on all the supported ones. Something like this (untested, as I don't have any of these packages):
calculate <- function (x) {
supported = sapply(x, is_supported_ticker, type = "tiingo")
result = rep(NA, length(x))
result[supported] =
(
tq_get(x[supported], get = "tiingo", from = yesterday, to = yesterday)$adjusted /
tq_get(x[supported], get = "tiingo", from = before, to = before)$adjusted
) - 1
return(result)
}
It worries me that before and yesterday aren't function arguments - they're just assumed to be there in the global environment. I'd suggest passing them in as arguments to calculate(), like this:
calculate <- function (x, before, yesterday) {
supported = sapply(x, is_supported_ticker, type = "tiingo")
result = rep(NA, length(x))
result[supported] =
(
tq_get(x[supported], get = "tiingo", from = yesterday, to = yesterday)$adjusted /
tq_get(x[supported], get = "tiingo", from = before, to = before)$adjusted
) - 1
return(result)
}
# then calling it
calculate(top$ticker, before = <...>, yesterday = <...>)
This way you can pass values in for before and yesterday on the fly. If they are objects in your global environment, you can simply use calculate(top$ticker, before, yesterday), but it gives you freedom to vary those arguments without redefining those names in your global environment.
This question differs from my original; it adheres more to a minimal reproducible example and incorporates a recommendation by be_green against silently loading entire libraries within the context of a function.
The outer function starts by defining a number of cases, default values, and a list of any case exceptions. The inner function assembles each case by using the default values in a computation unless exceptions are defined. Finally, the outer function assembles these cases into a data frame.
Here is the function:
outerfun <- function(cases, var_default, exceptions=list()){
# Inner Function to create a case
innerfun <- function(var=var_default) { # Case
result = var
return(result)
}
# Combine Cases
datlist <- list()
for(case in 1:cases){
datlist[[paste0("X",case)]] <- do.call(innerfun, as.list(exceptions[[paste0("X",case)]]))
}
casedata <- do.call(dplyr::data_frame, datlist)
return(casedata)
}
This function works fine when I define values for the inner function as exceptions:
data <- outerfun(cases = 3, var_default = 10, exceptions = list("X2" = c(var = 14)))
But not when I mix the two:
data <- outerfun(cases = 3, var_default = 10, exceptions =
list("X2" = c(var = var_default + 4)))
Being able to mix the two are important since it makes the function more intuitive and easier to program for a variety of cases.
I think the problem might result from using do.call and have seen other threads detailing this issue (having to do with environments and frames), but I haven't been able to find an optimal solution for me. I like do.call since I can pass a list of arguments into a function. I could turn the inner function into a list (think: function(...) { }) but then I would have to define every variable instead of relying on the default.
Any help or suggestions you might have would be great.
The problem is that lvl_default is not defined outside the context of the function, and yet you call it as an input to a parameter. Because there is no variable called lvl_default in the global environment, when the function tries to evaluate the parameter exceptions = list(X3 - c(lvl = lvl_default + 10), it fails to find a variable to evaluate. You are not able to specify parameters by setting them equal to the names of other unevaluated parameters.
Instead, what I would recommend doing is setting a variable outside the function associated with the value you were hoping to pass into lvl_default and then pass it into the function like so:
level <- 1000
data <- genCaseData(n_signals = 3, datestart = "2017-07-01T15:00:00",
n_cycles = 4, period_default = 10, phase_default = 0, ampl_default = 15,
lvl_default = level, exceptions = list(X1= c(lvl=980),
X3 = c(lvl = level + 10)))
Also as I noted in a comment, I would recommend against silently loading entire libraries within the context of a function. You can end up masking things you didn't mean to, and running into strange errors because the require call doesn't actually throw one if a library is unavailable. Instead I would reference the functions through pkgname::fncname.
be_green did solve this first, but I wanted to follow-up with what I actually did for my project.
As be_green pointed out, I couldn't call var_default within the exception list since it hadn't yet been defined. I didn't understand this at first since you can actually define the default of an argument to a variable defined within the function itself:
addfun <- function(x, y = z + x + 2) {
z = 20
c(x, y)
}
addfun(x = 20)
[1] 20 42
This is because function arguments in R lazily evaluated. I thought this gave me a pass to call the function like this:
addfun(x = 10, y = x + z)
Error in addfun(x = 10, y = x + z) : object 'x' not found
If you remove x then it calls an error for z. So even though the default to y is dependent on x and z, you can't call the function using x or z.
be_green suggested that I pass arguments in a string and then parse it within the function. But I was afraid that others on my team would find the resulting syntax confusing.
Instead, I used ellipsis (...) and evaluated the ellipsis arguments within my function. I did this using this line of code:
list2env(eval(substitute(alist(...))), envir = as.environment(-1))
Here the eval(substitute(alist(...))) pattern is common but results in a named list of arguments. Due to some other features, it becomes more convenient to evaluate the arguments as objects within the function. list2env(x, envir = as.environment(-1)) accomplishes this with an additional step. Once the argument is called, you need to explicitly evaluate the call. So if I wanted to change my addfun() above:
addfun <- function(x, ...) {
z = 20
list2env(eval(substitute(alist(...))),
envir = as.environment(-1))
c(x, eval(y))
}
addfun(x = 10, y = x + z)
This is a trite example: I now need to define y even though it's not an argument in the function. But now I can even re-define z within the function call:
addfun(x = 10, y = z + 2, z = 10)
This is all possible because of non-standard evaluation. There can be trade-offs but in my application of non-standard evaluation, I was able to increase the usability and flexibility of the function while making it more intuitive to use.
Final code:
outerfun <- function(caseIDs, var_default, ...){
list2env(eval(substitute(alist(...))), envir = as.environment(-1))
# Inner Function to create a case
innerfun <- function(var=var_default) { # Case
result = var
return(result)
}
# Combine Cases
datlist <- lapply(caseIDs, function(case) {
do.call(innerfun, eval(get0(case, ifnotfound = list())))
})
names(datlist) <- caseIDs
casedata <- do.call(dplyr::data_frame, datlist)
return(casedata)
}
Now both examples work with full functionality:
data <- outerfun(caseIDs = c("X1","X2","X3"), var_default = 10,
X2 = list(var = 14))
data <- outerfun(caseIDs = c("X1","X2","X3"), var_default = 10,
X2 = list(var = var_default + 4))
I hope this helps someone else! Enjoy!
I am trying to analyze how varying starting conditions and variable values in a set of differential equations (that describe the progression of a disease through a population) influences the dynamics of the system (as seen via graph). I have written the code and it works perfectly well, but forces me to change a value and then re-run the whole code.
I am therefore trying to put this code inside of manipulate() so I can manipulate the variables and immediately see the effect on the produced graph:
library(deSolve)
library(manipulate)
xyz <- function(time, state, parameters) {
with(as.list(c(state, parameters)), {
dt <- 1
dv <- v0*v1*cos(2*pi*t/33)
dX <- (v*N-B*X*Y-(mu+N/K)*X)
dY <- (B*X*Y-(mu+m+g+N/K)*Y)
dZ <- (g*Y-(mu+N/K)*Z)
dN <- (v-mu-N/K)*N-m*Y
return(list(c(dt,dv,dX, dY, dZ, dN)))
})
}
times <- seq(0,365*3,by = 1)
init <- c(t=0,v=0.02,X=995,Y=5,Z=0,N=1000)
parameters <- c(B=0.14,mu=.01,m=.075,g=.025,K=10000,v0=.02,v1=.5)
manipulate(
out <- as.data.frame(ode(y = init, times = seq(0,365*3,by=1), func = xyz, parms = parameters)),
matplot(times,out[4:6],type="l",xlab="Time",ylab="Susceptibles and Recovereds",main="SIR Model",lwd=1,lty=1,bty="l",col=2:4),
B=slider(0,1,initial=0.14,step=0.01)
)
I keep getting error messages regardless if I have all or part of the code inside manipulate(), define variables outside or inside of it, or anything else. Any help would be greatly appreciated!
When I first ran your code, the error I encountered was:
Error in manipulate(out <- as.data.frame(ode(y = init, times = seq(0, :
all controls passed to manipulate must be named
This error occurred because the second argument to manipulate was the matplot() command rather than a named control argument (such as a slider). So I placed the first two lines within curly braces to make them a single expression:
manipulate( {
out <- as.data.frame(ode(y = init, times = seq(0,365*3,by=1), func = xyz, parms = parameters))
matplot(times,out[4:6],type="l",xlab="Time",ylab="Susceptibles and Recovereds",main="SIR Model",lwd=1,lty=1,bty="l",col=2:4)
}, B=slider(0,1,initial=0.14,step=0.01)
)
This eliminates the error, but moving the slider doesn't do anything to the plot. Why? Because the slider named B doesn't refer to anything within the expression passed to manipulate(). I solved that by moving the parameters <- ... line into the manipulate expression and then changing that line so that there was a variable B (not just a name in the list); in other words, we need B=B instead of B=0.14. Now the plot changes when you move the slider, which I believe is what you wanted:
manipulate( {
parameters <- c(B=B,mu=.01,m=.075,g=.025,K=10000,v0=.02,v1=.5)
out <- as.data.frame(ode(y = init, times = seq(0,365*3,by=1), func = xyz, parms = parameters))
matplot(times,out[4:6],type="l",xlab="Time",ylab="Susceptibles and Recovereds",main="SIR Model",lwd=1,lty=1,bty="l",col=2:4)
}, B=slider(0,1,initial=0.14,step=0.01)
)
times <- seq(0,365*3,by = 1)
init <- c(t=0,v=0.02,X=995,Y=5,Z=0,N=1000)
plot.ode <- function(B.param) {
parameters <- c(B=B.param,mu=.01,m=.075,g=.025,K=10000,v0=.02,v1=.5)
out <- as.data.frame(ode(y = init, times = seq(0,365*3,by=1), func = xyz, parms = parameters))
matplot(times,out[4:6],type="l",xlab="Time",ylab="Susceptibles and Recovereds",main="SIR Model",lwd=1,lty=1,bty="l",col=2:4, ylim=c(0,200))
}
manipulate(plot.ode(B), B=slider(0,1,initial=0.14,step=0.01))
Seems a little odd that only the red curve is influenced by changing B.
I'd like to write a function that will create and return a set of parameters to be used in a function mySimulation I've created. Until now, I've basically been doing, e.g., mySimulation(parm1 = 3, parm2 = 4). This is now suboptimal because (1) in the actual version, the number of parameters is becoming unwieldy and (2) I'd like to keep track of different combinations of the parameters that produce the different models I'm using. So, I wrote createParms (a minimally sufficient version shown below) to do the trick. My whole approach just seems so clunky though. With all the statisticians using R, I'm sure there's a more standard way of handling my issue...right?
createParms <- function(model = "default", ...) {
# Returns a list `parms` of parameters which will then be used in
# mySimultation(parms)
#
# Args:
# model: ["default" | "mymodel"] character string representation of a model
# with known parameters
# ...: parameters of the existing `model` to overwrite.
# if nothing is supplied then the model parameters will be left as is.
# passed variables must be named.
# e.g., `parm1 = 10, parm2 = 20` is good. `10, 20` is bad.
#
# Returns:
# parms: a list of parameters to be used in mySimulation(parms)
#
parms.names <- c("parm1", "parm2")
parms <- vector(mode = "list", length = length(parms.names))
names(parms) <- parms.names
overwrite <- list(...)
overwrite.names <- names(overwrite)
if (model == "default") {
parms$parm1 <- 0
parms$parm2 <- 0
} else if (model == "mymodel") {
parms$parm1 <- 1
parms$parm2 <- 2
}
if (length(overwrite) != 0) {
parms[overwrite.names] <- overwrite
}
return(parms)
}
I think if you know the combination of parameters to be used for each model, then it is better to create a data frame of model names and parameters as shown below
# create a data frame with model names and parameters
# NOTE: i am assuming all models have equal number of parameters
# if they are unequal, then store as list of models
model = c('default', 'mymodel');
parm1 = c(0.5, 0.75);
parm2 = c(1, 2);
models.df = data.frame(model, parm1, parm2)
You can now simulate any of the models by passing it as an argument to your mySimulation function. I have used a dummy simulation example, which you can replace with your code.
# function to run simulation based on model name
mySimulation = function(model = 'default'){
# find row corresponding to model of interest
mod.row = match(model, models.df$model)
# extract parameters corresponding to model
parms = models.df[mod.row, -1]
# run dummy simulation of drawing normal random variables
sim.df = rnorm(100, mean = parms[,1], sd = parms[,2])
return(sim.df)
}
If you now want to run all your simulations in one step, you can use the excellent plyr package and invoke
library(plyr)
sim.all = ldply(models.df$model, mySimulation)
If each of your simulations returns unequal number of values then you can use the function llply instead of ldply.
If you provide more information about the return values of your simulation and details on what it does, this code can be easily tweaked to get what you want.
Let me know if this works
If the simulation function always takes the same set of arguments, then Ramnath's approach of storing them in a data frame is best. For the more general case of variable inputs to mySimulation, you should store each set of inputs in a list – probably using a list of lists for running several simluations.
The idea behind your createParms function looks sound; you can simplify the code a little bit.
createParms <- function(model = "default", ...)
{
#default case
parms <- list(
parm1 = 0,
parm2 = 0
)
#other special cases
if(model == "mymodel")
{
parms <- within(parms,
{
parm1 <- 1
parm2 <- 2
})
}
#overwrite from ...
dots <- list(...)
parms[names(dots)] <- dots
parms
}
Test this with, e.g.,
createParms()
createParms("mymodel")
createParms("mymodel", parm2 = 3)
do.call may come in handy for running your simulation, as in
do.call(mySimulation, createParms())
EDIT: What do.call does for you
If you have parms <- createParms(), then
do.call(mySimulation, parms)
is the same as
with(parms, mySimulation(parm1, parm2))
The main advantage is that you don't need to spell out each parameter that you are passing into mySimulation (or to modify that function to accept the parameters in list form).