sapply() misbehaving in R - r

I'm trying to have R substitute c(1/2, 1, sqrt(2)/2 ) for rscale = argument using sapply(). But I'm wondering why I'm getting 3 same answers (should get 3 different answers)?
ttype = 1
t = -.742
N1 = 102
N2 = ifelse(ttype==1, NA, 102)
rscale = sqrt(2)/2
tl = 1
dexp = -1
library(BayesFactor)
Gi1 <- ttest.tstat(t, N1, ifelse(ttype==1, F, N2),nullInterval =
c(ifelse(dexp==-1, -Inf, Inf), ifelse(tl==1, 0, Inf)),rscale = rscale, simple = TRUE)
UrUr <- sapply(c(1/2, 1, sqrt(2)/2 ), function(rscale) Gi1 )## HERE I get 3 same answers!

As #HubertL said, Gi1 is a number, not a function. You need to write a function that takes in a parameter and calculates the ttest.tstat on it, plugging the new variable into the "rscale" parameter. For example,
library(BayesFactor)
Gi1 <- function(x) {
ttest.tstat(t, N1, ifelse(ttype==1, F, N2),
nullInterval = c(ifelse(dexp==-1, -Inf, Inf),
ifelse(tl==1, 0, Inf)),rscale = x, simple = TRUE) }
UrUr <- sapply(c(1/2, 1, sqrt(2)/2 ), Gi1)
UrUr
And you should get three different answers.

Agreed with the previous answers. You can also try using sapply like this:
sapply(c(1/2, 1, sqrt(2)/2), function(x) ttest.tstat(t, N1, ifelse(ttype==1, F, N2),nullInterval = c(ifelse(dexp==-1, -Inf, Inf), ifelse(tl==1, 0, Inf)),rscale = x, simple = TRUE))
Sapply will then cycle through your vector using the parameter "x" as a placeholder for each element in your vector c.

Related

Prp plot - Coloring positive and negative values differently

I am fitting regression trees via the function rpart(). Given my data, I am going to have both positive and negative estimates in nodes. Is there a way to color them differently?
In particular, what I would like to have is a tree whose nodes are shaded in blue for negative values and in red for positive values, where darker colors signal stronger absolute values.
I attach a minimal reproducible example.
library(rpart)
library(rpart.plot)
# Simulating data.
set.seed(1986)
X = matrix(rnorm(2000, 0, 1), nrow = 1000, ncol = 2)
epsilon = matrix(rnorm(1000, 0, 0.01), nrow = 1000)
y = X[, 1] + X[, 2] + epsilon
dta = data.frame(X, y)
# Fitting regression tree.
my.tree = rpart(y ~ X1 + X2, data = dta, method = "anova", maxdepth = 3)
# Plotting.
prp(my.tree,
type = 2,
clip.right.labs = FALSE,
extra = 101,
under = FALSE,
under.cex = 1,
fallen.leaves = TRUE,
box.palette = "BuRd",
branch = 1,
round = 0,
leaf.round = 0,
prefix = "" ,
main = "",
cex.main = 1.5,
branch.col = "gray",
branch.lwd = 3)
# Repeating, with median(y) != 0.
X = matrix(rnorm(2000, 5, 1), nrow = 1000, ncol = 2)
epsilon = matrix(rnorm(1000, 0, 0.01), nrow = 1000)
y = X[, 1] + X[, 2] + epsilon
dta = data.frame(X, y)
my.tree = rpart(y ~ X1 + X2, data = dta, method = "anova", maxdepth = 3)
# HERE I NEED HELP!
prp(my.tree,
type = 2,
clip.right.labs = FALSE,
extra = 101,
under = FALSE,
under.cex = 1,
fallen.leaves = TRUE,
box.palette = "BuRd",
branch = 1,
round = 0,
leaf.round = 0,
prefix = "" ,
main = "",
cex.main = 1.5,
branch.col = "gray",
branch.lwd = 3)
As far as I understood, thanks to the box.palette option, I obtained the result I need in the first setting because median(y) is close to zero.
Indeed, in the second setting I am unhappy: I get blue shades for values less than median(y), and red shades for those above such value. How can I impose zero as the threshold for the two colors?
To be more specific, I would like a command that automatically ensures the two-colors system in any tree.
Ook, I answered my own question. The solution is actually quite simple: if the box.palette option is a two-color diverging palette (as in my example), we can use pal.thresh to set the threshold we want. In my case:
prp(my.tree,
type = 2,
clip.right.labs = FALSE,
extra = 101,
under = FALSE,
under.cex = 1,
fallen.leaves = TRUE,
box.palette = "BuRd",
branch = 1,
round = 0,
leaf.round = 0,
prefix = "" ,
main = "",
cex.main = 1.5,
branch.col = "gray",
branch.lwd = 3,
pal.thresh = 0) # HERE THE SOLUTION!
Even if this is probably bad for me, I will leave here the answer for future users and close the question, rather than deleting it.

How to make `integrate()` to accept a vector in an R function?

I am wondering how I could make my function Bpp to accept a vector for its first argument t?
Bpp = function(t, n1, n2 = NULL){
N = ifelse(is.null(n2), n1, n1*n2/(n1+n2))
df = ifelse(is.null(n2), n1 - 1, n1 + n2 - 2)
H1 = integrate(function(delta)dcauchy(delta, 0, sqrt(2)/2)*dt(t, df, delta*sqrt(N)), -Inf, Inf)[[1]]
H0 = dt(t, df)
BF10 = H1/H0
p.value = 2*(1-pt(abs(t), df))
list(BF10 = BF10, p.value = p.value)
}
Bpp(t = -6:6, 20, 20) ## This will give error because `t` is now a vector?
Looks like I could give a quick answer without testing. Use the following in your Bpp:
# joint density
joint <- function(delta, t) dcauchy(delta, 0, sqrt(2)/2) * dt(t, df, delta*sqrt(N))
# marginal density of `t`
marginal.t <- function (t) integrate(joint, lower = -Inf, upper = Inf, t = t)[[1]]
H1 <- sapply(t, marginal.t)
So, here we also could use Vectorize how would that look like?
Use your original Bpp:
Bpp <- Vectorize(Bpp, vectorize.args = "t")
Bpp(-6:6, 20, 20)

Having a function only output the numberic answer

I was wondering if there is a way I can have Gi1 an subsequently UrUr in the R code below only provide the NUMBERIC ANSWER and NOT the extra word B10 that currently comes with the numberic answer?
Note: I'm aware of using [[ith output]] but I will need that function itself to provide purely number from the begining so I don't need to use many [[]] later in my code.
library(BayesFactor)
ttype = 1
t = -.742
N1 = 102
N2 = ifelse(ttype==1, NA, 102)
rscale = sqrt(2)/2
tl = 1
dexp = -1
Gi1 <- function(x) {
ttest.tstat(t, N1, ifelse(ttype==1, F, N2),
nullInterval = c(ifelse(dexp==-1, -Inf, Inf),
ifelse(tl==1, 0, Inf)),rscale = x, simple = TRUE) }
UrUr <- sapply(c(1/2, 1, sqrt(2)/2 ), Gi1)
UrUr
Just use unname within the Gi1:
Gi1 <- function(x) {
unname(ttest.tstat(t, N1, ifelse(ttype==1, F, N2),
nullInterval = c(ifelse(dexp==-1, -Inf, Inf),
ifelse(tl==1, 0, Inf)),rscale = x, simple = TRUE)) }
And then it will be fine.
> UrUr <- sapply(c(1/2, 1, sqrt(2)/2 ), Gi1)
> UrUr
[1] 0.3015327 0.1579931 0.2197861

R: extract parameter estmates from object of class 'mle'

I was wondering how one extracts the estimated parameters stored in an R object of class mle-class.
Here is an example:
x <- matrix(rnorm(300), ncol = 3)
x[x > 1] <- 1
require(tmvtnorm)
fit1 <- mle.tmvnorm(X = x, lower = rep(-Inf, 3), upper = rep(1, 3))
Now, fit1 is an object of class:
class(fit1)
[1] "mle"
attr(,"package")
[1] "stats4
"
fit1 itself gives me:
fit1
Call:
mle(minuslogl = function (mu_1 = 0, mu_2 = 0, mu_3 = 0, sigma_1.1 = 1,
sigma_1.2 = 0, sigma_1.3 = 0, sigma_2.2 = 1, sigma_2.3 = 0,
sigma_3.3 = 1)
{
nf <- names(formals())
theta <- sapply(nf, function(x) {
eval(parse(text = x))
})
mean <- theta[1:n]
if (cholesky) {
L <- inv_vech(theta[-(1:n)])
L[lower.tri(L, diag = FALSE)] <- 0
sigma <- t(L) %*% L
}
else {
sigma <- inv_vech(theta[-(1:n)])
}
if (det(sigma) <= 0 || any(diag(sigma) < 0)) {
return(.Machine$integer.max)
}
f <- -(sum(dmvnorm(X, mean, sigma, log = TRUE)) - nrow(X) *
log(pmvnorm(lower = lower, upper = upper, mean = mean,
sigma = sigma)))
if (is.infinite(f) || is.na(f)) {
return(.Machine$integer.max)
}
f
}, start = as.list(c(0, 0, 0, 1, 0, 0, 1, 0, 1)), method = "BFGS",
fixed = list())
Coefficients:
mu_1 mu_2 mu_3 sigma_1.1 sigma_1.2 sigma_1.3
0.64218198 1.51720543 0.97047201 1.73395947 -0.03889188 0.14627774
sigma_2.2 sigma_2.3 sigma_3.3
2.18020597 0.38822509 1.49854600
My question is: how do I extract these coefficients from the object fit1?
Thanks again for your time, and for your help in answering this question!
coef is a generic function which extracts model coefficients from objects returned by modeling functions. coefficients is an alias for it.
Usage
coef(object, ...)
coefficients(object, ...)
So, fit1#coef should work.
https://stat.ethz.ch/R-manual/R-devel/library/stats/html/coef.html
Sorry for this silly question: I will keep it just in case someone ends up looking.
fit1#coef
mu_1 mu_2 mu_3 sigma_1.1 sigma_1.2 sigma_1.3
0.64218198 1.51720543 0.97047201 1.73395947 -0.03889188 0.14627774
sigma_2.2 sigma_2.3 sigma_3.3
2.18020597 0.38822509 1.49854600
solves the query. Duh!

Regarding the argument of d-dimensional copula function in R

I have a simple question on R. This is a simple code to generate random variables from a bivariate normal clayton copula with normally distributed margins. How could I do this neatly if I had d equally distributed margins, without having to write c("norm","norm","norm", ... ) etc.?
myMvd1 <- mvdc(copula = archmCopula(family = "clayton", param = 2),
margins = c("norm", "norm"), paramMargins = list(list(mean = 0,
sd = 1), list(mean = 0, sd = 1)))
You can use rep:
d <- 5
mvdc(copula = archmCopula(family = "clayton", param = 2),
margins = rep("norm", d),
paramMargins = rep(list(list(mean = 0, sd = 1)), d))
(And not knowing what this is about, I am not sure if param should be 2 or d.)
You can do something like this :
matrix(rMvdc(d*nRow, myMvd1),nRow,d)

Resources