Achieving t random variables with each different df and ncp in R? - r

I'm trying to generate 5 random t variates using rt(), with each of the 5 having a particular df (respectively, from 1 to 5) and a particular ncp (respectively, seq(0, 1, l = 5)). So, 5 random t-variables each having a different df and a different ncp.
To achieve the above, I tried the below with no success. What could be the efficient R code to achieve what I described above?
vec.rt = Vectorize(function(n, df, ncp) rt(n, df, ncp), c("n", "df", "ncp"))
vec.rt(n = 5, df = 1:5, ncp = seq(0, 1, l = 5))
Or
mapply(FUN = rt, n = 5 , df = 1:5, ncp = seq(0, 1, l = 5))
Notice for:
rt(n = 5, df = 1:5, ncp = seq(0, 1, l = 5))
R gives the following warning:
Warning message:
In if (is.na(ncp)) { :
the condition has length > 1 and only the first element will be used

Rephrasing your question helps to find an answer: you want sample of length 1 (n = 1) from 5 random variables each having different parameters.
mapply(FUN = rt, n = 1 , df = 1:5, ncp = seq(0, 1, l = 5))

Related

Optimize / solve equation for unknown exponent

I have dataframe with the following variables: a and b as predictors and c as outcome. My formula is:
c = (a^x) / (a^x + b^x)
How to solve for x?
Example data:
dat <- data.frame(a = runif(5, 1, 100), b = runif(5, 10, 20), c = runif(5, 0, 1))
Reply to comment:
What is your expected output? A single x-value from least squares fitting, or a column x?
The whole column (sum of all row errors). I want to minimize the error for every row.
You can use the following code
library(minpack.lm)
dataset = data.frame(a = runif(5, 1, 100), b = runif(5, 10, 20), c = runif(5, 0, 1))
fun <- as.formula(c ~ a^x/(a^x + b^x))
#Fitting model using minpack.lm package
nls.out1 <- nlsLM(fun,
data = dataset,
start=list(x=1),
algorithm = "LM",
control = nls.lm.control(maxiter = 500))
summary(nls.out1)

How can I predict values in factorial experiments (2^k) with centre points in R?

How can I predict values in factorial experiments with centre points in R using FrF2 package with the predict function or using the broom package?
My code:
library(FrF2)
plan.person = FrF2(nfactors = 5, resolution = 5, replications = 2,
ncenter = 1, randomize = FALSE,
factor.names = list(
A = c(8, 5),
B = c(70, 30),
C = c(0.5, 0),
D = c(1000, 700),
E = c(70, 10)))
resp <- c(84.55, 66.34, -1, 69.18, 73.01, 64.52, 0.73, 47.61, 68.18, 59.87,
26, 72.57, 78.08, 73.81, 26, 59.38, 71.41, 88.64, 64.92, 4, 68.81,
80, 69.66, -1.36, 54.50, 79.24, 78.53, -1, 72.63, 89.97, 87.98,
-11, 65.68, 82.46)
newplan <- add.response(design = plan.person, response = resp)
model <- lm(newplan, use.center = T)
# summary(model)
d <- within(newplan, {
A <- as.numeric(as.character(A))
B <- as.numeric(as.character(B))
C <- as.numeric(as.character(C))
D <- as.numeric(as.character(D))
E <- as.numeric(as.character(E)) })
A = seq(5, 8, 1)
B = seq(30, 70, length.out = length(A))
C = seq(0, 0.5, length.out = length(A))
D = seq(700, 1000, length.out = length(A))
E = seq(10, 70, length.out = length(A))
data <- expand.grid(A = A, B = B,
C = C, D = D,
E = E)
dados$p <- predict(model, newdata=data)
Because of the center point the following message appears.
Error in model.frame.default (Terms, newdata, na.action = na.action, xlev = object $ xlevels):
   lengths of variables differ (found in 'center')
"A two-level experiment with center points can detect, but not fit, quadratic effects."
(https://www.itl.nist.gov/div898/handbook/pri/section3/pri336.htm)
That is, R can't predict these values because you need to make additional assumptions about what the curve looks like to predict points not at your design points.
Note that computationally, you can get the software to work by adding a center term. The error is because this term is in the regression but not in the data set. You could add one with data$center <- FALSE (because none of the points in data are at the center), but this will not do the right thing, as it will not take the potential curvature into account when predicting non-central points, it would simply predict a twisted plane (that is, linear with interactions) with a single bump at the center.
Of course, it's also equivalent to just fitting the model with use.center=FALSE, as the center point doesn't affect the fit of the other points.
If you remove the central value, you can this after model <- lm(newplan, use.center = T)
:
1- Filter the pvalues < 0.05
coe <- broom::tidy(model) %>%
slice(-7) %>% #remove center
filter(p.value < 0.05)
m_beta <- coe$estimate
2 - Do a grid:
A = seq(5, 8, 0.5)
B = seq(30, 70, length.out = length(A))
exp <- expand.grid(A = A, B = B) %>%
mutate(bo = as.numeric(1)) %>%
mutate(ult = A*B) %>%
select(bo, A, B, ult) %>%
as.matrix()
3: Do a Regression:
reg <- t(m_beta %*% t(exp))
exp <- cbind(exp, reg) %>%
as.data.frame() %>%
rename(reg = V5)
But I believe this only solves the computational problem or simplifies it. I believe linear regression should be redone as well. But with this code you can explore and see what other errors exist.

Creating names for a matrix of data in R

I have a simple 12 x 2 matrix called m that contains my dataset (see below).
Question
I was wondering why when I use dimnames(m) to create two names for the two columns of my data, I run into an Error? Is there a better way to create column names for this data in R?
Here is my R code:
Group1 = rnorm(6, 7) ; Group2 = rnorm(6, 9)
Level = gl(n = 2, k = 6)
m = matrix(c(Group1 , Group2, Level), nrow = 12, ncol = 2)
dimnames(m) <- list( DV = Group1, Level = Level)
replace dimnames(m) with
colnames(m) <- c("DV","Level")

MASS packages' "fitdistr": Error when dealing with manipulated random data

Background:
Below I have generated some random beta data using R and manipulate the shape of the data a bit to arrive at what I call "Final" in my code. And I histogram "Final" in my code.
Question:
I'm wondering why when trying to fit a "beta" distribution to "Final" data using MASS packages' "fitdistr" function, I get the following error (Any suggestion how to avoid this error)?
Error in stats::optim(x = c(0.461379379270288, 0.0694261016478062, 0.76934266883081, :
initial value in 'vmmin' is not finite
Here is my R code:
require(MASS)
## Generate some data and manipulate it
set.seed(47)
Initial = rbeta(1e5, 2, 3)
d <- density(Initial)
b.5 <- dbeta(seq(0, 1, length.out = length(d$y)), 50, 50)
b.5 <- b.5 / (max(b.5) / max(d$y)) # Scale down to max of original density
b.6 <- dbeta(seq(0, 1, length.out = length(d$y)), 60, 40)
b.6 <- b.6 / (max(b.6) / max(d$y))
# Collect maximum densities at each x to use as sample probability weights
p <- pmax(d$y, b.5, b.6)
Final <- sample(d$x, 1e4, replace = TRUE, prob = p) ## THIS IS MY FINAL DATA
hist(Final, freq = F, ylim = c(0, 2)) ## HERE IS A HISTOGRAM
m <- MASS::fitdistr(Final, "beta", ## RUN THIS TO SEE HOW THE ERROR COMES UP
start = list(shape1 = 1, shape2 = 1))
Here is the code.
It is the same with your code, I just removed the negative beta values.
library(MASS)
set.seed(47)
Initial = rbeta(1e5, 2, 3)
d <- density(Initial)
b.5 <- dbeta(seq(0, 1, length.out = length(d$y)), 50, 50)
b.5 <- b.5 / (max(b.5) / max(d$y)) # Scale down to max of original
density
b.6 <- dbeta(seq(0, 1, length.out = length(d$y)), 60, 40)
b.6 <- b.6 / (max(b.6) / max(d$y))
# Collect maximum densities at each x to use as sample probability weights
p <- pmax(d$y, b.5, b.6)
Final <- sample(d$x, 1e4, replace = TRUE, prob = p) ## THIS IS MY FINAL DATA
hist(Final, freq = F, ylim = c(0, 2)) ## HERE IS A HISTOGRAM
# replace negative beta values with smallest value > 0
Final[Final<= 0] <- min(Final[Final>0])
hist(Final, freq = F, ylim = c(0, 2))
m <- MASS::fitdistr(x = Final, densfun = "beta",
start = list(shape1 = 1, shape2 = 1))
Here are the shape parameters:
> m
shape1 shape2
1.99240852 2.90219720
(0.02649853) (0.04010168)
Take note that it gives some warnings.

R package heatmap.plus ColSideColors parameters gives an error on dimensions

I need to plot a heatmap of a matrix with annotations, so using the heatmap.plus R package I need to use the ColSideColors parameter. The issue here is that it asks for the same dimensions even when those are equal...
> m <- matrix(rnorm(100,1, 20), 10, 10)
> c <- t(as.matrix(rep('gold', 10), ncol=10, nrow=10))
> heatmap.plus(m, ColSideColors=c)
Error in heatmap.plus(m, ColSideColors = c) :
'ColSideColors' dim()[2] must be of length ncol(x)
> dim(c)[2]
[1] 10
> ncol(m)
[1] 10
UPDATE
What in the case of the following code?
> m <- matrix(rnorm(100,1, 20), 10, 10)
> c <- t(as.matrix(cbind(rep('gold', 10), rep('blue', 10)), ncol=2, nrow=10))
> heatmap.plus(m, ColSideColors=c)
Error in heatmap.plus(m, ColSideColors = c) :
'ColSideColors' dim()[2] must be of length ncol(x)
> dim(c)[2]
[1] 10
> ncol(m)
[1] 10
in other words, what should I do when I want to build the matrix from vectors...?
I think your c is causing the issue. Your c is a matrix by 1 (row) x 10 (columns). But, heatmap.plus is expecting 10 rows. By following your example, this is what I have done.
m <- matrix(rnorm(100, 1, 20), 10, 10)
c <- matrix("gold", ncol = 10, nrow = 10)
heatmap.plus(m, ColSideColors=c)
I followed the example in the CRAN manual and did the following as well. If necessary, have a look.
m = matrix(rnorm(100,1, 20), 10, 10)
rlab = matrix(c("gold", "green", "blue", "red"), nrow = 10, ncol = 4)
clab = matrix(c("green", "blue"), nrow = 10, ncol =2)
colnames(rlab) = LETTERS[1:dim(rlab)[2]]
colnames(clab) = 1:dim(clab)[2]
heatmap.plus(m,ColSideColors=clab,RowSideColors=rlab)

Resources