How to generate a random variable from two different distributions in R - r

Suppose a random variable Z is taken randomly from two different distributions with equal probability: a standard N(0,1) and an exponential exp(1) with rate=1. I want to generate the random variable Z.
So in r, my approach is: Z=0.5X+0.5Y, so Z is from the joint distribution of N(0,1) and exp(1). The r code will be:
x<-rnorm(1)
y<-rexp(1)
z<-0.5x+0.5y
My question is can I obtain Z by just adding up x and y with their probabilities, or I have to consider the correlations between variables ?

Unfortunately not. You need another variable U, which is a Bernoulli random variable with p=0.5 and independent of X and Y. Define Z = U*X+(1-U)*Y. In R, you can do
x<-rnorm(1)
y<-rexp(1)
u<-rbinom(1,1,0.5)
z<-u*x+(1-u)*y
Averaging X and Y results in totally different distribution, not the mixture of distributions you want.

Related

get pairwise difference from emmeans with quadratic covariate interaction

I have a factor X with three levels and a continuous covariate Z.
To predict the continuous variable Y, I have the model
model<-lm(Y ~ X*poly(Z,2,raw=TRUE))
I know that the emmeans package in R has the function emtrends() to estimate the pairwise difference between factor level slopes and does a p-value adjustment.
emtrends(model, pairwise ~ X, var = "Z")
however this works when Z is a linear term. Here I have a quadratic term. I guess this means I have to look at pairwise differences at pre specified values of Z? and get something like the local "slope" trend?
Is this possible to do with emmeans? How would I need to do the p-adjustment, does it scale with the number of grid points? -so when the number of grid values where I do the comparison increases, bonferroni will become too conservative?
Also how would I do the pairwise comparison of the mean (prediction) at different grid values with emmeans (or is this the same regardless of using poly() as this relies only on model predicitons)?
thanks.

Generate two negative binomial distributed random variables with predefined correlation

Assume I have a negative binomial distributed variable X1 with NB(mu=MU1,size=s1) and a negative binomial distributed variable X2 with NB(mu=MU2,size=s2).
I fitted a negative binomial regression to estimate Mu's and size's from my data
I can use the rnbinom() function in R to generate random draws from this distribution.
X1model<-rnbinom(n=1000,mu=MU1fitted,size=s1fitted)
X2model<-rnbinom(n=1000,mu=MU2fitted,size=s2fitted)
Those draws are now independent. However how can I draw from those distributions, so that they exhibit a predefined correlation r, which is the correlation I observe between my original data X1,X2.
so that:
cor(X1,X2,method="spearman") = r = cor(X1model,X2model,method="spearman")
-or even better draw from those with any arbitrary preset correlation r

Testing First-Order Stochastic Dominance Using R

I ran simulation and generated two (random) variables: X and Y.
I would like to test whether X first-order stochastically dominates Y in R.
That is, how can I check whether X's empirical CDF is on the right side of Y's empirical CDF for all support?

What does it mean to put an `rnorm` as an argument of another `rnorm` in R?

I have difficulty understanding what it means when an rnorm is used as one of the arguments of another rnorm? (I'll explain more below)
For example, below, in the first line of my R code I use an rnorm() and I call this rnorm(): mu.
mu consists of 10,000 x.
Now, let me put mu itself as the mean argument of a new rnorm() called "distribution".
My question is how mu which itself has 10,000 x be used as the mean argument of this new rnorm() called distribution?
P.S.: mean argument of any normal distribution can be a single number, and with only ONE single mean, we will have a single, complete normal. Now, how come, using 10,000 mu values still results in a single normal?
mu <- rnorm( 1e4 , 178 , 20 ) ; plot( density(mu) )
distribution <- rnorm( 1e4 , mu , 1 ) ; plot( density(distribution) )
You distribution is a conditional density. While the density you draw with plot(density(distribution)), is a marginal density.
Statistically speaking, you first have a normal random variable mu ~ N(178, 20), then another random variable y | mu ~ N(mu, 1). The plot you produce is the marginal density of y.
P(y), is mathematically an integral of joint distribution P(y | mu) * p(mu), integrating out mu.
#李哲源ZheyuanLi, ahhh! so when we use a vetor as the mean argument or sd argument of an rnorm, the single, final plot is the result of the integral, right?
It means you are sampling from the marginal distribution. The density estimate approximates the Monte Carlo integral from samples.
This kind of thing is often seen in Bayesian computation. Toy R code on Bayesian inference for mean of a normal distribution [data of snowfall amount] gives a full example, but integral is computed by numerical integration.

Distribution of multivariate uniform random variable in R

I have k>2uniform independent random variables with different upper and lower
for example: X~U[-1,1], Y~U[0,1],Z~U[-1,0] and so on.
I have to calculate for a RV given by RV=X+Y+Z the probability that it is greater than some scalar x for example:x=0.2 and I have to calculate Pr(RV>x). Is there any easier way I can calculate using R. I have an array of >1000 random variables and I need to calculate this for each possible combinations of these variables. Therefore I am trying to avoid sampling route.

Resources