R - multivariate normal distribution in R - r

I would like to simulate a multivariate normal distribution in R. I've seen I need the values of mu and sigma. Unfortunately, I don't know how obtain them.
In the following link you will find my data in a csv file "Input.csv". Thanks https://www.dropbox.com/sh/blnr3jvius8f3eh/AACOhqyzZGiDHAOPmyE__873a?dl=0
Please, could you show me an example? Raúl

Your link is broken, but I understand that you want to generate random samples from empirical multivariate normal distribution. You can do it like that, assuming df is your data.frame with data:
library('MASS')
Sigma <- var(df)
Means <- colMeans(df)
simulation <- mvrnorm(n = 1000, Means, Sigma)

Related

R code to find the inverse of Cumulative Distribution Function of a Multivariate Joint Distribution (Copula)

I am new on R and and I am working with Copulas.
I have read the R documentation and so far I understood how to create a copula and to calculate the PdF and CDF.
#Generate Normal Copula
coef_ <- 0.7
mycopula <- normalCopula(coef_, dim = 2)
v <- rCopula(4000, mycopula)
# Compute the density
pdf_ <- dCopula(v, mycopula)
# Compute the CDF
cdf <- pCopula(v, mycopula)
However, I need a function to retrieve the inverse of the CDF of the Multivariate Normal Distribution, as I need to find the 99° percentile.
Anyone knows how to do that? Thanks!
I am not sure if you are still interested. However, you can just use qCopula function. Or simply qnorm(v). This will transfer your data from copula data to original data with standard normal margins.

How to graph inc exponential decay in R?

My prof decided that our first experience with coding was going to be trying to fit the function z(t) = A(1-e^(-t/T)) into a given data-set from class using R. I'm completely lost. I keep using lm and nls functions, without quite knowing how they work. So far, I have the data graphed but I have no clue how to get any sort of line more complicated than
mod3<-lm(y~I(x^1/5))
pre3<-predict(mod3)
lines(pre3)
to sum up: how do I find the A and T parameters? Do I use nls for the formula? Anything helps. I'll include a picture of the graph and the data. Please ignore the random lines on the plot. graph depicting my dataset dataset I have to use
One could attempt transform your expression into a linear relationship, but sometimes it is easier to just let the computer do the work. As mention in the comments, R has the nls function to perform the nonlinear regression.
Here is an example using some dummy data. The supply the nls function with your equation, the data frame containing the data and supply it with the initial estimates of the parameters.
See comments for additional details.
#create dummy data
A= 0.8
T1 = 13
t <- seq(2, 50, 3)
z <- A*(1-exp(-t/T1))
z<- z +rnorm(length(z), 0, 0.005) #add noise
#starting data frame
df <-data.frame(t, z)
#solve non-linear model
model <- nls(z ~ A*(1-exp(-t/Tc)), data=df, start = list(A=1, Tc=1))
print(summary(model))
#predict
pred_y <-predict(model, data.frame(t))
#plot
plot(x=t, y=z)
lines(y=pred_y, x= t, col="blue")

Maximum likelihood lognormal R and SAS

I am converting SAS codes to R and there is a feature of using lognormal distribution in the SAS univariate procedure using histograms and midpoints. The result is a table containing the following variables,
EXPPCT - estimated percent of population in histogram interval determined from optional fitted distribution (here it is lognormal)
OBSPCT - percent of variable values in histogram interval
VAR - variable name
MIDPT - midpoint of histogram interval
There is an option in SAS to consider the MLE of the zeta, theta and sigma parameters while applying the distribution.
Now I was able to figure out the way to do this in R. My only problem arises in the likelihood estimation, when the three parameters are being estimated in SAS. R gives me different values.
I am using the following for MLE in R.
library(fitdistrplus)
set.seed(0)
cd <- rlnorm(40,4)
pars <- coef(fitdist(cd, "lnorm"))
meanlog sdlog
4.0549354 0.8620153
I am using the following for MLE in SAS. (the est option)
proc univariate data = testing;
histogram cd /lognormal (theta = est zeta=est sigma=est)
midpoints = 1 to &maxx. by 100
outhistogram = this;
run;
&maxx denotes the maximum of the input. The results of the run from SAS can be found here.
I am new to statistics and unable to find the method used for the MLE in SAS and have no clue as to how to estimate the same in R.
Thanks in advance.
I found these packages EnvStats and FAdist that let me estimate the threshold parameter and use these parameters to fit the 3 parameter lognormal distribution. Backlin was right about the parameters. Right now, the parameters are not an exact match but the end result is the same as SAS. Thank you vey much.

How to get chi-square p-value from gofstat in R

I'm trying to compare sample data with lists of distributions (thanks to help on StackOverflow), but have hit a bit of a roadblock. gofstat seems to be working splendidly, and the graphical output is exactly what is desired. However, the final point of this piece of code is to find the most fitting distribution to the sample data (which will eventually be read from a text file, and will not be ideal at all), and the parameters of said distribution.
The first step for me is to find the appropriate p-value for the chi-square test statistic for each distribution and the data, and then find the largest of these p-values, which should indicate the most fitting distribution. However, I can't seem to get proper output from the code. Whenever I run the code below, I receive a NULL output (twice, because of the loop in the code). According to documentation, this is the output when the test statistic is not calculated. How do I get gofstat to calculate it, and display the p-value? (and the distribution parameters, if possible)
library(fitdistrplus)
set.seed(1)
testData <- lnorm(1000)
distlist <- c("norm","unif")
# Loop through list of distributions
for(i in 1:length(distlist)){
x <- fitdist(testData, distlist[i])
gofstat(x)
print(x$chisqpvalue)
plot(x)
}

How to choose the param in fitCopula

Suppose I have two random variables X1 and X2, both normally distributed. In R I would generate sample like this:
> X1 <- rnorm(1000,mean=2.4,sd=1.3)
> X2 <- rnorm(1000,mean=1.5,sd=0.9)
Assuming they are normally distributed with the corresponding mean and sd. My goals is to fit a copula C to these samples assuming a certain family for the copula C. For simplicity assume the bivariate distribution follows t distribution.
The first step would be to transform them into (pseudo) uniform distribution. We would look at U1=F1(X1) and U2=F2(X2). In R I would do this with the following code
> U1 <- pnorm(X1,mean=2.4,sd=1.3)
> U2 <- pnorm(X2,mean=1.5,sd=0.9)
Then I would fit a t-copula using the copula package. I know that I could directly fit a multivariat t distribution. However I would like to know how these things are working in the package. The function fitCopula needs an object of class copula. Obviously I would hand over a t-copula. I'm not sure how to choose the parameter, since they should be estimated. So how can I fit a t-Copula to U1 and U2?

Resources