GVLMA in R makes an assetment on the linear model asumption using a global test. I need to store in a table whether the assumpions hold or not.
First i make a regression:
z<- data.frame(replicate(2, rnorm(10)))
colnames(z) <- c("X","Y")
fit <- lm(Y ~ X, data=z)
Now i run GVLMA:
gvlma(fit)
The tables that comes out of this; How do I access this (store it,use the values)?
I recently had the same need as you, this is how I did it:
gvmodel = gvlma(fit)
mtrx = summary(gvmodel)
Then I could just take what I needed from it like any other matrix. For instance, the decision for Kurtosis (3rd row):
mtrx[3,]$Decision
Related
As a newcomer who just gets started in R, I am confused about the result of the mediation analysis.
My model is simple: IV 'T1Incivi', Mediator 'T1Envied', DV 'T2PSRB'. I ran the same model in SPSS using PROCESS, but the result was insignificant in PROCESS; however, the indirect effect is significant in R. Since I am not that familiar with R, could you please help me to see if there is anything wrong with my code? And tell me why the result is significant in R but not in SPSS?Thanks a bunch!!!
My code in R:
X predict M
apath <- lm(T1Envied~T1Incivi, data=dat)
summary(apath)
X and M predict Y
bpath <- lm(T2PSRB~T1Envied+T1Incivi, data=dat)
summary(bpath)
Bootstrapping for indirect effect
getindirect <- function(dataset,random){
d=dataset[random,]
apath <- lm(T1Envied~T1Incivi, data=d)
bpath <- lm(T2PSRB~T1Envied+T1Incivi, data=dat)
indirect <- apath$coefficients["T1Incivi"]*bpath$coefficients["T1Envied"]
return(indirect)
}
library(boot)
set.seed(6452234)
Ind1 <- boot(data=dat,
statistic=getindirect,
R=5000)
boot.ci(Ind1,
conf = .95,
type = "norm")`*PSRB as outcome*
In your function getindirect all linear regressions should be based on the freshly shuffled data in d.
However there is the line
bpath <- lm(T2PSRB~T1Envied+T1Incivi, data=dat)
that makes the wrong reference to the variable dat which should really not be used within this function. That alone can explain incoherent results.
Using the dlm package in R I fit a dynamic linear model to a time series data set, consisting of 20 observations. I then use the dlmForecast function to predict future values (which I can validate against the genuine data for said period).
I use the following code to create a prediction interval;
ciTheory <- (outer(sapply(fut1$Q, FUN=function(x) sqrt(diag(x))), qnorm(c(0.05,0.95))) +
as.vector(t(fut1$f)))
However my data does not follow a normal distribution and I wondered whether it would be possible to
adapt the qnorm function for other distributions. I have tried qt, but am unable to apply qgamma.......
Just wondered if anyone knew how you would go about sorting this.....
Below is a reproduced version of my code...
library(dlm)
data <- c(20.68502, 17.28549, 12.18363, 13.53479, 15.38779, 16.14770, 20.17536, 43.39321, 42.91027, 49.41402, 59.22262, 55.42043)
mod.build <- function(par) {
dlmModPoly(1, dV = exp(par[1]), dW = exp(par[2]))
}
# Returns most likely estimate of relevant values for parameters
mle <- dlmMLE(a2, rep(0,2), mod.build); #nileMLE$conv
if(mle$convergence==0) print("converged") else print("did not converge")
mod1 <- dlmModPoly(dV = v, dW = c(0, w))
mod1Filt <- dlmFilter(a1, mod1)
fut1 <- dlmForecast(mod1Filt, n = 7)
Cheers
I´m a begginer in R and programming and struggling in doing problably a simple task.
I've made a code that creates a second model order and i want to input variables in this model and find the "Y value"
I´ve tried to use the predict function, but is actually pretty complex and I can't got anywhere.
I did this so far:
modFOI <- rsm(Rendimento~FO(x1,x2,x3,x4)+TWI(x1,x2,x3,x4)+PQ(x1,x2,x3,x4),data=CR) # com interações
summary(modFOI)
print(modFOI)
With that, i found the SO model, but now i want to create variables like x1,x2,x3 and input that in the model and find the Y. I also woud like to find the optimum Y
Simplest way to create a polynomial (2nd order) that I can think of is the following:
DF <- data.frame(x = runif(10,0,1),
y = runif(10,0,1) )
mod <- lm(DF$y ~ DF$x + I(DF$x^2))
predict(mod, new.data=data.frame(x=c(1,2,3,4,5)))
NB. when using predict the new.data must be in a data.frame format, and the variable must have the same name as the variable in the model (here, x)
Hope this helps
The optimum value is shown as the stationary point in the output of summary(modFOI). You may also run steepest(modFOI) to see a trace of the estimated values along the path of steepest ascent.
To predict, create a data frame with the desired sets of x values. For example,
testdat <- data.frame(x1 = -1:1, x2 = 0, x3 = 0, x4 = 1)
Then use the predict() function with this is newdata:
predict(modFOI, newdata = testdat)
I have a loop that needs to be executed; within which are 6 models. The objects that those models are stored in then need to get passed into a function that executes an AIC analysis. However, sometimes one of the models does not work, which then breaks the code for the AIC function because it does not recognize whatever model that failed because it was not stored as an object.
So, I need a way to pull those models that worked into the AIC function.
Here is an example, but keep in mind it is important that this can all be executed within a loop. Here are three hypothetical models:
hn.1 <- ds(data)
hn.1.obs <- ds(data,formula = ~OBSCODE)
hn.1.obs.mas <- ds(dataformula = ~OBSCODE+MAS)
And this would be my AIC function that compares the models:
summarize_ds_models(hn.1, hn.1.obs, hn.1.obs.mas)
But I get an error if say, the hn.1.obs.mas model failed.
I tried to use "get" and "ls" and I successfully pull the models that exist when I call:
get(ls(pattern='hn.15*'))
But that just returns a character vector, so that when I call:
summarize_ds_models(get(ls(pattern='hn.15*')))
it only conducts the AIC analysis on the first model in the above character vector.
Am I on the right track or is there a better way to do this?
UPDATE with a reproducible example.
Here is a simplified version of my problem:
create and fill two data frames that will be put into a list:
data.frame <- data.frame(x = integer(4),
y = integer(4),
z = integer(4),
i = integer(4))
data.frame$x <- c(1,2,3,4)
data.frame$y <- c(1,4,9,16)
data.frame$z <- c(1,3,8,10)
data.frame$i <- c(1,5,10,15)
data.frame.2 <- data.frame[1:4,1:3]
my.list <- list(data.frame,data.frame.2)
create df to fill with best models from AIC analyses
bestmodels <- data.frame(modelname = character(2))
Here is the function that will run the loop:
myfun <- function(list) {
for (i in 1:length(my.list)){
mod.1 = lm(y ~ x, data = my.list[[i]])
mod.2 = lm(y ~ x + z, data = my.list[[i]])
mod.3 = lm(y ~ i, data = my.list[[i]])
bestmodels[i,1] <- rownames(AIC(mod.1,mod.2,mod.3))[1]#bestmodel is 1st row
}
print(bestmodels)
}
However, on the second iteration of the loop, the AIC function will fail because mod.3 will fail. So, is there a generic way to make it so the AIC function will only execute for those models that worked? The outcome I would want here would be:
> bestmodels
modelname
1 mod.1
2 mod.1
since mod.1 would be chosen for both AIC analyses.
Gregor's comment:
Use a list instead of individual named objects. Then do.call(summarize_ds_models, my_list_of_models). If it isn't done already, you can Filter the list first to make sure only working models are in the list.
solved my problem. Thanks
This is a follow-up question to this (see data and previous commands).
Starting with a list of models in mods, i am now able to find the model with the least AIC (corresponds to the best model):
mods <- lapply(methods, function(m)
update(amod.null, correlation = getFunction(m)(1, form = ~ x + y), method="ML"))
names(mods) <- methods
list.AIC <- lapply(mods, function(x) AIC(x))
best.mod <- names(which.min(list.AIC))
Now, i need to do some testing on the model, e.g. Tukey between dates. The syntax is very simple, e.g. for amod.null
library(multcomp)
res <- glht(amod.null, mcp(Date = "Tukey"))
The tricky part is, how can i tell glht to use the model which was put into best.mod (note: this is all happening within a loop). I tried
res <- glht(paste("mods$", as.factor(best.mod),sep = "") , mcp(Date = "Tukey"))
but to no avail, as glht needs to find a model-object in the first argument.
/edit:
Possibly useful:
names(mods)
[1] "corExp" "corGaus" "corLin" "corRatio" "corSpher"
Since the models are stored in the list mods, you can access the "best model" by using the index of which.min(list.AIC):
list.AIC <- sapply(mods, AIC)
best.mod <- mods[which.min(list.AIC)]
best.mod[[1]]