I am doing a SVAR (structural vector auto regression) analysis in which I want to plot IRFs (impulse response functions). My time series have length 137 and I only use 3 variables, furthermore I select 1 lag when specifying the VAR model.
Specifying the VAR model works fine, but when I want to summarize it I get the following error message
VAR_reduced <- VAR(VAR_data_1, p = 1, type = "both")
summary(VAR_reduced)
Error in solve.default(Sigma) :
system is computationally singular: reciprocal condition number = 1.03353e-16
From what I read in another question this error usually come up when there are not enough observations leading to overfitting, but in my example this should not be a problem, as I have enough observations.
As R does not display an error message if I don't run the summary command it is still possible to calculate the IRFs using:
plot(irf(VAR_reduced, n.ahead = 40))
But, the plot seems rather counter-intuitive, as there is no reaction from any variable other than assets. Therefore, my guess is that the error message hints at something I got wrong but haven't realised yet.
Is this correct, that is do I need to solve that error, or do my IRFs have nothing to do with this?
For completeness here is all the code:
library(quantmod)
library(urca)
library(vars)
library(tseries)
getSymbols('CPILFESL',src='FRED')
getSymbols('INDPRO',src='FRED')
getSymbols('WALCL',src='FRED')
CPI <- ts(CPILFESL, frequency = 12, start = c(1957,1))
output <- ts(INDPRO, frequency = 12, start = c(1919,1))
assets <- as.xts(WALCL)
assets <- to.monthly(assets, indexAt='yearmon', drop.time = TRUE)
assets <- ts(assets[,4], frequency = 12, start = c(2002,12))
assets <- window(assets, start = c(2008,9), end = c(2020,1))
CPI <- window(CPI, start = c(2008,9), end = c(2020,1))
output <- window(output, start = c(2008,9), end = c(2020,1))
loutput <- log(output)
lCPI <- log(CPI)
data_0 <- cbind(loutput, lCPI, assets)
plot(data_0)
VAR_data_1 <- ts.intersect(diff(loutput), diff(lCPI), diff(assets, differences = 2))
VAR_reduced <- VAR(VAR_data_1, p = 1, type = "both")
summary(VAR_reduced)
Related
I am building a vector autoregressive model and got stuck on some problem.
My regressors are some sentiment and financial values. For testing robustness I wanted to add multiple other economic variables to the model.
The problem I encounter is: when adding a fourth regressor I only get an error message in R.
I can use three from any combination, but as soon as I add a fourth one, it wont work (see error message below)...
My Code:
library(dplyr)
library(readr)
library(tidyverse)
library(urca)
library(vars)
library(tseries)
library(forecast)
library(stargazer)
tr <- ts(TR$tr, start = c(2011, 1), frequency = 4) #4 because quarterly
Index1 <- ts(Index1$Value, start = c(2011, 1), frequency = 4)
Index2 <- ts(Index2$Value, start = c(2011, 1), frequency = 4)
Control1 <- ts(CPI$Value, start = c(2011, 1), frequency = 4)
Control2 <- ts(Spread$Value, start = c(2011, 1), frequency = 4)
# for finding optimal lags
tr.bv <- cbind(TR$tr, Index1$Value, Index2$Value, CPI$Value, Spread$Value)
colnames(tr.bv) <- cbind("Total Return", "Index1", "Index2", "CPI", "Spread")
lagselect <- VARselect(tr.bv, lag.max = 10, type = "const")
lagselect$selection
# Building the model
Model <- VAR(tr.bv, p = 10, type = "const", season = NULL, exog = NULL)
summary(Model_LSTM)
The error message I get:
Error in solve.default(Sigma) :
Lapack routine dgesv: system is exactly singular: U[1,1] = 0
In addition: Warning message:
In cor(resids) : Standarddeviation equals zero
I did build the same model in Python using the statsmodel VAR function -> here I only get 0's as p-values or nan's...
Hopefully someone can help me?
The problem likely lies with your data and the final parameter you have added to your model (possibly multicollinearity or overfitting). A reproducible example would be helpful here.
See: https://stats.stackexchange.com/questions/446707/var-model-error-in-solve-defaultsigma-system-is-computationally-singular-r
Can some-one help me with my code, i have a code which is calculating a lot of logistic regression at the same time. i used this code also for a lm model and then it worked quite wel, however i tried to adapt it to a glm model but it does not work anymore.
Output_logistic <- data.frame()
glm_output = glm(test[,1] ~ test_2[,1], family = binomial ('logit'))
Output_2 <- data.frame(R_spuared = summary(glm_output)$r.squared)
Output_2$P_value <- summary(glm_output)$coefficients[2,4]
Output_2$Variabele <- paste(colnames(test))
Output_2$Variabele_1 <- paste(colnames(test_2))
Output_2$N_NA <- length(glm_output$na.action)
Output_2$df <- paste(glm_output$df.residual)
Output_logistic <- rbind(Output_logistic,Output_2)
running this code gives the next error:
Error in $<-.data.frame(*tmp*, "P_value", value = 9.66218350888067e-05) :
replacement has 1 row, data has 0
does anybody know what i have to adapt so that the code will work?
Thanks in advance
Your Output_2 is an empty data.frame (it has no rows) because summary(glm_output)$r.squared does not exist, because glm doesn’t report this value.
If you need the R-squared value you’ll have to calculate it yourself. But to fix the error you can simply change your code to construct the data-frame from the existing data in the summary:
output_2 = data.frame(
P_value = summary(glm_output)$coefficients[2, 4],
Variable = colnames(test),
# … etc.
)
New to stackoverflow. I'm working on a project with NHIS data, but I cannot get the svyglm function to work even for a simple, unadjusted logistic regression with a binary predictor and binary outcome variable (ultimately I'd like to use multiple categorical predictors, but one step at a time).
El_under_glm<-svyglm(ElUnder~SO2, design=SAMPdesign, subset=NULL, family=binomial(link="logit"), rescale=FALSE, correlation=TRUE)
Error in eval(extras, data, env) :
object '.survey.prob.weights' not found
I changed the variables to 0 and 1 instead:
Under_narm$SO2REG<-ifelse(Under_narm$SO2=="Heterosexual", 0, 1)
Under_narm$ElUnderREG<-ifelse(Under_narm$ElUnder=="No", 0, 1)
But then get a different issue:
El_under_glm<-svyglm(ElUnderREG~SO2REG, design=SAMPdesign, subset=NULL, family=binomial(link="logit"), rescale=FALSE, correlation=TRUE)
Error in svyglm.survey.design(ElUnderREG ~ SO2REG, design = SAMPdesign, :
all variables must be in design= argument
This is the design I'm using to account for the weights -- I'm pretty sure it's correct:
SAMPdesign=svydesign(data=Under_narm, id= ~NHISPID, weight= ~SAMPWEIGHT)
Any and all assistance appreciated! I've got a good grasp of stats but am a slow coder. Let me know if I can provide any other information.
Using some make-believe sample data I was able to get your model to run by setting rescale = TRUE. The documentation states
Rescaling of weights, to improve numerical stability. The default
rescales weights to sum to the sample size. Use FALSE to not rescale
weights.
So, one solution maybe is just to set rescale = TRUE.
library(survey)
# sample data
Under_narm <- data.frame(SO2 = factor(rep(1:2, 1000)),
ElUnder = sample(0:1, 1000, replace = TRUE),
NHISPID = paste0("id", 1:1000),
SAMPWEIGHT = sample(c(0.5, 2), 1000, replace = TRUE))
# with 'rescale' = TRUE
SAMPdesign=svydesign(ids = ~NHISPID,
data=Under_narm,
weights = ~SAMPWEIGHT)
El_under_glm<-svyglm(formula = ElUnder~SO2,
design=SAMPdesign,
family=quasibinomial(), # this family avoids warnings
rescale=TRUE) # Weights rescaled to the sum of the sample size.
summary(El_under_glm, correlation = TRUE) # use correlation with summary()
Otherwise, looking code for this function's method with 'survey:::svyglm.survey.design', it seems like there may be a bug. I could be wrong, but by my read when 'rescale' is FALSE, .survey.prob.weights does not appear to get assigned a value.
if (is.null(g$weights))
g$weights <- quote(.survey.prob.weights)
else g$weights <- bquote(.survey.prob.weights * .(g$weights)) # bug?
g$data <- quote(data)
g[[1]] <- quote(glm)
if (rescale)
data$.survey.prob.weights <- (1/design$prob)/mean(1/design$prob)
There may be a work around if you assign a vector of numeric values to .survey.prob.weights in the global environment. No idea what these values should be, but your error goes away if you do something like the following. (.survey.prob.weights needs to be double the length of the data.)
SAMPdesign=svydesign(ids = ~NHISPID,
data=Under_narm,
weights = ~SAMPWEIGHT)
.survey.prob.weights <- rep(1, 2000)
El_under_glm<-svyglm(formula = ElUnder~SO2,
design=SAMPdesign,
family=quasibinomial(),
rescale=FALSE)
summary(El_under_glm, correlation = TRUE)
I have a list of pre-filtered genomic regions (based on previous GWAS and some enrichment analysis performed on GSEA) and I am looking for interesting gene-gene interactions.
i have a binary phenotype and i have used glm=T in the model of course.
I have followed in detail the WISH-R guide - https://github.com/QSG-Group/WISH - and generated the correlations matrix without issues.
I am now struggling to use the generate.modules function, so I am writing here for some help.
i have tried several times to run generate.modules(correlations,values="Coefficients",thread=2)
before that I have also run as suggested:
correlations$Coefficients[(is.na(correlations$Coefficients))]<-0
correlations$Pvalues[(is.na(correlations$Pvalues))]<-1
This is my R code:
library(WISH)
library(data.table)
ped <- fread("D:/Dati/GWAS_ITALIAN_PBC_Mike_files/EPISTASI/epistasi_all SNPs_all_TF/file_epistasi_per_wish/all_snp_tf_recoded.ped", data.table=F)
tped <- fread("D:/Dati/GWAS_ITALIAN_PBC_Mike_files/EPISTASI/epistasi_all SNPs_all_TF/file_epistasi_per_wish/all_snp_tf_recoded.tped", data.table=F)
pval <- fread("D:/Dati/GWAS_ITALIAN_PBC_Mike_files/EPISTASI/epistasi_all SNPs_all_TF/file_epistasi_per_wish/ALL_SNP_TF_p.txt", data.table=F)
id <- fread("D:/Dati/GWAS_ITALIAN_PBC_Mike_files/EPISTASI/epistasi_all SNPs_all_TF/file_epistasi_per_wish/ALL_SNP_TF_id.txt", data.table=F)
genotype <-generate.genotype(ped,tped,snp.id=id, pvalue=0.005,id.select=NULL,gwas.p=pval,major.freq=0.95,fast.read=T)
LD_genotype<-LD_blocks(genotype)
genotype <- LD_genotype$genotype
pheno<-fread("D:/Dati/GWAS_ITALIAN_PBC_Mike_files/EPISTASI/epistasi_all SNPs_all_TF/file_epistasi_per_wish/pheno.txt",data.table=F)
pheno<-ifelse(pheno=="1","0","1")
pheno<-as.numeric(pheno)
correlations<-epistatic.correlation(pheno, genotype,threads = 2 ,test=F,glm=T)
genome.interaction(tped,correlations,quantile = 0.9)
correlations$Coefficients[(is.na(correlations$Coefficients))]<-0
correlations$Pvalues[(is.na(correlations$Pvalues))]<-1
generate.modules(correlations,values="Coefficients",thread=2)
I get the following error:
Error in seq.default(from = min(k), to = max(k), length = nBreaks + 1) :
'from' must be a finite number.
Do you have some hints to debug this error here?
What is the main issue here?
I am trying to estimate a MIDAS regression on a subsample of my data using the window function. However, when I use this, the midas_r() function throws me back the error:
Error in prepmidas_r(y, X, mt, Zenv, cl, args, start, Ofunction, weight_gradients, :
Starting values for weight parameters must be supplied
Here is my code:
install.packages("midasr")
library(midasr)
yrs <- 10
x <- ts(rnorm(12*yrs),start=c(1900,1),frequency = 12)
y <- ts(rnorm(yrs),start=c(1900,1))
midas_r(y~fmls(x,3,12,nealmon),start=list(x=rep(0,3)))
x_est <- window(x,end=c(1910,0))
y_est <- window(y,end=(1910))
midas_r(y_est~fmls(x_est,3,12,nealmon)+1,start=list(x=rep(0,3)))
Does anyone know what's the issue? Thanks in advance!
The issue is in list(x=rep(0, 3)). This list has indeed to be named, but this name needs to coincide with the variable name. Hence,
midas_r(y_est ~ fmls(x_est, 3, 12, nealmon), start = list(x_est = rep(0, 3)))
works.