I'm trying to use esttab to output regression results in R. However, every time I run it I get an error:
Error in FUN(X[[i]], ...) : variable names are limited to 10000 bytes
. Any ideas how to solve it? My code is below:
reg <- lm(y ~ ln_gdp + diffxln_gdp + diff + year, data=df)
eststo(reg)
esttab(store=reg)
The input data comes from approx 25,000 observations. It's all coded as numeric. I can share more information that is deemed relevant but I don't know what that would be right now.
Thanks!
Related
I'm working on a project for my Economics capstone with a very large data set. This is my first time ever programming and I had to merge multiple data sets, 16 in total, with anywhere between 30,000-130,000 observations. I did experience an issue merging the data sets since certain data sets contained more columns than others, but I was able to address it using "rbind.fill" Afterwards, I attempted to run a regression but I encountered an error. The error was
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) :
0 (non-NA) cases
Here is the original code for the regression
ols_reg_mortcur1 <- lm(MORTCUR ~ EST_ST + WEEK + TBIRTH_YEAR + EGENDER + RHISPANIC +
RRACE + EEDUC + MS + THHLD_NUMPER + THHLD_NUMKID + THHLD_NUMADLT + WRKLOSS + ANYWORK +
KINDWORK + RSNNOWRK + UNEMPPAY + INCOME + TENURE + MORTCONF, data = set_up_weeks15st)
I googled the error for some possible solutions; I found solutions like "na.omit", "na.exclude"' etc. I tried these solutions to aval. This leads me to think I didn't implement them correctly or perhaps something went wrong with the merge itself. While I was cleaning the data I set unknown or missing values, listed as -88 or -99 in the data sets, to NA since I had to create a summary stats table. I'll attach my R doc. I do apologize for the length of the attached code below I was sure if to just attach the sections leading up to the regression or include other lines.
Based on the error message,
0 (non-NA) cases the likely reason is that you have at least one NA in each of your rows. (Easy to check this by using na.omit(set_up_weeks15st). This should return zero rows.)
In this case, setting na.action to na.omit or na.exclude is not going to help.
Try to find columns with most NA's and remove them, or impute the missing values using an appropriate method.
I've been trying to run a Fama Macbeth regression using the pmg function for my data "Dev_Panel" but I keep getting this error message:
Fehler in pmg(BooktoMarket ~ Returns + Profitability + BEtoMEpersistence, :
Insufficient number of time periods
I've read in other posts on here that this could be due to NAs in the data. But I've already removed these from the panel.
Additionally, I've used the pmg function on the data frame "Em_Panel" for which I have undertaken the exact same data cleaning measures as for the "Dev_Panel". The regression for this panel worked, but it only produces a coefficient for the intercept. The other coefficients are NA.
Here's the code I used for the Em_Panel:
require(foreign)
require(plm)
require(lmtest)
Em_Panel <- read.csv2("Em_Panel.csv", na="NA")
FMR_Em <- pmg(BooktoMarket~Returns+Profitability+BEtoMEpersistence, Em_Panel, index = c("companyID", "years"))
And here's the code for the Dev_Panel:
Em_Panel <- read.csv2("Dev_Panel.csv", na="NA")
FMR_Dev <- pmg(BooktoMarket~Returns+Profitability+BEtoMEpersistence, Dev_Panel, index = c("companyID", "years"))
Since this seemingly is a problem concerning my data I will gladly provide it:
http://www.filedropper.com/empanel
http://www.filedropper.com/devpanel
Thank you so much for any help!!!
Edit
After switching the arguments as suggested the error is now produced by the Dev_Panel and not the Em_Panel.
Also the regression for the Em_Panel now only provides a coefficient for the intercept. The other coefficients are NA.
I'm trying to carry out covariate balancing using the ebal package. The basic code is:
W1 <- weightit(Conformidad ~ SexoCon + DurPetFiscPrisión1 +
Edad + HojaHistPen + NacionCon + AnteVivos +
TipoAbog + Reincidencia + Habitualidad + Delitos,
data = Suspension1,
method = "ebal", estimand = "ATT")
I then want to check the balance using the summary function:
summary(W1)
This originally worked fine but now I get the error message:
Error in rep(" ", spaces1) : invalid 'times' argument
It's the same dataset and same code, except I changed some of the covariates. But now even when I go back to original covariates I get the same error. Any ideas would be very appreciated!!
I'm the author of WeightIt. That looks like a bug. I'll take a look at it. Are you using the most updated version of WeightIt?
Also, summary() doesn't assess balance. To do that, you need to use cobalt::bal.tab(). summary() summarizes the distribution of the weights, which is less critical than examining balance. bal.tab() displays the effect sample size as well, which is probably the most important statistic produced by summary().
I encountered the same error message. This happens when the treatment variable is coded as factor or character, but not as numeric in weightit.
To make summary() work, you need to use 1 and 0.
I am a complete beginner at R and don't have much time to complete this analysis.
I need to run propensity score matching. I am using RStudio and have
Uploaded my dataset which is called 'R' and was saved on my desktop
Installed and loaded package Matchit
My dataset has the following headings:
BA (my grouping variable of which someone is either on BA or not, 0=off, 1=on),
Then age, sex, timesincediagnosis, TVS, and tscore which are my matching variables.
I have adapted the following code which I have found online
m.nn <- matchit(ba ~ age + sex + timesincediagnosis + TVS + tscore,
data = R, method= " nearest", ratio = 1)
summary(m.nn)
I am getting the following errors:
Error in summary(m.nn) : object 'm.nn' not found
Error in matchit(ba ~ age + sex + timesincediagnosis + TVS + tscore,
data = R, : nearestnot supported.
I would really appreciate any help with why I am getting the errors or how I can change my code.
Thank you!
Credit goes to #MrFlick for noticing this, but the problem is that " nearest" is not an acceptable value to be passed to method. What you want is "nearest" (without the leading space in the string). (Note that the default method is nearest neighbor matching, so you can omit the method argument entirely if this is what you want to do.)
The error print first (Error in summary(m.nn) : object 'm.nn' not found) occurs because R didn't create the m.nn object because of the other error.
I'm new to data analysis, and I have a couple questions about using lm() in R to create a linear regression model of my data.
My data looks like this:
testID userID timeSpentStudying testGrade
12345 007 10 90
09876 008 0 75
And my model:
model <- lm(formula = data$testGrade ~ timeSpentStudying, data = data)
I'm getting the following error (twice), across just under 60 rows of data from RStudio:
Warning messages:
1: In sqrt(crit * p * (1 - hh)/hh) : NaNs produced
2: In sqrt(crit * p * (1 - hh)/hh) : NaNs produced
My question is, does the problem have to do with the data containing many instances of zero being the value, such as above under the 'timeSpentStudying' column? If so, how do I handle that? Shouldn't lm() be able to handle values of zero, especially if that would give significance to the data itself?
Thanks!
So far I have been unable to replicate this, e.g.:
dd <- data.frame(y=rnorm(1000),x=c(rep(0,990),1:10))
model <- lm(y~x, data = dd)
summary(model)
Searching the R code base for the code listed in your error and tracing back indicates that the relevant lines are in plot.lm, the function that plots diagnostics, and that the problem is that you are somehow getting a value >1 for the leverage or "hat values" of one of your data points. However, I can't see how you could be achieving that. Data would make this much clearer!