Chinese characters in summary that I cannot knit to pdf - r

```
model2 = lm(1/glyhb~ gender + age + gender:age ,data = diabetes)
summary(model2)
```
Call:
lm(formula = 1/glyhb ~ gender + age + gender:age, data = diabetes)
Residuals:
Min 1Q Median 3Q Max
-0.149383 -0.019681 0.002455 0.029739 0.163164
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.2498158 0.0120415 20.746 < 2e-16 ***
genderfemale 0.0123141 0.0151608 0.812 0.417
age -0.0011384 0.0002363 -4.817 2.09e-06 ***
genderfemale:age -0.0002195 0.0003030 -0.724 0.469
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.04778 on 386 degrees of freedom
(因为不存在,13个观察量被删除了)
Multiple R-squared: 0.164, Adjusted R-squared: 0.1575
F-statistic: 25.24 on 3 and 386 DF, p-value: 6.264e-15
Error message:
! LaTeX Error: Unicode character 因 (U+56E0)
not set up for use with LaTeX.
Can anyone help me to change this line under residual standard error to English please....
Thanks in advance!

output:
pdf_document:
latex_engine: xelatex
works!

Related

lm(summary) : essentially perfect fit: summary may be unreliable error for standardized datas

I am a beginner in R and statistics in general. I am trying to build a regression model with 4 variables (one of them is nominal data, with 3 alt categories). I think i managed to build a model with my raw data. But I wanted to standardize my data-set and the lm essentially perfect fit: summary may be unreliable message.
This is the summary of the model with raw data;
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.0713 5.9131 -0.350 0.727
Vdownload 8.6046 0.5286 16.279 < 2e-16 ***
DownloadDegisim 2.8854 0.6822 4.229 4.25e-05 ***
Vupload -4.2877 0.5418 -7.914 7.32e-13 ***
Saglayici2 -8.2084 0.6043 -13.583 < 2e-16 ***
Saglayici3 -9.8869 0.5944 -16.634 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.885 on 138 degrees of freedom
Multiple R-squared: 0.8993, Adjusted R-squared: 0.8956
F-statistic: 246.5 on 5 and 138 DF, p-value: < 2.2e-16
I wrote these codes to standardize my data
memnuniyet_scaled <-scale(Vdownload, center = TRUE, scale = TRUE)
Vdownload_scaled <-scale(Vdownload, center = TRUE, scale = TRUE)
Vupload_scaled <-scale(Vupload, center = TRUE, scale = TRUE)
DownloadD_scaled <- scale(DownloadDegisim, center = TRUE, scale = TRUE)
result<-lm(memnuniyet_scaled~Vdownload_scaled+DownloadD_scaled+Vupload_scaled+Saglayıcı)
summary(result)
And this is the summary of my standardized data
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.079e-17 5.493e-17 9.250e-01 0.357
Vdownload_scaled 1.000e+00 6.667e-17 1.500e+16 <2e-16 ***
DownloadD_scaled -4.591e-17 8.189e-17 -5.610e-01 0.576
Vupload_scaled 9.476e-18 6.337e-17 1.500e-01 0.881
Saglayici2 -6.523e-17 7.854e-17 -8.300e-01 0.408
Saglayici3 -8.669e-17 7.725e-17 -1.122e+00 0.264
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 3.75e-16 on 138 degrees of freedom
Multiple R-squared: 1, Adjusted R-squared: 1
F-statistic: 2.034e+32 on 5 and 138 DF, p-value: < 2.2e-16
I do know that R value should not have changed with standardization and have no idea what I did wrong.

Small sample (20-25 observations) - Robust standard errors (Newey-West) do not change coefficients/standard errors. Is this normal?

I am running a simple regression (OLS)
> lm_1 <- lm(Dependent_variable_1 ~ Independent_variable_1, data = data_1)
> summary(lm_1)
Call:
lm(formula = Dependent_variable_1 ~ Independent_variable_1,
data = data_1)
Residuals:
Min 1Q Median 3Q Max
-143187 -34084 -4990 37524 136293
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 330853 13016 25.418 < 2e-16 ***
`GDP YoY% - Base` 3164631 689599 4.589 0.000118 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 66160 on 24 degrees of freedom
(4 observations deleted due to missingness)
Multiple R-squared: 0.4674, Adjusted R-squared: 0.4452
F-statistic: 21.06 on 1 and 24 DF, p-value: 0.0001181
The autocorrelation and heteroskedasticity tests follow:
> dwtest(lm_1,alternative="two.sided")
Durbin-Watson test
data: lm_1
DW = 0.93914, p-value = 0.001591
alternative hypothesis: true autocorrelation is not 0
> bptest(lm_1)
studentized Breusch-Pagan test
data: lm_1
BP = 9.261, df = 1, p-value = 0.002341
then I run a robust regression for autocorrelation and heteroskedasticity (HAC - Newey-West):
> coeftest(lm_1, vocv=NeweyWest(lm_1,lag=2, prewhite=FALSE))
t test of coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 330853 13016 25.4185 < 2.2e-16 ***
Independent_variable_1 3164631 689599 4.5891 0.0001181 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
and I get the same results for coefficients / standard errors.
Is this normal? Is this due to the small sample size?

Separate output when results='hold' in knitr

Is there a way to separate or section the output in a {r results='hold'} block in knitr without needing to manually insert a print('--------') command in between or the like? I want to avoid such extra print lines is that it makes the code much harder to read.
For example, you need to be quite trained to see where the output from one line stops and the next begins in the output from this:
```{r, results='hold'}
summary(lm(Sepal.Length ~ Species, iris))
#print('-------------') # Not a solution
summary(aov(Sepal.Length ~ Species, iris))
```
Output:
Call:
lm(formula = Sepal.Length ~ Species, data = iris)
Residuals:
Min 1Q Median 3Q Max
-1.6880 -0.3285 -0.0060 0.3120 1.3120
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.0060 0.0728 68.762 < 2e-16 ***
Speciesversicolor 0.9300 0.1030 9.033 8.77e-16 ***
Speciesvirginica 1.5820 0.1030 15.366 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.5148 on 147 degrees of freedom
Multiple R-squared: 0.6187, Adjusted R-squared: 0.6135
F-statistic: 119.3 on 2 and 147 DF, p-value: < 2.2e-16
Df Sum Sq Mean Sq F value Pr(>F)
Species 2 63.21 31.606 119.3 <2e-16 ***
Residuals 147 38.96 0.265
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Maybe some knitr option or a more general R printing option?

Display category labels in regression output in R

Using the this R linear modelling tutorial I'm finding the format of the model output is annoyingly different to that provided in the text and I can't for the life of me work out why. For example, here is the code:
pitch = c(233,204,242,130,112,142)
sex = c(rep("female",3),rep("male",3))
my.df = data.frame(sex,pitch)
xmdl = lm(pitch ~ sex, my.df)
summary(xmdl)
Here is the output I get:
Call:
lm(formula = pitch ~ sex, data = my.df)
Residuals:
1 2 3 4 5 6
6.667 -22.333 15.667 2.000 -16.000 14.000
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 177.167 7.201 24.601 1.62e-05 ***
sex1 49.167 7.201 6.827 0.00241 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 17.64 on 4 degrees of freedom
Multiple R-squared: 0.921, Adjusted R-squared: 0.9012
F-statistic: 46.61 on 1 and 4 DF, p-value: 0.002407
In the tutorial the line for Coefficients has "sexmale" instead of "sex1". What setting do I need to activate to achieve this?

R- How to save the console data into a row/matrix or data frame for future use?

I'm performing the multiple regression to find the best model to predict the prices. See as following for the output in the R console.
I'd like to store the first column (Estimates) into a row/matrix or data frame for future use such as using R shiny to deploy on the web.
*(Price = 698.8+0.116*voltage-70.72*VendorCHICONY
-36.6*VendorDELTA-66.8*VendorLITEON-14.86*H)*
Can somebody kindly advise?? Thanks in advance.
Call:
lm(formula = Price ~ Voltage + Vendor + H, data = PSU2)
Residuals:
Min 1Q Median 3Q Max
-10.9950 -0.6251 0.0000 3.0134 11.0360
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 698.821309 276.240098 2.530 0.0280 *
Voltage 0.116958 0.005126 22.818 1.29e-10 ***
VendorCHICONY -70.721088 9.308563 -7.597 1.06e-05 ***
VendorDELTA -36.639685 5.866688 -6.245 6.30e-05 ***
VendorLITEON -66.796531 6.120925 -10.913 3.07e-07 ***
H -14.869478 6.897259 -2.156 0.0541 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 7.307 on 11 degrees of freedom
Multiple R-squared: 0.9861, Adjusted R-squared: 0.9799
F-statistic: 156.6 on 5 and 11 DF, p-value: 7.766e-10
Use coef on your lm output.
e.g.
m <- lm(Sepal.Length ~ Sepal.Width + Species, iris)
summary(m)
# Call:
# lm(formula = Sepal.Length ~ Sepal.Width + Species, data = iris)
# Residuals:
# Min 1Q Median 3Q Max
# -1.30711 -0.25713 -0.05325 0.19542 1.41253
#
# Coefficients:
# Estimate Std. Error t value Pr(>|t|)
# (Intercept) 2.2514 0.3698 6.089 9.57e-09 ***
# Sepal.Width 0.8036 0.1063 7.557 4.19e-12 ***
# Speciesversicolor 1.4587 0.1121 13.012 < 2e-16 ***
# Speciesvirginica 1.9468 0.1000 19.465 < 2e-16 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
# Residual standard error: 0.438 on 146 degrees of freedom
# Multiple R-squared: 0.7259, Adjusted R-squared: 0.7203
# F-statistic: 128.9 on 3 and 146 DF, p-value: < 2.2e-16
coef(m)
# (Intercept) Sepal.Width Speciesversicolor Speciesvirginica
# 2.2513932 0.8035609 1.4587431 1.9468166
See also names(m) which shows you some things you can extract, e.g. m$residuals (or equivalently, resid(m)).
And also methods(class='lm') will show you some other functions that work on a lm.
> methods(class='lm')
[1] add1 alias anova case.names coerce confint cooks.distance deviance dfbeta dfbetas drop1 dummy.coef effects extractAIC family
[16] formula hatvalues influence initialize kappa labels logLik model.frame model.matrix nobs plot predict print proj qr
[31] residuals rstandard rstudent show simulate slotsFromS3 summary variable.names vcov
(oddly, 'coef' is not in there? ah well)
Besides, I'd like to know if there is command to show the "residual percentage"
=(actual value-fitted value)/actual value"; currently the "residuals()" command can
only show the below info but I need the percentage instead.
residuals(fit3ab)
1 2 3 4 5 6
-5.625491e-01 -5.625491e-01 7.676578e-15 -8.293815e+00 -5.646900e+00 3.443652e+00

Resources