I would like to know which is the easiest way to put a regression output (splm object) in TeX. Stargazer, texreg, latex does not recognize this type of object so the table would have to be done kind of manually. I already put the coefficients and standard errors in a matrix (standard error bellow) in the following way (each column is a different regression):
[,1] [,2] [,3] [,4] [,5] [,6]
lambda -0.550153770 -0.606755198 -1.0894505645 0.703821961 -0.560769652 -0.698232106
0.056878033 0.056878033 0.0568780329 0.056878033 0.056878033 0.056878033
rho 0.571742772 0.618236404 0.7365074175 -1.017060680 0.745559212 0.733598140
0.034064728 0.034064728 0.0340647282 0.034064728 0.034064728 0.034064728
However I don't know how to put to this matrix the stars (if they are in a vector), parenthesis to the standard errors, and finally put that matrix to TeX including the rownames.
It's not a perfect answer but you can piece together something
smry <- summary(splm_lag)
pander(data.frame(R.Square = smry$rsqr))
pander(smry$CoefTable)
----------
R.Square
----------
0.9161
----------
-----------------------------------------------------------------------
Estimate Std. Error t-value Pr(>|t|)
-------------------------- ---------- ------------ --------- ----------
**lambda** 0.574 0.05808 9.883 4.943e-23
**PC1** -0.06165 0.03741 -1.648 0.09931
**PC2** 0.05824 0.02296 2.537 0.01118
**PC3** 0.02966 0.01937 1.531 0.1258
**PC4** -0.04165 0.02289 -1.82 0.06879
**I(as.numeric(years))** 0.03059 0.00939 3.258 0.001122
-----------------------------------------------------------------------
Related
I am trying to run a meta analysis using a package "gemtc", and the code performs very well in my test data..............................................
The code is listed:
data <- read.csv("input.txt", sep=",", header=T)
network <- mtc.network(data, description="Example")
result.anohe <- mtc.anohe(network, n.adapt=10000, n.iter=50000)
#The file (problem.txt) is also attached.
However, when I use my real data, it has an unfixed bug:
Error in decompose.study(study.samples[, colIndexes, drop = FALSE], studies[i]) :
Decomposed variance ill-defined for 1. Most likely the USE did not converge:
[,1] [,2] [,3] [,4]
[1,] 0.000 2478.307 2491.482 2485.044
[2,] 2478.307 0.000 1106288.727 -440067.825
[3,] 2491.482 1106288.727 0.000 -1459996.199
[4,] 2485.044 -440067.825 -1459996.199 0.000
Thanks very much in advance!
The input file causing problem is attached:
file
..............................................................................................................................................................................................
I'm looking for a nicely formated markdown output of test results that are produced within a for loop and structured with headings. For example
df <- data.frame(x = rnorm(1000),
y = rnorm(1000),
z = rnorm(1000))
for (v in c("y","z")) {
cat("##", v, " (model 0)\n")
summary(lm(x~1, df))
cat("##", v, " (model 1)\n")
summary(lm(as.formula(paste0("x~1+",v)), df))
}
whereas the output should be
y (model 0)
Call:
lm(formula = x ~ 1, data = df)
Residuals:
Min 1Q Median 3Q Max
-3.8663 -0.6969 -0.0465 0.6998 3.1648
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.05267 0.03293 -1.6 0.11
Residual standard error: 1.041 on 999 degrees of freedom
y (model 1)
Call:
lm(formula = as.formula(paste0("x~1+", v)), data = df)
Residuals:
Min 1Q Median 3Q Max
-3.8686 -0.6915 -0.0447 0.6921 3.1504
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.05374 0.03297 -1.630 0.103
y -0.02399 0.03189 -0.752 0.452
Residual standard error: 1.042 on 998 degrees of freedom
Multiple R-squared: 0.0005668, Adjusted R-squared: -0.0004346
F-statistic: 0.566 on 1 and 998 DF, p-value: 0.452
z (model 0)
and so on...
There are several results discussing parts of the question like here or here suggesting the asis-tag in combination with the cat-statement. This one includes headers.
Closest to me request seems to be this question from two years ago. However, even though highly appreciated, some of suggestions are deprecated like the asis_output or I can't get them to work in general conditions like the formattable suggestion (e.g. withlm-output). I just wonder -- as two years have past since then -- if there is a modern approach that facilitates what I'm looking for.
Solution Type 1
You could do a capture.output(cat(.)) approach with some lapply-looping. Send the output to a file and use rmarkdown::render(.).
This is the R code producing a *.pdf.
capture.output(cat("---
title: 'Test Results'
author: 'Tom & co.'
date: '11 10 2019'
output: pdf_document
---\n\n```{r setup, include=FALSE}\n
knitr::opts_chunk$set(echo = TRUE)\n
mtcars <- data.frame(mtcars)\n```\n"), file="_RMD/Tom.Rmd") # here of course your own data
lapply(seq(mtcars), function(i)
capture.output(cat("# Model", i, "\n\n```{r chunk", i, ", comment='', echo=FALSE}\n\
print(summary(lm(mpg ~ ", names(mtcars)[i] ,", mtcars)))\n```\n"),
file="_RMD/Tom.Rmd", append=TRUE))
rmarkdown::render("_RMD/Tom.Rmd")
Produces:
Solution Type 2
When we want to automate the output of multiple model summaries in the rmarkdown itself, we could chose between 1. selecting chunk option results='asis' which would produce code output but e.g. # Model 1 headlines, or 2. to choose not to select it, which would produce Model 1 but destroys the code formatting. The solution is to use the option and combine it with inline code that we can paste() together with another sapply()-loop within the sapply() for the models.
In the main sapply we apply #G.Grothendieck's venerable solution to nicely substitute the Call: line of the output using do.call("lm", list(.)). We need to wrap an invisible(.) around it to avoid the unnecessary sapply() output [[1]] [[2]]... of the empty lists produced.
I included a ". " into the cat(), because leading white space like ` this` will be rendered to this in lines 6 and 10 of the summary outputs.
This is the rmarkdown script producing a *pdf that can also be executed ordinary line by line:
---
title: "Test results"
author: "Tom & co."
date: "15 10 2019"
output: pdf_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# Overview
This is an example of an ordinary code block with output that had to be included.
```{r mtcars, fig.width=3, fig.height=3}
head(mtcars)
```
# Test results in detail
The test results follow fully automated in detail.
```{r mtcars2, echo=FALSE, message=FALSE, results="asis"}
invisible(sapply(tail(seq(mtcars), -2), function(i) {
fo <- reformulate(names(mtcars)[i], response="mpg")
s <- summary(do.call("lm", list(fo, quote(mtcars))))
cat("\n## Model", i - 2, "\n")
sapply(1:19, function(j)
cat(paste0("`", ". ", capture.output(s)[j]), "` \n"))
cat(" \n")
}))
```
***Note:*** This is a concluding remark to show that we still can do other stuff afterwards.
Produces:
(Note: Site 3 omitted)
Context
I was hit by the same need as that of OP when trying to generate multiple plots in a loop, but one of them would apparently crash the graphical device (because of unpredictable bad input) even when called using try() and prevent all the remaining figures from being generated. I needed really independent code blocks, like in the proposed solution.
Solution
I've thought of preprocessing the source file before it was passed to knitr, preferably inside R, and found that the jinjar package was a good candidate. It uses a dynamic template syntax based on the Jinja2 templating engine from Python/Django. There are no syntax clashes with document formats accepted by R Markdown, but the tricky part was integrating it nicely with its machinery.
My hackish solution was to create a wrapper rmarkdown::output_format() that executes some code inside the rmarkdown::render() call environment to process the source file:
preprocess_jinjar <- function(base_format) {
if (is.character(base_format)) {
base_format <- rmarkdown:::create_output_format_function(base_format)
}
function(...) {
# Find the markdown::render() environment.
callers <- sapply(sys.calls(), function(x) deparse(as.list(x)[[1]]))
target <- grep('^(rmarkdown::)?render$', callers)
target <- target[length(target)] # render may be called recursively
render_envir <- sys.frames()[[target]]
# Modify input with jinjar.
input_paths <- evalq(envir = render_envir, expr = {
original_knit_input <- sub('(\\.[[:alnum:]]+)$', '.jinjar\\1', knit_input)
file.rename(knit_input, original_knit_input)
input_lines <- jinjar::render(paste(input_lines, collapse = '\n'))
writeLines(input_lines, knit_input)
normalize_path(c(knit_input, original_knit_input))
})
# Add an on_exit hook to revert the modification.
rmarkdown::output_format(
knitr = NULL,
pandoc = NULL,
on_exit = function() file.rename(input_paths[2], input_paths[1]),
base_format = base_format(...),
)
}
}
Then I can call, for example:
rmarkdown::render('input.Rmd', output_format = preprocess_jinjar('html_document'))
Or, more programatically, with the output format specified in the source file metadata as usual:
html_jinjar <- preprocess_jinjar('html_document')
rmarkdown::render('input.Rmd')
Here is a minimal example for input.Rmd:
---
output:
html_jinjar:
toc: false
---
{% for n in [1, 2, 3] %}
# Section {{ n }}
```{r block-{{ n }}}
print({{ n }}**2)
```
{% endfor %}
Caveats
It's a hack. This code depends on the internal logic of markdown::render() and likely there are edge cases where it won't work. Use at your own risk.
For this solution to work, the output format contructor must be called by render(). Therefore, evaluating it before passing it to render() will fail:
render('input.Rmd', output_format = 'html_jinja') # works
render('input.Rmd', output_format = html_jinja) # works
render('input.Rmd', output_format = html_jinja()) # fails
This second limitation could be circumvented by putting the preprocessing code inside the pre_knit() hook, but then it would only run after other output format hooks, like intermediates_generator() and other pre_knit() hooks of the format.
With this program below, I will get the error:
solve.default(Sigma0[cs.idx, cs.idx]) : 'a' is 0-diml
But, when I check the em() function step by step, I mean, sentence by sentence without function, there is no error within solve(). So I am confused and desperate for help, Thank you!
###----------------------------------------------------------------
### Maximal Likelihood estimation of mean and covariance
### for multivariate normal distribution by EM algorithm,
### for demonstration purposes only
###----------------------------------------------------------------
em<-function(xdata,mu0,Sigma0){
n<-nrow(xdata)
p<-ncol(xdata)
err<-function(mu0,Sigma0,mu1,Sigma1){
th0<-c(mu0,as.vector(Sigma0))
th1<-c(mu1,as.vector(Sigma1))
sqrt(sum((th0-th1)*(th0-th1)))
}
mu1<-mu0+1
Sigma1<-Sigma0+1
while(err(mu0,Sigma0,mu1,Sigma1)>1e-6){
mu1<-mu0
Sigma1<-Sigma0
zdata<-xdata
Ai<-matrix(0,p,p)
for(i in 1:n){
if(any(is.na(xdata[i,]))){
zi<-xdata[i,]
na.idx<-(1:p)[is.na(zi)]
cs.idx<-(1:p)[-na.idx]
Sigma012<-Sigma0[na.idx,cs.idx,drop=FALSE]
Sigma022.iv<-solve(Sigma0[cs.idx,cs.idx])
zdata[i,na.idx]<-mu0[na.idx]+(Sigma012%*%Sigma022.iv)%*%(zi[cs.idx]-mu0[cs.idx])
Ai[na.idx,na.idx]<-Ai[na.idx,na.idx]+Sigma0[na.idx,na.idx]-Sigma012%*%Sigma022.iv%*%t(Sigma012)
}
}
mu0<-colMeans(zdata)
Sigma0<-(n-1)*cov(zdata)/n+Ai/n
}
return(list(mu=mu0,Sigma=Sigma0))
}
##A simulation example
library(MASS)
set.seed(1200)
p=3
mu<-c(1,0,-1)
n<-1000
Sig <- matrix(c(1, .7, .6, .7, 1, .4, .6, .4, 1), nrow = 3)
triv<-mvrnorm(n,mu,Sig)
misp<-0.2 #MCAR probability
misidx<-matrix(rbinom(3*n,1,misp)==1,nrow=n)
triv[misidx]<-NA
#exclude the cases whose entire elements were missed
er<-which(apply(apply(triv,1,is.na),2,sum)==p)
if(length(er)>=1) triv<-triv[-er,]
#initial values
mu0<-rep(0,p)
Sigma0<-diag(p)
system.time(rlt<-em(triv,mu0,Sigma0))
#a better initial values
mu0<-apply(triv,2,mean,na.rm=TRUE)
nas<-is.na(triv)
na.num<-apply(nas,2,sum)
zdata<-triv
zdata[nas]<-rep(mu0,na.num)
Sigma0<-cov(zdata)
system.time(rlt<-em(triv,mu0,Sigma0))
Your er<-which(apply(apply(triv,1,is.na),2,sum)==) piece of code is not valid. As a comment above it states, you wish to remove complete NA cases. If so, er<-which(apply(apply(triv,1,is.na),2,sum)==ncol(triv)) is the right piece of code.
The error itself happens when there is a complete NA case still present in triv when being passed to em. At some point, cs.idx is empty, so Sigma0[cs.idx,cs.idx] is also empty, which is reflected by the error message.
However, if the correction above is applied, everything runs fine:
> system.time(rlt<-em(triv,mu0,Sigma0))
user system elapsed
0.46 0.00 0.47
> rlt
$mu
[1] 0.963058487 -0.006246175 -1.024260183
$Sigma
[,1] [,2] [,3]
[1,] 0.9721301 0.6603700 0.5549126
[2,] 0.6603700 1.0292379 0.3745184
[3,] 0.5549126 0.3745184 0.9373208
I'm working on a text mining/clustering project and am trying to create a table which contains number of clusters as rows and 6 columns representing the following 6 metrics:
max.diameter, min.separation, average.within,average.between,avg.silwidth,dunn.
I need to create the tables for 3 methods - kmeans, pam and hclust.
I was able to create something for kmeans
dtm0.90Dist = dist(dtm0.90)
foreachcluster = function(k) {
kmeans.result = kmeans(dtm0.90, k);
kmeans.stats = cluster.stats(dtm0.90Dist,kmeans.result$cluster);
c(kmeans.stats$min.separation, kmeans.stats$max.diameter,
kmeans.stats$average.within, kmeans.stats$avearge.between,
kmeans.stats$avg.silwidth, kmeans.stats$dunn)
}
rbind(foreachcluster(2), foreachcluster(3), foreachcluster(4), foreachcluster(5),
foreachcluster(6), foreachcluster(7),foreachcluster(8))
and I get the following output
[,1] [,2] [,3] [,4] [,5]
[1,] 3.162278 30.19934 5.831550 0.5403872 0.10471348
[2,] 2.236068 28.37252 5.006058 0.3923446 0.07881104
[3,] 1.000000 28.37252 4.995478 0.2496066 0.03524537
[4,] 1.000000 26.40076 4.387212 0.2633338 0.03787770
[5,] 1.000000 26.40076 4.353248 0.2681947 0.03787770
[6,] 1.000000 26.40076 4.163757 0.1633954 0.03787770
[7,] 1.000000 26.40076 4.128927 0.2676423 0.03787770
I need similar output for hclust and pam methods but for the life of me can't get the same function to work for either of the two methods
OK, so I was able to make the function for HCLUST
forhclust=function(k){dfDist = dist(dtm0.90);
hclust.result = hclust(dfDist);
hclust.cluster = (cutree(hclust.result, k));
cluster.stats(dfDist,hclust.cluster);c(cluster.stats$min.separation)}
But I get an error when i run this
Error in cluster.stats$min.separation :
object of type 'closure' is not subsettable
What I need is for it to print "min.separation" output.
I would really appreciate all the help and perhaps some guidance in understanding why my approach is failing in hclust.
Also, is there a good source that can explain the functioning and application of these methods, step by step, in detail?
Thank You
foreachcluster2 = function(k) {
hc = hclust(mDist, method = "ave")
hresult = cutree(hc, k)
h.stats = cluster.stats(mDist,hresult);
c( max.dia=h.stats$max.diameter,
min.sep=h.stats$min.separation,
avg.wi=h.stats$average.within,
avg.bw=h.stats$average.between,
silwidth=h.stats$avg.silwidth,
dunn=h.stats$dunn)
}
t2 = rbind(foreachcluster2(2), foreachcluster2(3), foreachcluster2(4), foreachcluster2(5),foreachcluster2(6),
foreachcluster2(7), foreachcluster2(8), foreachcluster2(9), foreachcluster2(10),
foreachcluster2(11), foreachcluster2(12),foreachcluster2(13),foreachcluster2(14))
rownames(t2) = 2:14
t2
This should work. For pam():
pamC <- pam(x=m, k=2)
pamC
pamC$clustering
use $clustering instead of $cluster, the rest are the same.
Is there a way to display only part of the R output with knitR? I want to display only part of the summary output from an lm model in a beamer presentation so that it doesn't run off the slide. (As a side note, why is my code not wrapping?) A minimal example is provided below.
\documentclass{beamer}
\begin{document}
\title{My talk}
\author{Me}
\maketitle
\begin{frame}[fragile, t]{Slide 1}
<<setup, include=FALSE, cache=FALSE, tidy=TRUE>>=
options(width=60, digits=5, show.signif.stars=FALSE)
#
<<mod1, tidy=TRUE>>==
data(cars) # load data
g <- lm(dist ~ speed + I(speed^2) + I(speed^3), data = cars)
summary(g)
#
\end{frame}
\end{document}
To be very specific, say that I wanted to return only the following output:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -19.50505 28.40530 -0.687 0.496
speed 6.80111 6.80113 1.000 0.323
I(speed^2) -0.34966 0.49988 -0.699 0.488
I(speed^3) 0.01025 0.01130 0.907 0.369
Residual standard error: 15.2 on 46 degrees of freedom
Multiple R-squared: 0.6732, Adjusted R-squared: 0.6519
F-statistic: 31.58 on 3 and 46 DF, p-value: 3.074e-11
There's probably a better way to do this, but the following should work for you. It uses capture.output to select what parts of the printed output to display:
\documentclass{beamer}
\begin{document}
\title{My talk}
\author{Me}
\maketitle
\begin{frame}[fragile, t]{Slide 1}
<<setup, include=FALSE, cache=FALSE, tidy=TRUE>>=
options(width=60, digits=5, show.signif.stars=FALSE)
#
<<mod1, tidy=TRUE>>==
data(cars) # load data
g <- lm(dist ~ speed + I(speed^2) + I(speed^3), data = cars)
tmp <- capture.output(summary(g))
cat(tmp[9:length(tmp)], sep='\n')
#
\end{frame}
\end{document}
The summary.lm() method being invoked here returns a list of relevant outputs formatted nicely with print.summary.lm. If you want individual components of the list, try double brackets:
Input:
summary(g)[[4]]
summary(g)[[6]]
summary(g)[[7]]
summary(g)[[8]]
Output:
> summary(g)[[4]]
Estimate Std. Error t value Pr(>|t|)
(Intercept) -19.50504910 28.40530273 -0.6866693 0.4957383
speed 6.80110597 6.80113480 0.9999958 0.3225441
I(speed^2) -0.34965781 0.49988277 -0.6994796 0.4877745
I(speed^3) 0.01025205 0.01129813 0.9074113 0.3689186
> summary(g)[[6]]
[1] 15.20466
> summary(g)[[7]]
[1] 4 46 4
> summary(g)[[8]]
[1] 0.6731808
There must be a better way to combine the niceness of the summary method with list indexing, though.