Related
I am not very familiar with loops in R, and am having a hard time stating a variable such that it is recognized by a function, DESeqDataSetFromMatrix.
pls is a table of integers. metaData is a data frame containing sample IDs and conditions corresponding to pls. I verified that the below steps run error-free with the individual elements of cond run successfully .
I reviewed relevant posts on referencing variables in R:
How to reference variable names in a for loop in R?
How to reference a variable in a for loop?
Based on these posts, I modified i in line 3 with single brackets, double brackets and "as.name". No luck. DESeqDataSetFromMatrix is reading the literal text after ~ and spits out an error.
cond=c("wt","dhx","mpp","taz")
for(i in cond){
dds <- DESeqDataSetFromMatrix(countData=pls,colData=metaData,design=~i, tidy = TRUE)
"sizeFactors"(dds) <- 1
paste0("PLS",i)<-DESeq(dds)
pdf <- paste(i,"-PLS_MA.pdf",sep="")
tsv <- paste(i,"-PLS.tsv",sep="")
pdf(file=pdf,paper = "a4r", width = 0, height = 0)
plotMA(paste0("PLS",i),ylim=c(-10,10))
dev.off()
write.table(results(paste0("PLS",i)),file = tsv,quote=FALSE, sep='\t', col.names = NA)
}
With brackets, an unexpected symbol error populates.
With i alone, DESEqDataSetFromMatrix tries to read "i" from my metaData column.
Is R just not capable of reading variables in some situations? Generally speaking, is it better to write loops outside of R in a more straightforward language, then push as standalone commands? Thanks for the help—I hope there is an easy fix.
For anyone else who may be having trouble looping with DESeq2 functions, comments above addressed my issue.
Correct input:
dds <- DESeqDataSetFromMatrix(countData=pls,colData=metaData,design=as.formula(paste0("~", i)), tidy = TRUE)
as.formula worked well with all DESeq functions that I tested.
reformulate(i) also worked well in most situations.
Thanks, everyone for the help!
I have created an r chunk in an r markdown document, in which I have calculated some parameter values that I am required to find for a homework. Now I would like for my knitted PDF-document to show the sentence "Our estimate of $\beta_1$ is -0.2186". However, the portion of the code for the greek letter beta ($\beta_1$) is being shown in the PDF the same way it's written here, not as the actual greek letter.
I have already tried installing LaTeX-packages in the document header (e.g. \usepackage{mathtools}), which made no difference.
cigs_mean <- mean(smoke$cigs) #find y-bar
educ_mean <- mean(smoke$educ) #find x-bar
beta1_hat <- (cov(smoke$educ,smoke$cigs))/(var(smoke$educ)) #find beta1-hat
beta0_hat <- (cigs_mean-(beta1_hat*educ_mean)) #find beta0-hat
print(paste0("Our estimate of $\beta_1$ is ", round(beta1_hat, digits=4)))
I just want for the document to show a greek letter beta with subscript 1, rather than replicating the code I have written ($\beta_1$)
Backslashes in R character strings have a special meaning as escape characters, and must themselves be escaped. Otherwise, your string '$\beta$' is read by R as '$' ‹backspace› 'e' 't' 'a' '$'.
Furthermore, print is the wrong function to use here: its purpose is to provide output in the interactive R console, never for actual output to a document. Use cat instead.
Finally, if you haven’t already done so, you need to tell knitr to interpret the results of this code chunk as-is instead of rendering them as a result:
```{r results = 'asis'}
…
cat(paste0("Our estimate of $\\beta_1$ is ", round(beta1_hat, digits=4), "\n"))
```
I use J48 to generate a decision tree below, when I want to plot it by using
if(require("party",quietly=TRUE)) plot(fit_1)
it gives an error:
Error: all(sapply(split, tail, 1) %in% mf_levels[[var_id]]) is not TRUE
what does this error mean?
I stumbled upon the same issue and reached out to the CRAN community for help. The issue relies primarily on the encoding of your datafile which is using non-ASCII labels without properly declaring them when importing the file.
Achim Zeileis (admin of the partykit library) provided the following answer -->
To avoid the problem you can either:
avoid non-ASCII labels which is usually the most robust solution where you have to pay less attention what exactly you are doing.
Or you properly declare the encoding of your data file as follows:
dataframe <- read.csv("mynonASCII_data.csv", header=T,sep=',',
**encoding = "proper_encoding"**)
#proper encoding could be any non-ASCII encoding such as 'latin1' for example
resultJ48 <- J48(class~., dataframe, control = Weka_control(M = 2, C =0.5))
x <- as.party(resultJ48)
as.party should work now for your plot
I need to share data sets that I've imported into R as ffdf objects. My aim is to easily be able to export my ffdf datasets into CSV format, without having to worry about NA values which just inflate the size of the output file.
If I were working with a simple dataframe, I would use the following syntax:
write.csv(df, "C:/path/data.csv", row.names=FALSE, na="")
But the write.csv.ffdf function doesn't seem to take "na" as an argument. Can anyone tell me the correct syntax so that I don't have to do post processing on the output file to take away the NA values?
I think you are making inaccurate characterization of the behavior of write.csv.ffdf.
require(ff)
# What follows is a minor modification of the first example in the `write.* help page.
> x <- data.frame(log=rep(c(FALSE, TRUE), length.out=26), int=c(NA, 2:26),
dbl=c(1:25,NA) + 0.1, fac=factor(c(letters[2:26], NA)),
ord=c(NA, ordered(LETTERS[2:26])), dct=Sys.time()+1:26,
dat=seq(as.Date("1910/1/1"), length.out=26, by=1))
> ffx <- as.ffdf(x)
> write.csv(ffx, na="")
"","log","int","dbl","fac","ord","dct","dat"
"1",FALSE,,1.1,"b",,2012-12-18 12:18:23,1910-01-01
"2",TRUE,2,2.1,"c",1,2012-12-18 12:18:24,1910-01-02
"3",FALSE,3,3.1,"d",2,2012-12-18 12:18:25,1910-01-03
"4",TRUE,4,4.1,"e",3,2012-12-18 12:18:26,1910-01-04
"5",FALSE,5,5.1,"f",4,2012-12-18 12:18:27,1910-01-05
"6",TRUE,6,6.1,"g",5,2012-12-18 12:18:28,1910-01-06
"7",FALSE,7,7.1,"h",6,2012-12-18 12:18:29,1910-01-07
"8",TRUE,8,8.1,"i",7,2012-12-18 12:18:30,1910-01-08
"9",FALSE,9,9.1,"j",8,2012-12-18 12:18:31,1910-01-09
"10",TRUE,10,10.1,"k",9,2012-12-18 12:18:32,1910-01-10
"11",FALSE,11,11.1,"l",10,2012-12-18 12:18:33,1910-01-11
"12",TRUE,12,12.1,"m",11,2012-12-18 12:18:34,1910-01-12
"13",FALSE,13,13.1,"n",12,2012-12-18 12:18:35,1910-01-13
"14",TRUE,14,14.1,"o",13,2012-12-18 12:18:36,1910-01-14
"15",FALSE,15,15.1,"p",14,2012-12-18 12:18:37,1910-01-15
"16",TRUE,16,16.1,"q",15,2012-12-18 12:18:38,1910-01-16
"17",FALSE,17,17.1,"r",16,2012-12-18 12:18:39,1910-01-17
"18",TRUE,18,18.1,"s",17,2012-12-18 12:18:40,1910-01-18
"19",FALSE,19,19.1,"t",18,2012-12-18 12:18:41,1910-01-19
"20",TRUE,20,20.1,"u",19,2012-12-18 12:18:42,1910-01-20
"21",FALSE,21,21.1,"v",20,2012-12-18 12:18:43,1910-01-21
"22",TRUE,22,22.1,"w",21,2012-12-18 12:18:44,1910-01-22
"23",FALSE,23,23.1,"x",22,2012-12-18 12:18:45,1910-01-23
"24",TRUE,24,24.1,"y",23,2012-12-18 12:18:46,1910-01-24
"25",FALSE,25,25.1,"z",24,2012-12-18 12:18:47,1910-01-25
"26",TRUE,26,,,25,2012-12-18 12:18:48,1910-01-26
If your goal is minimizing the RAM footprint during write operations, then first look at:
getOption("ffbatchbytes")
write.csv.ffdf does not have an na parameter, but write.table.ffdf passes the na parameter onto the write.table1 function that it wraps.
Just use sep="," as well and you are good to go.
This will work even for large ff variables.
It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
On general request, a community wiki on producing latex tables in R. In this post I'll give an overview of the most commonly used packages and blogs with code for producing latex tables from less straight-forward objects. Please feel free to add any I missed, and/or give tips, hints and little tricks on how to produce nicely formatted latex tables with R.
Packages :
xtable : for standard tables of most simple objects. A nice gallery with examples can be found here.
memisc : tool for management of survey data, contains some tools for latex tables of (basic) regression model estimates.
Hmisc contains a function latex() that creates a tex file containing the object of choice. It is pretty flexible, and can also output longtable latex tables. There's a lot of info in the help file ?latex
miscFuncs has a neat function 'latextable' that converts matrix data with mixed alphabetic and numeric entries into a LaTeX table and prints them to the console, so they can be copied and pasted into a LaTeX document.
texreg package (JSS paper) converts statistical model output into LaTeX tables. Merges multiple models. Can cope with about 50 different model types, including network models and multilevel models (lme and lme4).
reporttools package (JSS paper) is another option for descriptive statistics on continuous, categorical and date variables.
tables package is perhaps the most general LaTeX table making package in R for descriptive statistics
stargazer package makes nice comparative statistical model summary tables
Blogs and code snippets
There is the outreg function of Paul Johnson that gives Stata-like tables in Latex for the output of regressions. This one works great.
As given in an earlier question, there's a code snippet to adapt the memisc package for lme4 objects.
Related questions :
Suggestion for R/LaTeX table creation package
Rreport/LaTeX quality output package
sorting a table for latex output with xtable
Any way to produce a LaTeX table from an lme4 mer model fit object?
R data.frame with stacked specified titles for latex output with xtable
Automating adding tables fast to latex from R, with a very flexible and interesting syntax using the formula language
I'd like to add a mention of the "brew" package. You can write a brew template file which would be LaTeX with placeholders, and then "brew" it up to create a .tex file to \include or \input into your LaTeX. Something like:
\begin{tabular}{l l}
A & <%= fit$A %> \\
B & <%= fit$B %> \\
\end{tabular}
The brew syntax can also handle loops, so you can create a table row for each row of a dataframe.
Thanks Joris for creating this question. Hopefully, it will be made into a community wiki.
The booktabs packages in latex produces nice looking tables. Here is a blog post on how to use xtable to create latex tables that use booktabs
I would also add the apsrtable package to the mix as it produces nice looking regression tables.
Another Idea: Some of these packages (esp. memisc and apsrtable) allow easy extensions of the code to produce tables for different regression objects. One such example is the lme4 memisc code shown in the question. It might make sense to start a github repository to collect such code snippets, and over time maybe even add it to the memisc package. Any takers?
The stargazer package is another good option. It supports objects from many commonly used functions and packages (lm, glm, svyreg, survival, pscl, AER), as well as from zelig. In addition to regression tables, it can also output summary statistics for data frames, or directly output the content of data frames.
I have a few tricks and work arounds to interesting 'features' of xtable and Latex that I'll share here.
Trick #1: Removing Duplicates in Columns and Trick #2: Using Booktabs
First, load packages and define my clean function
<<label=first, include=FALSE, echo=FALSE>>=
library(xtable)
library(plyr)
cleanf <- function(x){
oldx <- c(FALSE, x[-1]==x[-length(x)])
# is the value equal to the previous?
res <- x
res[oldx] <- NA
return(res)}
Now generate some fake data
data<-data.frame(animal=sample(c("elephant", "dog", "cat", "fish", "snake"), 100,replace=TRUE),
colour=sample(c("red", "blue", "green", "yellow"), 100,replace=TRUE),
size=rnorm(100,mean=500, sd=150),
age=rlnorm(100, meanlog=3, sdlog=0.5))
#generate a table
datatable<-ddply(data, .(animal, colour), function(df) {
return(data.frame(size=mean(df$size), age=mean(df$age)))
})
Now we can generate a table, and use the clean function to remove duplicate entries in the label columns.
cleandata<-datatable
cleandata$animal<-cleanf(cleandata$animal)
cleandata$colour<-cleanf(cleandata$colour)
#
this is a normal xtable
<<label=normal, results=tex, echo=FALSE>>=
print(
xtable(
datatable
),
tabular.environment='longtable',
latex.environments=c("center"),
floating=FALSE,
include.rownames=FALSE
)
#
this is a normal xtable where a custom function has turned duplicates to NA
<<label=cleandata, results=tex, echo=FALSE>>=
print(
xtable(
cleandata
),
tabular.environment='longtable',
latex.environments=c("center"),
floating=FALSE,
include.rownames=FALSE
)
#
This table uses the booktab package (and needs a \usepackage{booktabs} in the headers)
\begin{table}[!h]
\centering
\caption{table using booktabs.}
\label{tab:mytable}
<<label=booktabs, echo=F,results=tex>>=
mat <- xtable(cleandata,digits=rep(2,ncol(cleandata)+1))
foo<-0:(length(mat$animal))
bar<-foo[!is.na(mat$animal)]
print(mat,
sanitize.text.function = function(x){x},
floating=FALSE,
include.rownames=FALSE,
hline.after=NULL,
add.to.row=list(pos=list(-1,bar,nrow(mat)),
command=c("\\toprule ", "\\midrule ", "\\bottomrule ")))
#could extend this with \cmidrule to have a partial line over
#a sub category column and \addlinespace to add space before a total row
#
Two utilities in package taRifx can be used in concert to produce multi-row tables of nested heirarchies.
library(datasets)
library(taRifx)
library(xtable)
test.by <- bytable(ChickWeight$weight, list( ChickWeight$Chick, ChickWeight$Diet) )
colnames(test.by) <- c('Diet','Chick','Mean Weight')
print(latex.table.by(test.by), include.rownames = FALSE, include.colnames = TRUE, sanitize.text.function = force)
# then add \usepackage{multirow} to the preamble of your LaTeX document
# for longtable support, add ,tabular.environment='longtable' to the print command (plus add in ,floating=FALSE), then \usepackage{longtable} to the LaTeX preamble
... and Trick #3 Multiline entries in an Xtable
Generate some more data
moredata<-data.frame(Nominal=c(1:5), n=rep(5,5),
MeanLinBias=signif(rnorm(5, mean=0, sd=10), digits=4),
LinCI=paste("(",signif(rnorm(5,mean=-2, sd=5), digits=4),
", ", signif(rnorm(5, mean=2, sd=5), digits=4),")",sep=""),
MeanQuadBias=signif(rnorm(5, mean=0, sd=10), digits=4),
QuadCI=paste("(",signif(rnorm(5,mean=-2, sd=5), digits=4),
", ", signif(rnorm(5, mean=2, sd=5), digits=4),")",sep=""))
names(moredata)<-c("Nominal", "n","Linear Model \nBias","Linear \nCI", "Quadratic Model \nBias", "Quadratic \nCI")
Now produce our xtable, using the sanitize function to replace column names with the correct Latex newline commands (including double backslashes so R is happy)
<<label=multilinetable, results=tex, echo=FALSE>>=
foo<-xtable(moredata)
align(foo) <- c( rep('c',3),'p{1.8in}','p{2in}','p{1.8in}','p{2in}' )
print(foo,
floating=FALSE,
include.rownames=FALSE,
sanitize.text.function = function(str) {
str<-gsub("\n","\\\\", str, fixed=TRUE)
return(str)
},
sanitize.colnames.function = function(str) {
str<-c("Nominal", "n","\\centering Linear Model\\\\ \\% Bias","\\centering Linear \\\\ 95\\%CI", "\\centering Quadratic Model\\\\ \\%Bias", "\\centering Quadratic \\\\ 95\\%CI \\tabularnewline")
return(str)
})
#
(although this isn't perfect, as we need \tabularnewline so the table is formatted correctly, and Xtable still puts in a final \, so we end up with a blank line below the table header.)
You can also use the latextable function from the R package micsFuncs:
http://cran.r-project.org/web/packages/miscFuncs/index.html
latextable(M) where M is a matrix with mixed alphabetic and numeric entries outputs a basic LaTeX table onto screen, which can be copied and pasted into a LaTeX document. Where there are small numbers, it also replaces these with index notation (eg 1.2x10^{-3}).
Another R package for aggregating multiple regression models into LaTeX tables is texreg.