How do I print superscripts in a table using xtable and sweave? - r

So my problem statement is as follows :
I have defined a data frame in my Sweave document (.Rnw extension) using the following code:
<<label=table2_1,echo=FALSE>>=
table2_1_rows <- c('Students with compulsory Evaluations',
'Teachers with compulsory evaluations1',
'Teachers without Evaluation2',
'Students without compulsory evaluations3'
)
table2_1_data <- c(1,2,3,4)
table2_1final <- data.frame(table2_1_rows,table2_1_data)
#
<<label=tab1,echo=FALSE,results=tex>>=
print(xtable(table2_1final,caption= ' ',align="|c|c|c|c|c|c|"),include.rownames=FALSE)
#
How do I make xtable print the numbers 1,2,3 (following the word Evaluation) as superscripts?

The actual superscript is quite easy. \\textsuperscript{1} in conjunction with sanitize.text.function = identity will get you there.
I had to make some other changes to your example. There are too many columns in align and the underscores in the variable names cause problems compiling tex.
<<label=table2_1,echo=FALSE>>=
require(xtable)
table2_1_rows <- c('Students with compulsory Evaluations',
'Teachers with compulsory evaluations\\textsuperscript{1}',
'Teachers without Evaluation\\textsuperscript{2}',
'Students without compulsory evaluations\\textsuperscript{3}'
)
table2_1_data <- c(1,2,3,4)
table2_1final <- data.frame(table2_1_rows,table2_1_data)
names(table2_1final) <- c("rows", "data")
#
<<label=tab1,echo=FALSE,results=tex>>=
print(xtable(table2_1final,caption= ' ',align="|c|c|c|"),include.rownames=FALSE,
sanitize.text.function = identity)
#

Related

Passing special character in DT Table

I am trying to escape special character "--" in column of DT table but formatPercentage is not letting this happen , manually passing formatPercentage(c(2,3,5)) is working and i want to make it dynamic. so i am looking for a solution through which column having "--" can be displayed in DT table.
I have tried ifelse but doesn't work , this code is just a part of my function.
df <- mtcars[1:6,1:5]
df$drat <- "--"
df$disp <- "--"
datatable(df, escape = FALSE) %>%
formatPercentage(2:5)
so the actual problem is I am trying to mask one column in my DT table output but formatPercentage not providing the require output. so i am looking for a solution .
My function is big thats why i am unable to create a reproducible example
You can exclude the columns which has '--' in them.
library(DT)
all_cols <- 2:5
format_cols <- setdiff(all_cols, which(colSums(df == '--') > 0))
datatable(df, escape = FALSE) %>% formatPercentage(format_cols)

knitr changes (1) to <ol> when rendering html?

The following content of a .Rmd file:
---
title: "Untitled"
output:
html_document: default
---
```{r cars}
mtcars$am <- sprintf("(%s)", as.character(mtcars$am))
knitr::kable(mtcars, format = "html")
```
Will show ordered lists <ol><li></li></ol> in the am column, instead of the numbers in brackets (as produced with the sprintf) after rendering to html.
Is this intended? How can I work around this and have numbers in brackets show as they are in the html output?
The output of knitr::kable seems to be fine, showing:
<td style="text-align:left;"> (1) </td>
Details:
Using knitr 1.20
RStudio Server 1.1.453
note that removing format = "html" does not resolve the issue as in the real-life context I would like to do advanced formatting with css e.g. based on the classes of the produced tables
A quick workaround solution based on Michael Harper's accepted answer may be a method like so:
replacechars <- function(x) UseMethod("replacechars")
replacechars.default <- function(x) x
replacechars.character <- function(x) {
x <- gsub("(", "&lpar;", x, fixed = TRUE)
x <- gsub(")", "&rpar;", x, fixed = TRUE)
x
}
replacechars.factor <- function(x) {
levels(x) <- replacechars(levels(x))
x
}
replacechars.data.frame <- function(x) {
dfnames <- names(x)
x <- data.frame(lapply(x, replacechars), stringsAsFactors = FALSE)
names(x) <- dfnames
x
}
Example use:
mtcars <- datasets::mtcars
# Create a character with issues
mtcars$am <- sprintf("(%s)", as.character(mtcars$am))
# Create a factor with issues
mtcars$hp <- as.factor(mtcars$hp)
levels(mtcars$hp) <- sprintf("(%s)", levels(mtcars$hp))
replacechars(mtcars)
If you don't want to remove the format="html" argument, you could try using the HTML character entities for the parentheses (&lpar and &rpar) and then add the argument escape = FALSE:
```{r cars}
mtcars$am <- sprintf("&lpar;%s&rpar;", as.character(mtcars$am))
knitr::kable(mtcars, format = "html", escape = FALSE)
```
Still not entirely sure of what is causing the error though. It seems that the specific combination of parentheses is being processed strangely by knitr.
An alternative solution is to escape the parentheses, e.g.,
mtcars$am <- sprintf("\\(%s)", as.character(mtcars$am))
Then you won't need escape = FALSE.
See https://pandoc.org/MANUAL.html#backslash-escapes in Pandoc's Manual.

R, creating variables on the fly in a list using assign statement

I want to create variable names on the fly inside a list and assign them values in R, but I am unable to get the desired result. Here is the logic of my code:
Upon the function call: dat_in <- readf(1,2), an input file is read based on a product and site. After reading, a particular column (13th, here) is assigned to a variable aot500. I want to have this variable return from the function for each combination of product and site. For example, I need variables name in the list statement as aot500.AF, aot500.CM, aot500.RB to be returned from this function. I am having trouble in the return statement. There is no error but there is nothing in dat_in. I expect it to have dat_in$aot500.AF etc. Please inform what is wrong in the return statement. Furthermore, I want to read files for all combinations in a single call to the function, say using a for loop and I wonder how would the return statement handle list of more variables.
prod <- c('inv','tot')
site <- c('AF','CM','RB')
readf <- function(pp, kk) {
fname.dsa <- paste("../data/site_data_",prod[pp],"/daily_",site[kk],".dat",sep="")
inp.aod <- read.csv(fname.dsa,skip=4,sep=",",stringsAsFactors=F,na.strings="N/A")
aot500 <- inp.aod[,13]
return(list(assign(paste("aot500",siteabbr[kk],sep="."),aot500)))
}
Almost always there is no need to use assign(), we can solve the problem in two steps, read the files into a list, then give names.
(Not tested as we don't have your files)
prod <- c('inv', 'tot')
site <- c('AF', 'CM', 'RB')
# get combo of site and prod
prod_site <- expand.grid(prod, site)
colnames(prod_site) <- c("prod", "site")
# Step 1: read the files into a list
res <- lapply(1:nrow(prod_site), function(i){
fname.dsa <- paste0("../data/site_data_",
prod_site[i, "prod"],
"/daily_",
prod_site[i, "site"],
".dat")
inp.aod <- read.csv(fname.dsa,
skip = 4,
stringsAsFactors = FALSE,
na.strings = "N/A")
inp.aod[, 13]
})
# Step 2: assign names to a list
names(res) <- paste("aot500", prod_site$prod, prod_site$site, sep = ".")
I propose two answers, one based on dplyr and one based on base R.
You'll probably have to adapt the filename in the readAOT_500 function to your particular case.
Base R answer
#' Function that reads AOT_500 from the given product and site file
#' #param prodsite character vector containing 2 elements
#' name of a product and name of a site
readAOT_500 <- function(prodsite,
selectedcolumn = c("AOT_500"),
path = tempdir()){
cat(path, prodsite)
filename <- paste0(path, prodsite[1],
prodsite[2], ".csv")
dtf <- read.csv(filename, stringsAsFactors = FALSE)
dtf <- dtf[selectedcolumn]
dtf$prod <- prodsite[1]
dtf$site <- prodsite[2]
return(dtf)
}
# Load one file for example
readAOT_500(c("inv", "AF"))
listofsites <- list(c("inv","AF"),
c("tot","AF"),
c("inv", "CM"),
c( "tot", "CM"),
c("inv", "RB"),
c("tot", "RB"))
# Load all files in a list of data frames
prodsitedata <- lapply(listofsites, readAOT_500)
# Combine all data frames together
prodsitedata <- Reduce(rbind,prodsitedata)
dplyr answer
I use Hadley Wickham's packages to clean data.
library(dplyr)
library(tidyr)
daily_CM <- read.csv("~/downloads/daily_CM.dat",skip=4,sep=",",stringsAsFactors=F,na.strings="N/A")
# Generate all combinations of product and site.
prodsite <- expand.grid(prod = c('inv','tot'),
site = c('AF','CM','RB')) %>%
# Group variables to use do() later on
group_by(prod, site)
Create 6 fake files by sampling from the data you provided
You can skip this section when you have real data.
I used various sample length so that the number of observations
differs for each site.
prodsite$samplelength <- sample(1:495,nrow(prodsite))
prodsite %>%
do(stuff = write.csv(sample_n(daily_CM,.$samplelength),
paste0(tempdir(),.$prod,.$site,".csv")))
Read many files using dplyr::do()
prodsitedata <- prodsite %>%
do(read.csv(paste0(tempdir(),.$prod,.$site,".csv"),
stringsAsFactors = FALSE))
# Select only the columns you are interested in
prodsitedata2 <- prodsitedata %>%
select(prod, site, AOT_500)

Can't get sanitize text function to work in xtable when not in math mode

EDIT I found the answer and posted below. The only reason I had thought it was working in math mode was because I was running an example and never saw the sanitize-text-function argument was being passed to the print method. I'll accept this answer once it becomes available.
I am typesetting a manuscript and doing a data analysis for it. In this analysis, I'm generating a table 1 and looking to indent some row names in the table to give it a cascading feel.
An example of the data I have is:
require(xtable)
data <- data.frame(
'case'=sample(c('case', 'control'), 100, replace=TRUE),
'age'=sample(c('40-50,', '50-60', '60-70'), 100, replace=TRUE),
'sex'=sample(c('male', 'female'), 100, replace=TRUE),
'income'=sample(c('under 50,000', '50-100,000', 'over 10000'), 100, replace=TRUE)
)
tables <- lapply(data[, -1], table, data[, 1])
tables <- lapply(tables, function(x) {
rownames(x) <- paste('\\hspace{5mm}', rownames(x))
x
})
tablenames <- names(tables)
tables <- Reduce(rbind, mapply(rbind, '', tables))
rownames(tables)[rownames(tables) == ''] <- tablenames
xtable(tables)
xtable(tables, type='latex', sanitize.text.function=identity)
I understand the last two xtable commands should return different tables. I'm using the most recent version of R and xtable.
Welp... Apparently, sanitize.text.function is an argument to print.xtable and not to xtable itself. Doing
print(xtable(tables), type='latex', sanitize.text.function=identity)
solves the problem.

Avoid that space in column name is replaced with period (".") when using read.csv()

I am using R to do some data pre-processing, and here is the problem that I am faced with: I input the data using read.csv(filename,header=TRUE), and then the space in variable names became ".", for example, a variable named Full Code became Full.Code in the generated dataframe. After the processing, I use write.xlsx(filename) to export the results, while the variable names are changed. How to address this problem?
Besides, in the output .xlsx file, the first column become indices(i.e., 1 to N), which is not what I am expecting.
If your set check.names=FALSE in read.csv when you read the data in then the names will not be changed and you will not need to edit them before writing the data back out. This of course means that you would need quote the column names (back quotes in some cases) or refer to the columns by location rather than name while editing.
To get spaces back in the names, do this (right before you export - R does let you have spaces in variable names, but it's a pain):
# A simple regular expression to replace dots with spaces
# This might have unintended consequences, so be sure to check the results
names(yourdata) <- gsub(x = names(yourdata),
pattern = "\\.",
replacement = " ")
To drop the first-column index, just add row.names = FALSE to your write.xlsx(). That's a common argument for functions that write out data in tabular format (write.csv() has it, too).
Here's a function (sorry, I know it could be refactored) that makes nice column names even if there are multiple consecutive dots and trailing dots:
makeColNamesUserFriendly <- function(ds) {
# FIXME: Repetitive.
# Convert any number of consecutive dots to a single space.
names(ds) <- gsub(x = names(ds),
pattern = "(\\.)+",
replacement = " ")
# Drop the trailing spaces.
names(ds) <- gsub(x = names(ds),
pattern = "( )+$",
replacement = "")
ds
}
Example usage:
ds <- makeColNamesUserFriendly(ds)
Just to add to the answers already provided, here is another way of replacing the “.” or any other kind of punctation in column names by using a regex with the stringr package in the way like:
require(“stringr”)
colnames(data) <- str_replace_all(colnames(data), "[:punct:]", " ")
For example try:
data <- data.frame(variable.x = 1:10, variable.y = 21:30, variable.z = "const")
colnames(data) <- str_replace_all(colnames(data), "[:punct:]", " ")
and
colnames(data)
will give you
[1] "variable x" "variable y" "variable z"

Resources