I have a dataframe in R that I have converted to a Flextable. It looks okay, but I need to manually set the column widths. I've tried to accomplish this several ways and so far have not been successful. What I find especially odd is that when I run dim(myflextable) to look at the dimensions, the returned values are what I've set them as but the Flextable itself does not have those dimensions. Any insight would be appreciated.
dataframe + libraries
library(dplyr)
library(flextable)
library(officer)
extractionWS <- structure(list(Sample = c(12740L, 13231L, 13232L, 13233L, 13234L),
Full.name = c("Method Negative Control In Vitro Extraction 600 uL RLT-TCEP 23",
"G-A12-T5_7H9tw_co2_0-05_p0-8_b0-8_48h_tube49",
"G-A12-T4_7H9tw_co2_0-05_p0-8_b0-8_h0-1_96h_tube90",
"G-A13-T3_7H9tw_co2_0-05_untx_192h_tube71",
"G-A12-T5_7H9tw_co2_0-05_p0-8_b0-8_opc0-00195_48h_tube77"),
Providing.lab.key = c(NA, "VOS22_00349",
"VOS22_00290",
"VOS22_00675",
"VOS22_00377"),
kitshelf = c("", "", "", "", ""),
kitbox = c(8, 8, 8, 8, 8),
kitpos = c("A1", "A2", "A3", "A4", "A5"),
extractdate = c("", "", "", "", ""),
concentration = c("", "", "", "", ""),
rnashelf = c("", "", "", "", ""),
rnabox = c("", "", "", "", ""),
rnapos = c("", "", "", "", ""),
comment = c("", "", "", "", "")),
class = "data.frame", row.names = c(NA, -5L))
extractionWS <- extractionWS %>%
setNames(c("WL Number", 'Full Name', "Providing Lab Key",
"Kit Shelf", "Kit Box", "Kit Position",
"Extraction Date", "RNA Conc. (ng/ul)",
"Extracted RNA Shelf", "Extracted RNA Box","Extracted RNA Position", "Comment"))
flextable format attempt
#Create flextable
worksheet <- flextable(extractionWS)
#font style
worksheet <- bold(worksheet, bold = TRUE, part="header")
worksheet <- align(worksheet, align = "center", part = "all")
worksheet <- colformat_num(worksheet, big.mark="")
#border
border.outer = fp_border(color="black", width=2.5)
border.horizontal = fp_border(color="black", width=1.5)
border.vertical = fp_border(color="black", width=1.5)
worksheet <- border_outer(worksheet, border=border.outer, part="all")
worksheet <- border_inner_h(worksheet, part="all", border = border.horizontal)
worksheet <- border_inner_v(worksheet, part="all", border = border.vertical)
#Set fontsize
worksheet <- fontsize(worksheet, size = 12, part = "body")
worksheet <- fontsize(worksheet, j = 2, size = 8, part = "body")
#dimensions
worksheet <- width(worksheet, 12, width = 6)
dim(worksheet) output - indicates I successfully changed the column width:
$widths
WL.num Full.name Providing.lab.key kitshelf kitbox
0.75 0.75 0.75 0.75 0.75
kitpos extractdate concentration rnashelf rnabox
0.75 0.75 0.75 0.75 0.75
rnapos comment Kit Position
0.75 6.00 0.75
But this is not reflected in the table:
Sorry if I'm missing something obvious, I've spent a lot of time on this simple thing and gone over the documentation but haven't been able to find a solution.
Related
I have a data frame with several distinct columns.
Each column has several different gene names.
I would like to know:
if there are repeated gene names in the whole data frame,
if possible, how many times each gene is repeated.
This is part of my data frame:
DS_struct <-
structure(
list(
`12941` = c("", "", "", "", ""),
`14520` = c("ABAT",
"ABCA6", "ABCA8", "ABCB4", "ABCG2"),
`22405` = c("ACSL4", "ADFP",
"ADH1A", "ADH1B", "ADH1C"),
`25097` = c("AATF", "ABCB8", "ABLIM3",
"ACCN2", "ACSM3"),
`33006` = c("ADAMTS1", "ADAMTS13", "ADGRA3",
"ADGRG7", "ADH1B"),
`36376` = c("ACAA2", "ACACB", "ACAD11", "ACOT12",
"ACSL1"),
`39791` = c("ABAT", "ACACB", "ACSL4", "ACSM5", "ADAMTSL2"),
`41804` = c("A2M-AS1", "A2MP1", "AADAT", "ABCA8", "ACADL"),
`46408` = c("A1CF", "A2M", "AADAT", "AASS", "ABAT"),
`50579` = c("AASS",
"ABAT", "ABCA8", "ABCB10", "ABLIM2"),
`55191` = c("", "",
"", "", ""),
`57555` = c("", "", "", "", ""),
`57957` = c("ACSL4",
"ACSM3", "ADAMTSL2", "ADGRG2", "ADH1B"),
`57958` = c("",
"", "", "", ""),
`58043` = c("", "", "", "", ""),
`60502` = c("ABAT",
"ABCA6", "ABCA8", "ABCB4", "ABT1"),
`62232` = c("AADAT",
"AASS", "AASS", "ABCA8", "ABCC4"),
`76427` = c("ADGRG7",
"ADIRF", "ALPL", "ANXA10", "ASPDH"),
`84005` = c("", "",
"", "", ""),
`84402` = c("AADAT", "AASS", "ABAT", "ABCA6",
"ABCA8"),
`89186` = c("", "", "", "", ""),
`101685` = c("AADAT",
"AASS", "ABAT", "ABCA9", "ABCC4"),
`101728` = c("5-??", "5_8S_rRNA",
"A1BG", "A2M", "AACS"),
`113996` = c("", "", "", "", ""),
`117361` = c("", "", "", "", ""),
`121248` = c("ABI3BP",
"ACADL", "ACOT12", "ACSL4", "ACSM3"),
`136247` = c("", "",
"", "", ""),
`138178` = c("", "", "", "", ""),
`166163` = c("",
"", "", "", "")
),
row.names = 2:6,
class = "data.frame"
)
First of all, let's convert those blank values to NAs. That way we won't be counting blanks as actual genes when we go to count them up.
DS_struct[which(DS_struct == "", arr.ind = T)] <- NA
Now we can look at how many of each gene name is in the data frame.
gene_counts <- sort(table(unlist(DS_struct)))
gene_counts
We can test if there are repeated gene names in the data frame.
repeated_genes <- length(gene_counts[which(gene_counts > 1)]) != 0
repeated_genes
And have a look at which gene names are repeated.
gene_counts[which(gene_counts > 1)]
I'm creating a document with rmarkdown, ultimately for pdf output.
I'd like to make a table that has multiple sections with subheadings (title, abstract, introduction etc.) such as the table below
I've made the following so far, but I'd like to have the vertical lines present apart from the heading rows("Title", "Abstract" etc):
{r prch}
pc = structure(list(`Section/topic` = c("\\textbf{Title}", "Title",
"\\textbf{Abstract}", "Structured summary"), `Item No` = c("",
"1", "", "2"), `Checklist item` = c("", "Identify the report as a systematic review, meta-analysis, or both",
"", "Provide a structured summary including, as applicable, background, objectives, data sources, study eligibility criteria, participants, interventions, "
), `Reported on page No` = c("", "", "", "")), row.names = c(NA,
-4L), class = c("tbl_df", "tbl", "data.frame"))
pc%>%
kbl(longtable = T, escape = F, booktabs = T)%>%
column_spec(1, width = "8em")%>%
column_spec(3, width = "20em")%>%
column_spec(4, width = "6em")%>%
kable_styling(latex_options = c("repeat"))
Here's a huxtable-based solution. (My package.)
library(huxtable)
ht <- tribble_hux(
~ "Section/topic", ~ "Item no", ~ "Checklist item", ~ "Reported on Page No",
"Title" , "" , "" , "",
"Title" , "1" , "Identify..." , "",
"Abstract" , "" , "" , "",
"Structured summary", "2" , "Provide..." , ""
# et cetera...
)
# using the pipe from 4.1.0...
ht |>
set_header_rows(c(2, 4), TRUE) |>
merge_across(c(2, 4), everywhere) |>
style_header_rows(bold = TRUE) |>
set_all_borders(brdr(0.4, "solid", "grey70")) |>
set_background_color("grey97") |>
set_background_color(1, 1:3, "grey90") |>
set_col_width(c(0.2, 0.05, 0.55, 0.2)) |>
set_font("cmss") |>
quick_pdf()
I have over 100 csv files that contain data like this...
> dput(head(hobo.temp))
structure(list(Serial = c("Plot Title: 20461693", "#", "1",
"2", "3", "4"), Date = c("", "Date Time, GMT-05:00", "02/14/20 10:14:50 AM",
"02/14/20 10:14:57 AM", "02/14/20 11:14:50 AM", "02/14/20 12:14:50 PM"
), Temp = c("", "Temp, °C (LGR S/N: 20461693, SEN S/N: 20461693)",
"18.866", "", "20.817", "20.913"), X1 = c("", "Coupler Detached (LGR S/N: 20461693, SEN S/N: 20461693)",
"", "Logged", "", ""), X2 = c("", "Coupler Attached (LGR S/N: 20461693, SEN S/N: 20461693)",
"", "", "", ""), X3 = c("", "Host Connected (LGR S/N: 20461693, SEN S/N: 20461693)",
"", "", "", ""), X4 = c("", "End Of File (LGR S/N: 20461693, SEN S/N: 20461693)",
"", "", "", "")), row.names = c(NA, 6L), class = "data.frame")
It's nasty so I wrote code to clean it up...
hobo.temp <- read.csv("20461693_suw_main_01_19_2021.csv",
colClasses = c(rep("character", 3), rep("NULL", 4)),
col.names = c("Serial", "Date", "Temp", 1, 2, 3, 4),
header = FALSE, fill = TRUE, stringsAsFactors = FALSE)
hobo.temp$Date = as.POSIXct(hobo.temp$Date, format="%m/%d/%y %H:%M")
hobo.temp[,1] <- hobo.temp[1,1]
hobo.temp <- hobo.temp[-c(1:4),]
hobo.temp <- na.omit(hobo.temp)
hobo.temp <- arrange(hobo.temp, Date)
row.names(hobo.temp) <- NULL
hobo.temp$Serial <- gsub("Plot Title: ", "", hobo.temp$Serial, fixed = TRUE)
hobo.temp$Temp <- as.numeric(hobo.temp$Temp)
return(hobo.temp)
But when I tried to convert it to a function and iterate it using this code.
filenames <- list.files(path = ".", pattern='^.*\\.csv$')
hobo.read <- function(fnam) {
hobo.temp <- read.csv(fnam, colClasses = c(rep("character", 3), rep("NULL", 4)),
col.names = c("Serial", "Date", "Temp", 1, 2, 3, 4),
header = FALSE, fill = TRUE, stringsAsFactors = FALSE)
hobo.temp$Date = as.POSIXct(hobo.temp$Date, format="%m/%d/%y %H:%M")
hobo.temp[,1] <- hobo.temp[1,1]
hobo.temp <- hobo.temp[-c(1:4),]
hobo.temp <- na.omit(hobo.temp)
hobo.temp <- arrange(hobo.temp, Date)
row.names(hobo.temp) <- NULL
hobo.temp$Serial <- gsub("Plot Title: ", "", hobo.temp$Serial, fixed = TRUE)
hobo.temp$Temp <- as.numeric(hobo.temp$Temp)
return(hobo.temp)
}
my.df <- do.call("rbind", lapply(filenames, hobo.read))
I get this error
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
more columns than column names
I'm terrible at writing functions so I apologize in advance.
I realized that a few of the files had 8 columns. I thought this might be the case and was trying to account for it with my original question code colClasses = c(rep("character", 3), rep("NULL", 4)) When I switched the 4 to a 5 in ```rep("NULL", 5)), it correctly nullified that 8th column. I modified my original question code to be slightly more readable (maybe). This is my first real function, and it's nested to boot. It's sloppy but I'm pretty proud of it.
#reads filenames from
filenames <- list.files(path = ".", pattern='^.*\\.csv$')
#first function imports data
hobo.read <- function(x) {
#in... rep("NULL", 5)... 5 has to be larger than the number for columns in the csv with the largest number of columns
df1 <- read.csv(x, colClasses = c(rep("character", 3), rep("NULL", 5)),
col.names = c("Serial", "Date", "Temp", 1, 2, 3, 4),
header = FALSE, fill = TRUE, stringsAsFactors = FALSE)
# line applies the action function below
df2 <- hobo.fix(df1)
}
#function of actions to apply within 1st function
hobo.fix <- function(hobo.temp) {
hobo.temp[,1] <- hobo.temp[1,1]
hobo.temp <- hobo.temp[-c(1:4),]
hobo.temp$Serial <- gsub("Plot Title: ", "", hobo.temp$Serial, fixed = TRUE)
hobo.temp$Temp <- as.numeric(hobo.temp$Temp)
hobo.temp$Date = as.POSIXct(hobo.temp$Date, format="%m/%d/%y %H:%M")
hobo.temp <- na.omit(hobo.temp)
hobo.temp <- dplyr::arrange(hobo.temp, Date)
row.names(hobo.temp) <- NULL
return(hobo.temp)
}
hobo <- do.call("rbind", lapply(filenames, hobo.read))
I have problems to automatically generate borders while exporting an excel file with r. Below is my code and the output I am currently generating and how I would like it to be.
I have tried to help myself with the solution here, but could not make it work on my example.
Here is some code to reproduce the problem:
#some dataframes to export as excel files
Agent1 <- data.frame("QUEUE" = c("call PA", "call", "Call", "call CB"), "NR" = c(6,15,3,7), "Client" = c("xyz company", "some other company", "Company abs", "BNM"), stringsAsFactors = FALSE)
Agent2 <- data.frame("QUEUE" = c("call PA", "call", "Call", "call CB"), "NR" = c(7,13,5,3), "Client" = c("xyz company", "some other company", "Company abs", "BNM"), stringsAsFactors = FALSE)
Agent3 <- data.frame("QUEUE" = c("call PA", "call", "Call", "call CB"), "NR" = c(4,4,3,7), "Client" = c("xyz company", "some other company", "Company abs", "BNM"), stringsAsFactors = FALSE)
nr_of_agents <- 3
# Variable creation for counting cases per agent
for (a in 1 : nr_of_agents) {
agent_s <- paste0("Agent",a,"sum")
assign(agent_s, 0)
}
for (a in 1:nr_of_agents){ #Counting cases per agent
agent <- paste0("Agent",a)
tempv <- eval(as.name(agent))
agent_s <- paste0("Agent",a,"sum")
tempv1 <- eval(as.name(agent_s))
tempv1 <- sum(tempv$NR)
assign(agent_s, paste("Total cases: ", tempv1))
}
## EXCEL OUTPUT
wb<-createWorkbook(type="xlsx")
TITLE_STYLE <- CellStyle(wb)+ Font(wb, heightInPoints=16, color=NULL, isBold=TRUE) +
Alignment(h="ALIGN_CENTER")
TEXT_STYLE <- CellStyle(wb)+ Font(wb, heightInPoints=12, color=NULL, isBold=FALSE) +
Alignment(h="ALIGN_RIGHT")+
Border(color="black", position=c("TOP"),
pen=c("BORDER_THIN"))
# Styles for the data table row/column names
TABLE_ROWNAMES_STYLE <- CellStyle(wb) + Font(wb, isBold=TRUE)
TABLE_COLNAMES_STYLE <- CellStyle(wb) + Font(wb,color="#FFFAFA", heightInPoints=12, name="Calibri Light", isBold=TRUE) +
Fill(foregroundColor="#9e2b11",pattern="SOLID_FOREGROUND")+#, backgroundColor="lightblue")
Alignment(wrapText=TRUE, horizontal="ALIGN_CENTER")+
Border(color="black", position=c("TOP", "BOTTOM", "LEFT", "RIGHT"),
pen=c("BORDER_THIN"))
#Code to add title
xlsx.addTitle<-function(sheet, rowIndex, title, titleStyle){
rows <-createRow(sheet,rowIndex=rowIndex)
sheetTitle <-createCell(rows, colIndex=3)
setCellValue(sheetTitle[[1,1]], title)
setCellStyle(sheetTitle[[1,1]], titleStyle)
}
#Code to add sums of cases per agent
xlsx.addsums<-function(sheet, rowIndex, title, titleStyle){
rows <-createRow(sheet,rowIndex=rowIndex)
sheetTitle <-createCell(rows, colIndex=3)
setCellValue(sheetTitle[[1,1]], title)
setCellStyle(sheetTitle[[1,1]], titleStyle)
}
names <- c("Mark", "Neli", "Sara") # Agents names
for (a in 1 : nr_of_agents) {
agent <- paste0("Agent",a)
tempv <- eval(as.name(agent))
agent_S <- paste0("Agent",a,"sum")
tempv1 <- eval(as.name(agent_S))
sheet<-createSheet(wb, sheetName = names[a]) #sheet creation
xlsx.addTitle(sheet, rowIndex=1, title=names[a], #Adding title to each sheet
titleStyle = TITLE_STYLE)
addDataFrame(tempv, sheet, startRow=3, startColumn=1, #Adding the dataframes
colnamesStyle = TABLE_COLNAMES_STYLE,
rownamesStyle = TABLE_ROWNAMES_STYLE
)
xlsx.addsums(sheet, rowIndex=(3+ nrow(tempv)+1), title= tempv1, #Adding total sum for every agent
titleStyle = TEXT_STYLE)
autoSizeColumn(sheet, colIndex=c(1:ncol(tempv))) #Auto size columns
}
saveWorkbook(wb, paste0(Sys.Date()," Test_file",".xlsx"))
Picture of current and desired output
As seen in the picture also the automatic column width is not working correctly, its size is dependant of the length of the column header and not the longest word in the column. Any idea on how to solve this?
Thanks for the help!
You can do this with openxlsx package.
library(openxlsx)
# Data
Agent1 <- data.frame("QUEUE" = c("call PA", "call", "Call", "call CB"), "NR" = c(6,15,3,7), "Client" = c("xyz company", "some other company", "Company abs", "BNM"), stringsAsFactors = FALSE)
Agent2 <- data.frame("QUEUE" = c("call PA", "call", "Call", "call CB"), "NR" = c(7,13,5,3), "Client" = c("xyz company", "some other company", "Company abs", "BNM"), stringsAsFactors = FALSE)
Agent3 <- data.frame("QUEUE" = c("call PA", "call", "Call", "call CB"), "NR" = c(4,4,3,7), "Client" = c("xyz company", "some other company", "Company abs", "BNM"), stringsAsFactors = FALSE)
agents <- c("Mark", "Neli", "Sara")
wb <- createWorkbook()
for (i in 1:length(agents)) {
agent <- paste0("Agent", i)
agent_nam <- agents[i]
agent_df <- eval(as.name(agent))
# Add sheet
addWorksheet(wb, agent_nam)
# Save Header (agent name)
writeData(wb, sheet = agent_nam, x = agent_nam, startRow = 1, startCol = 3)
# Write Dataframe
writeData(wb, sheet = agent_nam, x = agent_df, startRow = 3, rowNames = TRUE)
# Total cases
writeData(wb, sheet = agent_nam, x = paste0("Total cases: ", sum(agent_df$NR)), startRow = 8, startCol = 3)
# style 1: Agent names in bold
s1 <- createStyle(fontSize = 16, textDecoration = c("BOLD"), halign = "center")
# style 2: Bold white font with red background fill for table header
s2 <- createStyle(fontName = "Calibri Light", fontColour = "#FFFFFF",
fgFill = "#9e2b11", textDecoration = c("BOLD"), halign = "center",
border = "TopBottomLeftRight")
# style 3: border around the data
s3 <- createStyle(border = "TopBottomLeftRight")
# style 4: Text in the center for Total cases
s4 <- createStyle(halign = "center")
# Apply styles to the workbook
addStyle(wb, sheet = agent_nam, style = s1, rows = 1, cols = 3, gridExpand = TRUE)
addStyle(wb, sheet = agent_nam, style = s2, rows = 3, cols = 2:4, gridExpand = TRUE)
addStyle(wb, sheet = agent_nam, style = s3, rows = 4:7, cols = 2:4, gridExpand = TRUE)
addStyle(wb, sheet = agent_nam, style = s4, rows = 8, cols = 3, gridExpand = TRUE)
# Column widths
setColWidths(wb, sheet = agent_nam, cols = 1:4, widths = "auto")
}
saveWorkbook(wb, paste0(Sys.Date()," Test_file (openxlsx)",".xlsx"))
Can anyone give me any hints on how to output this table in R (link)? for example, how to bold the first line, add a dotted line , and make a double title for the table. I prefer R markdown, rft is also OK , I am trying to avoid Latex . Thank you so much!!
Here is one way using the htmlTable package
install.packages('devtools')
devtools::install_github('gforge/htmlTable')
(you can also find htmlTable, the function, in the Gmisc package on cran (install.packages('Gmisc')), but it will be removed soon and available in a stand-alone package called htmlTable)
out <-
structure(c("37(34%)", "1 (Ref)", "1 (Ref)", "1 (Ref)", "1 (Ref)",
"1 (Ref)", "45(23%)", "0.68 (0.63, 0.73)", "0.38 (0.32, 0.44)",
"0.21 (0.17, 0.28)", "0.08 (0.05, 0.13)", "0.05 (0.02, 0.11)",
"", "0.03", "0.04", "0.03", "0.02", "0.02", "110(34%)", "0.68 (0.65, 0.71)",
"0.38 (0.34, 0.42)", "0.21 (0.18, 0.25)", "0.08 (0.06, 0.11)",
"0.05 (0.03, 0.08)", "", "0.03", "0.04", "0.03", "0.02", "0.02"
),
.Dim = c(6L, 5L),
.Dimnames = list(NULL, c("r", "hr", "p", "hr", "p")))
## format rows/cols
colnames(out) <- c('(n = ***)','HR (92% CI)','P','HR (92% CI)','P')
rownames(out) <- c('PD No (%)','None','Age','Age (<60 vs > 60)',
' Age > 60',' Age < 60')
## add padding row called subset
out <- rbind(out[1:4, ], 'Subsets:' = '', out[5:6, ])
## bolding rownames
rownames(out) <- sprintf('<b>%s</b>', rownames(out))
## table column headers (with line breaks (<br />))
cgroup <- c('', 'R + C<br />(n = ***)', 'R + S<br />(n = ***)')
# zzz <- `rownames<-`(out, NULL)
library(htmlTable)
htmlTable(out, rowlabel = 'Adjustment<sup>†</sup>',
ctable = TRUE, align = 'ccccc',
## number of columns that each cgroup label spans:
n.cgroup = c(1, 2, 2), cgroup = cgroup,
## insert two table spanning sections:
tspanner = c('',''), # no labels
n.tspanner = c(4, 3), # number of rows to span (must sum to nrow(out))
# css.tspanner.sep = "border-bottom: 1px dotted grey;",
caption = "Table 1: Hazard ratios and <i>p</i> values of two models and
something something.",
tfoot = '<font size=1><sup>†</sup>Some note.</font>')
Gives me this
Ran into problems (headaches) with the dotted line. suggest just using a solid
You could use package ReporteRs. There are output examples and corresponding R code here.
You have control over table borders, fonts, paragraphs, etc.