How to display period class from lubridate in datatable from DT? - r

I have runtime data for various devices that can be widely different, ranging from a few minutes to several months that I would like to display in a datatable. So I thought the seconds_to_period function from lubridate provides a neat format to print this data. However, I seem unable to display it within a datatable from DT, which is what I want to do (within a shiny App).
Some example data:
library(lubridate)
library(DT)
names <- c("A","B","C","D","E","F")
timevec <- c(225,2250,22500,225000,2250000,22500000)
timevec <- seconds_to_period(timevec)
Writing this into a datatable without any formatting does not work as it only displays the seconds without considering the minutes/hours etc.:
##### This cuts off at the seconds -> useless
table <- data.frame(name = names, time = timevec)
my_table <- datatable(table)
Formatting the time column with formatDate also doesn't work since it is not a date or POSIXct object. I can print the desired format by typecasting it as a string, but then the sorting of the column doesn't work as it is sorted alphabetically:
##### This prints the period format, but sorting does not work
table <- data.frame(name = names, time = as.character(timevec))
my_table <- datatable(table)
and of course I could just print the total time in seconds, but as I said I find this very unintuitive to read:
##### This prints the seconds -> unintuitive to read
table <- data.frame(name = names, time = as.duration(timevec))
my_table <- datatable(table)
Any Ideas on how to achieve this or alternative suggestions how to intuitively display duration data?

solution by programming DT to sort a shown character column by a hidden numeric column via columnDefs
library(tidyverse)
library(lubridate)
library(DT)
names <- c("A", "B", "C", "D", "E", "F")
timevec_raw <- c(225, 2250, 22500, 225000, 2250000, 22500000)
timevec_period <- seconds_to_period(timevec_raw)
(table <- tibble(
name = names,
timenum = timevec_raw,
timechar = as.character(timevec_period)
)
)
my_table <- datatable(table,
options = list(
columnDefs = list(
list(
visible = FALSE, targets = 2
), # hide column 2 the numeric one
list(
orderData = c(2), # the ordering of column 3 comes from hidden column 2
targets = c(3)
)
)
)
)

Related

Combining the results of nested `tar_map` calls

I am creating a pipeline that allows for an arbitrary number of dataset names to be put in, where they will all be put through similar cleaning processes. To do this, I am using the targets package, and using the tar_map function from tarchetypes, I subject each dataset to a series of tidying and wrangling functions.
My issue now is that one dataset needs to be split into three datasets by a factor (a la split) while the rest should remain untouched. The pipeline would then theoretically move on by processing each dataset individually, including the three 'daughter' datasets.
Here's my best attempt:
library(targets)
library(tarchetypes)
library(tidyverse)
# dir.create("./data")
# tibble(nums = 1:300, groups = rep(letters[1:3], each = 100)) |>
# write_csv("./data/td1.csv")
# tibble(nums = 301:600, groups = rep(letters[1:3], each = 100)) |>
# write_csv("./data/td2.csv")
# tibble(nums = 601:900, groups = rep(letters[1:3], each = 100)) |>
# write_csv("./data/td3.csv")
tar_option_set(
packages = c("tidyverse")
)
read_data <- function(paths) {
read_csv(paths)
}
get_group <- function(data, groups) {
filter(data, groups == groups)
}
do_nothing <- function(data) {
data
}
list(
map1 <- tar_map(
values = tibble(datasets = c("./data/td1.csv", "./data/td2.csv", "./data/td3.csv")),
tar_target(data, read_data(datasets)),
map2 <- tar_map(values = tibble(groups = c("a", "b", "c")),
tar_skip(tester, get_group(data, groups), !str_detect(tar_name(), "td3\\.csv$"))
),
tar_target(dn, do_nothing(list(data, tester)))
)
)
The skipping method is a bit clumsy, I may be thinking about that wrong as well.
I'm obviously trying to combine the code poorly at the end there by putting them in a list, but I'm at a loss as to what else to do.
The datasets can't be combined by, say, rbind, since in actuality they are SummarizedExperiment objects.
Any help is appreciated - let me know if any further clarification is needed.
If you know the levels of that factor in advance, you can handle the splitting of that third dataset with a separate tar_map() call similar to what you do now. If you do not know the factor levels in advance, then the splitting needs to be handled with dynamic branching, and I recommend something like tarchetypes::tar_group_by().
I do not think tar_skip() is relevant here, and I recommend removing it.
If you start with physical files (or write physical files) then I strongly suggest you track them with format = "file": https://books.ropensci.org/targets/files.html#external-input-files.
library(targets)
library(tarchetypes)
tar_option_set(packages = "tidyverse")
list(
tar_map(
values = list(paths = c("data/td1.csv", "data/td2.csv")),
tar_target(file, paths, format = "file"),
tar_target(data, read_csv(file, col_types = cols()))
),
tar_target(file3, "data/td3.csv", format = "file"),
tar_group_by(data3, read_csv(file3, col_types = cols()), groups),
tar_target(
data3_row_counts,
tibble(group = data3$groups[1], n = nrow(data3)),
pattern = map(data3)
)
)

How to order a DataTable using a hidden column

I am rather new to R and I am trying to prepare an interactive data table using the DT package. My data contains numeric values, but some of these values are preceded by < or > sign. What I want is for my data table is to allow interactive sorting on the numeric values, regardless of whether there is a < or > sign in front of it. So for example >10, <5, 9, >8 should sort to <5, >8, 9, >10.
My initial approach for this was to duplicate the column containing the numeric values with < and > signs, to remove the < and > signs from this duplicate column, and to convert this data to numeric values to obtain a column with only the numeric values. What I then would like is to be able to order the data in the table on these numeric values, but I want to be able to do this when clicking the ordening button of the column containing the numeric values with the < and > signs. Therefore, I want to hide the column containing only the numeric values (since I do not want this column to be present in the table), but I want to somehow link the ordining function of the original column to this hidden column.
Here are some example data and a script in which I have already duplicated the column (b to c), removed the < and > signs, and converted it to numeric values to obtain the column c, which I have then hidden:
library(DT)
df <- data.frame(a=1:5, b=c('10','5.0','2.0','< 1.0','> 20'), c=c(10,5,2,1,20))
DT <- DT::datatable(df,
options = list(columnDefs =
list(list(visible=FALSE,
targets=3))))
DT
I have not been able to find a way to sort the data in the table on this hidden column c by using the sorting button of column b.
I have found that this should be possible in JavaScript: jQuery DataTables - Ordering dates by hidden column
However, I am not able to figure out how to do the same in R, either by using a suitable function in R, or by providing it in JavaScript using the JS() function.
Could anyone help me with this problem?
Here is a solution using render:
library(DT)
render <- c(
"function(data, type, row){",
" if(type === 'sort'){",
" return parseFloat(data.match(/\\d+\\.?\\d+/)[0]);",
" }else{",
" return data;",
" }",
"}"
)
df <- data.frame(
a = 1:5,
b = c('10','5.0','2.0','< 1.0','> 20')
)
DT <- datatable(df,
options = list(
columnDefs = list(
list(render = JS(render), type = "num", targets = 2)
)
)
)
DT
This solution does not require a hidden column.
Here's a way to do it. To get the "sorting key" use order.
library(DT)
# df <- data.frame(a=1:5, b=c('10','5.0','2.0','< 1.0','> 20'), c=c(10,5,2,1,20))
df <- data.frame(a = 1:5, b = c('10', '5.0', '2.0', '< 1.0', '> 20'))
df
#ONE APPROACH
df$c <-
stringr::str_replace(string = df$b,
pattern = "[<>]",
replacement = "") %>%
as.numeric()
#ANOTHER APPROACH
df$c <- gsub("[<>]", "", df$b) %>% as.numeric()
DT::datatable(df[order(df$c), -3], rownames = FALSE)
library(DT)
df <- data.frame(a=1:5, b=c('10','5.0','2.0','< 1.0','> 20'), c=c(10,5,2,1,20))
DT <- DT::datatable(df,
options = list(columnDefs =
list(list(visible=FALSE, targets=3),
list(orderData=3, targets=2)
)))
DT
Note: This answer is based on this one here, but DT now uses R indexing instead of JS indexing.

R: Conditional Formatting across excel files

I am trying to highlight rows of an excel file based on a match from the columns in a separate excel file. Pretty much, I want to highlight a row in file1 if a cell in that row matches a cell in file2.
I saw the R package "conditionalFormatting" has some of this functionality, but I cannot figure out how to use it.
the pseudo-code i think would look something like this:
file1 <- read_excel("file1")
file2 <- read_excel("file2")
conditionalFormatting(file1, sheet = 1, cols = 1:end, rows = 1:22,
rule = "number in file1 is found in a specific column of file 2")
Please let me know if this makes sense or if i need to clarify something.
Thanks!
The conditionalFormatting() function embeds active conditional formatting into the excel document but is likely more complicated than you need for a one-time highlight. I'd suggest loading each file into a dataframe, determining which rows contain a matching cell, creating a highlight style (yellow background), loading the file as a workbook object, setting the appropriate rows to the highlight style, and saving the updated workbook object.
The following function is the used to determine which rows have a match. The magrittr package provides the %>% pipes and the data.table package provides the transpose() function.
find_matched_rows <- function(df1, df2) {
require(magrittr)
require(data.table)
# the dataframe object treats each column as a list making it much easier and
# faster to search via column than row. Transpose the original file1 dataframe
# to treat the rows as columns.
df1_transposed <- data.table::transpose(df1)
# assuming that the location of the match in the second file is irrelevant,
# unlist the file2 dataframe so that each value in file1 can be searched in a
# vector
df2_as_vector <- unlist(df2)
# determine which columns contain a match. If one or more matches are found,
# attribute the row as 'TRUE' in the output vector to be used to subset the
# row numbers
match_map <- lapply(df1_transposed,FUN = `%in%`, df2_as_vector) %>%
as.data.frame(stringsAsFactors = FALSE) %>%
sapply(function(x) sum(x) > 0)
# make a vector of row numbers using the logical match_map vector to subset
matched_rows <- seq(1:nrow(df1))[match_map]
return(matched_rows)
}
The following code loads the data, finds the matched rows, applies the highlight, and saves over the original file1.xlsx. The second tst_df1 and tst_df2 provide for an easy way of testing the find_matched_rows() function. As expected, it finds that the 1st and 3rd rows of the first dataframe contain a cell that matches a cell in second dataframe.
# used to ensure that the correct rows are highlighted. the dataframe does not
# include the header as an independent row unlike excel.
file1_header_row <- 1
file2_header_row <- 1
tst_df1 <- openxlsx::read.xlsx("./file1.xlsx",
startRow = file1_header_row)
tst_df2 <- openxlsx::read.xlsx("./file2.xlsx",
startRow = file2_header_row)
#example data for testing
tst_df1 <- data.frame(fname = c("John", "Bob", "Bill"),
lname = c("Smith", "Johnson", "Samson"),
wage = c(10, 15.23, 137.38),
stringsAsFactors = FALSE)
tst_df2 <- data.frame(a = c(10, 34, 284.2),
b = c("Billy", "Bill", "Billy-Bob"),
c = c("Samson", "Johansson", NA),
stringsAsFactors = FALSE)
df_matched_rows <- find_matched_rows(tst_df1, tst_df2)
# any color found in colours() can be used here or hex color beginning with "#"
highlight_style <- openxlsx::createStyle(fgFill = "yellow")
file1_wb <- openxlsx::loadWorkbook(file = "./file1.xlsx")
openxlsx::addStyle(wb = file1_wb,
sheet = 1,
style = highlight_style,
rows = file1_header_row + df_matched_rows,
cols = 1:ncol(tst_df1),
stack = TRUE,
gridExpand = TRUE)
openxlsx::saveWorkbook(wb = file1_wb,
file = "./file1.xlsx",
overwrite = TRUE)

Format date in Datatable output

library(DT)
seq_dates <- data.frame(dates = as.Date("2017-01-01") + 1:6 * 100)
datatable(seq_dates) %>% formatDate(1, "toDateString")
I get a datatable in viewer pane displaying dates in following format "Mon May 22 2017".
Q - How can I format date column as "MM-YY"
If I do,
dplyr::mutate(seq_dates, dates = format(dates, format = "%b-%Y")) %>%
datatable()
I get the required date format, but in this second case column sorting doesn't work (sorting is done on alphabets rather than dates.)
P.S - I'm implementing this on shiny.
Hi in these cases do I think the best solution is to add a dummy column with the dates in orginal format and have the dates column being sorted according to the values in the DUMMY column. This is in Datatable quite easily done. Example code below.
seq_dates <- data.frame(dates = as.Date("2017-01-01") + 1:6 * 100)
datatable(seq_dates %>% mutate(DUMMY = dates,dates = format(dates, format = "%b-%Y")),
options = list(
columnDefs = list(
list(targets = 1,orderData = 2),
list(targets = 2, visible = FALSE)
)
))
For what it's worth (and using formatDate), the best that I can do is as follows:
datatable(seq_dates) %>%
formatDate(
columns = 1,
method = "toLocaleDateString",
params = list(
'en-US',
list(
year = 'numeric',
month = 'numeric')
)
)
And this yields date values like 4/2017 and 10/2017.
I've tried to find these parameter options (in github and the original datatables documentation) but to no avail. The only example in DT uses the parameters of short, long and numeric.
Converting "%b-%y" "dates" to date format is not an easy thing as I could see...
If you're not too attached to displaying "%b-%y" format, the easy way is to use "%Y-%m" or "%y-%m" format and the filter will work just fine :
library(DT)
seq_dates <- as.data.frame(seq(Sys.Date() - 100, Sys.Date(), by = "m"))
seq_dates <- format(seq_dates, format = "%y-%m")
datatable(seq_dates)
#resulting in
#1 2017-02
#2 2017-03
#3 2017-04
#4 2017-05
#or
#1 17-02
#2 17-03
#3 17-04
#4 17-05
There is a render method that you can use:
datatable( ...
options = list(..
columnDefs = list(..
list(targets = c(1), render = JS(
"function(data, type, row, meta) {",
"return type === 'display' ? new Date(data).toLocaleString() : data;"))))

Dynamic Reporting in R

I am looking for a help to generate a 'rtf' report from R (dataframe).
I am trying output data with many columns into a 'rtf' file using following code
library(rtf)
inp.data <- cbind(ChickWeight,ChickWeight,ChickWeight)
outputFileName = "test.out"
rtf<-RTF(paste(".../",outputFileName,".rtf"), width=11,height=8.5,font.size=10,omi=c(.5,.5,.5,.5))
addTable(rtf,inp.data,row.names=F,NA.string="-",col.widths=rep(1,12),header.col.justify=rep("C",12))
done(rtf)
The problem I face is, some of the columns are getting hide (as you can see last 2 columns are getting hide). I am expecting these columns to print in next page (without reducing column width).
Can anyone suggest packages/techniques for this scenario?
Thanks
Six years later, there is finally a package that can do exactly what you wanted. It is called reporter (small "r", no "s"). It will wrap columns to the next page if they exceed the available content width.
library(reporter)
library(magrittr)
# Prepare sample data
inp.data <- cbind(ChickWeight,ChickWeight,ChickWeight)
# Make unique column names
nm <- c("weight", "Time", "Chick", "Diet")
nms <- paste0(nm, c(rep(1, 4), rep(2, 4), rep(3, 4)))
names(inp.data) <- nms
# Create table
tbl <- create_table(inp.data) %>%
column_defaults(width = 1, align = "center")
# Create report and add table to report
rpt <- create_report("test.rtf", output_type = "RTF", missing = "-") %>%
set_margins(left = .5, right = .5) %>%
add_content(tbl)
# Write the report
write_report(rpt)
Only thing is you need unique columns names. So I added a bit of code to do that.
If docx format can replace rtf format, use package ReporteRs.
library( ReporteRs )
inp.data <- cbind(ChickWeight,ChickWeight,ChickWeight)
doc = docx( )
# uncomment addSection blocks if you want to change page
# orientation to landscape
# doc = addSection(doc, landscape = TRUE )
doc = addFlexTable( doc, vanilla.table( inp.data ) )
# doc = addSection(doc, landscape = FALSE )
writeDoc( doc, file = "inp.data.docx" )

Resources