Knitr kable/pandoc/pander/geom_title/grid.table custom cell formatting - r

I would like to add symbols and letters before and after some numbers when using knitr's kable function, but do not know how to do this efficiently. I am however also willing to consider pandoc/pander if its is better/more efficient.
The end result should be an HTML table...or very good graphic of one....
Please see the following code as a mock reproducible example that is in a .Rmd file:
### Notional and Cumulative P&L
```{r echo=FALSE}
Notional <- 10000
yday_pnl <- -2942
wtd_pnl <- 2300
mtd_pnl <- -3334
ytd_pnl <- 5024
yday_rtn <- (yday_pnl/Notional)*10000
wtd_rtn <- (wtd_pnl/Notional)*10000
mtd_rtn <- (mtd_pnl/Notional)*10000
ytd_rtn <- (ytd_pnl/Notional)*10000
Value <- c(Notional,yday_pnl,wtd_pnl,mtd_pnl,ytd_pnl)
rtn <- c(NA,yday_rtn,wtd_rtn,mtd_rtn,ytd_rtn)
COB.basics <- as.data.frame(cbind(Value,rtn))
rownames(COB.basics) <- c('Notional','yday pnl','wtd_pnl','mtd_pnl','ytd_pnl')
```
```{r results='asis',echo=FALSE}
kable(COB.basics,digits=2)
```
So similar to Excel's format type of currency or accountancy I would like the value field to have the $ sign for the Value column, and for the rtn column I would like to have the string bps after the numbers...also for readability purposes is it possible to have commas after three digits if it is before the decimal point? i.e. to represent thousands etc.
Also is it possible to colour the cells? and also colour the text/numbers too? i.e. red for negative values?

Partial solution with pander:
Set "big mark" for pander so that it would be used for all numbers:
panderOptions('big.mark', ',')
You can also set the table syntax to rmarkdown (optional, as now rmarkdoen v2 also uses Pandoc, where the multiline format has some cool features compared to what rmarkdown format offered before:
panderOptions('table.style', 'rmarkdown')
You can highlight some cells with e.g. which and some custom R expression:
emphasize.strong.cells(which(COB.basics > 0, arr.ind = TRUE))
Simply call pander on your data.frame:
> library(pander)
> emphasize.strong.cells(which(COB.basics > 0, arr.ind = TRUE))
> panderOptions('big.mark', ',')
> pander(COB.basics)
-----------------------------------
Value rtn
-------------- ---------- ---------
**Notional** **10,000** NA
**yday pnl** -2,942 -2,942
**wtd_pnl** **2,300** **2,300**
**mtd_pnl** -3,334 -3,334
**ytd_pnl** **5,024** **5,024**
-----------------------------------
> panderOptions('table.style', 'rmarkdown')
> pander(COB.basics)
| | Value | rtn |
|:--------------:|:-------:|:------:|
| **Notional** | 10,000 | NA |
| **yday pnl** | -2,942 | -2,942 |
| **wtd_pnl** | 2,300 | 2,300 |
| **mtd_pnl** | -3,334 | -3,334 |
| **ytd_pnl** | 5,024 | 5,024 |
To color the cells, you could add some custom HTML/CSS markup manually (or LaTeX if working with pdf in the long run), and the same stands also for adding % or other symbols/strings to your cells with e.g. paste and apply -- but pls feel free to submit a feature request at https://github.com/Rapporter/pander

Related

R dataframe : find list of elements in column 1 of corresponding to items in column 2

Say that I have a dataframe
xy.df <- data.frame(x = runif(10), y = runif(10))
What I want to do is:
Create a list of non-redundant items in column 1
For each item in this list (items in column 1), identify the list of corresponding items in column 2
I have tried some tests with dplyr but I still don't get it!
df = xy.df %>% group_by(xy.df$x)
Any help would be appreciated.
Try this:
Your data.frame:
db<-data.frame(idProcess=c("5aa78","5aa78","9a978"),
ip=c("128.55.12.81","128.55.12.81","130.50.12.99"),
port=c(9265,59264,63925))
Building your output (is not the most efficient way but it'is clear what I'm doing)
list<-NULL
id_unique<-as.character(unique(db$idProcess))
for (i in 1:length(id_unique))
{
ip_i<-unique(as.character(db[as.character(db$idProcess)==id_unique[[i]],"ip"]))
list[eval(id_unique[[i]])]<-list(c(ip_i,unique(as.character(db[as.character(db$idProcess)==id_unique[[i]],"port"]))))
}
Your output
list
$`5aa78`
[1] "128.55.12.81" "9265" "59264"
$`9a978`
[1] "130.50.12.99" "63925"
Sorry I wanted to simplify my problem with the precedent examples, so here a small example of the dataframe
idProcess | ip | port|
5aa78 | 128.55.12.81 | 9265
5aa78 | 128.55.12.81 | 59264
9a978 | 130.50.12.99 | 63925
.....
So what I want to have is a list of lists, where each entry in the global list if the process name, for each process get the list of non redundant IP and non redundant port in one list, i.e.
List["5aa78"]=(128.55.12.81, 9265 , 59264)
List["9a978"]=( 130.50.12.99 , 63925)
....
thanks

Pander formats tables weirdly when using significance stars and pandoc

If I run a linear regression with significance stars, render it through pander, and "Knit PDF" such as this:
pander(lm(crimerate ~ conscripted + birthyr + indigenous + naturalized, data = data), add.significance.stars = T)
I occasionally get output where there is weird spacing issues between rows in the output table.
I've tried setting pander options to report fewer digits panderOptions('digits', 2), but the problem persists.
Does anybody have any ideas?
I had the same problem. Something is wrong with the cell alignment, this error disappeared when i changed style to rmarkdown.
library(data.table)
dt <- data.table(Test = c("0 - 10 000"),
ALDT = "99.18 %")
First(space in table):
pandoc.table(dt, justify = c("left", "right"))
# From pandoc below
------------------
Test ALDT
---------- -------
0 - 10 000 99.18 %
------------------
Second(good formatting):
pandoc.table(dt, style = "rmarkdown", justify = c("left", "right"))
# From pandoc below
| Test | ALDT |
|:--------------|--------:|
| 0 - 10 000 | 99.18 % |
The first try doesn't work, something is wrong with the formatting pandoc gives us. But if you specify the style as rmarkdown it seems like the formatting is as it should be.

writing a data.frame using cat

How can I add/append data.frame abc to the text file that I have opened previously. I am writing some important information to that file and then I want to append that data.frame below that information. I get an error when I try to write the data.frame abc using cat.
fileConn<-file("metadata.txt","w+")
smoke <- matrix(c(51,43,22,92,28,21,68,22,9),ncol=3,byrow=TRUE)
smoke <- as.data.frame(smoke)
table <- sapply (smoke, class)
abc <- data.frame(nm = names(smoke), cl = sapply(unname(smoke), class))
cat("some imp info","\n", file=fileConn)
cat(abc,"\n", file=fileConn)
close(fileConn)
class(abc)
Just use the standard tools for writing data.frame's, i.e. write.table:
write.table(abc, 'yourfile', append=TRUE) # plus whatever additional params
Try this
capture.output(abc, file = fileConn)
To make sure the output is readable, you could use also knitr::kable(). This will print your table as character, which has the advantage that you can embed it directly within the cat() call. It has lso several printing options (digits, align, row.names) etc that make it easy to control for how your table is printed:
tab <- knitr::kable(head(swiss))
cat("This is my file:",
"Some important note about it",
tab,
sep="\n")
#> This is my file:
#> Some important note about it
#> | | Fertility| Agriculture| Examination| Education| Catholic| Infant.Mortality|
#> |:------------|---------:|-----------:|-----------:|---------:|--------:|----------------:|
#> |Courtelary | 80.2| 17.0| 15| 12| 9.96| 22.2|
#> |Delemont | 83.1| 45.1| 6| 9| 84.84| 22.2|
#> |Franches-Mnt | 92.5| 39.7| 5| 5| 93.40| 20.2|
#> |Moutier | 85.8| 36.5| 12| 7| 33.77| 20.3|
#> |Neuveville | 76.9| 43.5| 17| 15| 5.16| 20.6|
#> |Porrentruy | 76.1| 35.3| 9| 7| 90.57| 26.6|

ascii package printing unnecessary characters

Running the following example
require(ascii)
mat <- matrix(c(1,11,2,12),nrow=2)
rownames(mat)<-letters[1:2]
colnames(mat)<-letters[11:12]
tab<- as.table(mat)
ascitab <- ascii(tab,digits=0,align="r")
print(ascitab,type="t2t")
produces the following output:
|| | k | l
| a | 1 | 2
| b | 11 | 12
Warning messages:
1: In rep(rownames, length = nrow(x)) :
'x' is NULL so the result will be NULL
2: In rep(colnames, length = ncol(x)) :
'x' is NULL so the result will be NULL
There are 2 issues:
The double vertical bar at the very beginning is incorrect, it should only be one bar.
And the warnings are very strange. The table has a sensible format.
Changing the print statement to
print(ascitab,type="t2t",include.rownames=TRUE,include.colnames=TRUE)
does not solve the problem.
Can anybody help?
I know a clumsy solution which includes capturing and postprocessing the output,
but I would like to see a clean solution.

change data frame so thousands are separated by dots

At the moment, I'm working with RMarkdown and Pandoc. My data.frames in R look like this:
3.538e+01 3.542e+01 3.540e+01
9.583e+00 9.406e+00 9.494e+00
2.601e+05 2.712e+05 5.313e+05
After I ran pandoc, the result looks like this:
35.380 35.420 35.400
9.583 9.406 9.494
260116.000 271217.000 531333.000
What it should look like is:
35,380 35,420 35,400
9,583 9,406 9,494
260.116 271.217 531.333
So I want commas instead of dots and I want no comma or dot after 260116 (thousand numbers). The dots to separate the thousand would be nice. Is there a way to directly Change the appearance in R or do I have to set options in knitr/markdown?
Thanks
Here's an example of some of the conversions that can be done with format():
x <- c(35.38, 35.42, 35.4, 9.583, 9.406, 9.494, 260100, 271200, 531300)
format(x, decimal.mark=",", big.mark=".", scientific=FALSE)
# [1] " 35,380" " 35,420" " 35,400" " 9,583" " 9,406"
# [6] " 9,494" "260.100,000" "271.200,000" "531.300,000"
There are several other options, such as trim, justify, and so on that might be of interest in getting your output ready for pandoc.
As this question was really inspiring, I recently introduced that big.mark feature in my pander package, that can return markdown formatted tables from R objects with predefined options -- building on format by the way. Small demo:
Load the package (installed from GH until this features gets to CRAN):
> library(pander)
Create a demo data.frame:
> x <- matrix(c(35.38, 35.42, 35.4, 9.583, 9.406, 9.494, 260100, 271200, 531300), 3, byrow = TRUE)
Set your default options: (values for US context may need to be switched)
> panderOptions('decimal.mark', ',')
> panderOptions('big.mark', '.')
Let pander do the rest:
> pander(x)
------- ------- -------
35,38 35,42 35,4
9,583 9,406 9,494
260.100 271.200 531.300
------- ------- -------
You can find and use even more options there (like the markdown syntax for the table).

Resources