Indentation in the first column of a flextable object - r

I am building flextable objects to show tables, and sometimes I would like to add one or several indentations in the first column, where I show some rows' names.
Next I share some code to simulate some data and have a reproducible example. The true starting point of my question is ft (Table 1):
library(dplyr)
library(flextable)
# Simulate data
g_A <- expand.grid(x = "A", y = c("A_1", "A_2"), z = c("A_1_a", "A_1_b", "A_2_a", "A_2_b"))
g_B <- expand.grid(x = "B", y = c("B_1", "B_2"), z = c("B_1_a", "B_1_b", "B_2_a", "B_2_b"))
g <- rbind(g_A, g_B)
n <- 123
set.seed(1)
df <- sample_n(g, n, replace = TRUE)
# Build table
tmp <- c(table(df$x)[1],
table(df$y)[1],
table(df$z)[1:2],
table(df$y)[2],
table(df$z)[3:4],
table(df$x)[2],
table(df$y)[3],
table(df$z)[5:6],
table(df$y)[4],
table(df$z)[7:8])
my_tab <- data.frame("tmp" = names(tmp), "counts" = tmp, "percentages" = round(tmp/n*100, 2))
# flextable operations
ft <- flextable(my_tab)
ft <- set_header_labels(ft, tmp = "")
ft <- align(ft, align = "center")
ft <- align(ft, j = 1, align = "left")
# ft
Now, I would like to indent some names in the first column. For example, to indent A_1 I have tried the following strategies:
compose(ft, i = 2, j = 1, as_paragraph(" A_1"))
compose(ft, i = 2, j = 1, as_paragraph("\t A_1"))
# Or
# colformat_char(ft, i = 2, j = 1, prefix = " ")
# colformat_char(ft, i = 2, j = 1, prefix = "\t")
But they don't work (the result is the same as in Table 1). A "second best" strategy could be the following one (Table 2):
compose(ft, i = 2, j = 1, as_paragraph("- A_1"))
# Or
# colformat_char(ft, i = 2, j = 1, prefix = "- ")
However, I would like a proper indentation.
Finally, I share Table 3, my expected final result, with an indentation in place of each "-".
Waiting for your insights!
Ciao

To indent cells in a flextable you can use padding function:
ft <- padding(ft, i=2, j=1, padding.left=20)

Related

Unlist LAST level of a list in R

I have a list of list like ll:
ll <- list(a = list(data.frame(c = 1, d = 2), data.frame(h = 3, j = 4)), b = list(data.frame(c = 5, d = 6), data.frame(h = 7, j = 9)))
I want to unnest/unlist the last level of the structure (the interior list). Note that every list contains the same structure. I want to obtain lj:
lj <- list(a = (data.frame(c = 1, d = 2, h = 3, j = 4)), b = data.frame(c = 5, d = 6, h = 7, j = 9))
I have tried the following code without any success:
lj_not_success <- unlist(ll, recursive = F)
However, this code unlists the FIRST level, not the LAST one.
Any clue?
We may need to cbind the inner list elements instead of unlisting as the expected output is a also a list of data.frames
ll_new <- lapply(ll, function(x) do.call(cbind, x))
-checking
> identical(lj, ll_new)
[1] TRUE

Conditional statement: change one variable in a data list based on certain input

Can I use conditional statement to change one variable in a data list based on certain input?
For instance, a data list as follows. I need d = perd or phyd when I use different input: dlist[x], d=perd; dlist[y], d=phyd. x and y can be anything, what I need is just to give an order and then make it as perd or phyd.
dlist <- list(
Nsubjects = 1,
Ntrials = 2,
d = perd,
)
perd <- c (1,2,3)
phyd <- c (4,5,6)
Can you create another list with names to store perd and phyd ?
plist <- list(x = c (1,2,3), y = c (4,5,6))
You can then extract the data from it by it's name.
val <- 'x'
dlist <- list(
Nsubjects = 1,
Ntrials = 2,
d = plist[[val]]
)
Without creating plist you can do. :
list(
Nsubjects = 1,
Ntrials = 2,
d = if(val == 1) c(1,2,3) else c(4,5,6)
)
Or also :
list(
Nsubjects = 1,
Ntrials = 2,
d = list(c(1,2,3),c(4,5,6))[[val]]
)
where val <- 1 or 2.

ggarrange generates an empty pdf file

I am dealing with a function that takes a big data frame (36 rows and 194 columns) which performs a Principal Component Analysis and then generates a list of plots where I have the combination of 26 Principal Components which are 325 in total, using 'expand.grid'.
My problem is that when I am using ggarrange(), from ggpubr, to merge all the plots in only one pdf file, this file is empty.
My code:
a = 26
row.pairs = 325
PC.Graph <- function(df, col1, col2, tag, id){
df1 <- df[,-c(col1:col2)]
pca <- prcomp(df1, scale. = T)
pc.summ <- summary(pca)
a <- sum(pc.summ$importance[3,] < 0.975)
b <- c(1:a)
pc.grid <- expand.grid(b, b)
pc.pairs <- pc.grid[pc.grid$Var1 < pc.grid$Var2,]
row.pairs <- nrow(pc.pairs)
components <- c(1:row.pairs)
S.apply.FUN <- function(x){
c <- sapply(pc.pairs, "[", x, simplify = F)
pcx <- c$Var1
pcy <- c$Var2
df2 <- df
row.names(df2) <- df[, tag]
name = paste("PCA_", pcx, "_vs_", pcy)
autoplot(pca, data = df2, colour = id, label = T, label.repel = T, main = name,
x = pcx, y = pcy)
}
all.plots <- Map(S.apply.FUN, components)
pdf(file = "All_PC.pdf", width = 50, height = 70)
print(ggarrange(all.plots))
dev.off()
}
PC.Graph(Final_DF, col1 = 1, col2 = 5, tag = "Sample", id = "Maturation")
You would have to pass a plotlist to ggarrange, but I am not sure you would get any useful plot out of that plot area in the PDF file, so I would advise you to split the plotlist into chunks (e.g. of 20) and plot these to multiple pages.
Specifically, I would export all.plots from your PC.Graph function (and remove the code to write to PDF there).
I would also change the expand.grid(b, b) to t(combn(b, 2)), since you don't need to plot the PC combinations twice.
Then I would do something like this:
# export the full list of plots
plots <- PC.Graph(Final_DF, col1 = 1, col2 = 5, tag = "Sample", id = "Maturation")
# split the plotlist
splitPlots <- split(plots, ceiling(seq_along(plots)/20))
plotPlots <- function(x){
out <- cowplot::plot_grid(plotlist = x, ncol = 5, nrow = 4)
plot(out)
}
pdf(file = "All_PC.pdf", width = 50, height = 45)
lapply(splitPlots, plotPlots)
dev.off()

Creating subplot (facets) with custom x,y position of the subplots in ggplot2

How can we custom the position of the panels/subplot in ggplot2?
Concretely I have a grouped times series and I want to produce 1 subplot per time series with custom positions of the subplot, not necessarily in a grid.
The facet_grid() or facet_wrap() functions do not provide a full customization of the position of the panel as it uses grid.
library(tidyverse)
df = data.frame(group = LETTERS[1:5],
x = c(1,2,3,1.5,2.5),
y =c(2,1,2,3,3),
stringsAsFactors = F)%>%
group_by(group)%>%
expand_grid(time = 1:20)%>%
ungroup()%>%
mutate(dv = rnorm(n()))%>%
arrange(group,time)
## plot in grid
df%>%
ggplot()+
geom_line(aes(x=time,y=dv))+
facet_grid(~group)
## plot with custom x, y position
## Is there an equivalent of facet_custom()?
df%>%
ggplot()+
geom_line(aes(x=time,y=dv))+
facet_custom(~group, x.subplot = x, y.subplot = y)
FYI: This dataset is only an example. My data are EEG data where each group represents an electrode (up to 64) and I want to plot the EEG signals of each electrode accordingly to the position of the electrode on the head.
Well, I guess this would not really be a 'facet plot' any more. I therefore don't think there is a specific function out there.
But you can use the fantastic patchwork package for that, in particular the layout option in wrap_plots.
As the main package author Thomas describes in the vignette, the below option using area() may be a bit verbose, but it would give you full programmatic options about positioning all your plots.
library(tidyverse)
library(patchwork)
mydf <- data.frame(
group = LETTERS[1:5],
x = c(1, 2, 3, 1.5, 2.5),
y = c(2, 1, 2, 3, 3),
stringsAsFactors = F
) %>%
group_by(group) %>%
expand_grid(time = 1:20) %>%
ungroup() %>%
mutate(dv = rnorm(n())) %>%
arrange(group, time)
## plot in grid
mylist <-
mydf %>%
split(., .$group)
p_list <-
map(1:length(mylist), function(i){
ggplot(mylist[[i]]) +
geom_line(aes(x = time, y = dv)) +
ggtitle(names(mylist)[i])
}
)
layout <- c(
area(t = 1, l = 1, b = 2, r = 2),
area(t = 2, l = 3, b = 3, r = 4),
area(t = 3, l = 5, b = 4, r = 6),
area(t = 4, l = 3, b = 5, r = 4),
area(t = 5, l = 1, b = 6, r = 2)
)
wrap_plots(p_list, design = layout)
#> result not shown, it's the same as below
For a more programmatic approach, one option is to create the required "patch_area" object manually.
t = 1:5
b = t+1
l = c(1,3,5,3,1)
r = l+1
list_area <- list(t = t, b = b, l = l, r = r)
class(list_area) <- "patch_area"
wrap_plots(p_list, design = list_area)
Created on 2020-04-22 by the reprex package (v0.3.0)

apply statement to sample columns, across rows of different lengths

I'm trying to write a simple R function to sample 5-element substrings across two columns of a single data frame. The length of the strings are equal for each row, but they differ down the columns. The function works when I specify a row and col to act on, but I can't get the apply statement to work on on each row and each column. As written, it will only pull random samples based on the length of the first instance, so if the first instance is shorter than any of the other strings, the output for the other rows is sometimes less than 5-elements.
example df:
BP TF
1 CGTCTCTATTCTAGGCAAGA TTTFFFFTFFFTFFFTFTTT
2 AAGTCACTCGAATTCGGATGCCCCCTAGGC TTFFFFFTFFFFTTFTFFTTTFTTTTFTFF
3 TGCTCATGACGGGAC FFFTFTFFFFTFTFT
'intended output:'
1 CTATT FFTFF
2 CCTAG TTTFT
3 TCATG TFTFF
'reproducible example code:'
#make fake data frame
BaseP1 <- paste(sample(size = 20, x = c("A","C","T","G"), replace = TRUE), collapse = "")
BaseP2 <- paste(sample(size = 30, x = c("A","C","T","G"), replace = TRUE), collapse = "")
BaseP3 <- paste(sample(size = 15, x = c("A","C","T","G"), replace = TRUE), collapse = "")
TrueFalse1 <- paste(sample(size = 20, x = c("T","F"), replace = TRUE), collapse = "")
TrueFalse2 <- paste(sample(size = 30, x = c("T","F"), replace = TRUE), collapse = "")
TrueFalse3 <- paste(sample(size = 15, x = c("T","F"), replace = TRUE), collapse = "")
my_df <- data.frame(c(BaseP1,BaseP2,BaseP3), c(TrueFalse1, TrueFalse2, TrueFalse3))
Fragment = function(string) {
nStart = sample(1:nchar(string) -5, 1)
substr(string, nStart, nStart + 4)
}
Fragment(string = my_df[1,1])#works for the first row, first col.
but this does not work:
apply(my_df, c(1,2), function(x) Fragment(string = my_df[1:nrow(my_df),1:ncol(my_df)]))
There was an error in your function:
Fragment = function(string) {
nStart = sample(1:(nchar(string) -5), 1)
substr(string, nStart, nStart + 4)
}
It was missing parentheses between nchar(string) - 5, which made the subsetting go wrong.
You can then simply use apply(my_df, c(1,2), Fragment) as suggested in the comments.
To show that this works now:
for(i in 1:10000){
stopifnot(all(5 == sapply(apply(my_df, c(1,2), Fragment), nchar)))
}
This shows that in 10000 tries, it always produced 5 characters as output.

Resources