ggpattern and swimplot : issue with aligning patterns and repeating x-axis values - r

I am having a hard time figuring out how to fix this graph. I thought it was a bug with ggpattern (see here and response to my bug report here); however, others seem to think it is not a bug, but an issue with "overlapping columns". When there are no overlapping patterns/values (e.g., "NA, "S", "V+S") the x-axis and patterns align; but when some repeat as in the code above the problems occur. I do not know how to reconcile these packages together to create a graph with appropriate patterns and x-axis values which consistently seem to both be incorrect. Thanks!
library(swimplot)
library(ggpattern)
library(tidyverse)
df <- data.frame(
study_id = c(3, 3, 3, 3), primary_therapy = c("Si", "Si", "Si", "Si"),
additional_therapy = c("NA", "S", "NA", "V+S"), end_yr = c(0.08, 0.39, 3.03, 3.4))
df <- df %>% mutate(additional_therapy = factor(additional_therapy,
levels = c("S", "V", "V+S", "NA")))
swimmer_plot(
df = df, id = "study_id",
end = "end_yr", name_fill = "primary_therapy",
width = 0.85, color = NA) +
geom_col_pattern(aes(study_id, end_yr,
pattern = additional_therapy), color=NA,
fill = NA,
show.legend=FALSE, width=0.85,
pattern_spacing = 0.01, pattern_fill="black", pattern_color=NA) +
scale_pattern_manual(name="Additional Therapy", values = c("S"="stripe","V"="circle","V+S"="crosshatch","NA"="none"),
breaks=c("S","V","V+S"))

Related

ggpattern missing pattern after na values

Using ggpattern, how can I not show a pattern for "NA" values but then have patterns continue after that point? When I run this code, there are no more patterns after an "NA" value even though there should be.
library(swimplot)
library(ggpattern)
library(tidyverse)
df <- data.frame(
study_id = c(3, 3, 3,3), primary_therapy = c("Si", "Si", "Si", "Si"),
additional_therapy = c("NA", "S", "NA", "V+S"), end_yr = c(0.08, 0.39, 3.03, 3.4)
)
swimmer_plot(
df = df, id = "study_id",
end = "end_yr", name_fill = "primary_therapy",
width = 0.85, color = NA) +
geom_col_pattern(aes(study_id, end_yr,
pattern = additional_therapy), color=NA,
fill = NA,
show.legend=FALSE, width=0.85,
pattern_spacing = 0.01, pattern_fill="black", pattern_color=NA,
pattern_size = 0.5, pattern_density=0.1,
pattern_linetype = 0.5, pattern_orientation="vertical") +
scale_pattern_manual(name="Additional Therapy", values = c("S"="stripe","V"="circle","V+S"="crosshatch","NA"="none"))
Your question has two parts: 1) controlling the legend labels and 2) fixing the x-axis for geom_col_pattern().
For your first question, you can remove "NA" from the legend by converting additional_therapy to a factor:
df <- df %>%
mutate(additional_therapy = factor(additional_therapy,
levels = c("S", "V", "V+S", "NA")))
Then, you can specify breaks to control which legend labels show up:
+ scale_pattern_manual(name="Additional Therapy", breaks = c("S", "V", "V+S"), values = c("S"="stripe", "V"="circle", "V+S"="crosshatch", "NA"="none"))
However, to the best of my knowledge, there is a problem with ggpattern when it comes to repeating patterns. I was able to recreate this issue using base ggplot2 to confirm that it's not an issue with swimplot.
In the below code, you can see that fill acts appropriately, but pattern will never repeat twice, which causes the pattern from "S" to spill over into the second red bar, which shouldn't have any pattern because it's "NA". I recommend submitting an issue on Github.
library(ggpattern)
library(tidyverse)
df <- data.frame(
study_id = c(3, 3, 3, 3), primary_therapy = c("Si", "Si", "Si", "Si"),
additional_therapy = c("NA", "S", "NA", "V+S"), end_yr = c(0.08, 0.39, 3.03, 3.4))
df %>%
ggplot(aes(x = end_yr, y = study_id)) +
geom_col_pattern(aes(pattern = additional_therapy,
fill = additional_therapy),
color = NA,
show.legend = TRUE,
position = "fill",
width = 3,
pattern_spacing = 0.01,
pattern_fill = "black",
pattern_color = NA,
pattern_size = 0.5,
pattern_density = 0.1,
pattern_linetype = 0.5,
pattern_orientation = "vertical") +
scale_pattern_manual(
name = "Additional Therapy",
breaks = c("S", "V", "V+S"),
values = c("S" = "stripe", "V" = "circle", "V+S" = "crosshatch", "NA" = "none"))

ggpattern change stripe angle for one element

How can I change the angle of the stripe pattern in only one element? For example, I want only the stripe pattern of "V" to be at -30 degrees.
Also, there seems to be an issue with my "NA" values / none pattern and no pattern appears after a NA value.
library(swimplot)
library(ggpattern)
library(tidyverse)
df <- data.frame(
study_id = c(3, 3, 3,3), primary_therapy = c("Si", "Si", "Si", "Si"),
additional_therapy = c("NA", "S", "NA", "V+S"), end_yr = c(0.08, 0.39, 3.03, 3.4)
)
swimmer_plot(
df = df, id = "study_id",
end = "end_yr", name_fill = "primary_therapy",
width = 0.85, color = NA) +
geom_col_pattern(aes(study_id, end_yr,
pattern = additional_therapy), color=NA,
fill = NA,
show.legend=FALSE, width=0.85,
pattern_spacing = 0.01, pattern_fill="black", pattern_color=NA,
pattern_size = 0.5, pattern_density=0.1,
pattern_linetype = 0.5, pattern_orientation="vertical") +
scale_pattern_manual(name="Additional Therapy", values = c("S"="stripe","V"="stripe","V+S"="crosshatch","NA"="none"))
For demonstration purposes, I changed your dataframe so you can see how the levels of additional_therapy get plotted, since your example dataframe didn't include any appearances of the level "V".
To achieve your goal of changing the stripe element for one level of additional_therapy, you need to add the argument pattern_angle back into geom_col_pattern and then add an extra line for scale_pattern_angle_manual() to specify which levels' patterns get set at which angles.
library(swimplot)
library(ggpattern)
library(tidyverse)
df <- data.frame(
study_id = c(3, 3, 3, 3, 3),
primary_therapy = c("Si", "Si", "Si", "Si", "Si"),
additional_therapy = c("NA", "NA", "S", "V", "V+S"),
end_yr = c(0.08, 1.11, 2.11, 3.03, 3.4)
)
# Convert additional_therapy to ordered factor (optional but highly recommended)
# This just determines the order that the items in the Additional Therapy legend appear in
df <- df %>% mutate(additional_therapy = factor(additional_therapy, levels = c("S", "V", "V+S", "NA")))
swimmer_plot(
df = df, id = "study_id",
end = "end_yr", name_fill = "primary_therapy",
width = 0.85, color = NA) +
geom_col_pattern(aes(study_id, end_yr,
pattern = additional_therapy,
pattern_angle = additional_therapy
),
color=NA,
fill = NA,
show.legend=TRUE, # so you can see the legend
width=0.85,
pattern_spacing = 0.01,
pattern_fill="black",
pattern_color=NA,
pattern_size = 0.5,
pattern_density=0.1,
pattern_linetype = 0.5,
pattern_orientation="vertical") +
scale_pattern_manual(name="Additional Therapy", values = c("S"="stripe","V"="stripe","V+S"="crosshatch","NA"="none")) +
scale_pattern_angle_manual(name="Additional Therapy", values = c(30, -30, 30, 30))
Unfortunately, to the best of my knowledge, there is a problem with ggpattern that is causing the issue with the x-axis. I discussed it in another of your questions here. I confirmed that it wasn't an issue with swimplot.

R: how to add significance differences between sub groups on a line plot

I have a line plot for 3 different groups and I would like to present on the chart, in addition to the significant differences, the significance of the differences between subgroups. For example, add the significance of the difference between each of the 3 populations' weights at age 1. I saw the stat_compare_means() function but failed to use it vertically between lines in my chart.
my current code:
library(tidyverse)
#pivot data into long format
df <- data.frame(
stringsAsFactors = FALSE,
ID = c(1L, 2L, 3L, 4L, 5L),
POPULATION = c("A", "A", "B", "B", "C"),
weight.at.age.1 = c(13.37, 5.19, 7.68, 6.96, 10.35),
weight.at.age.2 = c(14.15, 15.34, 6.92, 15.12, 8.86),
weight.at.age.3 = c(17.36, NA, 19.42, 36.39, 26.33)
) %>%
pivot_longer(cols = weight.at.age.1:weight.at.age.3,
names_to = 'age',
values_to = 'weight') %>%
mutate(age = str_remove(age, 'weight.at.age.'))
#plot data
ggline(data = df,
mapping = aes(x = age,
y = weight, add = "mean_se", color=POPULATION))+
stat_compare_means(aes(group = POPULATION), method = "anova", label = "p.signif",
label.y = c(56))
Tnx!

Error in ggtexttable (ggpubr)

I'm trying to create a publication-ready table using the ggtexttable function from ggpubr. I have a data frame:
dput(df)
structure(list(feature = list("start_codon", "stop_codon", "intergenic",
"3UTR", "5UTR", "exon", "intron", "ncRNA", "pseudogene"),
observed = list(structure(1L, .Names = "start_codon"), structure(1L, .Names = "stop_codon"),
structure(418L, .Names = "intergenic"), structure(48L, .Names = "3UTR"),
structure(28L, .Names = "5UTR"), structure(223L, .Names = "exon"),
structure(578L, .Names = "intron"), structure(20L, .Names = "ncRNA"),
structure(1L, .Names = "pseudogene")), expected = list(
0.286, 0.286, 369.02, 72.461, 33.165, 257.869, 631.189,
48.491, 3.172), fc = list(3.5, 3.5, 1.1, 0.7, 0.8, 0.9,
0.9, 0.4, 0.3), test = list("enrichment", "enrichment",
"enrichment", "depletion", "depletion", "depletion",
"depletion", "depletion", "depletion"), sig = list("F",
"F", "T", "T", "F", "T", "T", "T", "F"), p_val = list(
"0.249", "0.249", "0.00186", "0.00116", "0.209", "0.00814",
"0.00237", "<1e-04", "0.175")), class = "data.frame", row.names = c(NA,
-9L), .Names = c("feature", "observed", "expected", "fc", "test",
"sig", "p_val"))
And when I try to turn this into a table:
ggtexttable(df)
I get the error:
Error in (function (label, parse = FALSE, col = "black", fontsize =
12, : unused arguments (label.feature = dots[[5]][1],
label.observed = dots[[6]][1], label.expected = dots[[7]][1],
label.fc = dots[[8]][1], label.test = dots[[9]][1], label.sig_val
= dots[[10]][1], label.p_val = dots[[11]][1])
Does anyone know what might be causing this?
This works fine:
df <- head(iris)
ggtexttable(df)
I have found the problem and solution which is going to work for you. First of all your data is not in proper format (nested list) thats why you were getting this error trying to display it. You can check what is the format of the dataset easily by pasting in your console: str(data)
Here is the solution to convert your data to data.frame:
first.step <- lapply(data, unlist)
second.step <- as.data.frame(first.step, stringsAsFactors = F)
Then you can easily use the function ggtexttable(second.step) and it displays the table with your data.

Setting row order in R using levels

Please don't mark this as duplicate as I will take this question down once I find out what's wrong. I have used Levels() with a very high degree of success and today it refuses to work come what may. Here's what I'm trying to achieve. I have two data frames with an identical column. I am using the simple merge() function as follows:
mergedData<-merge(df1, df2, by='Index')
Now, I want to reorder the 'Index' column i.e. reorder the rows in the 'mergedData' file to match the order in either of the original dataframes. This is the command I am using to achieve the reordering:
mergedData$Index<-factor(mergedData$Index,
levels=c("ND","TC","PR","W","MI"))
When I test the levels after running the above command it shows the desired order however when I export the table it retains the original order. I am extremely confused as to why this isn't working. I have other scripts wherein I've used this approach of setting the desired order and it is working perfectly fine except in this instance.
Any help/suggestions/advise would be greatly appreciated.
I have attached data from the two dataframes for you all to play around with:
df1
structure(list(Index = structure(1:5, .Label = c("ND", "TC",
"PR", "W", "MI"), class = "factor"), `CP` = c(0.7102,
0.059, -0.0469, 1.0137, 0.6116), FA1 = c(0.5218, 0.0249, -0.0532,
0.9561, 1.1676), FA2 = c(0.5625, 0.0397, -0.0712, 0.9636, 0.9569
), FA3 = c(0.5934, 0.0332, -0.0442, 0.9873, 0.8929)), .Names = c("Index",
"CP", "FA1", "FA2", "FA3"), row.names = c(NA, 5L), class = "data.frame")
df2
structure(list(Index = structure(1:5, .Label = c("ND", "TC",
"PR", "W", "MI"), class = "factor"), `CP SD` = c(0.0241,
0.0184, 0.0021, 0.0114, 0.0947), `FA1 SD` = c(0.1891, 0.0171,
0.0104, 0.0559, 0.5321), `FA2 SD` = c(0.1273, 0.0243, 0.0173,
0.0565, 0.3292), `FA3 SD` = c(0.0518, 0.0094, 0.0078, 0.0195,
0.1581)), .Names = c("Index", "CP SD", "FA1 SD", "FA2 SD",
"FA3 SD"), row.names = c(NA, 5L), class = "data.frame")
Thanks
levels only controls the order of the factor levels (how it will be displayed by levels(x), in table, etc.), not the order of the rows in a data.frame. To order a data frame, use this:
mergedData <- mergedData[order(mergedData$Index),]
Or with dplyr:
library(dplyr)
mergedData <- arrange(mergedData,Index)

Resources