I want to overlap two plots with different y-axis scales. I use stat_count() and geom_line. However, geom_line doesn't appear on the plot.
I use the following code.
library(ggplot2)
ggplot(X1, aes(x = Week)) +
stat_count() +
scale_x_continuous(breaks = seq(from = 0, to = 21, by = 1))+
scale_y_continuous(
name = expression("Count"),
limits = c(0, 20),
sec.axis = sec_axis(~ . * 15000 / 20, name = "Views"))+
geom_line(aes(y = Views), inherit.aes = T)
Here is the reproducible example of my data frame X1.
structure(list(Views = c(1749, 241, 309, 326, 237, 276, 2281,
1573, 10790, 1089, 1732, 3263, 2601, 2638, 2929, 3767, 2947,
65, 161), Week = c(1, 2, 2, 2, 3, 3, 4, 5, 5, 5, 6, 8, 8, 8,
8, 9, 10, 10, 10)), row.names = c(NA, -19L), class = c("tbl_df",
"tbl", "data.frame"))
Could you help me to put geom_line on the plot, please?
You also have to adjust the y values so that they fit inside the limits of the primary y-axis, i.e. apply the transfomation used for the secondary y-axis also inside geom_line. Try this:
X1 <- structure(list(Views = c(1749, 241, 309, 326, 237, 276, 2281,
1573, 10790, 1089, 1732, 3263, 2601, 2638, 2929, 3767, 2947,
65, 161), Week = c(1, 2, 2, 2, 3, 3, 4, 5, 5, 5, 6, 8, 8, 8,
8, 9, 10, 10, 10)), row.names = c(NA, -19L), class = c("tbl_df",
"tbl", "data.frame"))
library(ggplot2)
ggplot(X1, aes(x = Week)) +
stat_count() +
scale_x_continuous(breaks = seq(from = 0, to = 21, by = 1))+
scale_y_continuous(
name = expression("Count"),
limits = c(0, 20),
sec.axis = sec_axis(~ . * 15000 / 20, name = "Views"))+
geom_line(aes(y = Views / 15000 * 20), inherit.aes = T)
Created on 2020-05-21 by the reprex package (v0.3.0)
I also summarised the dataframe to improve the interpretation of the week 5 spike and plotted separate layers
x1 <- structure(list(Views = c(1749, 241, 309, 326, 237, 276, 2281,
1573, 10790, 1089, 1732, 3263, 2601, 2638, 2929, 3767, 2947,
65, 161), Week = c(1, 2, 2, 2, 3, 3, 4, 5, 5, 5, 6, 8, 8, 8,
8, 9, 10, 10, 10)), row.names = c(NA, -19L), class = c("tbl_df",
"tbl", "data.frame"))
x2 <- x1 %>%
group_by(Week) %>%
summarise(Views = sum(Views))
library(ggplot2)
ggplot() +
geom_line(data = x2, mapping = aes(x = Week, y = Views/15000 * 20))+
geom_bar(data = x1, mapping = aes(x = Week), stat = 'count')+
scale_x_continuous(breaks = seq(from = 0, to = 21, by = 1))+
scale_y_continuous( name = expression("Count"),
ylim.prim <- c(0, 20),
ylim.sec <- c(0, 15000),
sec.axis = sec_axis(~ . * 15000 / 20, name = "Views"))
Related
I am conducting a kruskal-wallis test to determine statistically significance between three groups of a measurement. I use ggbetweenstats to determine between which group there is a statistically significant association.
Here is the code for sample data and the plot:
sampledata <- structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20), group = c(1, 2, 3, 1, 2, 3,
1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2), measurement = c(0,
1, 200, 30, 1000, 6000, 1, 0, 0, 10000, 20000, 700, 65, 1, 8,
11000, 13000, 7000, 500, 3000)), class = "data.frame", row.names = c(NA,
20L))
library(ggstatsplot)
library(ggplot2)
ggbetweenstats(
data = sampledata,
x = group,
y = measurement,
type = "nonparametric",
plot.type = "box",
pairwise.comparisons = TRUE,
pairwise.display = "all",
centrality.plotting = FALSE,
bf.message = FALSE
)
You can see the results from the kruskal wallis test on the top of the plot as well as the groupes analysis in the plot. Now I want to change y axis to logarithmic scale:
ggbetweenstats(
data = sampledata,
x = group,
y = measurement,
type = "nonparametric",
plot.type = "box",
pairwise.comparisons = TRUE,
pairwise.display = "all",
centrality.plotting = FALSE,
bf.message = FALSE
) +
ggplot2::scale_y_continuous(trans=scales::pseudo_log_trans(sigma = 1, base = exp(1)), limits = c(0,25000), breaks = c(0,1,10,100,1000,10000)
)
However, this removes the grouped analysis. I have tried different scaling solutions and browsed SO for a solution but couldn't find anything. Thank you for your help!
It seems that the y_position parameter in the geom_signif component is not affected by the y axis transformation. You will need to pass the log values of the desired bracket heights manually. In theory, you can pass these via the ggsignif.args parameter, but it seems that in the latest version of ggstatsplot this isn't possible because the y_position is hard-coded.
One way tound this is to store the plot then change the y positions after the fact. Here's a full reprex with the latest versions of ggplot2, ggstatsplot and their dependencies (at the time of writing)
sampledata <- structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20), group = c(1, 2, 3, 1, 2, 3,
1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2), measurement = c(0,
1, 200, 30, 1000, 6000, 1, 0, 0, 10000, 20000, 700, 65, 1, 8,
11000, 13000, 7000, 500, 3000)), class = "data.frame", row.names = c(NA,
20L))
library(ggstatsplot)
library(ggplot2)
library(scales)
p <- ggbetweenstats(
data = sampledata,
x = group,
y = measurement,
type = "nonparametric",
plot.type = "box",
pairwise.comparisons = TRUE,
pairwise.display = "all",
centrality.plotting = FALSE,
bf.message = FALSE
) + scale_y_continuous(trans = pseudo_log_trans(sigma = 1, base = exp(1)),
limits = c(0, exp(13)),
breaks = c(0, 10^(0:5)),
labels = comma)
#> Scale for y is already present.
#> Adding another scale for y, which will replace the existing scale.
i <- which(sapply(p$layers, function(x) inherits(x$geom, "GeomSignif")))
p$layers[[i]]$stat_params$y_position <- c(10, 10.8, 11.6)
p
Created on 2023-01-15 with reprex v2.0.2
I am trying to create a bar chart or column chart plot to compare pre and post scores between participants. I managed to do this in a line graph, however, I am struggling to visualise this within a bar chart, can anyone help me with this?
Here is the data I am using:
structure(list(Participant = c(2, 3, 5, 7), PRE_QUIP_RS = c(24,
24, 20, 20), POST_QUIP_RS = c(10, 23, 24, 14), PRE_PDQ8 = c(11,
8, 10, 4), POST_PDQ8 = c(7, 7, 9, 4), PRE_GDS = c(1, 7, 1, 0),
POST_GDS = c(1, 4, 2, 0), PRE_PERSISTENT = c(9, 13, 6, 2),
POST_PERSISTENT = c(9, 13, 11, 3), PRE_EPISODIC = c(3, 4,
2, 0), POST_EPISODIC = c(2, 5, 6, 2), PRE_AVOIDANCE = c(6,
3, 0, 2), POST_AVOIDANCE = c(3, 3, 4, 1), PRE_IPQ = c(39,
48, 40, 37), POST_IPQ = c(16, 44, 30, 17), PRE_GSE = c(28,
31, 36, 29), POST_GSE = c(29, 30, 30, 29), PRE_BCI = c(11,
9, 5, 3), POST_BCI = c(3, 15, 0, 0)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -4L))
In terms of how I roughly want it to look, I want the bars to be placed together for pre and post for each participant, kind of like this:
You may try
library(tidyverse)
df %>%
select(Participant, PRE_QUIP_RS, POST_QUIP_RS) %>%
pivot_longer(cols = c(PRE_QUIP_RS, POST_QUIP_RS), names_to = "group") %>%
mutate(group = str_split(group, "_", simplify = T)[,1],
Participant = as.factor(Participant)) %>%
ggplot(aes(x = Participant, y = value, group = group, fill = group)) +
geom_col(position = "dodge")
PRE POST order
dummy %>%
select(Participant, PRE_QUIP_RS, POST_QUIP_RS) %>%
pivot_longer(cols = c(PRE_QUIP_RS, POST_QUIP_RS), names_to = "group") %>%
mutate(group = str_split(group, "_", simplify = T)[,1] %>%
factor(., levels = c("PRE", "POST")), # HERE
Participant = as.factor(Participant)) %>%
ggplot(aes(x = Participant, y = value, group = group, fill = group)) +
geom_col(position = "dodge")
I'm making a stacked barplot using ggplot, but for some reason, it keeps leaving 2 bars unfilled, despite filling in other ones using the same criteria. Why is it doing this and how can I prevent this from happening?
library(ggplot2)
library(dplyr)
library(scales)
#Code to replicate
data <- tibble(team = factor(c(rep("Team 1", 10), rep("Team 2", 10), rep("Team 3", 10), rep("Team 4", 10)), levels = c("Team 1", "Team 2", "Team 3", "Team 4")),
state = factor(c(rep(c("Won", "Tied",
"Rematch", "Postponed", "Forfeit",
"Lost", "Withdrew", "Ongoing",
"Undetermined", "Unknown"), 4)), levels = c("Won", "Tied",
"Rematch", "Postponed", "Forfeit",
"Lost", "Withdrew", "Ongoing",
"Undetermined", "Unknown")),
count = c(1920, 80, 241, 5, 310, 99, 2, 127, 20, 33,
48, 1, 8, 0, 11, 3, 0, 4, 3, 3,
140, 5, 8, 0, 17, 2, 0, 5, 3, 7,
477, 20, 59, 1, 106, 1, 0, 33, 7, 10))
data <- data %>%
group_by(team) %>%
mutate(percentage = round((count/sum(count, na.rm = TRUE)), 2))
data %>%
ggplot(aes(fill= state, y = percentage, x = team)) +
geom_col(position="stack",width = 0.4) +
coord_flip() +
scale_y_continuous(labels = scales::percent_format(accuracy = 1), limits = c(0, 1)) +
geom_text(aes(label = scales::percent(percentage, accuracy = 1)),
position = position_stack(vjust = .5),
check_overlap = TRUE )
Here's how it looks; the floating 75% and 59% for Team 3 and Team 2, respectively, should be in the salmon color that is used for Teams 4 and 1. I know it's not a typo because I'm using the same title for each.
Change the position argument to fill
data %>%
ggplot(aes(fill= state, y = percentage, x = team)) +
geom_col(position="fill",width = 0.4) +
coord_flip() +
scale_y_continuous(labels = scales::percent_format(accuracy = 1), limits = c(0, 1)) +
geom_text(aes(label = scales::percent(percentage, accuracy = 1)),
position = position_stack(vjust = .5),
check_overlap = TRUE )
I'm trying to change the x-axis tick labels in ggplot but I can't get it to work for some reason. I have the following code and plot:
ggplot(over36mo, aes(x=raceeth,y=pt,fill=factor(year.2cat))) +
geom_bar(stat="identity",position="dodge") +
geom_errorbar(aes(ymax=pt+se, ymin=pt-se), width=0.2, position=position_dodge(0.9)) +
scale_fill_discrete(guide=FALSE) +
scale_y_continuous(breaks=seq(0, 0.26, 0.02), limits=c(0,0.26)) +
labels=c("NHW","NHB","NHNA/PI","NHA","H")) +
theme(axis.line.x=element_line(color="black"),
axis.line.y=element_line(color="black"),
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
plot.background=element_blank()) +
xlab("All ages") + ylab("")
But when I try to change 1, 2, 3, 4, 5 to different labels with scale_x_discrete, the x-axis disappears like so:
ggplot(over36mo, aes(x=raceeth,y=pt,fill=factor(year.2cat))) +
geom_bar(stat="identity",position="dodge") +
geom_errorbar(aes(ymax=pt+se, ymin=pt-se), width=0.2, position=position_dodge(0.9)) +
scale_fill_discrete(guide=FALSE) +
scale_y_continuous(breaks=seq(0, 0.26, 0.02), limits=c(0,0.26)) +
labels=c("NHW","NHB","NHNA/PI","NHA","H")) +
theme(axis.line.x=element_line(color="black"),
axis.line.y=element_line(color="black"),
panel.background=element_blank(),
panel.border=element_blank(),
panel.grid.major=element_blank(),
panel.grid.minor=element_blank(),
plot.background=element_blank()) +
xlab("All ages") + ylab("") +
scale_x_discrete(breaks=c("1","2","3","4","5"), labels=c("NHW","NHB","NHNA/PI","NHA","H")) +
It's probably obvious what's wrong but I can't figure it out. Here's a dput of my data if someone wants to give it a shot!
dput(over36mo)
structure(list(z.surv.mos = c(36, 36, 36, 36, 36, 36, 36, 36,
36, 36), raceeth = c(1, 1, 2, 2, 3, 3, 4, 4, 5, 5), year.2cat = c(1,
2, 1, 2, 1, 2, 1, 2, 1, 2), pt = c(0.10896243930756, 0.12919986395988,
0.10763696166101, 0.0918969557367, 0.14186152615109, 0.12701814940611,
0.05405405405405, 0.09393141727008, 0.08880901672474, 0.11716939090588
), nevent = c(9, 3, 0, 0, 2, 1, 0, 0, 1, 1), ncensor = c(0, 9,
0, 1, 0, 2, 0, 1, 0, 0), nrisk = c(311, 96, 33, 9, 72, 21, 2,
2, 48, 20), cum.ev = c(2474, 2469, 287, 342, 440, 496, 35, 40,
505, 616), cum.cen = c(1, 958, 4, 107, 12, 198, 0, 13, 19, 239
), pointflg = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1), pe = c(0.89103756069243,
0.87080013604011, 0.89236303833898, 0.90810304426329, 0.8581384738489,
0.87298185059388, 0.94594594594594, 0.90606858272991, 0.91119098327525,
0.88283060909411), se = c(0.00591553159512, 0.00860912091676,
0.01746946721576, 0.01975702415208, 0.01550071018085, 0.01904081251339,
0.03717461110299, 0.05797150600236, 0.01228353765126, 0.01608823714602
), lower.cl = c(0.09796374785164, 0.11338170396883, 0.07830897003442,
0.06029765195198, 0.11451353670001, 0.09468155080317, 0.01404207131432,
0.02802051731609, 0.06772108402588, 0.08952365586359), upper.cl = c(0.12119598770184,
0.14722485430136, 0.14794876641234, 0.1400560419898, 0.17574073058836,
0.17039866945242, 0.20807761862723, 0.31488038035974, 0.11646360310182,
0.15335238527538)), .Names = c("z.surv.mos", "raceeth", "year.2cat",
"pt", "nevent", "ncensor", "nrisk", "cum.ev", "cum.cen", "pointflg",
"pe", "se", "lower.cl", "upper.cl"), row.names = c("38", "134",
"183", "246", "289", "366", "412", "452", "491", "563"), class = "data.frame")
It's because you are setting a discrete x scale but your x values are numeric. If you want to treat them as discrete, convert to a factor. Just change the first part to
ggplot(over36mo, aes(x=factor(raceeth), y=pt, fill=factor(year.2cat)))
and it should work just fine.
for visualized my data, I used gplot.
Question: Why "colour" doesn't change, and is it possible to do type = "h" like in basic plot?
print(qplot(roundpop, Observation, data=roundpopus), shape = 5, colour = "blue") # i tryed with "" and without.
And if it's possible to change type to histogram, like on second picture, can I draw a line by the top of lines?
Like that:
and maybe to write labels (states) on the top of the lines. Because I know how to give a name only for dots on basic plot.
Thank you!
Here are some options, which you may want to tweak according to your needs:
library(ggplot2)
df <- structure(list(x = c(1, 2, 2, 2, 3, 3, 3, 3, 4, 4, 5, 5, 5, 6,
6, 6, 7, 7, 7, 7, 8, 9, 10, 10, 10, 12, 13, 13, 20, 20, 27, 39
), y = c(33, 124, 45, 294, 160, 105, 276, 178, 377, 506, 176,
393, 247, 378, 221, 796, 503, 162, 801, 486, 268, 575, 828, 493,
252, 495, 836, 551, 413, 832, 1841, 1927), lab = c("i8g8Q", "oXlWk",
"NC2WO", "pYxBL", "Xfsy6", "FJcOl", "Ke98f", "K2mCW", "g4XYi",
"ICzWp", "7nqrK", "dzhlC", "JagAW", "0bObp", "8ljIW", "E8OZR",
"6Tuxz", "3Grbq", "xqsld", "BvuJT", "JXi2N", "eSDYS", "OYVWN",
"vyWzK", "6AKxk", "nCgPx", "8lHrq", "kWAGm", "E08Rd", "cmIYY",
"btoUm", "k6Iek")), .Names = c("x", "y", "lab"), row.names = c(NA,
-32L), class = "data.frame")
p <- ggplot(df, aes(x, y))
gridExtra::grid.arrange(
p + geom_point(),
p + geom_point() + geom_text(aes(label = lab), angle = 60, hjust = 0, size = 2),
p + geom_segment(aes(xend=x, yend=0)),
p + geom_segment(aes(xend=x, yend=0)) + geom_line(color = "red", size = 2) ,
p + geom_segment(aes(xend=x, yend=0)) + geom_smooth(span = .4, se = FALSE, color = "red", size = 2)
)