how to present the data with "<" in a PLOT - r

I would like to plot a line + point plot. But my data contain "<" Is it possible to make the special point for the point with "<"? Any suggestion on how to better present those info?
Sample data:
df<-structure(list(Day = c(1, 3, 6, 7, 9, 12, 15), Score = c("0.1",
"0.5", "<1.3", "0.2", "<1.55", "0.8", "1.2")), row.names = c(NA,
-7L), class = c("tbl_df", "tbl", "data.frame"))
Here is my plot code and sample:
df<- df %>%
mutate(Score1=gsub("<", "", Score))
ggplot(data=df26, aes(x=Day,y=Score1, group=1)) +
geom_line()+
geom_point()

BTW, your Score1 is still in character type, so it is not plotting proportional to its value. Here's one approach to use the value without "<" but the label including the "<".
There are lots of options here. A few below:
add the "<" to the axis labels
add a visual indicator (could be color, text, an arrow, etc.) to note "smaller than" values.
Color differently and use a legend. I like ggtext for this as you can use markup to color in specific words, which is great for incorporating color legends into explanatory text.
Perhaps "<1.3" could be interpreted, based on situational knowledge, that the measurement was somewhere below 1.3 but not below 1.2. Then we could show simulated possibilities.
ggplot(data=df, aes(x=Day,y=as.numeric(Score1), group=1)) +
geom_line()+
geom_point() +
scale_y_continuous(breaks = as.numeric(df$Score1), labels = df$Score,
minor_breaks = NULL)
Or you might indicate visually that the values are smaller, esp. if there's some plausible range that they might be lower.
ggplot(data=df, aes(x=Day,y=as.numeric(Score1), group=1)) +
geom_line()+
geom_point() +
geom_segment(data = . %>% filter(Score1 != Score),
aes(xend = Day, yend = as.numeric(Score1) - 0.2),
arrow = arrow(length = unit(0.02, "npc")), color = "gray60") +
scale_y_continuous(breaks = as.numeric(df$Score1), labels = df$Score, minor_breaks = NULL)
library(ggtext)
ggplot(data=df, aes(x=Day, y=as.numeric(Score1), group = 1,
shape = Score1 == Score)) +
geom_line()+
geom_point(aes(color = Score1 == Score)) +
scale_shape_discrete(guide = FALSE) +
scale_color_manual(values = c("red", "black"), guide = FALSE) +
labs(caption = "<span style = 'color:#FF0000'>Red dots</span> were recorded with a '<'") +
theme(plot.caption = element_markdown())
Another idea is we might show possibilities that are consistent with the measurement based on our situational understanding of what "<1.3" means -- ie maybe it means the value was "somewhere between 1.2 and 1.3."
df_possibilities <- df %>%
filter(Score1 != Score) %>%
uncount(10) %>%
rowwise() %>%
mutate(adjusted = as.numeric(Score1) - runif(1, max = 0.1))
ggplot(data=df, aes(x=Day,y=as.numeric(Score1), group=1)) +
geom_line()+
geom_point() +
scale_y_continuous(breaks = as.numeric(df$Score1), labels = df$Score,
minor_breaks = NULL) +
geom_point(data = df_possibilities,
aes(y = adjusted), alpha = 0.1)

Couple of alternatives, inclulded in the same graph:
by a key using a coloured geom_point, or
by annotation with geom_text
This is just to give an impression, both methods can be enhanced and modified to provide the appearance you think provides the best visualisation.
library(ggplot2)
library(dplyr)
library(stringr)
df1 <-
df%>%
mutate(y = as.numeric(str_extract(Score, "\\d.\\d{1,2}")),
less_than = if_else(str_detect(Score, "<"), TRUE, FALSE))
ggplot(df1, aes(Day, y))+
geom_point(aes(colour = less_than))+
geom_line()+
geom_text(aes(label = Score), hjust = -0.2)
Created on 2021-04-15 by the reprex package (v2.0.0)

UPDATE
Labels idea from Peter. Thanks.
You can use shape for different shapes.
with ggpubr more sophisticated. Here a overview of the numbers:
ggplot(data=df, aes(x=factor(Day),y=Score1, group=1)) +
geom_line()+
geom_point() +
geom_point(data=df[c(3,5),], aes(x=factor(Day), y=Score1), colour="red", size=5, shape=25) +
geom_text(aes(label = Score), hjust = -0.2)+
theme_bw()

Related

How to label plot with value of bars?

I'm trying to make a barplot with two categorical values. This particular thread was very helpful
My code was this
ggplot(DF, aes(Participant.Type, ..count...)) +
geom_bar(aes(fill=Sex), position ="dodge") +
theme_classic() +
ggtitle("Main phenotypes stated for the PCDH19 cohort on GEL") +
scale_fill_viridis(option ="viridis")
This was my resulting graph. I'm now trying to add the count of the particular bars on top - like Female proband is 135, Male proband is 165 and so on. I tried adding different iterations of the geom_text command so I could achieve this. Commands here:
+ geom_text(aes(label= ..count))
+ geom_text(aes(label= Sex))
Could anyone please help?
With some sample data from that question you linked you can do it like this:
library(ggplot2)
library(viridis)
#> Loading required package: viridisLite
Fruit <- c(rep("Apple", 3), rep("Orange", 5))
Bug <- c("worm", "spider", "spider", "worm", "worm", "worm", "worm", "spider")
df <- data.frame(Fruit, Bug)
ggplot(df, aes(Fruit, fill = Bug)) + geom_bar(position = "dodge") +
geom_text(
aes(label = after_stat(count)),
stat = "count",
vjust = -0.5,
position = position_dodge(width = 0.9)
) +
geom_text(
aes(y = after_stat(count), label = Bug),
stat = "count",
vjust = -1.5,
position = position_dodge(width = 0.9)
) +
scale_y_continuous(expand = expansion(add = c(0, 1))) +
scale_fill_viridis(option = "viridis", discrete = TRUE)
A few things to note:
geom_bar doesn't need ..count.. passed as a y-value - it defaults to counting
after_stat(count) is the updated form of .. notation
Text labels need dodges added - default width is 0.9 for bars so this width matches the placement of the bars.
I can't test the process without your input data, but here's something for you to give a try:
+ geom_text(stat='count', aes(label=..count..), vjust=-1)

How can I add a a nested y-axis title in my graph?

I created a ggplot graph using ggsegment for certain subcategories and their cost.
df <- data.frame(category = c("A","A","A","A","A","A","B","B","B","B","B","B","B"),
subcat = c("S1","S2","S3","S4","S5","S6","S7","S8","S9","S10","S11","S12","S13"),
value = c(100,200,300,400,500,600,700,800,900,1000,1100,1200,1300))
df2 <- df %>%
arrange(desc(value)) %>%
mutate(subcat=factor(subcat, levels = subcat)) %>%
ggplot(aes(x=subcat, y=value)) +
geom_segment(aes(xend=subcat, yend=0)) +
geom_point(size=4, color="steelblue") +
geom_text(data=df, aes(x=subcat, y=value, label = dollar(value, accuracy = 1)), position = position_nudge(x = -0.3), hjust = "inward") +
theme_classic() +
coord_flip() +
scale_y_continuous(labels = scales::dollar_format()) +
ylab("Cost Value") +
xlab("subcategory")
df2
This code results in a graph that is shown below:
My main issue is I want the category variable on the left of the subcategory variables. It should look like this:
How do I add the category variables in the y-axis, such that it looks nested?
As mentioned in my comment and adapting this post by #AllanCameron to your case one option to achieve your desired result would be the "facet trick", which uses faceting to get the nesting and some styling to remove the facet look:
Facet by category and free the scales and the space so that the distance between categories is the same.
Remove the spacing between panels and place the strip text outside of the axis text.
Additionally, set the expansion of the discrete x scale to .5 to ensure that the distance between categories is the same at the facet boundaries as inside the facets.
library(dplyr)
library(ggplot2)
library(scales)
df1 <- df %>%
arrange(desc(value)) %>%
mutate(subcat=factor(subcat, levels = subcat))
ggplot(df1, aes(x=subcat, y=value)) +
geom_segment(aes(xend=subcat, yend=0)) +
geom_point(size=4, color="steelblue") +
geom_text(data=df, aes(x=subcat, y=value, label = dollar(value, accuracy = 1)), position = position_nudge(x = -0.3), hjust = "inward") +
theme_classic() +
coord_flip() +
scale_y_continuous(labels = scales::dollar_format()) +
scale_x_discrete(expand = c(0, .5)) +
facet_grid(category~., scales = "free_y", switch = "y", space = "free_y") +
ylab("Cost Value") +
xlab("subcategory") +
theme(panel.spacing.y = unit(0, "pt"), strip.placement = "outside")

Problem with the x-axis labels in ggplot2 using n.breaks

I use n.breaks to have a labeled x-axis mark for each cluster this works well for 4, 5, 6 clusters. Now I tried it with two cluster and it does not work anymore.
I build the graphs like this:
country_plot <- ggplot(Data) + aes(x = Cluster) +
theme(legend.title = element_blank(), axis.title.y = element_blank()) +
geom_bar(aes(fill = country), stat = "count", position = "fill", width = 0.85) +
scale_fill_manual(values = color_map_3, drop = F) +
scale_x_continuous(n.breaks = max(unique(Data$Cluster))) + scale_y_continuous(labels = percent) +
ggtitle("Country")
and export it like this:
ggsave("country_plot.png", plot = country_plot, device = "png", width = 16, height = 8, units = "cm")
When it works it looks something like this:
But with two clusters I get something like this with only one mark beyond the actual bars with a 2.5:
I manually checked the return value of
max(unique(Data$Cluster))
and it returns 2 which in my understanding should lead to two x-axis marks with 1 and 2 like it works with more clusters.
edit:
mutate(country = factor(country, levels = 1:3)) %>%
mutate(country =fct_recode(country,!!!country_factor_naming))%>%
mutate(Gender = factor(Gender, levels = 1:2)) %>%
mutate(Gender = fct_recode(Gender, !!!gender_factor_naming))%>%
If I understand correctly the issue is caused by Cluster being treated as continuous variable. It needs to be turned into a factor.
Here is a minimal, reproducible example using the mtcars dataset that reproduces the unwanted behaviour:
First attempt (continuous x-axis)
library(ggplot2)
library(scales)
ggplot(mtcars) +
aes(x = gear, fill = factor(vs)) +
geom_bar(stat = "count", position = "fill", width = 0.85) +
scale_y_continuous(labels = percent)
In this example, gear takes over the role of Cluster and is assigned to the x-axis.
There are unwanted labeled tick marks at x = 2.5, 3.5, 4.5, 5.5 which are due to the continuous scale.
Second attempt (continuous x-axis with n.breaks given)
ggplot(mtcars) +
aes(x = gear, fill = factor(vs)) +
geom_bar(stat = "count", position = "fill", width = 0.85) +
scale_x_continuous(n.breaks = length(unique(mtcars$gear))) +
scale_y_continuous(labels = percent)
Specifying n.breaks in scale_x_continuous() does not change the x-axis to discrete.
Third attempt (discrete x-axis, gear as factor)
When gear is turned into a factor, we get a labeled tick mark for each factor value;
ggplot(mtcars) +
aes(x = factor(gear), fill = factor(vs)) +
geom_bar(stat = "count", position = "fill", width = 0.85) +
scale_y_continuous(labels = percent)

How do I represent percent of a variable in a filled barplot?

I have a data frame(t1) and I want to illustrate the shares of companies in relation to their size
I added a Dummy variable in order to make a filled barplot and not 3:
t1$row <- 1
The size of companies are separated in medium, small and micro:
f_size <- factor(t1$size,
ordered = TRUE,
levels = c("medium", "small", "micro"))
The plot is build up with the economic_theme:
ggplot(t1, aes(x = "Size", y = prop.table(row), fill = f_size)) +
geom_col() +
geom_text(aes(label = as.numeric(f_size)),
position = position_stack(vjust = 0.5)) +
theme_economist(base_size = 14) +
scale_fill_economist() +
theme(legend.position = "right",
legend.title = element_blank()) +
theme(axis.title.y = element_text(margin = margin(r = 20))) +
ylab("Percentage") +
xlab(NULL)
How can I modify my code to get the share for medium, small and micro in the middle of the three filled parts in the barplot?
Thanks in advance!
Your question isn't quite clear to me and I suggest you re-phrase it for clarity. But I believe you're trying to get the annotations to be accurately aligned on the Y-axis. For this use, pre-calculate the labels and then use annotate
library(data.table)
library(ggplot2)
set.seed(3432)
df <- data.table(
cat= sample(LETTERS[1:3], 1000, replace = TRUE)
, x= rpois(1000, lambda = 5)
)
tmp <- df[, .(pct= sum(x) / sum(df[,x])), cat][, cumsum := cumsum(pct)]
ggplot(tmp, aes(x= 'size', y= pct, fill= cat)) + geom_bar(stat='identity') +
annotate('text', y= tmp[,cumsum] - 0.15, x= 1, label= as.character(tmp[,pct]))
But this is a poor decision graphically. Stacked bar charts, by definition sum to 100%. Rather than labeling the components with text, just let the graphic do this for you via the axis labels:
ggplot(tmp, aes(x= cat, y= pct, fill= cat)) + geom_bar(stat='identity') + coord_flip() +
scale_y_continuous(breaks= seq(0,1,.05))

Plotting multiple Pie Charts with label in one plot

I came across this question the other day and tried to re-create it for myself. ggplot, facet, piechart: placing text in the middle of pie chart slices
. My data is in a very similar format, but sadly the accepted answer did not help, hence why I am re posting.
I essentially want to create the accepted answer but with my own data, yet the issue I run into is that coord_polar does not support free scale. Using the first answer:
I tried it using the second version of the answer, with the ddplyr version, but I also do not get my desired output. Using the second answer:
Clearly none of these has the desired effect. I would prefer to create one as with size pie charts, but only showed four as an example, follows: .
This I did in excel, but with one legend, and no background grid.
Code
title<-c(1,1,2,2,3,3,4,4,5,5,6,6)
type<-c('A','B','A','B','A','B','A','B','A','B','A','B')
value<-c(0.25,0.75,0.3,0.7,0.4,0.6,0.5,0.5,0.1,0.9,0.15,0.85)
piec<-data.frame(title,type,value)
library(tidyverse)
p1<-ggplot(data = piec, aes(x = "", y = value, fill = type)) +
geom_bar(stat = "identity") +
geom_text(aes(label = value), position = position_stack(vjust = 0.5)) +
coord_polar(theta = "y")
#facet_grid(title ~ ., scales = "free")
p1
piec <- piec %>% group_by(title) %>% mutate(pos=cumsum(value)-0.5*value)
p2<-ggplot(data = piec) +
geom_bar(aes(x = "", y = value, fill = type), stat = "identity") +
geom_text(aes(x = "", y = pos, label = value)) +
coord_polar(theta = "y")
#facet_grid(Channel ~ ., scales = "free")
p2
You don't have to supply different y values for geom_text and geom_bar (use y = value for both of them). Next you have to specify position in geom_text. Finally, remove scales from facets.
library(ggplot2)
title<-c(1,1,2,2,3,3,4,4,5,5,6,6)
type<-c('A','B','A','B','A','B','A','B','A','B','A','B')
value<-c(0.25,0.75,0.3,0.7,0.4,0.6,0.5,0.5,0.1,0.9,0.15,0.85)
piec<-data.frame(title,type,value)
ggplot(piec, aes("", value, fill = type)) +
geom_bar(stat = "identity", color = "white", size = 1) +
geom_text(aes(label = paste0(value * 100, "%")),
position = position_stack(vjust = 0.5),
color = "white", size = 3) +
coord_polar(theta = "y") +
facet_wrap(~ title, ncol = 3) +
scale_fill_manual(values = c("#0048cc", "#cc8400")) +
theme_void()

Resources