R - make geom_line dotted/dashed/sold by category - r

my df looks like something like this
df <- read.table(text="
cat eff count segment segment2
1 1 0 123 plane plane_0
2 2 25 12 plane plane_25
3 3 50 54 plane plane_50
4 4 75 34 plane plane_75
5 1 50 62 car car_50
6 2 75 12 car car_75
7 1 50 11 boat boat_50
8 2 75 10 boat boat_75
", header=TRUE)
I need it to put this dataframe to line graph. I crated this code, but I need to divide this by color and line color.
Plane should be red, car should be green boat blue. If eff is 0 then line should be solid, if eff is 25 then line should be dashed, 50 = dotted, 75 twodash.
ggplot(df, aes(x = as.numeric(cat), y = eff, color = segment2)) +
geom_line(stat = "identity", size = 1.5, linetype = "dashed") +
geom_point(size = 3.5)

You can try this:
ggplot(df,
aes(x = as.numeric(cat), y = eff)) +
geom_line(aes(linetype = factor(eff)), size = 1.5) +
geom_point(aes(color = segment), size = 3.5) +
scale_color_manual(values = c("boat" = "blue", "car" = "green", "plane" = "red")) +
scale_linetype_manual(values = c("0" = "solid", "25" = "dashed",
"50" = "dotted", "75" = "twodash"))
(Note: based on the sample data, the points corresponding to car are completely hidden beneath other points.)

Related

Change legend labels and position dodge

I created with ggplot an interaction plot and added with a different dataframe outliers into the same plot. I want to change the legend's labels (yes and no), but a new legend is added instead of changing them. Here is the Code:
the theme I'm using:
theme_apa(
legend.pos = "right",
legend.use.title = FALSE,
legend.font.size = 12,
x.font.size = 12,
y.font.size = 12,
facet.title.size = 12,
remove.y.gridlines = TRUE,
remove.x.gridlines = TRUE
)
the plot:
InteractionWithOutliers <- ggplot() +
geom_line(data=data2, aes(x=Messzeitpunkt,
y = Sum_PCLMean,group = TB2,linetype=TB2),) +
scale_color_manual(labels = c("test", "test"),values=c('#000000','#000000'))+
geom_point(data = outliersDF, aes(Messzeitpunkt,Sum_PCL,
shape=TB2, color=TB2, size=TB2),) +
geom_point(data = data2, aes(Messzeitpunkt,Sum_PCLMean,
shape=TB2, color=TB2, size=TB2), ) +
scale_shape_manual(values=c(15, 17))+
scale_size_manual(values=c(2,2)) +
ylim(0, 60) +
scale_x_continuous(breaks = seq(0,2)) +
geom_errorbar(data=data2,aes(x = Messzeitpunkt,ymin=Sum_PCLMean-Sum_PCLSD, ymax=Sum_PCLMean+Sum_PCLSD), width=.2,)
InteractionWithOutliers + theme_apa() +
labs(x ="Measurement Period", y = "PTSS mean scores")
Image of the Graph:
Furthermore, when i try to use position dodge to split the position of the interaction plot and the outliers, not everything moves the same way.
Code:
InteractionWithOutliers <- ggplot() +
geom_line(data=data2, aes(x=Messzeitpunkt,
y = Sum_PCLMean,group = TB2,linetype=TB2),position = position_dodge(width = 0.4)) +
scale_color_manual(labels = c("test", "test"),values=c('#000000','#000000'))+
geom_point(data = outliersDF, aes(Messzeitpunkt,Sum_PCL,
shape=TB2, color=TB2, size=TB2),position = position_dodge(width = 0.4)) +
geom_point(data = data2, aes(Messzeitpunkt,Sum_PCLMean,
shape=TB2, color=TB2, size=TB2),position = position_dodge(width = 0.4) ) +
scale_shape_manual(values=c(15, 17))+
scale_size_manual(values=c(2,2)) +
ylim(0, 60) +
scale_x_continuous(breaks = seq(0,2)) +
geom_errorbar(data=data2,aes(x = Messzeitpunkt,ymin=Sum_PCLMean-Sum_PCLSD, ymax=Sum_PCLMean+Sum_PCLSD),
width=.2,position = position_dodge(width = 0.4))
InteractionWithOutliers + theme_apa() +
labs(x ="Measurement Period", y = "PTSS mean scores")
Thank you for your help!
Edit: Data for the Outliers:
Messzeitpunkt Sum_PCL TB2
0 38 no
0 37 yes
0 40 yes
0 41 yes
0 38 yes
1 56 no
1 33 no
2 39 no
2 33 no
Data for the interaction plots:
Messzeitpunkt Sum_PCLMean TB2 Sum_PCLSD
0 9 no 11
0 12 yes 11
1 9 no 15
1 18 yes 16
2 8 no 12
2 14 yes 12
Merging legends can sometimes be painful. If your variables are already labelled (like in your example), then you also don't need to stipulate breaks or labels. (see first example).
However, a good rule is - don't add an aesthetic if you don't really need it. Size and color are constant aesthetics in your case, thus you could (and should) add it as a constant aesthetic outside of aes.
P.S. I have slightly changed the plot in order to make the essential more visible. I personally prefer to keep my plots in an order geoms->scales->coordinates->labels->theme, this helps me keeping an overview over the layers.
library(ggplot2)
data2 <- read.table(text = "Messzeitpunkt Sum_PCL TB2
0 38 no
0 37 yes
0 40 yes
0 41 yes
0 38 yes
1 56 no
1 33 no
2 39 no
2 33 no", head = T)
outliersDF <- read.table(text = "Messzeitpunkt Sum_PCLMean TB2 Sum_PCLSD
0 9 no 11
0 12 yes 11
1 9 no 15
1 18 yes 16
2 8 no 12
2 14 yes 12", head = T)
ggplot() +
geom_line(data = data2, aes(
x = Messzeitpunkt,
y = Sum_PCL, group = TB2, linetype = TB2
)) +
geom_point(data = outliersDF, aes(Messzeitpunkt, Sum_PCLMean,
shape = TB2, color = TB2, size = TB2
)) +
geom_point(data = data2, aes(Messzeitpunkt, Sum_PCL,
shape = TB2, color = TB2, size = TB2
)) +
## if your variable is labelled, no need to specify breaks or labels
scale_color_manual(values = c("#000000", "#000000")) +
scale_shape_manual(values = c(15, 17)) +
scale_size_manual(values = c(2, 2))
## Better, if you have constant aesthetics, not to use aes(), but
## add the values as constants instead
ggplot() +
geom_line(data = data2, aes(
x = Messzeitpunkt,
y = Sum_PCL, group = TB2, linetype = TB2
)) +
geom_point(data = outliersDF, aes(Messzeitpunkt, Sum_PCLMean,
shape = TB2
), size = 2) +
geom_point(data = data2, aes(Messzeitpunkt, Sum_PCL,
shape = TB2
## black color is default, this is just for demonstration
), color = "black", size = 2) +
scale_shape_manual(values = c(15, 17))
Created on 2022-07-15 by the reprex package (v2.0.1)

label mean lines in ggplot that are mapped in a group

I have density plots for each shift and year. The means are plotted by grouping in a df called mu. I also add vertical reference lines which I can label without issue but I cannot seem to get the labels on the grouped vertical lines. You will see my latest attempt which throws an error "Aesthetics must be either length 1 or the same as the data (134): x"
My code
library(ggplot2)
library(dplyr)
df <- read.csv("f4_bna_no_cup.csv")
head(df)
ï..n yr s ys x
1 1 2021 1 2021-1 116.83
2 2 2021 1 2021-1 114.83
3 3 2021 1 2021-1 115.50
4 4 2021 1 2021-1 115.42
5 5 2021 1 2021-1 115.58
6 6 2021 1 2021-1 115.58
#summarize means by ys (year-shift)
mu <- df %>%
group_by(ys,s) %>%
summarise(grp.mean = mean(x))
mu
ys s grp.mean
<chr> <int> <dbl>
1 2021-1 1 116.
2 2021-2 2 117.
3 2022-1 1 114.
4 2022-2 2 115.
llab<-mu
shift <- c("Shift 1", "Shift 2")
#density charts on df
ggplot(data=df, aes(x=x,group =ys, fill = yr, color = yr)) +
geom_density(alpha = 0.4) +
scale_x_continuous(limits=c(112,120))+
geom_vline(aes(xintercept = grp.mean), data = mu, linetype = "dashed", size = 0.5) +
geom_text(aes(x=llab$grp.mean, y=.6), label = llab$ys) + #this throws the error
geom_vline(aes(xintercept=114.8), linetype="dashed", size=0.5, color = 'green3') +
geom_text(aes(x=114.8, y=.6), label = "Target", angle = 90, color="black",size=3) +
geom_vline(aes(xintercept=114.1), linetype="solid", size=0.5, color = 'limegreen') +
geom_text(aes(x=114.1, y=.55), label = "Potential", angle = 90, color="black",size=3 ) +
geom_vline(aes(xintercept=113.4), linetype="solid", size=0.5, color = 'firebrick3') +
geom_text(aes(x=113.4, y=.62), label = "Label wt", angle = 90,
color="black",size=3, family = "Times New Roman", vjust=0) +
facet_grid(
.~s,
labeller = labeller(
s = c(`1` = "Shift 1", `2` = "Shift 2")
))+
theme_light()+
theme(legend.position = "none")
Output so far...I'm so close.
Persistence pays off. I figured it out and thought I would share it in case someone else has a similar problem:
All code remains the same as in my question except a slight change to grouping for the mu df, AND replace the line that I noted as throwing the error as follows:
#small change to group_by, retaining yr
mu <- df %>%
group_by(yr,s,ys) %>%
summarise(grp.mean = mean(x))
Replace: geom_text(aes(x=llab$grp.mean, y=.6), label = llab$ys), with
geom_text(data = mu, aes(label = yr), x = mu$grp.mean, y = .60, color = "black", angle = 90, vjust = 0)

Define each label in heatmap clearly in ggplot in R

I have following data frame:
ID position hum_chr_pos CHROM a1 a2 a3 a4 ID_rn
rs1 197_V 897738 1 0.343442666 0.074361225 1 0.028854932 1
rs3 1582_N 2114271 2 0.015863115 1 0.003432604 0.840242328 2
rs6 2266_I 79522907 3 0.177445544 0.090282782 1 0.038199399 3
rs8 521_D 86959173 4 0.542804846 0.088721027 1 0.047758851 4
rs98 1368_G 92252015 5 0.02861059 0.979995611 0.007545923 1 5
rs23 540_A 96162102 5 0.343781806 0.062643599 1 0.024992095 6
rs43 2358_S 147351955 6 0.042592955 0.862087128 0.013001476 1 7
rs65 577_E 168572720 6 0.517111734 0.080471431 1 0.034521778 8
rs602 1932_T 169483561 6 0.043270585 1 0.009731403 0.988762282 9
rs601 1932_T 169511878 6 0.042963813 0.911392392 0.010562154 1 10
rs603 1932_T 169513583 6 0.04096538 0.956129216 0.010983517 1 11
rs606 1936_T 169513573 7 0.04838 0.0126129216 0.090983517 1 12
rs609 1935_T 169513574 7 0.056 0.045 0.086 1 13
I created a heatmap with the values a1, a2, a3, a4:
For this I used this code:
df_melt <- melt(dummy, id.vars=c("ID", "position","hum_chr_pos","CHROM","ID_rn"))
pos <- df_melt %>%
group_by(CHROM) %>%
summarize(avg = round(mean(ID_rn))) %>%
pull(avg)
ggplot(df_melt, aes(x=variable, y=ID_rn)) + geom_tile(aes(fill=value))+theme_bw()+
scale_fill_gradient2(low="lightblue", mid="white", high="darkblue", midpoint=0.5, limits=range(df_melt$value))+
theme_classic()+ labs(title="graph", x= "a", fill = "value")+
ylab("CHROM") +
scale_y_discrete(limits = pos,labels = unique(limits = pos,df_melt$CHROM))
I would like to find a way to see more clearly the separation of each factor on the y axis. At the moment it is not really clear which row belong to which label on the y axis. So I would like to have something like that:
Also it is weird, that the numbers are sometimes not really in the middle of each factor. For example, the 5 and 7 on the y axis are not centered.
But I have searching how to do this, but couldn't find anything.
You could use geom_hline
ggplot(df_melt, aes(x = variable, y = ID_rn)) +
geom_tile(aes(fill = value)) +
theme_bw() +
scale_fill_gradient2(low = "lightblue", mid = "white", high = "darkblue",
midpoint = 0.5, limits = range(df_melt$value)) +
theme_classic() +
labs(title="graph", x= "a", fill = "value", y = "CHROM") +
scale_y_discrete(limits = c(1, 2, 3, 4, 5.5, 9, 12.5),
labels = unique(df_melt$CHROM)) +
geom_hline(yintercept = c(1, 2, 3, 4, 6, 11, 13) + 0.5, color = 'red')

Scatterpie: How to add annotation over line connecting pies to mark percent change in y-values between pies

I have a scatterpie plot with pies plotted over x and y axes and a "trend line" connecting them. In the spirit of this answer, I would like to add an annotation over each line to mark the percent increase/decrease between the y-values underlying each adjacent pies.
My data
library(tidyverse)
library(scatterpie)
my_df <- structure(list(day_in_july = 13:20, yes_and_yes = c(0.611814345991561,
0.574750830564784, 0.593323216995448, 0.610539845758355, 0.650602409638554,
0.57429718875502, 0.575971731448763, 0.545454545454545), yes_but_no = c(0.388185654008439,
0.425249169435216, 0.406676783004552, 0.389460154241645, 0.349397590361446,
0.42570281124498, 0.424028268551237, 0.454545454545455), y = c(0.388185654008439,
0.425249169435216, 0.406676783004552, 0.389460154241645, 0.349397590361446,
0.42570281124498, 0.424028268551237, 0.454545454545455)), row.names = c(NA,
-8L), class = c("tbl_df", "tbl", "data.frame"))
My current visualization
p <- ggplot(data = my_df) +
geom_path(aes(x=day_in_july, y = y*50)) +
geom_scatterpie(aes(x = day_in_july, y = y*50, r = 0.3),
data = my_df,
cols = colnames(my_df)[2:3],
color = "red") +
geom_text(aes(y = y*50, x = day_in_july,
label = paste0(formatC(y*100, digits = 3), "%")),
nudge_y = 0.07, nudge_x = -0.25, size = 3) +
geom_text(aes(y = y*50, x = day_in_july,
label = paste0(formatC((1-y)*100, digits = 3), "%")),
nudge_y = -0.07, nudge_x = 0.25, size = 3) +
scale_fill_manual(values = c("pink", "seagreen3")) +
scale_x_continuous(labels = xvals, breaks = xvals) +
scale_y_continuous(name = "yes but no",
labels = function(x) x/50) +
coord_fixed()
> p
I want to add percent increase/decrease between y-values of adjacent pies
The y-value of the first pie (at day_in_july = 13) is 0.388. From this y-value to the next pie's y-value (0.425) there's a percent increase of 9.53%. Therefore, I want to mark the line that connects the two pies with a label of +9.53% .
Ultimately, I want the plot to look like this one:
On the way to the solution
This answer already has the relevant mechanism to get what I'm looking for.
The idea is to use ggplot_build() to access the data underlying the plot, then calculate the percent change between two consecutive values, then rebuild the plot with the lines annotated accordingly. However, this solution isn't working for me with the scatterpie plot since the underlying data outputted from ggplot_build is of its own kind.
plot_data <- ggplot_build(p) %>% ggplot_build(p)$data[[1]] %>% as.tibble()
> plot_data
## # A tibble: 2,904 x 13
## fill group index amount PANEL stringsAsFactors nControl x y colour size linetype alpha
## <chr> <chr> <dbl> <dbl> <fct> <lgl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <lgl>
## 1 pink 1 0 0.612 1 FALSE 221 13 19.7 red 0.5 1 NA
## 2 pink 1 0.00452 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 3 pink 1 0.00905 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 4 pink 1 0.0136 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 5 pink 1 0.0181 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 6 pink 1 0.0226 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 7 pink 1 0.0271 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 8 pink 1 0.0317 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 9 pink 1 0.0362 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## 10 pink 1 0.0407 0.612 1 FALSE 221 13.0 19.7 red 0.5 1 NA
## # ... with 2,894 more rows
Where are the actual y-values that I need for calculating the percent change between pies' y-values? Obviously, I can get the y-values from the data. But in order to reconstruct the plot, this data from ggplot_build() doesn't make sense to me, and I don't know how to utilize the technique to add the percentage change between pies to the plot line.
Here is my attempt with the ggrepel package. I basically created a new data frame containing necessary information for geom_label_repel(). I omit the details of what I did to create foo. But I think you can read it. I invested a bit of time to find the optimal positions for the label, and this is what I could do for you for now. If you are not happy with the position, you gotta play around by yourself.
foo <- tibble(day_in_july = my_df$day_in_july + 0.5,
y = my_df$y * 50 + (((lead(my_df$y * 50) - (my_df$y * 50))) / 2),
gap = ((lead(my_df$yes_but_no) / my_df$yes_but_no) - 1) * 100) %>%
mutate(gap = paste(round(gap, digits = 2), "%", sep = ""),
hue = ifelse(gap > 0, "green", "red"))
p <- ggplot(data = my_df) +
geom_path(aes(x = day_in_july, y = y*50)) +
geom_scatterpie(aes(x = day_in_july, y = y*50, r = 0.3),
data = my_df,
cols = colnames(my_df)[2:3],
color = "red") +
geom_text(aes(y = y * 50, x = day_in_july,
label = paste0(formatC(y * 100, digits = 3), "%")),
nudge_y = 0.07, nudge_x = -0.25, size = 3) +
geom_text(aes(y = y * 50, x = day_in_july,
label = paste0(formatC((1-y) * 100, digits = 3), "%")),
nudge_y = -0.07, nudge_x = 0.25, size = 3) +
scale_fill_manual(values = c("pink", "seagreen3")) +
geom_label_repel(data = foo,
aes(x = day_in_july, y = y,
color = hue, label = as.character(gap)),
show.legend = FALSE,
nudge_x = 0.3,
direction = "y",
vjust = -1.0) +
scale_color_manual(values = c("green", "red"))

geom_text() to label two separate points from different plots in ggplot

I am trying to create individual plots facetted by 'iid' using 'facet_multiple', in the following dataset (first 3 rows of data)
iid Age iop al baseIOP baseAGE baseAL agesurg
1 1 1189 20 27.9 21 336 24.9 336
2 2 877 11 21.5 16 98 20.3 98
3 2 1198 15 21.7 16 98 20.3 98
and wrote the following code:
# Install gg_plus from GitHub
remotes::install_github("guiastrennec/ggplus")
# Load libraries
library(ggplot2)
library(ggplus)
# Generate ggplot object
p <- ggplot(data_longF1, aes(x = Age, y = al)) +
geom_point(alpha = 0.5) +
geom_point(aes(x= baseAGE, y=baseAL)) +
labs(x = 'Age (days)',
y = 'Axial length (mm)',
title = 'Individual plots of Axial length v time')
p1 <- p+geom_vline(aes(xintercept = agesurg),
linetype = "dotted",
colour = "red",
size =1.0)
p2<- p1 + geom_text(aes(label=iop ,hjust=-1, vjust=-1))
p3 <- p2 + geom_text(aes(label = baseIOP, hjust=-1, vjust=-1))
# Plot on multiple pages (output plot to R/Rstudio)
facet_multiple(plot = p3,
facets = 'iid',
ncol = 1,
nrow = 1,
scales = 'free')
The main issue I am having is labeling the points. The points corresponding to (x=age, y=axl) get labelled fine, but labels for the second group of points (x=baseIOP, y=baseAL) gets put in the wrong place.individual plot sample
I have had a look at similar issues in Stack Overflow e.g. ggplot combining two plots from different data.frames
But not been able to correct my code.
Thanks for your help
You need to define the x and y coordinates for the labels or they will default to the last ones specified.
Thus the geom_text() definitions should look something like:
data_longF1 <-read.table(header=TRUE, text="iid Age iop al baseIOP baseAGE baseAL agesurg
1 1 1189 20 27.9 21 336 24.9 336
2 2 877 11 21.5 16 98 20.3 98
3 2 1198 15 21.7 16 98 20.3 98")
# Generate ggplot object
p <- ggplot(data_longF1, aes(x = Age, y = al)) +
geom_point(alpha = 0.5) +
geom_point(aes(x= baseAGE, y=baseAL)) +
labs(x = 'Age (days)',
y = 'Axial length (mm)',
title = 'Individual plots of Axial length v time')
p1 <- p+geom_vline(aes(xintercept = agesurg),
linetype = "dotted",
colour = "red",
size =1.0)
#Need to specify the x and y coordinates or will default to the last ones defined
p2<- p1 + geom_text(aes(x=Age, y= al, label=iop ,hjust=-1, vjust=-1))
p3 <- p2 + geom_text(aes(x=baseAGE, y= baseAL, label = baseIOP, hjust=-1, vjust=-1))
print(p3)

Resources