I created with ggplot an interaction plot and added with a different dataframe outliers into the same plot. I want to change the legend's labels (yes and no), but a new legend is added instead of changing them. Here is the Code:
the theme I'm using:
theme_apa(
legend.pos = "right",
legend.use.title = FALSE,
legend.font.size = 12,
x.font.size = 12,
y.font.size = 12,
facet.title.size = 12,
remove.y.gridlines = TRUE,
remove.x.gridlines = TRUE
)
the plot:
InteractionWithOutliers <- ggplot() +
geom_line(data=data2, aes(x=Messzeitpunkt,
y = Sum_PCLMean,group = TB2,linetype=TB2),) +
scale_color_manual(labels = c("test", "test"),values=c('#000000','#000000'))+
geom_point(data = outliersDF, aes(Messzeitpunkt,Sum_PCL,
shape=TB2, color=TB2, size=TB2),) +
geom_point(data = data2, aes(Messzeitpunkt,Sum_PCLMean,
shape=TB2, color=TB2, size=TB2), ) +
scale_shape_manual(values=c(15, 17))+
scale_size_manual(values=c(2,2)) +
ylim(0, 60) +
scale_x_continuous(breaks = seq(0,2)) +
geom_errorbar(data=data2,aes(x = Messzeitpunkt,ymin=Sum_PCLMean-Sum_PCLSD, ymax=Sum_PCLMean+Sum_PCLSD), width=.2,)
InteractionWithOutliers + theme_apa() +
labs(x ="Measurement Period", y = "PTSS mean scores")
Image of the Graph:
Furthermore, when i try to use position dodge to split the position of the interaction plot and the outliers, not everything moves the same way.
Code:
InteractionWithOutliers <- ggplot() +
geom_line(data=data2, aes(x=Messzeitpunkt,
y = Sum_PCLMean,group = TB2,linetype=TB2),position = position_dodge(width = 0.4)) +
scale_color_manual(labels = c("test", "test"),values=c('#000000','#000000'))+
geom_point(data = outliersDF, aes(Messzeitpunkt,Sum_PCL,
shape=TB2, color=TB2, size=TB2),position = position_dodge(width = 0.4)) +
geom_point(data = data2, aes(Messzeitpunkt,Sum_PCLMean,
shape=TB2, color=TB2, size=TB2),position = position_dodge(width = 0.4) ) +
scale_shape_manual(values=c(15, 17))+
scale_size_manual(values=c(2,2)) +
ylim(0, 60) +
scale_x_continuous(breaks = seq(0,2)) +
geom_errorbar(data=data2,aes(x = Messzeitpunkt,ymin=Sum_PCLMean-Sum_PCLSD, ymax=Sum_PCLMean+Sum_PCLSD),
width=.2,position = position_dodge(width = 0.4))
InteractionWithOutliers + theme_apa() +
labs(x ="Measurement Period", y = "PTSS mean scores")
Thank you for your help!
Edit: Data for the Outliers:
Messzeitpunkt Sum_PCL TB2
0 38 no
0 37 yes
0 40 yes
0 41 yes
0 38 yes
1 56 no
1 33 no
2 39 no
2 33 no
Data for the interaction plots:
Messzeitpunkt Sum_PCLMean TB2 Sum_PCLSD
0 9 no 11
0 12 yes 11
1 9 no 15
1 18 yes 16
2 8 no 12
2 14 yes 12
Merging legends can sometimes be painful. If your variables are already labelled (like in your example), then you also don't need to stipulate breaks or labels. (see first example).
However, a good rule is - don't add an aesthetic if you don't really need it. Size and color are constant aesthetics in your case, thus you could (and should) add it as a constant aesthetic outside of aes.
P.S. I have slightly changed the plot in order to make the essential more visible. I personally prefer to keep my plots in an order geoms->scales->coordinates->labels->theme, this helps me keeping an overview over the layers.
library(ggplot2)
data2 <- read.table(text = "Messzeitpunkt Sum_PCL TB2
0 38 no
0 37 yes
0 40 yes
0 41 yes
0 38 yes
1 56 no
1 33 no
2 39 no
2 33 no", head = T)
outliersDF <- read.table(text = "Messzeitpunkt Sum_PCLMean TB2 Sum_PCLSD
0 9 no 11
0 12 yes 11
1 9 no 15
1 18 yes 16
2 8 no 12
2 14 yes 12", head = T)
ggplot() +
geom_line(data = data2, aes(
x = Messzeitpunkt,
y = Sum_PCL, group = TB2, linetype = TB2
)) +
geom_point(data = outliersDF, aes(Messzeitpunkt, Sum_PCLMean,
shape = TB2, color = TB2, size = TB2
)) +
geom_point(data = data2, aes(Messzeitpunkt, Sum_PCL,
shape = TB2, color = TB2, size = TB2
)) +
## if your variable is labelled, no need to specify breaks or labels
scale_color_manual(values = c("#000000", "#000000")) +
scale_shape_manual(values = c(15, 17)) +
scale_size_manual(values = c(2, 2))
## Better, if you have constant aesthetics, not to use aes(), but
## add the values as constants instead
ggplot() +
geom_line(data = data2, aes(
x = Messzeitpunkt,
y = Sum_PCL, group = TB2, linetype = TB2
)) +
geom_point(data = outliersDF, aes(Messzeitpunkt, Sum_PCLMean,
shape = TB2
), size = 2) +
geom_point(data = data2, aes(Messzeitpunkt, Sum_PCL,
shape = TB2
## black color is default, this is just for demonstration
), color = "black", size = 2) +
scale_shape_manual(values = c(15, 17))
Created on 2022-07-15 by the reprex package (v2.0.1)
Related
I have density plots for each shift and year. The means are plotted by grouping in a df called mu. I also add vertical reference lines which I can label without issue but I cannot seem to get the labels on the grouped vertical lines. You will see my latest attempt which throws an error "Aesthetics must be either length 1 or the same as the data (134): x"
My code
library(ggplot2)
library(dplyr)
df <- read.csv("f4_bna_no_cup.csv")
head(df)
ï..n yr s ys x
1 1 2021 1 2021-1 116.83
2 2 2021 1 2021-1 114.83
3 3 2021 1 2021-1 115.50
4 4 2021 1 2021-1 115.42
5 5 2021 1 2021-1 115.58
6 6 2021 1 2021-1 115.58
#summarize means by ys (year-shift)
mu <- df %>%
group_by(ys,s) %>%
summarise(grp.mean = mean(x))
mu
ys s grp.mean
<chr> <int> <dbl>
1 2021-1 1 116.
2 2021-2 2 117.
3 2022-1 1 114.
4 2022-2 2 115.
llab<-mu
shift <- c("Shift 1", "Shift 2")
#density charts on df
ggplot(data=df, aes(x=x,group =ys, fill = yr, color = yr)) +
geom_density(alpha = 0.4) +
scale_x_continuous(limits=c(112,120))+
geom_vline(aes(xintercept = grp.mean), data = mu, linetype = "dashed", size = 0.5) +
geom_text(aes(x=llab$grp.mean, y=.6), label = llab$ys) + #this throws the error
geom_vline(aes(xintercept=114.8), linetype="dashed", size=0.5, color = 'green3') +
geom_text(aes(x=114.8, y=.6), label = "Target", angle = 90, color="black",size=3) +
geom_vline(aes(xintercept=114.1), linetype="solid", size=0.5, color = 'limegreen') +
geom_text(aes(x=114.1, y=.55), label = "Potential", angle = 90, color="black",size=3 ) +
geom_vline(aes(xintercept=113.4), linetype="solid", size=0.5, color = 'firebrick3') +
geom_text(aes(x=113.4, y=.62), label = "Label wt", angle = 90,
color="black",size=3, family = "Times New Roman", vjust=0) +
facet_grid(
.~s,
labeller = labeller(
s = c(`1` = "Shift 1", `2` = "Shift 2")
))+
theme_light()+
theme(legend.position = "none")
Output so far...I'm so close.
Persistence pays off. I figured it out and thought I would share it in case someone else has a similar problem:
All code remains the same as in my question except a slight change to grouping for the mu df, AND replace the line that I noted as throwing the error as follows:
#small change to group_by, retaining yr
mu <- df %>%
group_by(yr,s,ys) %>%
summarise(grp.mean = mean(x))
Replace: geom_text(aes(x=llab$grp.mean, y=.6), label = llab$ys), with
geom_text(data = mu, aes(label = yr), x = mu$grp.mean, y = .60, color = "black", angle = 90, vjust = 0)
I am trying to cache a big data.table and then make a plot out of it, the code is as follow:
{r gen-data, tidy=TRUE, warning=FALSE, tidy.opts=list(width.cutoff=60), cache = TRUE, cache.lazy=FALSE}
DT = fread("reference.txt.gz", header = FALSE)
vc = c("chromosome_1", "chromosome_2", "chromosome_3", "chromosome_4", "chromosome_5", "chromosome_6")
colnames(DT) = c("chrom", "position", "score", "corrected base", "score of the corrected base")
DT=setDT(DT, key = "chrom")[J(vc), nomatch = 0]
{r, cache=TRUE, tidy=TRUE, warning=FALSE, tidy.opts=list(width.cutoff=60), dependson='gen-data'}
plot = ggplot(data = DT) + geom_line(aes(x = position, y = score, group = 1), stat = "summary_bin", fun.y = "mean", binwidth = 100000, color = ghibli_palette("MononokeMedium")[2])
ttle = paste0("coverage of the 6 longest scaffolds of Shasta + instagraal assembly")
plot = plot + labs(
title = ttle) + theme(plot.title = element_markdown(lineheight = 1.5, size = 12), legend.text = element_markdown(size = 14))
plot = plot + theme(axis.title = element_markdown(size = 12)) + theme(axis.text.x = element_text(size=5)) + theme(axis.text.y = element_text(size=3))
plot = plot + theme(legend.title = element_markdown(size = 12))
p = plot + facet_wrap(~chrom, scales = "free_x") +xlab( "position") + ylab("mean score per 100 Kb windows")
v = ggplotly(p) %>%
layout(
xaxis = list(automargin=TRUE),
yaxis = list(automargin=TRUE)
)
v
So what I was thinking, is that the first chunk read the data into a data.table, then apply the relevant selection, and finally cache a DT object.
However, the first chunk is evaluated every time, no matter what. Therefore I must be doing something wrong but I can't see what.
Thanks for any help.
EDIT:
adding some of the, here is the reference.txt sample (yes it's normal it has only 3 column entries, some lines can have up to 5).
chromosome_1 1 91
chromosome_1 2 91
chromosome_1 3 91
chromosome_1 4 91
chromosome_1 5 91
chromosome_1 6 91
chromosome_1 7 91
chromosome_1 8 91
chromosome_1 9 91
chromosome_1 10 91
# Data:
zz <- "Small Large Lat Long
1 51 2 11 10
2 49 0 12 11
3 77 7 13 13
4 46 5 12 15
5 32 6 13 14
6 54 3 15 17
7 68 0 14 10
8 39 5 12 13"
Data <- as.data.frame(read.table(text=zz, header = TRUE))
I have a continuous variable, a ratio (small/large), I am successfully plotting.
Although, some 0s exist within the 'large' variable. When this occurs, I just want to plot the 'small' number as a ratio is impossible. To do this I have the following:
ratio.both <- Data %>%
filter(Large > 0) %>%
mutate(Ratio = Small/Large)
only.sml<- Data %>%
filter(Large < 1)
I then plot both on the same graph (by lat long data):
ggplot() +
geom_point(data = ratio.both,
aes(x = Long,
y = Lat,
size = Ratio),
stroke = 0,
colour = '#3B3B3B',
shape=16) +
#
geom_point(data = only.sml,
aes(x = Long,
y = Lat,
size = Small,
shape=1),
stroke = 1,
shape=1)
Notice the difference in shape. This plots the following
not the nicest graph but demonstrates example
The difference between those which are a ratio (filled) and those which are just the small value is clear on the map but difficult in the legend.
I want the following in the legend:
#Title
Size = both.ratio$Ratio,
Shape/fill = Ratio or small value #whichever is easier
It is much easier to use variables in the table to contrast the data using the built in aesthetics mapping, instead of creating separate geoms for the small and large data. You can for example create a new variable that checks whether that datapoint belongs to the large or small "type". You can then map shape, color, size or whatever you want in aesthetics and optionally add scales for these manually (if you want).
Data %>%
mutate(is_large = ifelse(Large > 0, "Ratio", "Small"),
size = ifelse(is_large == "Large", Small/Large, Small)) %>%
ggplot(aes(Long, Lat,
size = size,
shape = is_large)) +
geom_point() +
scale_shape_manual(values = c("Ratio" = 16, "Small" = 1),
name = "Size") +
scale_size_continuous(name = "Ratio/small value")
Or if you want to contrast by point color:
Data %>%
mutate(is_large = ifelse(Large > 0, "Ratio", "Small"),
size = ifelse(is_large == "Large", Small/Large, Small)) %>%
ggplot(aes(Long, Lat,
size = size,
color = is_large)) +
geom_point() +
scale_color_manual(values = c("Ratio" = "blue", "Small" = "red"),
name = "Size") +
scale_size_continuous(name = "Ratio/small value")
my df looks like something like this
df <- read.table(text="
cat eff count segment segment2
1 1 0 123 plane plane_0
2 2 25 12 plane plane_25
3 3 50 54 plane plane_50
4 4 75 34 plane plane_75
5 1 50 62 car car_50
6 2 75 12 car car_75
7 1 50 11 boat boat_50
8 2 75 10 boat boat_75
", header=TRUE)
I need it to put this dataframe to line graph. I crated this code, but I need to divide this by color and line color.
Plane should be red, car should be green boat blue. If eff is 0 then line should be solid, if eff is 25 then line should be dashed, 50 = dotted, 75 twodash.
ggplot(df, aes(x = as.numeric(cat), y = eff, color = segment2)) +
geom_line(stat = "identity", size = 1.5, linetype = "dashed") +
geom_point(size = 3.5)
You can try this:
ggplot(df,
aes(x = as.numeric(cat), y = eff)) +
geom_line(aes(linetype = factor(eff)), size = 1.5) +
geom_point(aes(color = segment), size = 3.5) +
scale_color_manual(values = c("boat" = "blue", "car" = "green", "plane" = "red")) +
scale_linetype_manual(values = c("0" = "solid", "25" = "dashed",
"50" = "dotted", "75" = "twodash"))
(Note: based on the sample data, the points corresponding to car are completely hidden beneath other points.)
Given some data like:
my.data <- data.frame(time = rep(1:3, 2),
means = 2:7,
lowerCI = 1:6,
upperCI = 3:8,
scenario = rep(c("A","Z"), each=3))
my.data
# time means lowerCI upperCI scenario
# 1 1 2 1 3 A
# 2 2 3 2 4 A
# 3 3 4 3 5 A
# 4 1 5 4 6 Z
# 5 2 6 5 7 Z
# 6 3 7 6 8 Z
I need to make a plot like the one below but some label for the (confidence) dotted lines should appear in the legend - the order matters, should be something like Z, A, CI-Z, CI-A (see below).
This is the corresponding code:
ggplot(data = my.data) +
# add the average lines
geom_line(aes(x=time, y=means, color=scenario)) +
# add "confidence" lines
geom_line(aes(x=time, y=lowerCI, color=scenario), linetype="dotted") +
geom_line(aes(x=time, y=upperCI, color=scenario), linetype="dotted") +
# set color manually
scale_color_manual(name = 'Scenario',
breaks = c("Z", "A"),
values = c("Z" = "red",
"A" = "blue"))
Below is my attempt after I checked this & this SO similar questions. I get close enough, but I want the "CI" labels not to be separate.
ggplot(data = my.data) +
# add the average lines
geom_line(aes(x=time, y=means, color=scenario)) +
# add "confidence" lines
geom_line(aes(x=time, y=lowerCI, color=scenario, linetype="CI")) +
geom_line(aes(x=time, y=upperCI, color=scenario, linetype="CI")) +
# set color manually
scale_color_manual(name = 'Scenario',
breaks = c("Z", "A"),
values = c("Z" = "red",
"A" = "blue")) +
# set line type manually
scale_linetype_manual(name = 'Scenario',
breaks = c("Z", "A", "CI"),
values = c("Z" = "solid",
"A" = "solid",
"CI" = "dotted"))
I also tried something using geom_ribbon, but I could not find a clear way to make it display only the edge lines and add them as desired in the legend. All in all, I don't need to display bands, but lines.
I'm sure there is an obvious way, but for now I'm stuck here...
We can use guide_legend to specify dashed linetypes for the CI's. I think this is close to what you want:
ggplot(my.data, aes(x = time, y = means))+
geom_line(aes(colour = scenario))+
geom_line(aes(y = lowerCI, colour = paste(scenario, 'CI')),
linetype = 'dashed')+
geom_line(aes(y = upperCI, colour = paste(scenario, 'CI')),
linetype = 'dashed')+
scale_colour_manual(values = c('A' = 'red','Z' = 'blue',
'A CI' = 'red','Z CI' = 'blue'),
breaks = c('Z', 'Z CI', 'A', 'A CI'))+
guides(colour = guide_legend(override.aes = list(linetype = c('solid', 'dashed'))))+
ggtitle('Dashed lines represent X% CI')