Changing ggplot2 legend title without altering graphical parameters - r

I have found many topics about the legend title with ggplot2 but after a couple of hours I have not been able to handle my situation.
Here is the dataset:
> dat
FACTOR1 FACTOR2 lsmean lower.CL upper.CL
1 A aa 26.2 25.6 26.8
2 B aa 24.8 23.9 25.7
3 A bb 26.0 25.2 26.7
4 B bb 24.9 23.9 25.9
5 A cc 24.4 23.9 24.8
6 B cc 23.9 22.9 25.0
7 A dd 24.9 24.3 25.6
8 B dd 23.2 22.3 24.0
And the graphic of interest:
gp0 <- ggplot(dat, aes(x=FACTOR2, y=lsmean, group=FACTOR1, colour=FACTOR1))
( gp1 <- gp0 + geom_line(aes(linetype=FACTOR1), size=.6) +
geom_point(aes(shape=FACTOR1), size=3) +
geom_errorbar(aes(ymax=upper.CL, ymin=lower.CL), width=.1) +
geom_errorbar(aes(ymax=upper.CL, ymin=lower.CL), width=.1) )
If I use scale_colour_manual() to change the legend title then I get an unexpected additional legend:
gp1 + scale_colour_manual("NEW TITLE",values=c("red","blue"))
I suppress this additional legend with scale_"aes"_manual("guide=none", values=...) but I don't understand how to control the parameters (the style of points and lines):
gp1 + scale_colour_manual("NEW TITLE",values=c("red","blue")) +
scale_shape_manual(guide = 'none', values=c(1,2)) +
scale_linetype_manual(guide = 'none', values=c(1,3))
Please how to reproduce the first plot with and only with a new legend title ?

You have to set the same title for all aes() attributes you have used, for example, using function labs().
gp1 + scale_colour_manual(values=c("red","blue"))+
labs(colour="NEW TITLE",linetype="NEW TITLE",shape="NEW TITLE")

Related

ggplot boxplot with mean and confidence interval by group

I'd like to make a boxplot with mean instead of median. Moreover, I would like the line to stop at 5% (lower) end 95% (upper) quantile. Here the code;
ggplot(data, aes(x=Cement, y=Mean_Gap, fill=Material)) +
geom_boxplot(fatten = NULL,aes(fill=Material), position=position_dodge(.9)) +
xlab("Cement") + ylab("Mean cement layer thickness") +
stat_summary(fun=mean, geom="point", aes(group=Material), position=position_dodge(.9),color="black")
I'd like to change geom to errorbar, but this doesn't work. I tried middle = mean(Mean_Gap), but this doesn't work either. I tried ymin = quantile(y,0.05), but nothing was changing. Can anyone help me?
The standard boxplot using ggplot. fill is Material:
Here is how you can create the boxplot using custom parameters for the box and whiskers. It's the solution shown by #lukeA in stackoverflow.com/a/34529614/6288065, but this one will also show you how to make several boxes by groups.
The R built-in data set called "ToothGrowth" is similar to your data structure so I will use that as an example. We will plot the length of tooth growth (len) for each vitamin C supplement group (supp), separated/filled by dosage level (dose).
# "ToothGrowth" at a glance
head(ToothGrowth)
# len supp dose
#1 4.2 VC 0.5
#2 11.5 VC 0.5
#3 7.3 VC 0.5
#4 5.8 VC 0.5
#5 6.4 VC 0.5
#6 10.0 VC 0.5
library(dplyr)
# recreate the data structure with specific "len" coordinates to plot for each group
df <- ToothGrowth %>%
group_by(supp, dose) %>%
summarise(
y0 = quantile(len, 0.05),
y25 = quantile(len, 0.25),
y50 = mean(len),
y75 = quantile(len, 0.75),
y100 = quantile(len, 0.95))
df
## A tibble: 6 x 7
## Groups: supp [2]
# supp dose y0 y25 y50 y75 y100
# <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 OJ 0.5 8.74 9.7 13.2 16.2 19.7
#2 OJ 1 16.8 20.3 22.7 25.6 26.9
#3 OJ 2 22.7 24.6 26.1 27.1 30.2
#4 VC 0.5 4.65 5.95 7.98 10.9 11.4
#5 VC 1 14.0 15.3 16.8 17.3 20.8
#6 VC 2 19.8 23.4 26.1 28.8 33.3
# boxplot using the mean for the middle and 95% quantiles for the whiskers
ggplot(df, aes(supp, fill = as.factor(dose))) +
geom_boxplot(
aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100),
stat = "identity"
) +
labs(y = "len", title = "Boxplot with Mean Middle Line") +
theme(plot.title = element_text(hjust = 0.5))
In the figure above, the boxplot on the left is the standard boxplot with regular median line and regular min/max whiskers. The boxplot on the right uses the mean middle line and 5%/95% quantile whiskers.

Visualize multiple box plot selecting differents rows of a dataframe

I am developing an EDA (Estimation of Distribution Algorithm). I'm getting all measure of the Pareto Front's solutions with distint configurations.
I have a structure with all values:
> metrics20
# A tibble: 320 x 6
File Hypervolume `Modified Hypervolume` Spread Spacing Time
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 001-unif-0.csv 25771 26294. 391. 30.1 16.8
2 002-unif-0.csv 27481 28416. 534. 41.1 16.5
3 003-unif-0.csv 26394 26842. 356. 29.6 16.5
4 004-unif-0.csv 30828 31696 418. 38.0 16.5
5 005-unif-0.csv 28146 28727 444. 34.2 16.6
6 006-unif-0.csv 30176 31006 451. 50.1 16.6
7 007-unif-0.csv 29374 30216 537. 35.8 16.5
8 008-unif-0.csv 27434 28156. 439. 31.4 16.5
9 009-unif-0.csv 28944 29426 471. 33.7 16.4
10 010-unif-0.csv 28339 29302. 576. 44.3 16.4
I want to visualize the values by this way. I take for example the Hipervolume column, I split data by File column value: -unif-, -sat-, -eff- and -prod- distribution and show values with -0.csv,-0.25.csv,-0.5.csv and -0.75.csv in x axis for the same distribution.
Reproducible example:
library(readr)
metrics20 <- read_csv("./metrics20.csv")
Data: Link
Hopefully this is a step towards what you're looking for:
library(readr)
library(dplyr)
library(ggplot2)
metrics20 <- read_csv("metrics20.csv")
metrics20 %>%
mutate(tag = factor(gsub("(^\\d+-)(\\w+)(-.*$)", "\\2", .$File), levels = c("unif", "sat", "eff", "prod")),
level = gsub("(^\\d+-\\w+-)(.*)(\\.csv$)", "\\2", .$File)) %>%
ggplot(aes(x = level, y = Hypervolume)) +
geom_boxplot() +
facet_wrap(~tag, nrow = 1)+
theme_minimal() +
theme(panel.border = element_rect(colour = "black", fill = NA),
panel.grid = element_blank())
From here there may be other things you want to tweak if you need to adjust it to be more like the example plot. You should be able to find all next steps in the help for the functions used.

How ro draw a multiline plot in R

I have a dataframe with 6 features like this:
X1 X2 X3 X4 X5 X6
Modern Dog 9.7 21.0 19.4 7.7 32.0 36.5
Golden Jackal 8.1 16.7 18.3 7.0 30.3 32.9
Chinese Wolf 13.5 27.3 26.8 10.6 41.9 48.1
Indian Wolf 11.5 24.3 24.5 9.3 40.0 44.6
Cuon 10.7 23.5 21.4 8.5 28.8 37.6
Dingo 9.6 22.6 21.1 8.3 34.4 43.1
I want to draw a line plot like this:
I'm trying this:
plot(df$X1, type = "o",col = "red", xlab = "Month", ylab = "Rain fall")
lines(c(df$X2, df$X3, df$X4, df$X5, df$X6), type = "o", col = "blue")
But it's only plotting a single variable. I'm sorry if this question is annoying, i'm totally new to R and i just don't know how to get this done. I would really appreciate any help on this.
Thanks in advance
The easiest way would be to convert your dataset to a long format (e.g. by using the gather function in the tidyr package), and then plotting using the group aesthetic in ggplot.
I recreate your dataset, assuming your group variable is named "Group":
df <- read.table(text = "
Group X1 X2 X3 X4 X5 X6
Modern_Dog 9.7 21.0 19.4 7.7 32.0 36.5
Golden_Jackal 8.1 16.7 18.3 7.0 30.3 32.9
Chinese_Wolf 13.5 27.3 26.8 10.6 41.9 48.1
Indian_Wolf 11.5 24.3 24.5 9.3 40.0 44.6
Cuon 10.7 23.5 21.4 8.5 28.8 37.6
Dingo 9.6 22.6 21.1 8.3 34.4 43.1 ",
header = TRUE, stringsAsFactors = FALSE)
Then convert the dataset to long format and plot:
library(tidyr)
library(ggplot2)
df_long <- df %>% gather(X1:X6, key = "Month", value = "Rainfall")
ggplot(df_long, aes(x = Month, y = Rainfall, group = Group, shape = Group)) +
geom_line() +
geom_point() +
theme(legend.position = "bottom")
See also the answers here: Group data and plot multiple lines.

How to make the speed profile of a moving object?

I am an R beginner user and I face the following problem. I have the following data frame:
distance speed
1 61.0 36.4
2 51.4 35.3
3 42.2 34.2
4 33.4 32.8
5 24.9 31.3
6 17.5 28.4
7 11.5 24.1
8 7.1 19.4
9 3.3 16.9
10 0.5 15.5
11 4.4 15.1
12 8.5 15.5
13 13.1 17.3
14 18.8 20.5
15 25.7 24.1
16 33.3 26.3
17 41.0 27.0
18 48.7 27.7
19 56.6 28.4
20 64.8 29.2
21 73.6 31.7
22 83.3 34.2
23 93.4 35.3
The column distance represents the distance of a following object over a specific point and the column speed the object's speed. As you can see the object is getting closer to the point and then it is getting away. I am trying to make its speed profile. I tried the following code but it didn't give me the plot I want (because I want to show how its speed is changing when the moving object moves closer and past the reference point)
ggplot(speedprofile, aes(x = distance, y = speed)) + #speedprofile is the data frame
geom_line(color = "red") +
geom_smooth() +
geom_vline(xintercept = 0) # the vline is the reference line
The plot is the following:
Then, I tried to set the first 10 distances as negative manually which are prior to zero (0). So I get a plot closer to that I want:
But there is a problem. The distance can't be defined as negative.
To sum up, the expected plot is the following (and I am sorry for the quality).
Do you have any ideas on how to solve this?
Thank you in advance!
You can do something like this to auto-compute the change point (to know when the distance should be negative) and then set the axis labels to be positive.
Your data (in case anyone needs it to answer):
read.table(text="distance speed
61.0 36.4
51.4 35.3
42.2 34.2
33.4 32.8
24.9 31.3
17.5 28.4
11.5 24.1
7.1 19.4
3.3 16.9
0.5 15.5
4.4 15.1
8.5 15.5
13.1 17.3
18.8 20.5
25.7 24.1
33.3 26.3
41.0 27.0
48.7 27.7
56.6 28.4
64.8 29.2
73.6 31.7
83.3 34.2
93.4 35.3", stringsAsFactors=FALSE, header=TRUE) -> speed_profile
Now, compute the "real" distance (negative for approaching, positive for receding):
speed_profile$real_distance <- c(-1, sign(diff(speed_profile$distance))) * speed_profile$distance
Now, compute the X axis breaks ahead of time:
breaks <- scales::pretty_breaks(10)(range(speed_profile$real_distance))
ggplot(speed_profile, aes(real_distance, speed)) +
geom_smooth(linetype = "dashed") +
geom_line(color = "#cb181d", size = 1) +
scale_x_continuous(
name = "distance",
breaks = breaks,
labels = abs(breaks) # make all the labels for the axis positive
)
Provided fonts are working well on your system you could even do:
labels <- abs(breaks)
labels[(!breaks == 0)] <- sprintf("%s\n→", labels[(!breaks == 0)])
ggplot(speed_profile, aes(real_distance, speed)) +
geom_smooth(linetype = "dashed") +
geom_line(color = "#cb181d", size = 1) +
scale_x_continuous(
name = "distance",
breaks = breaks,
labels = labels,
)

Alternate geom_text position with hjust

I'm plotting a stacked bar graph and use geom_text to insert the value of each stack. The difficulty I'm facing is that some stacks are very small/narrow, so that the text of two stacks overlap each other and hence is not very readable. I would like to adjust the text positioning in a way that for example the text position alternates between hjust == 1 and hjust == -1 for each stack, so that there will be no overlaps (or any other method that will result in readable text).
Here's an example of what I'm currently doing (a dput of mydf is provided below):
library(ggplot2)
ggplot(mydf, aes(x=variable, y = value, fill = Category)) +
geom_bar(stat="identity") +
geom_text(aes(label = value, y = pos-(value/2)), size = 3)
What I tried so far is:
Using position = position_dodge(width = 0.5) and position = position_jitter(h =0.5, w = 0.5) but none resulted in what I was trying to do.
My first thought was to define hjust = c(1,-1) hoping that it would be recycled and texts would alternate between hjust == 1 and hjust == -1 but it results in the error message:
Error: Incompatible lengths for set aesthetics: size, hjust
I also tried defining size = c(3,3,3,3,3,3,3,3,3), hjust = c(1,-1,1,-1,1,-1,1,-1,1) but this results in the same error message.
I would appreciate some advice on how to achieve this the right way (and I'm open to other suggestions as well).
I couldn't figure out why the dput didn't work (also for me it didn't), so here's the data in readable format:
Category variable value pos maxpos
1 AX WW 47.8 47.8 184.1
2 AY WW 5.6 53.4 184.1
3 AZ WW 15.8 69.2 184.1
4 BX WW 31.4 100.6 184.1
5 BY WW 11.7 112.3 184.1
6 BZ WW 10.7 123.0 184.1
7 CX WW 2.2 125.2 184.1
8 CY WW 21.4 146.6 184.1
9 CZ WW 37.5 184.1 184.1
10 AX SM 39.8 39.8 148.6
11 AY SM 2.9 42.7 148.6
12 AZ SM 13.2 55.9 148.6
13 BX SM 22.7 78.6 148.6
14 BY SM 7.3 85.9 148.6
15 BZ SM 8.9 94.8 148.6
16 CX SM 1.6 96.4 148.6
17 CY SM 17.3 113.7 148.6
18 CZ SM 34.9 148.6 148.6
19 AX AsIs 156.9 156.9 519.0
20 AY AsIs 13.1 170.0 519.0
21 AZ AsIs 70.5 240.5 519.0
22 BX AsIs 72.6 313.1 519.0
23 BY AsIs 30.7 343.8 519.0
24 BZ AsIs 35.6 379.4 519.0
25 CX AsIs 5.2 384.6 519.0
26 CY AsIs 44.8 429.4 519.0
27 CZ AsIs 89.6 519.0 519.0
By creating a hjust variable, you can achieve the desired result. The code:
mydf$hj <- rep(c(1,0,-1), length.out=27)
ggplot(mydf, aes(x=variable, y=value, fill=Category)) +
geom_bar(stat="identity") +
geom_text(aes(label=value, y=pos-(value/2), hjust=hj), size=4)
which gives:
A slightly alternative solution proposed by #konvas:
ggplot(mydf, aes(x=variable, y=value, fill=Category)) +
geom_bar(stat="identity") +
geom_text(aes(label=value, y=pos-(value/2), hjust=rep(c(1,0,-1), length.out=length(value))), size=4)

Resources