I would like certain points I have created through ggplot to take labels at the side of the graph but I am not able to do that through my current code.
Ceplane1 is a matrix with two columns and 100 rows ( can take any random numbers). I want to plot column 2 on the x-axis and column 1 on the y-axis with. I have done this part using the below code. Now I want to make changes in the code so that I can put the label at the side of the graph and not on the graph area itself. Additionally, I want to represent the axis in a comma format. you can take result.table[1,1] and result.table[1,3] to be some number and suggest the solution.
ggplot(Ceplane1, aes(x = Ceplane1[,2], y = Ceplane1[,1])) +
geom_point(colour="blue")+geom_abline(slope = -results.table[5,1],intercept = 0,colour="darkred",size=1.25)+
geom_point(aes(mean(Ceplane1[,2]),mean(Ceplane1[,1])),colour="red")+
geom_point(aes(results.table[1,1],results.table[3,1],colour="darkred"))+ggtitle("CE-Plane: Drug A vs Drug P")+
xlab("QALY Difference")+ylab("Cost Difference")+xlim(-0.05,0.05)+ylim(-6000,6000)+
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(),plot.background = element_rect(fill = "white", colour = "black", size = 0.5))+
geom_vline(xintercept = 0,colour="black")+geom_hline(yintercept = 0,colour="black")+
geom_label(aes(mean(Ceplane1[,2]),mean(Ceplane1[,1])),label="mean")+
geom_label(aes(results.table[1,1],results.table[3,1]),label="Base ICER")
I want to put the label at the side of the graph and not on the points of the graph itself. Please suggest me a way to do that.
I think the best way is to add the mean and Base ICER points to your dataset. Then add a column for the legend and you will see them show up as matching in the chart and the legend:
library(ggplot2)
set.seed(1)
Ceplane1 <- data.frame(y = rnorm(100),
x = rnorm(100))
results.table <- data.frame(z = rnorm(100))
Ceplane1$Legend <- "Data"
meanPoint <- data.frame(y = mean(Ceplane1[,1]), x = mean(Ceplane1[,2]), Legend = "Mean")
basePoint <- data.frame(y = results.table[3,1], x = results.table[1,1], Legend = "Base ICER")
Ceplane1 <- rbind(Ceplane1, meanPoint)
Ceplane1 <- rbind(Ceplane1, basePoint)
ggplot(Ceplane1, aes(x = x, y = y, color = Legend)) +
geom_point() +
geom_abline(slope = -results.table[5,1],intercept = 0,colour="darkred",size=1.25) +
ggtitle("CE-Plane: Drug A vs Drug P")+ xlab("QALY Difference")+ylab("Cost Difference") +
xlim(-3,3) + ylim(-3,3) +
theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
panel.background = element_blank(),plot.background = element_rect(fill = "white", colour = "black", size = 0.5)) +
geom_vline(xintercept = 0,colour="black") +
geom_hline(yintercept = 0,colour="black")
This gives me the following:
Note that I changed the xlim and ylim to match the random data I created.
Related
I've been stuck on an issue and can't find a solution. I've tried many suggestions on Stack Overflow and elsewhere about manually ordering a stacked bar chart, since that should be a pretty simple fix, but those suggestions don't work with the huge complicated mess of code I plucked from many places. My only issue is y-axis item ordering.
I'm making a series of stacked bar charts, and ggplot2 changes the ordering of the items on the y-axis depending on which dataframe I am trying to plot. I'm trying to make 39 of these plots and want them to all have the same ordering. I think ggplot2 only wants to plot them in ascending order of their numeric mean or something, but I'd like all of the bar charts to first display the group "Bird Advocates" and then "Cat Advocates." (This is also the order they appear in my data frame, but that ordering is lost at the coord_flip() point in plotting.)
I think that taking the data frame through so many changes is why I can't just add something simple at the end or use the reorder() function. Adding things into aes() also doesn't work, since the stacked bar chart I'm creating seems to depend on those items being exactly a certain way.
Here's one of my data frames where ggplot2 is ordering my y-axis items incorrectly, plotting "Cat Advocates" before "Bird Advocates":
Group,Strongly Opposed,Opposed,Slightly Opposed,Neutral,Slightly Support,Support,Strongly Support
Bird Advocates,0.005473026,0.010946052,0.012509773,0.058639562,0.071149335,0.31118061,0.530101642
Cat Advocates,0.04491726,0.07013396,0.03624901,0.23719464,0.09141056,0.23404255,0.28605201
And here's all the code that takes that and turns it into a plot:
library(ggplot2)
library(reshape2)
library(plotly)
#Importing data from a .csv file
data <- read.csv("data.csv", header=TRUE)
data$s.Strongly.Opposed <- 0-data$Strongly.Opposed-data$Opposed-data$Slightly.Opposed-.5*data$Neutral
data$s.Opposed <- 0-data$Opposed-data$Slightly.Opposed-.5*data$Neutral
data$s.Slightly.Opposed <- 0-data$Slightly.Opposed-.5*data$Neutral
data$s.Neutral <- 0-.5*data$Neutral
data$s.Slightly.Support <- 0+.5*data$Neutral
data$s.Support <- 0+data$Slightly.Support+.5*data$Neutral
data$s.Strongly.Support <- 0+data$Support+data$Slightly.Support+.5*data$Neutral
#to percents
data[,2:15]<-data[,2:15]*100
#melting
mdfr <- melt(data, id=c("Group"))
mdfr<-cbind(mdfr[1:14,],mdfr[15:28,3])
colnames(mdfr)<-c("Group","variable","value","start")
#remove dot in level names
mylevels<-c("Strongly Opposed","Opposed","Slightly Opposed","Neutral","Slightly Support","Support","Strongly Support")
mdfr$variable<-droplevels(mdfr$variable)
levels(mdfr$variable)<-mylevels
pal<-c("#bd7523", "#e9aa61", "#f6d1a7", "#999999", "#c8cbc0", "#65806d", "#334e3b")
ggplot(data=mdfr) +
geom_segment(aes(x = Group, y = start, xend = Group, yend = start+value, colour = variable,
text=paste("Group: ",Group,"<br>Percent: ",value,"%")), size = 5) +
geom_hline(yintercept = 0, color =c("#646464")) +
coord_flip() +
theme(legend.position="top") +
theme(legend.key.width=unit(0.5,"cm")) +
guides(col = guide_legend(ncol = 12)) + #has 7 real columns, using to adjust legend position
scale_color_manual("Response", labels = mylevels, values = pal, guide="legend") +
theme(legend.title = element_blank()) +
theme(axis.title.x = element_blank()) +
theme(axis.title.y = element_blank()) +
theme(axis.ticks = element_blank()) +
theme(axis.text.x = element_blank()) +
theme(legend.key = element_rect(fill = "white")) +
scale_y_continuous(breaks=seq(-100,100,100), limits=c(-100,100)) +
theme(panel.background = element_rect(fill = "#ffffff"),
panel.grid.major = element_line(colour = "#CBCBCB"))
The plot:
I think this works, you may need to play around with the axis limits/breaks:
library(dplyr)
mdfr <- mdfr %>%
mutate(group_n = as.integer(case_when(Group == "Bird Advocates" ~ 2,
Group == "Cat Advocates" ~ 1)))
ggplot(data=mdfr) +
geom_segment(aes(x = group_n, y = start, xend = group_n, yend = start + value, colour = variable,
text=paste("Group: ",Group,"<br>Percent: ",value,"%")), size = 5) +
scale_x_continuous(limits = c(0,3), breaks = c(1, 2), labels = c("Cat", "Bird")) +
geom_hline(yintercept = 0, color =c("#646464")) +
theme(legend.position="top") +
theme(legend.key.width=unit(0.5,"cm")) +
coord_flip() +
guides(col = guide_legend(ncol = 12)) + #has 7 real columns, using to adjust legend position
scale_color_manual("Response", labels = mylevels, values = pal, guide="legend") +
theme(legend.title = element_blank()) +
theme(axis.title.x = element_blank()) +
theme(axis.title.y = element_blank()) +
theme(axis.ticks = element_blank()) +
theme(axis.text.x = element_blank()) +
theme(legend.key = element_rect(fill = "white"))+
scale_y_continuous(breaks=seq(-100,100,100), limits=c(-100,100)) +
theme(panel.background = element_rect(fill = "#ffffff"),
panel.grid.major = element_line(colour = "#CBCBCB"))
produces this plot:
You want to factor the 'Group' variable in the order by which you want the bars to appear.
mdfr$Group <- factor(mdfr$Group, levels = c("Bird Advocates", "Cat Advocates")
I plot a 2 geom_point graph with the following code:
source("http://www.openintro.org/stat/data/arbuthnot.R")
library(ggplot2)
ggplot() +
geom_point(aes(x = year,y = boys),data=arbuthnot,colour = '#3399ff') +
geom_point(aes(x = year,y = girls),data=arbuthnot,shape = 17,colour = '#ff00ff') +
xlab(label = 'Year') +
ylab(label = 'Rate')
I simply want to know how to add a legend on the right side. With the same shape and color. Triangle pink should have the legend "woman" and blue circle the legend "men". Seems quite simple but after many trial I could not do it. (I'm a beginner with ggplot).
If you rename your columns of the original data frame and then melt it into long format withreshape2::melt, it's much easier to handle in ggplot2. By specifying the color and shape aesthetics in the ggplot command, and specifying the scales for the colors and shapes manually, the legend will appear.
source("http://www.openintro.org/stat/data/arbuthnot.R")
library(ggplot2)
library(reshape2)
names(arbuthnot) <- c("Year", "Men", "Women")
arbuthnot.melt <- melt(arbuthnot, id.vars = 'Year', variable.name = 'Sex',
value.name = 'Rate')
ggplot(arbuthnot.melt, aes(x = Year, y = Rate, shape = Sex, color = Sex))+
geom_point() + scale_color_manual(values = c("Women" = '#ff00ff','Men' = '#3399ff')) +
scale_shape_manual(values = c('Women' = 17, 'Men' = 16))
This is the trick that I usually use. Add colour argument to the aes and use it as an indicator for the label names.
ggplot() +
geom_point(aes(x = year,y = boys, colour = 'Boys'),data=arbuthnot) +
geom_point(aes(x = year,y = girls, colour = 'Girls'),data=arbuthnot,shape = 17) +
xlab(label = 'Year') +
ylab(label = 'Rate')
Here is a way of doing this without using reshape::melt. reshape::melt works, but you can get into a bind if you want to add other things to the graph, such as line segments. The code below uses the original organization of data. The key to modifying the legend is to make sure the arguments to scale_color_manual(...) and scale_shape_manual(...) are identical otherwise you will get two legends.
source("http://www.openintro.org/stat/data/arbuthnot.R")
library(ggplot2)
library(reshape2)
ptheme <- theme (
axis.text = element_text(size = 9), # tick labels
axis.title = element_text(size = 9), # axis labels
axis.ticks = element_line(colour = "grey70", size = 0.25),
panel.background = element_rect(fill = "white", colour = NA),
panel.border = element_rect(fill = NA, colour = "grey70", size = 0.25),
panel.grid.major = element_line(colour = "grey85", size = 0.25),
panel.grid.minor = element_line(colour = "grey93", size = 0.125),
panel.margin = unit(0 , "lines"),
legend.justification = c(1, 0),
legend.position = c(1, 0.1),
legend.text = element_text(size = 8),
plot.margin = unit(c(0.1, 0.1, 0.1, 0.01), "npc") # c(bottom, left, top, right), values can be negative
)
cols <- c( "c1" = "#ff00ff", "c2" = "#3399ff" )
shapes <- c("s1" = 16, "s2" = 17)
p1 <- ggplot(data = arbuthnot, aes(x = year))
p1 <- p1 + geom_point(aes( y = boys, color = "c1", shape = "s1"))
p1 <- p1 + geom_point(aes( y = girls, color = "c2", shape = "s2"))
p1 <- p1 + labs( x = "Year", y = "Rate" )
p1 <- p1 + scale_color_manual(name = "Sex",
breaks = c("c1", "c2"),
values = cols,
labels = c("boys", "girls"))
p1 <- p1 + scale_shape_manual(name = "Sex",
breaks = c("s1", "s2"),
values = shapes,
labels = c("boys", "girls"))
p1 <- p1 + ptheme
print(p1)
output results
Here is an answer based on the tidyverse package. Where one can use the pipe, %>%, to chain functions together. Creating the plot in one continues manner, omitting the need to create temporarily variables. More on the pipe can be found in this post What does %>% function mean in R?
As far as I know, legends in ggplot2 are only based on aesthetic variables. So to add a discrete legend one uses a category column, and change the aesthetics according to the category. In ggplot this is for example done by aes(color=category).
So to add two (or more) different variables of a data frame to the legends, one needs to transform the data frame such that we have a category column telling us which column (variable) is being plotted, and a second column that actually holds the value. The tidyr::gather function, that was also loaded by tidyverse, does exactly that.
Then one creates the legend by just specifying which aesthetics variables need to be different. In this example the code would look as follows:
source("http://www.openintro.org/stat/data/arbuthnot.R")
library(tidyverse)
arbuthnot %>%
rename(Year=year,Men=boys,Women=girls) %>%
gather(Men,Women,key = "Sex",value = "Rate") %>%
ggplot() +
geom_point(aes(x = Year, y=Rate, color=Sex, shape=Sex)) +
scale_color_manual(values = c("Men" = "#3399ff","Women"= "#ff00ff")) +
scale_shape_manual(values = c("Men" = 16, "Women" = 17))
Notice that tidyverse package also automatically loads in the ggplot2 package. An overview of the packages installed can be found on their website tidyverse.org.
In the code above I also used the function dplyr::rename (also loaded by tidyverse) to first rename the columns to the wanted labels. Since the legend automatically takes the labels equal to the category names.
There is a second way to renaming labels of legend, which involves specifying the labels explicitly in the scale_aesthetic_manual functions by the labels = argument. For examples see legends cookbook. But is not recommended since it gets messy quickly with more variables.
I'm drawing a slope graph with ggplot, but the labels get clustered together and are not shown properly because of the scale of the two axis.
Any idea?
My code and the graph Is there any way to adjust step scale?
Thanks alot!
#Read file as numeric data
betterlife<-read.csv("betterlife.csv",skip=4,stringsAsFactors = F)
num_data <- data.frame(data.matrix(betterlife))
numeric_columns <- sapply(num_data,function(x){mean(as.numeric(is.na(x)))<0.5})
final_data <- data.frame(num_data[,numeric_columns],
betterlife[,!numeric_columns])
## rescale selected columns data frame
final_data <- data.frame(lapply(final_data[,c(3,4,5,6,7,10,11)], function(x) scale(x, center = FALSE, scale = max(x, na.rm = TRUE)/100)))
## Add country names as indicator
final_data["INDICATOR"] <- NA
final_data$INDICATOR <- betterlife$INDICATOR
employment.data <- final_data[5:30,]
indicator <- employment.data$INDICATOR
## Melt data to draw graph
employment.melt <- melt(employment.data)
#plot
sg = ggplot(employment.melt, aes(factor(variable), value,
group = indicator,
colour = indicator,
label = indicator)) +
theme(legend.position = "none",
axis.text.x = element_text(size=5),
axis.text.y=element_blank(),
axis.title.x=element_blank(),
axis.title.y=element_blank(),
axis.ticks=element_blank(),
axis.line=element_blank(),
panel.grid.major.x = element_line("black", size = 0.1),
panel.grid.major.y = element_blank(),
panel.grid.minor.y = element_blank(),
panel.background = element_blank())
# plot the right-most labels
sg1 = sg + geom_line(size=0.15) +
geom_text(data = subset(employment.melt, variable == "Life.expectancy"),
aes(x = factor(variable), label=sprintf(" %2f %s",value,INDICATOR)), size = 1.75, hjust = 0)
# plot the left-most labels
sg1 = sg1 + geom_text(data = subset(employment.melt, variable == "Employment.rate"),
aes(x = factor(variable), label=sprintf("%s %2f ",INDICATOR,value)), size = 1.75, hjust = 1)
sg1
Have you tried to set up a scale, for example x (but I think you should do it for y too)
scale_x_continuous(breaks = seq(0, 100, 5))
where 0 - 100 is the range and 5 is the step size. You need to adjust these values according to your graph.
Source
I'm trying to create a framework for easy plotting of our data sets. The current idea is to initiate a ggplot graph, add layers to it, then display or save it. My code looks like this:
initPlot <- function(title = "", data = NULL){
if(is.null(data)) data <- GLOBDATA
plot <- ggplot(data, aes(jahr))
plot <- plot + scale_x_continuous(breaks = seq(2001, 2012, 1))
textTheme <- element_text(size=6, face="plain", color="black", family="AvantGarde")
lineTheme <- element_line(color="black", size=0)
plot <- plot + theme(
text = textTheme,
axis.text = textTheme,
axis.ticks = lineTheme,
axis.line = lineTheme,
axis.title = element_blank(),
plot.background = element_rect(fill="#f0f0f0"),
strip.background = element_rect(fill="#f0f0f0"),
panel.background = element_rect(fill="#f0f0f0"),
panel.grid = element_blank(),
legend.position = "bottom"
)
plot <- plot + guides(color = guide_legend(title = title))
PLOTGLOB <<- plot
plot
}
plotConfidence <- function(columns, color = "red", title = "", label = "", plot = NULL){
plot <- plotLine(columns, "black", label, plot, 1)
plot <- plot + geom_ribbon(columns, alpha = 0.3, fill = color, linetype=0)
PLOTGLOB <<- plot
plot
}
plotLine <- function(column, color = "black", label = "", plot = NULL, size = 1){
if(is.null(plot)) plot <- PLOTGLOB
plot <- plot + geom_line(column, color = color, size = size)
PLOTGLOB <<- plot
plot
}
I then call my code like this:
initPlot("title")
plotConfidence(
aes(
y = jSOEP_aqne_ip_fgt060_f_alle,
ymin = jSOEP_aqne_ip_lfgt060_f_alle,
ymax = jSOEP_aqne_ip_ufgt060_f_alle, color="Alle", fill="Alle"
),
"red")
plotConfidence(
aes(
y = jSOEP_aqne_ip_fgt060_f_mann,
ymin = jSOEP_aqne_ip_lfgt060_f_mann,
ymax = jSOEP_aqne_ip_ufgt060_f_mann, color="Männer", fill="Männer"
),
"blue", , label="Männer")
Which produces the following graphic:
As you can see, the legend colors don't match up with the corresponding geom_ribbons, in fact, both are of the color "blue" (found that out by setting the alpha to 1 temporarily). How do I fix this?
Here's the data I want to plot:
GLOBDATA <- structure(list(jSOEP_aqne_ip_fgt060_f_alle = c(0.117169998586178,
0.122670002281666, 0.131659999489784, 0.132029995322227, 0.140119999647141,
0.142869994044304, 0.136739999055862, 0.140990003943443, 0.146730005741119,
0.149069994688034, 0.141920000314713, 0.142879992723465), jSOEP_aqne_ip_lfgt060_f_alle = c(0.114249996840954,
0.119199998676777, 0.128110006451607, 0.12814000248909, 0.136230006814003,
0.139119997620583, 0.132400006055832, 0.137409999966621, 0.142560005187988,
0.14478999376297, 0.137840002775192, 0.138579994440079), jSOEP_aqne_ip_ufgt060_f_alle = c(0.120090000331402,
0.126139998435974, 0.135220006108284, 0.135920003056526, 0.143999993801117,
0.146630004048347, 0.141090005636215, 0.144580006599426, 0.15090000629425,
0.153359994292259, 0.146009996533394, 0.147180005908012), jSOEP_aqne_ip_fgt060_f_mann = c(0.100199997425079,
0.106820002198219, 0.117770001292229, 0.117349997162819, 0.126489996910095,
0.130469992756844, 0.12601999938488, 0.127340003848076, 0.132960006594658,
0.135379999876022, 0.132510006427765, 0.13782000541687), jSOEP_aqne_ip_lfgt060_f_mann = c(0.0951400026679039,
0.101929999887943, 0.112829998135567, 0.112510003149509, 0.121720001101494,
0.12372999638319, 0.120829999446869, 0.121650002896786, 0.127389997243881,
0.128470003604889, 0.12533999979496, 0.131980001926422), jSOEP_aqne_ip_ufgt060_f_mann = c(0.105259999632835,
0.111709997057915, 0.122720003128052, 0.122189998626709, 0.131270006299019,
0.137209996581078, 0.131219998002052, 0.133019998669624, 0.138539999723434,
0.142289996147156, 0.139679998159409, 0.143659994006157)), .Names = c("jSOEP_aqne_ip_fgt060_f_alle",
"jSOEP_aqne_ip_lfgt060_f_alle", "jSOEP_aqne_ip_ufgt060_f_alle",
"jSOEP_aqne_ip_fgt060_f_mann", "jSOEP_aqne_ip_lfgt060_f_mann",
"jSOEP_aqne_ip_ufgt060_f_mann"))
Thanks for sharing your data. Unfortunately as it stands it does not run. GlOBDATA is a list structure and there is no jahr amongst some other omissions.
This answer does not attempt to create a general function or amend yours but hopefully does suggest another way to structure your data.
By restructuring your data, you can map variables to colours and this will automatically produce the legend.
library(ggplot2)
# create dataframe from your list
temp <- do.call(cbind.data.frame, GLOBDATA)
# Change data format
# your data is organised in wide format as mean, upper CI, lower CI (i think)
# for both 'alle' and 'mann'. By stacking these after renaming for consistent
# column names, we can then easily map aesthetics in ggplot.
# create a grouping variable (grp) to map aesthetics to.
df1 <- setNames(temp[grepl('alle', names(temp))], c('mn', 'lower', 'upper'))
df1$grp <- 'alle'
df2 <- setNames(temp[grepl('mann', names(temp))], c('mn', 'lower', 'upper'))
df2$grp <- 'mann'
df <- rbind(df1, df2)
# add year
df$year <- 2000 + seq(nrow(temp))
# plot
p <- ggplot(df, aes(x=year, y=mn , ymin=lower, ymax=upper, colour=grp, fill=grp)) +
geom_line(size = 1, colour="black") +
geom_ribbon(alpha = 0.3, linetype=0) +
scale_x_continuous(breaks = seq(2001, 2012, 1)) +
scale_fill_manual(values=c('alle' = 'red', 'mann'='blue'))
p <- p +
theme(
text = element_text(size=6, face="plain", color="black", family="AvantGarde"),
axis.text = element_text(size=6, face="plain", color="black", family="AvantGarde"),
axis.ticks = element_line(color="black", size=0.5),
axis.line = element_line(color="black", size=0.5),
axis.title = element_blank(),
plot.background = element_rect(fill="#f0f0f0"),
strip.background = element_rect(fill="#f0f0f0"),
panel.background = element_rect(fill="#f0f0f0"),
panel.grid = element_blank(),
legend.position = "bottom",
legend.title=element_blank()
)
So by tweaking how your data is organised and your functions a little you should be able to map variables to aesthetics and automatically generate a legend.
I plot a 2 geom_point graph with the following code:
source("http://www.openintro.org/stat/data/arbuthnot.R")
library(ggplot2)
ggplot() +
geom_point(aes(x = year,y = boys),data=arbuthnot,colour = '#3399ff') +
geom_point(aes(x = year,y = girls),data=arbuthnot,shape = 17,colour = '#ff00ff') +
xlab(label = 'Year') +
ylab(label = 'Rate')
I simply want to know how to add a legend on the right side. With the same shape and color. Triangle pink should have the legend "woman" and blue circle the legend "men". Seems quite simple but after many trial I could not do it. (I'm a beginner with ggplot).
If you rename your columns of the original data frame and then melt it into long format withreshape2::melt, it's much easier to handle in ggplot2. By specifying the color and shape aesthetics in the ggplot command, and specifying the scales for the colors and shapes manually, the legend will appear.
source("http://www.openintro.org/stat/data/arbuthnot.R")
library(ggplot2)
library(reshape2)
names(arbuthnot) <- c("Year", "Men", "Women")
arbuthnot.melt <- melt(arbuthnot, id.vars = 'Year', variable.name = 'Sex',
value.name = 'Rate')
ggplot(arbuthnot.melt, aes(x = Year, y = Rate, shape = Sex, color = Sex))+
geom_point() + scale_color_manual(values = c("Women" = '#ff00ff','Men' = '#3399ff')) +
scale_shape_manual(values = c('Women' = 17, 'Men' = 16))
This is the trick that I usually use. Add colour argument to the aes and use it as an indicator for the label names.
ggplot() +
geom_point(aes(x = year,y = boys, colour = 'Boys'),data=arbuthnot) +
geom_point(aes(x = year,y = girls, colour = 'Girls'),data=arbuthnot,shape = 17) +
xlab(label = 'Year') +
ylab(label = 'Rate')
Here is a way of doing this without using reshape::melt. reshape::melt works, but you can get into a bind if you want to add other things to the graph, such as line segments. The code below uses the original organization of data. The key to modifying the legend is to make sure the arguments to scale_color_manual(...) and scale_shape_manual(...) are identical otherwise you will get two legends.
source("http://www.openintro.org/stat/data/arbuthnot.R")
library(ggplot2)
library(reshape2)
ptheme <- theme (
axis.text = element_text(size = 9), # tick labels
axis.title = element_text(size = 9), # axis labels
axis.ticks = element_line(colour = "grey70", size = 0.25),
panel.background = element_rect(fill = "white", colour = NA),
panel.border = element_rect(fill = NA, colour = "grey70", size = 0.25),
panel.grid.major = element_line(colour = "grey85", size = 0.25),
panel.grid.minor = element_line(colour = "grey93", size = 0.125),
panel.margin = unit(0 , "lines"),
legend.justification = c(1, 0),
legend.position = c(1, 0.1),
legend.text = element_text(size = 8),
plot.margin = unit(c(0.1, 0.1, 0.1, 0.01), "npc") # c(bottom, left, top, right), values can be negative
)
cols <- c( "c1" = "#ff00ff", "c2" = "#3399ff" )
shapes <- c("s1" = 16, "s2" = 17)
p1 <- ggplot(data = arbuthnot, aes(x = year))
p1 <- p1 + geom_point(aes( y = boys, color = "c1", shape = "s1"))
p1 <- p1 + geom_point(aes( y = girls, color = "c2", shape = "s2"))
p1 <- p1 + labs( x = "Year", y = "Rate" )
p1 <- p1 + scale_color_manual(name = "Sex",
breaks = c("c1", "c2"),
values = cols,
labels = c("boys", "girls"))
p1 <- p1 + scale_shape_manual(name = "Sex",
breaks = c("s1", "s2"),
values = shapes,
labels = c("boys", "girls"))
p1 <- p1 + ptheme
print(p1)
output results
Here is an answer based on the tidyverse package. Where one can use the pipe, %>%, to chain functions together. Creating the plot in one continues manner, omitting the need to create temporarily variables. More on the pipe can be found in this post What does %>% function mean in R?
As far as I know, legends in ggplot2 are only based on aesthetic variables. So to add a discrete legend one uses a category column, and change the aesthetics according to the category. In ggplot this is for example done by aes(color=category).
So to add two (or more) different variables of a data frame to the legends, one needs to transform the data frame such that we have a category column telling us which column (variable) is being plotted, and a second column that actually holds the value. The tidyr::gather function, that was also loaded by tidyverse, does exactly that.
Then one creates the legend by just specifying which aesthetics variables need to be different. In this example the code would look as follows:
source("http://www.openintro.org/stat/data/arbuthnot.R")
library(tidyverse)
arbuthnot %>%
rename(Year=year,Men=boys,Women=girls) %>%
gather(Men,Women,key = "Sex",value = "Rate") %>%
ggplot() +
geom_point(aes(x = Year, y=Rate, color=Sex, shape=Sex)) +
scale_color_manual(values = c("Men" = "#3399ff","Women"= "#ff00ff")) +
scale_shape_manual(values = c("Men" = 16, "Women" = 17))
Notice that tidyverse package also automatically loads in the ggplot2 package. An overview of the packages installed can be found on their website tidyverse.org.
In the code above I also used the function dplyr::rename (also loaded by tidyverse) to first rename the columns to the wanted labels. Since the legend automatically takes the labels equal to the category names.
There is a second way to renaming labels of legend, which involves specifying the labels explicitly in the scale_aesthetic_manual functions by the labels = argument. For examples see legends cookbook. But is not recommended since it gets messy quickly with more variables.