Example I want to replicate I need to plot a two axis plot in R with ggplot2. The first y axis goes from -10 to 10, and the second from 0 to 10. I add an example. Please, let me know if there is a way to do it with ggplot2.
I used this code, but the result makes the first axis from -5 to 10, and the second, from 5 to 10. I want to get the breaks I define earlier.
df %>% filter(Country == "Chile" & year >= 1973) %>% ggplot(aes(x = year)) +
geom_line(aes(y = polity2, colour = "Polity 2")) + geom_line(aes(y = gee_totGDP,colour = "gee_totGDP")) + scale_y_continuous(sec.axis = sec_axis(~.*-1,name = "gee_totGDP")) + scale_colour_manual(values = c("blue", "red"))
I generated some fake data with four rows based on your example image.
To make the plot, I set the limits for the first axis using the limits() argument. Then I set up the second axis using a transformation formula, like you attempted. The transformation should be axis2 = (axis1 + 10)/2.
library(tidyverse)
df <- tibble(year = seq(1985, 2000, 5),
ed = c(6, 6, 8, 5),
polity = c(-10, -10, -8, -8))
df %>%
ggplot(aes(x = year)) +
geom_line(aes(y = polity)) +
geom_line(aes(y = ed)) +
scale_y_continuous(limits = c(-10, 10),
sec.axis = sec_axis(~(. + 10)/2))
You can use scale_y_continuous() for both axis as the following:
ggplot(data = df, aes(x = year)) +
geom_line(aes(y = polity2, color = "Polity 2")) +
geom_line(aes(y = gee_totGDP, color = "gee_totGDP")) +
scale_y_continuous(limits = c(-10, 10), name = "Polity 2") +
scale_y_continuous(limits = c(0, 10), sec.axis = sec_axis(~., name =
"gee_totGDP")) +
scale_color_manual(values = c("blue", "red"))
Related
I've been trying for a while now and also doing a lot of research, but I just can't get it to add a simple legend for my two lines.
I have two lines in my chart and I just want to add a legend for the two lines.
scale_color_manual did not work. I suspect it is because I am using scale_y_break. In other plots (without scale_y_break) scale_color_manual works without problems.
Here is my code:
day <- c(1:5)
altimeter <- c(8.291, 8.872, 7.212, 8.1, 5.92)
slope_kilometers <- c(30.23, 34.8, 29.34, 32.98, 21.23)
df2 <- data.frame(day, altimeter, slope_kilometers)
library(ggbreak)
altimeter_color <- "steelblue"
slope_kilometers_color <- "darkred"
ggplot(df2, aes(x = day)) +
#Altimeter data
geom_line(aes(y = altimeter),
linetype = 2,
linewidth = 1,
color = altimeter_color) +
geom_point(y = altimeter, size = 3, color = altimeter_color) +
#Slope kilometers data
geom_line(aes(y = slope_kilometers),
linetype = 2,
linewidth = 1,
color = slope_kilometers_color) +
geom_point(y = slope_kilometers, size = 3, color = slope_kilometers_color) +
#Y-Axis
scale_y_break( c(9, 20), scales = 1.5) +
#Label
labs(x = "Tage",
y = "[km]") +
#Legend
scale_color_manual(values = c(altimeter_color, slope_kilometers_color)) +
#Title
ggtitle("Höhenmeter und Pistenkilometer meines 5-tägigen Skiurlaubs")
I tried different versions of scale_color_manual, labs, aes(fill="")
Update: I tweaked the former plot (removed):
One way to achieve what you want is:
First bring data in long format then put color inside aesthetics:
Rule of thumb: What is in aesthetics will have a legend:
library(tidyverse)
library(ggbreak)
df2 %>%
pivot_longer(-day) %>%
ggplot(aes(x = day)) +
#Altimeter data
geom_line(data = . %>% filter(name == "altimeter"), aes(y = altimeter, color = name),
linetype = 2, linewidth = 1 ) +
geom_point(data = . %>% filter(name == "altimeter"), aes(y = altimeter, color = name), size = 3) +
#Slope kilometers data
geom_line(data = . %>% filter(name == "slope_kilometers"), aes(y = slope_kilometers, color = name),
linetype = 2, linewidth = 1) +
geom_point(data = . %>% filter(name == "slope_kilometers"), aes(y = slope_kilometers, color = name), size = 3) +
#Y-Axis
scale_y_break( c(9, 20), scales = 1.5) +
#Label
labs(x = "Tage", y = "[km]", color = "") +
#Legend
scale_color_manual(values = c(altimeter_color, slope_kilometers_color)) +
#Title
ggtitle("Höhenmeter und Pistenkilometer meines 5-tägigen Skiurlaubs") +
theme(legend.position = "bottom")
I'm using ggplot2 to plot the annual occurrence of events in states. I want the state labels to be in the same order as shown in the data table "AZ CT NH NM DE..." but ggplot automatically reorganizes the state labels in alphabetical order "AZ CT DE NH...". I created groups so I could display ranges in "num" values (ex. NM and TN). Please ignore the group numbering--I took out some data points to make the table smaller.
ggplot(guidelines, aes(x = state, y = num, group = grp)) +
geom_point() + geom_line(linetype = "dotted") +
labs(x = "State", y = "Number") +
labs(title = "A") +
scale_y_continuous(breaks = seq(0, 11, 1),
limits=c(0,11))
I have tried the suggestions of previous posts to use factor and levels like so:
guidelines$state <- factor(guidelines$state, levels = unique(guidelines$state)
But it does not work because I am using groups and repeating state names. Any ideas on how to get around this?
We can use ordered
library(dplyr)
library(ggplot2)
guidelines %>%
mutate(state =ordered(state, levels = unique(state))) %>%
ggplot(aes(x = state, y = num, group = grp)) +
geom_point() +
geom_line(linetype = "dotted") +
labs(x = "State", y = "Number") +
labs(title = "A") +
scale_y_continuous(breaks = seq(0, 11, 1),
limits=c(0,11))
-output
Try this. You were close in that you must use unique(). Adding ordered=T inside the factor() will keep the desired order. Here the code (Please next time share your data using dput() as sometimes it can be complex to use data from screenshots in they are really big):
library(ggplot2)
#Data
guidelines <- data.frame(state=c('AZ','CT','NH','NM','NM','DE','NJ','TN','TN'),
num=c(10,10,10,5,10,5,5,2,5),
grp=c(3,4,17,19,19,5,18,25,25),stringsAsFactors = F)
#Format factor
guidelines$state <- factor(guidelines$state,levels = unique(guidelines$state),ordered = T)
#Plot
ggplot(guidelines, aes(x = state, y = num, group = grp)) +
geom_point() + geom_line(linetype = "dotted") +
labs(x = "State", y = "Number") +
labs(title = "A") +
scale_y_continuous(breaks = seq(0, 11, 1),
limits=c(0,11))
Output:
Or as mentioned in comments by #TTS you can use this the scale_x_discrete() with limits option:
#Data
guidelines <- data.frame(state=c('AZ','CT','NH','NM','NM','DE','NJ','TN','TN'),
num=c(10,10,10,5,10,5,5,2,5),
grp=c(3,4,17,19,19,5,18,25,25),stringsAsFactors = F)
#Plot 2
ggplot(guidelines, aes(x = state, y = num, group = grp)) +
geom_point() + geom_line(linetype = "dotted") +
labs(x = "State", y = "Number") +
labs(title = "A") +
scale_y_continuous(breaks = seq(0, 11, 1),
limits=c(0,11))+
scale_x_discrete(limits=unique(guidelines$state))
Output:
I would like to use palette colours for my stacked plot:
p <- ggplot() + theme_bw() +
geom_bar(aes(fill = a, y = b, x= c), data = df, width = 0.7,
position="stack", stat="identity") + theme(legend.position="bottom")
I tried the following but it didn`t work:
p + scale_color_brewer(palette = "PuOr")
Futhermore I would like to plot a line showing the mean over the barplot. Maybe somebody has a Idea how to.
Some thoughts:
1) better to use geom_col than geom_bar for values you want the bar to represent, see the documentation
2) Used factor(...) to make continuous variables discrete
3) you code will be easier to read if you follow the order of arguments as set out in the documentation; although of course it does not matter what the order is.
4) updated to reflect request with mean for each x value
library(ggplot2)
library(dplyr)
df <- data.frame(a = c(2001, 2001, 2001, 2002, 2002, 2003),
x = c(6, 7, 8, 6, 7, 6),
y = c(1, 258, 1, 3, 9, 11))
#data frame for means
df_y_mean <-
df %>%
group_by(x) %>%
summarise(y_mean = mean(y))
ggplot() +
geom_col(data = df, aes(x = factor(x), y = y, fill = factor(a)), width = 0.7) +
geom_line(data = df_y_mean, aes(factor(x), y_mean, colour = "red"), group = 1, size = 1) +
scale_fill_brewer(palette = "PuOr", name = "Year") +
guides(colour = guide_legend(title = "Mean", label = FALSE)) +
theme_bw() +
theme(legend.position = "bottom")
Created on 2020-05-20 by the reprex package (v0.3.0)
You are defining fill but using scale_colour_brewer(). Use scale_fill_brewer() to modify fill.
To draw a horizontal line add geom_hline() to your plot call.
p <- ggplot() + theme_bw() +
geom_bar(aes(fill = a, y = b, x= c), data = df, width = 0.7,
position="stack", stat="identity") +
theme(legend.position="bottom")
my.mean <- mean(df$b) ## can be any value, change as needed
p + scale_fill_brewer(palette = "PuOr") + geom_hline(my.mean)
I have a data frame with multiple columns. Here is an example.
my_df <- data.frame(x = 1:5, y = c(50, 22, 15, 33, 49))
colnames(my_df) <- c("ID", "values")
my_df
I am trying to make a scatterplot where there are subsets of this data frame as outliers that are have separate colors to the non-outliers. On top of this, I am also trying to label these outliers with their associated number.
Here is an example attempt:
ggplot(data=my_df, aes(x = seq(1, length(values)), y = my_df$values))+
geom_point(data = subset(my_df, values > 48), aes(color = "blue"))+
geom_point(data = subset(my_df, values < 24, aes(color = "red"))+
geom_text(data = subset(my_df, values > 48), aes(label = values))
The geom_text line of code provides this error.
Error: Aesthetics must be either length 1 or the same as the data (2): colour, x, y
Secondly, I have tried using ifelse to separate values by different colors as a different attempt - however, I do not know a way to label the different color sections with numbers, or even with a legend with names for each color section. Here is an example, but even with added geom_text, or attempts at adding a legend, what I intend on making will not work out. Here is the code that works as a baseline:
ggplot(data=my_df, aes(x = seq(1, length(values)), y = my_df$values))+
geom_point(color = ifelse(my_df$values > 25, "red", "blue"))
If anyone can help, I'll be so thankful, as I've been struggling with this for over a week now.
EDIT: The answers provided below have answered my question. This is the code for my resulting plot, including a legend title and names for each variable as a reference for those looking this up afterwards.
ggplot(my_df, aes(ID, values, color = factor(cut(values, c(0,24,48,Inf))))) +
geom_point(size=3) +
geom_text_repel(data = . %>% filter(values> 48), aes(label = values), show.legend = F)+
geom_text_repel(data = . %>% filter(values< 24), aes(label = values), show.legend = F)+
labs(title = "Beautiful Scatterplot", x = "ID", y = "Values", color = "Legend Title") +
scale_color_manual(labels = c("Below 24", "Between 24 and 48", "Above 48"), values = c("blue", "red", "purple"))
Example Answer Scatterplot
You can try
library(tidyverse)
library(ggrepel)
my_df %>%
mutate(col=case_when(values > 48 ~ 4,
values < 24 ~ 2,
T ~ 1)) %>%
ggplot(aes(ID, values, color = factor(col))) +
geom_point(size=3) +
geom_text_repel(data = . %>% filter(values> 48), aes(label = values)) +
scale_color_identity()
Or using only ggplot
ggplot(my_df, aes(ID, values, color = factor(cut(values, c(0,24,48,Inf))))) +
geom_point(size=3) +
geom_text_repel(data = . %>% filter(values> 48), aes(label = values), show.legend = F)
ggplot(data=my_df,aes(x=ID,y=values,label=ifelse(values>48,values,"")))+
geom_point(size=4,color = ifelse(my_df$values > 48, "red", "blue"))+
geom_text(vjust = 1.3,nudge_x = 0.15,aes(colour="red"),fontface = "bold",show.legend=F)
I have a dataset that has a wide range of values for one group. Using ggplot's facet_wrap, I would plot the y axis in a log scale for one group (the group that has the widest range of values) and regular axis for the other group.
Below is a reproducible example.
set.seed(123)
FiveLetters <- LETTERS[1:2]
df <- data.frame(MonthlyCount = sample(1:10, 36, replace=TRUE),
CustName = factor(sample(FiveLetters,size=36, replace=TRUE)),
ServiceDate = format(seq(ISOdate(2003,1,1), by='day', length=36),
format='%Y-%m-%d'), stringsAsFactors = F)
df$ServiceDate <- as.Date(df$ServiceDate)
# replace some counts to really high numbers for group A
df$MonthlyCount[df$CustName =="A" & df$MonthlyCount >= 9 ] <-300
df
library(ggplot2)
library(scales)
ggplot(data = df, aes(x = ServiceDate, y = MonthlyCount)) +
geom_point() +
facet_wrap(~ CustName, ncol = 1, scales = "free_y" ) +
scale_x_date("Date",
labels = date_format("%Y-%m-%d"),
breaks = date_breaks("1 week")) +
theme(axis.text.x = element_text(colour = "black",
size = 16,
angle = 90,
vjust = .5))
The resulting graph has two facets. The facet for group A has dots on the top and the bottom on the graph, which are difficult to compared, the facet for B is easier to read. I would like to plot facet for group A in log scale and leave the other "free".
this does the job
ggplot(data = df, aes(x = ServiceDate, y = MonthlyCount)) +
geom_point() +
facet_wrap(~ CustName, ncol = 1, scales = "free_y" ) +
scale_x_date("Date",
labels = date_format("%Y-%m-%d"),
breaks = date_breaks("1 week")) +
scale_y_continuous(trans=log_trans(), breaks=c(1,3,10,50,100,300),
labels = comma_format())+
theme(axis.text.x = element_text(colour = "black",
size = 16,
angle = 90,
vjust = .5))
You can make a transformed monthly count and use that as the y-axis.
## modify monthly count
df$mcount <- with(df, ifelse(CustName == "A", log(MonthlyCount), MonthlyCount))
ggplot(data = df, aes(x = ServiceDate, y = mcount)) +
geom_point() +
facet_wrap(~ CustName, ncol = 1, scales = "free_y" ) +
scale_x_date("Date",
labels = date_format("%Y-%m-%d"),
breaks = date_breaks("1 week")) +
theme(axis.text.x = element_text(colour = "black",
size = 16,
angle = 90,
vjust = .5))