ggplot geom_line - setting colour of lines doesn't work? - r

I'm trying to plot several lines and then colouring them grey. However, whatever the colour I set, I get black lines. And if I put colour inside the aesthetic, then I get different colours (as expected), even if I specify the argument colour again outside aes().
I'm sure I'm missing something very basic here!
library(tidyverse)
library(ggplot)
country <- c(rep("A", 10), rep("B",10), rep("C", 10))
year <- c(2000:2009, 2000:2009, 2000:2009)
value <- c(rnorm(10), rnorm(10, mean = 0.5), rnorm(10, mean = 1.1))
myData <- tibble(country, year, value) %>%
mutate(avg = mean(value))
ggplot(myData,
aes(x = year, y = value, country = country),
colour = "grey") +
geom_line()

Try this:
ggplot(myData, aes(x = year, y = value, country = country, colour = I("grey"))) +
geom_line()

Here is an othe approach: How you can use scale_color_manual:
p <- ggplot(myData, aes(x = year, y = value, color=country)) +
geom_line()
p + scale_color_manual(values=c("#a6a6a6", "#a6a6a6", "#a6a6a6"))
Instead of using hex color you could also use:
p + scale_color_manual(values=c("gray69", "gray69", "gray69"))

Related

How can I change the size of a bar in a grouped bar chart when one group has no data? [duplicate]

Is there a way to set a constant width for geom_bar() in the event of missing data in the time series example below? I've tried setting width in aes() with no luck. Compare May '11 to June '11 width of bars in the plot below the code example.
colours <- c("#FF0000", "#33CC33", "#CCCCCC", "#FFA500", "#000000" )
iris$Month <- rep(seq(from=as.Date("2011-01-01"), to=as.Date("2011-10-01"), by="month"), 15)
colours <- c("#FF0000", "#33CC33", "#CCCCCC", "#FFA500", "#000000" )
iris$Month <- rep(seq(from=as.Date("2011-01-01"), to=as.Date("2011-10-01"), by="month"), 15)
d<-aggregate(iris$Sepal.Length, by=list(iris$Month, iris$Species), sum)
d$quota<-seq(from=2000, to=60000, by=2000)
colnames(d) <- c("Month", "Species", "Sepal.Width", "Quota")
d$Sepal.Width<-d$Sepal.Width * 1000
g1 <- ggplot(data=d, aes(x=Month, y=Quota, color="Quota")) + geom_line(size=1)
g1 + geom_bar(data=d[c(-1:-5),], aes(x=Month, y=Sepal.Width, width=10, group=Species, fill=Species), stat="identity", position="dodge") + scale_fill_manual(values=colours)
Some new options for position_dodge() and the new position_dodge2(), introduced in ggplot2 3.0.0 can help.
You can use preserve = "single" in position_dodge() to base the widths off a single element, so the widths of all bars will be the same.
ggplot(data = d, aes(x = Month, y = Quota, color = "Quota")) +
geom_line(size = 1) +
geom_col(data = d[c(-1:-5),], aes(y = Sepal.Width, fill = Species),
position = position_dodge(preserve = "single") ) +
scale_fill_manual(values = colours)
Using position_dodge2() changes the way things are centered, centering each set of bars at each x axis location. It has some padding built in, so use padding = 0 to remove.
ggplot(data = d, aes(x = Month, y = Quota, color = "Quota")) +
geom_line(size = 1) +
geom_col(data = d[c(-1:-5),], aes(y = Sepal.Width, fill = Species),
position = position_dodge2(preserve = "single", padding = 0) ) +
scale_fill_manual(values = colours)
The easiest way is to supplement your data set so that every combination is present, even if it has NA as its value. Taking a simpler example (as yours has a lot of unneeded features):
dat <- data.frame(a=rep(LETTERS[1:3],3),
b=rep(letters[1:3],each=3),
v=1:9)[-2,]
ggplot(dat, aes(x=a, y=v, colour=b)) +
geom_bar(aes(fill=b), stat="identity", position="dodge")
This shows the behavior you are trying to avoid: in group "B", there is no group "a", so the bars are wider. Supplement dat with a dataframe with all the combinations of a and b:
dat.all <- rbind(dat, cbind(expand.grid(a=levels(dat$a), b=levels(dat$b)), v=NA))
ggplot(dat.all, aes(x=a, y=v, colour=b)) +
geom_bar(aes(fill=b), stat="identity", position="dodge")
I had the same problem but was looking for a solution that works with the pipe (%>%). Using tidyr::spread and tidyr::gather from the tidyverse does the trick. I use the same data as #Brian Diggs, but with uppercase variable names to not end up with double variable names when transforming to wide:
library(tidyverse)
dat <- data.frame(A = rep(LETTERS[1:3], 3),
B = rep(letters[1:3], each = 3),
V = 1:9)[-2, ]
dat %>%
spread(key = B, value = V, fill = NA) %>% # turn data to wide, using fill = NA to generate missing values
gather(key = B, value = V, -A) %>% # go back to long, with the missings
ggplot(aes(x = A, y = V, fill = B)) +
geom_col(position = position_dodge())
Edit:
There actually is a even simpler solution to that problem in combination with the pipe. Use tidyr::complete gives the same result in one line:
dat %>%
complete(A, B) %>%
ggplot(aes(x = A, y = V, fill = B)) +
geom_col(position = position_dodge())

Only display label per category

I have the following dataset:
year <- as.factor(c(1999,2000,2001))
era <- c(0.4,0.6,0.7)
player_id <- as.factor(c(2,2,2))
df <- data.frame(year, era, player_id)
Using this data I created the following graph:
ggplot(data = df, aes(x = year, y=era, colour = player_id))+
geom_line() +
geom_text(aes(label = player_id), hjust=0.7)
Thing is however that I do now get a label at every datapoint. I only want to have a label at the end of each datapoint.
Any thoughts on what I should change to I get only one label?
If I understand correctly, you want label at end of data point. You could do this using directlabels library, as below:
library(ggplot2)
library(directlabels)
ggplot(data = df, aes(x = year, y=era, group = player_id, colour = player_id))+
geom_line() +
scale_colour_discrete(guide = 'none') +
scale_x_discrete(expand=c(0, 1)) +
geom_dl(aes(label = player_id), method = list(dl.combine("last.points"), cex = 0.8))
Output:
If I am understanding correctly what you want, then you can replace the geom_text(...) with geom_point()

ggplot2 change line type

I've been trying to plot two line graphs, one dashed and the other solid. I succeeded in doing so in the plot area, but the legend is problematic.
I looked at posts such as Changing the line type in the ggplot legend , but I can't seem to fix the solution. Where have I gone wrong?
library(ggplot2)
year <- 2005:2015
variablea <- 1000:1010
variableb <- 1010:1020
df = data.frame(year, variablea, variableb)
p <- ggplot(df, aes(x = df$year)) +
geom_line(aes(y = df$variablea, colour="variablea", linetype="longdash")) +
geom_line(aes(y = df$variableb, colour="variableb")) +
xlab("Year") +
ylab("Value") +
scale_colour_manual("", breaks=c("variablea", "variableb")
, values=c("variablea"="red", "variableb"="blue")) +
scale_linetype_manual("", breaks=c("variablea", "variableb")
, values=c("longdash", "solid"))
p
Notice that both lines appear as solid in the legend.
ggplot likes long data, so you can map linetype and color to a variable. For example,
library(tidyverse)
df %>% gather(variable, value, -year) %>%
ggplot(aes(x = year, y = value, colour = variable, linetype = variable)) +
geom_line()
Adjust color and linetype scales with the appropriate scale_*_* functions, if you like.

Plot discrete values with different color

Given a dataframe with discrete values,
d=data.frame(id=1:6, a=c(1,1,1,0,0,0), b=c(0,0,0,1,1,1), c=c(10,20,30,30,10,20))
I want to make a plot like
However I want to make different color for each layer, say red and green for "a", yellow/blue for "b".
The idea is to reshape your data (define coordinates to draw the rectangles) in order to use geom_rect from ggplot:
library(ggplot2)
library(reshape2)
i = setNames(expand.grid(1:nrow(d),1:ncol(d[-1])),c('x1','y1'))
ggplot(cbind(i,melt(d, id.vars='id')),
aes(xmin=x1, xmax=x1+1, ymin=y1, ymax=y1+1, color=variable, fill=value)) +
geom_rect()
Try geom_tile(). But you need to reshape your data to get exactly the same figure as you presented.
df <- data.frame(id=factor(c(1:6)), a=c(1,1,1,0,0,0), b=c(0,0,0,1,1,1), c=c(10,20,30,30,10,20))
library(reshape2)
df <- melt(df, vars.id = c(df$id))
library(ggplot2)
ggplot(aes(x = id, y = variable, fill = value), data = df) + geom_tile()
require("dplyr")
require("tidyr")
require("ggplot2")
d=data.frame(id=1:6, a=c(1,1,1,0,0,0), b=c(0,0,0,1,1,1), c=c(10,20,30,30,10,20))
ggplot(d %>% gather(type, value, a, b, c) %>% mutate(value = paste0(type, value)),
aes(x = id, y = type)) +
geom_tile(aes(fill = value), color = "white") +
scale_fill_manual(values = c("forestgreen", "indianred", "lightgoldenrod1",
"royalblue", "plum1", "plum2", "plum3"))
First we use reshape2 to transform the data from wide to long. Then to get discrete values we use as.factor(value) and finally we use scale_fill_manual to assign the 5 different colours we need. In geom_tile we specify the colour of the tile borders.
library(reshape2)
library(ggplot2)
df <- data.frame(id=1:6, a=c(1,1,1,0,0,0), b=c(0,0,0,1,1,1), c=c(10,20,30,30,10,20))
df <- melt(df, id.vars=c("id"))
ggplot(df, aes(id, variable, fill = as.factor(value))) + geom_tile(colour = "white") +
scale_fill_manual(values = c("lightblue", "steelblue2", "steelblue3", "steelblue4", "darkblue"), name = "Values")+
scale_x_discrete(limits = 1:6)

How to enforce stack ordering in ggplot geom_area

Is it possible to enforce the stack order when using geom_area()? I cannot figure out why geom_area(position = "stack") produces this strange fluctuation in stack order around 1605.
There are no missing values in the data frame.
library(ggplot2)
counts <- read.csv("https://gist.githubusercontent.com/mdlincoln/d5e1bf64a897ecb84fd6/raw/34c6d484e699e0c4676bb7b765b1b5d4022054af/counts.csv")
ggplot(counts, aes(x = year, y = artists_strict, fill = factor(nationality))) + geom_area()
You need to order your data. In your data, the first value found for each year is 'Flemish' until 1605, and from 1606 the first value is 'Dutch'. So, if we do this:
ggplot(counts[order(counts$nationality),],
aes(x = year, y = artists_strict, fill = factor(nationality))) + geom_area()
It results in
Further illustration if we use random ordering:
set.seed(123)
ggplot(counts[sample(nrow(counts)),],
aes(x = year, y = artists_strict, fill = factor(nationality))) + geom_area()
As randy said, ggplot2 2.2.0 does automatic ordering. If you want to change the order, just reorder the factors used for fill. If you want to switch which group is on top in the legend but not the plot, you can use scale_fill_manual() with the limits option.
(Code to generate ggplot colors from John Colby)
gg_color_hue <- function(n) {
hues = seq(15, 375, length = n + 1)
hcl(h = hues, l = 65, c = 100)[1:n]
}
cols <- gg_color_hue(2)
Default ordering in legend
ggplot(counts,
aes(x = year, y = artists_strict, fill = factor(nationality))) +
geom_area()+
scale_fill_manual(values=c("Dutch" = cols[1],"Flemish"=cols[2]),
limits=c("Dutch","Flemish"))
Reversed ordering in legend
ggplot(counts,
aes(x = year, y = artists_strict, fill = factor(nationality))) +
geom_area()+
scale_fill_manual(values=c("Dutch" = cols[1],"Flemish"=cols[2]),
limits=c("Flemish","Dutch"))
Reversed ordering in plot and legend
counts$nationality <- factor(counts$nationality, rev(levels(counts$nationality)))
ggplot(counts,
aes(x = year, y = artists_strict, fill = factor(nationality))) +
geom_area()+
scale_fill_manual(values=c("Dutch" = cols[1],"Flemish"=cols[2]),
limits=c("Flemish","Dutch"))
this should do it for you
ggplot(counts[order(counts$nationality),],
aes(x = year, y = artists_strict, fill = factor(nationality))) + geom_area()
hope this helps

Resources