Even though I found Hadley's post in the google group on POSIXct and geom_vline, I could not get it done. I have a time series from and would like to draw a vertical line for years 1998, 2005 and 2010 for example. I tried with ggplot and qplot syntax, but still I either see no vertical line at all or the vertical line is drawn at the very first vertical grid and the whole series is shifted somewhat strangely to the right.
gg <- ggplot(data=mydata,aes(y=somevalues,x=datefield,color=category)) +
layer(geom="line")
gg + geom_vline(xintercept=mydata$datefield[120],linetype=4)
# returns just the time series plot I had before,
# interestingly the legend contains dotted vertical lines
My date field has format "1993-07-01" and is of class Date.
Try as.numeric(mydata$datefield[120]):
gg + geom_vline(xintercept=as.numeric(mydata$datefield[120]), linetype=4)
A simple test example:
library("ggplot2")
tmp <- data.frame(x=rep(seq(as.Date(0, origin="1970-01-01"),
length=36, by="1 month"), 2),
y=rnorm(72),
category=gl(2,36))
p <- ggplot(tmp, aes(x, y, colour=category)) +
geom_line() +
geom_vline(xintercept=as.numeric(tmp$x[c(13, 24)]),
linetype=4, colour="black")
print(p)
You could also do geom_vline(xintercept = as.numeric(as.Date("2015-01-01")), linetype=4) if you want the line to stay in place whether or not your date is in the 120th row.
Depending on how you pass your "Dates" column to aes, either as.numeric or as.POSIXct works:
library(ggplot2)
using aes(as.Date(Dates),...)
ggplot(df, aes(as.Date(Dates), value)) +
geom_line() +
geom_vline(xintercept = as.numeric(as.Date("2020-11-20")),
color = "red",
lwd = 2)
using aes(Dates, ...)
ggplot(df, aes(Dates, value)) +
geom_line() +
geom_vline(xintercept = as.POSIXct(as.Date("2020-11-20")),
color = "red",
lwd = 2)
as.numeric works to me
ggplot(data=bmelt)+
geom_line(aes(x=day,y=value,colour=type),size=0.9)+
scale_color_manual(labels = c("Observed","Counterfactual"),values = c("1","2"))+
geom_ribbon(data=ita3,aes(x=day,
y=expcumresponse, ymin=exp.cr.ll,ymax=exp.cr.uu),alpha=0.2) +
labs(title="Italy Confirmed cases",
y ="# Cases ", x = "Date",color="Output")+
geom_vline(xintercept = as.numeric(ymd("2020-03-13")), linetype="dashed",
color = "blue", size=1.5)+
theme_minimal()
Related
I'm graphing data with ggplot and animating it using gganimate. I have colors as my labels and when I add a geom_label_repel() part to my graph it adds the labels but displays the colors as the color code. (Example Blue is displayed as #0000FFFF). Here is a snapshot of my animated graph.
My data is just cumulative percentages for m&m's colors in each bag.
Here is what I currently have for my ggplot() and am wondering if I have something wrong or if there is a reason why it's changing the labels from color names to the number codes.
df<- data.frame(x=mm_data$Bag,
y= c(mm_data$totalp_red,mm_data$totalp_blue,
mm_data$totalp_orange,mm_data$totalp_yellow,
mm_data$totalp_brown,mm_data$totalp_green),
group = c(rep("Red", nrow(mm_data)),
rep("Blue", nrow(mm_data)),
rep("Orange", nrow(mm_data)),
rep("Yellow", nrow(mm_data)),
rep("Brown", nrow(mm_data)),
rep("Green", nrow(mm_data))))
group.colors <- c( "blue3","sandybrown","green3","darkorange"
,"red2","yellow")
ggplot(df, aes(x, y, group=group, color=group)) +
geom_line() +
geom_point() +
geom_label_repel(label= df$group,max.overlaps = Inf)+
scale_color_manual(values = group.colors)+
ggtitle("Colors present in my M&M Bag") +
ylab("Distribution percentage") +
xlab("Bags of M&M's")+
transition_reveal(x)```
Don't use df$ in your ggplot verbs when you are referring to the same data;
You need aes(..), even in geom_label_repel.
Try this.
ggplot(df, aes(x, y, group=group, color=group)) +
geom_line() +
geom_point() +
geom_label_repel(aes(label=group),max.overlaps = Inf)+
scale_color_manual(values = group.colors)+
ggtitle("Colors present in my M&M Bag") +
ylab("Distribution percentage") +
xlab("Bags of M&M's")+
transition_reveal(x)
For some reason when the labels are as.character() they will show up like this. All I had to do was save them as as.factor and they appeared as "BLUE", "GREEN", "YELLOW", etc instead of the color code.
I am trying to plot data using ggplot2 in R.
The datapoints occur for each 2^i-th x-value (4, 8, 16, 32,...). For that reason, I want to scale my x-Axis by log_2 so that my datapoints are spread out evenly. Currently most of the datapoints are clustered on the left side, making my plot hard to read (see first image).
I used the following command to get this image:
ggplot(summary, aes(x=xData, y=yData, colour=groups)) +
geom_errorbar(aes(ymin=yData-se, ymax=yData+se), width=2000, position=pd) +
geom_line(position=pd) +
geom_point(size=3, position=pd)
However trying to scale my x-axis with log2_trans yields the second image, which is not what I expected and does not follow my data.
Code used:
ggplot(summary, aes(x=settings.numPoints, y=benchmark.costs.average, colour=solver.name)) +
geom_errorbar(aes(ymin=benchmark.costs.average-se, ymax=benchmark.costs.average+se), width=2000, position=pd) +
geom_line(position=pd) +
geom_point(size=3, position=pd) +
scale_x_continuous(trans = log2_trans(),
breaks = trans_breaks("log2", function(x) 2^x),
labels = trans_format("log2", math_format(2^.x)))
Using scale_x_continuous(trans = log2_trans()) only doesn't help either.
EDIT:
Attached the data for reproducing the results:
https://pastebin.com/N1W0z11x
EDIT 2:
I have used the function pd <- position_dodge(1000) to avoid overlapping of my error bars, which caused the problem.
Removing the position=pd statements solved the issue
Here is a way you could format your x-axis:
# Generate dummy data
x <- 2^seq(1, 10)
df <- data.frame(
x = c(x, x, x),
y = c(0.5*x, x, 1.5*x),
z = rep(letters[seq_len(3)], each = length(x))
)
The plot of this would look like this:
ggplot(df, aes(x, y, colour = z)) +
geom_point() +
geom_line()
Adjusting the x-axis would work like so:
ggplot(df, aes(x, y, colour = z)) +
geom_point() +
geom_line() +
scale_x_continuous(
trans = "log2",
labels = scales::math_format(2^.x, format = log2)
)
The labels argument is just so you have labels in the format 2^x, you could change that to whatever you like.
I have used the function pd <- position_dodge(1000) to avoid overlapping of my error bars, which caused the problem.
Adjusting the amount of position dodge and the with of the error bars according to the new scaling solved the problem.
pd <- position_dodge(0.2) # move them .2 to the left and right
ggplot(summary, aes(x=settings.numPoints, y=benchmark.costs.average, colour=algorithm)) +
geom_errorbar(aes(ymin=benchmark.costs.average-se, ymax=benchmark.costs.average+se), width=0.4, position=pd) +
geom_line(position=pd) +
geom_point(size=3, position=pd) +
scale_x_continuous(
trans = "log2",
labels = scales::math_format(2^.x, format = log2)
)
Adding scale_y_continuous(trans="log2") yields the results I was looking for:
In a previous question, I asked about moving the label position of a barplot outside of the bar if the bar was too small. I was provided this following example:
library(ggplot2)
options(scipen=2)
dataset <- data.frame(Riserva_Riv_Fine_Periodo = 1:10 * 10^6 + 1,
Anno = 1:10)
ggplot(data = dataset,
aes(x = Anno,
y = Riserva_Riv_Fine_Periodo)) +
geom_bar(stat = "identity",
width=0.8,
position="dodge") +
geom_text(aes( y = Riserva_Riv_Fine_Periodo,
label = round(Riserva_Riv_Fine_Periodo, 0),
angle=90,
hjust= ifelse(Riserva_Riv_Fine_Periodo < 3000000, -0.1, 1.2)),
col="red",
size=4,
position = position_dodge(0.9))
And I obtain this graph:
The problem with the example is that the value at which the label is moved must be hard-coded into the plot, and an ifelse statement is used to reposition the label. Is there a way to automatically extract the value to cut?
A slightly better option might be to base the test and the positioning of the labels on the height of the bar relative to the height of the highest bar. That way, the cutoff value and label-shift are scaled to the actual vertical range of the plot. For example:
ydiff = max(dataset$Riserva_Riv_Fine_Periodo)
ggplot(dataset, aes(x = Anno, y = Riserva_Riv_Fine_Periodo)) +
geom_bar(stat = "identity", width=0.8) +
geom_text(aes(label = round(Riserva_Riv_Fine_Periodo, 0), angle=90,
y = ifelse(Riserva_Riv_Fine_Periodo < 0.3*ydiff,
Riserva_Riv_Fine_Periodo + 0.1*ydiff,
Riserva_Riv_Fine_Periodo - 0.1*ydiff)),
col="red", size=4)
You would still need to tweak the fractional cutoff in the test condition (I've used 0.3 in this case), depending on the physical size at which you render the plot. But you could package the code into a function to make the any manual adjustments a bit easier.
It's probably possible to automate this by determining the actual sizes of the various grobs that make up the plot and setting the condition and the positioning based on those sizes, but I'm not sure how to do that.
Just as an editorial comment, a plot with labels inside some bars and above others risks confusing the visual mapping of magnitudes to bar heights. I think it would be better to find a way to shrink, abbreviate, recode, or otherwise tweak the labels so that they contain the information you want to convey while being able to have all the labels inside the bars. Maybe something like this:
library(scales)
ggplot(dataset, aes(x = Anno, y = Riserva_Riv_Fine_Periodo/1000)) +
geom_col(width=0.8, fill="grey30") +
geom_text(aes(label = format(Riserva_Riv_Fine_Periodo/1000, big.mark=",", digits=0),
y = 0.5*Riserva_Riv_Fine_Periodo/1000),
col="white", size=3) +
scale_y_continuous(label=dollar, expand=c(0,1e2)) +
theme_classic() +
labs(y="Riserva (thousands)")
Or maybe go with a line plot instead of bars:
ggplot(dataset, aes(Anno, Riserva_Riv_Fine_Periodo/1e3)) +
geom_line(linetype="11", size=0.3, colour="grey50") +
geom_text(aes(label=format(Riserva_Riv_Fine_Periodo/1e3, big.mark=",", digits=0)),
size=3) +
theme_classic() +
scale_y_continuous(label=dollar, expand=c(0,1e2)) +
expand_limits(y=0) +
labs(y="Riserva (thousands)")
I am trying to make an extremely single heatmap of percentages using ggplot2 which ideally will just be two single thin columns. I tried the following code, believing that the width option in aes would solve the problem.
p_prev_tg <- ggplot(tg_melt, aes(x = variable , y = OTU, fill = value,
width=.3)) + geom_tile() +
scale_fill_gradientn(colours = hm.palette2(10)) +
xlab(NULL) + ylab(NULL) +
theme(axis.text=element_text(size=7))
p_prev_tg
Unfortunately, this returns a plot with lots of empty space as shown. The plot I would like is those two bars side by side, how can I do this in ggplot?
thanks
What about this solution ?
set.seed(1234)
tg_melt <- data.frame(variable=rep(c("Prevalence_T","Prevalence_NT"), each=10),
OTU=rep(paste0("OTU_",1:10),2),
value=rnorm(20))
library(RColorBrewer)
library(ggplot2)
hm.palette2 <- colorRampPalette(rev(brewer.pal(11, 'Spectral')))
p_prev_tg <- ggplot(tg_melt, aes(x = as.numeric(variable), y = OTU, fill = value)) +
geom_tile() +
scale_fill_gradientn(colours = hm.palette2(10)) +
xlab(NULL) + ylab(NULL) +
theme(axis.text=element_text(size=7)) +
scale_x_continuous(breaks=c(1,2),
limits=c(0,3),
labels=levels(tg_melt$variable))+
theme_bw()
p_prev_tg
Even though I found Hadley's post in the google group on POSIXct and geom_vline, I could not get it done. I have a time series from and would like to draw a vertical line for years 1998, 2005 and 2010 for example. I tried with ggplot and qplot syntax, but still I either see no vertical line at all or the vertical line is drawn at the very first vertical grid and the whole series is shifted somewhat strangely to the right.
gg <- ggplot(data=mydata,aes(y=somevalues,x=datefield,color=category)) +
layer(geom="line")
gg + geom_vline(xintercept=mydata$datefield[120],linetype=4)
# returns just the time series plot I had before,
# interestingly the legend contains dotted vertical lines
My date field has format "1993-07-01" and is of class Date.
Try as.numeric(mydata$datefield[120]):
gg + geom_vline(xintercept=as.numeric(mydata$datefield[120]), linetype=4)
A simple test example:
library("ggplot2")
tmp <- data.frame(x=rep(seq(as.Date(0, origin="1970-01-01"),
length=36, by="1 month"), 2),
y=rnorm(72),
category=gl(2,36))
p <- ggplot(tmp, aes(x, y, colour=category)) +
geom_line() +
geom_vline(xintercept=as.numeric(tmp$x[c(13, 24)]),
linetype=4, colour="black")
print(p)
You could also do geom_vline(xintercept = as.numeric(as.Date("2015-01-01")), linetype=4) if you want the line to stay in place whether or not your date is in the 120th row.
Depending on how you pass your "Dates" column to aes, either as.numeric or as.POSIXct works:
library(ggplot2)
using aes(as.Date(Dates),...)
ggplot(df, aes(as.Date(Dates), value)) +
geom_line() +
geom_vline(xintercept = as.numeric(as.Date("2020-11-20")),
color = "red",
lwd = 2)
using aes(Dates, ...)
ggplot(df, aes(Dates, value)) +
geom_line() +
geom_vline(xintercept = as.POSIXct(as.Date("2020-11-20")),
color = "red",
lwd = 2)
as.numeric works to me
ggplot(data=bmelt)+
geom_line(aes(x=day,y=value,colour=type),size=0.9)+
scale_color_manual(labels = c("Observed","Counterfactual"),values = c("1","2"))+
geom_ribbon(data=ita3,aes(x=day,
y=expcumresponse, ymin=exp.cr.ll,ymax=exp.cr.uu),alpha=0.2) +
labs(title="Italy Confirmed cases",
y ="# Cases ", x = "Date",color="Output")+
geom_vline(xintercept = as.numeric(ymd("2020-03-13")), linetype="dashed",
color = "blue", size=1.5)+
theme_minimal()