Legends and labelling smooth fitted lines + additional lines using ggplot2 - r

I am working on visualising some patterns in network data and have some issues labelling lines, where I have multiple classes of lines:
loess lines for each factor (network)
a baseline at y=4000
a gam line that acts on all of the data (not factored)
Now, stack overflow has helped get me to this point (thanks!), but I feel like I have run into a brick wall for what I need to do:
A. provide a legend entry for the line #3
B. label each line on the graph (as per #1 #2 #3 - so 8 lines total)
Here is the code that I have so far:
p <- ggplot(network_data, aes(x=timeofday,y=dspeed, colour=factor(network)))+stat_smooth(method="loess",formula=y~x,se=FALSE)
p <- p + stat_function(fun=function(x)4000, geom="line", linetype="dashed", aes(colour="Baseline"))
p <- p + xlab("Time of Day (hr)") + ylab("Download Speed (ms)")
p <- p + theme(axis.line=element_line(colour="black"))
# add the gam line, colouring it purple for now
q <- layer(data=network_data, mapping=aes(x=timeofday,y=dspeed), stat="smooth"
, stat_params=list(method="gam", formula=y~s(x), se=FALSE), geom="smooth", geom_params=list(colour="purple"), position=position_identity())
graph <- p+q # add the layer
#legend
graph <- graph+scale_colour_discrete(name="network")
# set up the origin correctly and axes etc
graph2 <- graph + scale_y_continuous(limits=c(0,6500), expand=c(0,0), breaks=c(0,1000,2000,3000,4000,5000,6000)) + scale_x_datetime(limits=as.POSIXct(c("2015-04-13 00:00:01","2015-04-13 23:59:59")), expand = c(0, 0), breaks=date_breaks("1 hour"), labels=date_format("%H"))
Happy to consider other packages, but ggplot2 seems to be the best so far.
Is there anyway to do this 'automatically' (through programming) as I am trying to automate the generation of these graphs?
I have made the data available here as a .Rda file:
https://dl.dropboxusercontent.com/u/5268020/network_data.Rda
And here is an image of the current plot:

For q B, try annotate and manually code in the location and text for the label of each line. Seems unnecessary given the legend.
http://docs.ggplot2.org/current/annotate.html

Related

How to change dots in forest plot?

I have an excel table with the data of the Odds Ratios of different diseases for my study. I want to make a forestplot with the R package ggplot2. I have used this script:
library(ggplot2)
df <- excel.xlsx
fp <- ggplot(data=df, aes(x=Disease, y=OR, ymin=Lower, ymax=Upper)) +
geom_pointrange() +
geom_hline(yintercept=1, lty=2) + # add a dotted line at x=1 after flip
coord_flip() + # flip coordinates (puts labels on y axis)
xlab("Disease") + ylab("OR (95% CI)") +
theme_bw() # use a white background
print(fp)
This makes round black spots for all diseases.I would like to change the shape of the dots on the graph to squares or other different form, but only to some diseases. I would like to change the shape of the points on the graph corresponding to rows 6, 8, 14 and 16 and the rest of the points leave them as they are now.
Thank you in advanced.
I have tried this script but it makes only black spots.
the example code is not reproducible when I'm writing this answer, but I think you just need to specify shape in the aes
This question includes a complete example with multiple shapes

plot points in front of lines for each group/ggplot2 equivalent of type="o"

Suppose I want to plot a graph with both points and lines where points appear in front of their corresponding lines in each group. In particular, I want group 1 to be plotted with red filled points, where the points are connected by a line, but group 2 to be plotted with (just) a blue line, but I want group 2 to be plotted over group 1. For example, in base graphics:
set.seed(101)
dd <- data.frame(x=rep(1:10,2),
y=rep(1:10,2),
f=factor(rep(1:2,each=10)))
dd$y[11:20] <- dd$y[11:20] + rnorm(10)
d1 <- subset(dd,f=="1")
d2 <- subset(dd,f=="2")
par(cex=1.5)
plot(y~x,data=d1,bg="red",pch=21,type="o")
lines(y~x,data=d2,col="blue",lwd=2)
legend("bottomright",c("group 1","group 2"),
col=c("black","blue"),
pch=c(21,NA),
pt.bg=c("red",NA),
lty=1,
lwd=c(1,2))
(My real data are a little more complex.) I'm going a little nuts trying to do this cleanly in ggplot2.
If I draw points before lines, group 1's points get overlaid by the lines in the same group:
library(ggplot2); theme_set(theme_bw())
g0 <- ggplot(dd,aes(x,y,fill=f,colour=f,shape=f))+
scale_fill_manual(values=c("red",NA))+
scale_colour_manual(values=c("black","blue")) +
scale_shape_manual(values=c(21,NA))
g0 + geom_point()+ geom_line()
ggsave("order2.png",width=3,height=3)
If I draw lines before points, group 2's lines get overlaid by group 1's points:
g0 + geom_line()+ geom_point()
ggsave("order3.png",width=3,height=3)
The desired order is (group 1 lines), (group 1 points), (group 2 lines). I can do this by manually overlaying the geoms again, one group at a time, but this is way ugly.
g0 + geom_line() + geom_point()+
geom_point(data=d1)+
geom_line(data=d2,show.legend=FALSE)
ggsave("order4.png",width=3,height=3)
I think the "best" solution to this is to write a low-level geom_linepoint that works as desired; I've looked into this a bit and it's not entirely trivial ... can anyone suggest a cleaner, simpler solution?
Here's a "low tech"1 solution. Below is a function that adds a line layer and then a point layer successively for each level of a given grouping variable.
linepoint = function(data, group.var, lsize=1.2, psize=4) {
lapply(split(data, data[,group.var]), function(dg) {
list(geom_line(data=dg, size=lsize),
geom_point(data=dg, size=psize))
})
}
ggplot(dd, aes(x,y, fill=f, colour=f,shape=f))+
scale_fill_manual(values=c("red",NA))+
scale_colour_manual(values=c("black","blue")) +
scale_shape_manual(values=c(21,NA)) +
linepoint(dd, "f")
1 "Low tech" compared to writing a new geom. #baptiste's (now deleted) answer does create a new geom and seems to get the job done, so I'm not sure why he deleted it.

Plot log density of a distribution in ggplot2 [duplicate]

I'm using ggplot as described here
Smoothed density estimates
and entered in the R console
m <- ggplot(movies, aes(x = rating))
m + geom_density()
This works but is there some way to remove the connection between the x-axis and the density plot (the vertical lines which connect the density plot to the x-axis)
The most consistent way to do so is (thanks to #baptiste):
m + stat_density(geom="line")
My original proposal was to use geom_line with an appropriate stat:
m + geom_line(stat="density")
but it is no longer recommended since I'm receiving reports it's not universally working for every case in newer versions of ggplot.
The suggested answers dont provide exactly the same results as geom_density. Why not draw a white line over the baseline?
+ geom_hline(yintercept=0, colour="white", size=1)
This worked for me.
Another way would be to calculate the density separately and then draw it. Something like this:
a <- density(movies$rating)
b <- data.frame(a$x, a$y)
ggplot(b, aes(x=a.x, y=a.y)) + geom_line()
It's not exactly the same, but pretty close.

Add geom_vlines to - or colour - a facet wrap plot

I have a plot generated by the following R code - basically a panel of many histograms/bars. and to each one I'd like to add a vertical line, but the vertical line for each facet is different in it's position. Alternatively I'd like to colour the bars red depending on whether the x value is higher than a threshold - how do I do this to such a plot with ggplot2 / R.
I generated the chart like so:
Histogramplot3 <- ggplot(completeFrame, aes(P_Value)) + geom_bar() + facet_wrap(~ Generation)
Where completeFrame is my dataframe, P_Value is my x variable, and the Facet Wrap Variable Generation is a factor.
It's easier to help with specific examples, but simulating some data, maybe this will help:
#simulate data
completeFrame<-data.frame(P_Value=rnorm(200,0.8,0.1),Generation=rep(1:4,times=50))
#draw the basic plot
h3 <- qplot(data=completeFrame,x=P_Value,geom="blank") +
geom_bar(binwidth=0.02, col="black", fill="black") +
# overlay the "red" bars for the subset of data
geom_bar(data=completeFrame[which(completeFrame$P_Value>0.8),],binwidth=0.02, col="black", fill="red") +
facet_wrap(~ Generation)
#add lines to the subsets
h3 <- h3+geom_hline(data=completeFrame[which(completeFrame$Generation==2),],aes(yintercept=max(P_Value)))
h3 <- h3+geom_hline(data=completeFrame[which(completeFrame$Generation==1),],aes(yintercept=2.5))
h3 <- h3+geom_hline(data=completeFrame[which(completeFrame$Generation==3),],aes(yintercept=mean(P_Value)))
h3

Is it possible to create 3 series (2 lines and one point) faceted plot in ggplot?

I am trying to write a code that I wrote with a basic graphics package in R to ggplot.
The graph I obtained using the basic graphics package is as follows:
I was wondering whether this type of graph is possible to create in ggplot2. I think we could create this kind of graph by using panels but I was wondering is it possible to use faceting for this kind of plot. The major difficulty I encountered is that maximum and minimum have common lengths whereas the observed data is not continuous data and the interval is quite different.
Any thoughts on arranging the data for this type of plot would be very helpful. Thank you so much.
Jdbaba,
From your comments, you mentioned that you'd like for the geom_point to have just the . in the legend. This is a feature that is yet to be implemented to be used directly in ggplot2 (if I am right). However, there's a fix/work-around that is given by #Aniko in this post. Its a bit tricky but brilliant! And it works great. Here's a version that I tried out. Hope it is what you expected.
# bind both your data.frames
df <- rbind(tempcal, tempobs)
p <- ggplot(data = df, aes(x = time, y = data, colour = group1,
linetype = group1, shape = group1))
p <- p + geom_line() + geom_point()
p <- p + scale_shape_manual("", values=c(NA, NA, 19))
p <- p + scale_linetype_manual("", values=c(1,1,0))
p <- p + scale_colour_manual("", values=c("#F0E442", "#0072B2", "#D55E00"))
p <- p + facet_wrap(~ id, ncol = 1)
p
The idea is to first create a plot with all necessary attributes set in the aesthetics section, plot what you want and then change settings manually later using scale_._manual. You can unset lines by a 0 in scale_linetype_manual for example. Similarly you can unset points for lines using NA in scale_shape_manual. Here, the first two values are for group1=maximum and minimum and the last is for observed. So, we set NA to the first two for maximum and minimum and set 0 to linetype for observed.
And this is the plot:
Solution found:
Thanks to Arun and Andrie
Just in case somebody needs the solution of this sort of problem.
The code I used was as follows:
library(ggplot2)
tempcal <- read.csv("temp data ggplot.csv",header=T, sep=",")
tempobs <- read.csv("temp data observed ggplot.csv",header=T, sep=",")
p <- ggplot(tempcal,aes(x=time,y=data))+geom_line(aes(x=time,y=data,color=group1))+geom_point(data=tempobs,aes(x=time,y=data,colour=group1))+facet_wrap(~id)
p
The dataset used were https://www.dropbox.com/s/95sdo0n3gvk71o7/temp%20data%20observed%20ggplot.csv
https://www.dropbox.com/s/4opftofvvsueh5c/temp%20data%20ggplot.csv
The plot obtained was as follows:
Jdbaba

Resources