Make multiple ggplot have the same point colours in r - r

I need to show 3 ggplot scatterplots and one dendrogram on one page. How can I make the point colours the same in each scatter plot (i.e. I need the points for group two to be the same colour for all 3 graphs).
require(graphics)
require(ggplot)
require(ggdendro)
#Scatter plots
df1<-data.frame(x=c(3,4,5),y=c(15,20,25),grp=c(1,2,2))
df1$grp =factor(df1$grp)
colnames(df1)[3]="Group"
p<-ggplot(df1,aes(x,y))
p<-p+ geom_point(aes(colour=factor(Group)),size=4)
p1<-p + coord_fixed()
df2<-data.frame(x=c(3,4,5,6),y=c(15,20,25,30),grp=c(1,2,2,3))
df2$grp =factor(df2$grp)
colnames(df2)[3]="Group"
p<-ggplot(df2,aes(x,y))
p<-p+ geom_point(aes(colour=factor(Group)),size=4)
p2<-p + coord_fixed()
df3<-data.frame(x=c(3,4,5,6,7),y=c(15,20,25,30,35),grp=c(1,2,2,3,4))
df3$grp =factor(df3$grp)
colnames(df3)[3]="Group"
p<-ggplot(df3,aes(x,y))
p<-p+ geom_point(aes(colour=factor(Group)),size=4)
p3<-p + coord_fixed()
#Dendrogram
dis <- hclust(dist(USArrests), "ave")
d<-as.dendrogram(dis)
ddata<-dendro_data(d,type="rectangle")
dp<-ggplot(segment(ddata)) + geom_segment(aes(x=x,y=y,xend=xend,yend=yend))
dp<-dp+geom_hline(aes(yintercept=50),colour="red")
I tried used the multi plot function
multiplot(p1,p2,p3,dp,cols=2)
and got:
Bonus: The graphs all have a fixed aspect ratio such that scatterplot are different sizes, which is fine but I really don't need the scatterplot to take up so much space. How can I control how much space each figure is given in the final figure?

Related

How to change dots in forest plot?

I have an excel table with the data of the Odds Ratios of different diseases for my study. I want to make a forestplot with the R package ggplot2. I have used this script:
library(ggplot2)
df <- excel.xlsx
fp <- ggplot(data=df, aes(x=Disease, y=OR, ymin=Lower, ymax=Upper)) +
geom_pointrange() +
geom_hline(yintercept=1, lty=2) + # add a dotted line at x=1 after flip
coord_flip() + # flip coordinates (puts labels on y axis)
xlab("Disease") + ylab("OR (95% CI)") +
theme_bw() # use a white background
print(fp)
This makes round black spots for all diseases.I would like to change the shape of the dots on the graph to squares or other different form, but only to some diseases. I would like to change the shape of the points on the graph corresponding to rows 6, 8, 14 and 16 and the rest of the points leave them as they are now.
Thank you in advanced.
I have tried this script but it makes only black spots.
the example code is not reproducible when I'm writing this answer, but I think you just need to specify shape in the aes
This question includes a complete example with multiple shapes

How to make half-wiskers in a ggplot2 line graph?

I make very slow progress in R but now I'm able to do some stuff.
Right now I'm plotting the effects of 4 treatments on plant growth in one graph. As you can see the errorbars overlap which is why I made them different colors. I think in order to make the graph clearer it's better to use the lower errorbars as "half wiskers" for the lower 2 lines, and the upper errorbars for the top two lines (like I have now), see the attached image for reference
Is that doable with the way my script is set up now?
Here is part of my script of the plot, I have a lot more but this is where I specify the plot itself (leaving out the aesthetics and stuff), thanks in advance:
"soda1" is my altered dataframe, setup in a clear way, "sdtv" are my standard deviations for each timepoint/treatment, "oppervlak" is my y variable and "Measuring Date" is my x variable. "Tray ID" is the treatment, so my grouping variable.
p <- ggplot(soda1, aes(x=reorder(`Measuring Date`, oppervlak), y=`oppervlak`, group=`Tray ID`, fill=`Tray ID`, colour = `Tray ID` )) +
scale_fill_brewer(palette = "Spectral") +
geom_errorbar(data=soda1, mapping=aes(ymin=oppervlak, ymax=oppervlak+sdtv, group=`Tray ID`), width=0.1) +
geom_line(aes(linetype=`Tray ID`)) +
geom_point(mapping=aes(x=`Measuring Date`, y=oppervlak, shape=`Tray ID`))
print(p)
Showing only one side of errorbars can hide an overlap in the uncertainty between the distribution of two or more variables or measurements.
Instead of hiding this overlap, you could adjust the position of your errorbars horizontally very easily by adding position=position_dodge(width=) to your call to geom_errorbar().
For example:
library(ggplot2)
# some random data with two factors
df <- data.frame(a=rep(1:10, times=2),
b=runif(20),
treat=as.factor(rep(c(0,1), each=10)),
errormax=runif(20),
errormin=runif(20))
# plotting both sides of the errorbars, but dodging them horizontally
p <- ggplot(data=df, aes(x=a, y=b, colour=treat)) +
geom_line() +
geom_errorbar(data=df, aes(ymin=b-errormin, ymax=b+errormax),
position=position_dodge(width=0.25))

ggplot2 adding stacked barchart to heatmap

I would like to add functional information to a HeatMap (geom_tile). I've got the following simplified DataFrame and R code producing a HeatMap and a separate stacked BarPlot (in the right order, corresponding to the HeatMap).
Question:
How can I add the BarPlot to the right edge/side of the Heatmap?? It shouldn't overlap with any of the tiles, and the tiles of the BarPlot should align with the tiles of the HeatMap.
Data:
AccessionNumber <- c('A4PU48','A9YWS0','B7FKR5','G4W9I5','B7FGU7','B7FIR4','DY615543_2','G7I6Q7','G7I9C1','G7I9Z0','A4PU48','A9YWS0','B7FKR5','G4W9I5','B7FGU7','B7FIR4','DY615543_2','G7I6Q7','G7I9C1','G7I9Z0','A4PU48','A9YWS0','B7FKR5','G4W9I5','B7FGU7','B7FIR4','DY615543_2','G7I6Q7','G7I9C1','G7I9Z0','A4PU48','A9YWS0','B7FKR5','G4W9I5','B7FGU7','B7FIR4','DY615543_2','G7I6Q7','G7I9C1','G7I9Z0')
Bincode <- c(13,25,29,19,1,1,35,16,4,1,13,25,29,19,1,1,35,16,4,1,13,25,29,19,1,1,35,16,4,1,13,25,29,19,1,1,35,16,4,1)
MMName <- c('amino acid metabolism','C1-metabolism','protein','tetrapyrrole synthesis','PS','PS','not assigned','secondary metabolism','glycolysis','PS','amino acid metabolism','C1-metabolism','protein','tetrapyrrole synthesis','PS','PS','not assigned','secondary metabolism','glycolysis','PS','amino acid metabolism','C1-metabolism','protein','tetrapyrrole synthesis','PS','PS','not assigned','secondary metabolism','glycolysis','PS','amino acid metabolism','C1-metabolism','protein','tetrapyrrole synthesis','PS','PS','not assigned','secondary metabolism','glycolysis','PS')
cluster <- c(1,2,2,2,3,3,4,4,4,4,1,2,2,2,3,3,4,4,4,4,1,2,2,2,3,3,4,4,4,4,1,2,2,2,3,3,4,4,4,4)
variable <- c('rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_24','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_48','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_72','rd2c_96','rd2c_96','rd2c_96','rd2c_96','rd2c_96','rd2c_96','rd2c_96','rd2c_96','rd2c_96','rd2c_96')
value <- c(2.15724042939,1.48366099919,1.29388509992,1.59969471112,1.82681962192,2.13347487296,1.08298157478,1.20709456306,1.02011775131,0.88018823632,1.41435923375,1.31680079684,1.32041325076,1.23402873856,2.04977975574,1.90651971106,0.911615352178,1.05021352328,1.18437303394,1.05620421143,1.02132613918,1.22080237755,1.40759491365,1.43131574695,1.65848581311,1.91886008221,0.639581269674,1.11779720968,1.09406554542,1.02259316617,1.00529867534,1.30885290475,1.39376458384,1.35503544429,1.81418617518,1.92505106722,0.862870707741,1.0832577668,1.03118887309,1.21310404226)
df <- data.frame(AccessionNumber, Bincode, MMName, cluster, variable, value)
HeatMap plot:
hm <- ggplot(df, aes(x=variable, y=AccessionNumber))
hm + geom_tile(aes(fill=value), colour = 'white') + scale_fill_gradient2(low='blue', midpoint=1, high='red')
stacked BarPlot:
bp <- ggplot(df, aes(x=sum(df$Bincode), fill=MMName))
bp + stat_bin(aes(ymax = ..count..), binwidth = 1, geom='bar')
Thank you very much for your help/support!!
The variables of the y-axis are sorted first by increasing "cluster" then alphabetically by "AccessionNumber". This is true for both the HeatMap as well as the BarPlot. The values appear in the same order in both plots, but show two different variables (same amount of rows and in the same order, but different content). The HeatMap displays a continuous variable in contrast to the BarPlot which displays a categorical variable. Therefore, the plots could be combined, displaying additional information.
Please help!

Can the minimum y-value be adjusted when using scales = "free" in ggplot?

Using the following data set:
day <- gl(8,1,48,labels=c("Mon","Tues","Wed","Thurs","Fri","Sat","Sun","Avg"))
day <- factor(day, level=c("Mon","Tues","Wed","Thurs","Fri","Sat","Sun","Avg"))
month<-gl(3,8,48,labels=c("Jan","Mar","Apr"))
month<-factor(month,level=c("Jan","Mar","Apr"))
snow<-gl(2,24,48,labels=c("Y","N"))
snow<-factor(snow,levels=c("Y","N"))
count <- c(.94,.95,.96,.98,.93,.94,.99,.9557143,.82,.84,.83,.86,.91,.89,.93,.8685714,1.07,.99,.86,1.03,.81,.92,.88,.9371429,.94,.95,.96,.98,.93,.94,.99,.9557143,.82,.84,.83,.86,.91,.89,.93,.8685714,1.07,.99,.86,1.03,.81,.92,.88,.9371429)
d <- data.frame(day=day,count=count,month=month,snow=snow)
I like the y-scale in this graph, but not the bars:
ggplot()+
geom_line(data=d[d$day!="Avg",],aes(x=day, y=count, group=month, colour=month))+
geom_bar(data=d[d$day=="Avg",],aes(x=day, y=count, fill=month),position="dodge", group=month)+
scale_x_discrete(limits=levels(d$day))+
facet_wrap(~snow,ncol=1,scales="free")+
scale_y_continuous(labels = percent_format())
I like the points, but not the scale:
ggplot(data=d[d$day=="Avg",],aes(x=day, y=count, fill=month,group=month,label=month),show_guide=F)+
facet_wrap(~snow,ncol=1,scales="free")+
geom_line(data=d[d$day!="Avg",],aes(x=day, y=count, group=month, colour=month), show_guide=F)+
scale_x_discrete(limits=levels(d$day))+
scale_y_continuous(labels = percent_format())+
geom_point(aes(colour = month),size = 4,position=position_dodge(width=1.2))
How to combine the desirable qualities in the above graphs?
Essentially, I'm asking: How can I graph the points with a varied y-max while setting the y-min to zero?
Note: The solution that I'm aiming to find will apply to about 27 graphs built from one dataframe. So I'll vote up those solutions that avoid alterations to individual graphs. I'm hoping for a solution that applies to all the facet wrapped graphs.
Minor Questions (possibly for a separate post):
- How can I add a legend to each of the facet wrapped graphs? How
can I change the title of the legend to read "Weekly Average"? How
can the shape/color of the lines/points be varied and then reported
in one single legend?
there's expand_limits(y=0), which essentially adds a dummy layer with invisible geom_blank only to stretch the scales.

Problems making a graphic in ggplot

I an working with ggplot. I want to desine a graphic with ggplot. This graphics is with two continuous variables but I would like to get a graphic like this:
Where x and y are the continuous variables. My problem is I can't get it to show circles in the line of the plot. I would like the plot to have circles for each pair of observations from the continuous variables. For example in the attached graphic, it has a circle for pairs (1,1), (2,2) and (3,3). It is possible to get it? (The colour of the line doesn't matter.)
# dummy data
dat <- data.frame(x = 1:5, y = 1:5)
ggplot(dat, aes(x,y,color=x)) +
geom_line(size=3) +
geom_point(size=10) +
scale_colour_continuous(low="blue",high="red")
Playing with low/high will change the colours.
In general, to remove the legend, use + theme(legend.position="none")

Resources