In ggplot2, how to add another y-axis in the figure? - r

Here is my data (called "data" and is a CSV format file):
attitude,order,min,max,mean,SpRate
Commanding,7,0.023005096,1.6517,0.681777825,5.66572238
Friendly,10,0.20565908,1.7535,0.843770095,6.191950464
Hostile,12,0.105828885,2.4161,1.128603777,6.493506494
Insincere,1,0.110689225,1.5551,0.730545923,5.115089514
Irony,4,0.089307133,2.2395,0.955312553,5.249343832
Joking,2,0.165717303,2.1871,0.94512688,5.141388175
Neutral,5,-0.044620705,1.5322,0.696879247,5.420054201
Polite,11,0.170151929,1.8467,0.873735105,6.191950464
Praising,8,0.192402573,2.0631,0.972857404,5.797101449
Rude,13,0.249746688,2.2885,1.100819511,6.644518272
Serious,6,0.011312206,1.7195,0.693606814,5.649717514
Sincere,9,-0.09135461,1.6409,0.659525513,5.813953488
Suggesting,3,0.072541529,1.8345,0.82999014,5.249343832
Here is my code:
library(ggplot2)
ggplot (data, aes(x=order))+
geom_rect(aes(xmin=order-0.1, xmax=order+0.1, ymin = min, ymax=max), size=1, alpha=0,color="black")+
geom_bar(aes(y=SpRate, fill="SpRate"),stat="identity", alpha=0.2, width=0.9)+
geom_point(aes(y=min, shape="min"), size=5, fill="white")+
geom_point(aes(y=mean, shape="mean"), size=5)+
geom_point(aes(y=max, shape="max"), size=5)+
scale_x_continuous(breaks=c(1:13), labels=c("Insincere","Joking","Suggesting","Irony","Neutral","Serious","Commanding","Praising","Sincere","Friendly","Polite","Hostile", "Rude"))+
xlab("")+ylab("")+theme_bw()+
theme(axis.text.x=element_text(size=25,angle=45, vjust=0.5, color="black"))+
theme(legend.text = element_text(size = 20))+
theme(legend.title = element_text(size = 20))+
labs(shape = "f0:", fill = "SpRate:")+
scale_shape_manual(values=c("min"=15,"max"=16,"mean"=18))+
scale_fill_manual(values= "black")+
theme(axis.text.y = element_text(size=20))
So, as you can see from the plot, there are two plots indeed: A rectanglular with points and a bar-plot, but the y-axis of bar-plot obviously not adapt into the y-axis presented well, so, how to add another y-axis in the right of the whole plot which could adjust for the bar-plot better? (i.e. I want the y-axis of rectangular presented from 0 to 2.5 and bar-plot from 0 to 7)

You could add the second y-axis in ggplot2.
Use this example for one panel plot (http://rpubs.com/kohske/dual_axis_in_ggplot2)
Use my example for multiple panel plot (Dual y axis in ggplot2 for multiple panel figure)

Related

ggplot2 dodged boxplot with geom_point dodging and unequal number of subgroups

I am attempting to plot a dodged boxplot but I run into a couple of difficulties. First of all, the x-axis basically has 2 types of grouping: the "letter-groups" (A, B, C etc...) are the main groups, I specify these as my "X" aesthetic (X_main_group). Within this main group I have subgroups called "X_group", the boxes are coloured by those subgroup types. What causes problems is that for each letter group I have different amounts of these subgroups, e.g. for x=A I have 4 subgroups but for x=B I have only one. This causes problems, for one the dodging of the plotted points do not work anymore (see the example plot below) as they do not align with the dodged boxplots. Secondly, the boxes are not centered around the x-axis tick anymore, this is most clear for x=B. How can I fix this?
I would also like to achieve small x-axis ticks below each subgroup (so 4 ticks for x=A, 1 tick for x=B, 3 for x=C etc..) but this has less priority. I have attached the figure, and in red I drew some examples of what I hope to achieve with the tick-marks. ggplot2 code is shown below. I would like to provide a reproducible piece of code, but I can not manage to create a piece of code that creates a dataframe with unequal amounts of subgroups so people that want to help can run it. I can only make "symmetrical" dataframes...
cbpallette <- c("#999999", "#666666", "#333333", "#000000", "#003300")
p1 <- ggplot(data=df, aes(x=X_main_group,y=Intensity, colour=factor(X_group))) + stat_boxplot(geom = "errorbar", width=.4, position = position_dodge(0.5, preserve="single")) + geom_boxplot(width=0.5, outlier.shape=NA, position=position_dodge(preserve = "single")) + theme_classic() + geom_point(position=position_jitterdodge(), alpha=0.3)
p2 <- p1 + scale_colour_manual(values = cbpallette) + theme(legend.position = "none") + theme(axis.ticks.length = unit(-0.1, "cm"), axis.text.x = element_text(size=30, vjust=-0.4), axis.text.y=element_text(size=35, hjust = 0.5, angle=45), axis.title = element_blank())
p3 <- p2 + theme(axis.text.x = element_text(margin = margin(t = .5, unit = "cm")), axis.text.y = element_text(margin = margin(r = .5, unit = "cm")))
p3

Plot a line graph linked to a secondary y axis [duplicate]

This question already has an answer here:
Dual y axis (second axis) use in ggplot2
(1 answer)
Closed 4 years ago.
I a trying to replicate what can be easily done in an excel.
Using ggplot, I tried to plot the following:
Plot a barchart, where the left Y axis is represented in counts (0-600)
plot a line graph where the right Y axis is represented in % (0-100).
qn1 . Can someone explain to me, how can I link my percentage data to my secondary axis? Currently the line graph (which should represent the %) is plotted based on the primary Y axis using the counts scale.
qn2. How can i change the 2 scales independently?
qn3. How can i name the 2 scales independently?
ggplot() +
geom_bar(data=data,aes(x=sch,y=count,fill=category),stat = "identity")+
scale_fill_manual(values=c("darkcyan", "indianred1")) +
geom_line(data=data_percentage, aes(x=sch, y=count, group=1)) +
geom_point(data=data_percentage, aes(x=sch, y=count, group=1)) +
geom_text(data=data_percentage,aes(x=scht,label=paste(count,"%",sep="")),size=3) +
scale_y_continuous(sec.axis = sec_axis(~./2), name="%")+
theme(panel.background = element_blank(),
axis.line = element_line(colour = "black", size = 0.5, linetype = "solid"),
plot.title = element_text(size=11, face="bold", hjust=0.3),
legend.position = "top", legend.text = element_text(size=9)) +
labs(fill="") + guides(fill = guide_legend(reverse=TRUE))+
ylab("No. Recruited") + ggtitle("2. No. of students")
Answer1: You don't link the geom to an axis. Instead, you scale it up or down to be consistent with your secondary axis scale. In the example you provided, sec.axis is scaled by ~./2 then your y aesthetic in both geom_line and geom_point should be count*2. This will give and appearance that the line is linked to the secondary axis.
Answer2: You can't. In ggplot, the secondary axis should be a one-to-one transformation of the primary axis. I don't know if another package could do that.
Answer3: just move the name argument within the function scale_y_continuous to inside the function sec_axis as the example code shown below.
The code will look something like this:
ggplot() +
.
.
geom_line(data = data_percentage, aes(x=sch, y=count*2, group=1)) +
geom_point(data = data_percentage, aes(x=sch, y=count*2, group=1)) +
.
.
scale_y_continuous(sec.axis = sec_axis(~./2, name="%"))+
.
.
.

How to vertically arrange ggplots with single set of axes and legend?

I'd like to vertically arrange my stacked geom_bar objects and display them with unbroken vertical lines (see concept below) and a single set of axes and legend. I'm using plot_grid now but should perhaps be using facet wrapping? I'm unsure whether that would allow me to place vertical lines. The code that generates my current plot is here.
my concept:
my current plot:
You could create your plots and disable the axis text, line and ticks. Then make the axis titles match the background color so they are not visible (but retain the same graph dimensions) and plot them with plot_grid() as you are doing. Then overlay a full sized plot with zero data, the axis titles and vertical lines over the top of it using draw_plot(). For the single legend, leverage the following SO answer:
Align multiple plots in ggplot2 when some have legends and others don't
The code:
#!/usr/bin/env Rscript
if (!require("pacman")) install.packages("pacman")
pacman::p_load(ggplot2, cowplot)
### Create some garbage data to plot
d0 <- data.frame(foo=c(0,0,0,0,0),bar=c("SX_RUNNYNOSE","SX_COUGH","SX_HEADACHE","SX_MALAISE","SX_MYALGIA"))
d1 <- data.frame(foo=c(1,2,3,4,5),bar=c("SX_RUNNYNOSE","SX_COUGH","SX_HEADACHE","SX_MALAISE","SX_MYALGIA"))
### Create a plot with 0 data but having the axis titles and vertical lines
p0 <- ggplot(d0, aes(x=seq(1,5), y=foo, fill=bar)) +
geom_bar(stat="identity") +
theme(axis.text.x=element_blank(),
axis.text.y=element_blank(),
axis.line.x=element_blank(),
axis.line.y=element_blank(),
axis.ticks.x=element_blank(),
axis.ticks.y=element_blank()
) +
theme(legend.position = "none") +
geom_segment(aes(x=2, y=0, xend=2, yend=4.9), color='red') +
geom_text(aes(x=2, y=max(d1$foo), label="T0")) +
geom_segment(aes(x=3, y=0, xend=3, yend=4.9), color='red') +
geom_text(aes(x=3, y=max(d1$foo), label="T24")) +
labs(y="Continued Symptom Count Among Samples", x="Time Elapsed Since Viral Challenge")
### A bar pot with the sample data and only the bars (no axis, etc)
### Make color of axis titles white to match the background color so they are not visible
p1 <- ggplot(d1, aes(x=seq(1,5), y=foo, fill=bar)) +
geom_bar(stat="identity") +
theme(axis.text.x=element_blank(),
axis.text.y=element_blank(),
axis.line.x=element_blank(),
axis.line.y=element_blank(),
axis.ticks.x=element_blank(),
axis.ticks.y=element_blank(),
axis.title.x = element_text(colour = "white"),
axis.title.y = element_text(colour = "white")
) +
theme(legend.title=element_blank())
### Arrange bar plots and legends in a grid and use draw_plot to
### overlay the single axis titles and vertical bars across all
### plots
g <- plot_grid(
plot_grid(
p1 + theme(legend.position = "none")
, p1 + theme(legend.position = "none")
, p1 + theme(legend.position = "none")
, ncol = 1
, align = "h"
, labels=c("Rhinovirus", "H3N2", "H1N1")
, hjust=c(-0.5,-1,-1)) +
draw_plot(p0, 0, 0, 1, 1, 1)
, plot_grid(
ggplot()
, get_legend(p1)
, ggplot()
, ncol =1)
, rel_widths = c(9,3)
)
g
The result:

How to label the first few points and create non-overlapping labels on a plot using direct.label

How can I easily create a plot where the text is not overlapping?
Also How could I create a plot where I just label the first few points? Like the image below, I want to always label the bottom left hand part of the plot
xx<-c(2.25,5.5,5,9.5,7.75,14,24.5,20.75,28,25.5,11.25,17.75,11.75,20.5,23.5,5,10.5,5.5,11,12.5,15,26.75,15.25,24.25,27.75,10.25,22,11.25,18,22.5)
yy<-c(2.75,10.5,9.25,13.5,12,20,24.75,22,29,26.75,13,16.75,13.5,21,23,5.75,7.75,6.75,10.5,6.25,13.5,24.75,14,25.5,26.75,9.5,16.25,10.5,14.5,15)
nm_plot<-c("lastrem_0.5_NN","lastrem_0.25_NN","pt_0.5_NN","pt_0.25_NN","lastrem_NN","lastrem_0.5_area","lastrem_0.25_area","pt_0.5_area","pt_0.25_area","lastrem_area","lastrem_0.5_100","lastrem_100","lastrem_0.25_100","pt_0.5_100","pt_0.25_100","lastrem_0.5_100area","lastrem_100area","lastrem_0.25_100area","pt_0.5_100area","pt_0.25_100area","lastrem_0.5_200","lastrem_200","lastrem_0.25_200","pt_0.5_200","pt_0.25_200","lastrem_0.5_200area","lastrem_200area","lastrem_0.25_200area","pt_0.5_200area","pt_0.25_200area")
direct.label(xyplot(yy~xx,groups=nm_plot,col="Black",
main=textGrob("7Q10",gp=gpar(fontsize=20,fontface="bold")),xlab="",ylab="",
scales=list(tck=c(1,0),cex=1.5),xlim=c(0,35),ylim=c(0,35)),list("last.bumpup",cex=1.5))
How can I create the plot below in R
Found a simple solution using ggplot2 and ggrepel.
xx<-c(2.25,5.5,5,9.5,7.75,14,24.5,20.75,28,25.5,11.25,17.75,11.75,20.5,23.5,5,10.5,5.5,11,12.5,15,26.75,15.25,24.25,27.75,10.25,22,11.25,18,22.5)
yy<-c(2.75,10.5,9.25,13.5,12,20,24.75,22,29,26.75,13,16.75,13.5,21,23,5.75,7.75,6.75,10.5,6.25,13.5,24.75,14,25.5,26.75,9.5,16.25,10.5,14.5,15)
nm_plot<-c("lastrem_0.5_NN","lastrem_0.25_NN","pt_0.5_NN","pt_0.25_NN","lastrem_NN","lastrem_0.5_area","lastrem_0.25_area","pt_0.5_area","pt_0.25_area","lastrem_area","lastrem_0.5_100","lastrem_100","lastrem_0.25_100","pt_0.5_100","pt_0.25_100","lastrem_0.5_100area","lastrem_100area","lastrem_0.25_100area","pt_0.5_100area","pt_0.25_100area","lastrem_0.5_200","lastrem_200","lastrem_0.25_200","pt_0.5_200","pt_0.25_200","lastrem_0.5_200area","lastrem_200area","lastrem_0.25_200area","pt_0.5_200area","pt_0.25_200area")
library(ggrepel)
library(ggplot2)
pp<-data.frame(xx,yy)
row.names(pp)<-nm_plot
plot1<-ggplot(pp) +
geom_point(aes(xx, yy), color = 'red') +
geom_text_repel(aes(xx, yy, label = rownames(pp))) +
theme_classic(base_size = 16)+theme_bw()+theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank())+
theme(axis.title.x = element_blank())+theme(axis.title.y = element_blank())+
scale_y_continuous(breaks=seq(0,30,5))+scale_x_continuous(breaks=seq(0,30,5))+
ggtitle("7Q10")+theme(plot.title = element_text(lineheight=.8, face="bold"))

R: ggplot2: Adding count labels to histogram with density overlay

I have a time-series that I'm examining for data heterogeneity, and wish to explain some important facets of this to some data analysts. I have a density histogram overlayed by a KDE plot (in order to see both plots obviously). However the original data are counts, and I want to place the count values as labels above the histogram bars.
Here is some code:
$tix_hist <- ggplot(tix, aes(x=Tix_Cnt))
+ geom_histogram(aes(y = ..density..), colour="black", fill="orange", binwidth=50)
+ xlab("Bin") + ylab("Density") + geom_density(aes(y = ..density..),fill=NA, colour="blue")
+ scale_x_continuous(breaks=seq(1,1700,by=100))
tix_hist + opts(
title = "Ticket Density To-Date",
plot.title = theme_text(face="bold", size=18),
axis.title.x = theme_text(face="bold", size=16),
axis.title.y = theme_text(face="bold", size=14, angle=90),
axis.text.x = theme_text(face="bold", size=14),
axis.text.y = theme_text(face="bold", size=14)
)
I thought about extrapolating count values using KDE bandwidth, etc, . Is it possible to data frame the numeric output of a ggplot frequency histogram and add this as a 'layer'. I'm not savvy on the layer() function yet, but any ideas would be helpful. Many thanks!
if you want the y-axis to show the bin_count number, at the same time, adding a density curve on this histogram,
you might use geom_histogram() first and record the binwidth value! (this is very important!), next add a layer of geom_density() to show the fitting curve.
if you don't know how to choose the binwidth value, you can just calculate:
my_binwidth = (max(Tix_Cnt)-min(Tix_Cnt))/30;
(this is exactly what geom_histogram does in default.)
The code is given below:
(suppose the binwith value you just calculated is 0.001)
tix_hist <- ggplot(tix, aes(x=Tix_Cnt)) ;
tix_hist<- tix_hist + geom_histogram(aes(y=..count..),colour="blue",fill="white",binwidth=0.001);
tix_hist<- tix_hist + geom_density(aes(y=0.001*..count..),alpha=0.2,fill="#FF6666",adjust=4);
print(tix_hist);

Resources