Two lines appear on my plot when there should be one - r

I am making a plot in ggplot, and when I add the geom_line() layer it includes 2 lines instead of one. Can anyone help me understand why it's doing this?
Code:
library(ggplot2)
a <- data.frame(SubjectId=c(1:3, 1:3, 1:3, 1:3),
Cycle=c(1,1.1,1.2, 2,2.1,2.2, 3,3.1,3.2, 4,4.1,4.2),
Dose=c(sort(rep(1:3,3)), 3,3,3),
DLT=c("No","No","Yes","No","No","No","No","Yes","Yes","No","Yes","Yes"))
ggplot(aes(x=Cycle, y=Dose, fill=DLT), data = a) +
scale_fill_manual(values = c("white", "black")) +
geom_line(colour="grey20", size=1) +
geom_point(shape=21, size=5) +
xlim(1, 4.5) +
ylim(1, 4) +
ylab("Dose Level") +
theme_classic() +
theme(axis.text =element_text(size=10),
axis.title =element_text(size=12, face="bold", colour="grey20"),
legend.text =element_text(size=10),
legend.title=element_text(size=12, face="bold", colour="grey20"))
I just want one line to go through the points in order of Cycle, but sorting a by Cycle doesn't change the line at all. What am I doing wrong?

put fill=DLT in the geom_point() section, not at the top. Eg:
ggplot(aes(x=Cycle, y=Dose), data = a) +
scale_fill_manual(values = c("white", "darkred")) +
geom_line(colour="grey20", size=1) +
geom_point(shape=23, size=5, aes(fill=DLT)) +
xlim(1, 4.5) +
ylim(1, 4) +
ylab("Dose Level") +
theme_classic() +
theme(axis.text =element_text(size=10),
axis.title =element_text(size=12, face="bold", colour="grey20"),
legend.text =element_text(size=10),
legend.title=element_text(size=12, face="bold", colour="grey20"))

Related

Scale_y_continuous unexpectedly changes my data

Trying to plot a histogram with a large data set (>50,000 data points). There is a very clear distribution, an initial peak into a second smaller peak, however when we log the graph, the first peak pales in significance to the second, when even logged, it should remain a higher value. Will be viewing my data zoomed in FYI, just zoomed out to demonstrate effect better.
ggplot(sman,aes(x=V1, fill = V3)) +
geom_histogram(color ="#000000", center=100,binwidth = 100)+
scale_fill_manual(values = met.brewer("NewKingdom",4)) +
xlab("Feature Length (bp)") + ylab("Frequency") +
theme(axis.text = element_text(size=14)) + theme(axis.title = element_text(size=20)) +
theme(legend.text = element_text(face="italic")) +
theme(strip.text = element_text(size =18, face="italic")) + labs(fill="Species") +
theme(axis.text = element_text(size=18), axis.title = element_text(size=22, face="bold"))
ggplot(sman,aes(x=V1, fill = V3)) +
geom_histogram(color ="#000000", center=100,binwidth = 100)+
scale_fill_manual(values = met.brewer("NewKingdom",4)) +
xlab("Feature Length (bp)") + ylab("Frequency") + scale_y_continuous(trans="log10") +
theme(axis.text = element_text(size=14)) + theme(axis.title = element_text(size=20)) +
theme(legend.text = element_text(face="italic")) +
theme(strip.text = element_text(size =18, face="italic")) + labs(fill="Species") +
theme(axis.text = element_text(size=18), axis.title = element_text(size=22, face="bold"))
Data is formatted as such, note there are no 0 values in the dataset (I have done V1+1 to transform them:
V1 V2 V3.
1 S. mansoni TE
16 S. mansoni noTE
etc..

How to plot geom_line over bar chart grouped by x variable not fill variable?

I have a data frame df
Group Time_Period mean uci lci
1 A Before 4.712195 5.054539 4.369852
2 A After 5.881463 6.241784 5.521142
3 B Before 5.349754 5.872940 4.826567
4 B After 6.653595 7.246231 6.060959
I want to plot this to illustrate that there is no difference in the mean increase between groups. I tried the following :
ggplot(df, aes(x=Time_Period, y=mean, fill=Group)) +
geom_bar(stat="identity", position=position_dodge(width = 1), color="black") +
geom_point(position = position_dodge(width = 1))+
geom_line(aes(group=Group, color=Group), color=c("cyan4","firebrick","cyan4","firebrick"), size =1, position = position_dodge(width = 1)) +
geom_errorbar(aes(ymin=lci, ymax=uci), position=position_dodge(width = 1)) +
theme_bw() +
scale_y_continuous(limits=c(-0.2,8), breaks= seq(0,300,1), minor_breaks=seq(0,300,0.5)) +
theme(panel.grid.minor = element_line(colour="lightgrey", size=0.5)) +
theme(panel.grid.major = element_line(colour="grey", size=0.5)) +
labs(y="Sales", x="Time Period", fill="Category") +
theme(axis.text.x = element_text(face="bold", size=12)) +
theme(axis.text.y = element_text(face="bold", size=12)) +
theme(axis.title.x = element_text(face="bold", size=16)) +
theme(axis.title.y = element_text(face="bold", size=16)) +
theme(legend.text= element_text(face="bold", size=12)) +
theme(legend.title= element_text(face="bold", size=16))
which plots:
However my manager is concerned it is difficult to differentiate the two lines due to the overlap, so he told me to rearrange the columns so that x is Group and fill is Time Period.
I tried the following:
ggplot(df, aes(x=Group, y=mean, fill=Time_Period)) +
geom_bar(stat="identity", position=position_dodge(width = 1), color="black") +
geom_point(position = position_dodge(width = 1))+
geom_line(aes(group=Group), color="black", size =1, position = position_dodge(width = 1)) +
geom_errorbar(aes(ymin=lci, ymax=uci), position=position_dodge(width = 1)) +
theme_bw() +
scale_y_continuous(limits=c(-0.2,8), breaks= seq(0,300,1), minor_breaks=seq(0,300,0.5)) +
theme(panel.grid.minor = element_line(colour="lightgrey", size=0.5)) +
theme(panel.grid.major = element_line(colour="grey", size=0.5)) +
labs(y="Sales", x="Group", fill="Time Period") +
theme(axis.text.x = element_text(face="bold", size=12)) +
theme(axis.text.y = element_text(face="bold", size=12)) +
theme(axis.title.x = element_text(face="bold", size=16)) +
theme(axis.title.y = element_text(face="bold", size=16)) +
theme(legend.text= element_text(face="bold", size=12)) +
theme(legend.title= element_text(face="bold", size=16))
But I can't work out how to get the lines to plot correctly between the two bars instead of just vertically in the centre, even if I adjust the "width" argument for position_dodge:
Please could anyone advise me on how to fix the plot?
You're looking for position_dodge2(). There's a little about it on the ggplot2 dodge reference, and a little more in the actual code on Github. The relevant section below, with some emphasis added:
Dodging preserves the vertical position of an geom while adjusting the
horizontal position. position_dodge2 is a special case of position_dodge
for arranging box plots, which can have variable widths. position_dodge2
also works with bars and rectangles. But unlike position_dodge,
position_dodge2 works without a grouping variable in a layer.
So here's the code, with some of the theming removed.
library(tidyverse)
txt = "
Group Time_Period mean uci lci
1 A Before 4.712195 5.054539 4.369852
2 A After 5.881463 6.241784 5.521142
3 B Before 5.349754 5.872940 4.826567
4 B After 6.653595 7.246231 6.060959"
df <- read.table(text = txt, header = TRUE) %>%
mutate(Group = fct_relevel(Group, "A", "B")) %>%
mutate(Time_Period = fct_relevel(Time_Period, "Before", "After"))
ggplot(df, aes(x=Group, y=mean, fill=Time_Period)) +
geom_bar(stat="identity", position=position_dodge(width = 1), color="black") +
geom_point(position = position_dodge(width = 1))+
geom_line(aes(group=Group), color="black", size =1,
position = position_dodge2(width = 1)) +
geom_errorbar(aes(ymin=lci, ymax=uci), position=position_dodge(width = 1)) +
theme_bw() +
scale_y_continuous(limits=c(-0.2,8), breaks= seq(0,300,1), minor_breaks=seq(0,300,0.5)) +
labs(y="Sales", x="Group", fill="Time Period")
Created on 2019-11-21 by the reprex package (v0.3.0)

ggplot2: missing legend and how to add?

I want to plot a bar and line chart from a dataframe. Code below,
library("ggplot2")
numb <- c(1,2,3,4,5,6,7,8,9)
mydist <- c(53.846154,15.384615,15.384615,7.692308,7.692308,0,0,0,0)
basedist <- c(30.103,17.609126,12.493874,9.691001,7.918125,6.694679,5.799195,5.115252,4.575749)
df <- data.frame(numb, mydist, basedist)
ggplot(data=df,aes(x=numb)) +
geom_bar(stat="identity", aes(y=mydist), colour="green", fill="green") +
geom_line(aes(y=basedist,group=1, colour="base distribution")) +
geom_point(aes(y=basedist), colour="red") +
ggtitle("My Chart") +
labs(x="numb", y="percentage") +
scale_x_discrete(limits=c("1","2","3","4","5","6","7","8","9")) +
scale_y_continuous(breaks=seq(0,100,10)) +
theme(axis.title.x = element_text(size=10, colour ="#666666")) +
theme(axis.title.y = element_text(size=10, color="#666666")) +
theme(plot.title = element_text(size=16, face="bold", hjust=0, color="#666666")) +
theme(axis.text = element_text(size=12)) +
theme(legend.title = element_text(colour="white", size = 16, face='bold'))
Result is not I wanted because there is no legend for the bars
I reproduced the chart I need with the same data set in Excel below,
What do I need to change in my code to get the chart I need?
Thanks,
Lobbie
Here is a brief example. In general, I would recommend you reformat ggplot() assignment to make debugging easier. i.e. gp <- gp +
gp <- ggplot(data=df, aes(x=numb))
gp <- gp + geom_bar( aes(y = mydist, fill = "green"), stat="identity", color="green")
gp <- gp + geom_line( aes( y = basedist, group = 1, colour = "base distribution"))
gp <- gp + scale_fill_manual(values = "green", labels = "my distribution")
gp <- gp + geom_point(aes(y=basedist), colour="red")
gp <- gp + ggtitle("My Chart")
gp <- gp + labs(x="numb", y="percentage")
gp <- gp + scale_x_discrete(limits=c("1","2","3","4","5","6","7","8","9"))
gp <- gp + scale_y_continuous(breaks=seq(0,100,10))
gp <- gp + theme(axis.title.x = element_text(size=10, colour ="#666666"))
gp <- gp + theme(axis.title.y = element_text(size=10, color="#666666"))
gp <- gp + theme(plot.title = element_text(size=16, face="bold", hjust=0, color="#666666"))
gp <- gp + theme(axis.text = element_text(size=12))
gp <- gp + theme(legend.title = element_text(colour="white", size = 16, face='bold'))
gp <- gp + theme(legend.key = element_blank(), legend.title=element_blank(), legend.box ="vertical")
gp
Without changing much of the original code, you only need to put your fill into aes mapping, then add the scale to set the colour values and labels:
ggplot(data=df,aes(x=numb)) +
geom_bar(stat="identity", aes(y=mydist, fill="green"), colour="green") +
geom_line(aes(y=basedist,group=1, colour="base distribution")) +
geom_point(aes(y=basedist), colour="red") +
ggtitle("My Chart") +
labs(x="numb", y="percentage") +
scale_x_discrete(limits=c("1","2","3","4","5","6","7","8","9")) +
scale_y_continuous(breaks=seq(0,100,10)) +
scale_fill_manual(values = "green", labels = "my distribution") +
theme(axis.title.x = element_text(size=10, colour ="#666666")) +
theme(axis.title.y = element_text(size=10, color="#666666")) +
theme(plot.title = element_text(size=16, face="bold", hjust=0, color="#666666")) +
theme(axis.text = element_text(size=12)) +
theme(legend.title = element_text(colour="white", size = 16, face='bold'))

Why does geom_freqpoly starts adding lines from ZERO even though there is no data in it?

I am using ggplot2 to produce a frequency plot. The X-axis of my data is AGE and it starts from 18 and ends at 44. However, when I produce frequency plot of AGE, the graphs looks like this:
I want those lines to start at 18 and end at 44. How is that possible?
The code is:
ggplot(matched.frame, aes(x=AGE, fill=as.factor(DRUG_KEY), color=as.factor(DRUG_KEY))) +
stat_bin(aes(ymax=..count..,), alpha=.5, ymin=0, geom="ribbon", binwidth =5, position="identity", pad=T) +
geom_freqpoly(binwidth=5, size=2) +
scale_fill_discrete(labels = c("26"="foo", "27"="bar"), name = "Labels") +
scale_color_discrete(labels = c("26"="foo", "27"="bar"), name = "Labels") +
scale_x_continuous(breaks=seq(18, 44, 2)) +
xlab("AGE") + ylab("Number of Patients") +
theme(axis.text.x = element_text(face="bold", size=12)) +
theme(axis.text.y = element_text(face="bold", size=12)) +
theme(axis.title.x = element_text(face="bold", size=14)) +
theme(axis.title.y = element_text(face="bold", size=14))

ggplot2 Boxplot value 0 grid color and line type

I am wondering how I can change the grid line on x value=0, marked with a cross in the image, in my graph to shows the change from +ive to -ive. I would like to have it marked red with the same thickness. Thank you
=== Updated based on the comment
#Mtoto: My apologies. Here is the script.
df.boxplot<- ggplot(melt(df[,c(2:7)]), aes(variable, value))
df.boxplot +
geom_boxplot(lwd=1.2)+ theme_economist() + scale_colour_economist()+
scale_y_continuous(minor_breaks=seq(-5, 10, 0.5),name="Linear Measurements (mm)", breaks=seq(-5, 10, 1)) +
theme(axis.title.x = element_text(face="bold", colour="Black", size=20),
axis.text.x = element_text(face="bold", colour="Black", vjust=0.5, size=20)) +
scale_x_discrete(name="",labels=c("T0 A","T1 B","Δ AB","T0 C","T1 D","Δ CD")) +
theme(axis.title.y = element_text(face="bold", colour="Black", size=30,margin=margin(0,20,0,0)),
axis.text.y = element_text(angle=90, vjust=1, size=20)) +
theme(panel.grid.minor = element_line(colour="White",size=0.2))+
theme(axis.ticks = element_blank())+
ggtitle(" Title")+
theme(plot.title = element_text(size=25,lineheight=2, hjust =0.5, vjust=0.5, margin = margin(20, 10, 20, 0)))
I would also like to add a gap (one x unit/level) between the first three boxplots and the second three boxplots. I tried adding a NA column and use drop=FALSE and it didn't work.
I think you want to look at geom_hline -- I'm sure this is a duplicate...
library(ggplot2)
df <- data.frame(x = gl(5, 25),
y = rnorm(125))
ggplot(df, aes(x, y)) +
geom_boxplot() +
geom_hline(aes(y_intercept = 0), color = "red")

Resources