How to add legend to distribution plot with ggplot2 in R - r

I want to plot a negative binomial distribution and a Poisson distribution to fit my real data, but I don't know how to plot a legend, who can help me with that, thanks a lot. My code and picture is as follows:
ggplot() +
geom_density(aes(a),color="red",lwd=2) +
geom_density(aes(x=rpois(50,1.57)),color="purple",lwd=2) +
geom_smooth() +
geom_density(aes(x=rnbinom(100,size=0.2,mu=1.57)),color="blue",lwd=2) +
geom_smooth() +
coord_cartesian(xlim=c(0,10)) + labs(x="count")
And my data was uploaded here:
https://www.jianguoyun.com/p/DSHXKgMQm5CLBhiKjCc.

The easiest way to add a legend is to map a variable to color. For example
ggplot() +
geom_density(aes(a, color="data"),lwd=2) +
geom_density(aes(x=rpois(50,1.57), color="poisson"),,lwd=2) +
geom_smooth() +
geom_density(aes(x=rnbinom(100,size=0.2,mu=1.57),color="binomial"),lwd=2) +
geom_smooth() +
coord_cartesian(xlim=c(0,10)) + labs(x="count")

Related

ggplot scatterplot with missing x values, trendlines won't connect

I am new to R, and I'm trying to use ggplot2 to plot some data as a scatterplot. I'm missing a day in my samples, and the trendline I made won't connect all of the data together. Below is the code I have and what the graph looks like.
ggplot(SiExptTEPa, aes(x=Timepoint..dpi., y=(TEPcells),group=(Treatment))) +
geom_point(size=5,aes(colour=Nutrient)) +
scale_color_manual(values=c('yellow','light blue')) +
geom_errorbar(aes(ymin=TEPcells-se, ymax=TEPcells+se), width=.1) +
facet_wrap(~Nutrient, scales="free") +
scale_y_continuous(labels = scientific) +
theme_classic() +
xlab("Time Post Infection (Days)") +
ylab("TEP by Total Cells") +
ylim(3e-08,2e-07) +
geom_line(aes(linetype=Treatment)) +
scale_linetype_manual(values=c("solid", "dashed"))
Incorrect graph
Please tell me how to connect the gap in the middle of both sides of the graph so that there is one continuous line?

ggplot. Adding regression lines by group

If I plot this
dodge <- position_dodge(.35)
ggplot(mediat, aes(x=t, y=Value, colour=factor(act),group=id )) +
geom_point(position=dodge) + geom_errorbar(aes(ymin=Value-sdt, ymax=Value+sdt),
width=0, position=dodge) + theme_bw() + geom_smooth(method="lm",se=FALSE,
fullrange=TRUE)
I get this
As you can see the regression line is not plotted.
with +stat_smooth(method=lm, fullrange=TRUE, se = FALSE) the result is the same.
I've found that removing the "group=id" I can get the regression lines but
then
ggplot(mediat, aes(x=t, y=Value, colour=factor(act) ))+ geom_point(position=dodge) +
geom_errorbar(aes(ymin=Value-sdt, ymax=Value+sdt), width=0, position=dodge) +
theme_bw() + geom_smooth(method="lm",se=FALSE, fullrange=TRUE)
As you can see, now it plot the lines but I loose the dodge function by groups.
How can I get both things at once?. I mean, regression lines by "id" on the first uncluttered plot?
Any other solution with base plot, lattice or any other common package would also be welcome.
Regards

Draw mean and outlier points for box plots using ggplot2

I am trying to plot the outliers and mean point for the box plots in below using the data available here. The dataset has 3 different factors and 1 value column for 3600 rows.
While I run the below the code it shows the mean point but doesn't draw the outliers properly
ggplot(df, aes(x=Representations, y=Values, fill=Methods)) +
geom_boxplot() +
facet_wrap(~Metrics) +
stat_summary(fun.y=mean, colour="black", geom="point", position=position_dodge(width=0.75)) +
geom_point() +
theme_bw()
Again, while I am modify the code like in below the mean points disappear !!
ggplot(df, aes(x=Representations, y=Values, colour=Methods)) +
geom_boxplot() +
facet_wrap(~Metrics) +
stat_summary(fun.y=mean, colour="black", geom="point", position=position_dodge(width=0.75)) +
geom_point() +
theme_bw()
In both of the cases I am getting the message: "ymax not defined: adjusting position using y instead" 3 times.
Any kind suggestions how to fix it? I would like to draw the mean points within individual box plots and show outliers in the same colour as the plots.
EDIT:
The original data set does not have any outliers and that was reason for my confusion. Thanks to MrFlick's answer with randomly generated data which clarifies it properly.
Rather than downloading the data, I just made a random sample.
set.seed(18)
gg <- expand.grid (
Methods=c("BC","FD","FDFND","NC"),
Metrics=c("DM","DTI","LB"),
Representations=c("CHG","QR","HQR")
)
df <- data.frame(
gg,
Values=rnorm(nrow(gg)*50)
)
Then you should be able to create the plot you want with
library(ggplot2)
ggplot(df, aes(x=Representations, y=Values, fill=Methods)) +
geom_boxplot() +
stat_summary(fun.y="mean", geom="point",
position=position_dodge(width=0.75), color="white") +
facet_wrap(~Metrics)
which gave me
I was using ggplot2 version 0.9.3.1

Error in a Histogram with GGPLOT2 (R)

I am triying to obtain a histogram. This is my code:
ggplot(data, aes(x=skus, fill=as.factor(stars))) +
+ geom_histogram(binwidth=.5, alpha=.5, position="identity") +
+ geom_vline(data=cdf, aes(xintercept=rating.mean, colour=as.factor(stars)),
+ linetype="dashed", size=1)
When I execute this code I obtain the next graphic:
This is not a histogram. What is my code mistake?
Thanks!
Just I have detected my mistake. I am defined the binwidth as .5. I only have to increment this rate to obtain a good histogram like binwidth=50

Plotting a regression line through the origin

I am plotting some data series along with regression lines using this code:
ggplot(dt1.melt, aes(x=lower, y=value, group=variable, colour=variable)) +
geom_point(shape=1) +
geom_smooth(method=lm,
se=FALSE)
However, I need to constrain the regression line to be through the origin for all series - in the same way as abline(lm(Q75~-1+lower,data=dt1)) would achieve on a standard R plot.
Can anyone explain how to do this in ggplot ?
You need to specify this in the formula argument to geom_smooth:
... + geom_smooth(method=lm, se=FALSE, formula=y~x-1)

Resources