Problem adding a trendline to a graph in R ggplot2 - r

I tried to use geom_smooth(method = "lm") and it doesn't work...
percentage.no.work <- cleanData %>% group_by(AREA) %>%
summarise(percentage = mean(ESTIMATED.CITY.UNEMPLOYMENT))
ggplot() +
geom_point(data=percentage.no.work, aes(x=AREA, y=percentage), alpha=0.6, color="purple", size=2) +
geom_smooth(method = "lm") +
theme_minimal() + ggtitle("Percentage Estimated City Unemployment") +
ylab("Percentage")

You need to provide the aesthetics for the geom_smooth as well. Either by including it in the ggplot() or in the geom_smooth() :
ggplot() +
geom_point(data=percentage.no.work, aes(x=AREA, y=percentage), alpha=0.6, color="purple", size=2) +
geom_smooth(aes(x=AREA, y=percentage), method = "lm") +
theme_minimal() + ggtitle("Percentage Estimated City Unemployment") +
ylab("Percentage")
You can avoid repeating section of the code putting it in the ggplot()
ggplot(data=percentage.no.work, aes(x=AREA, y=percentage)) +
geom_point(alpha=0.6, color="purple", size=2) +
geom_smooth(method = "lm") +
theme_minimal() + ggtitle("Percentage Estimated City Unemployment") +
ylab("Percentage")

Related

How to delete legend in ggplot with reression model?

I'd like to ask how to delete legend in ggplot with regression model.
I already added theme(legend.position = "None")
but the legend cannot be deleted. Could you tell me what I was doing wrong?
Extra question!!
In my current code, how to change the symbol size and shape between N0 and N1? I want more bigger size of 'open circle', and 'closed square' shape.
Many thanks!!!
ggplot(data=x, aes(x=agw, y=pgw)) +
geom_point (data=x, aes(x=agw, y=pgw, color=Nitrogen)) +
stat_smooth(method = 'lm', se=FALSE, color="Black") +
scale_color_manual(values = c("Dark gray","Black")) +
theme(legend.position = "None") +
geom_text(x=30, y=70, label="", size=3.5, col="Black") +
geom_text(x=30, y=60, label="", size=3.5, col="Black") +
scale_x_continuous(breaks = seq(0,80,10),limits = c(0,80)) +
scale_y_continuous(breaks = seq(0,80,10), limits = c(0,80)) +
theme_bw() +
theme(panel.grid = element_blank())
This should work in lack of reproducible data. Be careful that functions like theme_bw() use to remove previous theme() settings as mentioned by #Ronald. So it is better to add in the final part of the plot. For shapes, you can enable shape in aes() like this and format with scale_shape_manual() (the numbers inside belong to the shape you want):
library(ggplot2)
#Code
ggplot(data=x, aes(x=agw, y=pgw)) +
geom_point (data=x, aes(x=agw, y=pgw, color=Nitrogen,shape=Nitrogen,size=3)) +
stat_smooth(method = 'lm', se=FALSE, color="Black") +
scale_color_manual(values = c("Dark gray","Black")) +
scale_shape_manual(values = c(1,15))+
geom_text(x=30, y=70, label="", size=3.5, col="Black") +
geom_text(x=30, y=60, label="", size=3.5, col="Black") +
scale_x_continuous(breaks = seq(0,80,10),limits = c(0,80)) +
scale_y_continuous(breaks = seq(0,80,10), limits = c(0,80)) +
theme_bw() +
theme(panel.grid = element_blank(),
legend.position = 'none')
For the legend: add the argument show.legend = F inside geom_point. For the different point size: can you give us an example of your dataset? We may need to reshape it.
ggplot(data=x, aes(x=agw, y=pgw)) +
geom_point (data=x, aes(x=agw, y=pgw, color=Nitrogen), show.legend = F) +
stat_smooth(method = 'lm', se=FALSE, color="Black") +
scale_color_manual(values = c("Dark gray","Black")) +
theme(legend.position = "None") +
geom_text(x=30, y=70, label="", size=3.5, col="Black") +
geom_text(x=30, y=60, label="", size=3.5, col="Black") +
scale_x_continuous(breaks = seq(0,80,10),limits = c(0,80)) +
scale_y_continuous(breaks = seq(0,80,10), limits = c(0,80)) +
theme_bw() +
theme(panel.grid = element_blank())

Adjusting percentage decimals for a bar plot with facet_grid()

I have the following line:
p1 <- ggplot(mtcars, aes(x= cyl)) + geom_bar(aes(fill = vs), stat = "count") + geom_text(aes(label = scales::percent(..prop..), ymax= ..prop..), stat = "count", vjust = -0.5) + theme_classic() + ylab("Count") + facet_grid(vs ~ .) + ylim(0, 15)
which gives this plot. This is a plot where I want to keep the count integers on the y-axis, but I want the percentages displayed above each bar.
I would like to edit the number of decimals over each bar plot. However, when using the line below:
p2 <- ggplot(mtcars, aes(x= cyl)) + geom_bar(aes(fill = vs), stat = "count") + geom_text(aes(label = scales::percent(round((..count..)/sum(..count..),1)), ymax= ((..count..)/sum(..count..))), stat="count", vjust = -.25) + theme_classic() + ylab("Count") + facet_grid(vs ~ .) + ylim(0, 15)
The percentages are now off (see below), displaying the percentages for the whole plot, and not the separated facets. Is there a way to round the percentages without compromising the numbers?
You can use accuracy = 2 in the scales::percent function:
p1 <- ggplot(mtcars, aes(x= cyl)) + geom_bar(aes(fill = vs), stat = "count") +
geom_text(aes(label = scales::percent(..prop.., accuracy = 2), ymax= ..prop..), stat = "count", vjust = -0.5) +
theme_classic() + ylab("Count") + facet_grid(vs ~ .) + ylim(0, 15)
p1
There is an accuracy option in scales::percent:
p1 <- ggplot(mtcars, aes(x= cyl)) +
geom_bar(aes(fill = vs), stat = "count") +
geom_text(aes(label = scales::percent(..prop..,accuracy=2)),
stat = "count", vjust = -0.5) +
theme_classic() + ylab("Count") + facet_grid(vs ~ .) + ylim(0, 15)

grid.arrange + ggplot2 on Impulse Response Function (IRF)

I'm working in a Impulse-Response function plot (from a Vector AutoRegressive Model) with GGplot2 + grid.arrange. Below i give you my actual plot and the original one from the vars package. I really would like any hint to improve the final result
Would be nice, at least place both plots closer.
This is not a full question topic, but an improvement asking
here the full code
library(vars)
# Define lags
lag = VARselect(my_data, lag.max=12)
# Estimating var
my_var = VAR(my_data, min(lag$selection), type='both')
# Set the Impulse-Response data
impulse <- irf(my_var)
# Prepare plot data
number_ticks <- function(n) {function(limits) pretty(limits, n)}
lags <- c(1:11)
irf1<-data.frame(impulse$irf$PIB[,1],impulse$Lower$PIB[,1],
impulse$Upper$PIB[,1], lags)
irf2<-data.frame(impulse$irf$PIB[,2],impulse$Lower$PIB[,2],
impulse$Upper$PIB[,2])
# creating plots
PIB_PIB <- ggplot(data = irf1,aes(lags,impulse.irf.PIB...1.)) +
geom_line(aes(y = impulse.Upper.PIB...1.), colour = 'lightblue2') +
geom_line(aes(y = impulse.Lower.PIB...1.), colour = 'lightblue')+
geom_line(aes(y = impulse.irf.PIB...1.))+
geom_ribbon(aes(x=lags, ymax=impulse.Upper.PIB...1., ymin=impulse.Lower.PIB...1.), fill="lightblue", alpha=.1) +
xlab("") + ylab("PIB") + ggtitle("Orthogonal Impulse Response from PIB") +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank()) +
geom_line(colour = 'black')
PIB_CON <- ggplot(data = irf2,aes(lags,impulse.irf.PIB...2.)) +
geom_line(aes(y = impulse.Upper.PIB...2.), colour = 'lightblue2') +
geom_line(aes(y = impulse.Lower.PIB...2.), colour = 'lightblue')+
geom_line(aes(y = impulse.irf.PIB...2.))+
geom_ribbon(aes(x=lags, ymax=impulse.Upper.PIB...2., ymin=impulse.Lower.PIB...2.), fill="lightblue", alpha=.1) +
scale_x_continuous(breaks=number_ticks(10)) +
xlab("") + ylab("CONSUMO") + ggtitle("") +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank()) +
geom_line(colour = 'black')
# Generating plot
grid.arrange(PIB_PIB, PIB_CON, nrow=2)
Actual Output
Desired Style [when you call plot(irf(my_var))
Got something very close to desired model.
here the changed plots:
PIB_PIB <- ggplot(data = irf1,aes(lags,impulse.irf.PIB...1.)) +
geom_line(aes(y = impulse.Upper.PIB...1.), colour = 'lightblue2') +
geom_line(aes(y = impulse.Lower.PIB...1.), colour = 'lightblue')+
geom_line(aes(y = impulse.irf.PIB...1.))+
geom_ribbon(aes(x=lags, ymax=impulse.Upper.PIB...1., ymin=impulse.Lower.PIB...1.), fill="lightblue", alpha=.1) +
xlab("") + ylab("PIB") + ggtitle("Orthogonal Impulse Response from PIB") +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
plot.margin = unit(c(2,10,2,10), "mm"))+
scale_x_continuous(breaks=number_ticks(10)) +
geom_line(colour = 'black')
PIB_CON <- ggplot(data = irf2,aes(lags,impulse.irf.PIB...2.)) +
geom_line(aes(y = impulse.Upper.PIB...2.), colour = 'lightblue2') +
geom_line(aes(y = impulse.Lower.PIB...2.), colour = 'lightblue')+
geom_line(aes(y = impulse.irf.PIB...2.))+
geom_ribbon(aes(x=lags, ymax=impulse.Upper.PIB...2., ymin=impulse.Lower.PIB...2.), fill="lightblue", alpha=.1) +
xlab("") + ylab("CONSUMO") + ggtitle("") +
theme(axis.title.x=element_blank(),
# axis.text.x=element_blank(),
# axis.ticks.x=element_blank(),
plot.margin = unit(c(-10,10,4,10), "mm"))+
scale_x_continuous(breaks=number_ticks(10)) +
geom_line(colour = 'black')
grid.arrange(PIB_PIB, PIB_CON, nrow=2)

ggplot: conflict between geom_text and ggplot(fill)

When I use geom_text on a ggplot, there is a conflict with the ggplot "fill" option.
Here is a clear example of the problem:
library(ggplot2)
a=ChickWeight
str(a)
xx=data.frame(level=levels(a$Chick),letter=1:50)
# a graph with the fill option alone
x11();ggplot(a, aes(x=Chick, y=weight,fill=Diet)) + geom_boxplot(notch=F) +
stat_summary(fun.y="mean", geom="point", shape=23, size=3, fill="white") +
xlab("Chick") +
ylab("Weight")
# a graph with the geom_text option alone
x11();ggplot(a, aes(x=Chick, y=weight)) + geom_boxplot(notch=F) +
stat_summary(fun.y="mean", geom="point", shape=23, size=3, fill="white") +
geom_text(data=xx, aes(x=level,y=450,label = letter)) +
xlab("Chick") +
ylab("Weight")
# a graph with the two option
x11();ggplot(a, aes(x=Chick, y=weight,fill=Diet)) + geom_boxplot(notch=F) +
stat_summary(fun.y="mean", geom="point", shape=23, size=3, fill="white") +
geom_text(data=xx, aes(x=level,y=1750,label = letter)) +
xlab("Chick") +
ylab("Weight")
If you only want the fill to affect the boxplot, move the aes() into the boxplot. Any aes() aesthetics in the ggplot() call itself will be propagated to all layers
ggplot(a, aes(x=Chick, y=weight)) + geom_boxplot(aes(fill=Diet), notch=F) +
stat_summary(fun.y="mean", geom="point", shape=23, size=3, fill="white") +
geom_text(data=xx, aes(x=level,y=1750,label = letter)) +
xlab("Chick") +
ylab("Weight")
you can also disable the fill= aesthetic in the text layer with fill=NULL
ggplot(a, aes(x=Chick, y=weight, fill=Diet)) + geom_boxplot(notch=F) +
stat_summary(fun.y="mean", geom="point", shape=23, size=3, fill="white") +
geom_text(data=xx, aes(x=level,y=1750,label = letter, fill=NULL)) +
xlab("Chick") +
ylab("Weight")

How to create a legend for lines in scatterplot?

I want to add a legend for the main diagonal and the regression line to the scatter plot.
What I have got now:
library(ggplot2)
df = data.frame(x = 1:10, y = 1:10)
p <- ggplot(df, aes(x, y)) +
geom_point(size=1.2) +
scale_x_continuous(expand=c(0,0)) +
scale_y_continuous(expand=c(0,0)) +
geom_smooth(method="lm", se=FALSE, formula=y~x, colour="blue", fill=NA, size=1.2) +
geom_abline(intercept=0, slope=1, size=1.2, colour="red") +
geom_text(aes(x=max(df[,1])/1.4, y=max(df[,2])/1.2, label=lm_eqn(df)), colour="blue", parse=TRUE) +
# doesn't work: scale_colour_manual("Lines", labels=c("Main Diagonal", "Regression"), values=c("red", "blue")) +
labs(x="X", y="Y")
use show_guide=TRUE e.g.
p <- ggplot(df, aes(x, y)) +
geom_point(size=1.2) +
scale_x_continuous(expand=c(0,0)) +
scale_y_continuous(expand=c(0,0)) +
geom_smooth(method="lm", se=FALSE, formula=y~x, colour="blue", fill=NA, size=1.2) +
geom_abline(aes(colour="red"),intercept=0, slope=1, size=1.2,show_guide=TRUE) +
geom_text(aes(x=max(df[,1])/1.4, y=max(df[,2])/1.2, label="lm_eqn(df)"), colour="blue", parse=TRUE) +
# doesn't work: scale_colour_manual("Lines", labels=c("Main Diagonal", "Regression"), values=c("red", "blue")) +
labs(x="X", y="Y") + opts(legend.position = 'left')
plus you can move legends about using things like+ opts(legend.position = 'left') to get it on the left. I suggest you look at the link provided by Tyler Rinker and also the following:
https://github.com/hadley/ggplot2/wiki/Legend-Attributes
Also no idea what lm_eqn ia so in my code i have surrounded it with "" so it will appear as it is written..
I could finally manage to create a legend for the regression and the diagonal line which is located in the bottom right corner and that makes sense:
p <- ggplot(df, aes(x, y)) +
geom_point(size=1.2) +
scale_x_continuous(expand=c(0,0)) +
scale_y_continuous(expand=c(0,0)) +
geom_abline(aes(colour="red"),intercept=0, slope=1, size=1.2, aes(colour="1"), show_guide=TRUE) + # new code
geom_smooth(method="lm", se=FALSE, formula=y~x, fill=NA, size=1.2, aes(colour="2"), show_guide=TRUE) + # new code
scale_colour_manual("Lines", labels=c("Diagonal", "Regression"), values=c("red", "blue")) +
opts(legend.position = c(0.85, 0.15)) # relative values, must be set individually

Resources