How to Unearth the Buried Regression Line in GGPLOT - r

Currently my regression plot looks like this. Notice that
the regression line is deeply buried.
Is there any way I can modify my code here, to show it on top of the dots?
I know I can increase the size but it's still underneath the dots.
p <- ggplot(data=my_df, aes(x=x,y=y),) +
xlab("x") +
ylab("y")+
geom_smooth(method="lm",se=FALSE,color="red",formula=y~x,size=1.5) +
geom_point()
p

Just change the order:
p <- ggplot(data=my_df, aes(x=x,y=y),) +
xlab("x") +
ylab("y")+
geom_point() +
geom_smooth(method="lm",se=FALSE,color="red",formula=y~x,size=1.5)
p

The issue is not the color, but the order of the geoms.
If you first call geom_point() and then geom_smooth()
the latter will be on top of the former.
Plot the following for comparison:
Before <-
ggplot(data=my_df, aes(x=x,y=y),) +
xlab("x") +
ylab("y")+
geom_smooth(method="lm",se=FALSE,color="red",formula=y~x,size=1.5) +
geom_point()
After <-
ggplot(data=my_df, aes(x=x,y=y),) +
xlab("x") +
ylab("y")+
geom_point() +
geom_smooth(method="lm",se=FALSE,color="red",formula=y~x,size=1.5)

How about transparent points?
library(ggplot2)
seed=616
x1<- sort(runif(rnorm(1000)))
seed=626
x2<- rnorm(1000)*0.02+sort(runif(rnorm(1000)))
my_df<- data.frame(x= x1, y = x2)
p <- ggplot(data=my_df, aes(x=x,y=y),) +
xlab("x") +
ylab("y")+
geom_smooth(method="lm",se=FALSE,color="red",formula=y~x,size=1.5)+
geom_point(size = I(2), alpha = I(0.1))
p

Related

How to overlay geom_bar and geom_line plots with different number of elements using ggplot2?

Assuming I have two data.frames with different data but in the same range of x-values
a <-data.frame(x=c(1,1,1,2,2,2,3,3,3),
y=c(0.3,0.4,0.3,0.2,0.5,0.3,0.4,0.4,0.2),
z=c("do","re","mi","do","re","mi","do","re","mi"))
b <- data.frame(x=c(1,2,3),y=c(10,15,8))
Both, a and b have the same range of X values (1,2,3) but while a is a data.frame with 9 rows, b is a data.frame with 3 rows.
I use geom_bar in order to plot the distribution of values of a, like this:
ggplot(a, aes(x=x, y=y, fill=z)) +
geom_bar(position="stack",stat="identity") +
ylab("") +
xlab("x")
And I use geom_line to plot b data, like this:
ggplot(b, aes(x=x, y=y)) +
geom_line(stat="identity") +
ylab("") + xlab("x") + ylim(0,15)
Now I would like to overlay this geom_line plot to the previous geom_bar plot. My first try was to do the following:
ggplot(a, aes(x=x, y=y, fill=z)) +
geom_bar(position="stack",stat="identity") +
ylab("") + xlab("x") +
ggplot(b, aes(x=x, y=y)) +
geom_line(stat="identity") +
ylab("") + xlab("x") + ylim(0,15)
With no success.
How can I overlay a geom_line plot to a geom_bar plot?
Try this
p <- ggplot()
p <- p + geom_bar(data = a, aes(x=x, y=y, fill=z), position="stack",stat="identity")
p <- p + geom_line(data = b, aes(x=x, y=y/max(y)), stat="identity")
p
Update:
You can rescale the one y to make them the same. As I don't know the relations between the two ys, I rescaled them by using y/max(y). Does this solve you problem?
Try merging the datasets first, then plotting, like this:
require(ggplot2)
df <- merge(a,b,by="x")
ggplot(df, aes(x=x, y=y.x, fill=z)) +
geom_bar(position="stack",stat="identity") +
geom_line(aes(x=x, y=y.y)) +
ylab("") + xlab("x")
Output:
I edited the sample data to better illustrate the effects, because the y-axis scaling of the original data would not have matched well:
a <-data.frame(x=c(1,1,1,2,2,2,3,3,3),
y=c(0.3,0.4,0.3,0.2,0.5,0.3,0.4,0.4,0.2),
z=c("do","re","mi","do","re","mi","do","re","mi"))
b <- data.frame(x=c(1,2,3),y=c(.4,1,.4))

Visualize overlapping and non-overlapping ranges

I'm working on some flattening of overlapping ranges and would like to visualize the initial data (overlapping) and the resulting set (flattened) the following way:
Initial data:
Resulting set:
Is such possible with R and, for example, ggplot2?
read.table(header=TRUE, sep=",", text="color,start,end
red,12.5,13.8
blue,0.0,5.4
green,2.0,12.0
yellow,3.5,6.7
orange,6.7,10.0", stringsAsFactors=FALSE) -> df
library(ggplot2)
df$color <- factor(df$color, levels=rev(df$color))
ggplot(df) +
geom_segment(aes(x=start, xend=end, y=color, yend=color, color=color), size=10) +
scale_x_continuous(expand=c(0,0)) +
scale_color_identity() +
labs(x=NULL, y=NULL) +
theme_minimal() +
theme(panel.grid=element_blank()) +
theme(axis.text.x=element_blank()) +
theme(plot.margin=margin(30,30,30,30))
There are other posts on SO that show how to get the y labels like you have shown (we can't do all the work for you ;-)
The answer to the second part of the question can be using #hrbrmstr 's great answer for the first part. We can use overplotting to our advantage and simply set the y coordinates for the segments to a fixed value (for example 1, which where "red" is):
p <- ggplot(df) +
geom_segment(aes(x=start, xend=end, color=color),
y=1, yend=1, size=10) +
scale_x_continuous(expand=c(0,0)) + scale_color_identity() +
labs(x=NULL, y=NULL) +
theme_minimal() +theme(panel.grid=element_blank()) +
theme(axis.text.x=element_blank()) +
theme(plot.margin=margin(30,30,30,30))
print(p)

Shift text in ggplot up

Using the this code gives the plot printed below. As you can see the percentages are printed on the border of the bars. I would like to have them above the bars. Is there a way to achieve this?
p <- ggplot(data=iris, aes(x=factor(Species), fill=factor(Species)))
p + geom_bar() + scale_fill_discrete(name="Species") + labs(x="") +geom_text(aes(y = (..count..),label = scales::percent((..count..)/sum(..count..))), stat="bin",colour="darkgreen") + theme(legend.position="none")
Just add an arbitrary value to y.
p <- ggplot(data=iris, aes(x=factor(Species), fill=factor(Species)))
p + geom_bar() + scale_fill_discrete(name="Species") + labs(x="") +geom_text(aes(y = (..count..) + 10,label = scales::percent((..count..)/sum(..count..))), stat="bin",colour="darkgreen") + theme(legend.position="none")
Or, as per Heroka's comment, use vjust, which is a better solution
p <- ggplot(data=iris, aes(x=factor(Species), fill=factor(Species)))
p + geom_bar() + scale_fill_discrete(name="Species") + labs(x="") +
geom_text(aes(y = (..count..),
label = scales::percent((..count..)/sum(..count..))),
stat="bin",
colour="darkgreen", vjust = -0.5) +
theme(legend.position="none")
But as this makes things quite cramped at the top you might want to add + expand_limits(y = c(0, 60)) to give you a bit more space for the labels.

Draw lines between two facets in ggplot2

How can I draw several lines between two facets?
I attempted this by plotting points at the min value of the top graph but they are not between the two facets. See picture below.
This is my code so far:
t <- seq(1:1000)
y1 <- rexp(1000)
y2 <- cumsum(y1)
z <- rep(NA, length(t))
z[100:200] <- 1
df <- data.frame(t=t, values=c(y2,y1), type=rep(c("Bytes","Changes"), each=1000))
points <- data.frame(x=c(10:200,300:350), y=min(y2), type=rep("Bytes",242))
vline.data <- data.frame(type = c("Bytes","Bytes","Changes","Changes"), vl=c(1,5,20,5))
g <- ggplot(data=df, aes(x=t, y=values)) +
geom_line(colour=I("black")) +
facet_grid(type ~ ., scales="free") +
scale_y_continuous(trans="log10") +
ylab("Log values") +
theme(axis.text.x = element_text(angle = 90, hjust = 1), panel.margin = unit(0, "lines"))+
geom_point(data=points, aes(x = x, y = y), colour="green")
g
In order to achieve that, you have to set the margins inside the plot to zero. You can do that with expand=c(0,0). The changes I made to your code:
When you use scale_y_continuous, you can define the axis label inside that part and you don't need a seperarate ylab.
Changed colour=I("black") to colour="black" inside geom_line.
Added expand=c(0,0) to scale_x_continuous and scale_y_continuous.
The complete code:
ggplot(data=df, aes(x=t, y=values)) +
geom_line(colour="black") +
geom_point(data=points, aes(x = x, y = y), colour="green") +
facet_grid(type ~ ., scales="free") +
scale_x_continuous("t", expand=c(0,0)) +
scale_y_continuous("Log values", trans="log10", expand=c(0,0)) +
theme(axis.text.x=element_text(angle=90, vjust=0.5), panel.margin=unit(0, "lines"))
which gives:
Adding lines can also be done with geom_segment. Normally the lines (segments) will appear in both facets. If you want them to appear between the two facets, you will have to restrict that in data parameter:
ggplot(data=df, aes(x=t, y=values)) +
geom_line(colour="black") +
geom_segment(data=df[df$type=="Bytes",], aes(x=10, y=0, xend=200, yend=0), colour="green", size=2) +
geom_segment(data=df[df$type=="Bytes",], aes(x=300, y=0, xend=350, yend=0), colour="green", size=1) +
facet_grid(type ~ ., scales="free") +
scale_x_continuous("t", expand=c(0,0)) +
scale_y_continuous("Log values", trans="log10", expand=c(0,0)) +
theme(axis.text.x=element_text(angle=90, vjust=0.5), panel.margin=unit(0, "lines"))
which gives:

nested panels in ggplot

first, sorry for my English and mistakes.
I have a plot like this:
data <- data.frame(site=rep(letters[1:6],each=3), year=rep(2001:2003, 6), nb=round(runif(18, min=20, max=60)), group=c(rep("A",9),rep("B", 6),rep("C",3)))
ggplot(data=data, aes(x= factor(year), y= nb)) +
geom_point() +
facet_wrap(~site)
And I would like to add the other panel "group". In fact I would like to make this graph without empty parts:
ggplot(data=data, aes(x= factor(year), y= nb)) +
geom_point() +
facet_grid(group~site)
Does someone has an idea? Thanks for you help!
#
There is this solution which look like that I want, but I thought there were more simple solution :
plt1 <- ggplot(data=data[data$group=="A",], aes(x= factor(year), y= nb)) +
geom_point() +
ggtitle("A")+
facet_grid(~site)+
xlab("") + ylab("")
plt2 <- ggplot(data=data[data$group=="B",], aes(x= factor(year), y= nb)) +
geom_point() +
ggtitle("B")+
facet_grid(~site)+
xlab("") + ylab("")
plt3 <- ggplot(data=data[data$group=="C",], aes(x= factor(year), y= nb)) +
geom_point() +
ggtitle("C")+
facet_grid(~site)+
xlab("") + ylab("")
library(gridExtra)
grid.arrange(arrangeGrob(plt1,plt2, plt3),
left = textGrob("nb",rot=90))
You can combine the site and group inside the facet_wrap() - so you will have only "full" facets.
ggplot(data=data, aes(x= factor(year), y= nb)) +
geom_point() +
facet_wrap(~site+group)

Resources