ggplot2: stat_smooth produces too many confidence bands for factor - r

I am plotting an interaction with ggplot using a script from last year. Last year this worked fine, but now that I installed a new version of ggplot2, it seems to have issues. The first issues was that the classic theme was not able to plot the X and Y-axis. I managed to solved this by adding it to the theme formatting. But now, stat_smooth produces three confidence bands when I have a two-level factor. Not sure why this is happening.
This is the code:
gp <- ggplot(data=myData, aes(x=Sbfld,y=mem,colour=factor(status))) + geom_point(shape=17, size=8, na.rm=TRUE)
gp <- gp +
stat_smooth(method="lm", size=2, na.rm=TRUE) +
scale_y_continuous(breaks=seq(-4, max(mem)*1.1, 0.5)) +
theme_classic(base_size=35) +
theme(legend.position="bottom",
legend.title=element_blank(),
legend.text=element_text(size=30, face="bold"),
legend.key.size=unit(2, "cm"),
legend.background = element_rect(colour="black"))+
theme(axis.line.x=element_line(colour="black", size=0.5, linetype="solid"),
axis.line.y=element_line(colour="black", size=0.5, linetype="solid"),
axis.title.y=element_text(vjust=1.6, size = 40, face="bold"),
axis.title.x = element_text(vjust=-0.2, size = 40, face="bold"),
axis.text.x = element_text(size=25,colour="#333333"),
axis.text.y = element_text(size=25,colour="#333333"),
panel.grid.minor=element_blank())
Status has two levels: positive and negative and there are about 7 missing values. X and Y are continuous and there are no missing values there.
This is the output:ggplotoutput
Is this a bug in ggplot? Does anybody know how to solve this?
Thanks!

Related

R ggplot2 hexbin plot graphics bug?

I always spot a handful of strange white horizontal dotted lines that seem to appear on the borders between vertically adjacent hexagons on every hexbin density plot that I create using ggplot2 in the R environment. They certainly do not belong there... This behavior (bug) is reproducible on several R installations on different computers with different MS Windows versions and different graphics drivers. I don't have a clue on how to get rid of these dotted lines.
Below is a plot showing the problem, and the code that produced the plot.
Is this a known problem? Any hint to overcome this? Thanks a bunch for any help!
## Load libraries
library(ggplot2)
library(hexbin)
## Create some data
set.seed(222)
myx <- data.frame(x = 0.4 + rnorm(10000, mean = 0, sd = 0.1))
myy <- data.frame(y = 0.5 + rnorm(10000, mean = 0, sd = 0.1))
MyData <- data.frame(xvariable=myx$x*100, yvariable=myy$y*100)
## Prepare the plot window
dev.off(2)
windows.options(width=6, height=6)
op <- par(mfrow = c(1,1))
op <- par(oma = c(0,0,0,0) + 0.1,
mar = c(5.1, 5.1, 4.1, 2.1))
## Creaate the plot
ggplot(MyData, aes(x=xvariable, y=yvariable) ) +
geom_hex(bins=66) +
xlim(0, 100) +
ylim(0, 100) +
ggtitle("My plot") +
scale_fill_continuous(type = "viridis") +
theme_bw() +
theme(plot.title = element_text(color="black", size=17, face="bold"),
axis.title.x = element_text(color="black", size=17, face="bold"),
axis.title.y = element_text(color="black", size=17, face="bold"),
axis.text=element_text(color="black", size=15, face="bold"),
legend.title = element_text(color = "black", size = 15),
legend.text = element_text(color = "black", size=12),
legend.position = c(0.9,0.2), legend.direction = "vertical")
Looks like if you save as pdf and then convert that to image, there will be no line.
I have this issue too, RStudio version 1.4.1717 "Juliet Rose" (df86b69e, 2021-05-24) for Windows. Looks like white lines going across the 'Plots' tab if I use geom_hex(). If I resize the output window by a pixel at a time (vertically) , the lines will disappear.
Maybe there's too many hex bins to begin with and the resolution causes gaps every so often.
before
after

Prevent Background Image from Covering Plot

I am generating bubble charts from NBA shot data clusters. The final form of the data is:
Where Group.1 is the index of the cluster, ad.SHOT_MADE_FLAG is the field goal percent for the cluster, coords.x1 and x2 are the mean x and y coordinates of the points in that cluster, and x is the number of shots (x and y points) in that cluster.
I am plotting the data with the following:
courtImg.URL <- "https://thedatagame.files.wordpress.com/2016/03/nba_court.jpg"
court <- rasterGrob(readJPEG(getURLContent(courtImg.URL)),
width=unit(1,"npc"), height=unit(1,"npc"))
p6 <- ggplot(final, aes(x = final$coords.x1, y = final$coords.x2, size =
final$x,fill=final$ad.SHOT_MADE_FLAG)) +
geom_point(shape = 21) +
annotation_custom(court, -250, 250, -52, 418) +
scale_x_continuous() +
coord_fixed() +
scale_fill_gradientn(colours = c("Blue","Red")) +
theme(line = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
legend.title = element_blank(),
plot.title = element_text(size = 17, lineheight = 1.2, face = "bold")) +
ggtitle("Stephen Curry Shot Chart")
p6
This outputs the following chart
I am wanting to solve two issues with this. First the background image is covering up the majority of the data. Second, I want to only show the plot below the 418 point on the y axis. I dont want to show shots from the backcourt as they aren't as relevant. Just for reference, when I remove the annotation_custom() line, it shows the following plot:
So the implementation of the annotation_custom line appears to be part of the problem. Any help would be greatly appreciated. Thanks!
ggplot2 draws plot layers in the order you specify them. To move the image of the court below the points, put it first in the drawing order. The other fix that might make your plot a little nicer is to make the panel background transparent so that you can see the points on top of the image, which I assume is what you're going for.
You can set the ends of the plots using the limits argument in scale_y_continuous().
Updated plotting code:
p6 <- ggplot(final, aes(x = final$coords.x1, y = final$coords.x2, size =
final$x,fill=final$ad.SHOT_MADE_FLAG)) +
annotation_custom(court, -250, 250, -52, 418) +
geom_point(shape = 21) +
scale_x_continuous() +
scale_y_continuous(limits=c(-52,418)) +
coord_fixed() +
scale_fill_gradientn(colours = c("Blue","Red")) +
theme(line = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
legend.title = element_blank(),
panel.background = element_rect(fill="transparent"),
plot.title = element_text(size = 17, lineheight = 1.2, face = "bold")) +
ggtitle("Stephen Curry Shot Chart")
p6

ggplot2: How to crop out of the blank area on top and bottom of a plot?

This is a follow up of Question How to fit custom long annotations geom_text inside plot area for a Donuts plot?. See the accepted answer, the resulting plot understandably has extra blank area on the top and on the bottom. How can I get rid of those extra blank areas? I looked at theme aspect.ratio but this is not what I intend though it does the job but distorts the plot. I'm after cropping the plot from a square to a landscape form.
How can I do that?
UPDATE This is a self contained example of my use-case:
library(ggplot2); library(dplyr); library(stringr)
df <- data.frame(group = c("Cars", "Trucks", "Motorbikes"),n = c(25, 25, 50),
label2=c("Cars are blah blah blah", "Trucks some of the best in town", "Motorbikes are great if you ..."))
df$ymax = cumsum(df$n)
df$ymin = cumsum(df$n)-df$n
df$ypos = df$ymin+df$n/2
df$hjust = c(0,0,1)
ggplot(df %>%
mutate(label2 = str_wrap(label2, width = 10)), #change width to adjust width of annotations
aes(x="", y=n, fill=group)) +
geom_rect(aes_string(ymax="ymax", ymin="ymin", xmax="2.5", xmin="2.0")) +
expand_limits(x = c(2, 4)) + #change x-axis range limits here
# no change to theme
theme(axis.title=element_blank(),axis.text=element_blank(),
panel.background = element_rect(fill = "white", colour = "grey50"),
panel.grid=element_blank(),
axis.ticks.length=unit(0,"cm"),axis.ticks.margin=unit(0,"cm"),
legend.position="none",panel.spacing=unit(0,"lines"),
plot.margin=unit(c(0,0,0,0),"lines"),complete=TRUE) +
geom_text(aes_string(label="label2",x="3",y="ypos",hjust="hjust")) +
coord_polar("y", start=0) + scale_x_discrete()
And this is the result I'd like to find an answer to fix those annotated resulting blank spaces:
This is a multi-part solution to answer this and the other related question you've posted.
First, for changing the margins in a single graph, #Keith_H was on the right track; using plot.margin inside theme() is a convenient way. However, as mentioned, this alone won't solve the issue if the goal is to combine multiple plots, as in the case of the other question linked above.
To do that you'll need a combination of plot.margin and a specific plotting order within arrangeGrob(). You'll need a specific order because plots get printed in the order you call them, and because of that, it will be easier to change the margins of plots that are layered behind other plots, instead of in front of plots. We can think of it like covering the plot margins we want to shrink by expanding the plot on top of the one we want to shrink. See the graphs below for illustration:
Before plot.margin setting:
#Main code for the 1st graph can be found in the original question.
After plot.margin setting:
#Main code for 2nd graph:
ggplot(df %>%
mutate(label2 = str_wrap(label2, width = 10)),
aes(x="", y=n, fill=group)) +
geom_rect(aes_string(ymax="ymax", ymin="ymin", xmax="2.5", xmin="2.0")) +
geom_text(aes_string(label="label2",x="3",y="ypos",hjust="hjust")) +
coord_polar(theta='y') +
expand_limits(x = c(2, 4)) +
guides(fill=guide_legend(override.aes=list(colour=NA))) +
theme(axis.line = element_blank(),
axis.ticks=element_blank(),
axis.title=element_blank(),
axis.text.y=element_blank(),
axis.text.x=element_blank(),
panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = "white"),
plot.margin = unit(c(-2, 0, -2, -.1), "cm"),
legend.position = "none") +
scale_x_discrete(limits=c(0, 1))
After combining plot.margin setting and arrangeGrob() reordering:
#Main code for 3rd graph:
p1 <- ggplot(mtcars,aes(x=1:nrow(mtcars),y=mpg)) + geom_point()
p2 <- ggplot(df %>%
mutate(label2 = str_wrap(label2, width = 10)), #change width to adjust width of annotations
aes(x="", y=n, fill=group)) +
geom_rect(aes_string(ymax="ymax", ymin="ymin", xmax="2.5", xmin="2.0")) +
geom_text(aes_string(label="label2",x="3",y="ypos",hjust="hjust")) +
coord_polar(theta='y') +
expand_limits(x = c(2, 4)) + #change x-axis range limits here
guides(fill=guide_legend(override.aes=list(colour=NA))) +
theme(axis.line = element_blank(),
axis.ticks=element_blank(),
axis.title=element_blank(),
axis.text.y=element_blank(),
axis.text.x=element_blank(),
panel.border = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
panel.background = element_rect(fill = "white"),
plot.margin = unit(c(-2, 0, -2, -.1), "cm"),
legend.position = "none") +
scale_x_discrete(limits=c(0, 1))
final <- arrangeGrob(p2,p1,layout_matrix = rbind(c(1),c(2)),
widths=c(4),heights=c(2.5,4),respect=TRUE)
Note that in the final code, I reversed the order you had in the arrangeGrob from p1,p2 to p2,p1. I then adjusted the height of the first object plotted, which is the one we want to shrink. This adjustment allows the earlier plot.margin adjustment to take effect, and as that takes effect, the graph printed last in order, which is P1, will start to take the space of what was the margins of P2. If you make one of these adjustments with out the others, the solution won't work. Each of these steps are important to produce the end result above.
You can set the plot.margins to negative values for the top and bottom of the plot.
plot.margin=unit(c(-4,0,-4,0),"cm"),complete=TRUE)
edit: here is the output:

ggplot2 script plots axes on one computer, not another

I wrote a ggplot2 script that produces a plot with 3 points and associated error bars. On my computer it works fine, but the same script run on a colleague's computer omits the x and y axes. Any idea why the same ggplot2 code would produce axes on one computer but not another?
I've pasted the code below. for_group_code is a factor with 3 levels. CIlo, CIhi, and y are continuous variables (mean and associated 95% confidence intervals):
Plotnewdat.fg <- ggplot(newdat.fg, aes(x=for_group_code, y=y)) +
geom_point(aes(shape=dum, size=10)) +
scale_shape_manual(values=c(1,0,15)) +
geom_errorbar(aes(ymin=(CIlo), ymax=(CIhi)), size=1, width=0) +
scale_fill_identity()+
scale_x_discrete(labels=newdat.fg$for_group_code)+
xlab("")+
ylab("Density") +
scale_y_continuous(expand=c(0,0), limits=c(0, ymax), breaks=c(0,round((ymax/2), digits=1),ymax)) +
theme(axis.text.x=element_text(size=10, colour="#000000"), axis.text.y=element_text(size=10, colour="#000000"),
axis.title.x=element_text(size=10, colour="#000000"), axis.title.y=element_text(size=10, vjust=1.0),
panel.background = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(),
axis.line = element_line(colour = "black"), plot.title=element_text(face="bold", size=10, vjust=0.9),
legend.position="none")
Plotnewdat.fg

Removing some tick labels in boxplots in ggplot2

I'm relatively new to ggplot, so I apologize if this is easy, but I couldn't find anything online.
I want to display 29 boxplots (numbered 1.1 to 4.0) next to each other in ggplot2 (which I can do) but once I make the tick label the appropriate size (which I can do), the labels overlap and I only want a few (1.5, 2, 2.5 etc) anyways. How can I remove only some of the tick labels? Also, anyway I can include a blank tick mark at 1.0 so my tick labels are nice, round numbers?
My data is list which I 'melted' since each boxplot has a different number of observations.
My current code:
list = list(data11, data12, ... data39, data40) # Elipse denotes the rest of the sequence
df = melt(list)
ggplot(df, aes(factor(variable), value)) +
geom_boxplot(outlier.size=1.5, colour="black") +
xlab("Xlabel") +
ylab("Ylabel") +
theme_classic() +
theme(
axis.text.x = element_text(size=12),
axis.text.y = element_text(size=12),
axis.title.x = element_text(size=14),
axis.title.y = element_text(size=14, angle=90),
axis.line = element_line(size=0.75)
)
This is not a difficult question indeed. The key idea can be easily found here.
Basically, you are missing a single line of code. Since you did not share a sample of your data (shame on you! see this), I generated some. Here's the solution:
df.so1 <- runif(10); df.so2 <- runif(10); df.so3 <- runif(10)
list.so = list(df.so1, df.so2, df.so3)
df.so = melt(list.so)
ggplot(df.so, aes(factor(L1), value)) +
geom_boxplot(outlier.size=1.5, colour="black") +
xlab("Xlabel") + ylab("Ylabel") +
theme_classic() +
theme(
axis.text.x = element_text(size=12),
axis.text.y = element_text(size=12),
axis.title.x = element_text(size=14),
axis.title.y = element_text(size=14, angle=90),
axis.line = element_line(size=0.75)
) +
scale_x_discrete(breaks = c(1,3))
Note that you have full control over axis, ticks, tick labels, etc. See ggplot2 documentation for more.
Upd.
Don't forget to check out related questions before posting: bump1, bump2.

Resources