ggplot2: increasing space between categorical axis ticks with geom_point - r

I have a geom_point plot that with a large number of categorical variables, and a size parameter mapped to a continuous variable. When I make the plot, the categorical variables are too close together, and the large points from within each overlap. Is there any way to give a little breathing room to the axis so that this doesn't happen? I'm aware that an alternative solution is simply to use scale_size_area(max_size = 3) to narrow the range of point sizes, but I'd prefer not to do this as it makes it too difficult to tell them apart.
Here's the code:
plot <- ggplot(allcazfull, aes(x = Family, y = ifelse(Percentage==0,NA, Percentage), fill = Treatment, size = ifelse(Number == 0, NA,Number))) +
facet_wrap(~ Pathogen, scales = "free_x") +
geom_point(shape = 21) +
scale_fill_manual(values = alpha(c("#98fb98","#f77e17","#0d5a0d","#8d0707"),.6)) +
theme_bw() +
theme(panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
aspect.ratio = 4/1,
strip.background = element_rect(fill="white", linetype = "blank"),
strip.text = element_blank()) +
scale_x_discrete(limits = rev(levels(allcazfull$Family))) +
xlab("") +
ylab("") +
guides(fill = FALSE, size = FALSE) +
coord_flip()
plot
And here's the resulting figure:

Related

Points not remaining clear over bar when using dodge

When I make my bar graph using this code, it becomes unclear which bar the points are supposed to be above. This seems to happen when I add the col function.
Any help would be great!
ggplot(Data_Task1, aes(Type, Percentage_Correct_WND, fill = Condition, col = Animal)) +
geom_bar(stat = "summary", col = "black", position = "dodge") +
geom_point(position = position_dodge(0.9)) +
labs(x = "", y = "% Correct Without No Digs") +
scale_fill_brewer(palette = "Blues") +
scale_colour_manual(values = c("#000000", "#FF9933", "#00FF33", "#FF0000", "#FFFF00", "#FF00FF")) +
theme(
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
axis.line.x = element_line(colour = "black", size = 0.5),
axis.line.y = element_line(colour = "black", size = 0.5)
)
enter image description here
The issue is that ggplot2 does not know the column in your dataset it should use for the dodging when it comes to the point geom. For the bar geom, it's obvious because different bars are drawn with different fill colors at the same x axis value. For your point geom, ggplot2 is dodging based on the color aesthetic. If you want to force grouping or dodging based on a specific column, you should assign that column to the group= aesthetic.
Here's an example using mtcars. I'm forcing continuous factors to be discrete via as.factor() here.
library(ggplot2)
# plot with incorrect dodging
ggplot(mtcars, aes(x=as.factor(carb), y=mpg, fill=as.factor(cyl), color=as.factor(gear))) +
geom_bar(position="dodge", stat="summary", col='black') +
geom_point(position=position_dodge(0.9), size=3) +
theme_classic()
The bars are dodging based on as.factor(cyl), assigned to the fill aesthetic, but the points are dodging based on as.factor(gear), assigned to the color aesthetic. We override the color aesthetic in the geom_bar() command (as OP did) by defining col='black'.
The solution is to force the points to be grouped (and therefore dodged) based on the same column used for the fill aesthetic, so mapping is group=as.factor(cyl).
ggplot(mtcars, aes(x=as.factor(carb), y=mpg, fill=as.factor(cyl), color=as.factor(gear))) +
geom_bar(position="dodge", stat="summary", col='black') +
geom_point(aes(group=as.factor(cyl)), position=position_dodge(0.9), size=3) +
theme_classic()
Applied to OP's case, the dodging should work with this code:
ggplot(Data_Task1, aes(Type, Percentage_Correct_WND, fill = Condition, col = Animal)) +
geom_bar(stat = "summary", col = "black", position = "dodge") +
# adjust group here to Condition (same as fill)
geom_point(aes(group = Condition)), position = position_dodge(0.9)) +
labs(x = "", y = "% Correct Without No Digs") +
scale_fill_brewer(palette = "Blues") +
scale_colour_manual(values = c("#000000", "#FF9933", "#00FF33", "#FF0000", "#FFFF00", "#FF00FF")) +
theme(
panel.grid.major.x = element_blank(),
panel.grid.minor.x = element_blank(),
axis.line.x = element_line(colour = "black", size = 0.5),
axis.line.y = element_line(colour = "black", size = 0.5)
)

In ggplot2 how can I scale the legend when using two graph types?

I'm using ggplot2 with both + geom_line() + geom_point(). I have the colors/shapes worked out, but I can't scale the legend appropriately. If I do nothing it's tiny, and if I enlarge it, the color blocks the shape.
For example:
You can see that the shapes and colors are both in the legend, but the shapes are being drawn over by the colors. I would like to have shapes of the appropriate color drawn in the legend, but can't figure out how to do it.
My plot is being drown as follows:
ggplot(data=melted, aes(x=gene, y=value, colour=variable, shape=variable, group = variable, stroke=3, reorder(gene, value)))
+ theme_solarized()
+ scale_colour_solarized("blue")
+ geom_line()
+ geom_point()
+ theme(axis.text.x = element_text(angle = 90, hjust = 1), plot.title = element_text(size=16, face="bold"), legend.title=element_blank(), legend.text=element_text(size=20))
+ ggtitle('Signiture Profiles')
+ labs(x="Gene", y=expression(paste("Expression"), title="Expression"))
+ scale_colour_manual(name = "Virus / Time", labels = c("Mock", "ACali09_day1", "ACali09_day3", "ACali09_day8", "AShng113_day1", "AShng113_day3", "AShng113_day8", "AChkShng113_day1", "AChkShng113_day3", "AChkShng113_day8"), values = c("#ff420e","#89da59","#89da59","#89da59","#376467","#376467","#376467","#00293c","#00293c","#00293c"))
+ scale_shape_manual(name = "Virus / Time", labels = c("Mock", "ACali09_day1", "ACali09_day3", "ACali09_day8", "AShng113_day1", "AShng113_day3", "AShng113_day8", "AChkShng113_day1", "AChkShng113_day3", "AChkShng113_day8"), values = c(0,1,2,3,1,2,3,1,2,3))
+ guides(colour = guide_legend(override.aes = list(size=12)))
Here is some example data as requested:Example Data
Thanks in advance for any help you can provide.
You could perhaps rethink how you are differentiating your variables.
You could do something like the following. Note the changes in the first line, where I have separated the component parts of variable rather than setting colours and shapes via your scale statements. (I haven't got your theme, so I left that out).
ggplot(data=melted, aes(x=gene,
y=value,
colour=gsub("_.*","",variable),
shape=gsub(".*_","",variable),
group = variable,
stroke=3,
reorder(gene, value))) +
geom_line() +
geom_point() +
theme(axis.text.x = element_text(angle = 90, hjust = 1),
plot.title = element_text(size=16, face="bold"),
legend.title=element_blank(),
legend.text=element_text(size=20)) +
ggtitle('Signiture Profiles') +
labs(x="Gene", y=expression(paste("Expression"), title="Expression")) +
guides(shape = guide_legend(override.aes = list(size=5)),
colour = guide_legend(override.aes = list(size=5)))

ggplot2, facet wrap, fixed y scale for each row, free scale between rows

I would like to produce a plot using facet_wrap that has a different y scale for each row of the wrap. In other words, with fixed scales on the same row, free scales on different rows, with a fixed x scale. Free scales doesn't give me exactly what I'm looking for, nor does facet_grid. If possible, I'd like to avoid creating 2 separate plots and then pasting them together. I'm looking for a result like the plot below, but with a y scale max of 300 for the first row, and an y scale max of 50 in the second row. Thanks for any help!
Here is my code:
library(ggplot2)
library(reshape)
# set up data frame
dat <- data.frame(jack = c(150,160,170),
surgeon = c(155,265,175),
snapper = c(10,15,12),
grouper = c(5,12,50))
dat$island<-c("Oahu","Hawaii","Maui")
df<-melt(dat)
# plot
ggplot(df, aes(fill=variable, y=value, x=island)) +
geom_bar(width = 0.85, position= position_dodge(width=0.5),stat="identity", colour="black") +
facet_wrap(~variable, scales = "free_y",ncol=2) +
theme_bw() +
theme(strip.text = element_text(size=15, face="bold"))+
theme(legend.position="none")+
theme(panel.grid.major = element_line(colour = "white", size = 0.2))+
theme(panel.grid.minor = element_line(colour = "white", size = 0.5))+
theme(axis.text.x = element_text(angle = 90, hjust =1, vjust =0.5, size=18))+
labs(y = expression(paste("Yearly catch (kg)")))
Drawing on one of the lower ranked answers from the link Eric commented, you can add a layer that blends into the background to enforce the axes.
Here I created a second data frame (df2) that puts a single point at "Hawaii" and the max value you wanted (300 or 50) for the four variable/fish types. By manually setting the color of the geom_point white, it fades into the background.
library(ggplot2)
library(reshape)
# set up data frame
dat <- data.frame(jack = c(150,160,170),
surgeon = c(155,265,175),
snapper = c(10,15,12),
grouper = c(5,12,50))
dat$island<-c("Oahu","Hawaii","Maui")
df<-melt(dat)
#> Using island as id variables
df2 <- data.frame(island = rep("Hawaii",4), variable = c("jack","surgeon","snapper","grouper"),value = c(300,300,50,50))
ggplot(df, aes(fill=variable, y=value, x=island)) +
geom_bar(width = 0.85, position= position_dodge(width=0.5),stat="identity", colour="black") +
geom_point(data = df2, aes(x = island, y = value), colour = "white") +
facet_wrap(~variable, scales = "free_y",ncol=2) +
theme_bw() +
theme(strip.text = element_text(size=15, face="bold"))+
theme(legend.position="none")+
theme(panel.grid.major = element_line(colour = "white", size = 0.2))+
theme(panel.grid.minor = element_line(colour = "white", size = 0.5))+
theme(axis.text.x = element_text(angle = 90, hjust =1, vjust =0.5, size=18))+
labs(y = expression(paste("Yearly catch (kg)")))

Merging two plots into one, each with a separate legend using R

I'm have made two separate scatter plots using ggplot2 and I need to combine them into one single plot. Each plot is for a population of lizards under three different treatments (backgrounds).
for each plot I have the following:
csMS = data.frame()
ellMS = data.frame()
centroidsMS = data.frame()
csplotMS = ggplot(csMS, aes(x = RG, y = GB, colour = Background)) + geom_point(size = 3, shape = 17) + #colour by background, circles size 3
geom_path(data = ell.AS, aes(x = RG, y = GB ,colour = Background), size = 1, linetype = 2) + #adding the ellipses
geom_point(data = centroidsMS, size = 3, shape = 17) + #added centroids
geom_errorbar(data = centroidsMS, aes(ymin = GB - se.GB, ymax = GB + se.GB), width = 0) + #add y error bars
geom_errorbarh(data = centroidsMS, aes(xmin = RG - se.RG, xmax = RG + se.RG), height = 0) +
theme_bw() + #white background
theme(axis.title.y = element_text(vjust = 2), axis.title.x = element_text(vjust = -0.3)) + #distance of axis titles from axis
theme(panel.border = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), #no grids
axis.line = element_line(colour = "black")) + #black axes
theme(text = element_text(size = 30)) + #font size
ylab("(G-B)/(G+B)") + xlab("(R-G)/(R+G)") + # Set text for axes labels
scale_colour_manual(values = c("black","#FF6600", "yellow1")) + #changed default colours
labs(colour = "Murray Sunset NP") +
theme(legend.title = element_text(size = "20")) + #changes the legend title
theme(legend.text = element_text(size = "20")) + #changes the legend title
theme(legend.key = element_blank()) + #removed little squares around legend symbols
theme(legend.direction = "horizontal", legend.position = c(.5, .85))
I tried
csASMS = csplotAS + csplotMS
but I get an error message: "Error in p + o : non-numeric argument to binary operator In addition: Warning message: Incompatible methods ("+.gg", "Ops.data.frame") for "+" "
I also tried
csASMS = grid.arrange(csplotAS, csplotMS)
but this places one plot on top of the other, but I need to combine both plots so that they are basically just one plot but with two separate legends as each plot has different conventions to indicate the different lizard populations.
Any help will be greatly appreciated.
****EDIT**** Dec 12/ 2014
I have managed to combine the two plots into one but still have the problem of the separate legends. To try to simplify the question and as per cdeterman's request I'm adding a simpler form of the code with some sample data:
data frames: p1 and p2
> p1
treatment x y
1 Black 1 1
2 Orange 2 2
3 Yellow 3 3
> p2
treatment x y
1 Black 4 4
2 Orange 5 5
3 Yellow 6 6
I used the following code to make a plot that includes both data frames:
plot = ggplot(p1, aes(x = x, y = y, colour = treatment)) + geom_point(size = 3) + #colour by background, circles size 3
theme_bw() + #white background
theme(axis.title.y = element_text(vjust = 2), axis.title.x = element_text(vjust = -0.3)) + #distance of axis titles from axis
theme(panel.border = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), #no grids
axis.line = element_line(colour = "black")) + #black axes
theme(text = element_text(size = 30)) + #font size
scale_colour_manual(values = c("black","#FF6600", "yellow1")) + #changed default colours
labs(colour = "p1") +
theme(legend.title = element_text(size = "20")) + #changes the legend title
theme(legend.text = element_text(size = "20")) + #changes the legend title
theme(legend.key = element_blank()) + #removed little squares around legend symbols
theme(legend.direction = "horizontal", legend.position = c(.33, 1)) +
# Now to add the second plot/ No need to code for axis titles, titles positions,etc b/c it's already coded in the first plot
geom_point(data = p2, aes(x = x, y = y, colour = treatment), size = 3, shape = 17)
This produces a graph with each data frame represented in a different symbol (circles for p1 and triangles for p2) but with only one combined legend with triangles superimposed over circles). How can I get two separate legends, one for each data frame?
Thank you!
After doing some research and trying different things I was able to solve PART of my problem. To add two plots together one needs to be plotter first and the other one on top of the first one using
geom.point()
my new code looks like this:
csplotASMS = ggplot(csAS, aes(x = RG, y = GB, colour = Background)) + geom_point(size = 3) + #colour by background, circles size 3
geom_path(data = ell.AS, aes(x = RG, y = GB ,colour = Background), size = 1, linetype = 1) + #adding the ellipses
geom_point(data = centroidsAS, size = 4) + #added centroids
geom_errorbar(data = centroidsAS, aes(ymin = GB - se.GB, ymax = GB + se.GB), width = 0) + #add y error bars
geom_errorbarh(data = centroidsAS, aes(xmin = RG - se.RG, xmax = RG + se.RG), height = 0) +
theme_bw() + #white background
theme(axis.title.y = element_text(vjust = 2), axis.title.x = element_text(vjust = -0.3)) + #distance of axis titles from axis
theme(panel.border = element_blank(), panel.grid.major = element_blank(), panel.grid.minor = element_blank(), #no grids
axis.line = element_line(colour = "black")) + #black axes
theme(text = element_text(size = 30)) + #font size
ylab("(G-B)/(G+B)") + xlab("(R-G)/(R+G)") + # Set text for axes labels
scale_colour_manual(values = c("black","#FF6600", "yellow1")) + #changed default colours
labs(colour = "Alice Springs") +
theme(legend.title = element_text(size = "20")) + #changes the legend title
theme(legend.text = element_text(size = "20")) + #changes the legend title
theme(legend.key = element_blank()) + #removed little squares around legend symbols
theme(legend.direction = "horizontal", legend.position = c(.33, 1)) +
# Now to add the second plot/ No need to code for axis titles, titles positions,etc b/c it's already coded in the first plot
geom_point(data = csMS, aes(x = RG, y = GB, colour = Background), size = 3, shape = 17) +
geom_path(data = ell.MS, aes(x = RG, y = GB ,colour = Background), size = 1, linetype = 2) + #adding the ellipses
geom_point(data = centroidsMS, size = 4, shape = 17) + #added centroids
geom_errorbar(data = centroidsMS, aes(ymin = GB - se.GB, ymax = GB + se.GB), width = 0) + #add y error bars
geom_errorbarh(data = centroidsMS, aes(xmin = RG - se.RG, xmax = RG + se.RG), height = 0) #add x error bars
and the graph depicts a scatterplot for two populations, each with three treatments. Because tratments are the same for both populations I want to use the same colours but different symbols to denote the differences in populations. One population is circles and the other one is triangles.
Now, the part I can't answer yet is how to have two separate legends, one for each "plot". i.e. one for the circles and one for the triangles. At the moments there is a "combined legend showing triangles superimposed on circles. Each legend should have its own title.

Create different colours for regression lines

I am creating a scatterplot (around 1,500,000 points), and I running regressions through it based on a factor variable (see below "halfs"). This is the picture:
As you can see it is difficult to see the "red" regression lines.
This is the data:
For_Cov Per_chg halfs
1 0.8372001 0.002400000 upper half
2 0.7236001 0.002800111 upper half
3 0.6036000 0.000800000 upper half
4 0.8540000 0.000000000 upper half
5 0.9080001 0.003200000 upper half
6 0.8248000 0.000000000 upper half
7 0.1132000 0.000000000 upper half
8 0.2044000 0.007600000 upper half
9 0.2476001 0.085200000 upper half
10 0.2368000 0.003600000 upper half
This is the code:
ggplot(grid_gdp_full, aes(x = For_Cov, y = Def_Chg, group = factor(halfs))) +
geom_point(aes(colour = halfs), alpha = 0.1) +
stat_smooth(method = "lm", formula = y ~ x + I(x^2), size = 4, fullrange=TRUE, aes(group = halfs, colour = halfs), alpha = 1) +
xlab("Initial") +
ylab("Percent") +
ylim(0,0.10) +
scale_x_reverse() +
theme_bw() +theme(
axis.text=element_text(size=12)
,axis.title=element_text(size=14,face="bold")
,plot.background = element_blank()
,panel.grid.major = element_blank()
,panel.grid.minor = element_blank()
,panel.border = element_blank()
,panel.background = element_blank()
) +
theme(axis.line = element_line(color = 'black'))
Does anyone how to change the code so that the regression lines will have different colors (regression line is created by stat_smooth) - possibly grey and blue for the two factor levels without changing the initial color of the points?
Here is the code I am using at the moment and I am still not able to get the two different colored lines:
ggplot(grid_gdp_full, aes(x = For_Cov, y = Def_Chg, fill = factor(halfs))) +
geom_point(aes(colour = halfs), alpha = 0.1, colour="transparent",shape=21) +
stat_smooth(method = "lm", formula = y ~ x + I(x^2), size = 2, fullrange=TRUE, aes(group = factor(halfs)), alpha = 1) +
xlab("Initial") +
ylab("Percent") +
ylim(0,0.1) +
scale_x_reverse() +
theme_bw() +theme(
axis.text=element_text(size=12)
,axis.title=element_text(size=14,face="bold")
,plot.background = element_blank()
,panel.grid.major = element_blank()
,panel.grid.minor = element_blank()
,panel.border = element_blank()
,panel.background = element_blank()
) +
theme(axis.line = element_line(color = 'black')) +
scale_colour_manual(values = c("red","blue")) +
scale_fill_manual(values = c("grey","green"))
dev.off()
Taking advantage of point shape 21 we can use fill for points and colour for lines. Setting colours by scale_manual as we want. Note the colour=transparent in geom_point so we remove colour borders around points.
ggplot(mtcars, aes(x=cyl,y=mpg,fill=factor(gear))) +
geom_point(aes(fill=factor(gear)),colour="transparent",shape=21) +
stat_smooth(aes(colour=factor(gear)),method="lm",se = FALSE) +
scale_colour_manual(values = c("red","blue", "green")) +
scale_fill_manual(values = c("orange","pink", "red"))
OK. I see what you want now (I think...): you want the line groups to be a different color than the point groups. In essence you need four colors: two for point groups and two for the line groups.
This worked on your sample, recoded with For_Cov<0.5 to "lower" to split the data.
grid_gdp_full$halfs <- ifelse(grid_gdp_full$For_Cov<0.5,"lower half","upper half")
grid_gdp_full <- cbind(grid_gdp_full,col =
ifelse(grid_gdp_full$halfs=="lower half","lm: lower","lm: upper"))
ggplot(grid_gdp_full, aes(x = For_Cov, y = Per_chg)) +
geom_point(aes(colour = factor(halfs)), alpha = 0.9, size=5) +
stat_smooth(method = "lm", formula = y ~ x + I(x^2), size = 4, fullrange=TRUE,
aes(group = col, colour = factor(col)), alpha = 1) +
xlab("Initial") +
ylab("Percent") +
ylim(0,0.10) +
scale_x_reverse() +
theme_bw() +theme(
axis.text=element_text(size=12)
,axis.title=element_text(size=14,face="bold")
,plot.background = element_blank()
,panel.grid.major = element_blank()
,panel.grid.minor = element_blank()
,panel.border = element_blank()
,panel.background = element_blank()
) +
theme(axis.line = element_line(color = 'black'))
Produces this:
I made the points bigger and set alpha=0.9 so you can see the points in this small sample. Also you reversed the x axis, which makes "lower" and "upper" kind of confusing.

Resources