ggplot won't apply alpha value to some data points - r

Absolute beginner on ggplot. I am plotting the iris dataset and when I set alpha=0.5, it won't apply to some data points.
Here is the code:
ggplot(iris)+
geom_point(aes(Sepal.Length,Sepal.Width,colour=Species),size=5,alpha=0.5)+
labs(x="Sepal Length",y="Sepal Width")+
theme_minimal()
Here is the output I got. As can be seen, the alpha value is not consistent throughout the data points.

The alpha value is being applied, but what you are seeing is due to some points overlapping others exactly. You can see this if you select geom_jitter instead of geom_point:
ggplot(iris)+
geom_jitter(aes(Sepal.Length,Sepal.Width,colour=Species),size=5,alpha=0.5)+
labs(x="Sepal Length",y="Sepal Width")+
theme_minimal()
When you set alpha to .5, it means that if there are 2 overlapping points, then you will get full color (2 * 0.5 = 1). If you wish to have points remain transparent even when there is overlap, then you can simply select a lower alpha value. The fact that it gets darker with overlap is a nice property, because it means that you can see where there might be large clusters of points.

Related

Controling size/ scale of additional points in ggplot

I have a plot in which I am trying to highlight different points along the line. My plot looks like this:
library(ggplot2)
X<-c(seq(1:20))
Y<-c(2,4,6,3,5,8,6,5,4,3,2,4,6,6,9,8,9,5,4,3)
Col<-as.character(c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5))
value<-c(NA,NA,NA,NA,10,NA,NA,NA,20,NA,NA,20,20,10,NA,NA,NA,10,NA,NA)
DF<-data.frame(X,Y,Col,value)
p<-ggplot(DF, aes(x=X, y=Y)) +geom_line(size=.02) +geom_point(col='black', size=.5)+ geom_jitter(aes(color = Col, size = value),position = position_jitter(height = .2, width = .2)) + scale_colour_manual(values=c("red", "violetred","orange",'blue','steelblue'))
p+guides(size = "none")
On top of the basic line plot, I have 5 different colored dot possibilities defined by the "Col" column. I also have 2 different size classes I'd like to represent using the "value" column. My issue is that the default sizes chosen to represent the "value" column is scaled so that the larger size appears too big in the plot in comparison to the smaller size.
Is there a way to manually set the scale so I can control the drawn size of the added points (hopefully make them closer is size to each other)? The final plot should be similar to the one here, only the value=10 points would be only slightly smaller than than the value=20 points.
I've tried using some of the "manual" options, but when I do it always plots the other "Y" column points, which for now are suppressed because of the NAs
In the same way you manually define the colours for your points using a scale_colour_ variant, you can define the sizes using scale_size.
In this case you can define the range of sizes to use. Adding + scale_size(range = c(4,6)) seems to give results that fit your description. You can tweak the size of, and the difference in size between the points by changing the numbers.

re-sizing ggplot geom_dotplot

I'm having trouble creating a figure with ggplot2. I am using geom_dotplot with center stacking to display my data which are discrete values for 4 categories.
For aesthetic reasons I want to customize the positions of the dots so that
reduce the empty space between dots along the y axis, (ie the dots are 1 value large)
The distributions fit and don't overlap
I've adjusted the bin and dotsize to achieve aesthetic goal 1, but that requires me to fiddle with the ylim() parameter to make sure that the groups fit in the plot. This results in a plot with more whitw space and few numbers on the y axis.
Question: Can anyone explain a way to resize the empty space on this plot?
My code is below:.
plot <- ggplot(figdata, aes(y=Counts, x=category, col=strain)) +
geom_dotplot(aes(fill=strain), dotsize=1, binwidth=.7,
binaxis= "y",stackdir ="centerwhole", stackratio=.7) +
ylim(18,59)
plot + scale_color_manual(values=c("#E69F00", "#56B4E9")) +
geom_errorbar(stat="hline", yintercept="mean",
aes( ymax=..y..,ymin=..y.., group = category, width = 0.5),
color="black")
Which produces:
EDIT: Incorporating jitter will allow the all the data to fit, but I don't want to add noise to this data and would prefer to show it as discreet data.
adjusting the binwidth and dotsize to 0.3 as suggested below also fits all the data, however it leaves too much white space.
I think that I might have to transform my data so that the values are steps smaller than 1, in order to get everything to fit horizontally and dot sizes to big large enough to reduce white space.
I think the easiest way is using coord_cartesian:
plot + scale_color_manual(values=c("#E69F00", "#56B4E9")) +
geom_errorbar(stat="hline", yintercept="mean",
aes( ymax=..y..,ymin=..y.., group = category, width = 0.5),
color="black") +
coord_cartesian(ylim=c(17,40))
Which gives me this plot (with fake data that are not as neatly distributed as yours):

Create a bivariate color gradient legend using lattice for an spplot overlaying polygons with alpha

I've created a map by overlaying polygons using spplot and with the alpha value of the fill set to 10/255 so that areas with more polygons overlapping have a more saturated color. The polygons are set to two different colors (blue and red) based on a binary variable in the attribute table. Thus, while the color saturation depends on the number of polygons overlapping, the color depends on the ratio of the blue and red classes of polygons.
There is, of course, no easy built-in legend for this so I need to create one from scratch. There is a nice solution to this in base graphics found here. I also came up with a not-so-good hack to do this in ggplot based on this post from kohske. A similar question was posted here and I did my best to give some solutions, but couldn't really come up with a solid answer. Now I need to do the same for myself, but I specifically would like to use R and also use grid graphics.
This is the ggplot hack I came up with
Variable_A <- 100 # max of variable
Variable_B <- 100
x <- melt(outer(1:Variable_A, 1:Variable_B)) # set up the data frame to plot from
p <- ggplot(x) + theme_classic() + scale_alpha(range=c(0,0.5), guide="none") +
geom_tile(aes(x=Var1, y=Var2, fill="Variable_A", col.regions="red", alpha=Var1)) +
geom_tile(aes(x=Var1, y=Var2, fill="Variable_B", col.regions="blue", alpha=Var2)) +
scale_x_continuous(limits = c(0, Variable_A), expand = c(0, 0)) +
scale_y_continuous(limits = c(0, Variable_B), expand = c(0, 0)) +
xlab("Variable_A") + ylab("Variable_B") +
guides(fill=FALSE)
p
Which gives this:
This doesn't work for my purposes for two reasons. 1) Because the alpha value varies, the second color plotted (blue in this case) overwhelms the first one as the alpha values get higher. The correct legend should have blue and red mixed evenly along the 1:1 diagonal. In addition, the colors don't really properly correspond to the map colors. 2) I don't know how to overlay a ggplot object on the lattice map created with spplot. I tried to create a grob using ggplotGrob(p), but still couldn't figure out how to add the grob to the spplot map.
The ideal solution would be to create a similar figure using lattice graphics. I think that using tiles is probably the right solution, but what would be best is to have the alpha values stay constant and vary the number of tiles plotted going from left to right (for red) and bottom to top (for blue). Thus, the colors and saturation should properly match the map (I think...).
Any help is much appreciated!
How about mapping the angle to color, and alpha to the sum of the two variables -- does this do what you want?
d <- expand.grid(x=1:100, y=1:100)
ggplot(d, aes(x, y, fill=atan(y/x), alpha=x+y)) +
geom_tile() +
scale_fill_gradient(high="red", low="blue")+
theme(legend.position="none", panel.background=element_blank())

diverging size scale ggplot2

I am trying to map model/obs differences at locations in space. I have color mapped to the diff but I would also like to map size to diff. i'm currently mapping size to abs(diff) but it produces two different legends that I would like to combine. mapping size to diff creates small points for negative values and large points for positive values but I really only want the magnitude represented for over/under predictions. In case it makes a difference, I would also like the scales to be discrete.
Thanks
Some test data:
so.df=data.frame(lat=runif(50,33,43),long=runif(50,-112,-104),diff=runif(50,-2,2))
ggplot()+
geom_point(data=so.df,aes(x=long,y=lat,color=diff,size=abs(diff)),alpha=0.8)+
scale_color_gradient2(low='red',mid='white',high='blue',limits=c(-0.6,0.6),breaks=c(-0.6,-0.4,-0.2,-0.1,0,0.1,0.2,0.6))+
guides(color=guide_legend('SWE Differences (m)'),size=guide_legend('SWE Difference\nMagnitudes (m)'))+
coord_cartesian(xlim=c(-112.5,-104.25),ylim=c(33,44))+
theme_bw()
EDIT:
to use a discrete color scale I'm using the following (open to suggestions)
so.df$cuts=cut_interval(so.df$diff,length=0.15)
ggplot()+
geom_path(data=states,aes(x=long,y=lat,group=group),color='grey10')+
geom_point(data=so.df,aes(x=long,y=lat,color=cuts,size=abs(diff)),alpha=0.8)+
coord_cartesian(xlim=c(-112.5,-104.25),ylim=c(33,44))+
scale_colour_brewer(type='div',palette='RdYlBu')+
guides(color=guide_legend('SWE Differences (m)'),size=guide_legend('SWE Difference\nMagnitudes (m))+
theme_bw()
If you map your sizes to your cuts column and use scale_size_manual you can get the diverging size scale you want.
ggplot() + geom_point(data=so.df,aes(x=long,y=lat,size=cuts,color=cuts))
+ scale_size_manual(values =
c(8,6,4,2,1,2,4,6,8),guide="legend") +
coord_cartesian(xlim=c(-112.5,-104.25),ylim=c(33,44)) +
scale_colour_brewer(type='div',palette='RdBu')

controlling tick limits and direction in scale_colour_gradient in ggplot2 in R?

I am using scale_colour_gradient with a limits= argument and a transformation to a log scale. I want the scale gradient to end exactly in those ticks, and have the ticks face outward rather than inward. Is this possible?
For example:
ggplot(iris) +
geom_point(aes(x=Sepal.Width, y=Sepal.Length, colour=Sepal.Length)) +
scale_colour_gradient(limits=c(3,8), trans=log2_trans())
I'd like the scale to start exactly at 3 (i.e. show no colors with values less than 3) and end exactly at 8 (show no values higher than 8) and have the ticks facing outward (preferably only on the right side of the scale, like in this figure except not horizontal). The limits are important since in real data sometimes you clip values based on a cutoff and want to colour only the points that are in that range, so it's confusing to have a scale bar that shows colors outside the range (since points were selected to be in that range only.)
I'm not aware of a (built-in) option to alter the colorbar ticks in the manner you describe. But I think this hits the rest of your points:
ggplot(iris) +
geom_point(aes(x=Sepal.Width, y=Sepal.Length, colour=Sepal.Length)) +
scale_colour_gradient(limits=c(3,8),
breaks = round(seq(3,8,length.out = 4),1),
trans=log2_trans()) +
guides(colour = guide_colorbar(draw.ulim = FALSE,draw.llim = FALSE))
You could remove the ticks entirely with ticks = FALSE. For different styles of tick marks you'd probably have to hack the grid code itself.

Resources