diverging size scale ggplot2 - r

I am trying to map model/obs differences at locations in space. I have color mapped to the diff but I would also like to map size to diff. i'm currently mapping size to abs(diff) but it produces two different legends that I would like to combine. mapping size to diff creates small points for negative values and large points for positive values but I really only want the magnitude represented for over/under predictions. In case it makes a difference, I would also like the scales to be discrete.
Thanks
Some test data:
so.df=data.frame(lat=runif(50,33,43),long=runif(50,-112,-104),diff=runif(50,-2,2))
ggplot()+
geom_point(data=so.df,aes(x=long,y=lat,color=diff,size=abs(diff)),alpha=0.8)+
scale_color_gradient2(low='red',mid='white',high='blue',limits=c(-0.6,0.6),breaks=c(-0.6,-0.4,-0.2,-0.1,0,0.1,0.2,0.6))+
guides(color=guide_legend('SWE Differences (m)'),size=guide_legend('SWE Difference\nMagnitudes (m)'))+
coord_cartesian(xlim=c(-112.5,-104.25),ylim=c(33,44))+
theme_bw()
EDIT:
to use a discrete color scale I'm using the following (open to suggestions)
so.df$cuts=cut_interval(so.df$diff,length=0.15)
ggplot()+
geom_path(data=states,aes(x=long,y=lat,group=group),color='grey10')+
geom_point(data=so.df,aes(x=long,y=lat,color=cuts,size=abs(diff)),alpha=0.8)+
coord_cartesian(xlim=c(-112.5,-104.25),ylim=c(33,44))+
scale_colour_brewer(type='div',palette='RdYlBu')+
guides(color=guide_legend('SWE Differences (m)'),size=guide_legend('SWE Difference\nMagnitudes (m))+
theme_bw()

If you map your sizes to your cuts column and use scale_size_manual you can get the diverging size scale you want.
ggplot() + geom_point(data=so.df,aes(x=long,y=lat,size=cuts,color=cuts))
+ scale_size_manual(values =
c(8,6,4,2,1,2,4,6,8),guide="legend") +
coord_cartesian(xlim=c(-112.5,-104.25),ylim=c(33,44)) +
scale_colour_brewer(type='div',palette='RdBu')

Related

Controling size/ scale of additional points in ggplot

I have a plot in which I am trying to highlight different points along the line. My plot looks like this:
library(ggplot2)
X<-c(seq(1:20))
Y<-c(2,4,6,3,5,8,6,5,4,3,2,4,6,6,9,8,9,5,4,3)
Col<-as.character(c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5))
value<-c(NA,NA,NA,NA,10,NA,NA,NA,20,NA,NA,20,20,10,NA,NA,NA,10,NA,NA)
DF<-data.frame(X,Y,Col,value)
p<-ggplot(DF, aes(x=X, y=Y)) +geom_line(size=.02) +geom_point(col='black', size=.5)+ geom_jitter(aes(color = Col, size = value),position = position_jitter(height = .2, width = .2)) + scale_colour_manual(values=c("red", "violetred","orange",'blue','steelblue'))
p+guides(size = "none")
On top of the basic line plot, I have 5 different colored dot possibilities defined by the "Col" column. I also have 2 different size classes I'd like to represent using the "value" column. My issue is that the default sizes chosen to represent the "value" column is scaled so that the larger size appears too big in the plot in comparison to the smaller size.
Is there a way to manually set the scale so I can control the drawn size of the added points (hopefully make them closer is size to each other)? The final plot should be similar to the one here, only the value=10 points would be only slightly smaller than than the value=20 points.
I've tried using some of the "manual" options, but when I do it always plots the other "Y" column points, which for now are suppressed because of the NAs
In the same way you manually define the colours for your points using a scale_colour_ variant, you can define the sizes using scale_size.
In this case you can define the range of sizes to use. Adding + scale_size(range = c(4,6)) seems to give results that fit your description. You can tweak the size of, and the difference in size between the points by changing the numbers.

re-sizing ggplot geom_dotplot

I'm having trouble creating a figure with ggplot2. I am using geom_dotplot with center stacking to display my data which are discrete values for 4 categories.
For aesthetic reasons I want to customize the positions of the dots so that
reduce the empty space between dots along the y axis, (ie the dots are 1 value large)
The distributions fit and don't overlap
I've adjusted the bin and dotsize to achieve aesthetic goal 1, but that requires me to fiddle with the ylim() parameter to make sure that the groups fit in the plot. This results in a plot with more whitw space and few numbers on the y axis.
Question: Can anyone explain a way to resize the empty space on this plot?
My code is below:.
plot <- ggplot(figdata, aes(y=Counts, x=category, col=strain)) +
geom_dotplot(aes(fill=strain), dotsize=1, binwidth=.7,
binaxis= "y",stackdir ="centerwhole", stackratio=.7) +
ylim(18,59)
plot + scale_color_manual(values=c("#E69F00", "#56B4E9")) +
geom_errorbar(stat="hline", yintercept="mean",
aes( ymax=..y..,ymin=..y.., group = category, width = 0.5),
color="black")
Which produces:
EDIT: Incorporating jitter will allow the all the data to fit, but I don't want to add noise to this data and would prefer to show it as discreet data.
adjusting the binwidth and dotsize to 0.3 as suggested below also fits all the data, however it leaves too much white space.
I think that I might have to transform my data so that the values are steps smaller than 1, in order to get everything to fit horizontally and dot sizes to big large enough to reduce white space.
I think the easiest way is using coord_cartesian:
plot + scale_color_manual(values=c("#E69F00", "#56B4E9")) +
geom_errorbar(stat="hline", yintercept="mean",
aes( ymax=..y..,ymin=..y.., group = category, width = 0.5),
color="black") +
coord_cartesian(ylim=c(17,40))
Which gives me this plot (with fake data that are not as neatly distributed as yours):

Using a uniform color palette among different ggplot2 graphs with factor variable

I am using ggplot2 to create several plots about the same data. In particular I am interested in plotting observations according to a factor variable with 6 levels ("cluster").
But the plots produced by ggplot2 use different palettes every time!
For example, if I make a bar plot with this formula I get this result (this palette is what I expect to obtain):
qplot(cluster, data = data, fill = cluster) + ggtitle("Clusters")
And if I make a scatter plot and I try to color the observations according to their belonging to a cluster I get this result (notice that the color palette is different):
ggplot(data, aes(liens_ratio,RT_ratio)) +
geom_point(col=data$cluster, size=data$nombre_de_tweet/100+2) +
geom_smooth() +
ggtitle("Links - RTs")
Any idea on how to solve this issue?
I can't be certain this will work in your specific case without a reproducible example, but I'm reasonably confident that all you need to do is set your color inside an aes() call within the geom you want to color. That is,
ggplot(data, aes(x = liens_ratio, y = RT_ratio)) +
geom_point(aes(color = cluster, size = nombre_de_tweet/100+2)) +
geom_smooth() +
ggtitle("Links - RTs")
If all plots you make use the same data and this basic format, the color palette should be the same regardless of the geom used. Additional elements, such as the line from geom_smooth() will not be changed unless they are also explicitly colored.
The palette will just be the default one, of course; to change it look into scale_color_manual.

Create a bivariate color gradient legend using lattice for an spplot overlaying polygons with alpha

I've created a map by overlaying polygons using spplot and with the alpha value of the fill set to 10/255 so that areas with more polygons overlapping have a more saturated color. The polygons are set to two different colors (blue and red) based on a binary variable in the attribute table. Thus, while the color saturation depends on the number of polygons overlapping, the color depends on the ratio of the blue and red classes of polygons.
There is, of course, no easy built-in legend for this so I need to create one from scratch. There is a nice solution to this in base graphics found here. I also came up with a not-so-good hack to do this in ggplot based on this post from kohske. A similar question was posted here and I did my best to give some solutions, but couldn't really come up with a solid answer. Now I need to do the same for myself, but I specifically would like to use R and also use grid graphics.
This is the ggplot hack I came up with
Variable_A <- 100 # max of variable
Variable_B <- 100
x <- melt(outer(1:Variable_A, 1:Variable_B)) # set up the data frame to plot from
p <- ggplot(x) + theme_classic() + scale_alpha(range=c(0,0.5), guide="none") +
geom_tile(aes(x=Var1, y=Var2, fill="Variable_A", col.regions="red", alpha=Var1)) +
geom_tile(aes(x=Var1, y=Var2, fill="Variable_B", col.regions="blue", alpha=Var2)) +
scale_x_continuous(limits = c(0, Variable_A), expand = c(0, 0)) +
scale_y_continuous(limits = c(0, Variable_B), expand = c(0, 0)) +
xlab("Variable_A") + ylab("Variable_B") +
guides(fill=FALSE)
p
Which gives this:
This doesn't work for my purposes for two reasons. 1) Because the alpha value varies, the second color plotted (blue in this case) overwhelms the first one as the alpha values get higher. The correct legend should have blue and red mixed evenly along the 1:1 diagonal. In addition, the colors don't really properly correspond to the map colors. 2) I don't know how to overlay a ggplot object on the lattice map created with spplot. I tried to create a grob using ggplotGrob(p), but still couldn't figure out how to add the grob to the spplot map.
The ideal solution would be to create a similar figure using lattice graphics. I think that using tiles is probably the right solution, but what would be best is to have the alpha values stay constant and vary the number of tiles plotted going from left to right (for red) and bottom to top (for blue). Thus, the colors and saturation should properly match the map (I think...).
Any help is much appreciated!
How about mapping the angle to color, and alpha to the sum of the two variables -- does this do what you want?
d <- expand.grid(x=1:100, y=1:100)
ggplot(d, aes(x, y, fill=atan(y/x), alpha=x+y)) +
geom_tile() +
scale_fill_gradient(high="red", low="blue")+
theme(legend.position="none", panel.background=element_blank())

How can I create a colormap with a fixed color for 0 in R / ggplot

I am producing heatmaps of measurement using ggplot2. The data contains positive and negative values and I use the rainbow() palette for coloring.
I have different data sets and would like to scale the colora in a way that the minimum, maximum and 0 values of each data set get the same colors assigned. I could only find out to set the minimum and maximum using limits=...
How can I also define a given color for 0?
Here is my minimal example, if I would for example use rainbow(5), I would like the 3rd color to be the zero color.
data <- read.csv("http://protzkeule.de/data.csv")
ggplot(data=data, aes(x=variable, y=meas)) +
geom_tile(aes(fill=value)) +
scale_fill_gradientn(colours=rev(rainbow(255)),limits=c(-.2,.4))
Perhaps a different approach: For my plots I found it easier to cut the values used for the Colors:
ggplot(...) +
stat_bin2d (aes(fill=ifelse(..count..>20,20,..count..)), bins = 10) +
scale_fill_gradientn("Count", colours=c("blue", "yellow", "red")) + ...

Resources