I have a plot in which I am trying to highlight different points along the line. My plot looks like this:
library(ggplot2)
X<-c(seq(1:20))
Y<-c(2,4,6,3,5,8,6,5,4,3,2,4,6,6,9,8,9,5,4,3)
Col<-as.character(c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4,5,5,5,5))
value<-c(NA,NA,NA,NA,10,NA,NA,NA,20,NA,NA,20,20,10,NA,NA,NA,10,NA,NA)
DF<-data.frame(X,Y,Col,value)
p<-ggplot(DF, aes(x=X, y=Y)) +geom_line(size=.02) +geom_point(col='black', size=.5)+ geom_jitter(aes(color = Col, size = value),position = position_jitter(height = .2, width = .2)) + scale_colour_manual(values=c("red", "violetred","orange",'blue','steelblue'))
p+guides(size = "none")
On top of the basic line plot, I have 5 different colored dot possibilities defined by the "Col" column. I also have 2 different size classes I'd like to represent using the "value" column. My issue is that the default sizes chosen to represent the "value" column is scaled so that the larger size appears too big in the plot in comparison to the smaller size.
Is there a way to manually set the scale so I can control the drawn size of the added points (hopefully make them closer is size to each other)? The final plot should be similar to the one here, only the value=10 points would be only slightly smaller than than the value=20 points.
I've tried using some of the "manual" options, but when I do it always plots the other "Y" column points, which for now are suppressed because of the NAs
In the same way you manually define the colours for your points using a scale_colour_ variant, you can define the sizes using scale_size.
In this case you can define the range of sizes to use. Adding + scale_size(range = c(4,6)) seems to give results that fit your description. You can tweak the size of, and the difference in size between the points by changing the numbers.
Related
Absolute beginner on ggplot. I am plotting the iris dataset and when I set alpha=0.5, it won't apply to some data points.
Here is the code:
ggplot(iris)+
geom_point(aes(Sepal.Length,Sepal.Width,colour=Species),size=5,alpha=0.5)+
labs(x="Sepal Length",y="Sepal Width")+
theme_minimal()
Here is the output I got. As can be seen, the alpha value is not consistent throughout the data points.
The alpha value is being applied, but what you are seeing is due to some points overlapping others exactly. You can see this if you select geom_jitter instead of geom_point:
ggplot(iris)+
geom_jitter(aes(Sepal.Length,Sepal.Width,colour=Species),size=5,alpha=0.5)+
labs(x="Sepal Length",y="Sepal Width")+
theme_minimal()
When you set alpha to .5, it means that if there are 2 overlapping points, then you will get full color (2 * 0.5 = 1). If you wish to have points remain transparent even when there is overlap, then you can simply select a lower alpha value. The fact that it gets darker with overlap is a nice property, because it means that you can see where there might be large clusters of points.
I am trying to create a scatterplot based on four values. My data is just lists of prices (BASIC,VALUE,DELUXE,ULTIMATE). I want VALUE and DELUXE to be the two axis (x,y) and then have the size and color of the points represent the data for the other two columns.
It is hard to set up a reproducible example, because it is only an issue when I get a lot of values listed. i have about 300 points, with about 30 different color/value labels(For ULTIMATE, and 20 size/value labels(For BASIC)
> gg <- ggplot(d, aes(x=DELUXE_PRICE, y=VALUE_PRICE,color=ULTIMATE_PRICE,size=BASIC_PRICE)) + geom_point(alpha = 1)
> plot(gg)
My code does this well, and lists the colors/size with the corresponding value on the side. This is great, but I would like to alter how that is displayed, so that it is not cut off. I would like to be able to "wrap" the values into more columns, or shrink the display size of those so that they fit.
Currently, this lists ULTIMATE in three columns, to the right of the plot area, but cuts off the top of the labels (it extends well above the plot area)
This lists BASIC size/value labels to the right of the plot area, below ULTIMATE labels, in one column, so about half are cut off at the bottom.
I can increase the margins with:
> gg <- ggplot(d, aes(x=DELUXE_PRICE, y=VALUE_PRICE,color=ULTIMATE_PRICE,size=BASIC_PRICE)) + geom_point(alpha = 1) +theme(plot.margin = unit(c(4,2,4,2), "cm"))
> plot(gg)
This gets more of it in, but creates lots of white area and a smaller view of the plot. I would like to be able to just increase the right margin if necessary, and "wrap" the labels in more columns extending to the right. (i.e. put ULTIMATE into 4 columns instead of 3, and put BASIC into 3-4 columns instead of 1 - So that they are shorter and don't run out the plot area.
There is some built in functionality I found to do the required operation. It lies in adding a guides() argument to the plot, specifying whether I am dealing with the color or size legend, and specifying the number of columns with "ncol = " (You can also specify rows). Giving it an order ranking allows you to rank these as well, so my resulting code was:
> gg <- ggplot(Table, aes(x=DELUXE_PRICE, y=VALUE_PRICE,color=ULTIMATE_PRICE,size=BASIC_PRICE)) + geom_point(alpha = 1) + guides(color = guide_legend(order = 0,ncol = 4),size = guide_legend(order = 1,ncol = 4))
I'm having trouble creating a figure with ggplot2. I am using geom_dotplot with center stacking to display my data which are discrete values for 4 categories.
For aesthetic reasons I want to customize the positions of the dots so that
reduce the empty space between dots along the y axis, (ie the dots are 1 value large)
The distributions fit and don't overlap
I've adjusted the bin and dotsize to achieve aesthetic goal 1, but that requires me to fiddle with the ylim() parameter to make sure that the groups fit in the plot. This results in a plot with more whitw space and few numbers on the y axis.
Question: Can anyone explain a way to resize the empty space on this plot?
My code is below:.
plot <- ggplot(figdata, aes(y=Counts, x=category, col=strain)) +
geom_dotplot(aes(fill=strain), dotsize=1, binwidth=.7,
binaxis= "y",stackdir ="centerwhole", stackratio=.7) +
ylim(18,59)
plot + scale_color_manual(values=c("#E69F00", "#56B4E9")) +
geom_errorbar(stat="hline", yintercept="mean",
aes( ymax=..y..,ymin=..y.., group = category, width = 0.5),
color="black")
Which produces:
EDIT: Incorporating jitter will allow the all the data to fit, but I don't want to add noise to this data and would prefer to show it as discreet data.
adjusting the binwidth and dotsize to 0.3 as suggested below also fits all the data, however it leaves too much white space.
I think that I might have to transform my data so that the values are steps smaller than 1, in order to get everything to fit horizontally and dot sizes to big large enough to reduce white space.
I think the easiest way is using coord_cartesian:
plot + scale_color_manual(values=c("#E69F00", "#56B4E9")) +
geom_errorbar(stat="hline", yintercept="mean",
aes( ymax=..y..,ymin=..y.., group = category, width = 0.5),
color="black") +
coord_cartesian(ylim=c(17,40))
Which gives me this plot (with fake data that are not as neatly distributed as yours):
I am trying to map model/obs differences at locations in space. I have color mapped to the diff but I would also like to map size to diff. i'm currently mapping size to abs(diff) but it produces two different legends that I would like to combine. mapping size to diff creates small points for negative values and large points for positive values but I really only want the magnitude represented for over/under predictions. In case it makes a difference, I would also like the scales to be discrete.
Thanks
Some test data:
so.df=data.frame(lat=runif(50,33,43),long=runif(50,-112,-104),diff=runif(50,-2,2))
ggplot()+
geom_point(data=so.df,aes(x=long,y=lat,color=diff,size=abs(diff)),alpha=0.8)+
scale_color_gradient2(low='red',mid='white',high='blue',limits=c(-0.6,0.6),breaks=c(-0.6,-0.4,-0.2,-0.1,0,0.1,0.2,0.6))+
guides(color=guide_legend('SWE Differences (m)'),size=guide_legend('SWE Difference\nMagnitudes (m)'))+
coord_cartesian(xlim=c(-112.5,-104.25),ylim=c(33,44))+
theme_bw()
EDIT:
to use a discrete color scale I'm using the following (open to suggestions)
so.df$cuts=cut_interval(so.df$diff,length=0.15)
ggplot()+
geom_path(data=states,aes(x=long,y=lat,group=group),color='grey10')+
geom_point(data=so.df,aes(x=long,y=lat,color=cuts,size=abs(diff)),alpha=0.8)+
coord_cartesian(xlim=c(-112.5,-104.25),ylim=c(33,44))+
scale_colour_brewer(type='div',palette='RdYlBu')+
guides(color=guide_legend('SWE Differences (m)'),size=guide_legend('SWE Difference\nMagnitudes (m))+
theme_bw()
If you map your sizes to your cuts column and use scale_size_manual you can get the diverging size scale you want.
ggplot() + geom_point(data=so.df,aes(x=long,y=lat,size=cuts,color=cuts))
+ scale_size_manual(values =
c(8,6,4,2,1,2,4,6,8),guide="legend") +
coord_cartesian(xlim=c(-112.5,-104.25),ylim=c(33,44)) +
scale_colour_brewer(type='div',palette='RdBu')
I am using scale_colour_gradient with a limits= argument and a transformation to a log scale. I want the scale gradient to end exactly in those ticks, and have the ticks face outward rather than inward. Is this possible?
For example:
ggplot(iris) +
geom_point(aes(x=Sepal.Width, y=Sepal.Length, colour=Sepal.Length)) +
scale_colour_gradient(limits=c(3,8), trans=log2_trans())
I'd like the scale to start exactly at 3 (i.e. show no colors with values less than 3) and end exactly at 8 (show no values higher than 8) and have the ticks facing outward (preferably only on the right side of the scale, like in this figure except not horizontal). The limits are important since in real data sometimes you clip values based on a cutoff and want to colour only the points that are in that range, so it's confusing to have a scale bar that shows colors outside the range (since points were selected to be in that range only.)
I'm not aware of a (built-in) option to alter the colorbar ticks in the manner you describe. But I think this hits the rest of your points:
ggplot(iris) +
geom_point(aes(x=Sepal.Width, y=Sepal.Length, colour=Sepal.Length)) +
scale_colour_gradient(limits=c(3,8),
breaks = round(seq(3,8,length.out = 4),1),
trans=log2_trans()) +
guides(colour = guide_colorbar(draw.ulim = FALSE,draw.llim = FALSE))
You could remove the ticks entirely with ticks = FALSE. For different styles of tick marks you'd probably have to hack the grid code itself.