How to reverse point size in ggplot? - r

Please, help me with this point. I need the positive values to be represented as small points and negative as large points. If I tape minus before size, the point sizes are right but the legend is changing:
df=data.frame(x=rnorm(20),y=runif(20),z=rnorm(20))
ggplot(df,aes(x=x,y=y))+geom_point(aes(size=-z))
so that does not suite.

One solution would be to use scale_size() and set your own breaks and then labels in opposite direction. Changed range of z values to get better representation.
df=data.frame(x=rnorm(20),y=runif(20),z=(-13:6))
ggplot(df,aes(x=x,y=y))+geom_point(aes(size=-z))+
scale_size("New legend",breaks=c(-10,-5,0,5,10),labels=c(10,5,0,-5,-10))

Little late, but an easier way would just to add trans='reverse' to scale_size.
Example:
df=data.frame(x=rnorm(20),y=runif(20),z=z=(-13:6))
ggplot(df,aes(x=x,y=y))+
geom_point(aes(size=z)) +
scale_size(trans = 'reverse')

While this question is very old - and has an accepted answer - the comment from baptiste that suggests using last_plot() + scale_size(range = c(5,1)) + guides(size = guide_legend(reverse=TRUE)) works very elegantly and simply. For my data where I needed to produce the same result as OP, this worked with zero modification required.

Related

ggplot geom_point varying size with windows size

I've got some issue creating a map with ggplot2 above which I project points using geom_point. When exporting in pdf or in an other support, the point size varies (because she's absolute and not axis-relative). I've searched how to change that and found a lot of answers saying, that it was on purpose, because if it wasn't the case it would be changing to ellipse each time the axis proprtions change. I understand that, however, because I work on a map, I use coord_fixed to fix the output and avoid distorsions of my map, so if I was able to fix the point size relatively to the plot size, it wouldn't be a problem.
Is there some solution to do that? I've read some interesting things suggesting using geom_polygon to artificially create ellipses. But I have two problems with this method:
First I don't know how to implement that with my data, now I know the place where the centers of my points are, but how could I then later say how to define all the centers and then defin a filled circled polygon around?
Second I have used scale_size_continuous to plot smaller or bigger points relatively to other variable. How could I implement that with geom_polygon?
Facit: I would be happy either with the possibility of override the impossibility to determine a relative unit for the point size, or with some help to make me understand how I can create the same thing with the function geom_polygon.
I tried to join a small reproducible example here. It is only an example, the problem with my data is that I have a lot of closed small values (mainly 1, like the small dot in the reproducible example), and so they seem really good, but when exporting it can become very bigger and create a lot of problems by overplotting, which is the reason why I need to fix this ratio.
Link for the map informations and second link for map informations
dat <- data.frame(postcode=c(3012, 2000, 1669, 4054, 6558), n=c(1, 20, 40, 60, 80))
ch <- read.csv("location/PLZO_CSV_LV03/PLZO_CSV_LV03.csv", sep=";")#first link, to attribute a geographical location for each postcode
ch <- ch%>%
distinct(PLZ, .keep_all=TRUE)%>%
group_by(PLZ, N, E)%>%
summarise
ch <- ch%>%
filter(PLZ %in% dat$postcode)
ch <- ch%>%
arrange(desc(as.numeric(PLZ)))
dat <- dat%>%
arrange(desc(as.numeric(postcode)))
datmap <- bind_cols(dat, ch)
ch2 <- readOGR("location/PLZO_SHP_LV03/PLZO_PLZ.shp")#second link, to make the shape of the country
ch2 <- fortify(ch2)
a <- ggplot()+
geom_polygon(dat=ch2, aes(x=long, y=lat, group=group), colour="grey75", fill="grey75")+
geom_jitter(data=datmap, aes(x=E, y=N, group=FALSE, size=n), color=c("red"))+ #here I put geom_jitter, but geom_point is fine too
scale_size_continuous(range=c(0.7, 5))+
coord_fixed()
print(a)
Thanks in advance for the help!
You can use ggsave() to save the last plot and adjust the scaling factor used for points/lines etc. Try this:
ggplot(data = ch2) +
geom_polygon(aes(x=long, y=lat, group=group),
colour="grey85", fill="grey90") +
geom_point(data=datmap, aes(x=E, y=N, group=FALSE, size=n),
color=c("red"), alpha = 0.5) +
scale_size_continuous(range=c(0.7, 5)) +
coord_fixed() +
theme_void()
ggsave(filename = 'plot.pdf', scale = 2, width = 3, height = 3)
Play around with the scale parameter (and optionally the width and height) until you are happy with the result.
DO NOT use geom_jitter(): this will add random XY variation to your points. To deal with overplotting you can try adding transparency - I added an alpha parameter for this. I also used theme_void() to get rid of axes and background.
Your shape file with map information is quite heavy: you can try a simple one with Swiss cantons, like this one.

geom_histogram in ggplot rounds data up when placing in bins, how do I change this?

I am putting together a histogram to look at how my data is bunched around a threshold histogram. I was surprised to see that the spike appeared to be to the right of the threshold (the red vertical line) rather than to the left. Using the geom_build function, I saw that observations were being rounded up. In this graph, for example, the spike should be at 1305, but with a bin width of 1, it appears at 1305.5, where the bin takes in values between 1304.5 and 1305.5. I.e. it is rounding upwards.
(I know this is the case, because as I reduce the bin size, the spike approaches 1305, which is where I know it really is.)
I can't find any setting in ggplot to change this, and I'm not quite sure if it's even possible. An alternative would be to change the bins to match integers, ie to go from 1-2, 2-3, 3-4 .. rather than .5-1.5,1.5-2.5 etc.
My code is below. I'd be grateful for any advice.
plotcars<-ggplot(data=cars_total) +
geom_histogram(binwidth = 1, aes(x=V3,weight=V1)) +
geom_vline(data=cuts, aes(xintercept=vals, linetype=Thresholds,
colour = Thresholds), show.legend = TRUE) +
coord_cartesian(xlim = c(1300,1350),ylim=c(0,800000)) +
scale_y_continuous(labels = comma)
plotcars
The problem here was that I was using the geom_histogram, when I should have been using geom_stat.
Quoting from ?geom_bar:
stat_count counts the number of cases at each x position. If you want to bin the data in ranges, you should use stat_bin instead.
The replacement code is:
+ stat_count(geom="bar", aes(weight=Registrations,width = 1, center=0))

Set categorical axis labels with scales "free" ggplot2

I am trying to set the labels on a categorical axis within a faceted plot using the ggplot2 package (1.0.1) in R (3.1.1) with scales="free". If I plot without manually setting the axis tick labels they appear correctly (first plot), but when I try to set the labels (second plot) only the first n labels are used on both facets (not in sequence as with the original labels).
Here is a reproducible code snippet exemplifying the problem:
foo <- data.frame(yVal=factor(letters[1:8]), xVal=factor(rep(1:4,2)), fillVal=rnorm(8), facetVar=rep(1:2,each=4))
## axis labels are correct
p <- ggplot(foo) + geom_tile(aes(x=xVal, y=yVal, fill=fillVal)) + facet_grid(facetVar ~ ., scales='free')
print(p)
## axis labels are not set correctly
p <- p + scale_y_discrete(labels=c('a','a','b','b','c','d','d','d'))
print(p)
I note that I cannot set the labels correctly within the data.frame as they are not unique. Also I am aware that I can accomplish this with arrange.grid, but this requires "manually" aligning the plots if there are different length labels etc. Additionally, I would like to have the facet labels included in the plot which is not an available option with the arrange.grid solution. Also I haven't tried viewports yet. Maybe that is the solution, but I was hoping for more of the faceted look to this plot and that seems to be more similar to grid.arrange.
It seems to me as though this is a bug, but I am open to an explanation as to how this might be a "feature". I also hope that there might be a simple solution to this problem that I have not thought of yet!
The easiest method would be to create another column in your data set with the right conversion. This would also be easier to audit and manipulate. If you insist on changing manually:
You cannot simply set the labels directly, as it recycles (I think) the label vector for each facet. Instead, you need to set up a conversion using corresponding breaks and labels:
p <- p + scale_y_discrete(labels = c('1','2','3','4','5','6','7','8'), breaks=c('a','b','c','d','e','f','g','h'))
print(p)
Any y axis value of a will now be replaced with 1, b with 2 and so on. You can play around with the label values to see what I mean. Just make sure that every factor value you have is also represented in the breaks argument.
I think I may actually have a solution to this. My problem was that my labels were not correct because as someone above has said - it seems like the label vector is recycled through. This line of code gave me incorrect labels.
ggplot(dat, aes(x, y))+geom_col()+facet_grid(d ~ t, switch = "y", scales = "free_x")+ylab(NULL)+ylim(0,10)+geom_text(aes(label = x))
However when the geom_text was moved prior to the facet_grid, the below code gave me correct labels.
ggplot(dat, aes(x, y))+geom_col()+geom_text(aes(label = x))+facet_grid(d ~ t, switch = "y", scales = "free_x")+ylab(NULL)+ylim(0,10)
There's a good chance I may have misunderstood the problem above, but I certainly solved my problem so hopefully this is helpful to someone!

r stat_contour incorrect fill with polygon

When I use stat_contour with polygon, some regions should not be filled because there is no data there, i marked them in the figure. Does anyone know how to avoid that? In addition, there is space between axis and plot region, how to remove it?!
Here is the plotting code:
plot_contour <- function (da, native ) {
h2d<-hist2d(da$germ_div,da[[native]],nbins=40,show=F)
h2d$counts<-h2d$counts+1
counts<-log(h2d$counts, base=10)
rownames(counts)<-h2d$x
colnames(counts)<-h2d$y
counts<-melt(counts)
names(counts)<-c('x','y','z')
ggplot(counts,aes(x,y))+
stat_contour(expand=c(0,0),aes(z=z,fill=..level..),geom='polygon')+
stat_contour( data=counts[counts$x<=75,],aes(z=z,fill=..level..),bins=50,geom='polygon')+
scale_fill_gradientn(expand=c(0,0),colours=rainbow(1000),
limits=c(log(2,base=10),4),na.value='white',guide=F)+
geom_contour(aes(z=z,colour=..level..),size=1.5)+
scale_color_gradientn(colours=rainbow(30),limits=c(log(2,base=10),4),na.value='white',
guide=F) + theme_bw()+
scale_x_continuous(expand=c(0,0),limits=c(0,50))+
scale_y_continuous(expand=c(0,0),limits=c(40,100))+
labs(x=NULL, y=NULL, title=NULL)+
theme(axis.text.x = element_text(family='Times', colour="black", size=20, angle=NULL,
hjust=NULL,vjust=NULL,face="plain"),
axis.text.y = element_text( family='Times', colour="black", size=20,angle=NULL,
hjust=NULL,vjust=NULL,face="plain")
)
}
da<-read.table('test.txt',header=T)
i<-'test'
plot_contour(da,i)
This didn't fit in a comment, so posting as an answer:
stat_contour doesn't handle polygons that aren't closed very well. Additionally, there is a precision issue that crops up when setting the bins manually whereby the actual contour calculation can get freaked out (this happens when the contour bins are the same as plot data but aren't recognized as the same due to precision issues).
The first issue you can resolve by expanding your grid by 1 all around in every direction, and then setting every value in in the matrix that is lower than the lowest you care about to some arbitrarily low value. This will force the contour calculation to close all the polygons that would otherwise be open at the edges of the plot. You can then set the limits with coord_cartesian(xlim=c(...)) to have your axes flush with the graph.
The second issue I don't know of a good way to solve without modifying the ggplot code. You may not be affected by this issue.
#BrodieG
Your answer is correct, but it's a bit difficult without some code.
Adding the following lines, with appropriate x,y values (these are a best guess), makes things clearer:
xlim(-10, 60)+
ylim(30, 120)+
coord_cartesian(xlim=c(0, 50),ylim=c(40, 100))

Why wont ggplot2 allow me to set a size for each individual point?

I've got a scatter plot. I'd like to scale the size of each point by its frequency. So I've got a frequency column of the same length. However, if I do:
... + geom_point(size=Freq)
I get this error:
When _setting_ aesthetics, they may only take one value. Problems: size
which I interpret as all points can only have 1 size. So how would I do what I want?
Update: data is here
The basic code I used is:
dcount=read.csv(file="New_data.csv",header=T)
ggplot(dcount,aes(x=Time,y=Counts)) + geom_point(aes(size=Freq))
Have you tried..
+ geom_point(aes(size = Freq))
Aesthetics are mapped to variables in the data with the aes function. Check out http://had.co.nz/ggplot2/geom_point.html
ok, this might be what you're looking for. The code you provided above aggregates the information into four categories. If you don't want that, you can specify the categories with scale_size_manual().
sizes <- unique(dcount$Freq)
names(sizes) <- as.character(unique(dcount$Freq))
ggplot(dcount,aes(x=Time,y=Counts)) + geom_point(aes(size=as.factor(Freq))) + scale_size_manual(values = sizes/2)
If the code gd047 gave doesn't work, I'd double check that your Freq column is actually called Freq and that your workspace doesn't have some other object called Freq. Other than that, the code should work. How do you know that the scale has nothing to do with the frequency?

Resources