I am new to R, and I am trying to do what it seems to be the simplest thing, but for the love of god, I cannot find out how to do it!
As the title says, I want to plot x=1, y=1 and y=1/(2*x), preferably with different colors, and after that, I want to paint the area between the x,y axis and the lines ploted. Something like this:
Thanks in advance
There are various ways to do this. For example, using library(ggplot2) you can do
# define how far beyond the intersection we calculate curve values
xmax = 1.1
xmin = 1/(2*xmax)
# calculate coordinates of the curve
x = seq(xmin, xmax, length.out = 100)
y = 1/(2*x)
# create polygon coordinates that follow the curve and ...
# ...extend down the staight lines to infinity
poly = data.frame(
x = c(x[x<1 & y<1], 1, 1, -Inf, -Inf, 0.5),
y = c(y[x<1 & y<1], 0.5, -Inf, -Inf, 1, 1))
ggplot(data.frame(x,y), aes(x,y)) +
geom_polygon(data = poly, fill='yellow') +
geom_line() +
geom_hline(aes(yintercept=1)) +
geom_vline(aes(xintercept=1)) +
coord_equal(1, c(0,1), c(0,1))
Related
I know that you can transform the coordinates of a plot using coord_trans(), and you can even perform coordinate transformations along both axes (e.g. coord_trans(x = "log10", y = "log10")), but is there a way to perform a coordinate transformation that depends on the values of both axes, like a shear?
I know that I can perform the linear transformation before I pass my data to ggplot using something like ggforce::linear_trans() like this example:
trans <- linear_trans(shear(1, 0))
square <- data.frame(x = c(0, 0, 1, 1), y = c(0, 1, 1, 0))
square2 <- trans$transform(square$x, square$y)
ggplot(square2, aes(x, y)) +
geom_polygon(colour = 'black')
However, I'm hoping that there would be a way to write a custom coordinate system such that the data doesn't need to be transformed beforehand, e.g.:
square <- data.frame(x = c(0, 0, 1, 1), y = c(0, 1, 1, 0))
ggplot(square, aes(x, y)) +
geom_polygon(colour = 'black') +
coord_shear(x=1)
I implemented a custom coord that does this. It takes a transformer like that produced by ggforce::linear_trans and applies it to a ggplot. Check it out in my deeptime package here.
I'd like to use something like ggplot2 and ggmap to produce a heat map of arbitrary values such as property prices per metre squared over a geographic area at a street level (with a high resolution).
Unfortunately, the task appears to be rather difficult because while ggplot2 can produce a great density plot, it seems unable to visualise spatial data like this without prior interpolation.
For this, I've used libraries akima (gridded bivariate interpolation for irregular data) and mgcv (generalised additive models with integrated smoothness estimation), however my knowledge of interpolation methods is mediocre at best and the results I've been able to produce aren't satisfactory enough.
Consider the following example:
Data
library(ggplot2)
library(ggmap)
## data simulation
set.seed(1945)
df <- tibble(x = rnorm(500, -0.7406, 0.03),
y = rnorm(500, 51.9976, 0.03),
z = abs(rnorm(500, 2000, 1000)))
Map, scatterplot, density plot
## ggmap
map <- get_map("Bletchley Park, Bletchley, Milton Keynes", zoom = 13, source = "stamen", maptype = "toner-background")
q <- ggmap(map, extent = "device", darken = .5)
## scatterplot over map
q + geom_point(aes(x, y), data = df, colour = z)
## classic density heat map
q +
stat_density2d(aes(x=x, y=y, fill=..level..), data=df, geom="polygon", alpha = .2) +
geom_density_2d(aes(x=x, y=y), data=df, colour = "white", alpha = .4) +
scale_fill_distiller(palette = "Spectral")
As you can see, the data are rather dense over the chosen area and the density heat map looks great with round edges and closed curves (except for some of the outermost layers).
Interpolation and plotting using akima
## akima interpolation
library(akima)
df_akima <-interp2xyz(interp(x=df$x, y=df$y, z=df$z, duplicate="mean", linear = T,
xo=seq(min(df$x), max(df$x), length=200),
yo=seq(min(df$y), max(df$y), length=200)), data.frame=TRUE)
## akima plot
q +
geom_tile(aes(x = x, y = y, fill = z), data = df_akima, alpha = .4) +
stat_contour(aes(x = x, y = y, z = z, fill = ..level..), data = df_akima, geom = 'polygon', alpha = .4) +
geom_contour(aes(x = x, y = y, z = z), data = df_akima, colour = 'white', alpha = .4) +
scale_fill_distiller(palette = "Spectral", na.value = NA)
This produces a dense grid of interpolated values (to ensure a sufficient resolution) and while the tile plot underneath is acceptable, the contour plots are too ragged and many of the curves aren't closed.
Non-linear interpolation using linear = F is smoother, but apparently sacrifices resolution and goes wild with the numbers (negative values of z).
Interpolation and plotting using mgcv
## mgcv interpolation
library(mgcv)
gam <- gam(z ~ s(x, y, bs = 'sos'), data = df)
df_mgcv <- data.frame(expand.grid(x = seq(min(df$x), max(df$x), length=200),
y = seq(min(df$y), max(df$y), length=200)))
resp <- predict(gam, df_mgcv, type = "response")
df_mgcv$z <- resp
## mgcv plot
q +
geom_tile(aes(x = x, y = y, fill = z), data = df_mgcv, alpha = .4) +
stat_contour(aes(x = x, y = y, z = z, fill = ..level..), data = df_mgcv, geom = 'polygon', alpha = .4) +
geom_contour(aes(x = x, y = y, z = z), data = df_mgcv, colour = 'white', alpha = .4) +
scale_fill_distiller(palette = "Spectral", na.value = NA)
The same process using mgcv results in a nice and smooth plot, but the resolution is much lower and practically all curves aren't closed.
Questions
Could you please suggest a better method or modify my attempt to obtain a plot similar to the first one (clean, connected, and smooth lines with high resolution)?
Is it possible to close the curves, e.g. in the last plot (the shaded area should be computed beyond the image boundaries)?
Thank you for your time!
The problem with your maps is not the interpolation method you're using, but the way ggplot displays density lines. Here's an answer to this: Remove gaps in a stat_density2d ggplot chart without modifying XY limits.
The density lines go beyond the map, so any polygon that goes outside the plot area is rendered inappropriately (ggplot will close the polygon using the next point of the correspondent level). This does not show up much on your first map because the interpolation resolution is low.
The trick proposed by Andrew is to first expand the plot area, so that the density lines are rendered correctly, then cut off the display area to hide the extra space. Since I tested his solution with your first example, here's the code:
q +
stat_density2d(
aes(x = x, y = y, fill = ..level..),
data = df,
geom = "polygon",
alpha = .2,
color = "white",
bins = 20
) +
scale_fill_distiller(
palette = "Spectral"
) +
xlim(
min(df$x) - 10^-5,
max(df$x) + 10^-5
) +
ylim(
min(df$y) - 10^-3,
max(df$y) + 10^-3
) +
coord_equal(
expand = FALSE,
xlim = c(-.778, -.688),
ylim = c(51.965, 52.03)
)
The only differences is that I used min()- / max() + instead of fixed numbers and coord_equal to ensure the map wasn't distorted. In addition, I manually specified a greater number of levels (using bin), since by increasing the plot area, stat_density automatically chooses a lower resolution.
As for the best interpolation method, this depends on your objective and the type of data you have. The question is not what is the best method for your map, but what is the best method for your data. This is a very broad issue, out of scope for this space. But here's a good guide: http://www.rspatial.org/analysis/rst/4-interpolation.html
For general ideas on how to make good maps in R using ggplot: http://spatial.ly/r/
Sorry, I can't run your example at the moment to provide details. But try autoKrige() from automap package.
Kriging is a great method for interpolation. Just be sure that your data fits the requisitions. Here's a good guide:
https://gisgeography.com/kriging-interpolation-prediction/
I have written the following code to plot my x-y data on a set of re-scaleable axes, the values contained in pointSize are the correctly scaled vertical/horizontal diameters of the point I want at each plotted coordinate. How do I go about getting this to work? Right now I am just plotting points with whatever scaling is used by default in geom_point(aes(size)) and the points don't scale with the axes. Once I rescale the axes with coord_cartesian I want the plotted points to increase/decrease relative to the axes accordingly.
For example, if the point size is say 5, that means I want the horizontal and vertical diameter of the point to be 5 relative to the axes regardless of specified xyScaling.
EDIT: min in pointSize should have been min = 0, not min = -10
Minimal reproducible code:
# Sample size & x-y axes plot boundaries
sampleSize <- 100
# Set scale factor of x-y axes
xyScaling <- 1
# Set to false once sampled to rescale axis with same distributions
resample <- TRUE
if (resample == TRUE){
xSample <- replicate(sampleSize, runif(1, min = -sampleSize/2, max = sampleSize/2))
ySample <- replicate(sampleSize, runif(1, min = -sampleSize/2, max = sampleSize/2))
pointSize <- replicate(sampleSize, runif(1, min = 0, max = 10))
}
sampleDataFrame <- data.frame(xSample, ySample, pointSize)
samplePlot <- ggplot(sampleDataFrame, aes(xSample, ySample))
samplePlot +
geom_point(data = sampleDataFrame, aes(size = sampleDataFrame$pointSize[])) +
coord_cartesian(xlim = c((xyScaling*(-sampleSize/2)),(xyScaling*(sampleSize/2))),
ylim = c((xyScaling*(-sampleSize/2)),(xyScaling*(sampleSize/2)))) +
xlab("x") +
ylab("y") +
scale_size_identity(guide=FALSE)
EDIT: So I almost managed to solve the problem by using geom_rect, the following code does what I want with the caveat that the points are rectangles as opposed to ellipses/circles, I couldn't get this to work with ellipses, if anyone could guide me to the right function I would be very grateful.
sampleDataFrame <- data.frame(xSample, ySample, pointSize)
samplePlot <- ggplot(sampleDataFrame)
samplePlot +
geom_point(aes(xSample, ySample, size = 0)) +
geom_rect(aes(xmin = xSample-(pointSize/2), xmax = xSample+(pointSize/2), ymin = ySample-(pointSize/2), ymax = ySample+(pointSize/2))) +
coord_cartesian(xlim = c((xyScaling*(-sampleSize/2)),(xyScaling*(sampleSize/2))),
ylim = c((xyScaling*(-sampleSize/2)),(xyScaling*(sampleSize/2)))) +
xlab("x") +
ylab("y") +
scale_size_identity(guide=FALSE)
this has been suggested in the past, but I don't think it got implemented. One problem is that circles are only circular in the special case of cartesian coordinates with unit aspect ratio. The easiest workaround is probably to create a data.frame with xy positions describing circles (ellipses) and draw these as polygons.
library(gridExtra)
library(ggplot2)
circle <- polygon_regular(50)
pointy_points <- function(x, y, size){
do.call(rbind, mapply(function(x,y,size,id)
data.frame(x=size*circle[,1]+x, y=size*circle[,2]+y, id=id),
x=x,y=y, size=size, id=seq_along(x), SIMPLIFY=FALSE))
}
test <- pointy_points(1:10, 1:10, size=seq(0.2, 1, length.out=10))
ggplot(test, aes(x,y,group=id, fill=id)) + geom_polygon()
You could try to edit the points at the lowest-level, but it's quite fiddly,
library(ggplot2); library(grid)
p <- qplot(1:10, 1:10, size=I(10))
g <- ggplotGrob(p)
points <- g$grobs[[4]][["children"]][[2]]
g$grobs[[4]][["children"]][[2]] <-
editGrob(points, size = convertUnit(points$size, unitTo = "npc"))
grid.newpage()
grid.draw(g)
I am creating a number of histograms and I want to add annotations towards the top of the graph. I am plotting these using a for loop so I need a way to place the annotations at the top even though my ylims change from graph to graph. If I could store the ylim for each graph within the loop I could cause the y coordinates for my annotation to vary based on the current graph. The y value I include in my annotation must change dynamically as the loop proceeds across iterations. Here is some sample code to demonstrate my issue (Notice how the annotation moves around. I need it to change based on the ylim for each graph):
library(ggplot2)
cuts <- levels(as.factor(diamonds$cut))
pdf(file = "Annotation Example.pdf", width = 11, height = 8,
family = "Helvetica", bg = "white")
for (i in 1:length(cuts)) {
by.cut<-subset(diamonds, diamonds$cut == cuts[[i]])
print(ggplot(by.cut, aes(price)) +
geom_histogram(fill = "steelblue", alpha = .55) +
annotate ("text", label = "My annotation goes at the top", x = 10000 ,hjust = 0, y = 220, color = "darkred"))
}
dev.off()
ggplot uses Inf in its positions to represent the extremes of the plot range, without changing the plot range. So the y value of the annotation can be set to Inf, and the vjust parameter can also be adjusted to get a better alignment.
...
print(ggplot(by.cut, aes(price)) +
geom_histogram(fill = "steelblue", alpha = .55) +
annotate("text", label = "My annotation goes at the top",
x = 10000, hjust = 0, y = Inf, vjust = 2, color = "darkred"))
...
For i<-2, this looks as:
There may be a neater way, but you can get the max count and use that to set y in the annotate call:
for (i in 1:length(cuts)) {
by.cut<-subset(diamonds, diamonds$cut == cuts[[i]])
## get the cut points that ggplot will use. defaults to 30 bins and thus 29 cuts
by.cut$cuts <- cut(by.cut$price, seq(min(by.cut$price), max(by.cut$price), length.out=29))
## get the highest count of prices in a given cut.
y.max <- max(tapply(by.cut$price, by.cut$cuts, length))
print(ggplot(by.cut, aes(price)) +
geom_histogram(fill = "steelblue", alpha = .55) +
## change y = 220 to y = y.max as defined above
annotate ("text", label = "My annotation goes at the top", x = 10000 ,hjust = 0, y = y.max, color = "darkred"))
}
I found the plotrix package in R but cannot yet find how to do this simple circle in R. Basically, how can I do a polar-plot with radius of 1 and 0:360 angles in degree, generating a circle?
Example
$$r\cos\left(\frac{2\pi}{3}\left(\frac{3\theta}{2\pi}-\left\lfloor\frac{3\theta}{2\pi}\right\rfloor\right) -\frac{\pi}{3}\right) = 1$$
Perhaps related
Trying to plot the above function, more here, the LaTex with this hack here visible.
Draw a circle with ggplot2
Regular polygons in polar coordinates
You can also make circles using geometry
circle <- function(x, y, rad = 1, nvert = 500, ...){
rads <- seq(0,2*pi,length.out = nvert)
xcoords <- cos(rads) * rad + x
ycoords <- sin(rads) * rad + y
polygon(xcoords, ycoords, ...)
}
# asp = 1 due to Hans' comment below- wouldn't let me leave a comment just saying 'thanks'
plot(-5:5, type="n", xlim = c(-5,5), ylim = c(-5,5), asp = 1)
circle(0,0,4)
circle(-1.5,1.5,.5)
circle(1.5,1.5,.5)
circle(0,0,1)
segments(-2,-2,2,-2)
You can easily get polar coordinate graphs in ggplot2.
From the ggplot2 website:
library(ggplot2)
cxc <- ggplot(mtcars, aes(x = factor(cyl))) +
geom_bar(width = 1, colour = "black")
cxc <- cxc + coord_polar()
print(cxc)
You can do very nice circular statistics with package circular. Below is one of examples from the package:
require(circular)
x <- circular(runif(50, 0, 2*pi))
rose.diag(x, bins = 18, main = 'Uniform Data',col=2)
points(x)