How to plot loess surface with ggplot - r

i have this code and i create a loess surface of my dataframe.
library(gstat)
library(sp)
x<-c(0,55,105,165,270,65,130,155,155,225,250,295,
30,100,110,135,160,190,230,300,30,70,105,170,
210,245,300,0,85,175,300,15,60,90,90,140,210,
260,270,295,5,55,55,90,100,140,190,255,285,270)
y<-c(305,310,305,310,310,260,255,265,285,280,250,
260,210,240,225,225,225,230,210,215,160,190,
190,175,160,160,170,120,135,115,110,85,90,90,
55,55,90,85,50,50,25,30,5,35,15,0,40,20,5,150)
z<-c(870,793,755,690,800,800,730,728,710,780,804,
855,813,762,765,740,765,760,790,820,855,812,
773,812,827,805,840,890,820,873,875,873,865,
841,862,908,855,850,882,910,940,915,890,880,
870,880,960,890,860,830)
dati<-data.frame(x,y,z)
x.range <- as.numeric(c(min(x), max(x)))
y.range <- as.numeric(c(min(y), max(y)))
meuse.loess <- loess(z ~ x * y, dati, degree=2, span = 0.25,
normalize=F)
meuse.mar <- list(x = seq(from = x.range[1], to = x.range[2], by = 1), y = seq(from = y.range[1],
to = y.range[2], by = 1))
meuse.lo <- predict(meuse.loess, newdata=expand.grid(meuse.mar), se=TRUE)
Now I want to plot meuse.lo[[1]] with ggplot2 function... but i don't know how to convert meuse.lo[[1]] in a dataframe with x,y (grid's coordinates) and z (interpolated value) columns. Thanks.

Your problem here is that loess() returns a matrix if you use grid.expand() to generate the new data for loess().
This is mentioned in the help for ?loess.predict:
If newdata was the result of a call to expand.grid, the predictions (and s.e.'s if requested) will be an array of the appropriate dimensions.
Now, you can still use grid.expand() to compute the new data, but force this function to return a data frame and dropping the attributes.
From ?grid.expand:
KEEP.OUT.ATTRS: a logical indicating the "out.attrs" attribute (see below) should be computed and returned.
So, try this:
nd <- expand.grid(meuse.mar, KEEP.OUT.ATTRS = FALSE)
meuse.lo <- predict(meuse.loess, newdata=nd, se=TRUE)
# Add the fitted data to the `nd` object
nd$z <- meuse.lo$fit
library(ggplot2)
ggplot(nd, aes(x, y, col = z)) +
geom_tile() +
coord_fixed()
The result:

ggplot2 is probably not the best choice for 3d graphs. However here is an easy solution with rgl
library(rgl)
plot3d(x, y, z, type="s", size=0.75, lit=FALSE,col="red")
surface3d(meuse.mar[[1]], meuse.mar[[2]], meuse.lo[[1]],
alpha=0.4, front="lines", back="lines")

Related

Adding loess regresion line on a hexbin plot

I have been trying to find method to add a loess regression line on a hexbin plot. So far I do not have any success... Any suggestions?
My code is as follow:
bin<-hexbin(Dataset$a, Dataset$b, xbins=40)
plot(bin, main="Hexagonal Binning",
xlab = "a", ylab = "b",
type="l")
I would suggest using ggplot2 to build the plot.
Since you didn't include any example data, I've used the palmerpenguins package dataset for the example below.
library(palmerpenguins) # For the data
library(ggplot2) # ggplot2 for plotting
ggplot(penguins, aes(x = body_mass_g,
y = bill_length_mm)) +
geom_hex(bins = 40) +
geom_smooth(method = 'loess', se = F, color = 'red')
Created on 2021-01-05 by the reprex package (v0.3.0)
I don't have a solution for base, but it's possible to do this with ggplot. It should be possible with base too, but if you look at the documentation for ?hexbin, you can see the quote:
Note that when plotting a hexbin object, the grid package is used. You must use its graphics (or those from package lattice if you know how) to add to such plots.
I'm not familiar with how to modify these. I did try ggplotify to convert the base to ggplot and edit that way, but couldn't get the loess line added to the plot window properly.
So here is a solution with ggplot with some fake data that you can try on your Datasets:
library(hexbin)
library(ggplot2)
# fake data with a random walk, replace with your data
set.seed(100)
N <- 1000
x <- rnorm(N)
x <- sort(x)
y <- vector("numeric", length=N)
for(i in 2:N){
y[i] <- y[i-1] + rnorm(1, sd=0.1)
}
# current method
# In documentation for ?hexbin it says:
# "You must use its graphics (or those from package lattice if you know how) to add to such plots."
(bin <- hexbin(x, y, xbins=40))
plot(bin)
# ggplot option. Can play around with scale_fill_gradient to
# get the colour scale similar or use other ggplot options
df <- data.frame(x=x, y=y)
d <- ggplot(df, aes(x, y)) +
geom_hex(bins=40) +
scale_fill_gradient(low = "grey90", high = "black") +
theme_bw()
d
# easy to add a loess fit to the data
# span controls the degree of smoothing, decrease to make the line
# more "wiggly"
model <- loess(y~x, span=0.2)
fit <- predict(model)
loess_data <- data.frame(x=x, y=fit)
d + geom_line(data=loess_data, aes(x=x, y=y), col="darkorange",
size=1.5)
Here are two options; you will need to decide if you want to smooth over the raw data or the binned data.
library(hexbin)
library(grid)
# Some data
set.seed(101)
d <- data.frame(x=rnorm(1000))
d$y <- with(d, 2*x^3 + rnorm(1000))
Method A - binned data
# plot hexbin & smoother : need to grab plot viewport
# From ?hexVP.loess : "Fit a loess line using the hexagon centers of mass
# as the x and y coordinates and the cell counts as weights."
bin <- hexbin(d$x, d$y)
p <- plot(bin)
hexVP.loess(bin, hvp = p$plot.vp, span = 0.4, col = "red", n = 200)
Method B - raw data
# calculate loess predictions outside plot on raw data
l = loess(y ~ x, data=d, span=0.4)
xp = with(d, seq(min(x), max(x), length=200))
yp = predict(l, xp)
# plot hexbin
bin <- hexbin(d$x, d$y)
p <- plot(bin)
# add loess line
pushHexport(p$plot.vp)
grid.lines(xp, yp, gp=gpar(col="red"), default.units = "native")
upViewport()

Is there a way to plot a randomwalk process using ggplot

So my aim is to compare the movement of the random walk process with stock prices movement.
I created a random walk process and plotted that as follows
P1<-RW(100,10,0,0.0004) plot(P2, main="Random Walk without Drift", xlab="index(",ylab="Price", ylim=c(9.7,10.3), typ='l', col="blue")
and it worked.
But is it possible to use ggplot instead of plot
In base graphics, when you do plot(x) (and no y component), several things go on under the hood. Notably, though, is that it calls xy.coords(x, y), which eventually does ...
else {
if (is.factor(x))
x <- as.numeric(x)
if (setLab)
xlab <- "Index"
y <- x
x <- seq_along(x)
}
which is the clue into how to get ggplot2 to do effectively the same thing: by assign the values to y and creating a sequence into x.
set.seed(42)
P1 <- cumsum(rnorm(1000))
plot(P1, type = "l")
ggplot(mapping = aes(x = seq_along(P1), y = P1)) + geom_line()
or in a "formalized" data.frame:
dat <- data.frame(x = seq_along(P1), y = P1)
ggplot(dat, aes(x = x, y = y)) + geom_line()

How do I add the curve from GauPro in ggplot?

GauPro is an R library for fitting gaussian processes. You can also get it to produce a nuce predicted curve for you.
The documentation for GauPro uses builtin r plotting functions to do plots like this:
gp <- GauPro(x,y) ## fit a gaussian process model to x & y
plot(x,y) ## plots the x,y points
curve(gp$predict(x), add=T, col=2) ## adds the predicted curve from the gaussian process
What would be the equivalent using ggplot? I can get the points to show up, but I can't quite figure out how to add the curve.
GauPro documentation I refer to is here
We can do this by building a little data frame of predictions. Let's start by loading the necessary packages and creating some sample data:
library(GauPro)
library(ggplot2)
set.seed(69)
x <- 1:10
y <- cumsum(runif(10))
Now we can create our model and plot it using the same plotting functions shown in the vignette you linked:
gp <- GauPro(x, y)
plot(x, y)
curve(gp$predict(x), add = TRUE, col = 2)
Now if we want to customize this plot using ggplot, we need a data frame with columns for the x values at which we wish to predict, the y prediction at that point, and a column each for upper and lower 95% confidence intervals. We can obtain the x values like this:
new_x <- seq(min(x), max(x), length.out = 100)
and we can get the three sets of corresponding y values using predict like this:
predict_df <- predict(gp, new_x, se.fit = TRUE)
predict_df$x <- new_x
predict_df$y <- predict_df$mean
predict_df$lower <- predict_df$y - 1.96 * predict_df$se
predict_df$upper <- predict_df$y + 1.96 * predict_df$se
this is now quite straightforward to plot in ggplot with themes customized as you choose:
ggplot(data.frame(x, y), aes(x, y)) +
geom_point() +
geom_line(data = predict_df, color = "deepskyblue4", linetype = 2) +
geom_ribbon(data = predict_df, aes(ymin = lower, ymax = upper),
alpha = 0.2, fill = "deepskyblue4") +
theme_minimal()
Created on 2020-07-29 by the reprex package (v0.3.0)

Filling parts of a contour plot in R

I have made a contour plot in R with the following code:
library(mvtnorm)
# Define the parameters for the multivariate normal distribution
mu = c(0,0)
sigma = matrix(c(1,0.2,0.2,3),nrow = 2)
# Make a grid in the x-y plane centered in mu, +/- 3 standard deviations
xygrid = expand.grid(x = seq(from = mu[1]-3*sigma[1,1], to = mu[1]+3*sigma[1,1], length.out = 100),
y = seq(from = mu[2]-3*sigma[2,2], to = mu[2]+3*sigma[2,2], length.out = 100))
# Use the mvtnorm library to calculate the multivariate normal density for each point in the grid
distribution = as.matrix(dmvnorm(x = xygrid, mean = mu, sigma = sigma))
# Plot contours
df = as.data.frame(cbind(xygrid, distribution))
myPlot = ggplot() + geom_contour(data = df,geom="polygon",aes( x = x, y = y, z = distribution))
myPlot
I want to illustrate cumulative probability by shading/colouring certain parts of the plot, for instance everything in the region {x<0, y<0} (or any other self defined region).
Is there any way of achieving this in R with ggplot?
So you are able to get the coordinates used to draw the circles in the plot using ggplot_build. Subsequently you could try to use these coordinates in combination with geom_polygon to shade a particular region. My best try:
library(dplyr)
data <- ggplot_build(myPlot)$data[[1]]
xCoor <- 0
yCoor <- 0
df <- data %>% filter(group == '-1-001', x <= xCoor, y <= yCoor) %>% select(x,y)
# Insert the [0,0] coordinate in the right place
index <- which.max(abs(diff(rank(df$y))))
df <- rbind( df[1:index,], data.frame(x=xCoor, y=yCoor), df[(index+1):nrow(df),] )
myPlot + geom_polygon(data = df, aes(x=x, y=y), fill = 'red', alpha = 0.5)
As you can see it's not perfect because the [x,0] and [0,y] coordinates are not included in the data, but it's a start.

Interpolating a path/curve within R

Within R, I want to interpolate an arbitrary path with constant distance
between interpolated points.
The test-data looks like that:
require("rgdal", quietly = TRUE)
require("ggplot2", quietly = TRUE)
r <- readOGR(".", "line", verbose = FALSE)
coords <- as.data.frame(r#lines[[1]]#Lines[[1]]#coords)
names(coords) <- c("x", "y")
print(coords)
x y
-0.44409 0.551159
-1.06217 0.563326
-1.09867 0.310255
-1.09623 -0.273754
-0.67283 -0.392990
-0.03772 -0.273754
0.63633 -0.015817
0.86506 0.473291
1.31037 0.998899
1.43934 0.933198
1.46854 0.461124
1.39311 0.006083
1.40284 -0.278621
1.54397 -0.271321
p.orig <- ggplot(coords, aes(x = x, y = y)) + geom_path(colour = "red") +
geom_point(colour = "yellow")
print(p.orig)
I tried different methods, none of them were really satisfying:
aspline (akima-package)
approx
bezierCurve
with the tourr-package I couldn't get started
aspline
aspline from the akima-package does some weird stuff when dealing with arbitrary paths:
plotInt <- function(coords) print(p.orig + geom_path(aes(x = x, y = y),
data = coords) + geom_point(aes(x = x, y = y), data = coords))
N <- 50 # 50 points to interpolate
require("akima", quietly = TRUE)
xy.int.ak <- as.data.frame(with(coords, aspline(x = x, y = y, n = N)))
plotInt(xy.int.ak)
approx
xy.int.ax <- as.data.frame(with(coords, list(x = approx(x, n = N)$y,
y = approx(y, n = N)$y)))
plotInt(xy.int.ax)
At first sight, approx looks pretty fine; however, testing it with real data gives me
problems with the distances between the interpolated points. Also a smooth, cubic interpolation would be a nice thing.
bezier
Another approach is to use bezier-curves; I used the following
implementation
source("bez.R")
xy.int.bz <- as.data.frame(with(coords, bezierCurve(x, y, N)))
plotInt(xy.int.bz)
How about regular splines using the same method you used for approx? Will that work on the larger data?
xy.int.sp <- as.data.frame(with(coords, list(x = spline(x)$y,
y = spline(y)$y)))
Consider using xspline or grid.xspline (the first is for base graphics, the second for grid):
plot(x,y, type='b', col='red')
xspline(x,y, shape=1)
You can adjust the shape parameter to change the curve, this example just plots the x spline, but you can also have the function return a set of xy coordinates that you would plot yourself.

Resources