Panel functions in Lattice using differing data - r

I am working with a data frame called d in R. I want to plot a scatter plot using two of the columns, include a best-fit regression line, and also plot binned means.
I have calculated the centers of the bins and binned means, and included those as columns in the data frame.
I can make the scatter plot and regression line work, but cannot get the binned means to show up. Using the code below I get no errors, but the panel.points function does not show up.
scatter.Epsilon <- xyplot(Epsilon ~ data.subset.UpdatedVS30.091015,
data = d,
grid = TRUE,
scales = list(x = list(log = 10)),
xlab = "Vs30 (m/s)",
ylab = "Epsilon",
ylim = c(-4, 3),
xlim = c(10^2,10^3.4),
subscripts = TRUE,
panel=function(x,y,subscripts,...) {
panel.xyplot(x,y)
panel.abline(mod <- lm(y ~ x), col = 'black')
panel.points(d$bin.ep[subscripts], d$means.ep[subscripts],
col = 'red')})
scatter.Epsilon
A simplified data set would be:
dist <- rnorm(10,4,100)
x <- seq(1,100)
bin <-rep(50,100)
mean <- rep(mean(dist),100)
d <- data.frame(x,dist,bin,mean)
where dist ~ x is the scatterplot component, and mean represents the binned mean for data points between 1-100, and bin is the bin's center (at 50). I want to add one point at (bin, mean) on top of dist ~ x. My real data set has multiple bins and means based on data.subset.UpdatedVS30.091015 that I want to add on top of Epsilon ~ data.subset.UpdatedVS30.091015.

I think you might be trying to do too much work in the call to panel.points. Using your example data, this code works fine:
scatter.Epsilon <- xyplot(dist ~ x,
data = d,
grid = TRUE,
subscripts = TRUE,
panel=function(x,y,subscripts,...) {
panel.xyplot(x,y)
panel.abline(mod <- lm(y ~ x), col = 'black')
panel.points(bin,mean,col = 'red')})
and plots a red point right where it should be. Have you tried just
panel.points(bin.ep,means.ep,col='red')
There is no grouping variable in your formula, so no need for subscripts.

Related

How can I add a line to my xyplot based upon the mean of an attribute of my data?

I have created the base graph I am looking to get I just can't figure out how to add a line to the graph based on the mean of the murder attribute within the USArrests dataset. After that, I also need to color the state names based upon if they fall above or below the line.
The graph I have: https://ibb.co/V3VkYt4
The graph I need: https://ibb.co/4TTnQM1
I have tried adding an abline with the Murder attributes mean as the input and the line appears outside of my graph not sure what I am doing wrong.
library(lattice)
textPlot <- function()
{
data <- cbind(rownames(USArrests), USArrests)
names(data) <- c("State", names(data)[2:5])
averageM <- mean(USArrests$Murder)
xyplot(Murder~UrbanPop, data,
groups=State, panel=drawText,
main="Murder vs. Urban Population")
}
drawText <- function(x,y,groups,...)
{
panel.text(x=x,y=y,label=groups,cex=y/10)
}
Your graph appears to show a sloped regression line rather than a horizontal line for the mean. Lattice can add a regression line in xyplot directly from the variables with panel.lmline or from a regression model (or a constant) with panel.abline. A little more work is required to classify the states that are above or below selected murder rate. Here's one way to do it with lattice showing both types of regression lines.
# Load the lattice package, create data.frame with state names from USAarrests
library(lattice)
df <- data.frame(State = rownames(USArrests), USArrests)
# Determine regression and mean murder rate outside of xyplot()
# However, these operations don't have to be done outside of the lattice function
fm <- lm(Murder ~ UrbanPop, df)
averageM <- mean(USArrests$Murder)
# Add a variable to the data.frame indicating the classification
df$type <- factor(ifelse(df$Murder < fm$fitted, "low", "high"))
# Plot via lattice with explicit panel() function
xyplot(Murder ~ UrbanPop, data = df,
panel = function(x, y, ...) {
panel.abline(fm, col = "red", lwd = 2)
# panel.lmline(x, y, col = "red", lwd = 2) # This would do the same
panel.abline(h = averageM, col = "red", lty = 2, lwd = 2)
# panel.abline(h = mean(y), col = "red", lty = 2, lwd = 2) # This would do the same
panel.text(x, y, labels = df$State, cex = y/10, col = c(2,4)[df$type])
}
)

How can I highlight minimum values in a levelplot in R?

How can I highlight the ten minimum value grid points of a 385*373 levelplot as black points?
I have the indexes as well as the coordinates of the ten minimum grid points. Preferably I would use the idexes...
I have the following levelplot displaying Europe's air temperature (Z), with X and Y being longitude and latitude respectively.
levelplot(Z ~ X*Y, data=data , xlab="X" , col.regions = heat.colors(100))
One further question: how can I add the country contours with the same projection type as the base data? I tried that before within another function
image(x,y,data,...)
data(wrdl_simpl)
plot(wrld_simpl, add = TRUE)
where the country contours plot seemed to have a totally different projection. However, I want to do this for levelplot() now.
I am very thankful for any help!
lattice plots differ to base plots. Therefore using points does not work. But there are replacement functions. Here is a way to do it:
x <- seq(-10, 10, length.out = 100)
y <- seq(-10, 10, length.out = 100)
z <- as.vector(sqrt(outer(x^2, y^2, "+")))
grid <- cbind(expand.grid(x=x, y=y), z)
minimum <- grid[which.min(grid$z),]
levelplot(z ~ x * y, grid, panel = function(...) {
panel.levelplot(...)
panel.points(x = minimum$x, y = minimum$y, pch = "x", cex =2)
})
We are basically building up the plot inside the panel argument.

Fitting smooth through xyplot

This question seems simple but I haven't been able to figure out how to do it. I'm trying to fit a smooth line through longitudinal dataset as illustrated in the following code
library(nlme)
xyplot(conc ~ Time, data = Theoph, groups = Subject, type = c("l", "smooth"))
The output isn't quite what I'm after and there are multiple warnings. I would like to fit a smooth through the entire data. As a bonus, if anyone could also show how to do this using ggplot, that would be great.
To plot the individual Subjects as separate lines and points but plot the overall smooth use either of the two lattices approaches shown or the classic graphics and zoo approach at the end. Also note that we need to order the time points to produce the overall smooth and the nlme package is not used. Also note that no errors are given by the code in the question -- only warnings.
1) trellis.focus/trellis.unfocus We can use trellis.focus/trellis.unfocus to add an overall smooth:
library(lattice)
xyplot(conc ~ Time, groups = Subject, data = Theoph, type = "o")
trellis.focus("panel", 1, 1)
o <- order(Theoph$Time)
panel.xyplot(Theoph[o, "Time"], Theoph[o, "conc"], type = "smooth", col = "red", lwd = 3)
trellis.unfocus()
2) panel function A second way is to define an appropriate panel function:
library(lattice)
o <- order(Theoph$Time)
xyplot(conc ~ Time, groups = Subject, data = Theoph[o, ], panel =
function(x, y, ..., subscripts, groups) {
for (lev in levels(groups)) {
ok <- groups == lev
panel.xyplot(x[ok], y[ok], type = "o", col = lev)
}
panel.xyplot(x, y, type = "smooth", col = "red", lwd = 3)
})
Either of these gives the following output. Note that the overall smooth is the thick red line.
(continued after chart)
3) zoo/classic graphics Here is a solution using the zoo package and classic graphics.
library(zoo)
Theoph.z <- read.zoo(Theoph[c("Subject", "Time", "conc")],
index = "Time", split = "Subject")
plot(na.approx(Theoph.z), screen = 1, col = 1:nlevels(Theoph$Subject))
o <- order(Theoph$Time)
lo <- loess(conc ~ Time, Theoph[o, ])
lines(fitted(lo) ~ Time, Theoph[o,], lwd = 3, col = "red")
You can use the latticeExtra package to add a smoother to your first treillis object
library(nlme)
library(ggplot2)
library(lattice)
library(latticeExtra)
xyplot(conc ~ Time, data = Theoph, groups = Subject, type = "l") +
layer(panel.smoother(..., col = "steelblue"))
And here is the ggplot2 version of the same graph
ggplot(data = Theoph, aes(Time, conc)) +
geom_line(aes(colour = Subject)) +
geom_smooth(col = "steelblue")

How to add boxplots to scatterplot with jitter

I am using following commands to produce a scatterplot with jitter:
ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5],500,replace=T))
library(lattice)
stripplot(NUMS~GRP,data=ddf, jitter.data=T)
I want to add boxplots over these points (one for every group). I tried searching but I am not able to find code plotting all points (and not just outliers) and with jitter. How can I solve this. Thanks for your help.
Here's one way using base graphics.
boxplot(NUMS ~ GRP, data = ddf, lwd = 2, ylab = 'NUMS')
stripchart(NUMS ~ GRP, vertical = TRUE, data = ddf,
method = "jitter", add = TRUE, pch = 20, col = 'blue')
To do this in ggplot2, try:
ggplot(ddf, aes(x=GRP, y=NUMS)) +
geom_boxplot(outlier.shape=NA) + #avoid plotting outliers twice
geom_jitter(position=position_jitter(width=.1, height=0))
Obviously you can adjust the width and height arguments of position_jitter() to your liking (although I'd recommend height=0 since height jittering will make your plot inaccurate).
I've written an R function called spreadPoints() within a package basiclotteR. The package can be directly installed into your R library using the following code:
install.packages("devtools")
library("devtools")
install_github("JosephCrispell/basicPlotteR")
For the example provided, I used the following code to generate the example figure below.
ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5],500,replace=T))
boxplot(NUMS ~ GRP, data = ddf, lwd = 2, ylab = 'NUMS')
spreadPointsMultiple(data=ddf, responseColumn="NUMS", categoriesColumn="GRP",
col="blue", plotOutliers=TRUE)
It is a work in progress (the lack of formula as input is clunky!) but it provides a non-random method to spread points on the X axis that doubles as a violin like summary of the data. Take a look at the source code, if you're interested.
For a lattice solution:
library(lattice)
ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5], 500, replace = T))
bwplot(NUMS ~ GRP, ddf, panel = function(...) {
panel.bwplot(..., pch = "|")
panel.xyplot(..., jitter.x = TRUE)})
The default median dot symbol was changed to a line with pch = "|". Other properties of the box and whiskers can be adjusted with box.umbrella and box.rectangle through the trellis.par.set() function. The amount of jitter can be adjusted through a variable named factor where factor = 1.5 increases it by 50%.

Surface plot Q in R - compable to surf() in matlab

I want to plot a matrix of z values with x rows and y columns as a surface similar to this graph from MATLAB.
Surface plot:
Code to generate matrix:
# Parameters
shape<-1.849241
scale<-38.87986
x<-seq(from = -241.440, to = 241.440, by = 0.240)# 2013 length
y<-seq(from = -241.440, to = 241.440, by = 0.240)
matrix_fun<-matrix(data = 0, nrow = length(x), ncol = length(y))
# Generate two dimensional travel distance probability density function
for (i in 1:length(x)) {
for (j in 1:length(y)){
dxy<-sqrt(x[i]^2+y[j]^2)
prob<-1/(scale^(shape)*gamma(shape))*dxy^(shape-1)*exp(-(dxy/scale))
matrix_fun[i,j]<-prob
}}
# Rescale 2-d pdf to sum to 1
a<-sum(matrix_fun)
matrix_scale<-matrix_fun/a
I am able to generate surface plots using a couple methods (persp(), persp3d(), surface3d()) but the colors aren't displaying the z values (the probabilities held within the matrix). The z values only seem to display as heights not as differentiated colors as in the MATLAB figure.
Example of graph code and graphs:
library(rgl)
persp3d(x=x, y=y, z=matrix_scale, color=rainbow(25, start=min(matrix_scale), end=max(matrix_scale)))
surface3d(x=x, y=y, z=matrix_scale, color=rainbow(25, start=min(matrix_scale), end=max(matrix_scale)))
persp(x=x, y=y, z=matrix_scale, theta=30, phi=30, col=rainbow(25, start=min(matrix_scale), end=max(matrix_scale)), border=NA)
Image of the last graph
Any other tips to recreate the image in R would be most appreciated (i.e. legend bar, axis tick marks, etc.)
So here's a ggplot solution which seems to come a little bit closer to the MATLAB plot
# Parameters
shape<-1.849241
scale<-38.87986
x<-seq(from = -241.440, to = 241.440, by = 2.40)
y<-seq(from = -241.440, to = 241.440, by = 2.40)
df <- expand.grid(x=x,y=y)
df$dxy <- with(df,sqrt(x^2+y^2))
df$prob <- dgamma(df$dxy,shape=shape,scale=scale)
df$prob <- df$prob/sum(df$prob)
library(ggplot2)
library(colorRamps) # for matlab.like(...)
library(scales) # for labels=scientific
ggplot(df, aes(x,y))+
geom_tile(aes(fill=prob))+
scale_fill_gradientn(colours=matlab.like(10), labels=scientific)
BTW: You can generate your data frame of probabilities much more efficiently using the built-in dgamma(...) function, rather than calculating it yourself.
In line with alexis_laz's comment, here is an example using filled.contour. You might want to increase your by to 2.40 since the finer granularity increases the time it takes to generate the plot by a lot but doesn't improve quality.
filled.contour(x = x, y = y, z = matrix_scale, color = terrain.colors)
# terrain.colors is in the base grDevices package
If you want something closer to your color scheme above, you can fiddle with the rainbow function:
filled.contour(x = x, y = y, z = matrix_scale,
color = (function(n, ...) rep(rev(rainbow(n/2, ...)[1:9]), each = 3)))
Finer granularity:
filled.contour(x = x, y = y, z = matrix_scale, nlevels = 150,
color = (function(n, ...)
rev(rep(rainbow(50, start = 0, end = 0.75, ...), each = 3))[5:150]))

Resources