Fitting smooth through xyplot - r

This question seems simple but I haven't been able to figure out how to do it. I'm trying to fit a smooth line through longitudinal dataset as illustrated in the following code
library(nlme)
xyplot(conc ~ Time, data = Theoph, groups = Subject, type = c("l", "smooth"))
The output isn't quite what I'm after and there are multiple warnings. I would like to fit a smooth through the entire data. As a bonus, if anyone could also show how to do this using ggplot, that would be great.

To plot the individual Subjects as separate lines and points but plot the overall smooth use either of the two lattices approaches shown or the classic graphics and zoo approach at the end. Also note that we need to order the time points to produce the overall smooth and the nlme package is not used. Also note that no errors are given by the code in the question -- only warnings.
1) trellis.focus/trellis.unfocus We can use trellis.focus/trellis.unfocus to add an overall smooth:
library(lattice)
xyplot(conc ~ Time, groups = Subject, data = Theoph, type = "o")
trellis.focus("panel", 1, 1)
o <- order(Theoph$Time)
panel.xyplot(Theoph[o, "Time"], Theoph[o, "conc"], type = "smooth", col = "red", lwd = 3)
trellis.unfocus()
2) panel function A second way is to define an appropriate panel function:
library(lattice)
o <- order(Theoph$Time)
xyplot(conc ~ Time, groups = Subject, data = Theoph[o, ], panel =
function(x, y, ..., subscripts, groups) {
for (lev in levels(groups)) {
ok <- groups == lev
panel.xyplot(x[ok], y[ok], type = "o", col = lev)
}
panel.xyplot(x, y, type = "smooth", col = "red", lwd = 3)
})
Either of these gives the following output. Note that the overall smooth is the thick red line.
(continued after chart)
3) zoo/classic graphics Here is a solution using the zoo package and classic graphics.
library(zoo)
Theoph.z <- read.zoo(Theoph[c("Subject", "Time", "conc")],
index = "Time", split = "Subject")
plot(na.approx(Theoph.z), screen = 1, col = 1:nlevels(Theoph$Subject))
o <- order(Theoph$Time)
lo <- loess(conc ~ Time, Theoph[o, ])
lines(fitted(lo) ~ Time, Theoph[o,], lwd = 3, col = "red")

You can use the latticeExtra package to add a smoother to your first treillis object
library(nlme)
library(ggplot2)
library(lattice)
library(latticeExtra)
xyplot(conc ~ Time, data = Theoph, groups = Subject, type = "l") +
layer(panel.smoother(..., col = "steelblue"))
And here is the ggplot2 version of the same graph
ggplot(data = Theoph, aes(Time, conc)) +
geom_line(aes(colour = Subject)) +
geom_smooth(col = "steelblue")

Related

How to color background in each panel of an xyplot according to a factor variable?

I'm trying to construct a xyplot which contains different background color according to different values of an additional categorical variable. It is no problem to get repeating background coloring with the panel.xblocks (package: latticeExtra) function, but up to now I found no method to implement this with different coloring for different subplots in the xyplot.
JD <- c(seq(0,19, 1), seq(0,19, 1))
VAR <- c(rnorm(20, mean=10, sd=1), rnorm(20, mean=10, sd=1))
CATEG <- c(rep("A", 5), rep("B", 15), rep("A", 10), rep("B", 10))
YEAR <- c(rep(2001, 20), rep(2002, 20))
myd <- data.frame(JD, VAR, CATEG, YEAR)
xyplot((VAR) ~ JD | factor(YEAR), type="l",
xlab="", ylab="", col=1, data=myd)+
layer_(panel.xblocks(x, CATEG,
col = c("lightgray")))
Running the above code, the background coloring from the first xyplot-subplot (year 2001) is repeated in the second xyplot-subplot (year 2002). my aim is to get different background coloring according to the varaiable "CATEG" for the two subplots. Any suggestions welcome.
I think the panel.xblocks function is a good approach. The use of subscripts and groups is handy too but always requires some re-learning for me.
The conditioning request (|) generates subscripts. The groups argument is used to pass the CATEG values to the panel function. It isn't actually used for any grouping here. The ... in the panel function is not actually used either, but it's a good practice in case the code is changed and other functions need arguments passed down.
# Starting with data in 'myd' from above
# Load non-standard packages
library(lattice)
library(latticeExtra)
# Old school colors
myCol <- c("salmon", "lightgray")
names(myCol) <- levels(myd$CATEG)
# To use a different color for each level of 'CATEG' in each panel:
obj1 <- xyplot(VAR ~ JD | factor(YEAR), data = myd,
groups = CATEG, xlab = "", ylab = "",
panel = function(x, y, subscripts, groups, ...) {
panel.xblocks(x, myCol[groups][subscripts])
panel.lines(x, y, col = 1)
})
# Here's a solution to a different problem (second plot):
# How to use a different color for the first level of 'CATEG' in each panel
obj2 <- xyplot(VAR ~ JD | factor(YEAR), data = myd,
xlab = "", ylab = "", groups = CATEG,
panel = function(x, y, subscripts, groups, ...) {
panel.xblocks(x, myCol[panel.number()][groups][subscripts])
panel.lines(x, y, col = 1)
})
# Plot in one device
plot(obj1, position = c(0, 0.45, 1, 1), more = TRUE)
plot(obj2, position = c(0, 0, 1, 0.55))

How can I add a line to my xyplot based upon the mean of an attribute of my data?

I have created the base graph I am looking to get I just can't figure out how to add a line to the graph based on the mean of the murder attribute within the USArrests dataset. After that, I also need to color the state names based upon if they fall above or below the line.
The graph I have: https://ibb.co/V3VkYt4
The graph I need: https://ibb.co/4TTnQM1
I have tried adding an abline with the Murder attributes mean as the input and the line appears outside of my graph not sure what I am doing wrong.
library(lattice)
textPlot <- function()
{
data <- cbind(rownames(USArrests), USArrests)
names(data) <- c("State", names(data)[2:5])
averageM <- mean(USArrests$Murder)
xyplot(Murder~UrbanPop, data,
groups=State, panel=drawText,
main="Murder vs. Urban Population")
}
drawText <- function(x,y,groups,...)
{
panel.text(x=x,y=y,label=groups,cex=y/10)
}
Your graph appears to show a sloped regression line rather than a horizontal line for the mean. Lattice can add a regression line in xyplot directly from the variables with panel.lmline or from a regression model (or a constant) with panel.abline. A little more work is required to classify the states that are above or below selected murder rate. Here's one way to do it with lattice showing both types of regression lines.
# Load the lattice package, create data.frame with state names from USAarrests
library(lattice)
df <- data.frame(State = rownames(USArrests), USArrests)
# Determine regression and mean murder rate outside of xyplot()
# However, these operations don't have to be done outside of the lattice function
fm <- lm(Murder ~ UrbanPop, df)
averageM <- mean(USArrests$Murder)
# Add a variable to the data.frame indicating the classification
df$type <- factor(ifelse(df$Murder < fm$fitted, "low", "high"))
# Plot via lattice with explicit panel() function
xyplot(Murder ~ UrbanPop, data = df,
panel = function(x, y, ...) {
panel.abline(fm, col = "red", lwd = 2)
# panel.lmline(x, y, col = "red", lwd = 2) # This would do the same
panel.abline(h = averageM, col = "red", lty = 2, lwd = 2)
# panel.abline(h = mean(y), col = "red", lty = 2, lwd = 2) # This would do the same
panel.text(x, y, labels = df$State, cex = y/10, col = c(2,4)[df$type])
}
)

R: How to add normal distributions to overlapping grouped histograms with lattice

I've been searching for ways to make overlapping grouped histograms with the function 'histogram' in lattice, which I've found an answer to here.
histogram( ~Sepal.Length,
data = iris,
type = "p",
breaks = seq(4,8,by=0.2),
ylim = c(0,30),
groups = Species,
panel = function(...)panel.superpose(...,panel.groups=panel.histogram,
col=c("cyan","magenta","yellow"),alpha=0.4),
auto.key=list(columns=3,rectangles=FALSE,
col=c("cyan","magenta","yellow3"))
)
Now my question is if you could still add normal distributions for every group to this plot.
Possibly using this?
panel.mathdensity(dmath = dnorm, col = "black",
args = list(mean=mean(x),sd=sd(x)))
end result should end up looking similar to this:
image
This is the closest I was able to get. The hint I used was here. My problem is that the density plot gets hidden behind the next histogram plot.
plot1 <- histogram( ~Sepal.Length,
data = iris,
type = "p",
ylim = c(0,30),
breaks = seq(4,8,by=0.2),
groups = Species,
col=c("cyan","magenta","yellow"),
panel = panel.superpose,
panel.groups = function(x,y, group.number,...){
specie <- levels(iris$Species)[group.number]
if(specie %in% "setosa"){
panel.histogram(x,...)
panel.mathdensity(dmath=dnorm,args = list(mean=mean(x), sd=sd(x)), col="black")
}
if(specie %in% "versicolor"){
panel.histogram(x,...)
panel.mathdensity(dmath=dnorm,args = list(mean=mean(x), sd=sd(x)), col="black")
}
if(specie %in% "virginica"){
panel.histogram(x,...)
panel.mathdensity(dmath=dnorm,args = list(mean=mean(x), sd=sd(x)), col="black")
}
}
)

Boxplot and xyplot overlapped

I've done a conditional boxplot with my data, with the bwplot function of the lattice library.
A1 <- bwplot(measure ~ month | plot , data = prueba,
strip = strip.custom(bg = 'white'),
cex = .8, layout = c(2, 2),
xlab = "Month", ylab = "Total",
par.settings = list(
box.rectangle = list(col = 1),
box.umbrella = list(col = 1),
plot.symbol = list(cex = .8, col = 1)),
scales = list(x = list(relation = "same"),
y = list(relation = "same")))
Then, I've done a xyplot because I want to add the precipitation data to the previous graph, using xyplot from lattice library also.
B1 <- xyplot(precip ~ month | plot, data=prueba,
type="b",
ylab = '% precip',
xlab = 'month',
strip = function(bg = 'white', ...)
strip.default(bg = 'white', ...),
scales = list(alternating = F,
x=list(relation = 'same'),
y=list(relation = 'same')))
I've try to draw them on the same graph using grid.arrange from gridExtra library:
grid.arrange(A1,B1)
But with this, I don't overlap the data, but the result is this
How could I draw the precipitacion data "inside" the boxplots conditioned by plot?
Thank you
Using the barley data as Andrie did, another approach with latticeExtra:
library(lattice)
library(latticeExtra)
bwplot(yield ~ year | variety , data = barley, fill = "grey") +
xyplot(yield ~ year | variety , data = barley, col = "red")
You need to create a custom panel function. I demonstrate with the built-in barley data:
Imagine you want to create a simple bwplot and xyplot using the barley data. Your code might look like this:
library(lattice)
bwplot(yield ~ year | variety , data = barley)
xyplot(yield ~ year | variety , data = barley)
To combine the plots, you need to create a panel function that first plots the default panel.bwplot and then the panel.xyplot. Try this:
bwplot(yield ~ year | variety , data = barley,
panel = function(x, y, ...){
panel.bwplot(x, y, fill="grey", ...)
panel.xyplot(x, y, col="red", ...)
}
)
There is some information about doing this in the help for ?xyplot - scroll down to the details of the panel argument.

How to add boxplots to scatterplot with jitter

I am using following commands to produce a scatterplot with jitter:
ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5],500,replace=T))
library(lattice)
stripplot(NUMS~GRP,data=ddf, jitter.data=T)
I want to add boxplots over these points (one for every group). I tried searching but I am not able to find code plotting all points (and not just outliers) and with jitter. How can I solve this. Thanks for your help.
Here's one way using base graphics.
boxplot(NUMS ~ GRP, data = ddf, lwd = 2, ylab = 'NUMS')
stripchart(NUMS ~ GRP, vertical = TRUE, data = ddf,
method = "jitter", add = TRUE, pch = 20, col = 'blue')
To do this in ggplot2, try:
ggplot(ddf, aes(x=GRP, y=NUMS)) +
geom_boxplot(outlier.shape=NA) + #avoid plotting outliers twice
geom_jitter(position=position_jitter(width=.1, height=0))
Obviously you can adjust the width and height arguments of position_jitter() to your liking (although I'd recommend height=0 since height jittering will make your plot inaccurate).
I've written an R function called spreadPoints() within a package basiclotteR. The package can be directly installed into your R library using the following code:
install.packages("devtools")
library("devtools")
install_github("JosephCrispell/basicPlotteR")
For the example provided, I used the following code to generate the example figure below.
ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5],500,replace=T))
boxplot(NUMS ~ GRP, data = ddf, lwd = 2, ylab = 'NUMS')
spreadPointsMultiple(data=ddf, responseColumn="NUMS", categoriesColumn="GRP",
col="blue", plotOutliers=TRUE)
It is a work in progress (the lack of formula as input is clunky!) but it provides a non-random method to spread points on the X axis that doubles as a violin like summary of the data. Take a look at the source code, if you're interested.
For a lattice solution:
library(lattice)
ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5], 500, replace = T))
bwplot(NUMS ~ GRP, ddf, panel = function(...) {
panel.bwplot(..., pch = "|")
panel.xyplot(..., jitter.x = TRUE)})
The default median dot symbol was changed to a line with pch = "|". Other properties of the box and whiskers can be adjusted with box.umbrella and box.rectangle through the trellis.par.set() function. The amount of jitter can be adjusted through a variable named factor where factor = 1.5 increases it by 50%.

Resources