How to xyplot multiple lines in R - r

I would like to plot graphs with multiple lines in R like this:
2 lines
x axis is date
y axis is the log return
I have data in 3 vectors
print(class(TradeDate))
print(class(ArimaGarchCurve))
print(class(CompareCurve))
---------------------------------------------
[1] "factor"
[1] "numeric"
[1] "numeric"
I search and found that xyplot may be useful, but I don't know how to use it. I have tried.
pdf("Testing.pdf")
plotData <- data.frame(Date=TradeDate,
Arima=ArimaGarchCurve,
BuyHold=BuyHoldCurve)
print(xyplot(
Arima ~ Date,
data=plotData,
superpose=T,
col=c("darkred", "darkblue"),
lwd=2,
key=list(
text=list(
c("ARIMA+GARCH", "Buy & Hold")
),
lines=list(
lwd=2, col=c("darkred", "darkblue")
)
)
))
dev.off()
Here is the result:
Learn from here
Thank you very much.
dput(head(plotData,20))
structure(list(Date = structure(1:20, .Label = c("2001-12-03",
"2001-12-04", "2001-12-05", "2001-12-06", "2001-12-07", "2001-12-10",
"2001-12-11", "2001-12-12", "2001-12-13", "2001-12-14", "2001-12-17",
"2001-12-18", "2001-12-19", "2001-12-20", "2001-12-21", "2001-12-24",
"2001-12-25", "2001-12-26", "2001-12-27", "2001-12-28", "2001-12-31",
"2002-01-01", "2002-01-02", "2002-01-03", "2002-01-04", "2002-01-07",
"2019-05-22", "2019-05-23"), class = "factor"), Arima = c(-0.0134052258713131,
-0.00542641764174324, 0.0128513670753771, 0.0282761455973665,
0.0179931884968989, 0.0281714817318116, 0.0435962602538011, 0.0462004298658309,
0.0194592964361352, 0.0248069155406948, 0.032807001046888, 0.0381120657516546,
0.0381120657516546, 0.030090589527961, -0.0146168717909267, -0.00630652663076437,
-0.00630652663076437, -0.00630652663076437, 0.0100429785563596,
0.0100429785563596), BuyHold = c(-0.0134052258713131, -0.00542641764174324,
0.0128513670753771, 0.0282761455973665, 0.0384544388322794, 0.0281714817318116,
0.0125050470584384, 0.0151092166704679, -0.0116319167592278,
-0.0170082867113405, -0.0090082012051471, -0.00370313650038065,
-0.00370313650038065, -0.0117246127240743, -0.056432074042962,
-0.0481217288827996, -0.0481217288827996, -0.0481217288827996,
-0.0317722236956757, -0.0317722236956757)), row.names = c(NA,
20L), class = "data.frame")

I think that this could help:
library(lattice)
xyplot(
Arima + BuyHold ~ Date, # here you can add log() to the two ts
data=plotData,
superpose=T,
col=c("#cc0000", "#0073e6"), # similar colors
lwd=2,
key=list(
text = list(c("ARIMA+GARCH log", "Buy & Hold log")),
lines = list( lwd=2, col=c("#cc0000", "#0073e6")) # similar colors
), type=c("l","g") # lines and grid
)
If you want to reduce the number of ticks on the x axis, you'd create your labels, and add them in this way (in this case, one year, you'd calculate your full time series parameters):
x.tick.number <- 1
at <- seq(1, nrow(d), length.out=x.tick.number)
labels <- round(seq(2001, 2001, length.out=x.tick.number))
In the plot:
xyplot(
Arima + BuyHold ~ Date, # here you can add log() to the two ts
data=d,
superpose=T,
col=c("#cc0000", "#0073e6"),
lwd=2,
key=list(
text = list(c("ARIMA+GARCH log", "Buy & Hold log")),
lines = list( lwd=2, col=c("#cc0000", "#0073e6"))
), type=c("l","g"),
scales = list(at=at, labels=labels, rot=90))

Both lattice and ggplot offer solutions. Regardless, as #davide suggests, "melting" your data or converting it from a "wide" format to a "long" is a very good practice. Values of interest are placed in a single variable and a parallel factor is created to identify the group associated with each value.
This can be done in base R by several methods. The use of stack() is shown here. In addition, by converting the factor or character representation of the date into a Date object, the plotting routines in lattice and ggplot2 will do a better job managing axes labels for you.
df <- data.frame(Date = as.Date(plotData$Date), stack(plotData[2:3]))
(names(df)) # stack names the data 'values and the grouping factor 'ind'
levels(df$ind) <- c("ARIMA+GARCH", "Buy & Hold") # simplifies legends
Here's a somewhat simple plot with few additions for grid lines and legend (key):
xyplot(values ~ Date, data = df, groups = ind, type = c("g", "l"), auto.key = TRUE)
The plots can be customized with lattice through panel functions and elements in auto.key. Although using col = c("darkred", "darkblue") at the top level of the function would color the lines in the plot, passing it through the optional par.settings argument makes it available for the legend function.
xyplot(values ~ Date, data = df, groups = ind,
panel = function(...) {
panel.grid(h = -1, v = -1)
panel.refline(h = 0, lwd = 3)
panel.xyplot(..., type = "l")},
auto.key = list(points = FALSE, lines = TRUE, columns = 2),
par.settings = list(superpose.line = list(col = c("darkred", "darkblue"))))

Related

Fitting smooth through xyplot

This question seems simple but I haven't been able to figure out how to do it. I'm trying to fit a smooth line through longitudinal dataset as illustrated in the following code
library(nlme)
xyplot(conc ~ Time, data = Theoph, groups = Subject, type = c("l", "smooth"))
The output isn't quite what I'm after and there are multiple warnings. I would like to fit a smooth through the entire data. As a bonus, if anyone could also show how to do this using ggplot, that would be great.
To plot the individual Subjects as separate lines and points but plot the overall smooth use either of the two lattices approaches shown or the classic graphics and zoo approach at the end. Also note that we need to order the time points to produce the overall smooth and the nlme package is not used. Also note that no errors are given by the code in the question -- only warnings.
1) trellis.focus/trellis.unfocus We can use trellis.focus/trellis.unfocus to add an overall smooth:
library(lattice)
xyplot(conc ~ Time, groups = Subject, data = Theoph, type = "o")
trellis.focus("panel", 1, 1)
o <- order(Theoph$Time)
panel.xyplot(Theoph[o, "Time"], Theoph[o, "conc"], type = "smooth", col = "red", lwd = 3)
trellis.unfocus()
2) panel function A second way is to define an appropriate panel function:
library(lattice)
o <- order(Theoph$Time)
xyplot(conc ~ Time, groups = Subject, data = Theoph[o, ], panel =
function(x, y, ..., subscripts, groups) {
for (lev in levels(groups)) {
ok <- groups == lev
panel.xyplot(x[ok], y[ok], type = "o", col = lev)
}
panel.xyplot(x, y, type = "smooth", col = "red", lwd = 3)
})
Either of these gives the following output. Note that the overall smooth is the thick red line.
(continued after chart)
3) zoo/classic graphics Here is a solution using the zoo package and classic graphics.
library(zoo)
Theoph.z <- read.zoo(Theoph[c("Subject", "Time", "conc")],
index = "Time", split = "Subject")
plot(na.approx(Theoph.z), screen = 1, col = 1:nlevels(Theoph$Subject))
o <- order(Theoph$Time)
lo <- loess(conc ~ Time, Theoph[o, ])
lines(fitted(lo) ~ Time, Theoph[o,], lwd = 3, col = "red")
You can use the latticeExtra package to add a smoother to your first treillis object
library(nlme)
library(ggplot2)
library(lattice)
library(latticeExtra)
xyplot(conc ~ Time, data = Theoph, groups = Subject, type = "l") +
layer(panel.smoother(..., col = "steelblue"))
And here is the ggplot2 version of the same graph
ggplot(data = Theoph, aes(Time, conc)) +
geom_line(aes(colour = Subject)) +
geom_smooth(col = "steelblue")

R: How to add normal distributions to overlapping grouped histograms with lattice

I've been searching for ways to make overlapping grouped histograms with the function 'histogram' in lattice, which I've found an answer to here.
histogram( ~Sepal.Length,
data = iris,
type = "p",
breaks = seq(4,8,by=0.2),
ylim = c(0,30),
groups = Species,
panel = function(...)panel.superpose(...,panel.groups=panel.histogram,
col=c("cyan","magenta","yellow"),alpha=0.4),
auto.key=list(columns=3,rectangles=FALSE,
col=c("cyan","magenta","yellow3"))
)
Now my question is if you could still add normal distributions for every group to this plot.
Possibly using this?
panel.mathdensity(dmath = dnorm, col = "black",
args = list(mean=mean(x),sd=sd(x)))
end result should end up looking similar to this:
image
This is the closest I was able to get. The hint I used was here. My problem is that the density plot gets hidden behind the next histogram plot.
plot1 <- histogram( ~Sepal.Length,
data = iris,
type = "p",
ylim = c(0,30),
breaks = seq(4,8,by=0.2),
groups = Species,
col=c("cyan","magenta","yellow"),
panel = panel.superpose,
panel.groups = function(x,y, group.number,...){
specie <- levels(iris$Species)[group.number]
if(specie %in% "setosa"){
panel.histogram(x,...)
panel.mathdensity(dmath=dnorm,args = list(mean=mean(x), sd=sd(x)), col="black")
}
if(specie %in% "versicolor"){
panel.histogram(x,...)
panel.mathdensity(dmath=dnorm,args = list(mean=mean(x), sd=sd(x)), col="black")
}
if(specie %in% "virginica"){
panel.histogram(x,...)
panel.mathdensity(dmath=dnorm,args = list(mean=mean(x), sd=sd(x)), col="black")
}
}
)

Boxplot and xyplot overlapped

I've done a conditional boxplot with my data, with the bwplot function of the lattice library.
A1 <- bwplot(measure ~ month | plot , data = prueba,
strip = strip.custom(bg = 'white'),
cex = .8, layout = c(2, 2),
xlab = "Month", ylab = "Total",
par.settings = list(
box.rectangle = list(col = 1),
box.umbrella = list(col = 1),
plot.symbol = list(cex = .8, col = 1)),
scales = list(x = list(relation = "same"),
y = list(relation = "same")))
Then, I've done a xyplot because I want to add the precipitation data to the previous graph, using xyplot from lattice library also.
B1 <- xyplot(precip ~ month | plot, data=prueba,
type="b",
ylab = '% precip',
xlab = 'month',
strip = function(bg = 'white', ...)
strip.default(bg = 'white', ...),
scales = list(alternating = F,
x=list(relation = 'same'),
y=list(relation = 'same')))
I've try to draw them on the same graph using grid.arrange from gridExtra library:
grid.arrange(A1,B1)
But with this, I don't overlap the data, but the result is this
How could I draw the precipitacion data "inside" the boxplots conditioned by plot?
Thank you
Using the barley data as Andrie did, another approach with latticeExtra:
library(lattice)
library(latticeExtra)
bwplot(yield ~ year | variety , data = barley, fill = "grey") +
xyplot(yield ~ year | variety , data = barley, col = "red")
You need to create a custom panel function. I demonstrate with the built-in barley data:
Imagine you want to create a simple bwplot and xyplot using the barley data. Your code might look like this:
library(lattice)
bwplot(yield ~ year | variety , data = barley)
xyplot(yield ~ year | variety , data = barley)
To combine the plots, you need to create a panel function that first plots the default panel.bwplot and then the panel.xyplot. Try this:
bwplot(yield ~ year | variety , data = barley,
panel = function(x, y, ...){
panel.bwplot(x, y, fill="grey", ...)
panel.xyplot(x, y, col="red", ...)
}
)
There is some information about doing this in the help for ?xyplot - scroll down to the details of the panel argument.

How to add boxplots to scatterplot with jitter

I am using following commands to produce a scatterplot with jitter:
ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5],500,replace=T))
library(lattice)
stripplot(NUMS~GRP,data=ddf, jitter.data=T)
I want to add boxplots over these points (one for every group). I tried searching but I am not able to find code plotting all points (and not just outliers) and with jitter. How can I solve this. Thanks for your help.
Here's one way using base graphics.
boxplot(NUMS ~ GRP, data = ddf, lwd = 2, ylab = 'NUMS')
stripchart(NUMS ~ GRP, vertical = TRUE, data = ddf,
method = "jitter", add = TRUE, pch = 20, col = 'blue')
To do this in ggplot2, try:
ggplot(ddf, aes(x=GRP, y=NUMS)) +
geom_boxplot(outlier.shape=NA) + #avoid plotting outliers twice
geom_jitter(position=position_jitter(width=.1, height=0))
Obviously you can adjust the width and height arguments of position_jitter() to your liking (although I'd recommend height=0 since height jittering will make your plot inaccurate).
I've written an R function called spreadPoints() within a package basiclotteR. The package can be directly installed into your R library using the following code:
install.packages("devtools")
library("devtools")
install_github("JosephCrispell/basicPlotteR")
For the example provided, I used the following code to generate the example figure below.
ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5],500,replace=T))
boxplot(NUMS ~ GRP, data = ddf, lwd = 2, ylab = 'NUMS')
spreadPointsMultiple(data=ddf, responseColumn="NUMS", categoriesColumn="GRP",
col="blue", plotOutliers=TRUE)
It is a work in progress (the lack of formula as input is clunky!) but it provides a non-random method to spread points on the X axis that doubles as a violin like summary of the data. Take a look at the source code, if you're interested.
For a lattice solution:
library(lattice)
ddf = data.frame(NUMS = rnorm(500), GRP = sample(LETTERS[1:5], 500, replace = T))
bwplot(NUMS ~ GRP, ddf, panel = function(...) {
panel.bwplot(..., pch = "|")
panel.xyplot(..., jitter.x = TRUE)})
The default median dot symbol was changed to a line with pch = "|". Other properties of the box and whiskers can be adjusted with box.umbrella and box.rectangle through the trellis.par.set() function. The amount of jitter can be adjusted through a variable named factor where factor = 1.5 increases it by 50%.

plotting from data frame with 2 fixed variables

Consider the following:
set.seed(1)
RandData <- rnorm(100,sd=20)
Locations <- rep(c('England','Wales'),each=50)
today <- Sys.Date()
dseq <- (seq(today, by = "1 days", length = 100))
Date <- as.POSIXct(dseq, format = "%Y-%m-%d")
Final <- cbind(Loc = Locations, Doy = as.numeric(format(Date,format = "%j")), Temp = RandData)
In this example how is it possible to produce two plots in the same figure window, where the first plot shows the temperature in England against Doy and the second shows temperature in Wales against Doy?
Note that your data is a character matrix. Better if the Final object is created via:
Final <- data.frame(Loc = Locations,
Doy = as.numeric(format(Date,format = "%j")),
Temp = RandData)
With that, the code below draws two plots on the one window, side by side. I use the formula interface to plot() to make use of it's subset argument, which works like the subset() function.
ylab <- "Temperature"
xlab <- "Day of year"
layout(matrix(1:2, ncol = 2))
plot(Temp ~ Doy, data = Final, subset = Loc == "England", main = "England",
ylab = ylab, xlab = xlab)
plot(Temp ~ Doy, data = Final, subset = Loc == "Wales", main = "Wales",
ylab = ylab, xlab = xlab)
layout(1)
Which produces this plot:
If you want them both on the same scale then we modify it a bit:
ylab <- "Temperature"
xlab <- "Day of year"
xlim <- with(Final, range(Doy))
ylim <- with(Final, range(Temp))
layout(matrix(1:2, ncol = 2))
plot(Temp ~ Doy, data = Final, subset = Loc == "England", main = "England",
ylab = ylab, xlab = xlab, xlim = xlim, ylim = ylim)
plot(Temp ~ Doy, data = Final, subset = Loc == "Wales", main = "Wales",
ylab = ylab, xlab = xlab, xlim = xlim, ylim = ylim)
layout(1)
which produces this version of the plot
For a line-plot you'd need to get the data in Doy order and then add type = "l" to the plot() calls.
For completeness, #Justin has shown how to use one of the high level plotting packages to achieve something similar but with less user-effort via ggplot2. The lattice package is another major high-level plotting package in R. You can achieve the same plot using lattice via:
require(lattice)
xyplot(Temp ~ Doy | Loc, data = Final, type = c("l","p")
The latter produces
Use type = "p" for just points and type = "l" for just lines. As you can see, the higher-level packages make producing these plots a bit easier than with the base graphics package.
by using cbind to create your data, they are all coerced to character. instead use data.frame()
Final <- data.frame(Loc = Locations,
Doy = as.numeric(format(Date,format = "%j")),
Temp = RandData)
ggplot does things like this very nicely.
library(ggplot2)
ggplot(Final, aes(x=Doy, y=Temp)) + geom_path() + facet_wrap( ~ Loc)
Or you can use coloring:
ggplot(Final, aes(x=Doy, y=Temp, color=Loc)) + geom_path()

Resources