I have a problem using loess and loess.smooth with a time series with missing data.
Both commands don't work with this toy data.
x <- as.Date(c(1, 2, 4, 5, 6), origin="2010-1-1")
y <- c(4, 8, 8, 28, 11)
plot(x, y, ylim=c(1,30))
lines(loess(y ~ x), col="red")
lines(loess.smooth(y=y, x=x), col="blue")
I ended up using the following code:
# Data
x.1 <- as.Date(c(1, 2, 4, 5, 6), origin="2010-1-1")
x.2 <- c(1, 2, 4, 5, 6)
y <- c(4, 8, 8, 28, 11)
# x.2 - x is numeric variable
plot(x.2, y, ylim=c(1,30))
lines(loess(y ~ x.2, span=1.01), col="black", lwd=2, lty=2) # neccessary to change span default to avoid warnings (span = 0.75)
lines(loess.smooth(x.2, y, span=1.01), col = "orange", , lwd=2) # neccessary to change span default to avoid warnings (span = 2/3)
lines(smooth.spline(x.2,y), col="blue", lwd=2)
# x.1 - x is date variable
plot(x.1, y, ylim=c(1,30))
# loess() cannot deal with date variables, thus convert it to
lines(loess(y~as.numeric(x.1), span=1.01), col="red", lwd=2) # neccessary to change span default to avoid warnings (span = 0.75)
lines(loess.smooth(x.1, y, span=1.01), col = "orange", lwd=2) # neccessary to change span default to avoid warnings (span = 2/3)
lines(smooth.spline(x.1,y), col="blue", lwd=2)
The problems were:
(1) loess is unable to deal with date variables.
(2) The span parameter had to be adjusted (>1).
Related
I need to add a "separating" line in Base R boxplot to separate difference groups. In the example below, I want to separate groups A and B (each having 2 levels) using a horizontal line (in red). R codes for reproducible results:
dat = data.frame(A1 = rnorm(1000, 0, 1), A2 = rnorm(1000, 1, 2),
B1 = rnorm(1000, 0.5, 0.5), B2 = rnorm(1000, 1.5, 1.5))
boxplot(dat, horizontal = T, outline=F)
Is there an easy way to do in Base R?
Also, is there an easy way to color the y-axis labels? I want to have A1 and B1 shown as red, and A2 and B2 shown as blue in the axis.
Thanks!
Use abline. To get the right position take the mean of the axTicks of the y-axis.
To get the colored labels, first omit yaxt and rebuild axis ticks and mtext, also using axTicks.
b <- boxplot(dat, horizontal=T, outline=F, yaxt="n")
ats <- axTicks(2)
axis(2, labels=F)
mtext(b$names, 2, 1, col=c(2, 4), at=ats)
abline(h=mean(ats), lwd=2, col=2)
If you want axis tick label colors corresponding to the labels, use segments instead.
b <- boxplot(dat, horizontal=T, outline=F, yaxt="n")
ats <- axTicks(2)
abline(h=mean(ats), lwd=2, col=2)
pu <- par()$usr
Map(function(x, y) segments(pu[1] - .2, x, pu[1], x, xpd=T, col=y), ats, c(2, 4))
mtext(b$names, 2, 1, col=c(2, 4), at=ats)
Edit: To adjust the space a little more use at=option in boxplot and leave out the middle axTicks.
b <- boxplot(dat, horizontal=T, outline=F, yaxt="n", at=c(1, 2, 4, 5))
ats <- axTicks(2)[-3]
abline(h=mean(ats), lwd=2, col=2)
pu <- par()$usr
Map(function(x, y) segments(pu[1] - .2, x, pu[1], x, xpd=T, col=y), ats, c(2, 4))
mtext(b$names, 2, 1, col=c(2, 4), at=ats)
I know how to split and fill areas of a polygon along a horizontal line, if the values are quite simple.
x <- 9:15
y1 <- c(5, 6, 5, 4, 5, 6, 5)
plot(x, y1, type="l")
abline(h=5, col="red", lty=2)
polygon(x[c(1:3, 5:7)], y1[c(1:3, 5:7)], col="green")
polygon(x[3:5], y1[3:5], col="red")
y2 <- c(5, 6, 4, 7, 5, 6, 5)
plot(x, y2, type="l")
abline(h=5, col="red", lty=2)
But how to get the result if the values are a bit more skew?
Expected output (photoshopped):
As pointed out by #Henrik in comments we can interpolate the missing points.
If the data is centered around another value than zero – as in my case – we need to adapt the method a little.
x <- 9:15
y2 <- c(5, 6, 4, 7, 5, 6, 5)
zp <- 5 # zero point
d <- data.frame(x, y=y2 - zp) # scale at zero point
# kohske's method
new_d <- do.call(rbind,
sapply(1:(nrow(d) - 1), function(i) {
f <- lm(x ~ y, d[i:(i + 1), ])
if (f$qr$rank < 2) return(NULL)
r <- predict(f, newdata=data.frame(y=0))
if(d[i, ]$x < r & r < d[i + 1, ]$x)
return(data.frame(x=r, y=0))
else return(NULL)
})
)
d2 <- rbind(d, new_d)
d2 <- transform(d2, y=y + zp) # descale
d2 <- unique(round(d2[order(d2$x), ], 4)) # get rid of duplicates
# plot
plot(d2, type="l")
abline(h=5, col="red", lty=2)
polygon(d2$x[c(1:3, 5:9)], d2$y[c(1:3, 5:9)], col="green")
polygon(d2$x[3:5], d2$y[3:5], col="red")
Result
I have 2 data sets (DSA and DSB) that contain x & y coordinates
tumor<- data.frame(DSA[,c("X_Parameter","Y_Parameter")])
cells<-data.frame(DSB[,c ("X_Parameter","Y_Parameter")])
plot(cells, xlim=c(1,1300), ylim=c(1,1000), col="red")
par(new=TRUE)
plot(tumor, xlim=c(1,1300), ylim=c(1,1000), col="blue")
the plots make this graph
I want to be able to draw a connecting line from every red dot to every blue dot.
Does anyone know if this can be done. thanks
Sample
DSA=(5,5 6,6 5,6 6,5) DSB=(1,1 10,10 10,1 1,10)
what the plot should look like
Brute-force, perhaps inelegant:
DSA <- data.frame(x = c(5, 6, 5, 6),
y = c(5, 6, 6, 5))
DSB <- data.frame(x = c(1, 10, 10, 1),
y = c(1, 10, 1, 10))
plot(y ~ x, DSB, col = "red")
points(DSA, col = "blue")
for (r in seq_len(nrow(DSA))) {
segments(DSA$x[r], DSA$y[r], DSB$x, DSB$y)
}
Edit: more directly:
nA <- nrow(DSA)
nB <- nrow(DSB)
plot(y ~ x, DSB, col = "red")
points(DSA, col = "blue")
segments(rep(DSA$x, each = nB), rep(DSA$y, each = nB),
rep(DSB$x, times = nA), rep(DSB$y, times = nA))
(I still can't figure out an elegant solution with #42's recommendation for combn or outer.)
Suppose I want to plot an R function:
weibull <- function(ALPHA, LAMBDA, T){
ALPHA*LAMBDA*(T^(ALPHA-1))
}
So the function takes the arguments alpha, lambda and T. I want to generate a plot where in one plot alpha =0.5, time ranges from 0 to 2 and lambda=1, 2, 4, 8, 16 and in another, alpha=1, time ranges from 0 to 2 and lambda=1, 2, 4, 8, 16.
In the past for plotting functions with just one argument, I've used curve and then done ADD=TRUE if I wanted another curve on the same plot. So for instance, in the past I've used:
lambda <- 0.5
pdf <- function(x){
lambda*exp(-lambda*x)
}
survival <- function(x){
exp(-lambda*x)
}
plot(curve(pdf, 0, 6), type="l", ylim=c(0, 1), lwd=3, ylab="", xlab="", xaxs="i", yaxs="i", main=expression(paste("Exponential Distribution ", lambda, "=0.5")), cex.main=2, cex.axis=2, cex.lab=2)
curve(survival, 0, 6, add=TRUE, col="plum4", lwd=3)
But in this example the functions just have one argument, which is x. Whereas, now I want to vary LAMBDA, T and ALPHA. The curve function does not work and I am not sure how else to approach this.
If you use curve, you can specify an expression with a free variable x that will get replaced by the range of values specified in your from=/to= parameters. For example you can do
weibull <- function(ALPHA, LAMBDA, T){
ALPHA*LAMBDA*(T^(ALPHA-1))
}
lambda<-c(1, 2, 4, 8, 16)
col<-rainbow(length(lambda))
layout(matrix(1:2, nrow=1))
for(i in seq_along(lambda)) {
curve(weibull(.5, lambda[i], x), from=0, to=2, add=i!=1, col=col[i], ylim=c(0,50), main="alpha=.5")
}
legend(1,50,lambda, col=col, lty=1)
for(i in seq_along(lambda)) {
curve(weibull(1, lambda[i], x), from=0, to=2, add=i!=1, col=col[i], ylim=c(0,20), main="alpha=1")
}
which will produce a plot like
I'd do it with plyr and ggplot2,
weibull <- function(alpha, lambda, time){
data.frame(time = time, value = alpha*lambda*(time^(alpha-1)))
}
library(plyr)
library(ggplot2)
params <- expand.grid(lambda = c(1, 2, 4, 8, 16), alpha = c(0.5, 1))
all <- mdply(params, weibull, time = seq(0, 2, length=100))
ggplot(all, aes(time, value, colour=factor(lambda)))+
facet_wrap(~alpha,scales="free", ncol=2) + geom_line()
A tidyverse alternative,
weibull <- function(alpha, lambda, time){
data.frame(time = time, value = alpha*lambda*(time^(alpha-1)))
}
library(ggplot2)
library(tidyverse)
params <- tidyr::crossing(lambda = c(1, 2, 4, 8, 16), alpha = c(0.5, 1))
params %>%
dplyr::mutate(purrr::pmap(., .f = weibull, time = seq(0, 2, length=100))) %>%
tidyr::unnest() %>%
ggplot(aes(time, value, colour=factor(lambda)))+
facet_wrap(~alpha,scales="free", ncol=2) + geom_line()
This is similar to MrFlick's answer but shorter:
par(mfrow=1:2)
lapply(0:4, function(l) curve(weibull(0.5, 2^l, x), col=l+1, add=l!=0, ylim=c(0,50), xlim=c(0,2)))
lapply(0:4, function(l) curve(weibull(1, 2^l, x), col=l+1, add=l!=0, ylim=c(0,50), xlim=c(0,2)))
Ok if you're a big fan of nested lapply's you can also do:
lapply(c(0.5,1), function(a) lapply(0:4, function(l) curve(weibull(a, 2^l, x), col=l+1, add=l!=0, ylim=c(0,50), xlim=c(0,2))))
Does anybody know, how to grab the single cooks distance plot that you get from this code:
treatment <- factor(rep(c(1, 2), c(43, 41)), levels = c(1, 2), labels = c("placebo","treated"))
improved <- factor(rep(c(1, 2, 3, 1, 2, 3), c(29, 7, 7, 13, 7, 21)), levels = c(1, 2,3),labels = c("none", "some", "marked"))
numberofdrugs <- rpois(84, 5)+1
healthvalue <- rpois(84,5)
y <- data.frame(healthvalue, numberofdrugs, treatment, improved)
test <- glm(healthvalue~numberofdrugs+treatment+improved, y, family=poisson)
par(mfrow=c(2,2))
plot(test) # how to grab plot 2.1 ?
What I don't like to have is this
par(mfrow=c(1, 1))
plot(test, which=c(4))
because it doesn't have residuals on the y axis and leverage on the x axis!
Thanks guys
I'm not quite sure what your problem is. You seem to want the plot with residuals on the y axis and leverage on the x axis. Isn't that just the 5th (of 6) plot generated:
plot(test,which=5)
You can read more about this at ?plot.lm
Edit to address OP's question about setting y axis labels:
Usually, simply adding ylab="My Label" to the plot() call would work, but these graphs are designed to be produced "automatically" and so certain graphical parameters are 'hard coded'. If you pass your own ylab value, you'll get an error, as plot.lm() will be presented with two ylab's and won't know which one to use. If you really don't like the y axis label, your only option here is to grab the plot.lm code (just type 'plot.lm' at the console and hit enter) copy and paste it into a text file and look for this section:
if (show[5L]) {
ylab5 <- if (isGlm)
"Std. Pearson resid."
else "Standardized residuals"
r.w <- residuals(x, "pearson")
if (!is.null(w))
r.w <- r.w[wind]
rsp <- dropInf(r.w/(s * sqrt(1 - hii)), hii)
ylim <- range(rsp, na.rm = TRUE)
if (id.n > 0) {
ylim <- extendrange(r = ylim, f = 0.08)
show.rsp <- order(-cook)[iid]
}
and modify it with your own y axis label. Rename the function (say, plotLMCustomY, or something) and it should work.