Storing plot in object - r

Is it possible to store a simple plot created using plot() in an R object?
Let's say I create a plot:
a<-rnorm(1000)
b<-rnorm(1000)
plot(a,type='b')
lines(b,col="blue")
I would like to store both the plot and the added line in an object.
Is this possible to do?

Something like that will save your plots in a list.
n <- 3
templist <- vector(mode = "list", n)
for(i in seq_along(templist)){
a <- rnorm(1000)
b <- rnorm(1000)
plot(a, type = 'b')
lines(b,col = "blue")
templist[[i]] <- recordPlot()
}
You can check each plot, by doing templist[[i]] where i is in the range of: 1 <= i <= n

Related

Problem with theoretical density functions plotted to histograms in R

I did simulations in R and plotted the results in histograms. There is no problem adding probability density into the histograms with the density() function. But for some reason I get very strange results when I plot the theoretical densities to the histograms for comparison purposes. Here are two example codes and two pictures. The blue theoretical pdfs are way off and I do not know why. Could someone with R skills point out my mistakes?
# generating 10000 samples thrice from U(0,1) distribution
# and sorting them for the statistics:
n <- 10000
samples1 <- data.frame('0'=c(rep(NA,4)))
samples2 <- data.frame('0'=c(rep(NA,10)))
samples3 <- data.frame('0'=c(rep(NA,10)))
for (i in 1:n) {
new <- runif(4)
samples1[ , ncol(samples1) + 1] <- sort(new)
colnames(samples1)[ncol(samples1)] <- i
new <- runif(10)
samples2[ , ncol(samples2)+1] <- sort(new)
colnames(samples2)[ncol(samples2)] <- i
new <- runif(10)
samples3[ , ncol(samples3)+1] <- sort(new)
colnames(samples3)[ncol(samples3)] <- i
}
# dropping the first (useless) columns:
samples1 <- samples1[-c(1)]
samples2 <- samples2[-c(1)]
samples3 <- samples3[-c(1)]
# selecting the statistics from the samples:
# X_2:4
stat24 <- rep(NA,n)
for (i in 1:n) {
stat24[i] <- samples1[2,i]
}
# X_2:10
stat210 <- rep(NA,n)
for (i in 1:n) {
stat210[i] <- samples1[2,i]
}
# X_10:10
stat1010 <- rep(NA,n)
for (i in 1:n) {
stat1010[i] <- samples1[10,i]
}
# plotting the histograms and Beta pdfs:
hist(stat24, freq = FALSE)
lines(dbeta(stat24, 2, 5), col='blue')
lines(density(stat24), col='red')
Distribution of the kth statistic follows Beta(k, n+k-1) distribution which appears as the odd blue stroke on the right.
n <- 10000
random_variable_F <- rep(NA,n)
# generating 10000 samples of sizes 10 and 5 and computing F:
for (i in 1:n) {
x <- rnorm(10, mean = 10, sd = sqrt(5))
y <- rnorm(5, mean = 20, sd = sqrt(10))
random_variable_F[i] <- ((var(x))*5)/((var(y)*10))
}
#head(random_variable_F)
# plotting the histogram:
hist(random_variable_F, freq = F)
lines(density(random_variable_F), col='red')
lines(df(random_variable_F, 9, 4,), col='blue')
Random variable F follows F-distribution. Paramaters are the sample sizes minus one, in this case 10-1=9 and 5-1=4. the theoretical curve is quite wild:
If you pass a single vector to lines, it assumes that this is a vector of y values you want to plot. It plots the first y value at x = 1, the second y value at x = 2, etc, all the way up to x = length(y). In your case, random_variable_F is an unordered random variable, and you are just plotting its sequential values at 1:10000 along the x axis.
Clearly, you want the function y = df(x) to be plotted, so you need to pass random_variable_F as the x values and df(random_variable_F) as the y values. You will also need to sort random_variable_F first to ensure the line is plotted from left to right:
hist(random_variable_F, freq = F)
lines(density(random_variable_F), col='red')
lines(sort(random_variable_F), df(sort(random_variable_F), 9, 4), col='blue')
Note that this doesn't happen when you plot lines(density(random_variable_F)) because density produces a list containing ordered x and y valued rather than a vector.

ggarrange and ggplot inside a for loop

this is a simple example I have, where I generate 5 standard normals, each one with his own p value (just for the sake of the demonstration). I save that in a 4x50x5 array called X.
After that, I want to save 5 plots with 4 histograms each in a .pdf, and the following code does the job
pvec <- 2^(2:5)
n <- pvec/2
j <- 5
size <- 50
X <- array(rep(NA, length(pvec)*reps*j), dim=c(length(pvec), reps, j))
for (k in 1:length(pvec)){
for (i in 1:j){
X[k,,i] <- rnorm(size)
}
}
pdf("grafic.pdf")
par(mfrow=c(2,2))
for (w in 1:j){
for (k in 1:length(pvec)){
hist(X[k,,w], freq = F, col = 'lightgreen',main = paste("p = ", pvec[k], ",n =", n[k]))
curve(dnorm(x,mean=0,sd=1), add=TRUE,col="blue")
}
}
dev.off()
Obtaining, for example
Let's say that I want to do this now, but with ggplot. I have to use ggarrange in replace of par(mfrow). But ggarrange uses a plot.list as an argument, so inside the for I should have something like
graphlist <- NULL
for (w in 1:j){
for (k in 1:length(p.vec)){
graphlist[k,,w] <- ggplot(data=data.frame(X), aes(x=X[k,,w])) +
geom_histogram()
}
}
ggarrange(plotlist = graphlist, ncol = 2, nrow = 2)
But of course this doesn't work. How can I do stuffs like that, where I need to save the plots made by ggplot2 and then combine them with ggarrange? Thanks

Change color in basic plot function R when working with data of special class (stdCoxph)

I normally do all my plotting using ggplot but now I am working with standardized survival curves using a package called stdCoxph. My question is: Does anyone know how to change the color and thickness of these lines. I have tried to use col and lwd, but it does not change anything. See reproducible example below:
#Load packages:
if(!require(stdReg)){install.packages("stdReg", dependencies=TRUE)}
library(stdReg)
library(survival)
n <- 1000
Z <- rnorm(n)
X <- rnorm(n, mean=Z)
Tm <- rexp(n, rate=exp(X+Z+X*Z)) #survival time
C <- rexp(n, rate=exp(X+Z+X*Z)) #censoring time
U <- pmin(Tm, C) #time at risk
D <- as.numeric(Tm < C) #event indicator
dd <- data.frame(Z, X, U, D)
fit <- coxph(formula=Surv(U, D)~ X*Z, data=dd, method="breslow")
# in R formulas ~X*C will be interpreted as ~X+Z+X:Z
fit.std <- stdCoxph(fit=fit, data=dd, X="X", x=seq(-1,1,0.5), t=1:5)
print(summary(fit.std, t=3))
plot(fit.std)
col is hard coded but the other plot parameters can be modified with par :
opar <- par(lwd = 3)
plot(std.fit)
par(opar) # reset back
To change the colors redefine plot.stdCoxph adding a colors argument. This makes a copy of plot.stdCoxph which calls a local copy of lines and legend that use the specified colors.
plot.stdCoxph <- function(x, ..., colors = seq_along(x)) {
lines <- function(x, ..., col) graphics::lines(x, ..., col = colors[col])
legend <- function(x, ..., col) graphics::legend(x, ..., col = colors[col])
plot.stdCoxph <- stdReg:::plot.stdCoxph
environment(plot.stdCoxph) <- environment()
plot.stdCoxph(x, ...)
}
plot(fit.std, colors = rainbow(5)) # test
You could consider contacting the maintainer of the package and ask if they could provide similar functionality without this workaround.

Package for Divide Chain of tesselations in R, spatstat package?

I am trying to create pretty figures of clustered points. Is there a package which will create the divide chain between tessellations of points? Ideally it would be fit for plotting in ggplot.
Here is some example code:
#DivideLineExample
library(spatstat)
W=owin(c(0,1),c(0,1)) # Set up the Window
p<-runifpoint(42, win=W) # Get random points
ll=cbind(p$x,p$y) # get lat/long for each point
zclust=kmeans(ll,centers=4) # Cluster the points spatially into 4 clusters
K<-pp<-D<-list()
plot(W,main="Clustered Points")
for (i in 1:4){ # this breaks up the points into separate ppp objects for each cluster
K[[i]]=ll[zclust$cluster==i,]
pp[[i]]=as.ppp(K[[i]],W)
plot(pp[[i]],col=i,add=TRUE,cex=1.5,pch=16)
D[[i]]=dirichlet(pp[[i]]) # This performs the Dirichlet Tessellation and plots
plot(D[[i]],col=i,add=TRUE)
}
This outputs as such:
http://imgur.com/CCXeOEB
What I'm looking for is this:
http://imgur.com/7nmtXjo
I know an algorithm exists.
Any ideas/alternatives?
I have written a function that I think will do what you want:
divchain <- function (X) {
stopifnot(is.ppp(X))
if(!is.multitype(X)) {
whinge <- paste(deparse(substitute(X)),
"must be a marked pattern with",
"factor valued marks.\n")
stop(whinge)
}
X <- unique(X, rule = "deldir", warn = TRUE)
w <- Window(X)
require(deldir)
dd <- deldir(X,z=marks(X),rw=c(w$xrange,w$yrange))
if (is.null(dd))
return(NULL)
ddd <- dd$dirsgs
sss <- dd$summary
z <- sss[["z"]]
rslt <- list()
nsgs <- nrow(ddd)
K <- 0
for (i in 1:nsgs) {
i1 <- ddd[i,5]
i2 <- ddd[i,6]
c1 <- z[i1]
c2 <- z[i2]
if(c1 != c2) {
K <- K+1
rslt[[K]] <- unlist(ddd[i,1:4])
}
}
class(rslt) <- "divchain"
attr(rslt,"rw") <- dd$rw
rslt
}
I have also written a plot method for class "divchain":
plot.divchain <- function(x,add=FALSE,...){
if(!add) {
rw <- attr(x,"rw")
plot(0,0,type="n",ann=FALSE,axes=FALSE,xlim=rw[1:2],ylim=rw[3:4])
bty <- list(...)$bty
box(bty=bty)
}
lapply(x,function(u){segments(u[1],u[2],u[3],u[4],...)})
invisible()
}
E.g.:
require(spatstat)
set.seed(42)
X <- runifpoint(50)
z <- factor(kmeans(with(X,cbind(x,y)),centers=4)$cluster)
marks(X) <- z
dcX <- divchain(X)
plot(dirichlet(X),border="brown",main="")
plot(X,chars=20,cols=1:4,add=TRUE)
plot(dcX,add=TRUE,lwd=3)
Let me know whether this is satisfactory. Sorry I can't help you with ggplot stuff; I don't do ggplot.
You could try point in polygon test for example like kirkpatrick data structure. Much easier is to divide the polygon in horizontal or vertical. Source:http://www.personal.kent.edu/~rmuhamma/Compgeometry/MyCG/Voronoi/DivConqVor/divConqVor.htm

R: Draw a polygon with conditional colour

I want to colour the area under a curve. The area with y > 0 should be red, the area with y < 0 should be green.
x <- c(1:4)
y <- c(0,1,-1,2,rep(0,4))
plot(y[1:4],type="l")
abline(h=0)
Using ifelse() does not work:
polygon(c(x,rev(x)),y,col=ifelse(y>0,"red","green"))
What I achieved so far is the following:
polygon(c(x,rev(x)),y,col="green")
polygon(c(x,rev(x)),ifelse(y>0,y,0),col="red")
But then the red area is too large. Do you have any ideas how to get the desired result?
If you want two different colors, you need two different polygons. You can either call polygon multiple times, or you can add NA values in your x and y vectors to indicate a new polygon. R will not automatically calculate the intersection for you. You must do that yourself. Here's how you could draw that with different colors.
x <- c(1,2,2.5,NA,2.5,3,4)
y <- c(0,1,0,NA,0,-1,0)
#calculate color based on most extreme y value
g <- cumsum(is.na(x))
gc <- ifelse(tapply(y, g,
function(x) x[which.max(abs(x))])>0,
"red","green")
plot(c(1, 4),c(-1,1), type = "n")
polygon(x, y, col = gc)
abline(h=0)
In the more general case, it might not be as easy to split a polygon into different regions. There seems to be some support for this type of operation in GIS packages, where this type of thing is more common. However, I've put together a somewhat general case that may work for simple polygons.
First, I define a closure that will define a cutting line. The function will take a slope and y-intercept for a line and will return the functions we need to cut a polygon.
getSplitLine <- function(m=1, b=0) {
force(m); force(b)
classify <- function(x,y) {
y >= m*x + b
}
intercepts <- function(x,y, class=classify(x,y)) {
w <- which(diff(class)!=0)
m2 <- (y[w+1]-y[w])/(x[w+1]-x[w])
b2 <- y[w] - m2*x[w]
ix <- (b2-b)/(m-m2)
iy <- ix*m + b
data.frame(x=ix,y=iy,idx=w+.5, dir=((rank(ix, ties="first")+1) %/% 2) %% 2 +1)
}
plot <- function(...) {
abline(b,m,...)
}
list(
intercepts=intercepts,
classify=classify,
plot=plot
)
}
Now we will define a function to actually split a polygon using the splitter we've just defined.
splitPolygon <- function(x, y, splitter) {
addnullrow <- function(x) if (!all(is.na(x[nrow(x),]))) rbind(x, NA) else x
rollup <- function(x,i=1) rbind(x[(i+1):nrow(x),], x[1:i,])
idx <- cumsum(is.na(x) | is.na(y))
polys <- split(data.frame(x=x,y=y)[!is.na(x),], idx[!is.na(x)])
r <- lapply(polys, function(P) {
x <- P$x; y<-P$y
side <- splitter$classify(x, y)
if(side[1] != side[length(side)]) {
ints <- splitter$intercepts(c(x,x[1]), c(y, y[1]), c(side, side[1]))
} else {
ints <- splitter$intercepts(x, y, side)
}
sideps <- lapply(unique(side), function(ss) {
pts <- data.frame(x=x[side==ss], y=y[side==ss],
idx=seq_along(x)[side==ss], dir=0)
mm <- rbind(pts, ints)
mm <- mm[order(mm$idx), ]
br <- cumsum(mm$dir!=0 & c(0,head(mm$dir,-1))!=0 &
c(0,diff(mm$idx))>1)
if (length(unique(br))>1) {
mm<-rollup(mm, sum(br==br[1]))
}
br <- cumsum(c(FALSE,abs(diff(mm$dir*mm$dir))==3))
do.call(rbind, lapply(split(mm, br), addnullrow))
})
pss<-rep(unique(side), sapply(sideps, nrow))
ps<-do.call(rbind, lapply(sideps, addnullrow))[,c("x","y")]
attr(ps, "side")<-pss
ps
})
pss<-unname(unlist(lapply(r, attr, "side")))
src <- rep(seq_along(r), sapply(r, nrow))
r <- do.call(rbind, r)
attr(r, "source")<-src
attr(r, "side")<-pss
r
}
The input is just the values of x and y as you would pass to polygon along with the cutter. It will return a data.frame with x and y values that can be used with polygon.
For example
x <- c(1,2,2.5,NA,2.5,3,4)
y <- c(1,-2,2,NA,-1,2,-2)
sl<-getSplitLine(0,0)
plot(range(x, na.rm=T),range(y, na.rm=T), type = "n")
p <- splitPolygon(x,y,sl)
g <- cumsum(c(F, is.na(head(p$y,-1))))
gc <- ifelse(attr(p,"side")[is.na(p$y)],
"red","green")
polygon(p, col=gc)
sl$plot(lty=2, col="grey")
This should work for simple concave polygons as well with sloped lines. Here's another example
x <- c(1,2,3,4,5,4,3,2)
y <- c(-2,2,1,2,-2,.5,-.5,.5)
sl<-getSplitLine(.5,-1.25)
plot(range(x, na.rm=T),range(y, na.rm=T), type = "n")
p <- splitPolygon(x,y,sl)
g <- cumsum(c(F, is.na(head(p$y,-1))))
gc <- ifelse(attr(p,"side")[is.na(p$y)],
"red","green")
polygon(p, col=gc)
sl$plot(lty=2, col="grey")
Right now things can get a bit messy when the the vertex of the polygon falls directly on the splitting line. I may try to correct that in the future.
A faster, but not very accurate solution is to split data frame to list according to grouping variable (e.g. above=red and below=blue). This is a pretty nice workaround for rather big (I would say > 100 elements) datasets. For smaller chunks some discontinuity may be visible:
x <- 1:100
y1 <- sin(1:100/10)*0.8
y2 <- sin(1:100/10)*1.2
plot(x, y2, type='l')
lines(x, y1, col='red')
df <- data.frame(x=x, y1=y1, y2=y2)
df$pos_neg <- ifelse(df$y2-df$y1>0,1,-1) # above (1) or below (-1) average
# create the number for chunks to be split into lists:
df$chunk <- c(1,cumsum(abs(diff(df$pos_neg)))/2+1) # first element needs to be added`
df$colors <- ifelse(df$pos_neg>0, "red","blue") # colors to be used for filling the polygons
# create lists to be plotted:
l <- split(df, df$chunk) # we should get 4 sub-lists
lapply(l, function(x) polygon(c(x$x,rev(x$x)),c(x$y2,rev(x$y1)),col=x$colors))
As I said, for smaller dataset some discontinuity may be visible if sharp changes occur between positive and negative areas, but if horizontal line distinguishes between those two, or more elements are plotted then this effect is neglected:

Resources