Graphic of binary variable in R - r

I would like to plot a simple graphic. I have a dat set with n rowns and k columns, in which each row has a a sequence of 0 and 1. I would like to plot exactly this sequence for all rows.
Actually I want to reproduce the figure 24.1, p. 516, of Gelman and Hill's book (Data aAnalysis Using Regression and Multilevel/Hierarchical Models). I suspect that he made the graphic in Latex, but it seems quite ridiculous that I'm not able to repplicate this simple graphic in R. The figue is something like this. As you can see from the link, the "ones" are replaced by "S" and "zeros" by ".". It's a simple graphic, but it shows each individual response by time.

I would go with a formatted text output using sprintf. Much cleaner and simpler. If you still want a plot, you could go with the following:
Given matrix tbl containing your data:
tbl <- matrix(data=rep(0:1,25), nrow=5)
You can generate a plot as:
plot(1, 1, xlim=c(1,dim(tbl)[2]+.5), ylim=c(0.5,dim(tbl)[1]), type="n")
lapply(1:dim(tbl)[1], function(x) {
text(x=c(1:dim(tbl)[2]), y=rep(x,dim(tbl)[2]), labels=tbl[x,])
})
Using this as a base you can play around with the text and plot args to stylize the plot the way you wish.

Here are two possible solutions, based on fake data generated with this helper function:
generate.data <- function(rate=.3, dim=c(25,25)) {
tmp <- rep(".", prod(dim))
tmp[sample(1:prod(dim), ceiling(prod(dim)*rate))] <- "S"
m <- matrix(tmp, nr=dim[1], nc=dim[2])
return(m)
}
Text-based output
x <- generate.data()
rownames(x) <- colnames(x) <- 1:25
capture.output(as.table(x), file="res.txt")
The file res.txt include a pretty-printed version of the console output; you can convert it to pdf using any txt to pdf converter (I use the one from PDFlib). Here is a screenshot of the text file:
Image-based output
First, here is the plotting function I used:
make.table <- function(x, labels=NULL) {
# x = matrix
# labels = list of labels for x and y
coord.xy <- expand.grid(x=1:nrow(x), y=1:ncol(x))
opar <- par(mar=rep(1,4), las=1)
plot.new()
plot.window(xlim=c(0, ncol(x)), ylim=c(0, nrow(x)))
text(coord.xy$x, coord.xy$y, c(x), adj=c(0,1))
if (!is.null(labels)) {
mtext(labels[[1]], side=3, line=-1, at=seq(1, ncol(x)), cex=.8)
mtext(labels[[2]], side=2, line=-1, at=seq(1, nrow(x)), cex=.8, padj=1)
}
par(opar)
}
Then I call it as
make.table(x, list(1:25, 1:25))
and here is the result (save it as png, pdf, jpg, or whatever).

As far as I can see, this is a text table. I am wondering why you want to make it a graph? Anyway, quick solutions are (either way)
make the text table (by programming or typing) and make its screenshot and embed the image into the plot.
make a blank plot and put the text on the plot by programming R with "text" function. For more info on "text", refer to http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_adtxt.html

Related

Plot multiple columns saved in data frame with no x

My problem is multifaceted.
I would like to plot multiple columns saved in a data frame. Those columns do not have an x variable but would essentially be 1 to 101 consistent for all. I have seen that I can transfer them into long format but most ggplot options require an X. I tried zoo which does what I want it to, but the x-label is all jumbled and I am not aware of how to fix it. (Example of data below, and plot)
df <- zoo(HIP_131_Y0_LC_walk1[1:9])
plot(df)
I have multiple data frames saved in a list so ultimately would like to run a function and apply to all. The zoo function solves step one but I am not able to apply to all the data frames in the list.
graph<-lapply(myfiles,function(x) zoo(x) )
print(graph)
Ideally I would like to also mark minimum and maximum, which I am aware can be done with ggplot but not zoo.
Thank you so much for your help in advance
Assuming that the problem is overlapped panel names there are numerous solutions to this:
abbreviate the names using abbreviate. We show this for plot.zoo and autoplot.zoo .
put the panel name in the upper left. We show this for plot.zoo using a custom panel.
Use a header on each panel. We show this using xyplot.zoo and using ggplot.
The examples below use the test input in the Note at the end. (Next time please provide a complete example including all input in reproducible form.)
The first two examples below abbreviates the panel names and using plot.zoo and autoplot.zoo (which uses ggplot2). The third example uses xyplot.zoo (which uses lattice). This automatically uses headers and is probably the easiest solution.
library(zoo)
plot(z, ylab = abbreviate(names(z), 8))
library(ggplot2)
zz <- setNames(z, abbreviate(names(z), 8))
autoplot(zz)
library (lattice)
xyplot(z)
(click on plots to see expanded; continued after plots)
This fourth example puts the panel names in the upper left of the panel themselves using plot.zoo with a custom panel.
pnl <- function(x, y, ..., pf = parent.frame()) {
legend("topleft", names(z)[pf$panel.number], bty = "n", inset = -0.1)
lines(x, y)
}
plot(z, panel = pnl, ylab = "")
(click on plot to see it expanded)
We can also get headers with autoplot.zoo similar to in lattice above.
library(ggplot2)
autoplot(z, facets = ~ Series, col = I("black")) +
theme(legend.position = "none")
(click to expand; continued after graphics)
List
If you have a list of vectors L (see Note at end for a reproducible example of such a list) then this will produce a zoo object:
do.call("merge", lapply(L, zoo))
Note
Test input used above.
library(zoo)
set.seed(123)
nms <- paste0(head(state.name, 9), "XYZ") # long names
m <- matrix(rnorm(101*9), 101, dimnames = list(NULL, nms))
z <- zoo(m)
L <- split(m, col(m)) # test list using m in Note

How to save several plots from an own function into a list in R?

I have created one function that contains two types of plots and it will give you one image. However, the title of this image will change depending on one list, so, you will have several plots but different title.
(The original function will change the numbers that the plot uses but, in essence, is what I need).
This is the example that I have created.
list_genes <- c("GEN1", "GEN2", "GEN3")
myfunction <- function(x,y){
for(gene in list_genes){
# This to draw both plots
par(mfrow=c(2,1))
plot(x,y, main=paste0("Plot of ", gene))
hist(x, main=paste0("Plot of ", gene))
}
}
myfunction(x=c(1,5,6,2,4),y=c(6,10,53,1,5))
Since the list has 3 elements, we get 3 plots.
However, as I need to create a presentation with all the plots that I generate, I found this post with a solution to create slides for several plots inside a for loop. It is what I want, but for that, I need to save my plots into a list/variable.
object <- myfunction(x=c(1,5,6,2,4),y=c(6,10,53,1,5))
> object
NULL
I found this post (which gives you an interesting solution) but, the plots still cannot be saved into an object.
calling_myfunc <- function(){
myfunction(x=c(1,5,6,2,4),y=c(6,10,53,1,5))
}
calling_myfunc()
object <- calling_myfunc()
> object
NULL
My final objective is to create a presentation (automatically) with all of the plots that I generate from my function. As I saw in this post. But I need to save the plots into a variable.
Could anyone help me with this?
Thanks very much in advance
Although I couldn't find the way to save the plots into an object, I found a way to create a presentation with those images thanks to this post and the export package.
library(export)
list_genes <- c("GEN1", "GEN2", "GEN3")
myfunction <- function(x,y){
for(gene in list_genes){
# This to draw both plots
par(mfrow=c(2,1))
plot(x,y, main=paste0("Plot of ", gene))
hist(x, main=paste0("Plot of ", gene))
graph2ppt(file="plots.pptx", width=6, height=5,append=TRUE) } }
myfunction(x=c(1,5,6,2,4),y=c(6,10,53,1,5))
Of course, the width and height of the plots can be changed or put them as a parameter in the function.
Since the package is not available in CRAN for my current R version (4.1.2), I downloaded it from GitHub:
devtools::install_github("tomwenseleers/export")
In addition, I have found another package that I can use for the same purpose (although it adds one extra slide at the beginning, I don't know why)
library(eoffice)
list_genes <- c("GEN1", "GEN2", "GEN3")
myfunction <- function(x,y){
for(gene in list_genes){
# This to draw both plots
par(mfrow=c(2,1))
plot(x,y, main=paste0("Plot of ", gene))
hist(x, main=paste0("Plot of ", gene))
topptx(file="plots.pptx", width=6, height=5,append=TRUE)
}
}
myfunction(x=c(1,5,6,2,4),y=c(6,10,53,1,5))
PS: I found the solution to create a presentation --> How to create a presentation in R with several plots obtained by a function?

R statistical Programing

I am trying to write R codes for the histogram plot and save each histogram separate file using the following command.
I have a data set "Dummy" and i want to plot each histogram by a column name and there will be 100 histogram plots in total...
I have the following R codes that draws the each Histogram...
library(ggplot2)
i<-1
for(i in 1:100)
{
jpeg(file="d:/R Data/hist.jpeg", sep=",")
hist(Dummy$colnames<-1, ylab= "Score",ylim=c(0,3),col=c("blue"));
dev.off()
i++
if(i>100)
break()
}
As a start, let's get your for loop into R a little better by taking out the lines trying to change i, your for loop will do that for you.
We'll also include a file= value that changes with each loop run.
for(i in 1:100)
{
jpeg(file = paste0("d:/R Data/hist", i, ".jpeg"))
hist(Dummy[[i]], ylab = "Score", ylim = c(0, 3), col = "blue")
dev.off()
}
Now we just need to decide what you want to plot. Will each plot be different? How will each plot extract the data it needs?
EDIT: I've taken a stab at what you're trying to do. Are you trying to take each of 100 columns from the Dummy dataset? If so, Dummy[[i]] should achieve that (or Dummy[,i] if Dummy is a matrix).

r - Missing object when ggsave output as .svg

I'm attempting to step through a dataset and create a histogram and summary table for each factor and save the output as a .svg . The histogram is created using ggplot2 and the summary table using summary().
I have successfully used the code below to save the output to a single .pdf with each page containing the relevant histogram/table. However, when I attempt to save each histogram/table combo into a set of .svg images using ggsave only the ggplot histogram is showing up in the .svg. The table is just white space.
I've tried using dev.copy Cairo and svg but all end up with the same result: Histogram renders, but table does not. If I save the image as a .png the table shows up.
I'm using the iris data as a reproducible dataset. I'm not using R-Studio which I saw was causing some "empty plot" grief for others.
#packages used
library(ggplot2)
library(gridExtra)
library(gtable)
library(Cairo)
#Create iris histogram plot
iris.hp<-ggplot(data=iris, aes(x=Sepal.Length)) +
geom_histogram(binwidth =.25,origin=-0.125,
right = TRUE,col="white", fill="steelblue4",alpha=1) +
labs(title = "Iris Sepal Length")+
labs(x="Sepal Length", y="Count")
iris.list<-by(data = iris, INDICES = iris$Species, simplify = TRUE,FUN = function(x)
{iris.hp %+% x + ggtitle(unique(x$Species))})
#Generate list of data to create summary statistics table
sum.str<-aggregate(Sepal.Length~Species,iris,summary)
spec<-sum.str[,1]
spec.stats<-sum.str[,2]
sum.data<-data.frame(spec,spec.stats)
sum.table<-tableGrob(sum.data)
colnames(sum.data) <-c("species","sep.len.min","sep.len.1stQ","sep.len.med",
"sep.len.mean","sep. len.3rdQ","sep.len.max")
table.list<-by(data = sum.data, INDICES = sum.data$"species", simplify = TRUE,
FUN = function(x) {tableGrob(x)})
#Combined histogram and summary table across multiple plots
multi.plots<-marrangeGrob(grobs=(c(rbind(iris.list,table.list))),
nrow=2, ncol=1, top = quote(paste(iris$labels$Species,'\nPage', g, 'of',pages)))
#bypass the class check per #baptiste
ggsave <- ggplot2::ggsave; body(ggsave) <- body(ggplot2::ggsave)[-2]
#
for(i in 1:3){
multi.plots<-marrangeGrob(grobs=(c(rbind(iris.list[i],table.list[i]))),
nrow=2, ncol=1,heights=c(1.65,.35),
top = quote(paste(iris$labels$Species,'\nPage', g, 'of',pages)))
prefix<-unique(iris$Species)
prefix<-prefix[i]
filename<-paste(prefix,".svg",sep="")
ggsave(filename,multi.plots)
#dev.off()
}
Edit removed theme tt3 that #rawr referenced. It was accidentally left in example code. It was not causing the problem, just in case anyone was curious.
Edit: Removing previous answer regarding it working under 32bit install and not x64 install because that was not the problem. Still unsure what was causing the issue, but it is working now. Leaving the info about grid.export as it may be a useful alternative for someone else.
Below is the loop for saving the .svg's using grid.export(), although I was having some text formatting issues with this (different dataset).
for(i in 1:3){
multi.plots<-marrangeGrob(grobs=(c(rbind(iris.list[i],table.list[i]))),
nrow=2, ncol=1,heights=c(1.65,.35), top =quote(paste(iris$labels$Species,'\nPage', g,
'of',pages)))
prefix<-unique(iris$Species)
prefix<-prefix[i]
filename<-paste(prefix,".svg",sep="")
grid.draw(multi.plots)
grid.export(filename)
grid.newpage()
}
EDIT: As for using arrangeGrob per #baptiste's comment. Below is the updated code. I was incorrectly using the single brackets [] for the returned by list, so I switched to the correct double brackets [[]] and used grid.draw to on the ggsave call.
for(i in 1:3){
prefix<-unique(iris$Species)
prefix<-prefix[i]
multi.plots<-grid.arrange(arrangeGrob(iris.list[[i]],table.list[[i]],
nrow=2,ncol=1,top = quote(paste(iris$labels$Species))))
filename<-paste(prefix,".svg",sep="")
ggsave(filename,grid.draw(multi.plots))
}

In R, how to prevent blank page in pdf when using gridBase to embed subplot inside plot

As explained here, it is easy to embed a plot into an existing one thanks to gridBase, even though both plots use the base graphics system of R. However, when saving the whole figure into a pdf, the first page is always blank. How to prevent this?
Here is an example:
require(gridBase)
## generate dummy data
set.seed(1859)
x <- 1:100
y <- x + rnorm(100, sd=5)
ols <- lm(y ~ x)
pdf("test.pdf")
## draw the first plot
plot.new() # blank page also happens when using grid.newpage()
pushViewport(viewport())
plot(x, y)
## draw the second plot, embedded into the first one
pushViewport(viewport(x=.75,y=.35,width=.2,height=.2,just=c("center","center")))
par(plt=gridPLT(), new=TRUE)
hist(ols$residuals, main="", xlab="", ylab="")
popViewport(2)
dev.off()
I think it's a bit of a hack but setting onefile=FALSE worked on my machine:
pdf("test.pdf", onefile=FALSE)
In searching for an answer (which I didn't really find so much as stumbled upon in the forest) I came across this post to Rhelp from Paul Murrell who admits that mixing grid and base graphics is confusing even to the Master.
A work around solution I found was to initiate the pdf file inside the for loop; then insert an if clause to assess whether the first iteration is being run. When the current iteration is the first one, go ahead and create the output device using pdf(). Put the dev.off() after closing the for loop. An quick example follows:
for(i in 1:5){
if (i == 1) pdf(file = "test.pdf")
plot(rnorm(50, i, i), main = i)}
dev.off()

Resources