Automatically update graph title with parameter - r

I am not very familiar with R. I was using R to make the poisson distribution plot for different lambda (from 1 to 10), and display the plot for each just as a comparison.
But I would like to add a title say: "lambda = 1" for plot 1, "lambda=2" for plot 2 ... etc on the graph automatically according to lambda. But I wasn't able to figure out how to update the title automatically. This is my code, I was able to output 10 different graph correctly , but not sure how to update or add the corresponding lambda to the title automatically. Could someone give me some hint.
Also is it possible to say have a font size of "small" for the plot 1 to 5, and then a font size of 6 to 10?
Thanks
the_data_frame<-data.frame(matrix(ncol=10,nrow=21))
lam<-seq(1,10,1)
lam
x<-seq(0,20,1)
x
for (i in 1:10){
the_data_frame[i]<-exp(-lam[i])*lam[i]**x/gamma(x+1)
}
the_data_frame<-cbind(the_data_frame, x)
par(mfrow=c(5,2))
for (i in 1:10){
plot(the_data_frame[[i]]~the_data_frame[[11]], the_data_frame)
}

You can simplify the problem. Using one loop, over the lamda values, you compute at each iteration the value of y using the poison formula then you plot it. I use main argument to add a title for each plot. Here I am using bquote to get a plotmath format of lambda value.
For example , for 4 values of lambda , you get:
x<-seq(0,20,1);lam = c(0.5,1,2,4)
par(mfrow=c(2,2))
lapply(lam,function(lamd){
y <- exp(-lamd)*lamd*x/gamma(x+1)
plot(x,y,main=bquote(paste(lambda,'=',.(lamd))),type='l')
})

This might help:
for (i in 1:10){
plot(the_data_frame[[i]]~the_data_frame[[11]], the_data_frame,
main=paste("lambda=", i, sep=""))
}

library(ggplot2)
xval <- rep(0:20,10)
lambda <- rep(1:10,21)
yvtal <- exp(- lambda)*lambda**xval/gamma(xval+1)
the_new_data_frame <- data.frame(cbind(xval,lambda,yval))
plot1 <- ggplot(the_new_data_frame, aes(xval, yval)) + geom_line(aes(colour=factor(lambda)))
plot1
plot1 + facet_grid(~lambda)

Were you looking for an interactive window where you can input the text and update the figure title? If yes you may want to look for the tcltk package.
See
http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/modalDialog.html

Related

In R, how can I tell if the scales on a ggplot object are log or linear?

I have many ggplot objects where I wish to print some text (varies from plot to plot) in the same relative position on each plot, regardless of scale. What I have come up with to make it simple is to
define a rescale function (call it sx) to take the relative position I want and return that position on the plot's x axis.
sx <- function(pct, range=xr){
position <- range[1] + pct*(range[2]-range[1])
}
make the plot without the text (call it plt)
Use the ggplot_build function to find the x scale's range
xr <- ggplot_build(plt)$layout$panel_params[[1]]$x.range
Then add the text to the plot
plt <- plt + annotate("text", x=sx(0.95), ....)
This works well for me, though I'm sure there are other solutions folks have derived. I like the solution because I only need to add one step (step 3) to each plot. And it's a simple modification to the annotate command (x goes to sx(x)).
If someone has a suggestion for a better method I'd like to hear about it. There is one thing about my solution though that gives me a little trouble and I'm asking for a little help:
My problem is that I need a separate function for log scales, (call it lx). It's a bit of a pain because every time I want to change the scale I need to modify the annotate commands (change sx to lx) and occasionally there are many. This could easily be solved in the sx function if there was a way to tell what the type of scale was. For instance, is there a parameter in ggplot_build objects that describe the log/lin nature of the scale? That seems to be the best place to find it (that's where I'm pulling the scale's range) but I've looked and can not figure it out. If there was, then I could add a command to step 3 above to define the scale type, and add a tag to the sx function in step 1. That would save me some tedious work.
So, just to reiterate: does anyone know how to tell the scaling (type of scale: log or linear) of a ggplot object? such as using the ggplot_build command's object?
Suppose we have a list of pre-build plots:
linear <- ggplot(iris, aes(Sepal.Width, Sepal.Length, colour = Species)) +
geom_point()
log <- linear + scale_y_log10()
linear <- ggplot_build(linear)
log <- ggplot_build(log)
plotlist <- list(a = linear, b = log)
We can grab information about their position scales in the following way:
out <- lapply(names(plotlist), function(i) {
# Grab plot, panel parameters and scales
plot <- plotlist[[i]]
params <- plot$layout$panel_params[[1]]
scales <- plot$plot$scales$scales
# Only keep (continuous) position scales
keep <- vapply(scales, function(x) {
inherits(x, "ScaleContinuousPosition")
}, logical(1))
scales <- scales[keep]
# Grab relevant transformations
out <- lapply(scales, function(scale) {
data.frame(position = scale$aesthetics[1],
# And now for the actual question:
transformation = scale$trans$name,
plot = i)
})
out <- do.call(rbind, out)
# Grab relevant ranges
ranges <- params[paste0(out$position, ".range")]
out$min <- sapply(ranges, `[`, 1)
out$max <- sapply(ranges, `[`, 2)
out
})
out <- do.call(rbind, out)
Which will give us:
out
position transformation plot min max
1 x identity a 1.8800000 4.520000
2 y identity a 4.1200000 8.080000
3 y log-10 b 0.6202605 0.910835
4 x identity b 1.8800000 4.520000
Or if you prefer a straightforward answer:
log$plot$scales$scales[[1]]$trans$name
[1] "log-10"

Save plots as R objects and displaying in grid

In the following reproducible example I try to create a function for a ggplot distribution plot and saving it as an R object, with the intention of displaying two plots in a grid.
ggplothist<- function(dat,var1)
{
if (is.character(var1)) {
var1 <- which(names(dat) == var1)
}
distribution <- ggplot(data=dat, aes(dat[,var1]))
distribution <- distribution + geom_histogram(aes(y=..density..),binwidth=0.1,colour="black", fill="white")
output<-list(distribution,var1,dat)
return(output)
}
Call to function:
set.seed(100)
df <- data.frame(x = rnorm(100, mean=10),y =rep(1,100))
output1 <- ggplothist(dat=df,var1='x')
output1[1]
All fine untill now.
Then i want to make a second plot, (of note mean=100 instead of previous 10)
df2 <- data.frame(x = rep(1,1000),y = rnorm(1000, mean=100))
output2 <- ggplothist(dat=df2,var1='y')
output2[1]
Then i try to replot first distribution with mean 10.
output1[1]
I get the same distibution as before?
If however i use the information contained inside the function, return it back and reset it as a global variable it works.
var1=as.numeric(output1[2]);dat=as.data.frame(output1[3]);p1 <- output1[1]
p1
If anyone can explain why this happens I would like to know. It seems that in order to to draw the intended distribution I have to reset the data.frame and variable to what was used to draw the plot. Is there a way to save the plot as an object without having to this. luckly I can replot the first distribution.
but i can't plot them both at the same time
var1=as.numeric(output2[2]);dat=as.data.frame(output2[3]);p2 <- output2[1]
grid.arrange(p1,p2)
ERROR: Error in gList(list(list(data = list(x = c(9.66707664902549, 11.3631137069225, :
only 'grobs' allowed in "gList"
In this" Grid of multiple ggplot2 plots which have been made in a for loop " answer is suggested to use a list for containing the plots
ggplothist<- function(dat,var1)
{
if (is.character(var1)) {
var1 <- which(names(dat) == var1)
}
distribution <- ggplot(data=dat, aes(dat[,var1]))
distribution <- distribution + geom_histogram(aes(y=..density..),binwidth=0.1,colour="black", fill="white")
plot(distribution)
pltlist <- list()
pltlist[["plot"]] <- distribution
output<-list(pltlist,var1,dat)
return(output)
}
output1 <- ggplothist(dat=df,var1='x')
p1<-output1[1]
output2 <- ggplothist(dat=df2,var1='y')
p2<-output2[1]
output1[1]
Will produce the distribution with mean=100 again instead of mean=10
and:
grid.arrange(p1,p2)
will produce the same Error
Error in gList(list(list(plot = list(data = list(x = c(9.66707664902549, :
only 'grobs' allowed in "gList"
As a last attempt i try to use recordPlot() to record everything about the plot into an object. The following is now inside the function.
ggplothist<- function(dat,var1)
{
if (is.character(var1)) {
var1 <- which(names(dat) == var1)
}
distribution <- ggplot(data=dat, aes(dat[,var1]))
distribution <- distribution + geom_histogram(aes(y=..density..),binwidth=0.1,colour="black", fill="white")
plot(distribution)
distribution<-recordPlot()
output<-list(distribution,var1,dat)
return(output)
}
This function will produce the same errors as before, dependent on resetting the dat, and var1 variables to what is needed for drawing the distribution. and similarly can't be put inside a grid.
I've tried similar things like arrangeGrob() in this question "R saving multiple ggplot2 plots as R-object in list and re-displaying in grid " but with no luck.
I would really like a solution that creates an R object containing the plot, that can be redrawn by itself and can be used inside a grid without having to reset the variables used to draw the plot each time it is done. I would also like to understand wht this is happening as I don't consider it intuitive at all.
The only solution I can think of is to draw the plot as a png file, saved somewhere and then have the function return the path such that i can be reused - is that what other people are doing?.
Thanks for reading, and sorry for the long question.
Found a solution
How can I reference the local environment within a function, in R?
by inserting
localenv <- environment()
And referencing that in the ggplot
distribution <- ggplot(data=dat, aes(dat[,var1]),environment = localenv)
made it all work! even with grid arrange!

Assigning "beanplot" object to variable in R

I have found that the beanplot is the best way to represent my data. I want to look at multiple beanplots together to visualize my data. Each of my plots contains 3 variables, so each one looks something like what would be generated by this code:
library(beanplot)
a <- rnorm(100)
b <- rnorm(100)
c <- rnorm(100)
beanplot(a, b ,c ,ylim = c(-4, 4), main = "Beanplot",
col = c("#CAB2D6", "#33A02C", "#B2DF8A"), border = "#CAB2D6")
(Would have just included an image but my reputation score is not high enough, sorry)
I have 421 of these that I want to put into one long PDF (EDIT: One plot per page is fine, this was just poor wording on my part). The approach I have taken was to first generate the beanplots in a for loop and store them in a list at each iteration. Then I will use the multiplot function (from the R Cookbook page on multiplot) to display all of my plots on one long column so I can begin my analysis.
The problem is that the beanplot function does not appear to be set up to assign plot objects as a variable. Example:
library(beanplot)
a <- rnorm(100)
b <- rnorm(100)
plot1 <- beanplot(a, b, ylim = c(-5,5), main = "Beanplot",
col = c("#CAB2D6", "#33A02C", "#B2DF8A"), border = "#CAB2D6")
plot1
If you then type plot1 into the R console, you will get back two of the plot parameters but not the plot itself. This means that when I store the plots in the list, I am unable to graph them with multiplot. It will simply return the plot parameters and a blank plot.
This behavior does not seem to be the case with qplot for example which will return a plot when you recall the stored plot. Example:
library(ggplot2)
a <- rnorm(100)
b <- rnorm(100)
plot2 <- qplot(a,b)
plot2
There is no equivalent to the beanplot that I know of in ggplot. Is there some sort of workaround I can use for this issue?
Thank you.
You can simply open a PDF device with pdf() and keep the default parameter onefile=TRUE. Then call all your beanplot()s, one after the other. They will all be in one PDF document, each one on a separate page. See here.

function lines() is not working

I have a problem with the function lines.
this is what I have written so far:
model.ew<-lm(Empl~Wage)
summary(model.ew)
plot(Empl,Wage)
mean<-1:500
lw<-1:500
up<-1:500
for(i in 1:500){
mean[i]<-predict(model.ew,data.frame(Wage=i*100),interval="confidence",level=0.90)[1]
lw[i]<-predict(model.ew,data.frame(Wage=i*100),interval="confidence",level=0.90)[2]
up[i]<-predict(model.ew,data.frame(Wage=i*100),interval="confidence",level=0.90)[3]
}
plot(Wage,Empl)
lines(mean,type="l",col="red")
lines(up,type="l",col="blue")
lines(lw,type="l",col="blue")
my problem i s that no line appears on my plot and I cannot figure out why.
Can somebody help me?
You really need to read some introductory manuals for R. Go to this page, and select one that illustrates using R for linear regression: http://cran.r-project.org/other-docs.html
First we need to make some data:
set.seed(42)
Wage <- rnorm(100, 50)
Empl <- Wage + rnorm(100, 0)
Now we run your regression and plot the lines:
model.ew <- lm(Empl~Wage)
summary(model.ew)
plot(Empl~Wage) # Note. You had the axes flipped here
Your first problem was that you flipped the axes. The dependent variable (Empl) goes on the vertical axis. That is the main reason you didn't get any lines on the plot. To get the prediction lines requires no loops at all and only a single plot call using matlines():
xval <- seq(min(Wage), max(Wage), length.out=101)
conf <- predict(model.ew, data.frame(Wage=xval),
interval="confidence", level=.90)
matlines(xval, conf, col=c("red", "blue", "blue"))
That's all there is to it.

add labels to lattice barchart

I would like to place the value for each bar in barchart (lattice) at the top of each bar. However, I cannot find any option with which I can achieve this. I can only find options for the axis.
Create a custom panel function, e.g.
library("lattice")
p <- barchart((1:10)^2~1:10, horiz=FALSE, ylim=c(0,120),
panel=function(...) {
args <- list(...)
panel.text(args$x, args$y, args$y, pos=3, offset=1)
panel.barchart(...)
})
print(p)
I would have suggested using the new directlabels package, which can be used with both lattice and ggplot (and makes life very easy for these labeling problems), but unfortunately it doesn't work with barcharts.
Since I had to do this anyway, here's a close-enough-to-figure it out code sample along the lines of what #Alex Brown suggests (scores is a 2D array of some sort, which'll get turned into a grouped vector):
barchart(scores, horizontal=FALSE, stack=FALSE,
xlab='Sample', ylab='Mean Score (max of 9)',
auto.key=list(rectangles=TRUE, points=FALSE),
panel=function(x, y, box.ratio, groups, errbars, ...) {
# We need to specify groups because it's not actually the 4th
# parameter
panel.barchart(x, y, box.ratio, groups=groups, ...)
x <- as.numeric(x)
nvals <- nlevels(groups)
groups <- as.numeric(groups)
box.width <- box.ratio / (1 + box.ratio)
for(i in unique(x)) {
ok <- x == i
width <- box.width / nvals
locs <- i + width * (groups[ok] - (nvals + 1)/2)
panel.arrows(locs, y[ok] + 0.5, scores.ses[,i], ...)
}
} )
I haven't tested this, but the important bits (the parts determining the locs etc. within the panel function) do work. That's the hard part to figure out. In my case, I was actually using panel.arrows to make errorbars (the horror!). But scores.ses is meant to be an array of the same dimension as scores.
I'll try to clean this up later - but if someone else wants to, I'm happy for it!
If you are using the groups parameter you will find the labels in #rcs's code all land on top of each other. This can be fixed by extending panel.text to work like panel.barchart, which is easy enough if you know R.
I can't post the code of the fix here for licencing reasons, sorry.

Resources