How do I loop a qqplot in ggplot2? - r

I am trying to create a function that loops through the columns of my dataset and saves a qq-plot of each of my variables. I have spent a lot of time looking for a solution, but I am an R novice and haven't been able to successfully apply any answers to my data. Can anyone see what I am doing wrong?
There error I am give is this, "Error in eval(expr, envir, enclos) : object 'i' not found"
library(ggplot2)
QQPlot <- function(x, na.rm = TRUE, ...) {
nm <- names(x)
for (i in names(mybbs)) {
plots <-ggplot(mybbs, aes(sample = nm[i])) +
stat_qq()
ggsave(plots, filename = paste(nm[i], ".png", sep=""))
}
}
QQPlot(mybbs)

The error happens because you are trying to pass a string as a variable name. Use aes_string() instead of aes()
Moreover, you are looping over names, not indexes; nm[i] would work for something like for(i in seq_along(names(x)), but not with your current loop. You would be better off replacing all nm[i] by i in the function, since what you want is the variable name.
Finally, you use mybbs instead of x inside the function. That means it will not work properly with any other data.frame.
Here is a solution to those three problems:
QQPlot <- function(x, na.rm = TRUE, ...) {
for (i in names(x)) {
plots <-ggplot(x, aes_string(sample = i)) +
stat_qq()
#print(plots)
ggsave(plots, filename = paste(i, ".png", sep=""))
}
}

Related

Change title of plots in list

I have a problem with my plotting function. I already asked kind of a similar question and Here are all the data and the plotting function.
When I try to apply my plotting function to my list of dfs, the title gets changed and the title condition I specified in the function is not respected. Is there a way to rename all the plots in the list or fix the function/loop so that the title stays the same?
Thanks in advance
This works as intended:
mynames <- sapply(names(tbls), function(x) {
paste("How do they rank? -",gsub("\\.",": ",x))
})
myfilenames <- names(tbls)
plot_likert <- function(x, myname, myfilename){
p <- plot(likert(x),
type ="bar",center=3,
group.order=names(x))+
labs(x = "Theme", subtitle=paste("Number of observations:",nrow(x)))+
guides(fill=guide_legend("Rank"))+
ggtitle(myname)
p
}
list_plots <- lapply(1:length(tbls),function(i) {
plot_likert(tbls[[i]], mynames[i], myfilenames[i])
})
When in doubt, keep things stupid and simple. Non-standard evaluation like deparse(substitute( will throw you right into Burns' R inferno.

Looping cut2 color argument in qplot

First off fair warning that this is relevant to a quiz question from coursera.org practical machine learning. However, my question does not deal with the actual question asked, but is a tangential question about plotting.
I have a training set of data and I am trying to create a plot for each predictor that includes the outcome on the y axis, the index of the data set on the x axis, and colors the plot by the predictor in order to determine the cause of bias along the index. To make the color argument more clear I am trying to use cut2() from the Hmisc package.
Here is my data:
library(ggplot2)
library(caret)
library(AppliedPredictiveModeling)
library(Hmisc)
data(concrete)
set.seed(1000)
inTrain = createDataPartition(mixtures$CompressiveStrength, p = 3/4)[[1]]
training = mixtures[ inTrain,]
testing = mixtures[-inTrain,]
training$index <- 1:nrow(training)
I tried this and it makes all the plots but they are all the same color.
plotCols <- function(x) {
cols <- names(x)
for (i in 1:length(cols)) {
assign(paste0("cutEx",i), cut2(x[ ,i]))
print(qplot(x$index, x$CompressiveStrength, color=paste0("cutEx",i)))
}
}
plotCols(training)
Then I tried this and it makes all the plots, and this time they are colored but the cut doesn't work.
plotCols <- function(x) {
cols <- names(x)
for (i in 1:length(cols)) {
assign(cols[i], cut2(x[ ,i]))
print(qplot(x$index, x$CompressiveStrength, color=x[ ,cols[i]]))
}
}
plotCols(training)
It seems qplot() doesn't like having paste() in the color argument. Does anyone know another way to loop through the color argument and still keep my cuts? Any help is greatly appreciated!
Your desired output is easier to achieve using ggplot() instead of qplot(), since you can use aes_string(), that accepts strings as arguments.
plotCols <- function(x) {
cols <- names(x)
for (i in 1:length(cols)) {
assign(paste0("cutEx", i), cut2(x[, i]))
p <- ggplot(x) +
aes_string("index", "CompressiveStrength", color = paste0("cutEx", i)) +
geom_point()
print(p)
}
}
plotCols(training)

How to extract dependent variable names from for-looped glms object to use as titles

I am making barplots of corresponding model coefficients derrived from lapply function applied to a for-looped multi-glm object. Take LWAniMoveSubDat to be any data set that makes for facile explication in code below:
models.LW <- list()
ivnames.LW <- paste(names(subset(LWAniMovSubDat[,c(1:6)], select = -c(4))),
collapse = ' + ')
dvnames.LW <- paste(names(LWAniMovSubDat[7:17]), sep = ',')
for (y in dvnames.LW){
form <- as.formula(paste(y, "~", ivnames.LW))
models.LW[[y]] <- glm(form, data = LWAniMovSubDat)
}
for (var in models.LW) {
dev.new()
barplot(coef(var))
}
How do I add titles to the above barplots without getting things messed up. I tried the following and it does not work:
main = names(models.LW)
Please help!
main = names(models.LW) doesn't work because that refers to the names of all of the models. If you change your barplot code as follows, where you iterate through the names of the models and then reference the models based on those names, it should work.
for (var in names(models.LW)){
dev.new()
barplot(coef(models.LW[var][[1]]), main = var)
}

Plotting a data.frame from within a function with ggplot2

I have this function to take an object returned by the IRT package sirt and plot item response functions for a set of items that the user can specify:
plotRaschIRF <- function(x,items=NULL,thl=-5,thu=5,thi=.01,D=1.7) {
if (!class(x)=="rasch.mml") stop("Object must be of class rasch.mml")
thetas <- seq(thl,thu,thi)
N <- length(thetas)
n <- length(x$item$b)
tmp <- data.frame(item=rep(1:n,each=N),theta=rep(thetas,times=n),b=rep(x$item$b,each=N))
probs <- exp(D*(tmp[,2]-tmp[,3]))/(1+exp(D*(tmp[,2]-tmp[,3])))
dat <- data.frame(item=rep(1:n,each=N),theta=rep(thetas,times=n),b=rep(x$item$b,each=N),p=probs)
#dat$item <- factor(dat$item,levels=1:n,labels=paste0("Item",1:n))
if (is.null(items)) {
m <- min(10,n)
items <- 1:m
if (10<n) warning("By default, this function will plot only the first 10 items")
}
if (length(items)==1) {
title="Item Response Function"
} else {
title="Item Response Functions"
}
dat2 <- subset(dat,eval(quote(eval(item,dat) %in% items)))
dat2$item <- factor(dat2$item,levels=unique(dat2$item),labels=paste0("Item",unique(dat2$item)))
out <- ggplot(dat2,aes(x=theta,y=p,group=item)) +
geom_line(aes(color=dat2$item),lwd=1) + guides(col=guide_legend(title="Items")) +
theme_bw() + ggtitle(title) + xlab(expression(theta)) +
ylab("Probability") + scale_x_continuous(breaks=seq(thl,thu,1))
print(out)
}
But it seems to be getting stuck at either the line just before I start using ggplot2 (where I convert one column of dat2 to a factor) or at the ggplotting itself -- not really sure which. I get the error message "Error in eval(expr, envir, enclos) : object 'dat2' not found".
I tried reading through this as was suggested here but either this is a different problem or I'm just not getting it. The function works fine when I step through it line by line. Any help is greatly appreciated!
Based on your comments, the error is almost certainly in geom_line(aes(color=dat2$item)). Get rid of dat2$ and it should work fine (i.e. geom_line(aes(color=item))). Stuff in aes is evaluated in the data argument (dat2 here), with the global environment as the enclosure. Notably this means stuff in the function environment is not available for use by aes unless it is part of the data (dat2 here). Since dat2 doesn't exist inside dat2, and dat2 doesn't exist in the global environment, you get that error.

How to combine do.call() plot() and expression()

I am getting an error when I try and combine using expression with do.call and plot.
x <- 1:10
y <- x^1.5
I can get the plot I want by using only the plot function:
plot(y~x,xlab=expression(paste("Concentration (",mu,"M)")))
However, I would like to implement my plot using do.call. I have a really long list of parameters stored as a list, p. However, when I try and pass the list to do.call I get the following error:
p <- list(xlab=expression(paste("Concentration (",mu,"M)")))
do.call(plot,c(y~x,p))
Error in paste("Concentration (", mu, "M)") :
object 'mu' not found
I also tried defining the formula explicitly in the args passed to do.call. ie. do.call(plot,c(formula=y~x,p)). I do not understand why I am getting the error - especially because the following does not give an error:
do.call(plot,c(0,p))
(and gives the desired mu character in the xaxis).
You can use alist rather then list
p <- alist(xlab=expression(paste("Concentration (",mu,"M)")))
do.call(plot,c(y~x,p))
do.call evaluates the parameters before running the function; try wrapping the expression in quote:
p <- list(xlab=quote(expression(paste("Concentration (",mu,"M)"))))
do.call("plot", c(y~x, p))
Setting quote=TRUE also works. It in effect prevents do.call() from evaluating the elements of args before it passes them to the function given by what.
x <- 1:10
y <- x^1.5
p <- list(xlab=expression(paste("Concentration (",mu,"M)",sep="")))
do.call(what = "plot", args = c(y ~ x, p), quote = TRUE)

Resources