I would like to do something along the lines of this post: R: saving ggplot2 plots in a list
The problem is I can't get it to work. I seem to be able to get the individual graphs but the facet_wrap throws out an error. I would be content with just outputting all the graphs and then saving them to disk as a jpg or something, so I can scroll through them later.
for(n in 1:5){
pdata <- data.frame(mt1[n])
library(ggplot2)
p <-ggplot(pdata, aes(x=variable, y=value, color=Legend, group=Legend))+ geom_line()+ facet_wrap(~ color)
}
Link to a dput of the data : mt1
Edit:
Added the whole correct file, its a bit long
If we omit the facet error due to a missing variable in your data frames, you can generate and save your plots in different files this way using ggsave :
for(n in 1:5){
pdata <- data.frame(mt1[n]) # better to use mt1[[n]]
p <-ggplot(pdata, aes(x=variable, y=value, color=Legend, group=Legend))+ geom_line()
ggsave(paste0("plot",n,".jpg"), p)
}
Some suggestions for improvement:
First, as #Dason points out, your library(ggplot2) call should be outside your loop.
Second, if you access an element of list by [.], then the result will still be a list. You should do instead: [[.]] which will render the data.frame(.) call unnecessary (as commented above in the code).
Third is a suggestion to use *apply family of functions. Here, using lapply.
To summarise all these points in code:
require(ggplot2) # load package outside once
o <- lapply(seq_along(mtl), function(idx) {
p <- ggplot(mtl[[idx]], aes(x = variable, y = value,
color = Legend, group = Legend))+ geom_line()
ggsave(paste0("plot",idx,".jpg"), p)
})
Related
I hope I can get a contextual clue as to what may be wrong here without providing data frame, but can if necessary, but ultimately I want to utilize lapply to create multiple boxplots across multiple Ys and same X, but get the following error, but Termed is definitely in my CMrecruitdat data.frame:
Error in aes_string(x = Termed, y = RecVar, fill = Termed) :
object 'Termed' not found
RecVar <- CMrecruitdat[,c("Req.Open.To.System.Entry", "Req.Open.To.Hire", "Tenure")]
BP <- function (RecVar){
require(ggplot2)
ggplot(CMrecruitdat, aes_string(x=Termed, y=RecVar, fill=Termed))+
geom_boxplot()+
guides(fill=false)
}
lapply(RecVar, FUN=BP)
If you use aes_string, you should pass strings rather than vectors and use strings for all your fields.
RecVar <- CMrecruitdat[,c("Termed", "Req.Open.To.System.Entry", "Req.Open.To.Hire", "Tenure")]
BP <- function (RecVar){
require(ggplot2)
ggplot(RecVar, aes_string(x="Termed", y=RecVar, fill="Termed"))+
geom_boxplot()+
guides(fill=false)
}
lapply(names(RecVar), FUN=BP)
I want to create some kind of animation with ggplot2 but it doesn't work as I want to. Here is a minimal example.
print(p <- qplot(c(1, 2),c(1, 1))+geom_point())
print(p <- p + geom_point(aes(c(1, 2),c(2, 2)))
print(p <- p + geom_point(aes(c(1, 2),c(3, 3)))
Adding extra points by hand is no problem. But now I want to do it in some loop to get an animation.
for(i in 4:10){
Sys.sleep(.3)
print(p <- p + geom_point(aes(c(1, ),c(i, i))))
}
But now only the new points added are shown, and points of the previous iterations are deleted. I want the old ones still to be visible. How can I do this?
Either of these will do what you want, I think.
# create df dynamically
for (i in 1:10) {
df <- data.frame(x=rep(1:2,i),y=rep(1:i,each=2))
Sys.sleep(0.3)
print(ggplot(df, aes(x,y))+geom_point() + ylim(0,10))
}
# create df at the beginning, then subset in the loop
df <- data.frame(x=rep(1:2,10), y=rep(1:10,each=2))
for (i in 1:10) {
Sys.sleep(0.3)
print(ggplot(df[1:(2*i),], aes(x,y))+geom_point() +ylim(0,10))
}
Also, your code will cause the y-axis limits to change for each plot. Using ylim(...) keeps all the plots on the same scale.
EDIT Response to OP's comment.
One way to create animations is using the animations package. Here's an example.
library(ggplot2)
library(animation)
ani.record(reset = TRUE) # clear history before recording
df <- data.frame(x=rep(1:2,10), y=rep(1:10,each=2))
for (i in 1:10) {
plot(ggplot(df[1:(2*i),], aes(x,y))+geom_point() +ylim(0,10))
ani.record() # record the current frame
}
## now we can replay it, with an appropriate pause between frames
oopts = ani.options(interval = 0.5)
ani.replay()
This will "record" each frame (using ani.record(...)) and then play it back at the end using ani.replay(...). Read the documentation for more details.
Regarding the question about why your code fails, the simple answer is: "this is not the way ggplot is designed to be used." The more complicated answer is this: ggplot is based on a framework which expects you to identify a default dataset as a data frame, and then associate (map) various aspects of the graph (aesthetics) with columns in the data frame. So if you have a data frame df with columns A and B, and you want to plot B vs. A, you would write:
ggplot(data=df, aes(x=A, y=B)) + geom_point()
This code identifies df as the dataset, and maps the aesthetic x (the horizontal axis) with column A and y with column B. Taking advantage of the default order of the arguments, you could also write:
ggplot(df, aes(A,B)) + geom_point()
It is possible to specify things other than column names in aes(...) but this can and often does lead to unexpected (even bizarre) results. Don't do it!.
The reason, basically, is that ggplot does not evaluate the arguments to aes(...) immediately, but rather stores them as expressions in a ggplot object, and evaluates them when you plot or print that object. This is why, for example, you can add layers to a plot and ggplot is able to dynamically rescale the x- and y-limits, something that does not work with plot(...) in base R.
I have question similar to this one about the use of multiple dataframes for plotting a ggplot. I would like to create a base plot and then add data using a list of dataframes (rationale/usecase described below).
library(ggplot2)
# generate some data and put it in a list
df1 <- data.frame(p=c(10,8,7,3,2,6,7,8),v=c(100,300,150,400,450,250,150,400))
df2 <- data.frame(p=c(10,8,6,4), v=c(150,250,350,400))
df3 <- data.frame(p=c(9,7,5,3), v=c(170,200,340,490))
l <- list(df1,df2,df3)
#create a layer-adding function
addlayer <-function(df,plt=p){
plt <- plt + geom_point(data=df, aes(x=p,y=v))
plt
}
#for loop works
p <- ggplot()
for(i in l){
p <- addlayer(i)
}
#Reduce throws and error
p <- ggplot()
gg <- Reduce(addlayer,l)
Error in as.vector(x, mode) :
cannot coerce type 'environment' to vector of type 'any'
Called from: as.vector(e2)
In writing out this example I realize that the for loop is not a bad option but wouldn't mind the conciseness of Reduce, especially if I want to chain several functions together.
For those who are interested my use case is to draw a number of unconnected lines between points on a map. From a reference dataframe I figured the most concise way to map was to generate a list of subsetted dataframes, each of which corresponds to a single line. I don't want them connected so geom_path is no good.
This seems to work,
addlayer <-function(a, b){
a + geom_point(data=b, aes(x=p,y=v))
}
Reduce(addlayer, l, init=ggplot())
Note that you can also use a list of layers,
ggplot() + lapply(l, geom_point, mapping = aes(x=p,y=v))
However, neither of those two strategies is to be recommended; ggplot2 is perfectly capable of drawing multiple unconnected lines in a single layer (using e.g. the group argument). It is more efficient, and cleaner code.
names(l) = 1:3
m = ldply(l, I)
ggplot(m, aes(p, v, group=.id)) + geom_line()
(Very much a novice, so please excuse any confusion/obvious mistakes)
Goal: A loop that allows me to plot multiple maps, displaying density data (D) for grid cells, across multiple months and seasons. The data for each month, season, etc., are in 8 separate columns; the loop would run through the columns of the data frame (DF)
Tried: Adding the plot from each iteration of the loop to a list so all plots can be called up to be displayed in a multipanel figure.
out <- NULL
for(i in 1:8){
D <- DF[,i]
x <- names(DF)[i]
p <-ggplot() + geom_polygon(data=DF, aes(x=long, y=lat, group=Name, fill=D), colour = "lightgrey") +labs(title=x)
out[[i]]<- p
print(p)
}
Problem: Even though the print(p) yields the correct plot for each iteration, the plots in the list out display the data from the final loop only.
So, when I try to use grid.arrange with plots in "out", all plots show the same data (from the 8th column); however, the plots do retain the correct title. When I try to call up each plot - e.g., print(out[[1]]), shows the same plot - except for the title label - as print(out[[8]]).
It seems that the previous elements in the list are being overwritten with each loop? However, the title of the plots seem to display correctly.
Is there something obviously wrong with how I'm constructing the out list? How can I avoid having each previous plot overwritten?
The problem isn't that each item is over written, the problem is that ggplot() waits until you print the plot to resolve the variables in the aes() command. The loop is assigning all fill= parameters to D. And the value of D changes each loop. However, at the end of the loop, D will only have the last value. Thus each of the plots in the list will print with the same coloring.
This also reproduces the same problem
require(ggplot2)
#sample data
dd<-data.frame(x=1:10, y=runif(10), g1=sample(letters[1:2], 10, replace=T), g2=sample(letters[3:4], 10, replace=T))
plots<-list()
g<-dd$g1
plots[[1]]<-ggplot(data=dd, aes(x=x, y=y, color=g)) + geom_point()
g<-dd$g2
plots[[2]]<-ggplot(data=dd, aes(x=x, y=y, color=g)) + geom_point()
#both will print with the same groups.
print(plots[[1]])
print(plots[[2]])
One way around this as ( #baptiste also mentioned ) is by using aes_string(). This resolves the value of the variable "more quickly" So this should work
plots<-list()
g<-"g1"
plots[[1]]<-ggplot(data=dd, aes(x=x, y=y)) + geom_point(aes_string(color=g))
g<-"g2"
plots[[2]]<-ggplot(data=dd, aes(x=x, y=y)) + geom_point(aes_string(color=g))
#different groupings (as desired)
print(plots[[1]])
print(plots[[2]])
This is most directly related to the aes() function, so the way you are setting the title is just fine which is why you see different titles.
I would like to plot an INDIVIDUAL box plot for each unrelated column in a data frame. I thought I was on the right track with boxplot.matrix from the sfsmsic package, but it seems to do the same as boxplot(as.matrix(plotdata) which is to plot everything in a shared boxplot with a shared scale on the axis. I want (say) 5 individual plots.
I could do this by hand like:
par(mfrow=c(2,2))
boxplot(data$var1
boxplot(data$var2)
boxplot(data$var3)
boxplot(data$var4)
But there must be a way to use the data frame columns?
EDIT: I used iterations, see my answer.
You could use the reshape package to simplify things
data <- data.frame(v1=rnorm(100),v2=rnorm(100),v3=rnorm(100), v4=rnorm(100))
library(reshape)
meltData <- melt(data)
boxplot(data=meltData, value~variable)
or even then use ggplot2 package to make things nicer
library(ggplot2)
p <- ggplot(meltData, aes(factor(variable), value))
p + geom_boxplot() + facet_wrap(~variable, scale="free")
From ?boxplot we see that we have the option to pass multiple vectors of data as elements of a list, and we will get multiple boxplots, one for each vector in our list.
So all we need to do is convert the columns of our matrix to a list:
m <- matrix(1:25,5,5)
boxplot(x = as.list(as.data.frame(m)))
If you really want separate panels each with a single boxplot (although, frankly, I don't see why you would want to do that), I would instead turn to ggplot and faceting:
m1 <- melt(as.data.frame(m))
library(ggplot2)
ggplot(m1,aes(x = variable,y = value)) + facet_wrap(~variable) + geom_boxplot()
I used iteration to do this. I think perhaps I wasn't clear in the original question. Thanks for the responses none the less.
par(mfrow=c(2,5))
for (i in 1:length(plotdata)) {
boxplot(plotdata[,i], main=names(plotdata[i]), type="l")
}