Strange behaviour of ggplot2 - r

I simply want to draw multiple arrows on a scatterplot using ggplot2. In this (dummy) example, an arrow is drawn but it moves as i is incremented and only one arrow is drawn. Why does that happen?
library(ggplot2)
a <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
b <- data.frame(x1=c(2,3),y1=c(10,10),x2=c(3,4),y2=c(15,15))
for (i in 1:nrow(b)) {
a <- a + geom_segment(arrow=arrow(),
mapping = aes(x=b[i,1],y=b[i,2],xend=b[i,3],yend=b[i,4]))
plot(a)
}
Thanks.

This isn't strange behavior, this is exactly how aes() is supposed to work. It delays evaluation of the parameters until the plotting actually runs. This is problematic if you include expressions to variable outside your data.frame (like i) and functions (like [,]). These are only evaulated when you actually "draw" the plot.
If you want to force evaulation of your parameters, you can use aes_. This will work for you
for (i in 1:nrow(b)) {
a <- a + geom_segment(arrow=arrow(),
mapping = aes_(x=b[i,1],y=b[i,2],xend=b[i,3],yend=b[i,4]))
}
plot(a)
Now within the loop the parameters for x= and y=, etc are evaluated in the environment and their value are "frozen" in the layer.
Of course, it would be better not to build layers in loops and just procide a proper data object as pointed out in #eipi10's answer.

As #Roland explains in the comment thread to this answer, only one arrow is plotted, because geom_segment(arrow=arrow(), mapping = aes(x=b[i,1],y=b[i,2],xend=b[i,3],yend=b[i,4])) is evaluated only when a is plotted. But i only has one value each time a is plotted. During the first time through the loop, i=1 and during the second time i=2. After the loop i also still equals 2. Thus, only one arrow is plotted each time. If, after the loop, you run i=1:2 then you'll get both arrows. On the other hand, if you change i to anything other than 1 and/or 2, you won't get any arrows plotted.
In any case, you can get both arrows without a loop as follows:
ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
geom_segment(data=b, arrow=arrow(), aes(x=x1,y=y1,xend=x2,yend=y2))
Question regarding #Roland's first comment: Shouldn't the object a be updated each time through the loop by adding the new geom_segment? For example, if I start with the OP's original a, then after one iteration of the loop,
a = a + geom_segment(arrow=arrow(), aes(x=b[1,1],y=b[1,2],xend=b[1,3],yend=b[1,4]))
Then, after two iterations of the loop,
a = a + geom_segment(arrow=arrow(), aes(x=b[1,1],y=b[1,2],xend=b[1,3],yend=b[1,4])) +
geom_segment(arrow=arrow(), aes(x=b[2,1],y=b[2,2],xend=b[2,3],yend=b[2,4]))
where in each case a means the value of a before the start of the loop. Shouldn't those underlying changes to the object a occur regardless of the when or if a is evaluated?

Related

Plotting dataframes in the same ggplot with for-loop in a function

I have a bunch of dataframes and I want to plot 2 columns of each dataframe on the same ggplot. I already have a plot from another function, coloured in blue and red and I want the new ones to be added to it. Although the way I'm trying works on the console, I can't get to save the function, call it and have it work. The error I get is :
Discrete value supplied to continuous scale.
So, the dataframes are in my environment and named BEFMORN1 to BEFMORN9. The initial plot is test_plot.
The first part that gives me the test_plot works.
test_plot<-ggplot()+geom_point(data=yy4, aes(x=Time, y=Dist), colour="red")+geom_point(data=zz4, aes(x=Time, y=Dist), colour="blue")
test_plot<-test_plot+scale_x_continuous(name="Time (Seconds from the beginning)")
test_plot<-test_plot+scale_y_continuous(name="Distance (Metres from the beginning)")
The second part will be the new function
plot_all_runs<-function(r,test_plot) {
for (i in 1:(length(r[[1]]))) {
z<-as.data.frame(mget(ls(pattern=paste0("BEFMORN",i))))
test_plot2<-test_plot+geom_point(data=z, aes_string(x=names(z)[12], y=names(z)[17]))
}print(test_plot2)
}
r is a list of 6 lists of different dataframes, so BEFMORN came from r[[1]]. BEFNOON will come from r[[2]] etc. So my plan is to have 6 identical functions with different arguments in paste0.
I'm using aes_string(x=names(z)[12] because the data frames z will have different column names in each iteration.
Does someone understand why I'm getting an error? I have played around with the scales (removing them from the initial plot or adding them again in the next one) but no improvement.
EDIT:
All columns to be plotted have been transformed to numeric. Others are factors and integers.
EXAMPLE
BEFMORN1<-data.frame(BEFMORN1.Time=seq(0:10, 0.5), BEFMORN1.Dist=1:20)
BEFMORN2<-data.frame(BEFMORN2.Time=seq(0:13, 0.5), BEFMORN2.Dist=c(1:8,8,8,9,10,13,13,13,13.5,14,14,14 14:20))
yy4<-data.frame(Time=seq(0:10, 0.5). Dist=c(1:8,8,8,9,10,13,14:20))
ZZ4<-data.frame(Time=seq(0:12, 0.5). Dist=c(1:8,8,8,9,9.5,10,10.5,12,12.5,13,14:20))
test_plot<-ggplot()+geom_point(data=yy4, aes(x=Time, y=Dist), colour="red")+geom_point(data=zz4, aes(x=Time, y=Dist), colour="blue")
plot_all_runs<-function(test_plot) {
for (i in 1:9) {
z<-as.data.frame(mget(ls(pattern=paste0("BEFMORN",i))))
test_plot2<-test_plot+geom_point(data=z, aes_string(x=names(z)[12], y=names(z)[17]))
}print(test_plot2)
}
An example of generating the long format #biomiha and #joran suggested:
library(ggplot2)
BEFMORN1<-data.frame(Time=seq(0,10, 0.5)
, Dist=1:21, Group = "BEFMORN1")
BEFMORN2<-data.frame(Time=seq(0,13, 0.5)
, Dist=c(1:8,8,8,9,10,13,13,13,13.5,14,14,14,14:21)
, Group = "BEFMORN2")
yy4<-data.frame(Time=seq(0,10, 0.5)
, Dist=c(1:8,8,8,9,10,13,14:21)
, Group = "yy4")
zz4<-data.frame(Time=seq(0,12, 0.5)
, Dist=c(1:8,8,8,9,9.5,10,10.5,12,12.5,13,14:21)
, Group = "zz4")
allData <-
rbind(BEFMORN1, BEFMORN2, yy4, zz4)
ggplot(allData
, aes(x = Time
, y = Dist
, col = Group)) +
geom_point()
Note that if your data are already in place, adding a "Group" column may need to be done with a bit more care. However, the general principle is the same. If you want, you can use any of the scale_color_* functions to change the default colors, including scale_color_manual if you want to set them yourself.

Aesthetics must either be length one or the same length

I am trying to plot values and errorbars, a seemingly simple task. As the script is fairly long, I am trying to limit the code in give here to the necessary amount.
I can plot the graph without error bars. However, when trying to add the errorbars I get the message
Error: Aesthetics must either be length one, or the same length as the dataProblems:Tempdata
This is the code I am using. All vectors in the Tempdata data frame are of length 390.
Tempdata <- data.frame (TempDiff, Measurement.points, Room.ext.resc, MelatoninData, Proximal.vs.Distal.SD.ext, ymax, ymin)
p <- ggplot(data=Tempdata,
aes(x = Measurement.points,
y = Tempdata, colour = "Temperature Differences"))
p + geom_line(aes(x=Measurement.points, y = Tempdata$TempDiff, colour = "Gradient Proximal vs. Distal"))+
geom_errorbar(aes(ymax=Tempdata$ymax, ymin=Tempdata$ymin))
The problem is that you have the colour-variables between quotation marks. You should put the variable name at that spot. So, replacing "Temperature Differences" with TempDiff and "Gradient Proximal vs. Distal" with Proximal.vs.Distal.SD.ext will probably solve your problem.
Furthermore: you can can't call for two different colour-variables.
The improved ggplot code should probably be something like this:
ggplot(data=Tempdata, aes(x=Measurement.points, y=TempDiff, colour=Proximal.vs.Distal.SD.ext)) +
geom_line() +
geom_errorbar(aes(ymax=ymax, ymin=ymin))
I also fixed some more problems with your original code:
the $ issue reported by Roland
the fact that you have conflicting calls in your aes
the fact you are calling your dataframe inside the first aes

Animation, adding geom

I want to create some kind of animation with ggplot2 but it doesn't work as I want to. Here is a minimal example.
print(p <- qplot(c(1, 2),c(1, 1))+geom_point())
print(p <- p + geom_point(aes(c(1, 2),c(2, 2)))
print(p <- p + geom_point(aes(c(1, 2),c(3, 3)))
Adding extra points by hand is no problem. But now I want to do it in some loop to get an animation.
for(i in 4:10){
Sys.sleep(.3)
print(p <- p + geom_point(aes(c(1, ),c(i, i))))
}
But now only the new points added are shown, and points of the previous iterations are deleted. I want the old ones still to be visible. How can I do this?
Either of these will do what you want, I think.
# create df dynamically
for (i in 1:10) {
df <- data.frame(x=rep(1:2,i),y=rep(1:i,each=2))
Sys.sleep(0.3)
print(ggplot(df, aes(x,y))+geom_point() + ylim(0,10))
}
# create df at the beginning, then subset in the loop
df <- data.frame(x=rep(1:2,10), y=rep(1:10,each=2))
for (i in 1:10) {
Sys.sleep(0.3)
print(ggplot(df[1:(2*i),], aes(x,y))+geom_point() +ylim(0,10))
}
Also, your code will cause the y-axis limits to change for each plot. Using ylim(...) keeps all the plots on the same scale.
EDIT Response to OP's comment.
One way to create animations is using the animations package. Here's an example.
library(ggplot2)
library(animation)
ani.record(reset = TRUE) # clear history before recording
df <- data.frame(x=rep(1:2,10), y=rep(1:10,each=2))
for (i in 1:10) {
plot(ggplot(df[1:(2*i),], aes(x,y))+geom_point() +ylim(0,10))
ani.record() # record the current frame
}
## now we can replay it, with an appropriate pause between frames
oopts = ani.options(interval = 0.5)
ani.replay()
This will "record" each frame (using ani.record(...)) and then play it back at the end using ani.replay(...). Read the documentation for more details.
Regarding the question about why your code fails, the simple answer is: "this is not the way ggplot is designed to be used." The more complicated answer is this: ggplot is based on a framework which expects you to identify a default dataset as a data frame, and then associate (map) various aspects of the graph (aesthetics) with columns in the data frame. So if you have a data frame df with columns A and B, and you want to plot B vs. A, you would write:
ggplot(data=df, aes(x=A, y=B)) + geom_point()
This code identifies df as the dataset, and maps the aesthetic x (the horizontal axis) with column A and y with column B. Taking advantage of the default order of the arguments, you could also write:
ggplot(df, aes(A,B)) + geom_point()
It is possible to specify things other than column names in aes(...) but this can and often does lead to unexpected (even bizarre) results. Don't do it!.
The reason, basically, is that ggplot does not evaluate the arguments to aes(...) immediately, but rather stores them as expressions in a ggplot object, and evaluates them when you plot or print that object. This is why, for example, you can add layers to a plot and ggplot is able to dynamically rescale the x- and y-limits, something that does not work with plot(...) in base R.

List for Multiple Plots from Loop (ggplot2) - List elements being overwritten

(Very much a novice, so please excuse any confusion/obvious mistakes)
Goal: A loop that allows me to plot multiple maps, displaying density data (D) for grid cells, across multiple months and seasons. The data for each month, season, etc., are in 8 separate columns; the loop would run through the columns of the data frame (DF)
Tried: Adding the plot from each iteration of the loop to a list so all plots can be called up to be displayed in a multipanel figure.
out <- NULL
for(i in 1:8){
D <- DF[,i]
x <- names(DF)[i]
p <-ggplot() + geom_polygon(data=DF, aes(x=long, y=lat, group=Name, fill=D), colour = "lightgrey") +labs(title=x)
out[[i]]<- p
print(p)
}
Problem: Even though the print(p) yields the correct plot for each iteration, the plots in the list out display the data from the final loop only.
So, when I try to use grid.arrange with plots in "out", all plots show the same data (from the 8th column); however, the plots do retain the correct title. When I try to call up each plot - e.g., print(out[[1]]), shows the same plot - except for the title label - as print(out[[8]]).
It seems that the previous elements in the list are being overwritten with each loop? However, the title of the plots seem to display correctly.
Is there something obviously wrong with how I'm constructing the out list? How can I avoid having each previous plot overwritten?
The problem isn't that each item is over written, the problem is that ggplot() waits until you print the plot to resolve the variables in the aes() command. The loop is assigning all fill= parameters to D. And the value of D changes each loop. However, at the end of the loop, D will only have the last value. Thus each of the plots in the list will print with the same coloring.
This also reproduces the same problem
require(ggplot2)
#sample data
dd<-data.frame(x=1:10, y=runif(10), g1=sample(letters[1:2], 10, replace=T), g2=sample(letters[3:4], 10, replace=T))
plots<-list()
g<-dd$g1
plots[[1]]<-ggplot(data=dd, aes(x=x, y=y, color=g)) + geom_point()
g<-dd$g2
plots[[2]]<-ggplot(data=dd, aes(x=x, y=y, color=g)) + geom_point()
#both will print with the same groups.
print(plots[[1]])
print(plots[[2]])
One way around this as ( #baptiste also mentioned ) is by using aes_string(). This resolves the value of the variable "more quickly" So this should work
plots<-list()
g<-"g1"
plots[[1]]<-ggplot(data=dd, aes(x=x, y=y)) + geom_point(aes_string(color=g))
g<-"g2"
plots[[2]]<-ggplot(data=dd, aes(x=x, y=y)) + geom_point(aes_string(color=g))
#different groupings (as desired)
print(plots[[1]])
print(plots[[2]])
This is most directly related to the aes() function, so the way you are setting the title is just fine which is why you see different titles.

Temporarily disable aesthetics already defined in ggplot()

We may want to define some global aes() for a ggplot() graphics, but exclude them in some layers. For instance suppose the following example:
foo <- data.frame(x=runif(10),y=runif(10))
bar <- data.frame(x=c(0,1),ymin=c(-.1,.9),ymax=c(.1,1.1))
p <- ggplot(foo,aes(x=x,y=y))+geom_point()
Everything is good. However when trying to add the ribbon:
p <- p + geom_ribbon(data=bar, aes(x=x,ymin=ymin,ymax=ymax), alpha=.1)
# Error: Discrete value supplied to continuous scale
This error happens because we have already defined y as a part of global aes() that applies also to the geom_ribbon(), but the bar does not have it.
I have found two possibilities to escape this error, one of them is to remove y=y from the original ggplot(foo,aes(x=x,y=y)), however every time in the future I need to draw something I should add y=y to the aes() that is not good.
The other possibility is to add a fake y column to bar:
bar = cbind(bar, y=0)
p <- p + geom_ribbon(data=bar, aes(x=x,ymin=ymin,ymax=ymax), alpha=.1)
Now works good. However I don't like acting so, as it's a fake variable. Is there any way to temporarily disable the already defined aes() in ggplot() when calling the geom_ribbon()?
As said in the comments by #ErnestA, we can unmap the aesthetics by setting them to NULL
aes(y=NULL,x=x,ymin=ymin,ymax=ymax)
PS: For the legend you can now override aesthetic by aes.override

Resources