Trouble producing a polygon on top of a scatterplot using ggplot - r

Currently, I am trying to transition my graphical knowledge from the plot function in R, to the ggplot function. I have began constructing scatterplots and corresponding legends for a given data set, however I want to incorporate the function geom_polygon onto my plots using ggplot.
Specifically, I want to capture a triangular region from the origin of a scatterplot. For reproducibility, say I have the following data set:
rawdata<-data.frame(matrix(c(1,1,1,
2,1,-1,
3,-1,-1,
4,-1,1,
4,-2,2),5,3,byrow=TRUE))
names(rawdata)<-c("Town","x.coordinate","y.coordinate")
rawdata[,1]<-as.factor(rawdata[,1])
To construct a scatterplot along with a legend, I have been told to do the following:
p1<-ggplot(data=rawdata,aes(x=x.coordinate,y=y.coordinate,colour=Town,shape=Town))
+ theme_bw() + geom_point()
The result is the following:
Click here.
What I want to do now is produce a polygon. To do so, I have construct the following dataframe to use in the geom_polygon function:
geom_polygon(data=polygondata,aes(x = xa, y = ya),colour="darkslategray2",
fill = "darkslategray2",alpha=0.25)
However, when I combine this with p1, I get the following error:
Error in eval(expr, envir, enclos) : object 'Town' not found
From some messing around, I have noticed that when I omit the shape argument from the ggplot function, I can easily produce the desired output which is shown here. However, I wish to keep the shape for aesthetics.
I also get a similar problem when I try to produce arrows which connect points on the scatterplot using ggplot. However, I will address this problem after, as the root problem may be here.

Add the following to polygondata:
polygondata$Town = NA
Even though you're not using that variable in geom_polygon, ggplot expects it to be there if that column is used for an aesthetic in the main call to ggplot.
Alternatively, I think you could avoid the error if you move the aesthetic mapping in the initial plot to geom_point rather than the main ggplot call, like this:
p1 <- ggplot(data=rawdata) +
theme_bw() +
geom_point(aes(x=x.coordinate, y=y.coordinate, colour=Town, shape=Town))
In that case, you wouldn't need to add a Town column to polygondata.

Related

ggplot2 error in plotting a scatter plot in R

I expect this kind of scatter plot.
However, whenever I tried to apply on my data, I get this.
I just used this code, and this is my data.
And I also confirmed they are numeric class.
ggplot(selected.df, aes(x, y))
making a right plot.
Those variables were not numeric.

animate a ternary plot

I'm trying to animate a time-series of ternary plots made with ggtern. I'd like to animate the 'time' variable but am getting the error
Error: Mapping must have 3 unique named variable pairs
# Get ternary data:
df=data.frame(x=c(.7,.1,.3),
y=c(.1,.8,.2),
z=c(.2,.1,.5), time = c(1,2,3))
# Plot with each time as a facet
ggtern::ggtern(data=df,
aes(x=x,y=y,z=z))+
geom_point()+
facet_wrap( ~time)
# Animate
ggtern::ggtern(data=df,
aes(x=x,y=y,z=z))+
geom_point()+
gganimate::transition_time(time)
I'm pretty sure the error is being thrown from ggtern, but not sure how to make it compatible with gganimate. I can't find much in the documentation to figure it out.

Trouble producing discrete legend using ggplot for a scatterplot

I am fairly new to the ggplot function in R. Currently, I am struggling to produce a legend for a given data set that I have constructed by hand. For simplicity, suppose this was my data set:
rawdata<-data.frame(matrix(c(1,1,1,
2,1,-1,
3,-1,-1,
4,-1,1
4,-2,2),5,3,byrow=TRUE))
names(rawdata)<-c("Town","x-coordinate","y-coordinate")
rawdata[,1]<-as.factor(rawdata[,1])
Now, using ggplot, I am trying to figure out how to produce a legend on a scatterplot. So far I have done the following:
p1<-ggplot(data=rawdata,aes(x=x.coordinate,y=y.coordinate,fill=rawdata[,1]))
+geom_point(data=rawdata,aes(x=x.coordinate,y=y.coordinate))
I produce the following using the above code,
As you can see, the coordinates have been plotted and the legend has been constructed, but they are only colored black.
I learned that to color coordinates, I would have needed to use the argument colour=rawdata[,1] in the geom_point function to color in points. However, when I try this, I get the following error code:
Error: Aesthetics must be either length 1 or the same as the data (4): colour
I understand that this has something to do with the length of the vector, but as of right now, I have absolutely no idea how to tackle this small problem.
geom_point() takes a colour, not a fill. And, having passed the data into ggplot(data = ..), there's no need to then pass it into the geom_point() again.
I've also fixed an error in the creation of your df in your example.
rawdata<-data.frame(matrix(c(1,1,1,2,1,-1,3,-1,-1,4,-1,1,4,-2,2),5,3,byrow=TRUE))
names(rawdata)<-c("Town","x.coordinate","y.coordinate")
rawdata[,1]<-as.factor(rawdata[,1])
library(ggplot2)
ggplot(data=rawdata,aes(x=x.coordinate,y=y.coordinate,colour=Town)) +
geom_point()

R - emulate the default behavior of hist() with ggplot2 for bin width

I'm trying to plot an histogram for one variable with ggplot2. Unfortunately, the default binwidth of ggplot2 leaves something to be desired:
I've tried to play with binwidth, but I am unable to get rid of that ugly "empty" bin:
Amusingly (to me), the default hist() function of R seems to produce a much better "segmentation" of the bins:
Since I'm doing all my other graphs with ggplot2, I'd like to use it for this one as well - for consistency. How can I produce the same bin "segmentation" of the hist() function with ggplot2?
I tried to input hist at the terminal, but I only got
function (x, ...)
UseMethod("hist")
<bytecode: 0x2f44940>
<environment: namespace:graphics>
which bears no information for my problem.
I am producing my histograms in ggplot2 with the following code:
ggplot(mydata, aes(x=myvariable)) + geom_histogram(color="darkgray",fill="white", binwidth=61378) + scale_x_continuous("My variable") + scale_y_continuous("Subjects",breaks=c(0,2.5,5,7.5,10,12.5),limits=c(0,12.5)) + theme(axis.text=element_text(size=14),axis.title=element_text(size=16,face="bold"))
One thing I should add is that looking at the histogram produced byhist(), it would seem that the bins have a width of 50000 (e.g. from 1400000 to 1600000 there are exactly two bins); setting binwidth to 50000 in ggplot2 does not produce the same graph. The graph produced by ggplot2 has the same gap.
Without sample data, it's always difficult to get reproducible results, so i've created a sample dataset
set.seed(16)
mydata <- data.frame(myvariable=rnorm(500, 1500000, 10000))
#base histogram
hist(mydata$myvariable)
As you've learned, hist() is a generic function. If you want to see the different implementations you can type methods(hist). Most of the time you'll be running hist.default. So if be borrow the break finding logic from that funciton, we come up with
brx <- pretty(range(mydata$myvariable),
n = nclass.Sturges(mydata$myvariable),min.n = 1)
which is how hist() by default calculates the breaks. We can then use these breaks with the ggplot command
ggplot(mydata, aes(x=myvariable)) +
geom_histogram(color="darkgray",fill="white", breaks=brx) +
scale_x_continuous("My variable") +
theme(axis.text=element_text(size=14),axis.title=element_text(size=16,face="bold"))
and the plot below shows the two results side-by-side and as you can see they are quite similar.
Also, that empty bim was probably caused by your y-axis limits. If a shape goes outside the limits of the range you specify in scale_y_continuous, it will simply get dropped from the plot. It looks like that bin wanted to be 14 tall, but you clipped y at 12.5.

ggplot2 - possible to reorder x's by value of computed y (stat_summary)?

Is it possible to reorder x values using a computed y via stat_summary?
I would think that this should work:
stat_summary( aes( x = reorder( XVarName , ..y.. ) ) )
but I get the following error:
"Error: stat_summary requires the following missing aesthetics: x"
I've seen a number of your posts, and I think this may be helpful for you. When generating a plot, always save it to a unique variable
Create your plots without regard for ordering at first, until you're comfortable just creating the plots. Then, work your way into the structure of the ggplot objects to get a better understanding of what's in them. Then, figure out what you should be sorting.
plot1 <- ggplot() + ...
You can push plots to the viewport by typing out the object name that you've saved them to:
plot1
Creating a ggplot object (or variable) allows you the opportunity to review the structure of the plot. Which, incidentally, can answer a number of the questions that you've been having so far.
str(plot1)
It is still fairly simple to reorder a plot after you've saved it as a variable/object, albeit with slightly longer names:
plot$data$variable_tobe_recoded <- factor(...)

Resources