I'm trying to make a density plot in R, using ggplot. I am able to get the axes and the points plotted, but no density. I am fairly unfamiliar with ggplot, as a side note. But my z-axis is just a list of values and what shows up when I plot it is just a plot of dots. How do I incorporate the density?
An error that pops up says there are "negative extents to matrix." I've tried searching for this, but no luck.
This is my code:
ggplot(data=Denit, aes(x=Date, y=Depth, z=N2.excess)) +
geom_point() +
stat_density2d(data=Denit, aes(x=Date, y=Depth, z=N2.excess))
Related
I am trying to plot a distribution in R using the package vioplot; my plot consists of a scatterplot of points with violin plots (representing 'bins' of these points) plotted over the top of the scatterplots.
However, different methods of plotting my data result in slightly different characteristics in the plot. If all the violin plots are plotted using a loop, the violin plot tails will stretch down to the lowest points, but if plotted individually, the violin plot tails won't reach to the outliers. Additionally, resizing the plot window (and then re-plotting) also changes how the tails of the violin plots appear.
Because I'm getting these differing plots, I'm wondering how to tell which plot is the correct representation of the data, and how to produce a consistent result. I've used the 'range' and 'coef' arguments in vioplot to make the plots more consistent, but this hasn't worked.
Thanks!
Maybe I am wrong, but I think the violin itself is not really that defined and just an easy-to-look-at data representation. The boxplot on the other hand (which is plotted as well with vioplot inside the violin) is much more important as its bars tell you the 50th, 25th and 75th percentile (though the 5oth in vioplot is a white dot for some reason), and the whiskers depend on how you plot, but in the case of vioplot I think it is the 95th and 5th percentile.
If you want higher customizability, use ggplot:
library(reshape2) #for melt()
library(ggplot2)
uniform<-runif(200,-4,4)
normal<-rnorm(200,0,3)
df <- data.frame(x=normal, y=uniform) %>% melt()
ggplot(df, aes(x=variable, y=value)) +
geom_violin() +
geom_boxplot()
and if you want to plot all the data points instead of just the outliers, you can use ggbeeswarm for that, while not showing the outliers from geom_boxplot:
library(ggbeeswarm)
ggplot(df, aes(x=variable, y=value)) +
geom_violin() +
geom_boxplot(outlier.alpha = 0) +
geom_beeswarm()
Getting a strange ordering of vertices in a geom_line plot. Left hand plot is base R; right is ggplot.
Here's the shapefile I'm working with. This will reproduce the plot:
require(ggplot2); require(maptools)
rail = readShapeLines('railnetworkLine.shp')
rail_dat = fortify(rail[1,])
ggplot(rail_dat) + geom_line(aes(long, lat, group=group)) + coord_equal()
Any idea what is causing this? The data order of fortify seems correct, as plotting separately lines() confirms.
Use geom_path instead of geom_line. geom_line orders the data from lowest to highest x-value (long in this case) before plotting, but geom_path plots the data in the current order of the data frame rows.
ggplot(rail_dat) +
geom_path(aes(long, lat)) + coord_equal()
I've tried using xyplot, symbols, and plot and am not getting exactly what I'm looking for. Basically, I have x and y data that I want to plot on a log scale. Each data point will be a solid filled circle, with the circle size dependent on a numerical variable (z), and the color based on a categorical value (w). Here are things I've tried:
radius <- sqrt(z / pi)
symbols(x, y, circles=radius, inches = 0.35)
Adding in a log scale and color completely threw it off. Basically trying to do something like this (http://flowingdata.com/2010/11/23/how-to-make-bubble-charts/) minus labels, with log scales and colored by a categorical variable (w).
I also tried xyplot and plot (using cex for size of points), but couldn't quite get what I was looking for either...can anyone point me in the right direction? Just starting off learning R and appreciate the help!
You can do something like this in ggplot:
ggplot(d)+
geom_point(aes(x=Expectancy,y=Fertility,size=Population,colour=Region),alpha=0.8)+
theme_bw()
To do it with log scales you would simply add :
+ scale_y_log10() + scale_x_log10()
I'm trying to make a boxplot with ggplot2 using the following code:
p <- ggplot(
data,
aes(d$score, reorder(d$names d$scores, median))
) +
geom_boxplot()
I have factors called names and integers called scores.
My code produces a plot, but the graphic does not depict the boxes (only shows lines) and I get a warning message, "position_dodge requires non-overlapping x intervals." I've tried to adjust the height and width with geom_boxplot(width=5), but this does not seem to fix the problem. Can anyone suggest a possible solution to my problem?
I should point out that my boxplot is rather large and has about 200 name values on the y-axis). Perhaps this is the problem?
The number of groups is not the problem; I can see the same thing even when there are only 2 groups. The issue is that ggplot2 draws boxplots vertically (continuous along y, categorical along x) and you are trying to draw them horizontally (continuous along x, categorical along y).
Also, your example has several syntax errors and isn't reproducible because we don't have data/d.
Start with some mock data
dat <- data.frame(scores=rnorm(1000,sd=500),
names=sample(LETTERS, 1000, replace=TRUE))
Corrected version of your example code:
ggplot(dat, aes(scores, reorder(names, scores, median))) + geom_boxplot()
This is the horizontal lines you saw.
If you instead put the categorical on the x axis and the continuous on the y you get
ggplot(dat, aes(reorder(names, scores, median), scores)) + geom_boxplot()
Finally, if you want to flip the coordinate axes, you can use coord_flip(). There can be some additional problems with this if you are doing even more sophisticated things, but for basic boxplots it works.
ggplot(dat, aes(reorder(names, scores, median), scores)) +
geom_boxplot() + coord_flip()
In case anyone else arrives here wondering why they're seeing
Warning message:
position_dodge requires non-overlapping x intervals
Why this happens
The reason this happens is because some of the boxplot / violin plot (or other plot type) are possibly overlapping. In many cases, you may not care, but in some cases, it matters, hence why it warns you.
How to fix it
You have two options. Either suppress warnings when generating/printing the ggplot
The other option, simply alter the width of the plot so that the plots don't overlap, then the warning goes away. Try altering the width argument to the geom: e.g. geom_boxplot(width = 0.5) (same works for geom_violin())
In addition to #stevec's options, if you're seeing
position_stack requires non-overlapping x intervals
position_fill requires non-overlapping x intervals
position_dodge requires non-overlapping x intervals
position_dodge2 requires non-overlapping x intervals
and if your x variable is supposed to overlap for different aesthetics such as fill, you can try making the x_var into a factor:
geom_bar(aes(x = factor(x_var), fill = type)
I have some data that I am trying to plot faceted by its Type with a smooth (Loess, LM, whatever) superimposed. Generation code is below:
testFrame <- data.frame(Time=sample(20:60,50,replace=T),Dollars=round(runif(50,0,6)),Type=sample(c("First","Second","Third","Fourth"),50,replace=T,prob=c(.33,.01,.33,.33)))
I have no problem either making a faceted plot, or plotting the smooth, but I cannnot do both. The first three lines of code below work fine. The fourth line is where I have trouble:
qplot(Time,Dollars,data=testFrame,colour=Type)
qplot(Time,Dollars,data=testFrame,colour=Type) + geom_smooth()
qplot(Time,Dollars,data=testFrame) + facet_wrap(~Type)
qplot(Time,Dollars,data=testFrame) + facet_wrap(~Type) + geom_smooth()
It gives the following error:
Error in [<-.data.frame(*tmp*, var, value = list(NA = NULL)) :
missing values are not allowed in subscripted assignments of data frames
What am I missing to overlay a smooth in a faceted plot? I could have sworn I had done this before, possibly even with the same data.
It works for me. Are sure you have the latest version of ggplot2?