Do you know how to get the curved effect Jake Kaupp achieves on his plot?
Looks to be something along the lines of:
ggplot(full_data, aes(y = total_consumption_lbs, x = milk_production_lbs)) +
geom_xspline2(aes(s_open = TRUE, s_shape = 0.5))
Where geom_xspline2() comes from library(ggalt)
But don't ask me, here is his source code:
https://github.com/jkaupp/tidytuesdays/blob/master/2019/week5/R/analysis.R
This approach doesn't look quite as nice as your example, but it's a start, and some fiddling may get you the rest of the way.
First, some data to work with:
x <- seq(1:20)
y <- jitter(x,amount=1.5)
df <- data.frame(x,y)
The approach using ggplot2 is to draw a geom_smooth with very small span (small enough to cause lots of errors, as you'll see), and then plot points with white borders over the top of that.
ggplot(df, aes(x,y)) +
geom_smooth(se=F, colour="black", span=0.15) +
geom_point(fill="black", colour="white", shape=21, size=2.5) +
theme_minimal()
The downsides: As I noted above, you'll see many errors about singularities in the loess fit, because the span is so small. Second, you'll note that not all of the points are centred on the line, which makes sense since you are using a loess fit for the line. Lastly, there doesn't appear to be a way to change the width of the line around the points, so you end up with quite a thin white border.
Related
I am using the following code to plot a stacked area graph and I get the expected plot.
P <- ggplot(DATA2, aes(x=bucket,y=volume, group=model, fill=model,label=volume)) + #ggplot initial parameters
geom_ribbon(position='fill', aes(ymin=0, ymax=1))
but then when I add lines which are reading the same data source I get misaligned results towards the right side of the graph
P + geom_line(position='fill', aes(group=model, ymax=1))
does anyone know why this may be? Both plots are reading the same data source so I can't figure out what the problem is.
Actually, if all you wanted to do was draw an outline around the areas, then you could do the same using the colour aesthetic.
ggplot(DATA2, aes(x=bucket,y=volume, group=model, fill=model,label=volume)) +
geom_ribbon(position='fill', aes(ymin=0, ymax=1), colour = "black")
I have an answer, I hope it works for you, it looks good but very different from your original graph:
library(ggplot2)
DATA2 <- read.csv("C:/Users/corcoranbarriosd/Downloads/porsche model volumes.csv", header = TRUE, stringsAsFactors = FALSE)
In my experience you want to have X as a numeric variable and you have it as a string, if that is not the case I can Change that, but this will transform your bucket into a numeric vector:
bucket.list <- strsplit(unlist(DATA2$bucket), "[^0-9]+")
x=numeric()
for (i in 1:length(bucket.list)) {
x[i] <- bucket.list[[i]][2]
}
DATA2$bucket <- as.numeric(x)
P <- ggplot(DATA2, aes(x=bucket,y=volume, group=model, fill=model,label=volume)) +
geom_ribbon(aes(ymin=0, ymax=volume))+ geom_line(aes(group=model, ymax=volume))
It gives me the area and the line tracking each other, hope that's what you needed
If you switch to using geom_path in place of geom_line, it all seems to work as expected. I don't think the ordering of geom_line is behaving the same as geom_ribbon (and suspect that geom_line -- like geom_area -- assumes a zero base y value)
ggplot(DATA2, aes(x=bucket, y=volume, ymin=0, ymax=1,
group=model, fill=model, label=volume)) +
geom_ribbon(position='fill') +
geom_path(position='fill')
Should give you
I would like to use customized linetypes in ggplot. If that is impossible (which I believe to be true), then I am looking for a smart hack to plot arrowlike symbols above, or below, my line.
Some background:
I want to plot some water quality data and compare it to the standard (set by the European Water Framework Directive) in a red line. Here's some reproducible data and my plot:
df <- data.frame(datum <- seq.Date(as.Date("2014-01-01"),
as.Date("2014-12-31"),by = "week"),y=rnorm(53,mean=100,sd=40))
(plot1 <-
ggplot(df, aes(x=datum,y=y)) +
geom_line() +
geom_point() +
theme_classic()+
geom_hline(aes(yintercept=70),colour="red"))
However, in this plot it is completely unclear if the Standard is a maximum value (as it would be for example Chloride) or a minimum value (as it would be for Oxygen). So I would like to make this clear by adding small pointers/arrows Up or Down. The best way would be to customize the linetype so that it consists of these arrows, but I couldn't find a way.
Q1: Is this at all possible, defining custom linetypes?
All I could think of was adding extra points below the line:
extrapoints <- data.frame(datum2 <- seq.Date(as.Date("2014-01-01"),
as.Date("2014-12-31"),by = "week"),y2=68)
plot1 + geom_point(data=extrapoints, aes(x=datum2,y=y2),
shape=">",size=5,colour="red",rotate=90)
However, I can't seem to rotate these symbols pointing downward. Furthermore, this requires calculating the right spacing of X and distance to the line (Y) every time, which is rather inconvenient.
Q2: Is there any way to achieve this, preferably as automated as possible?
I'm not sure what is requested, but it sounds as though you want arrows at point up or down based on where the y-value is greater or less than some expected value. If that's the case, then this satisfies using geom_segment:
require(grid) # as noted by ?geom_segment
(plot1 <-
ggplot(df, aes(x=datum,y=y)) + geom_line()+
geom_segment(data = data.frame( df$datum, y= 70, up=df$y >70),
aes(xend = datum , yend =70 + c(-1,1)[1+up]*5), #select up/down based on 'up'
arrow = arrow(length = unit(0.1,"cm"))
) + # adjust units to modify size or arrow-heads
geom_point() +
theme_classic()+
geom_hline(aes(yintercept=70),colour="red"))
If I'm wrong about what was desired and you only wanted a bunch of down arrows, then just take out the stuff about creating and using "up" and use a minus-sign.
I'm using ggplot as described here
Smoothed density estimates
and entered in the R console
m <- ggplot(movies, aes(x = rating))
m + geom_density()
This works but is there some way to remove the connection between the x-axis and the density plot (the vertical lines which connect the density plot to the x-axis)
The most consistent way to do so is (thanks to #baptiste):
m + stat_density(geom="line")
My original proposal was to use geom_line with an appropriate stat:
m + geom_line(stat="density")
but it is no longer recommended since I'm receiving reports it's not universally working for every case in newer versions of ggplot.
The suggested answers dont provide exactly the same results as geom_density. Why not draw a white line over the baseline?
+ geom_hline(yintercept=0, colour="white", size=1)
This worked for me.
Another way would be to calculate the density separately and then draw it. Something like this:
a <- density(movies$rating)
b <- data.frame(a$x, a$y)
ggplot(b, aes(x=a.x, y=a.y)) + geom_line()
It's not exactly the same, but pretty close.
I would like to plot some data as a scatter plot using facet_wrap, while superimposing some information such as a linear regression and the density.
I managed to do all that, but the density values are out of proportion with respect to my points, which is a normal thing since these points are far away. Nevertheless, I'd like to scale and move my density curve so that it is clearly visible; I don't care about it's real values but more about its shape.
Here is an exaggerated minimum working example of what I have:
set.seed(48151623)
mydf <- data.frame(x1=rnorm(mean=5,n=100),x2=rnorm(n=100,mean=10),x3=rnorm(n=100,mean=20,sd=3))
mydf$var <- mydf$x1 + mydf$x2 * mydf$x3
mydf.wide <- melt(mydf,id.vars='var',measure.vars=c(1:3))
ggplot(data=mydf.wide,aes(x=value,y=var)) +
geom_point(colour='red') +
geom_smooth(method='lm') +
stat_density(aes(x=value,y=..scaled..),position='identity',geom='line') +
facet_wrap(~variable,scale='free_x')
Which results in:
What I would like resembles to this ugly hack:
stat_density(aes(x=value,y=..scaled..*100+200),position='identity',geom='line')
Ideally, I would use y=..scaled..* diff(range(value)) + min(value) but when I do this I get an error saying that 'value' was not found. I suspect the problem is related to the faceting, but I would prefer to keep my facets.
How can I scale and move the density curve in this case?
I suggest to make two plots and combine them with grid.arrange:
p1 <- ggplot(data=mydf.wide,aes(x=value,y=var)) +
geom_point(colour='red') +
geom_smooth(method='lm') +
facet_wrap(~variable,scale='free_x') +
theme(axis.title.x=element_blank(),
axis.text.x=element_blank(),
axis.ticks.x=element_blank(),
plot.margin = unit(c(1, 1, 0, 0.5), "lines"))
p2 <- ggplot(data=mydf.wide,aes(x=value,y=var)) +
stat_density(aes(x=value,y=..scaled..),position='identity',geom='line') +
facet_wrap(~variable,scale='free_x') +
theme(strip.background=element_blank(),
strip.text=element_blank(),
plot.margin = unit(c(-1, 1, 0.5, 0.35), "lines"))
library(gridExtra)
grid.arrange(p1, p2, heights = c(2,1))
I'm not sure if this completely answers your question, but it was too long to put in a comment, so... In response to your second chunk of code in your question, since you've already defined x=value, you can use x instead of value in your definition of y.
stat_density(aes(x=value,y=..scaled..*diff(range(x)) +
min(x)),position='identity',geom='line')
This seems to fix your error and produces the following plot:
The only problem is, of course, if you have data with low y-values, then you're still going to overlap your density curves with your scatterplot. But, if this isn't the case, I personally think this is a fairly informative figure, as long as you can communicate effectively that the y axis values aren't important in interpreting the density curves--only the shapes of the curves are important.
I appreciate the answers of everyone, which led me to better understand ggplot underlying mechanisms. I also realize how awkward my requirement is; ggplot is not going to solve my problem.
I managed to do what I wanted not by using ggplot stat_density but to directly calculate my densities in another data frame:
set.seed(48151623)
mydf <- data.frame(x1=rnorm(mean=5,n=100),x2=rnorm(n=100,mean=10),x3=rnorm(n=100,mean=20,sd=3))
mydf$var <- mydf$x1 + mydf$x2 * mydf$x3
mydf.wide <- melt(mydf,id.vars='var',measure.vars=c(1:3))
mydf.densities <- do.call('rbind',lapply(unique(mydf.wide$variable), function(var) {
tmp <- mydf.wide[which(mydf.wide$variable==var),c('var','value')]
dfit <- density(tmp$value,cut=0)
scaledy <-dfit$y/max(dfit$y) * diff(range(tmp$var)) + min(tmp$var)
data.frame(x=dfit$x,y=scaledy,variable=rep(var,length(dfit$x)))
}))
ggplot(data=mydf.wide,aes(x=value,y=var)) +
geom_point(colour='red') +
geom_smooth(method='lm') +
geom_line(aes(x=x,y=y),data=mydf.densities) +
facet_wrap(~variable,scale='free_x')
(I know that the construction of mydf.densities is a bit obfuscated, but I will work on that later).
I'm giving out the bounty to the most voted solution at the end of the day, for your troubles.
From what I can find on stackoverflow, (such as this answer to using two scale colour gradients on one ggplot) this may not (yet) be possible with ggplot2.
I want to create a bubbleplot with two size aesthetics, one always larger than the other. The idea is to show the proportion as well as the absolute values. Now I could colour the points by the proportion but I prefer multi-bubbles. In Excel this is relatively simple. (http://i.stack.imgur.com/v5LsF.png) Is there a way to replicate this in ggplot2 (or base)?
Here's an option. Mapping size in two geom_point layers should work. It's a bit of a pain getting the sizes right for bubblecharts in ggplot though.
p <- ggplot(mtcars, aes(mpg, wt)) + geom_point(aes(size = disp), shape = 1) +
geom_point(aes(size = hp/(2*disp))) + scale_size_continuous(range = c(15,30))
To get it looking most like your exapmle, add theme_bw():
P <- p + theme_bw()
The scale_size_continuous() is where you have to just fiddle around till you're happy - at least in my experience. If someone has a better idea there I'd love to hear it.