Adding stat_smooth in to only 1 facet in ggplot2 - r

I have some data for which, at one level of a factor, there is a significant correlation. At the other level, there is none. Plotting these side-by-side is simple. Adding a line to both of them with stat_smooth, also straightforward. However, I do not want the line or its fill displayed in one of the two facets. Is there a simple way to do this? Perhaps specifying a blank color for the fill and colour of one of the lines somehow?

Don't think about picking a facet, think supplying a subset of your data to stat_smooth:
ggplot(df, aes(x, y)) +
geom_point() +
geom_smooth(data = subset(df, z =="a")) +
facet_wrap(~ z)

Of course, I later answered my own question. Although, is there a less hack-y way to do this? I wonder if one could even fit different functions to different panels.
One technique is to use + scale_fill_manual and scale_colour_manual. They allow one to specify what colors will be used. So, in this case, let's say you have
a<-qplot(x, y, facets=~z)+stat_smooth(method="lm", aes(colour=z, fill=z))
You can specify colors for the fill and colour using the following. Note, the second color is clear, as it is using a hex value with the final two numbers representing transparency. So, 00=clear.
a+stat_fill_manual(values=c("grey", "#11111100"))+scale_colour_manual(values=c("blue", "#11111100"))

Related

r fill and alpha from table

Is it possible to get a graph like the following with ggplot2 in R?
ggplot(mydata, aes(x=discrete_1, y=continuous_1)) +
geom_violin(aes(fill=discrete_2, alpha=continuous_2)) +
facet_grid(. ~ discrete_3)
Trying this, it seems their either 'alpha' or 'fill' will override each other?
I'm open to a different solution that achieves the same result (adding alpha to the fill colours proportionally to the value of continuous_2). My values for continuous_2 are always the same across rows with matching values for discrete_2.
Sorry if this is duplicate, I couldn't find anything that addressed this specifically.

R - Bar Plot with transparency based on values?

I have a dataset myData which contains x and y values for various Samples. I can create a line plot for a dataset which contains a few Samples with the following pseudocode, and it is a good way to represent this data:
myData <- data.frame(x = 290:450, X52241 = c(..., ..., ...), X75123 = c(..., ..., ...))
myData <- myData %>% gather(Sample, y, -x)
ggplot(myData, aes(x, y)) + geom_line(aes(color=Sample))
Which generates:
This turns into a Spaghetti Plot when I have a lot more Samples added, which makes the information hard to understand, so I want to represent the "hills" of each sample in another way. Preferably, I would like to represent the data as a series of stacked bars, one for each myData$Sample, with transparency inversely related to what is in myData$y. I've tried to represent that data in photoshop (badly) here:
Is there a way to do this? Creating faceted plots using facet_wrap() or facet_grid() doesn't give me what I want (far too many Samples). I would also be open to stacked ridgeline plots using ggridges, but I am not understanding how I would be able to convert absolute values to a stat(density) value needed to plot those.
Any suggestions?
Thanks to u/Joris for the helpful suggestion! Since, I did not find this question elsewhere, I'll go ahead and post the pretty simple solution to my question here for others to find.
Basically, I needed to apply the alpha aesthetic via aes(alpha=y, ...). In theory, I could apply this over any geom. I tried geom_col(), which worked, but the best solution was to use geom_segment(), since all my "bars" were going to be the same length. Also note that I had to "slice" up the segments in order to avoid the problem of overplotting similar to those found here, here, and here.
ggplot(myData, aes(x, Sample)) +
geom_segment(aes(x=x, xend=x-1, y=Sample, yend=Sample, alpha=y), color='blue3', size=14)
That gives us the nice gradient:
Since the max y values are not the same for both lines, if I wanted to "match" the intensity I normalized the data (myDataNorm) and could make the same plot. In my particular case, I kind of preferred bars that did not have a gradient, but which showed a hard edge for the maximum values of y. Here was one solution:
ggplot(myDataNorm, aes(x, Sample)) +
geom_segment(aes(x=x, xend=x-1, y=Sample, y=end=Sample, alpha=ifelse(y>0.9,1,0)) +
theme(legend.position='none')
Better, but I did not like the faint-colored areas that were left. The final code is what gave me something that perfectly captured what I was looking for. I simply moved the ifelse() statement to apply to the x aesthetic, so the parts of the segment drawn were only those with high enough y values. Note my data "starts" at x=290 here. Probably more elegant ways to combine those x and xend terms, but whatever:
ggplot(myDataNorm, aes(x, Sample)) +
geom_segment(aes(
x=ifelse(y>0.9,x,290), xend=ifelse(y>0.9,x-1,290),
y=Sample, yend=Sample), color='blue3', size=14) +
xlim(290,400) # needed to show entire scale

Using a uniform color palette among different ggplot2 graphs with factor variable

I am using ggplot2 to create several plots about the same data. In particular I am interested in plotting observations according to a factor variable with 6 levels ("cluster").
But the plots produced by ggplot2 use different palettes every time!
For example, if I make a bar plot with this formula I get this result (this palette is what I expect to obtain):
qplot(cluster, data = data, fill = cluster) + ggtitle("Clusters")
And if I make a scatter plot and I try to color the observations according to their belonging to a cluster I get this result (notice that the color palette is different):
ggplot(data, aes(liens_ratio,RT_ratio)) +
geom_point(col=data$cluster, size=data$nombre_de_tweet/100+2) +
geom_smooth() +
ggtitle("Links - RTs")
Any idea on how to solve this issue?
I can't be certain this will work in your specific case without a reproducible example, but I'm reasonably confident that all you need to do is set your color inside an aes() call within the geom you want to color. That is,
ggplot(data, aes(x = liens_ratio, y = RT_ratio)) +
geom_point(aes(color = cluster, size = nombre_de_tweet/100+2)) +
geom_smooth() +
ggtitle("Links - RTs")
If all plots you make use the same data and this basic format, the color palette should be the same regardless of the geom used. Additional elements, such as the line from geom_smooth() will not be changed unless they are also explicitly colored.
The palette will just be the default one, of course; to change it look into scale_color_manual.

Can I adjust the lower limit of scale_color_brewer?

I have ordered categorical data that I would like to use color brewer on. But I have a hard time seeing the very light lower values. Is there a way to either trim off those lower values or set the lower limit in the scale?
ggplot(data.frame(x=1:6, y=10:15, w=letters[1:6]), aes(x, y, color=w)) +
geom_point()+ scale_color_brewer(type="seq", palette=1) + theme_bw()
Is there a better way to do this? So far I either see qualitative scales that aren't ordered or continuous scales that don't like being applied to discrete data. I'm aware of manual scales if that's the only route.
You cannot just set a lower limit. But you can use a palette with more colors than needed and map the brightest colors to unused levels. Below is an example with 9 levels:
ggplot(data.frame(x=1:6, y=10:15, w=letters[1:6]), aes(x, y, color=w)) +
geom_point() + theme_bw() +
scale_color_brewer(type="seq", palette=1,
limits=c(LETTERS[1:3], letters[1:6]),
breaks=letters[1:6])
While #shadow's answer was a start for me, the kind of brewer palette I needed to use (sequential) only has 9 values -- I had 8 categorical variables to plot! Removing only the 9th and lightest palette color still wasn't enough to make the color scheme completely visible.
So I used the colorRampPalette() function, which allows you to expand existing color palettes into continuous functions:
library(RColorBrewer)
ggplot(data.frame(x=1:6, y=10:15, w=letters[1:6]), aes(x, y, color=w)) +
geom_point() + theme_bw() +
scale_color_manual(values = colorRampPalette(brewer.pal(9, "YlGnBu"))(12)[6:12])
So in this case, I'm mapping the (maximum) 9 native colors from the "YlGnBu" palette onto 12 colors, and then only using the darkest 6 of those colors ([6:12]) in the plot.
I'm not aware of any additional arguments you could pass to scale_colour_brewer() to set the lower limit of the scale (see http://docs.ggplot2.org/current/scale_brewer.html)
You have more flexibility with one of ggplot's colour options, which take the format of: scale_xxx_yyy, for example scale_fill_discrete() which take more arguments. See for example http://docs.ggplot2.org/current/scale_hue.html but also note the other options ('see also').
scale_fill_continuous might be a good starting place for ordinal data as you've requested.
You could, for example, pass colours from http://colorbrewer2.org/ to it, and choose a more suitable starting colour. The only problem is you would need to convert the rgb/hex values to HSL values using a tool such as: http://serennu.com/colour/hsltorgb.php

geom_text positions per group

I am using geom_line, geom_point, and geom_text to plot something like the picture below:
I am grouping, and coloring my data frame, but I want the geom_text not to be so close to each other.
I want to put the one text on top, and the other on bottom. Or at least, hide the one of the two. Is there any way I can do this?
You can specify custom aesthetics in different geom_text() calls. You can include only a subset of the data (such as just one group) in each call, and give each geom_text() a custom hjust or vjust value for each subset.
ggplot(dat, aes(x, y, group=mygroups, color=mygroups, label=mylabel)) +
geom_point() +
geom_line() +
geom_text(data=dat[dat$mygroups=='group1',], aes(vjust=1)) +
geom_text(data=dat[dat$mygroups=='group2',], aes(vjust=-1))

Resources