Plotting continuous and discrete series in ggplot with facet - r

I have data that plots over time with four different variables. I would like to combine them in one plot using facet_grid, where each variable gets its own sub-plot. The following code resembles my data and the way I'm presenting it:
require(ggplot2)
require(reshape2)
subm <- melt(economics, id='date', c('psavert','uempmed','unemploy'))
mcsm <- melt(data.frame(date=economics$date, q=quarters(economics$date)), id='date')
mcsm$value <- factor(mcsm$value)
ggplot(subm, aes(date, value, col=variable, group=1)) + geom_line() +
facet_grid(variable~., scale='free_y') +
geom_step(data=mcsm, aes(date, value)) +
scale_y_discrete(breaks=levels(mcsm$value))
If I leave out scale_y_discrete, R complains that I'm trying to combine discrete value with continuous scale. If I include scale_y_discreate my continuous series miss their scale.
Is there any neat way of solving this issue ie. getting all scales correct ? I also see that the legend is alphabetically sorted, can I change that so the legend is ordered in the same order as the sub-plots ?

Problem with your data is that that for data frame subm value is numeric (continuous) but for the mcsm value is factor (discrete). You can't use the same scale for numeric and continuous values and you get y values only for the last facet (discrete). Also it is not possible to use two scale_y...() functions in one plot.
My approach would be to make mcsm value as numeric (saved as value2) and then use them - it will plot quarters as 1,2,3 and 4. To solve the problem with legend, use scale_color_discrete() and provide breaks= in order you need.
mcsm$value2<-as.numeric(mcsm$value)
ggplot(subm, aes(date, value, col=variable, group=1)) + geom_line()+
facet_grid(variable~., scale='free_y') + geom_step(data=mcsm, aes(date, value2)) +
scale_color_discrete(breaks=c('psavert','uempmed','unemploy','q'))
UPDATE - solution using grobs
Another approach is to use grobs and library gridExtra to plot your data as separate plots.
First, save plot with all legends and data (code as above) as object p. Then with functions ggplot_build() and ggplot_gtable() save plot as grob object gp. Extract from gp only part that plots legend (saved as object gp.leg) - in this case is list element number 17.
library(gridExtra)
p<-ggplot(subm, aes(date, value, col=variable, group=1)) + geom_line()+
facet_grid(variable~., scale='free_y') + geom_step(data=mcsm, aes(date, value2)) +
scale_color_discrete(breaks=c('psavert','uempmed','unemploy','q'))
gp<-ggplot_gtable(ggplot_build(p))
gp.leg<-gp$grobs[[17]]
Make two new plot p1 and p2 - first plots data of subm and second only data of mcsm. Use scale_color_manual() to set colors the same as used for plot p. For the first plot remove x axis title, texts and ticks and with plot.margin= set lower margin to negative number. For the second plot change upper margin to negative number. faced_grid() should be used for both plots to get faceted look.
p1 <- ggplot(subm, aes(date, value, col=variable, group=1)) + geom_line()+
facet_grid(variable~., scale='free_y')+
theme(plot.margin = unit(c(0.5,0.5,-0.25,0.5), "lines"),
axis.text.x=element_blank(),
axis.title.x=element_blank(),
axis.ticks.x=element_blank())+
scale_color_manual(values=c("#F8766D","#00BFC4","#C77CFF"),guide="none")
p2 <- ggplot(data=mcsm, aes(date, value,group=1,col=variable)) + geom_step() +
facet_grid(variable~., scale='free_y')+
theme(plot.margin = unit(c(-0.25,0.5,0.5,0.5), "lines"))+ylab("")+
scale_color_manual(values="#7CAE00",guide="none")
Save both plots p1 and p2 as grob objects and then set for both plots the same widths.
gp1 <- ggplot_gtable(ggplot_build(p1))
gp2 <- ggplot_gtable(ggplot_build(p2))
maxWidth = grid::unit.pmax(gp1$widths[2:3],gp2$widths[2:3])
gp1$widths[2:3] <- as.list(maxWidth)
gp2$widths[2:3] <- as.list(maxWidth)
With functions grid.arrange() and arrangeGrob() arrange both plots and legend in one plot.
grid.arrange(arrangeGrob(arrangeGrob(gp1,gp2,heights=c(3/4,1/4),ncol=1),
gp.leg,widths=c(7/8,1/8),ncol=2))

Related

Is it possible to make a column plot using ggplot in which the column fill is controlled by a third variable?

I have a data frame with three continuous variables (x,y,z). I want a column plot in which x defines the x-axis position of the columns, y defines the length of the columns, and the column colors (function of y) are defined by z. The test code below shows the set up.
`require(ggplot2)
require(viridis)
# Create a dummy data frame
x <- c(rep(0.0, 5),rep(0.5,10),rep(1.0,15))
y <- c(seq(0.0,-5,length.out=5),
seq(0.0,-10,length.out=10),
seq(0.0,-15,length.out=15))
z <- c(seq(10,0,length.out=5),
seq(8,0,length.out=10),
seq(6,0,length.out=15))
df <- data.frame(x=x, y=y, z=z)
pbase <- ggplot(df, aes(x=x, y=y, fill=z))
ptest <- pbase + geom_col(width=0.5, position="identity") +
scale_fill_viridis(option="turbo",
limits = c(0,10),
breaks=seq(0,10,2.5),
labels=c("0","2.5","5.0","7.5","10.0"))
print(ptest)`
The legend has the correct colors but the columns do not. Perhaps this is not the correct way to do this type of plot. I tried using geom_bar() which creates a bars with the correct colors but the y-values are incorrect.
It looks like you have 3 X values that each appear 5, 10, or 15 times. Do you want the bars to be overlaid on top of one another, as they are now? If you add an alpha = 0.5 to the geom_col call you'll see the overlapping bars.
Alternatively, you might use dodging to show the bars next to one another instead of on top of one another.
ggplot(df, aes(x=x, y=y, fill=z, group = z)) +
geom_col(width=0.5, position=position_dodge()) +
scale_fill_viridis_c(option="turbo", # added with ggplot 3.x in 2018
limits = c(0,10),
breaks=seq(0,10,2.5),
labels=c("0","2.5","5.0","7.5","10.0"))
Or you might plot the data in order of y so that the smaller bars appear on top, visibly:
ggplot(dplyr::arrange(df,y), aes(x=x, y=y, fill=z))+
geom_col(width=0.5, position="identity") +
scale_fill_viridis_c(option="turbo",
limits = c(0,10),
breaks=seq(0,10,2.5),
labels=c("0","2.5","5.0","7.5","10.0"))
I solved this by using geom_tile() in place of geom_col().

Overlay density plot to each existing facet wrapped density plot in ggplot2?

I have a dataframe with ~37000 rows that contains 'name' in string format and 'UTCDateTime' in posixct format and am using it to produce a facet wrapped density plot of time grouped by the names:
I also have a separate density plot of posixct datetime data from an entirely different dataframe:
I want to overlay this second density plot on each individual facet_wrapped plot in the first density plot. Is there a way to do that? In general, if I have plots of any kind that are facet wrapped and another plot of the same type but different data that I want to overlay on each facet of the facet wrap, how do I do so?
This should in theory be as simple as not having the column that you're facetting by in the second dataframe. Example below:
library(ggplot2)
ggplot(iris, aes(Sepal.Width)) +
geom_density(aes(fill = Species)) +
geom_density(data = faithful,
aes(x = eruptions)) +
facet_wrap(~ Species)
Created on 2020-08-12 by the reprex package (v0.3.0)
EDIT: To get the densities on the same scale for the two types of data, you can use the computed variables using after_stat()*:
ggplot(iris, aes(Sepal.Width)) +
geom_density(aes(y = after_stat(scaled),
fill = Species)) +
geom_density(data = faithful,
aes(x = eruptions,
y = after_stat(scaled))) +
facet_wrap(~ Species)
* Prior to ggplot2 v3.3.0 also stat(variable) or ...variable....

How to adjust the ordering of labels in the default legend in ggplot2 so that it corresponds to the order in the data

I am plotting a forest plot in ggplot2 and am having issues with the ordering of the labels in the legend matching the order of the labels in the data set. Here is my code below.
data code
d<-data.frame(x=c("Co-K(W) N=720", "IH-K(W) N=67", "IF-K(W) N=198", "CO-K(B)N=78", "IH-K(B) N=13", "CO=A(W) N=874","D-Sco Ad(W) N=346","DR-Ad (W) N=892","CE_A(W) N=274","CO-Ad(B) N=66","D-So Ad(B) N=215","DR-Ad(B) N=123","CE-Ad(B) N=79"),
y = rnorm(13, 0, 0.1))
d <- transform(d, ylo = y-1/13, yhi=y+1/13)
d$x <- factor(d$x, levels=rev(d$x)) # reverse ordering
forest plot code
credplot.gg <- function(d){
# d is a data frame with 4 columns
# d$x gives variable names
# d$y gives center point
# d$ylo gives lower limits
# d$yhi gives upper limits
require(ggplot2)
p <- ggplot(d, aes(x=x, y=y, ymin=ylo, ymax=yhi,group=x,colour=x,)) +
geom_pointrange(size=1) +
theme_bw() +
scale_color_discrete(name="Sample") +
coord_flip() +
theme(legend.key=element_rect(fill='cornsilk2')) +
guides(colour = guide_legend(override.aes = list(size=0.5))) +
geom_hline(aes(x=0), colour = 'red', lty=2) +
xlab('Cohort') + ylab('CI') + ggtitle('Forest Plot')
return(p)
}
credplot.gg(d)
This is what I get. As you can see the labels on the y axis matches the labels in the order that it is in the data. However, it is not the same order in the legend. I'm not sure how to correct this. This is my first time creating a plot in ggplot2. Any feedback is well appreciated.Thanks in advanced
Nice plot, especially for a first ggplot! I've not tested, but I think all you need is to add reverse=TRUE inside your colour's guide_legend(found this in the Cookbook for R).
If I were to make one more comment, I'd say that ordering your vertical factor by numeric value often makes comparisons easier when alphabetical order isn't particularly meaningful. (Though maybe your alpha order is meaningful.)

ggplot2 geom_bar color only one column

Consider a sample dataframe and the relative geom_bar plot
data = data.frame(method=LETTERS[sample(x=c(1,2,3),size=100,replace=T)],
x1=sample(x=c(1,2,3,4,5,6),size=100,replace=T),
x2=sample(x=c(1,2,3,4,5,6),size=100,replace=T),
d =letters[sample(c(1,2,3,4),size=100,replace=T)] )
ggplot()+
geom_bar(data=data, aes(x=method, y=x1),stat="identity") +
facet_wrap(~d, ncol=2)
I would like to color the smaller column of each plot of red.
How can I do that?
I'm not sure how you would do it without collapsing your data to be able to create a new column which specifies which value is the minimum. Then you can attach an aesthetic to that value. Here's a collapsing strategy using your data
collapsed < -as.data.frame(xtabs(x1~d+method, data))
collapsed$ismin <- with(collapsed, ave(Freq,d,FUN=function(x) x==min(x)))
And now we plot with
ggplot(collapsed, aes(x=method, y=Freq, fill=as.factor(ismin)))+
geom_bar(stat="identity") +
facet_wrap(~d, ncol=2) +
scale_fill_manual(breaks=c("0","1"), values=c("black","red"), guide="none")
which results in

ggplot2 using geom_errorbar and geom_point to add points to a plot

I have a plot using ggplot, and I would like to add points and error bars to it. I am using geom_errorbar and geom_point, but I am getting an error: "Discrete value supplied to continuous scale" and I am not sure why. The data labels in the plot below should remain the same. I simply want to add new points to the existing graph. The new graph should look like the one below, except with two points/CI bars for each label on the Y axis.
The following example is from the lme4 package, and it produces a plot with confidence intervals using ggplot below (all can be replicated except the last two lines of borken code). My data is only different in that it includes about 15 intercepts instead of 6 below (which is why I am using scale_shape_manual).
The last two lines of code is my attempt at adding points/confidence intervals. I'm going to put a 50 bounty on this. Please let me know if I am being unclear. Thanks!
library("lme4")
data(package = "lme4")
# Dyestuff
# a balanced one-way classiï¬cation of Yield
# from samples produced from six Batches
summary(Dyestuff)
# Batch is an example of a random effect
# Fit 1-way random effects linear model
fit1 <- lmer(Yield ~ 1 + (1|Batch), Dyestuff)
summary(fit1)
coef(fit1) #intercept for each level in Batch
randoms<-ranef(fit1, postVar = TRUE)
qq <- attr(ranef(fit1, postVar = TRUE)[[1]], "postVar")
rand.interc<-randoms$Batch
#THESE ARE THE ADDITIONAL POINTS TO BE ADDED TO THE PLOT
Inter <- c(-25,-45,20,30,23,67)
SE2 <- c(20,20,20,20,20,20)
df<-data.frame(Intercepts=randoms$Batch[,1],
sd.interc=2*sqrt(qq[,,1:length(qq)]), Intercepts2=Inter, sd.iterc2=SE2,
lev.names=rownames(rand.interc))
df$lev.names<-factor(df$lev.names,levels=df$lev.names[order(df$Intercepts)])
library(ggplot2)
p <- ggplot(df,aes(lev.names,Intercepts,shape=lev.names))
#Added horizontal line at y=0
#Includes first set of points/confidence intervals. This works without error
p <- p + geom_hline(yintercept=0) +geom_errorbar(aes(ymin=Intercepts-sd.interc, ymax=Intercepts+sd.interc), width=0,color="black") + geom_point(aes(size=2))
#Removed legends and with scale_shape_manual point shapes set to 1 and 16
p <- p + guides(size=FALSE,shape=FALSE) + scale_shape_manual(values=c(16,16,16,16,16,16))
#Changed appearance of plot (black and white theme) and x and y axis labels
p <- p + theme_bw() + xlab("Levels") + ylab("")
#Final adjustments of plot
p <- p + theme(axis.text.x=element_text(size=rel(1.2)),
axis.title.x=element_text(size=rel(1.3)),
axis.text.y=element_text(size=rel(1.2)),
panel.grid.minor=element_blank(),
panel.grid.major.x=element_blank())
#To put levels on y axis you just need to use coord_flip()
p <- p+ coord_flip()
print(p)
#####
# code for adding more plots, NOT working yet
p <- p +geom_errorbar(aes(ymin=Intercepts2-sd.interc2, ymax=Intercepts2+sd.interc2),
width=0,color="gray40", lty=1, size=1)
p <- p + geom_point(aes(Intercepts2, lev.names),size=0,pch=7)
First, in your data frame df and geom_errorbar() there are two different variables sd.iterc2 and sd.interc2. Changed also in df to sd.interc2.
For the last line of geom_point() you get the error because your x and y values are in wrong order. As your are using coord_flip() then x and y values should be placed in the same order as in original plot before coord_flip(), that is, lev.names as x, and Intercepts2 as y. Changed also size= to 5 for better illustration.
+ geom_point(aes(lev.names,Intercepts2),size=5,pch=7)
Update - adding legend
To add legend for the points of intercept types, one option is to reshape your data to long format and add new column with intercept types. Other option with your existing data is, first, remove shape=lev.names from ggplot() call. Then in both geom_point() calls add shape="somename" inside aes(). Then with scale_shape_manual() set shape values you need.
ggplot(df,aes(lev.names,Intercepts))+
geom_hline(yintercept=0) +
geom_errorbar(aes(ymin=Intercepts-sd.interc, ymax=Intercepts+sd.interc), width=0,color="black")+
geom_point(aes(shape="Intercepts"),size=5)+
theme_bw() + xlab("Levels") + ylab("")+
theme(axis.text.x=element_text(size=rel(1.2)),
axis.title.x=element_text(size=rel(1.3)),
axis.text.y=element_text(size=rel(1.2)),
panel.grid.minor=element_blank(),
panel.grid.major.x=element_blank())+
coord_flip()+
geom_errorbar(aes(ymin=Intercepts2-sd.interc2, ymax=Intercepts2+sd.interc2),
width=0,color="gray40", lty=1, size=1) +
geom_point(aes(lev.names,Intercepts2,shape="Intercepts2"),size=5)+
scale_shape_manual(values=c(16,7))

Resources