I am trying to create a graph similar to the one in this picture.
You can see that they have flipped the direction of the blue bars even though they have positive x values. Right now, I am able to reproduce that bar graph but with the bars in the same direction. Is it possible to create this same type of graph in ggplot with the flipped bars and positive x values?
Here is a tidyverse solution
Libraries
library(tidyverse)
Data
df <-
tibble(
y = letters[1:15],
p = runif(15,5,100),
g = as.factor(rep(0:1,c(5,10)))
)
Code
df %>%
#Create auxiliary variable, where for a determined group the percentage become negative
mutate(
p2 = if_else(g == 0, -p,p),
y = fct_reorder(y,p2)
) %>%
ggplot(aes(p2,y, fill = g))+
geom_col()+
scale_x_continuous(
breaks = seq(-100,100,10),
#Make the labels positive
labels = seq(-100,100,10) %>% abs()
)
Output
Related
I'm trying to get a plot_ly (in R) faceted histogram plot to look like a ggplot2 plot, using facets.
I can see this question How to facet a plot_ly() chart?, which allows me to make a faceted histogram plot, but although I can fix the chosen bins, I can't fix the x axis title to be consistent, nor the range of the x axis, nor can I choose different colour for the individual histogram facets.
The following works as a minimal example:
library(plotly)
library(dplyr)
x <- data.frame(Ancestry = as.factor(sample(1:7,200, replace=T)), Est.Age = rnorm(200, mean=50, sd=20))
x %>% group_by(Ancestry) %>%
group_map (~ plot_ly(data = ., x = ~Est.Age, color = ~Ancestry,
type = "histogram", nbinsx = 18, bingroup = 1), .keep = TRUE) %>%
subplot(nrow=3, shareX=TRUE) %>% layout(xaxis = list(title = "Age"))
This code snippet produces the following plot (or similar, depending on the random number):
What I would like to see is a consistent x-axis across all plots (for comparison purposes), and the same x-axis title ("Age" in this case). I would also like to change the colour of the individual plots in the facets to be consistent with other plots I'm generating on the same dataset, which aren't faceted. How can I do this with plot_ly in R?
EDIT: I should say that I want the facets based on a factor in my dataframe, and I want the colours to be based on a list of colours in the same order as the factors in the dataframe.
Here is one possible way using ggplotly:
p <-ggplot(x, aes(x = Est.Age, fill=Ancestry))+
geom_histogram(bins = 10)+
facet_wrap(.~Ancestry)
ggplotly(
p = ggplot2::last_plot()
)
I am trying to combine a line plot and horizontal barplot on the same plot. The difficult part is that the barplot is actually counts of the y values of the line plot.
Can someone show me how this can be done using the example below ?
library(ggplot2)
library(plyr)
x <- c(1:100)
dff <- data.frame(x = x,y1 = sample(-500:500,size=length(x),replace=T), y2 = sample(3:20,size=length(x),replace=T))
counts <- ddply(dff, ~ y1, summarize, y2 = sum(y2))
# line plot
ggplot(data=dff) + geom_line(aes(x=x,y=y1))
# bar plot
ggplot() + geom_bar(data=counts,aes(x=y1,y=y2),stat="identity")
I believe what I need is presented in the pseudocode below but I do not know how to write it out in R.
Apologies. I actually meant the secondary x axis representing the value of counts for the barplot, while primary y-axis is the y1.
ggplot(data=dff) + geom_line(aes(x=x,y=y1)) + geom_bar(data=counts , aes(primary y axis = y1,secondary x axis =y2),stat="identity")
I just want the barplots to be plotted horizontally, so I tried the code below which flip both the line chart and barplot, which is also not I wanted.
ggplot(data=dff) +
geom_line(aes(x=x,y=y1)) +
geom_bar(data=counts,aes(x=y2,y=y1),stat="identity") + coord_flip()
You can combine two plots in ggplot like you want by specifying different data = arguments in each geom_ layer (and none in the original ggplot() call).
ggplot() +
geom_line(data=dff, aes(x=x,y=y1)) +
geom_bar(data=counts,aes(x=y1,y=y2),stat="identity")
The following plot is the result. However, since x and y1 have different ranges, are you sure this is what you want?
Perhaps you want y1 on the vertical axis for both plots. Something like this works:
ggplot() +
geom_line(data=dff, aes(x=y1 ,y = x)) +
geom_bar(data=counts,aes(x=y1,y=y2),stat="identity", color = "red") +
coord_flip()
Maybe you are looking for this. Ans based on your last code you look for a double axis. So using dplyr you can store the counts in the same dataframe and then plot all variables. Here the code:
library(ggplot2)
library(dplyr)
#Data
x <- c(1:100)
dff <- data.frame(x = x,y1 = sample(-500:500,size=length(x),replace=T), y2 = sample(3:20,size=length(x),replace=T))
#Code
dff %>% group_by(y1) %>% mutate(Counts=sum(y2)) -> dff2
#Scale factor
sf <- max(dff2$y1)/max(dff2$Counts)
# Plot
ggplot(data=dff2)+
geom_line(aes(x=x,y=y1),color='blue',size=1)+
geom_bar(stat='identity',aes(x=x,y=Counts*sf),fill='tomato',color='black')+
scale_y_continuous(name="y1", sec.axis = sec_axis(~./sf, name="Counts"))
Output:
I'm trying to create a forest plot with R plotly where I want to color code the effect sizes (points) and their error bars by their corresponding p-values.
Here are toy data:
set.seed(1)
factors <- paste0(1:25,":age")
effect.sizes <- rnorm(25,0,1)
effect.errors <- abs(rnorm(25,0,1))
p.values <- runif(25,0,1)
Here's what I'm trying:
library(dplyr)
plotly::plot_ly(type='scatter',mode="markers",y=~factors,x=~effect.sizes,color=~p.values,colors=grDevices::colorRamp(c("darkred","gray"))) %>%
plotly::add_trace(error_x=list(array=effect.errors),marker=list(color=~p.values,colors=grDevices::colorRamp(c("darkred","gray")))) %>%
plotly::colorbar(limits=c(0,1),len=0.4,title="P-Value") %>%
plotly::layout(xaxis=list(title="Effect Size",zeroline=T,showticklabels=T),yaxis=list(title="Factor",zeroline=F,showticklabels=T))
which gives me:
Which is pretty close to what I want except for:
I'd like the error bars to be colored similar to the effect sizes (by the corresponding p-values).
Remove the two trace legends below the colorbar
Have the order of the labels on the y-axis be that of factors
Any idea?
Okay it took me a while to warm up my plotly skills. Since your first point was the most difficult, I will go reversely through your points.
That can be achied by manipulating the layout using categoryorder
and categoryarray in the yaxis-list (cf. motos answer here)
Set showlegend=FALSE
That was tricky. I had to move your second line (the error bars) in the first. Added a color vector to it. Put it in the plot_ly-function. Used split to allow the correct coloring by group. Added the color for the points in a marker-list. In additon I converted the p.values via the colorRamp to hex-because every simpler solution didn't work for me.
Looks like this:
The code (the colorbar created some issues):
### Set category order
yform <- list(categoryorder = "array",
categoryarray = rev(factors),
title="Factor",zeroline=F,showticklabels=T)
### set the color scale and convert it to hex
library(grDevices)
mycramp<-colorRamp(c("darkred","gray"))
mycolors<-rgb(mycramp(p.values),maxColorValue = 255)
### plot without the adjusted colorbar
library(plotly)
### Without colorbar adjustment
plot_ly(type='scatter',mode="markers",y=~factors,x=~effect.sizes,
color=~p.values,colors=grDevices::colorRamp(c("darkred","gray")),
error_x=list(array=effect.errors,color=mycolors),split=factors,showlegend=FALSE,marker=list(color=mycolors)) %>%
layout(xaxis=list(title="Effect Size",zeroline=T,showticklabels=T),yaxis=yform)
### The colorbar-adjustment kicks out the original colors of the scatter points. Either you plot them over
plot_ly(type='scatter',mode="markers",y=~factors,x=~effect.sizes,
color=~p.values,colors=grDevices::colorRamp(c("darkred","gray")),
error_x=list(array=effect.errors,color=mycolors),split=factors,showlegend=FALSE,marker=list(color=mycolors)) %>%
layout(xaxis=list(title="Effect Size",zeroline=T,showticklabels=T),yaxis=yform) %>%
colorbar(limits=c(0,1),len=0.4,title="P-Value",inherit=FALSE) %>%
add_trace(type='scatter',mode="markers",y=~factors,x=~effect.sizes,
showlegend=FALSE,marker=list(color=mycolors),inherit=FALSE) %>%
layout(xaxis=list(title="Effect Size",zeroline=T,showticklabels=T),yaxis=yform)
### or you try to set the colorbar before the plot. This results in some warnings
plot_ly() %>%
colorbar(limits=c(0,1),len=0.4,title="P-Value",inherit=FALSE) %>%
add_trace(type='scatter',mode="markers",y=~factors,x=~effect.sizes,
color=~p.values,colors=grDevices::colorRamp(c("darkred","gray")),
error_x=list(array=effect.errors,color=mycolors),split=factors,showlegend=FALSE,marker=list(color=mycolors)) %>%
layout(xaxis=list(title="Effect Size",zeroline=T,showticklabels=T),yaxis=yform)
Just odd that this first point was so difficult to solve and results in such a big code bracket, because normally plotly supports that pipe logic quite well and you get a very readable code with all the add-functions.
I expected e.g., some add_errorbar-function, but apparently you have to add the errorbars in the plot_ly-function and the color-vector for the errors only works if you use the split-function. If someone would like to comment or post an alternative answer with more readable code on this, that would be interesting.
Here is an idea by constructing first a ggplot2 graph and using ggplotly:
create a data frame :
df <- data.frame(factors = factor(factors, levels = factors), #note the order of the levels which determines the order of the y axes
effect.sizes = effect.sizes,
effect.errors = effect.errors,
p.values = p.values)
create the ggplot graph:
library(ggplot2)
library(plotly)
ggplot(df)+
geom_vline(xintercept = 0, color = "grey50") +
geom_point(aes(y = factors,
x = effect.sizes,
color = p.values)) +
geom_errorbarh(aes(y = factors,
xmin = effect.sizes - effect.errors,
xmax = effect.sizes + effect.errors,
x = effect.sizes,
color = p.values)) +
scale_color_continuous(low = "darkred", high = "gray")+
theme_bw() +
xlab("Effect Sizes")+
ylab("Factors") +
theme(panel.border = element_blank(),
plot.margin = margin(1, 1, 1, 1, "cm")) -> p1
ggplotly(p1)
data:
set.seed(1)
factors <- paste0(1:25,":age")
effect.sizes <- rnorm(25,0,1)
effect.errors <- abs(rnorm(25,0,1))
p.values <- runif(25,0,1)
I have a scatter plot now. Each color represent a categorical group and each group has a range of values which are on the x-axis. There should not be any overlapping between the range of categorical variables. However, because of the thickness of scatter points, it looks like that there is overlapping. So, I want to draw a line to connect the maximum point of the group and the minimum point of the adjacent group so that as long as the line does not have a negative slope, it can show that there is no overlapping between each categorical variable.
I do not know how to use geom_line() to connect two points where y-coordinate is a categorical variable. IS that possible to do so??
Any help would be appreciated!!!
It sounds like you want geom_segment not geom_line. You'll need to aggregate your data into a new data frame that has the points you want plotted. I adapted Brian's sample data and use dplyr for this:
# sample data
df <- data.frame(xvals = runif(50, 0, 1))
df$cats <- cut(df$xvals, c(0, .25, .625, 1))
# aggregation
library(dplyr)
df_summ = df %>% group_by(cats) %>%
summarize(min = min(xvals), max = max(xvals)) %>%
mutate(adj_max = lead(max),
adj_min = lead(min),
adj_cat = lead(cats))
# plot
ggplot(df, aes(xvals, cats, colour = cats)) +
geom_point() +
geom_segment(data = df_summ, aes(
x = max,
xend = adj_min,
y = cats,
yend = adj_cat
))
You can keep the segments colored as the previous category, or maybe set them to a neutral color so they don't stand out as much.
My reading comprehension failed me, so I misunderstood the question. Ignore this answer unless you want to learn about the lineend = argument of geom_line.
# generate dummy data
df <- data.frame(xvals = runif(1000, 0, 1))
# these categories were chosen to line up
# with tick marks to show they don't overlap
df$cats <- cut(df$xvals, c(0, .25, .625, 1)))
ggplot(df, aes(xvals, cats, colour = cats)) +
geom_line(size = 3)
The caveat is there there is a lineend = argument to geom_line. The default is butt, so that lines end exactly where you want them to and butt up against things, but sometimes that's not the right look. In this case, the other options would cause visual overlap, as you can see with the gridlines.
With lineend = "square":
With lineend = "round":
This is a continuation of the question here: Create non-overlapping stacked area plot with ggplot2
I have a ggplot2 area chart created by the following code. I want the labels from names be aligned on the right side of the graph. I think directlabels might work, but am willing to try whatever is most clever.
require(ggplot2)
require(plyr)
require(RColorBrewer)
require(RCurl)
require(directlabels)
link <- getURL("http://dl.dropbox.com/u/25609375/so_data/final.txt")
dat <- read.csv(textConnection(link), sep=' ', header=FALSE,
col.names=c('count', 'name', 'episode'))
dat <- ddply(dat, .(episode), transform, percent = count / sum(count))
# needed to make geom_area not freak out because of missing value
dat2 <- rbind(dat, data.frame(count = 0, name = 'lane',
episode = '02-tea-leaves', percent = 0))
g <- ggplot(arrange(dat2,name,episode), aes(x=episode,y=percent)) +
geom_area(aes(fill=name, group = name), position='stack') + scale_fill_brewer()
g1 <- g + geom_dl(method='last.points', aes(label=name))
I'm brand new to directlabels and not really sure how to get the labels to align to right side of the graph with the same colors as the areas.
You can use simple geom_text to add labels. First, subset you data set to get the final x value:
dd=subset(dat, episode=="06-at-the-codfish-ball")
Then order the data frame by factor level:
dd = dd[with(dd, order(name, levels(dd$name))),]
Then work out the cumulative percent for plotting:
dd$cum = cumsum(dd$percent)
Then just use a standard geom_text call:
g + geom_text(data=dd, aes(x=6, y=cum, label=name))
Oh, and you may want to angle your x-axis labels to avoid over plotting:
g + opts(axis.text.x=theme_text(angle=-25, hjust=0.5, size = 8))
Graph