Adding points, symbols, and legends to ggplot - r

I have created a plot using ggplot (with DF1 dataset below). I would like two additions to this plot:
to add symbol based on DF.SYMBOL dataset (on specified times for two IDs: different shape and color by event).
to add a vertical line within the bar with CONC as legend based on DF.LINE dataset
I would appreciate your suggestion!
ID<-rep(c(1,2),each=6)
START <- c(0, 42,57,300,520,710, 0,31,56,85,120,300)
END <- c(42,57,300,520,710,711,31,56,85,120,300,301)
TYPE <- c("S","NR","R","NR","R","R","S","R","NR","R","NR","NR")
DF1 <-data.frame(ID,START,END,TYPE)
DF1
# converting ID from numeric to factor
DF1 %<>%
dplyr::mutate(ID = factor(ID))
ggplot(DF1,aes(y=ID,yend=ID,x=START,xend=END,color=TYPE))+
geom_segment(aes(y=ID,yend=ID,x=START,xend=END),size=6,lineend= "butt")
DF.SYMBOL dataset to add points and symbols to the plot
ID<-rep(c(1,2),each=2)
EVENT <- rep(c("TBR","PBR"))
TIME <- c(90, 220,120,200)
DF.SYMBOL<-data.frame(ID,EVENT,TIME)
DF.LINE dataset to add a vertical line in bar with CONC in legend above the vertical line for each ID
ID <- c(1,2)
TIME <- c(400, 265)
CONC <- c(23,97)
DF.LINE<-data.frame(ID,TIME, CONC)
Here's the desired plot (edited on powerpoint): symbols based on DF.SYMBOL dataset and black line with value based on DF.LINE dataset.

This should do it. I used geom_errorbarh for the vertical line - I don't know a better way to get a vertical line across a horizontal bar on a discrete scale. For better control of the thickness you might consider changing the geom_segment to a geom_rect.
DF.SYMBOL$ID = factor(DF.SYMBOL$ID)
DF.LINE$ID = factor(DF.LINE$ID)
ggplot(DF1,aes(y=ID))+
geom_segment(aes(yend=ID, x=START, xend=END, color = TYPE),size=6,lineend= "butt") +
geom_point(data = DF.SYMBOL, aes(x = TIME, fill = EVENT, shape = EVENT), size = ) +
scale_shape_manual(values = c(21, 24)) +
scale_fill_manual(values = c("red", "yellow")) +
geom_errorbarh(data = DF.LINE, aes(xmin = TIME, xmax = TIME), height = 0.1) +
geom_text(data = DF.LINE, aes(x = TIME, label = CONC), vjust = -1.5)

Related

Highlighting lines and gray out rest in multiple line chart with ggplot2?

So I had generated a multiple line chart on ggplot with different countries and want to colour the top 10 and grey out the rest.
When I assign colours black and red, it colours the first two countries in the legend. However I want to colour other ones down the list. US, India, Brazil in the chart. Help much appreciated, thanks.
This is what I have:
and the code here:
ggplot(data=y, aes(x = Date, y = Deaths, color = Country)) +
geom_line() +
scale_color_manual(values = c("black",
"red",
rep("gray", 196)))
You first need to order your countries according to the number of deaths, then in scale_color_manual you need the first 10 colours to be of your choice, not just the first two:
library(ggplot2)
y$Country <- reorder(y$Country, -y$Deaths)
ggplot(data = y, aes(x = Date, y = Deaths, color = Country)) +
geom_line() +
scale_color_manual(values = c(rep(c("black", "red"), each = 5),
rep("gray", nrow(y) - 10))) +
guides(color = guide_none())
Note that since you didn't share your data, I made some up with a similar structure and the same names as yours so that the above code should also work on your own data set.
Made-up data
set.seed(1)
y <- data.frame(
Deaths = c(replicate(198, 1000 * cumprod(runif(100, 1, 1 + sample(10, 1)/100)))),
Date = rep(seq(as.POSIXct('2020-01-01'), as.POSIXct('2022-01-01'), len = 100),
198),
Country = factor(rep(1:198, each = 100)))

plotting stacked points using ggplot

I have a data frame and I would like to stack the points that have overlaps exactly on top of each other.
here is my example data:
value <- c(1.080251e-04, 1.708859e-01, 1.232473e-05, 4.519876e-03,2.914256e-01, 5.869711e-03, 2.196347e-01,4.124873e-01, 5.914052e-03, 2.305623e-03, 1.439013e-01, 5.407597e-03, 7.530298e-02, 7.746897e-03)
names = letters[1:7]
data <- data.frame(names = rep(names,), group = group, value = value, stringsAsFactors = T)
group <- c(rep("AA", 7) , rep("BB", 7))
I am using the following command:
p <- ggplot(data, aes(x = names, y = "", color = group)) +
geom_point(aes(size = -log(value)), position = "stack")
plot(p)
But the stacked circle outlines out of the grid. I want it close or exactly next to the bottom circle. do you have any idea how I can fix the issue?
Thanks,
The y-axis has no numeric value, so use the group instead. And we don't need the color legend now since the group labels are shown on the y-axis.
ggplot(data, aes(x = names, y = group, color = group)) +
geom_point(aes(size = -log(value))) +
guides(color=FALSE)

Plot legend for multiple histograms plotted on top of each other ggplot

I've made this multiple histogram plot in ggplot and now I want to add a legend for both the light purple part and the dark purple part. I know the conventional way is to to it with aes, but I can't seem to figure out how I integrate this feature as one into my multiple histogram plot.
I don't shy manual labour, but more sophisticated solutions are preferred. Anyone help me out?
#dataframe
set.seed(20)
df <- data.frame(expl = rbinom(n=100, size = 1, prob=0.08),
resp = sample(50:100, size = 100, replace = T))
#graph
graph <- ggplot(data = df, aes(x = resp))
graph +
geom_histogram(fill = "#BEBADA", alpha = 0.5, bins = 10) +
geom_histogram(data = subset(df, expl == '1'), fill = "#BEBADA", bins = 10)
Your data is already in the long format that is well suited for ggplot; you just need to map expl to alpha. In general, if you find yourself making multiples of the same geom, you probably want to rethink either the shape of your data or your approach for feeding it into geoms.
library(tidyverse)
set.seed(20)
df <- data.frame(expl = rbinom(n=100, size = 1, prob=0.08),
resp = sample(50:100, size = 100, replace = T))
To map expl onto alpha, make it a factor, and then assign that to alpha inside your aes. Then you can set the alpha scale to values of 0.5 and 1.
ggplot(df, aes(x = resp, alpha = as.factor(expl))) +
geom_histogram(fill = "#bebada", bins = 10) +
scale_alpha_manual(values = c(0.5, 1))
However, differentiating by alpha is a little awkward. You could instead map to fill and use light and dark purples:
ggplot(df, aes(x = resp, fill = as.factor(expl))) +
geom_histogram(bins = 10) +
scale_fill_manual(values = c("0" = "mediumpurple1", "1" = "mediumpurple4"))
Note also that you can adjust the position of the histogram bars if you need to, by assigning geom_histogram(position = ...), where you could fill in with something such as "dodge" if that's what you'd like.
If you want a legend on the alpha value, the idea is to include it as an aesthetic rather than as a direct argument as you tried. In order to do this, a simple solution is to enrich the data frame used by ggplot:
df2 <- rbind(
cbind(df, filter="all lines"),
cbind(subset(df, expl == '1'), filter="expl==1")
)
df2 corresponds to df after appending the lines from your subset of interest (with a field filter telling from which copy each record comes)
Then, this solves your problem
ggplot(df2, aes(resp, alpha=filter)) +
geom_histogram(fill="#BEBADA", bins=10, position="identity") +
scale_alpha_discrete(range=c(.5,1))

Facet wrap radar plot with three apexes in R

I have created the following plot which gives the shape of the plot I desire. But when I facet wrap it, the shapes no longer remain triangular and become almost cellular. How can I keep the triangular shape after faceting?
Sample data:
lvls <- c("a","b","c","d","e","1","2","3","4","5","6","7","8","9","10","11","12","13","14","15")
df <- data.frame(Product = factor(rep(lvls, 3)),
variable = c(rep("Ingredients", 20),
rep("Defence", 20),
rep("Benefit", 20)),
value = rnorm(60, mean = 5))
Now when I use this code, I get the shapes I desire.
ggplot(df,
aes(x = variable,
y = value,
color = Product,
group = Product)) +
geom_polygon(fill = NA) +
coord_polar()
However, the products are all on top of one another so ideally I would like to facet wrap.
ggplot(df,
aes(x = variable,
y = value,
color = Product,
group = Product)) +
geom_polygon(fill = NA) +
coord_polar() +
facet_wrap(~Product)
But when I facet wrap, the shapes become oddly cellular and not triangular (straight lines from point to point). Any ideas on how to alter this output?
Thanks.

Coloring a geom_histogram by gradient

I'm trying to plot a geom_histogram where the bars are colored by a gradient.
This is what I'm trying to do:
library(ggplot2)
set.seed(1)
df <- data.frame(id=paste("ID",1:1000,sep="."),val=rnorm(1000),stringsAsFactors=F)
ggplot(df,aes_string(x="val",y="..count..+1",fill="val"))+geom_histogram(binwidth=1,pad=TRUE)+scale_y_log10()+scale_fill_gradient2("val",low="darkblue",high="darkred")
But getting:
Any idea how to get it colored by the defined gradient?
Not sure you can fill by val because each bar of the histogram represents a collection of points.
You can, however, fill by categorical bins using cut. For example:
ggplot(df, aes(val, fill = cut(val, 100))) +
geom_histogram(show.legend = FALSE)
Just for completeness.
If the colors I'd like to have the gradient on to be manually selected here's what I suggest:
data:
library(ggplot2)
set.seed(1)
df <- data.frame(id=paste("ID",1:1000,sep="."),val=rnorm(1000),stringsAsFactors=F)
colors:
bins <- 10
cols <- c("darkblue","darkred")
colGradient <- colorRampPalette(cols)
cut.cols <- colGradient(bins)
cuts <- cut(df$val,bins)
names(cuts) <- sapply(cuts,function(t) cut.cols[which(as.character(t) == levels(cuts))])
plot:
ggplot(df,aes(val,fill=cut(val,bins))) +
geom_histogram(show.legend=FALSE) +
scale_color_manual(values=cut.cols,labels=levels(cuts)) +
scale_fill_manual(values=cut.cols,labels=levels(cuts))
Instead of binning manually another option would be to make use of the bins computed by stat_bin by mapping ..x.. (or factor(..x..) in case of a discrete scale) or after_stat(x) on the fill aesthetic.
An issue with computing the bins manually is that we end up with multiple groups per bin for which the count has to be computed (even if the count is zero most of the time) and which get stacked on top of each other in the histogram. Especially, this gets problematic if one would add labels of counts to the histogram as can be seen in this post, because in that case one ends up with multiple labels per bin.
library(ggplot2)
set.seed(1)
df <- data.frame(id = paste("ID", 1:1000, sep = "."), val = rnorm(1000), stringsAsFactors = F)
ggplot(df, aes(x = val, y = ..count.. + 1, fill = ..x..)) +
geom_histogram(binwidth = .1, pad = TRUE) +
scale_y_log10() +
scale_fill_gradient2(name = "val", low = "darkblue", high = "darkred")
#> Warning: Duplicated aesthetics after name standardisation: pad

Resources