ggplot2 multi-histogram graph plotting only single histogram - r

ggplot(d,aes(x= `Log Number`)) +
geom_histogram(data=subset(d,state == 'c'),fill = "red", alpha = 0.2) +
geom_histogram(data=subset(d,state == 'l'),fill = "blue", alpha = 0.2) +
geom_histogram(data=subset(d,state == 't'),fill = "green", alpha = 0.2)
d is a dataset only contain two columns log number which is a long list of number, state which is a factor contain 3 levels-c,l,t
i tried to use it to plot a overlapping histogram but it just return a single one. Thanks

You want to fill by status
ggplot(d, aes(x = `Log Number`, fill = state)) + geom_histogram()

Hmm, I don't know, I think your data is wrong. Worked for me:
lon <- log( rnorm(1000,exp(6) ))
state <- sample(c("c","l","t"),1000,replace=T)
d <- data.frame(lon,state)
names(d) <- c("Log Number","state")
head(d)
yields the following data:
Log Number state
1 5.999955 t
2 5.997907 c
3 6.002452 l
4 5.994471 l
5 5.997306 l
6 6.000798 t
And then the plot:
ggplot(d,aes(x= `Log Number`)) +
geom_histogram(data=subset(d,state == 'c'),fill = "red", alpha = 0.2) +
geom_histogram(data=subset(d,state == 'l'),fill = "blue", alpha = 0.2) +
geom_histogram(data=subset(d,state == 't'),fill = "green", alpha = 0.2)
looks like this:

Related

How to draw a multi-colored dashed line (alternating colors for visual effect) [duplicate]

This question already has answers here:
Alternating color of individual dashes in a geom_line
(4 answers)
Closed 8 months ago.
I was wondering if it is possible to create a multicolored dashed line in ggplot.
Basically I have a plot displaying savings based on two packages.
A orange line with savings based on package A
A green line with savings based on package B
I also have a third line and I would like that one to be dashed alterenating between orange and green. Is that something that somebody has been able to do?
Here is an example:
library(tidyverse)
S <- seq(0, 5, by = 0.05)
a <- S ^ 2
b <- S
a_b = a + b #This data should have the dashed multicolor line, since it is the sum of the other two lines.
S <- data.frame(S)
temp <- cbind(S, a, b, a_b)
temp <- gather(temp, variable, value, -S)
desiredOrder <- c("a", "b", "a_b")
temp$variable <- factor(temp$variable, levels = desiredOrder)
temp <- temp[order(temp$variable),]
p <- ggplot(temp, aes(x = S, y = value, colour = variable)) +
theme_minimal() +
geom_line(size = 1) +
scale_color_manual(name = "Legend", values = c("orange", "green", "#0085bd"),
breaks = c("a", "b", "a_b"))
p
I basically want to have a multicolored (dashed or dotted) line for "c"
This is, to my best knowledge, currently only possible via creation of new segments for each alternate color. This is fiddly.
Below I've tried a largely programmatic approach in which you can define the size of the repeating segment (based on your x unit). The positioning of y values is slightly convoluted and it will also result in slightly irregular segment lengths when dealing with different slopes. I also haven't tested it on many data, either. But I guess it's a good start :)
For the legend, I'm taking the same approach, by creating a fake legend and stitching it onto the other plot. The challenges here include:
positioning of legend elements relative to the plot
relative distance between the legend elements
update
For a much neater way to create those segments and a Stat implementation see this thread
library(tidyverse)
library(patchwork)
S <- seq(0, 5, by = 0.05)
a <- S^2
b <- S
a_b <- a + b
df <- data.frame(x = S, a, b, a_b) %>%
pivot_longer(-x, names_to = "variable", values_to = "value")
## a function to create modifiable cuts in order to get segments.
## this looks convoluted - and it is! there are a few if/else statements.
## Why? The assigment of new y to x values depends on how many original values
## you have.
## There might be more direct ways to get there
alt_colors <- function(df, x, y, seg_length, my_cols) {
x <- df[[x]]
y <- df[[y]]
## create new x for each tiny segment
length_seg <- seg_length / length(my_cols)
new_x <- seq(min(x, na.rm = TRUE), x[length(x)], length_seg)
## now we need to interpolate y values for each new x
## This is different depending on how many x and new x you have
if (length(new_x) < length(x)) {
ind_int <- findInterval(new_x, x)
new_y <- sapply(seq_along(ind_int), function(i) {
if (y[ind_int[i]] == y[ind_int[length(ind_int)]]) {
y[ind_int[i]]
} else {
seq_y <- seq(y[ind_int[i]], y[ind_int[i] + 1], length.out = length(my_cols))
head(seq_y, -1)
}
})
} else {
ind_int <- findInterval(new_x, x)
rle_int <- rle(ind_int)
new_y <- sapply(rle_int$values, function(i) {
if (y[i] == y[max(rle_int$values)]) {
y[i]
} else {
seq_y <- seq(y[i], y[i + 1], length.out = rle_int$lengths[i] + 1)
head(seq_y, -1)
}
})
}
## THis is also a bit painful and might cause other bugs that I haven't
## discovered yet.
if (length(unlist(new_y)) < length(new_x)) {
newdat <- data.frame(
x = new_x,
y = rep_len(unlist(new_y), length.out = length(new_x))
)
} else {
newdat <- data.frame(x = new_x, y = unlist(new_y))
}
newdat <- newdat %>%
mutate(xend = lead(x), yend = lead(y)) %>%
drop_na(xend)
newdat$color <- my_cols
newdat
}
## the below is just a demonstration of how the function would work
## using different segment widths
df_alt1 <-
df %>%
filter(variable == "a_b") %>%
alt_colors("x", "value", 1, c("orange", "green"))
df_alt.5 <-
df %>%
filter(variable == "a_b") %>%
alt_colors("x", "value", .5, c("orange", "green"))
df_ab <-
df %>%
filter(variable != "a_b") %>%
# for the identity mapping
mutate(color = ifelse(variable == "a", "green", "orange"))
## create data frame for the legend, also using the alt_colors function as per above
## the amount of x is a bit of trial and error, this is just a quick hack
## this is a trick to center the legend more or less relative to the main plot
y_leg <- ceiling(mean(range(df$value, na.rm = TRUE)))
dist_y <- 2
df_legend <-
data.frame(
variable = rep(unique(df$variable), each = 2),
x = 1:2,
y = rep(seq(y_leg - dist_y, y_leg + dist_y, by = dist_y), each = 2)
)
df_leg_onecol <-
df_legend %>%
filter(variable != "a_b") %>%
mutate(color = ifelse(variable == "a", "green", "orange"))
df_leg_alt <-
df_legend %>%
filter(variable == "a_b") %>%
alt_colors("x", "y", .5, c("orange", "green"))
## I am mapping the colors globally using identity mapping (see scale_identity).
p1 <-
ggplot(mapping = aes(x, value, colour = color)) +
theme_minimal() +
geom_line(data = df_ab, size = 1) +
geom_segment(data = df_alt1, aes(y = y, xend = xend, yend = yend), size = 1) +
scale_color_identity() +
ggtitle("alternating every 1 unit")
p.5 <-
ggplot(mapping = aes(x, value, colour = color)) +
theme_minimal() +
geom_line(data = df_ab, size = 1) +
geom_segment(data = df_alt.5, aes(y = y, xend = xend, yend = yend), size = 1) +
scale_color_identity() +
ggtitle("alternating every .5 unit")
p_leg <-
ggplot(mapping = aes(x, y, colour = color)) +
theme_void() +
geom_line(data = df_leg_onecol, size = 1) +
geom_segment(data = df_leg_alt, aes(xend = xend, yend = yend), size = 1) +
scale_color_identity() +
annotate(
geom = "text", y = unique(df_legend$y), label = unique(df_legend$variable),
x = max(df_legend$x + 1), hjust = 0
)
## set y limits to the range of the main plot
## in order to make the labels visible you need to adjust the plot margin and
## turn clipping off
p1 + p.5 +
(p_leg + coord_cartesian(ylim = range(df$value), clip = "off") +
theme(plot.margin = margin(r = 20, unit = "pt"))) +
plot_layout(widths = c(1, 1, .2))
Created on 2022-01-18 by the reprex package (v2.0.1)
(Copied this over from Alternating color of individual dashes in a geom_line)
Here's a ggplot hack that is simple, but works for two colors only. It results in two lines being overlayed, one a solid line, the other a dashed line.
library(dplyr)
library(ggplot2)
library(reshape2)
# Create df
x_value <- 1:10
group1 <- c(0,1,2,3,4,5,6,7,8,9)
group2 <- c(0,2,4,6,8,10,12,14,16,18)
dat <- data.frame(x_value, group1, group2) %>%
mutate(group2_2 = group2) %>% # Duplicate the column that you want to be alternating colors
melt(id.vars = "x_value", variable.name = "group", value.name ="y_value") # Long format
# Put in your selected order
dat$group <- factor(dat$group, levels=c("group1", "group2", "group2_2"))
# Plot
ggplot(dat, aes(x=x_value, y=y_value)) +
geom_line(aes(color=group, linetype=group), size=1) +
scale_color_manual(values=c("black", "red", "black")) +
scale_linetype_manual(values=c("solid", "solid", "dashed"))
Unfortunately the legend still needs to be edited by hand. Here's the example plot.

ggplot2 - adding lines of same color but different type to legend

Given some data like:
my.data <- data.frame(time = rep(1:3, 2),
means = 2:7,
lowerCI = 1:6,
upperCI = 3:8,
scenario = rep(c("A","Z"), each=3))
my.data
# time means lowerCI upperCI scenario
# 1 1 2 1 3 A
# 2 2 3 2 4 A
# 3 3 4 3 5 A
# 4 1 5 4 6 Z
# 5 2 6 5 7 Z
# 6 3 7 6 8 Z
I need to make a plot like the one below but some label for the (confidence) dotted lines should appear in the legend - the order matters, should be something like Z, A, CI-Z, CI-A (see below).
This is the corresponding code:
ggplot(data = my.data) +
# add the average lines
geom_line(aes(x=time, y=means, color=scenario)) +
# add "confidence" lines
geom_line(aes(x=time, y=lowerCI, color=scenario), linetype="dotted") +
geom_line(aes(x=time, y=upperCI, color=scenario), linetype="dotted") +
# set color manually
scale_color_manual(name = 'Scenario',
breaks = c("Z", "A"),
values = c("Z" = "red",
"A" = "blue"))
Below is my attempt after I checked this & this SO similar questions. I get close enough, but I want the "CI" labels not to be separate.
ggplot(data = my.data) +
# add the average lines
geom_line(aes(x=time, y=means, color=scenario)) +
# add "confidence" lines
geom_line(aes(x=time, y=lowerCI, color=scenario, linetype="CI")) +
geom_line(aes(x=time, y=upperCI, color=scenario, linetype="CI")) +
# set color manually
scale_color_manual(name = 'Scenario',
breaks = c("Z", "A"),
values = c("Z" = "red",
"A" = "blue")) +
# set line type manually
scale_linetype_manual(name = 'Scenario',
breaks = c("Z", "A", "CI"),
values = c("Z" = "solid",
"A" = "solid",
"CI" = "dotted"))
I also tried something using geom_ribbon, but I could not find a clear way to make it display only the edge lines and add them as desired in the legend. All in all, I don't need to display bands, but lines.
I'm sure there is an obvious way, but for now I'm stuck here...
We can use guide_legend to specify dashed linetypes for the CI's. I think this is close to what you want:
ggplot(my.data, aes(x = time, y = means))+
geom_line(aes(colour = scenario))+
geom_line(aes(y = lowerCI, colour = paste(scenario, 'CI')),
linetype = 'dashed')+
geom_line(aes(y = upperCI, colour = paste(scenario, 'CI')),
linetype = 'dashed')+
scale_colour_manual(values = c('A' = 'red','Z' = 'blue',
'A CI' = 'red','Z CI' = 'blue'),
breaks = c('Z', 'Z CI', 'A', 'A CI'))+
guides(colour = guide_legend(override.aes = list(linetype = c('solid', 'dashed'))))+
ggtitle('Dashed lines represent X% CI')

Error in Plotting Longitude Latitude with Fill Values in ggplot2

I have a data with longitude, latitude and value at each grid. A grid may have more than one value so I set alpha to visualize multiple values. My aim is to fill grids with three different ranges. If the value is zero then that grid would be empty.
library(maps)
library(ggplot2)
data <- read.csv("G:/mydata.csv")
g1 <- ggplot(aes(x=x, y=y, fill= A), data=data) +
geom_tile(data=subset(data, A > 1970 & A < 1980),fill = "black", alpha = 0.5)+
geom_tile(data=subset(data, B > 1970 & B < 1980),fill = "black", alpha = 0.5)+
geom_tile(data=subset(data, C > 1970 & C < 1980),fill = "black", alpha = 0.5)+
geom_tile(data=subset(data, A > 1979 & A < 1990),fill = "blue", alpha = 0.5)+
geom_tile(data=subset(data, B> 1979 & B < 1990), fill = "blue", alpha = 0.5)+
geom_tile(data=subset(data, C > 1979 & C < 1990),fill = "blue", alpha = 0.5)+
geom_tile(data=subset(data, A > 1989),fill = "red", alpha = 0.5)+
geom_tile(data=subset(data, B > 1989),fill = "red", alpha = 0.5)+
geom_tile(data=subset(data, C > 1989),fill = "red", alpha = 0.5)+
theme_classic()
is wrong. As blue grids are bigger. I could not find out the mistake. I followed the link but could not make it. I guess there is something trivial which I am missing. My data can be accessed here. Many thanks in advance.
Sorry, can't do it the way you envisioned it. Not enough flexiblity that I could see. But one can do this:
library(maps)
library(ggplot2)
ddf <- read.csv("mydata.csv")
setz <- function(dddf,zvek,lev=0,fillclr){
dddf$z <- as.numeric(zvek)
dddf$lev <- lev
dddf$color <- "white"
dddf$fill <- ifelse(zvek,fillclr,"gray")
return(dddf)
}
df1<-setz(ddf,ddf$A>1970 & ddf$A<1980,"A>1970 & A<1980","black")
df2<-setz(ddf,ddf$B>1970 & ddf$B<1980,"B>1970 & B<1980","black")
df3<-setz(ddf,ddf$C>1970 & ddf$C<1980,"C>1970 & C<1980","black")
df4<-setz(ddf,ddf$A>1979 & ddf$A<1990,"A>1979 & A<1990","blue")
df5<-setz(ddf,ddf$B>1979 & ddf$B<1990,"B>1979 & B<1990","blue")
df6<-setz(ddf,ddf$C>1979 & ddf$C<1990,"C>1979 & C<1990","blue")
df7<-setz(ddf,ddf$A>1989,"A>1989","red")
df8<-setz(ddf,ddf$B>1989,"B>1989","red")
df9<-setz(ddf,ddf$C>1989,"C>1989","red")
ddg <- rbind( df1,df2,df3, df4,df5,df6, df7,df8,df9 )
g1 <- ggplot(data=ddg,aes(x=x, y=y,fill=fill,color=color)) +
geom_tile() +
scale_color_identity() +
scale_fill_identity() +
facet_wrap(~lev)
theme_classic()
print(g1)
Which yields this:

Color one point and add an annotation in ggplot2?

I have a dataframe a with three columns :
GeneName, Index1, Index2
I draw a scatterplot like this
ggplot(a, aes(log10(Index1+1), Index2)) +geom_point(alpha=1/5)
Then I want to color a point whose GeneName is "G1" and add a text box near that point, what might be the easiest way to do it?
You could create a subset containing just that point and then add it to the plot:
# create the subset
g1 <- subset(a, GeneName == "G1")
# plot the data
ggplot(a, aes(log10(Index1+1), Index2)) + geom_point(alpha=1/5) + # this is the base plot
geom_point(data=g1, colour="red") + # this adds a red point
geom_text(data=g1, label="G1", vjust=1) # this adds a label for the red point
NOTE: Since everyone keeps up-voting this question, I thought I would make it easier to read.
Something like this should work. You may need to mess around with the x and y arguments to geom_text().
library(ggplot2)
highlight.gene <- "G1"
set.seed(23456)
a <- data.frame(GeneName = paste("G", 1:10, sep = ""),
Index1 = runif(10, 100, 200),
Index2 = runif(10, 100, 150))
a$highlight <- ifelse(a$GeneName == highlight.gene, "highlight", "normal")
textdf <- a[a$GeneName == highlight.gene, ]
mycolours <- c("highlight" = "red", "normal" = "grey50")
a
textdf
ggplot(data = a, aes(x = Index1, y = Index2)) +
geom_point(size = 3, aes(colour = highlight)) +
scale_color_manual("Status", values = mycolours) +
geom_text(data = textdf, aes(x = Index1 * 1.05, y = Index2, label = "my label")) +
theme(legend.position = "none") +
theme()

How I can make in ggplot2 my first 10 lines in red and the rest lines in blue based on example (R, ggplot2)

There were example code for E on ggplot2 library:
theme_set(theme_bw())
dat = data.frame(value = rnorm(100,sd=2.5))
dat = within(dat, {
value_scaled = scale(value, scale = sd(value))
obs_idx = 1:length(value)
})
ggplot(aes(x = obs_idx, y = value_scaled), data = dat) +
geom_ribbon(ymin = -1, ymax = 1, alpha = 0.1) +
geom_line() + geom_point()
There is a question: How I can make in ggplot2 my first 10 lines in red and the rest lines in blue based on example? I tried to use some kind of layer syntax is, but it doesn't work.
First, add another column to your data frame dat. It has value 0 for the first 10 rows and 1 for the rest.
dat$group <- factor(rep.int(c(0, 1), c(10, nrow(dat)-10)))
Generate the plot:
library(ggplot2)
ggplot(aes(x = obs_idx, y = value_scaled), data = dat) +
geom_ribbon(ymin = -1, ymax = 1, alpha = 0.1) +
geom_line(aes(colour = group), show_guide = FALSE) +
scale_colour_manual(values = c("red", "blue")) +
geom_point()
The parameter show_guide = FALSE suppresses the legend for the red and blue lines.
OK, I could manage layers, the code is (not elegant, but works):
require(ggplot2)
value=round(rnorm(50,200,50),0)
nmbrs<-length(value) ## length of vector
obrv<-1:length(value) ## list of observations
#create data frame from the values
data_lj<-data.frame(obrv,value)
data_lj20<-data.frame(data_lj[1:20,1:2])
data_lj21v<-data.frame(data_lj[20:nmbrs,1:2])
#plot with ggplot
rr<-ggplot()+
layer(mapping=aes(obrv,value),geom="line",data=data_lj20,colour="red")+
layer(mapping=aes(obrv,value),geom="line",data=data_lj21v,colour="blue")
print(rr)

Resources