how can i plot a line on bar chart in R - r

H <- c(1,2,4,1,0,0,3,1,3)
M <- c("one","two","three","four","five")
main="bar chart",border="blue")
I want to add line on the bar chart , i don't know how to do it
like the one in blue

Maybe you want this?
hist(H, breaks=-1:4, freq=FALSE, xaxt="n")
axis(side=1, at=seq(-0.5, 3.5), labels=M)

The graph in the question can be made with code following the lines of:
1. Table the x vector.
tbl <- table(H)
df1 <-
2. With package ggplot2, built-in ways of fitting a line can be used.
ggplot(df1, aes(as.integer(Var1), Freq)) +
geom_bar(stat = "identity", fill = "red", alpha = 0.5) +
geom_smooth(method = stats::loess,
formula = y ~ x,
color = "blue", fill = "blue",
alpha = 0.5)
Test data creation code.
f <- function(x) sin(x)^2*exp(x)
p <- f(seq(0, 2.5, by = 0.05))
p <- p/sum(p)
H <- sample(51, size = 1e3, prob = p, replace = TRUE)
Here is a new graph, with the new data posted in comment. The data is at the end of this answer.
Mdate <- as.Date(paste0(M, "/2020"), format = "%d/%m/%Y")
df1 <- data.frame(H, M = Mdate)
ggplot(df1, aes(M, H)) +
geom_bar(stat = "identity", fill = "red", alpha = 0.5) +
geom_smooth(method = stats::loess,
formula = y ~ x,
color = "blue", fill = "blue", alpha = 0.25,
level = 0.5, span = 0.1) +
scale_x_date(labels = date_format("%d/%m"))
New data
H <- c(1,2,4,1,0,0,3,1,3,3, 6,238,0,
58,17,64,38,3,10,8, 10,11,13,
M <- c("29/02","01/03","02/03","03/03",
"08/03", "09/03","10/03","11/03",
"16/03", "17/03","18/03","19/03",
"24/03", "25/03","26/03","27/03",


How to add name labels to a graph using ggplot2 in R?

I have the following code:
plot <- ggplot(data = df_sm)+
geom_histogram(aes(x=simul_means, y=..density..), binwidth = 0.20, fill="slategray3", col="black", show.legend = TRUE)
plot <- plot + labs(title="Density of 40 Means from Exponential Distribution", x="Mean of 40 Exponential Distributions", y="Density")
plot <- plot + geom_vline(xintercept=sampl_mean,size=1.0, color="black", show.legend = TRUE)
plot <- plot + stat_function(fun=dnorm,args=list(mean=sampl_mean, sd=sampl_sd),color = "dodgerblue4", size = 1.0)
plot <- plot+ geom_vline(xintercept=th_mean,size=1.0,color="indianred4",linetype = "longdash")
plot <- plot + stat_function(fun=dnorm,args=list(mean=th_mean, sd=th_mean_sd),color = "darkmagenta", size = 1.0)
I want to show the legends of each layer, I tried show.legend = TRUE but it does nothing.
All my data frame is means from exponential distribution simulations, also I have some theoretical values from the distribution (mean and standard deviation) which I describe as th_mean and th_mean_sd.
The code for my simulation is the following:
lambda <- 0.2
th_mean <- 1/lambda
th_sd <- 1/lambda
th_var <- th_sd^2
n <- 40
th_mean_sd <- th_sd/sqrt(n)
th_mean_var <- th_var/sqrt(n)
simul <- 1000
simul_means <- NULL
for(i in 1:simul) {
simul_means <- c(simul_means, mean(rexp(n, lambda)))
sampl_mean <- mean(simul_means)
sampl_sd <- sd(simul_means)
If you want to get a legend you have to map on aesthetics instead of setting the color, fill, ... as parameter, i.e. move color=... inside aes(...) and make use of scale_color/fill_manual to set the color values. Personally I find it helpful to make use of some meaningful labels, e.g. in case of your histogram I map the label "hist" on the fill but you could whatever label you like:
lambda <- 0.2
th_mean <- 1 / lambda
th_sd <- 1 / lambda
th_var <- th_sd^2
n <- 40
th_mean_sd <- th_sd / sqrt(n)
th_mean_var <- th_var / sqrt(n)
simul <- 1000
simul_means <- NULL
for (i in 1:simul) {
simul_means <- c(simul_means, mean(rexp(n, lambda)))
sampl_mean <- mean(simul_means)
sampl_sd <- sd(simul_means)
df_sm <- data.frame(simul_means)
ggplot(data = df_sm) +
geom_histogram(aes(x = simul_means, y = ..density.., fill = "hist"), binwidth = 0.20, col = "black") +
labs(title = "Density of 40 Means from Exponential Distribution", x = "Mean of 40 Exponential Distributions", y = "Density") +
stat_function(fun = dnorm, args = list(mean = sampl_mean, sd = sampl_sd), aes(color = "sampl_mean"), size = 1.0) +
stat_function(fun = dnorm, args = list(mean = th_mean, sd = th_mean_sd), aes(color = "th_dens"), size = 1.0) +
geom_vline(size = 1.0, aes(xintercept = sampl_mean, color = "sampl_mean")) +
geom_vline(size = 1.0, aes(xintercept = th_mean, color = "th_mean"), linetype = "longdash") +
scale_fill_manual(values = c(hist = "slategray3")) +
scale_color_manual(values = c(sampl_dens = "dodgerblue4", th_dens = "darkmagenta", th_mean = "indianred4", sampl_mean = "black"))

Creating a vertical color gradient for a geom_bar plot

I have searched and searched, but I cant seem to find an elegant way of doing this!
I have a dataset Data consisting of Data$x (dates) and Data$y (numbers from 0 to 1)
I want to plot them in a bar-chart:
ggplot(Data) + geom_bar(aes(x = x, y = y, fill = y, stat = "identity")) +
scale_fill_gradient2(low = "red", high = "green", mid = "yellow", midpoint = 0.90)
The result looks like this
However, I wanted to give each bar a gradient in the vertical direction ranging from 0 (red) to y (greener depending on y). Is there any way of doing this smoothly?
I have tried to see if I could impose a picture on the graph as a hack, but I can't impose it on the bars only except in a super super ugly way.
Another, not very pretty, hack using geom_segment. The x start and end positions (x and xend) are hardcoded (- 0.4; + 0.4), so is the size. These numbers needs to be adjusted depending on the number of x values and range of y.
# some toy data
d <- data.frame(x = 1:3, y = 1:3)
# interpolate values from zero to y and create corresponding number of x values
vals <- lapply(d$y, function(y) seq(0, y, by = 0.01))
y <- unlist(vals)
mid <- rep(d$x, lengths(vals))
d2 <- data.frame(x = mid - 0.4,
xend = mid + 0.4,
y = y,
yend = y)
ggplot(data = d2, aes(x = x, xend = xend, y = y, yend = yend, color = y)) +
geom_segment(size = 2) +
scale_color_gradient2(low = "red", mid = "yellow", high = "green",
midpoint = max(d2$y)/2)
A somewhat related question which may give you some other ideas: How to make gradient color filled timeseries plot in R
Doesn't exist as far as I know, but you can manipulate your data to produce it.
df = data.frame(x=c(1:10),y=runif(10))
prepGradient <- function(x,y,spacing=max(y)/100){
df <- data.frame(x=x,y=y)
newDf = data.frame(x=NULL,y=NULL,z=NULL)
for (r in 1:nrow(df)){
n <- floor(df[r,"y"]/spacing)
for (s in c(1:n)){
tmp <- data.frame(x=df[r,"x"],y=spacing,z=s*spacing)
newDf <- rbind(newDf,tmp)
tmp <- data.frame(x=df[r,"x"],y=df[r,"y"]%%spacing,z=df[r,"y"])
newDf <- rbind(newDf,tmp)
df2 <- prepGradient(df$x,df$y)
ggplot(df2,aes(x=x,y=y,fill=z)) +
geom_bar(stat="identity") +
scale_fill_gradient2(low="red", high="green", mid="yellow",midpoint=median(df$y))+
ggtitle('Vertical Gradient Example') +
Found a less hacky way to do this when answering Change ggplot bar chart fill colors
df <- data.frame(value = c(20, 50, 90),
group = c(1, 2, 3))
df_expanded <- df %>%
rowwise() %>%
summarise(group = group,
value = list(0:value)) %>%
unnest(cols = value)
df_expanded %>%
ggplot() +
x = group,
y = value,
fill = value,
width = 0.9
)) +
coord_flip() +
scale_fill_viridis_c(option = "C") +
theme(legend.position = "none")
Because this did not explicitly ask for divergent / multi-hue scales (in the title), here a simple hack for a single-hue gradient. This is very much the approach like suggested for a gradient fill under a curve as seen here
d <- data.frame(x = 1:3, y = 1:3)
n_grad <- 1000
grad_df <- data.frame(yintercept = seq(0, 3, len = 200),
alpha = seq(0.3, 0, len = 200))
ggplot(d ) +
geom_col(aes(x, y), fill = "darkblue") +
geom_hline(data = grad_df, aes(yintercept = yintercept, alpha = alpha),
size = 1, colour = "white", show.legend = FALSE) +
## white background looks nicer then

R ggplot2: Adding a Legend to a Time Series with Forecasts

I've been looking at this for hours stumped. I've come across a number of suggestions that I need to add aes() and assigning colours to the geom_lines but this isn't generating anything - potentially as I have some forecasts in as well? I'm really not too sure.
In any case I've put my code below, and really appreciate any help that can be provided.
paperback <- books[,1]
fit1 <- ses(paperback, alpha = 0.2, initial = "simple", h = 3)
fit2 <- ses(paperback, alpha = 0.6, initial = "simple", h = 3)
fit3 <- ses(paperback, h = 3)
xlab="Day", main="", size = 20) +
geom_line(data = paperback, colour = "black", aes(colour="black")) +
geom_line(data = fitted(fit1), colour = "blue", linetype = 2, aes(colour="blue")) +
geom_line(data = fitted(fit2), colour = "red", linetype = 2, aes(colour="red")) +
geom_line(data = fitted(fit3), colour = "green", linetype = 2, aes(colour = "green")) +
geom_line(data = fit1$mean, colour = "blue", linetype = 2) +
geom_line(data = fit2$mean, colour = "red", linetype = 2) +
geom_line(data = fit3$mean, colour = "green", linetype = 2)
I recommend to plot directly the forecast objects:
require("fpp"); require("books"); require("ggplot2"); require("ggfortify")
paperback <- books[,1]
fit1 <- ses(paperback, alpha = 0.2, initial = "simple", h = 3)
fit2 <- ses(paperback, alpha = 0.6, initial = "simple", h = 3)
fit3 <- ses(paperback, h = 3)
par(mfrow = c(1,3))
and this produces:
and if you want to do it with ggplot you can do:
X <- cbind(model1 = fit1$mean, model2 = fit2$mean, model3 = fit3$mean)
df <- cbind(paperback, X)
colnames(df) <- c("paperback", "model1", "model2", "model3")

#What causes different behaviour between stats and ggplot2 when writing histograms, normal curves and qqplots to .pdf?

I need to produce plots for statistical analyses and I am stumped by a difference in behaviour between stats and ggplot. Who can help out?
I am trying to produce a pdf with histograms, including normal curves, side-by-side with qqplots, with the next plot continuing on the same page. Preferably using ggplot (because prettier plots). I have a large number of variables in my real dataset, so I am using a 'for' loop.
This piece of ggplot code does what I want it to do.
ggplot(airquality, aes(Wind)) +
geom_histogram(aes(y = ..density..),colour = "black", fill = "white") +
stat_function(fun = dnorm, args = list(mean = mean(airquality$Wind), sd = sd(airquality$Wind)), colour = "red", size = 1) +
qplot(sample = airquality$Wind, stat = "qq")
I am fine with the binwidth warning, I want that picked automatically, and I will build in a suppression for that message later on. I am not sure wat to do though with: '"stat" is deprecated' Anyone?
If I try to work this into a 'for' loop, I cannot get it to work. It keeps putting every plot on a new page and it leaves out the normal curves:
Variablesairquality<-c("Wind", "Temp", "Month", "Day")
pdf(file = "Normality.pdf", 4, 5)
par(mfrow = c(2,2))
for(i in Variablesairquality){
plot(ggplot(airquality, aes(airquality[,i])) +
geom_histogram(aes(y = ..density..),colour = "black", fill = "white") +
stat_function(fun = dnorm, args = list(mean = mean(airquality[,i]), sd = sd(airquality[,i])), colour = "red", size = 1) +
plot(qplot(sample = airquality[,i], stat = "qq" )
Which I don’t get, because if I try it using stats, it does exactly what I want:
pdf(file = "Normality2.pdf", 4, 5)
par(mfrow = c(2,2))
for(i in Variablesairquality){
h <- hist(airquality[,i], col = "white", cex.axis=0.50, xlab = i, cex.lab=0.75, main = paste("Distribution"), cex.main= 0.75)
yfit <- yfit*diff(h$mids[1:2])*length(airquality[,i])
lines(xfit, yfit, col="red", lwd=1)
qqnorm(airquality[,i], cex = 0.5, cex.axis=0.50, cex.lab=0.75, main = expression("Q-Q plot for"~paste(i)), cex.main= 0.75)
qqline(airquality[,i], col = "red")
(Accept for the thing with the main label which I still need to figure out. Anyone any tips?)
I would be most grateful if someone could point out the mistake in my ggplot code or otherwise explain this behaviour. Thanks!
I use R-programming V3.2.3 and R-studio v0.99.891. (And yes, I read every similar item here, scowered the internet and I read the help files; that did not get me where I need to go.)
On `stat` is deprecated, see Deprecated features in the ggplot2 2.0.0 release notes. Use instead:
ggplot(airquality, aes(sample = Wind)) +
If you don't wish to use gridExtra::grid.arrange, here's an approach that uses facets. Begin by wrangling the data into a new dataframe with the values we want for x, y, plot type, and geom variables:
d <-$Wind, = F))
d$plot <- "QQ plot"
d$geom <- "point"
d <- rbind(d, data.frame(x = airquality$Wind, y = NA,
plot = "Histogram", geom = "bar"))
d <- rbind(d, with(airquality, data.frame(
x = seq(min(Wind), max(Wind), l = 100),
y = dnorm(seq(min(Wind), max(Wind), l = 100),
mean = mean(Wind), sd = sd(Wind)),
plot = "Histogram", geom = "line")))
Then call ggplot, subsetting the data as appropriate for each geom:
ggplot(d, aes(x = x, y = y)) + facet_wrap(~plot, scales = "free") +
geom_histogram(data = subset(d, plot == "Histogram" & geom == "bar"),
aes(y = ..density..),
colour = "black", fill = "white") +
geom_line(data = subset(d, plot == "Histogram" & geom == "line"),
colour = "red", size = 1) +
geom_point(data = subset(d, plot == "QQ plot")) +
labs(x = "Wind")
To do multiple plots, you can wrap the code above into a for loop, making sure to wrap ggplot inside print:
Variablesairquality <- c("Wind", "Temp", "Month", "Day")
for (i in rev(Variablesairquality)) {
x <- airquality[[i]]
d <-, = F))
d$plot <- "QQ plot"
d$geom <- "point"
d <- rbind(d, data.frame(x = x, y = NA, plot = "Histogram", geom = "bar"))
d <- rbind(d, data.frame(x = seq(min(x), max(x), l = 100),
y = dnorm(seq(min(x), max(x), l = 100),
mean = mean(x), sd = sd(x)),
plot = "Histogram", geom = "line"))
ggplot(d, aes(x = x, y = y)) + facet_wrap(~plot, scales = "free") +
geom_histogram(data = subset(d, plot == "Histogram" & geom == "bar"),
aes(y = ..density..),
colour = "black", fill = "white") +
geom_line(data = subset(d, plot == "Histogram" & geom == "line"),
colour = "red", size = 1) +
geom_point(data = subset(d, plot == "QQ plot")) +
labs(x = i)

operation between stat_summary_hex plots made in ggplot2

I have two populations A and B distributed spatially with one character Z, I want to be able to make an hexbin substracting the proportion of the character in each hexbin. Here I have the code for two theoretical populations A and B
xA <- rnorm(1000)
yA <- rnorm(1000)
zA <- sample(c(1, 0), 20, replace = TRUE, prob = c(0.2, 0.8))
hbinA <- hexbin(xA, yA, xbins = 40, IDs = TRUE)
A <- data.frame(x = xA, y = yA, z = zA)
xB <- rnorm(1000)
yB <- rnorm(1000)
zB <- sample(c(1, 0), 20, replace = TRUE, prob = c(0.4, 0.6))
hbinB <- hexbin(xB, yB, xbins = 40, IDs = TRUE)
B <- data.frame(x = xB, y = yB, z = zB)
ggplot(A, aes(x, y, z = z)) + stat_summary_hex(fun = function(z) sum(z)/length(z), alpha = 0.8) +
scale_fill_gradientn(colours = c("blue","red")) +
guides(alpha = FALSE, size = FALSE)
ggplot(B, aes(x, y, z = z)) + stat_summary_hex(fun = function(z) sum(z)/length(z), alpha = 0.8) +
scale_fill_gradientn (colours = c("blue","red")) +
guides(alpha = FALSE, size = FALSE)
here is the two resulting graphs
My goal is to make a third graph with hexbins with the values of the difference between hexbins at the same coordinates but I don't even know how to start to do it, I have done something similar in the raster Package, but I need it as hexbins
Thanks a lot
You need to make sure that both plots use the exact same binning. In order to achieve this, I think it is best to do the binning beforehand and then plot the results with stat_identity / geom_hex. With the variables from your code sample you ca do:
## find the bounds for the complete data
xbnds <- range(c(A$x, B$x))
ybnds <- range(c(A$y, B$y))
nbins <- 30
# function to make a data.frame for geom_hex that can be used with stat_identity
makeHexData <- function(df) {
h <- hexbin(df$x, df$y, nbins, xbnds = xbnds, ybnds = ybnds, IDs = TRUE)
z = tapply(df$z, h#cID, FUN = function(z) sum(z)/length(z)),
cid = h#cell)
Ahex <- makeHexData(A)
Bhex <- makeHexData(B)
## not all cells are present in each binning, we need to merge by cellID
byCell <- merge(Ahex, Bhex, by = "cid", all = T)
## when calculating the difference empty cells should count as 0
byCell$z.x[$z.x)] <- 0
byCell$z.y[$z.y)] <- 0
## make a "difference" data.frame
Diff <- data.frame(x = ifelse($x.x), byCell$x.y, byCell$x.x),
y = ifelse($y.x), byCell$y.y, byCell$y.x),
z = byCell$z.x - byCell$z.y)
## plot the results
ggplot(Ahex) +
geom_hex(aes(x = x, y = y, fill = z),
stat = "identity", alpha = 0.8) +
scale_fill_gradientn (colours = c("blue","red")) +
guides(alpha = FALSE, size = FALSE)
ggplot(Bhex) +
geom_hex(aes(x = x, y = y, fill = z),
stat = "identity", alpha = 0.8) +
scale_fill_gradientn (colours = c("blue","red")) +
guides(alpha = FALSE, size = FALSE)
ggplot(Diff) +
geom_hex(aes(x = x, y = y, fill = z),
stat = "identity", alpha = 0.8) +
scale_fill_gradientn (colours = c("blue","red")) +
guides(alpha = FALSE, size = FALSE)
