This question already has an answer here:
How to create side-by-side bar charts (for multiple series) with ggplot?
(1 answer)
Closed 5 years ago.
I want to create a barplot using ggplot in R studio using two variables side by side. I tried following other people suggestions I found online, but I cant get it to work.
Here's the data I'm using:
x <- c(5,17,31,9,17,10,30,28,16,29,14,34)
y <- c(1,2,3,4,5,6,7,8,9,10,11,12)
day <- c(1,2,3,4,5,6,7,8,9,10,11,12)
So, what I'm trying to do is have days on the x-axis and side by side barplots of x and y (with x & y being colored) corresponding to the day number.
First thing i did was make a data frame :
df1 <- data.frame(x,y,day)
and then I tried:
ggplot(df1, aes(x = day, y = x,y)) + geom_bar(stat = "identity",color = x, width = 1, position="dodge")
But I just can't get it to work properly. Any suggestions as to how I'd achieve this?
You have the right idea, I think the melt() function from the reshape2 package is what you're looking for.
library(ggplot2)
library(reshape2)
x <- c(5,17,31,9,17,10,30,28,16,29,14,34)
y <- c(1,2,3,4,5,6,7,8,9,10,11,12)
day <- c(1,2,3,4,5,6,7,8,9,10,11,12)
df1 <- data.frame(x, y, day)
df2 <- melt(df1, id.vars='day')
head(df2)
ggplot(df2, aes(x=day, y=value, fill=variable)) +
geom_bar(stat='identity', position='dodge')
EDIT
I think the pivot_longer() function from the tidyverse tidyr package might now be the better way to handle these types of data manipulations. It gives quite a bit more control than melt() and there's also a pivot_wider() function as well to do the opposite.
library(ggplot2)
library(tidyr)
x <- c(5,17,31,9,17,10,30,28,16,29,14,34)
y <- c(1,2,3,4,5,6,7,8,9,10,11,12)
day <- c(1,2,3,4,5,6,7,8,9,10,11,12)
df1 <- data.frame(x, y, day)
df2 <- tidyr::pivot_longer(df1, cols=c('x', 'y'), names_to='variable',
values_to="value")
head(df2)
ggplot(df2, aes(x=day, y=value, fill=variable)) +
geom_bar(stat='identity', position='dodge')
Or you could use facet_wrap to produce two plots:
library("ggplot2")
library("reshape")
x <- c(5,17,31,9,17,10,30,28,16,29,14,34)
y <- c(1,2,3,4,5,6,7,8,9,10,11,12)
day <- c(1,2,3,4,5,6,7,8,9,10,11,12)
df1 <- data.frame(x,y,day)
df2 <- reshape::melt(df1, id = c("day"))
ggplot(data = df2, aes(x = day, y = value, fill = variable)) + geom_bar(stat = "identity")+ facet_wrap(~ variable) + scale_x_continuous(breaks=seq(1,12,2))
If you want the bars with color according to the day use fill = day:
ggplot(data = df2, aes(x = day, y = value, fill = day)) + geom_bar(stat = "identity") + facet_wrap(~ variable) + scale_x_continuous(breaks=seq(1,12,2))
Related
I have a csv file which looks like the following:
Name,Count1,Count2,Count3
application_name1,x1,x2,x3
application_name2,x4,x5,x6
The x variables represent numbers and the applications_name variables represent names of different applications.
Now I would like to make a barplot for each row by using ggplot2. The barplot should have the application_name as title. The x axis should show Count1, Count2, Count3 and the y axis should show the corresponding values (x1, x2, x3).
I would like to have a single barplot for each row, because I have to store the different plots in different files. So I guess I cannot use "melt".
I would like to have something like:
for each row in rows {
print barplot in file
}
Thanks for your help.
You can use melt to rearrange your data and then use either facet_wrap or facet_grid to get a separate plot for each application name
library(ggplot2)
library(reshape2)
# example data
mydf <- data.frame(name = paste0("name",1:4), replicate(5,rpois(4,30)))
names(mydf)[2:6] <- paste0("count",1:5)
# rearrange data
m <- melt(mydf)
# if you are wanting to export each plot separately
# I used facet_wrap as a quick way to add the application name as a plot title
for(i in levels(m$name)) {
p <- ggplot(subset(m, name==i), aes(variable, value, fill = variable)) +
facet_wrap(~ name) +
geom_bar(stat="identity", show_guide=FALSE)
ggsave(paste0("figure_",i,".pdf"), p)
}
# or all plots in one window
ggplot(m, aes(variable, value, fill = variable)) +
facet_wrap(~ name) +
geom_bar(stat="identity", show_guide=FALSE)
I didn't see #user20650's nice answer before preparing this. It's almost identical, except that I use plyr::d_ply to save things instead of a loop. I believe dplyr::do() is another good option (you'd group_by(Name) first).
yourData <- data.frame(Name = sample(letters, 10),
Count1 = rpois(10, 20),
Count2 = rpois(10, 10),
Count3 = rpois(10, 8))
library(reshape2)
yourMelt <- melt(yourData, id.vars = "Name")
library(ggplot2)
# Test a function on one piece to develope graph
ggplot(subset(yourMelt, Name == "a"), aes(x = variable, y = value)) +
geom_bar(stat = "identity") +
labs(title = subset(yourMelt, Name == 'a')$Name)
# Wrap it up, with saving to file
bp <- function(dat) {
myPlot <- ggplot(dat, aes(x = variable, y = value)) +
geom_bar(stat = "identity") +
labs(title = dat$Name)
ggsave(filname = paste0("path/to/save/", dat$Name, "_plot.pdf"),
myPlot)
}
library(plyr)
d_ply(yourMelt, .variables = "Name", .fun = bp)
i have a dataframe structured like this
Elem. Category. SEZa SEZb SEZc
A. ONE. 1. 3. 4
B. TWO. 4. 5. 6
i want to plot three histograms in three different facets (SEZa, SEZb, SEZc) with ggplot where the x values are the category values (ONE. e TWO.) and the y values are the number present in columns SEZa, SEZb, SEZc.
something like this:
how can I do? thank you for your suggestions!
Assume df is your data.frame, I would first convert from wide format to a long format:
new_df <- reshape2::melt(df, id.vars = c("Elem", "Category"))
And then make the plot using geom_col() instead of geom_histogram() because it seems you've precomputed the y-values and wouldn't need ggplot to calculate these values for you.
ggplot(new_df, aes(x = Category, y = value, fill = Elem)) +
geom_col() +
facet_grid(variable ~ .)
I think that what you are looking for is something like this :
library(ggplot2)
library(reshape2)
df <- data.frame(Category = c("One", "Two"),
SEZa = c(1, 4),
SEZb = c(3, 5),
SEZc = c(4, 6))
df <- melt(df)
ggplot(df, aes(x = Category, y = value)) +
geom_col(aes(fill = variable)) +
facet_grid(variable ~ .)
My inspiration is :
http://felixfan.github.io/stacking-plots-same-x/
I have a data frame with five columns and five rows. the data frame looks like this:
df <- data.frame(
day=c("m","t","w","t","f"),
V1=c(5,10,20,15,20),
V2=c(0.1,0.2,0.6,0.5,0.8),
V3=c(120,100,110,120,100),
V4=c(1,10,6,8,8)
)
I want to do some plots so I used the ggplot and in particular the geom_bar:
ggplot(df, aes(x = day, y = V1, group = 1)) + ylim(0,20)+ geom_bar(stat = "identity")
ggplot(df, aes(x = day, y = V2, group = 1)) + ylim(0,1)+ geom_bar(stat = "identity")
ggplot(df, aes(x = day, y = V3, group = 1)) + ylim(50,200)+ geom_bar(stat = "identity")
ggplot(df, aes(x = day, y = V4, group = 1)) + ylim(0,15)+ geom_bar(stat = "identity")
My question is, How can I do a grouped ggplot with geom_bar with multiple y axis? I want at the x axis the day and for each day I want to plot four bins V1,V2,V3,V4 but with different range and color. Is that possible?
EDIT
I want the y axis to look like this:
require(reshape)
data.m <- melt(df, id.vars='day')
ggplot(data.m, aes(day, value)) +
geom_bar(aes(fill = variable), position = "dodge", stat="identity") +
facet_grid(variable ~ .)
You can also change the y-axis limits if you like (here's an example).
Alternately you may have meant grouped like this:
require(reshape)
data.m <- melt(df, id.vars='day')
ggplot(data.m, aes(day, value)) +
geom_bar(aes(fill = variable), position = "dodge", stat="identity")
For the latter examples if you want 2 Y axes then you just create the plot twice (once with a left y axis and once with a right y axis) then use this function:
double_axis_graph <- function(graf1,graf2){
graf1 <- graf1
graf2 <- graf2
gtable1 <- ggplot_gtable(ggplot_build(graf1))
gtable2 <- ggplot_gtable(ggplot_build(graf2))
par <- c(subset(gtable1[['layout']], name=='panel', select=t:r))
graf <- gtable_add_grob(gtable1, gtable2[['grobs']][[which(gtable2[['layout']][['name']]=='panel')]],
par['t'],par['l'],par['b'],par['r'])
ia <- which(gtable2[['layout']][['name']]=='axis-l')
ga <- gtable2[['grobs']][[ia]]
ax <- ga[['children']][[2]]
ax[['widths']] <- rev(ax[['widths']])
ax[['grobs']] <- rev(ax[['grobs']])
ax[['grobs']][[1]][['x']] <- ax[['grobs']][[1]][['x']] - unit(1,'npc') + unit(0.15,'cm')
graf <- gtable_add_cols(graf, gtable2[['widths']][gtable2[['layout']][ia, ][['l']]], length(graf[['widths']])-1)
graf <- gtable_add_grob(graf, ax, par['t'], length(graf[['widths']])-1, par['b'])
return(graf)
}
I believe there's also a package or convenience function that does the same thing.
First I reshaped as described in the documentation in the link below the question.
In general ggplot does not support multiple y-axis. I think it is a philosophical thing. But maybe faceting will work for you.
df <- read.table(text = "day V1 V2 V3 V4
m 5 0.1 120 1
t 10 0.2 100 10
w 2 0.6 110 6
t 15 0.5 120 8
f 20 0.8 100 8", header = TRUE)
library(reshape2)
df <- melt(df, id.vars = 'day')
ggplot(df, aes(x = variable, y = value, fill = variable)) + geom_bar(stat = "identity") + facet_grid(.~day)
If I understand correctly you want to include facets in your plot. You have to use reshape2 to get the data in the right format. Here's an example with your data:
df <- data.frame(
day=c("m","t","w","t","f"),
V1=c(5,10,20,15,20),
V2=c(0.1,0.2,0.6,0.5,0.8),
V3=c(120,100,110,120,100),
V4=c(1,10,6,8,8)
)
library(reshape2)
df <- melt(df, "day")
Then plot with and include facet_grid argument:
ggplot(df, aes(x=day, y=value)) + geom_bar(stat="identity", aes(fill=variable)) +
facet_grid(variable ~ .)
I have a csv file which looks like the following:
Name,Count1,Count2,Count3
application_name1,x1,x2,x3
application_name2,x4,x5,x6
The x variables represent numbers and the applications_name variables represent names of different applications.
Now I would like to make a barplot for each row by using ggplot2. The barplot should have the application_name as title. The x axis should show Count1, Count2, Count3 and the y axis should show the corresponding values (x1, x2, x3).
I would like to have a single barplot for each row, because I have to store the different plots in different files. So I guess I cannot use "melt".
I would like to have something like:
for each row in rows {
print barplot in file
}
Thanks for your help.
You can use melt to rearrange your data and then use either facet_wrap or facet_grid to get a separate plot for each application name
library(ggplot2)
library(reshape2)
# example data
mydf <- data.frame(name = paste0("name",1:4), replicate(5,rpois(4,30)))
names(mydf)[2:6] <- paste0("count",1:5)
# rearrange data
m <- melt(mydf)
# if you are wanting to export each plot separately
# I used facet_wrap as a quick way to add the application name as a plot title
for(i in levels(m$name)) {
p <- ggplot(subset(m, name==i), aes(variable, value, fill = variable)) +
facet_wrap(~ name) +
geom_bar(stat="identity", show_guide=FALSE)
ggsave(paste0("figure_",i,".pdf"), p)
}
# or all plots in one window
ggplot(m, aes(variable, value, fill = variable)) +
facet_wrap(~ name) +
geom_bar(stat="identity", show_guide=FALSE)
I didn't see #user20650's nice answer before preparing this. It's almost identical, except that I use plyr::d_ply to save things instead of a loop. I believe dplyr::do() is another good option (you'd group_by(Name) first).
yourData <- data.frame(Name = sample(letters, 10),
Count1 = rpois(10, 20),
Count2 = rpois(10, 10),
Count3 = rpois(10, 8))
library(reshape2)
yourMelt <- melt(yourData, id.vars = "Name")
library(ggplot2)
# Test a function on one piece to develope graph
ggplot(subset(yourMelt, Name == "a"), aes(x = variable, y = value)) +
geom_bar(stat = "identity") +
labs(title = subset(yourMelt, Name == 'a')$Name)
# Wrap it up, with saving to file
bp <- function(dat) {
myPlot <- ggplot(dat, aes(x = variable, y = value)) +
geom_bar(stat = "identity") +
labs(title = dat$Name)
ggsave(filname = paste0("path/to/save/", dat$Name, "_plot.pdf"),
myPlot)
}
library(plyr)
d_ply(yourMelt, .variables = "Name", .fun = bp)
I'm looking for a way to plot a bar chart containing two different series, hide the bars for one of the series and instead have a line (smooth if possible) go through the top of where bars for the hidden series would have been (similar to how one might overlay a freq polynomial on a histogram). I've tried the example below but appear to be running into two problems.
First, I need to summarize (total) the data by group, and second, I'd like to convert one of the series (df2) to a line.
df <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,1,2,2,3,3))
df2 <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,4,3,5,1,2))
ggplot(df, aes(x=grp, y=val)) +
geom_bar(stat="identity", alpha=0.75) +
geom_bar(data=df2, aes(x=grp, y=val), stat="identity", position="dodge")
You can get group totals in many ways. One of them is
with(df, tapply(val, grp, sum))
For simplicity, you can combine bar and line data into a single dataset.
df_all <- data.frame(grp = factor(levels(df$grp)))
df_all$bar_heights <- with(df, tapply(val, grp, sum))
df_all$line_y <- with(df2, tapply(val, grp, sum))
Bar charts use a categorical x-axis. To overlay a line you will need to convert the axis to be numeric.
ggplot(df_all) +
geom_bar(aes(x = grp, weight = bar_heights)) +
geom_line(aes(x = as.numeric(grp), y = line_y))
Perhaps your sample data aren't representative of the real data you are working with, but there are no lines to be drawn for df2. There is only one value for each x and y value. Here's a modifed version of your df2 with enough data points to construct lines:
df <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,2,3,1,2,3))
df2 <- data.frame(grp=c("A","A","B","B","C","C"),val=c(1,4,3,5,0,2))
p <- ggplot(df, aes(x=grp, y=val))
p <- p + geom_bar(stat="identity", alpha=0.75)
p + geom_line(data=df2, aes(x=grp, y=val), colour="blue")
Alternatively, if your example data above is correct, you can plot this information as a point with geom_point(data = df2, aes(x = grp, y = val), colour = "red", size = 6). You can obviously change the color and size to your liking.
EDIT: In response to comment
I'm not entirely sure what the visual for a freq polynomial over a histogram is supposed to look like. Are the x-values supposed to be connected to one another? Secondly, you keep referring to wanting lines but your code shows geom_bar() which I assume isn't what you want? If you want lines, use geom_lines(). If the two assumptions above are correct, then here's an approach to do that:
#First let's summarise df2 by group
df3 <- ddply(df2, .(grp), summarise, total = sum(val))
> df3
grp total
1 A 5
2 B 8
3 C 3
#Second, let's plot df3 as a line while treating the grp variable as numeric
p <- ggplot(df, aes(x=grp, y=val))
p <- p + geom_bar(alpha=0.75, stat = "identity")
p + geom_line(data=df3, aes(x=as.numeric(grp), y=total), colour = "red")