I am working on a bar graph that shows counts of cats and dogs that differ across countries. Cats and dogs are levels stored in different factors/ variables. I want to plot the bars for each animal count on top of the other (i.e. 2 layers), and then I want to order the bars from the tallest (i.e. highest count) to lowest according to animal frequency per country.
Here is what I did:
Order the data table according to animal counts per country
plot <- within(plot, country <- factor(country,
levels=names(sort(table(country), decreasing=TRUE))))
Plot the graph
gg <- ggplot(data = plot, aes(x=country))
Add bar for dogs
dogs <- gg +
geom_bar(data = plot[plot$animal1 == 'dog',], #select dogs from animal1 variable
stat="count")
If I do that, I get this (with one geom_bar):
So far, so good. Next, I add the second geom_bar for the cats:
dogs_cats <- gg +
geom_bar(data = plot[plot$animal1 == 'dog',], #select dogs from animal1 variable
stat="count") +
geom_bar(data = plot[plot$animal2 == 'cat',], #select cats from animal2 variable
stat="count")
Now the order is changed and off-key (after the second geom_bar):
How can I maintain the order of the bars to follow the initial geom_bar?
Many thanks!
I suggest you to use merge to create a new data frame:
1.Sum up (ddply and melt)
require(plyr) #ddply
require(reshape2) # melt
df = ddply(plot, "country", summarize, dogs = sum(animal1 == "dog"),
cats = sum(animal2 == "cat"))
dogs_and_cats = melt(df, id = "country")
You might have a new data frame with 3 columns:
country
variable: "dog" or "cat"
value: number of dogs/cats (per country)
2.Plot
ggplot(dogs_and_cats , aes(x = reorder(country, -value), y = value, fill = variable)) +
geom_bar(stat = "identity", position = "dodge")
3.Example:
Here is an example with the diamonds dataset, without a reproducible example:
df = ddply(diamonds, "cut", summarize, J = sum(color == "J"),
D = sum(color == "D"))
plot = melt(df, id = "cut")
ggplot(plot, aes(x = reorder(cut, -value), y = value, fill = variable)) +
geom_bar(stat = "identity", position = "dodge")
Hoom, I did like your code but the order of bars didn't change.
Perhaps you made a simple mistake somewhere.
library(ggplot2)
# make a sample data
set.seed(1); d <- data.frame(animal1 = sample(c("dog", "other"), replace=T, 10000, prob=c(0.7,0.3)),
animal2 = sample(c("cat", "other"), replace=T, 10000, prob=c(0.3,0.7)),
country = sample(LETTERS[1:15], replace=T, 10000, prob=runif(15,0,1)))
levels(d$country) # [1] "A" "B" "C" "D" ...
plot <- within(d, country <- factor(country, levels=names(sort(table(country), decreasing=TRUE))))
levels(plot$country) # [1] "N" "O" "L" "F" ...
gg <- ggplot(data = plot, aes(x=country))
dogs <- gg + geom_bar(data = plot[plot$animal1 == "dog",], stat="count", fill="darkblue")
dogs_cats <- gg +
geom_bar(data = plot[plot$animal1 == "dog",], stat="count", fill="darkblue") +
geom_bar(data = plot[plot$animal2 == "cat",], stat="count", fill="blue")
print(dogs)
print(dogs_cats) # I made below img using library(grid) to form two graphs.
Related
I have a bar plot using geom_bar() that I'd like to overlay points using geom_point(). The issues is the ordering of the axis labels. I have 2 groups, group A which I want to show with geom_bar() ordered from high to low and group B which I want to show with points using geom_bar. Group A and B will not always have the same categories but I always want group A shown with bars and ordered from high to low. and
If you run this code you will see just the bar plot correctly ordered. I need the pet supercategory shown first and then the car category. I have defined supercategory as an ordered factor and it is working.
Then within the supercategory, the bars are sorted by gorup A's value form high to low. you can see in the pet category dog is higher than the others and kia is higher than the others in the car category.
library(dplyr)
group = c("A","A","A","B","B","B","A","A","A","B","B","B")
supercategory = c("pet", "pet","pet","pet","pet","pet","car","car","car","car","car","car")
category = c("bird","cat","dog","bird","cat","lizard","ford","chevy","kia","kia","toyota","ford")
supercategory = factor(supercategory, levels= c("pet", "car"), ordered = TRUE)
value=c(3,4,5,4,5,6,1,3,10,8,3,5)
dat = data.frame(group = group,supercategory = supercategory, category = category, value = value )
dat = dat %>% mutate(LABEL = paste0(supercategory, "-",category), HIGH_VALUE = ifelse(group =="A",value,0)) %>%
arrange(supercategory, -HIGH_VALUE)
# after the lines above the data is ordered correctly. first by supercategory then by group A's value from higest to lowest using the HIGH_VALUE field
dat$ROW_NUMBER = 1:nrow(dat)
dat = dat %>% group_by(supercategory,category) %>% mutate(ROW_NUMBER2= min(ROW_NUMBER)) %>% arrange( supercategory ,ROW_NUMBER2)
# after the 2 lines above now the data is sorted by ROW_NUMBER2 which orders the category within supercategory.
# Group A will be be in bars using geom_bar
# group B will be displayed iwht points using geom_point
# The bars and points should be in the order of ROW_NUMBER2
library(ggplot2)
dat$LABEL = factor(dat$LABEL, levels = unique(dat$LABEL), ordered = TRUE)
ggplot(dat[dat$group=="A",] , aes(x = LABEL, y = value))+
geom_bar(stat="identity")
I'd like to keep the ordering of the plot above and just add the points above the bars. And if Group B has a category that is not one of Group A's the point should be to the right of Group A's last bar within whatever supercategory it is in.
But when I try to add the points the ordering gets messed up. Run this code which just adds group B's data as points and you will see the order of the labels gets messed up.
library(ggplot2)
dat$LABEL = factor(dat$LABEL, levels = unique(dat$LABEL), ordered = TRUE)
ggplot(dat[dat$group=="A",] , aes(x = LABEL, y = value))+
geom_bar(stat="identity") +
geom_point(data = dat[dat$group=="B",], aes(x = LABEL, y = value), shape=15, size = 3, color = "blue" )
How can I add this line to the plot:
geom_point(data = dat[dat$group=="B",], aes(x = LABEL, y = value), shape=15, size = 3, color = "blue" )
while keeping group A's ordering?
Each group have not the same set of values, then you have to force the X axis order by adding:
+ scale_x_discrete(limits=dat$LABEL)
Then:
ggplot(data = dat , aes(x = LABEL, y = value) ) +
geom_bar(data = dat[dat$group=="A",], stat="identity") +
geom_point(data = dat[dat$group=="B",], shape=15, size = 3, color = "blue") +
scale_x_discrete(limits=dat$LABEL)
I agree with #Cédric Miachon.
There is a problem of using different x.
A possible way to change the behaviour is to introduce NA's to the not present x:
require(reshape2)
require(dplyr)
require(tidyr)
vector_f <- unique(dat$LABEL)
dat1 <- dat %>%
dcast(group+supercategory~LABEL, value.var = 'value') %>% #casting and gathering
gather(label, value , 3:10)
ggplot() +
geom_bar(data = dat1[dat1$group=="A",],aes(x = factor(label, levels = vector_f), y = value), stat="identity") +
geom_point(data = dat1[dat1$group=="B",], aes(x = factor(label, levels = vector_f), y = value))
##I removed some of the geom_point layout specs
I would like to create a plot with
Using part of the data to create a base plot with facet_grid of two columns.
Use remaining part of the data and plot on top of the existing facets but using only a single column.
The sample code:
library(ggplot2)
library(gridExtra)
df2 <- data.frame(Class=rep(c('A','B','C'),each=20),
Type=rep(rep(c('T1','T2'),each=10), 3),
X=rep(rep(1:10,each=2), 3),
Y=c(rep(seq(3,-3, length.out = 10),2),
rep(seq(1,-4, length.out = 10),2),
rep(seq(-2,-8, length.out = 10),2)))
g2 <- ggplot() + geom_line(data = df2 %>% filter(Class %in% c('B','C')),
aes(X,Y,color=Class, linetype=Type)) +
facet_grid(Type~Class)
g3 <- ggplot() + geom_line(data = df2 %>% filter(Class == 'A'),
aes(X,Y,color=Class, linetype=Type)) +
facet_wrap(~Type)
grid.arrange(g2, g3)
The output plots:
How to include g3 plot on g2 plot? The resulting plot should include the g3 two lines twice on two facets.
I assume the plot below is what you were looking for.
library(dplyr)
library(ggplot2)
df_1 <- filter(df2, Class %in% c('B','C')) %>%
dplyr::rename(Class_1 = Class)
df_2 <- filter(df2, Class == 'A')
g2 <- ggplot() +
geom_line(data = df_1,
aes(X, Y, color = Class_1, linetype = Type)) +
geom_line(data = df_2,
aes(X, Y, color = Class, linetype = Type)) +
facet_grid(Type ~ Class_1)
g2
explaination
For tasks like this I found it better to work with two datasets. Since the variable df2$class has three unique values: A, B and C, faceting Class~Type does not give you desired plot, since you want the data for df2$Class == "A" to be displayed in the respective facets.
That's why I renamed variable Class in df_1 to Class_1 because this variable only contains two unique values: B and C.
Faceting Class_1 ~ Type allows you to plot the data for df2$Class == "A" on top without being faceted by Class.
edit
Based on the comment below here is a solution using only one dataset
g2 + geom_line(data = filter(df2, Class == 'A')[, -1],
aes(X, Y, linetype = Type, col = "A"))
Similar / same question: ggplot2:: Facetting plot with the same reference plot in all panels
How can I use geom_segment to draw lines on plot, after the data have been melted with reshape2?
# Tiny dataset
facet_group <- c("facet1", "facet1", "facet2", "facet2")
time_group <- c("before", "after", "before", "after")
variable1 <- c(1,5,4,7)
variable2 <- c(2,4,5,8)
variable3 <- c(4,5,6,7)
data <- data.frame(facet_group, time_group, variable1, variable2, variable3)
# Melt data
library(reshape2)
data_melt <- melt(data, id.vars = c("facet_group", "time_group"))
Plot the data:
# Plot 1
library(ggplot2)
ggplot(data_melt, aes(x=variable, y=value, group = time_group)) +
geom_point(aes(color = time_group))
Add faceting:
# Plot 2
ggplot(data_melt, aes(x=variable, y=value, group = time_group)) +
geom_point(aes(color = time_group)) +
facet_grid(facet_group ~ .)
I want to draw a segments from the "before" point to the "after" point for each variable. (see mock up image). How can I do this? I tried some things with geom_segment but I kept having errors. Will casting the data into a new data frame help?? Thanks!
data_cast <- dcast(data_melt, variable + facet_group ~ time_group)
Final "ideal" plot:
You were definitely on the right track with the casted data. Give this a shot:
ggplot(data_melt, aes(x=variable, y=value)) +
geom_point(aes(color = time_group)) +
facet_grid(facet_group ~ .) +
geom_segment(data = data_cast, aes(x = variable, xend = variable,
y = before, yend = after),
arrow = arrow(),
colour = "#FF3EFF",
size = 1.25)
I have a data frame with five columns and five rows. the data frame looks like this:
df <- data.frame(
day=c("m","t","w","t","f"),
V1=c(5,10,20,15,20),
V2=c(0.1,0.2,0.6,0.5,0.8),
V3=c(120,100,110,120,100),
V4=c(1,10,6,8,8)
)
I want to do some plots so I used the ggplot and in particular the geom_bar:
ggplot(df, aes(x = day, y = V1, group = 1)) + ylim(0,20)+ geom_bar(stat = "identity")
ggplot(df, aes(x = day, y = V2, group = 1)) + ylim(0,1)+ geom_bar(stat = "identity")
ggplot(df, aes(x = day, y = V3, group = 1)) + ylim(50,200)+ geom_bar(stat = "identity")
ggplot(df, aes(x = day, y = V4, group = 1)) + ylim(0,15)+ geom_bar(stat = "identity")
My question is, How can I do a grouped ggplot with geom_bar with multiple y axis? I want at the x axis the day and for each day I want to plot four bins V1,V2,V3,V4 but with different range and color. Is that possible?
EDIT
I want the y axis to look like this:
require(reshape)
data.m <- melt(df, id.vars='day')
ggplot(data.m, aes(day, value)) +
geom_bar(aes(fill = variable), position = "dodge", stat="identity") +
facet_grid(variable ~ .)
You can also change the y-axis limits if you like (here's an example).
Alternately you may have meant grouped like this:
require(reshape)
data.m <- melt(df, id.vars='day')
ggplot(data.m, aes(day, value)) +
geom_bar(aes(fill = variable), position = "dodge", stat="identity")
For the latter examples if you want 2 Y axes then you just create the plot twice (once with a left y axis and once with a right y axis) then use this function:
double_axis_graph <- function(graf1,graf2){
graf1 <- graf1
graf2 <- graf2
gtable1 <- ggplot_gtable(ggplot_build(graf1))
gtable2 <- ggplot_gtable(ggplot_build(graf2))
par <- c(subset(gtable1[['layout']], name=='panel', select=t:r))
graf <- gtable_add_grob(gtable1, gtable2[['grobs']][[which(gtable2[['layout']][['name']]=='panel')]],
par['t'],par['l'],par['b'],par['r'])
ia <- which(gtable2[['layout']][['name']]=='axis-l')
ga <- gtable2[['grobs']][[ia]]
ax <- ga[['children']][[2]]
ax[['widths']] <- rev(ax[['widths']])
ax[['grobs']] <- rev(ax[['grobs']])
ax[['grobs']][[1]][['x']] <- ax[['grobs']][[1]][['x']] - unit(1,'npc') + unit(0.15,'cm')
graf <- gtable_add_cols(graf, gtable2[['widths']][gtable2[['layout']][ia, ][['l']]], length(graf[['widths']])-1)
graf <- gtable_add_grob(graf, ax, par['t'], length(graf[['widths']])-1, par['b'])
return(graf)
}
I believe there's also a package or convenience function that does the same thing.
First I reshaped as described in the documentation in the link below the question.
In general ggplot does not support multiple y-axis. I think it is a philosophical thing. But maybe faceting will work for you.
df <- read.table(text = "day V1 V2 V3 V4
m 5 0.1 120 1
t 10 0.2 100 10
w 2 0.6 110 6
t 15 0.5 120 8
f 20 0.8 100 8", header = TRUE)
library(reshape2)
df <- melt(df, id.vars = 'day')
ggplot(df, aes(x = variable, y = value, fill = variable)) + geom_bar(stat = "identity") + facet_grid(.~day)
If I understand correctly you want to include facets in your plot. You have to use reshape2 to get the data in the right format. Here's an example with your data:
df <- data.frame(
day=c("m","t","w","t","f"),
V1=c(5,10,20,15,20),
V2=c(0.1,0.2,0.6,0.5,0.8),
V3=c(120,100,110,120,100),
V4=c(1,10,6,8,8)
)
library(reshape2)
df <- melt(df, "day")
Then plot with and include facet_grid argument:
ggplot(df, aes(x=day, y=value)) + geom_bar(stat="identity", aes(fill=variable)) +
facet_grid(variable ~ .)
I have a stacked barplot with the following data
df <- expand.grid(name = c("oak","birch","cedar"),
sample = c("one","two"),
type = c("sapling","adult","dead"))
df$count <- sample(5:200, size = nrow(df), replace = T)
I generate a barplot and try to add the group lables to it:
ggplot(df, aes(x = name, y = count, fill = type)) +
geom_bar(stat = "identity") +
coord_flip() +
theme(legend.position="none") +
geom_text(aes(label = type, position = "stack"))
It produces:
Two to three questions arise:
How can I make the labels appear in the top bar only?
How can I make the labels appear in the center of the bar section?
Optionally: How can I make the labels appear on top of the top bar being connected to their sections by arrows?
There is a link suggested above. That will help you. Here, I have another suggestion.
set.seed(123)
df <- expand.grid(name = c("oak","birch","cedar"),
sample = c("one","two"),
type = c("sapling","adult","dead"))
df$count <- sample(5:200, size = nrow(df), replace = T)
### Arrange a data frame (summing up sample one and two)
library(dplyr)
ana <- df %>%
group_by(name, type) %>%
summarise(total = sum(count))
# Draw a figure once
bob <- ggplot(ana, aes(x = name, y = total, fill = type)) +
geom_bar(stat = "identity", position = "stack")
# Get a data frame for ggplot
cathy <- ggplot_build(bob)$data[[1]]
# calculate text position & add text labels
cathy$y_pos <- (cathy$ymin + cathy$ymax) / 2
cathy$label <- rep(c("sampling", "adult", "dead"), times = 3)
# Subset the data for labeling for the top bar
dan <- cathy[c(7:9), ]
# Draw a figure again
bob +
annotate(x = dan$x, y = dan$y_pos, label = dan$label, geom="text", size=3) +
coord_flip()