R ggplot2 with stacked column instead of grouped - r

I want to plot the data shown below as a grouped bar_plot.
I tried position = "dodge" or position = "dodge2" but it didn't work. Ι also tried position = position_dodge()
It kinda works if i use geom_bar instead of geom_col and remove the y=overlap_percent:
p3 <- ggplot(data = comp_coors, aes(x = species_code, fill = mirna_form)) +
geom_bar(position = "dodge2") + theme_classic()
p3
but i would like the y_axis to have the overlap_percent.
Another attempt which ends in a stacked barplot is:
p2 <- ggplot(data = comp_coors, aes(x = species_code, y = overlap_percent, fill = mirna_form)) +
geom_bar(stat = "identity") + theme_classic()
p2
Finally by using geom_col, it returns this which it doesn't make sense, at least to me:
p4 <- ggplot(data = comp_coors, aes(x = species_code, y = overlap_percent, fill = mirna_form)) +
geom_col(position = "dodge") + theme_classic()
p4
The data that i want to plot :
comp_coors <- data.table( species = c("aae","cel", "dme","hsa", "mdo"),
mirna_form = c("mature", "precursor"),
overlap_percent = c(100.0, 100.0, 88.0, 95.5, 91.7, 100.0, 96.6, 98.4),
overlapping_attribute = c("ID=MIMAT0014285;Alias=MIMAT0014285", "ID=MI0000043;Alias=MI0000043;Name=cel-mir-72", "ID=MIMAT0000401;Alias=MIMAT0000401;Name=dme-miR-", "ID=MI0000791;Alias=MI0000791;Name=hsa-mir-383", "ID=MI0005331;Alias=MI0005331;Name=mdo-let-7g")
)

Try using species as a factor and add stat = "identity" like this:
ggplot(data = comp_coors, aes(x = factor(species), y = overlap_percent, fill = mirna_form)) +
geom_bar(position = "dodge", stat = "identity") + theme_classic() + labs(x = "Species", y = "Overlap percent")
Output:
A grouped barplot with overlap_percent on y-axis right.

Related

mean line in every facet_wrap

ggplot(data = results, aes(x = inst, y = value, group = inst)) +
geom_boxplot() +
facet_wrap(~color) +
#geom_line(data = mean,
#mapping = aes(x = inst, y = average, group = 1))
theme_bw()
When I run the code above with the code line commented, it runs and gives the graph below but I want a joining mean lines on the boxplots based on its own color category for each group in facet wraps. Any ideas how can I do that?
Your code is generally correct (though you'll want to add color = color to the aes() specification in geom_line()), so I suspect your mean dataset isn't set up correctly. Do you have means grouped by both your x axis and faceting variable? Using ggplot2::mpg as an example:
library(dplyr) # >= v1.1.0
library(ggplot2)
mean_dat <- summarize(mpg, average = mean(hwy), .by = c(cyl, drv))
ggplot(mpg, aes(factor(cyl), hwy)) +
geom_boxplot() +
geom_line(
data = mean_dat,
aes(y = average, group = 1, color = drv),
linewidth = 1.5,
show.legend = FALSE
) +
facet_wrap(~drv) +
theme_bw()
Alternatively, you could use stat = "summary" and not have to create a means dataframe at all:
ggplot(mpg, aes(factor(cyl), hwy)) +
geom_boxplot() +
geom_line(
aes(group = 1, color = drv),
stat = "summary",
linewidth = 1.5,
show.legend = FALSE
) +
facet_wrap(~drv) +
theme_bw()
# same result as above

ggplot: adding a frequency plot over a percentage plot

I am interested in doing a plot showing percentages by group.
something like this:
data(iris)
ggplot(iris,
aes(x = Sepal.Length, group = factor(Species), fill = factor(Species))) +
geom_histogram(position = "fill")+theme_bw()
however, I would also like to plot a histogram showing the frequency distribution on top of this graph.
something like the plot below.
ggplot(iris,aes(x = Sepal.Length)) +
geom_histogram()+theme_bw()
Does anyone know how to do this?
Note I know how to do a frequency plot by group: ggplot(iris,aes(x = Sepal.Length, group = factor(Species), fill = factor(Species))) + geom_histogram()+theme_bw(). But this is not what I want. Rather I would like a small frequency distribution at the bottom of the percentage plot presented at the beginning.
Thank you very much
Something like this?
library(gridExtra)
p1 <- ggplot(iris,
aes(x = Sepal.Length,
group = factor(Species),
fill = factor(Species))) +
geom_histogram(position = "fill") +
theme_bw() +
theme(legend.position = "top")
p2 <- ggplot(iris,aes(x = Sepal.Length,
group = factor(Species),
fill = factor(Species))) +
geom_histogram() +
theme_bw() +
theme(legend.position = "none")
grid.arrange(p1, p2,
heights = c(4, 1.5))
Edit: So you are looking for this then? Note that in this case the absolute values of the smaller histogram become meaningless since they were scaled down to be ~25% of the vertical chart range.
ggplot() +
geom_histogram(data = iris,
aes(x = Sepal.Length,
group = factor(Species),
fill = factor(Species)),
position = "fill",
alpha = 1) +
geom_histogram(data = iris,
aes(x = Sepal.Length,
y = ..ncount.. / 4),
alpha = 0.5,
fill = 'black')

geom_bar and geom_point in the same ggplot and within the same groups

I have the current code
ggplot(data = niveles[niveles$departamento=="CUNDINAMARCA" &
niveles$prueba=="MATEMÁTICAS" &
!is.na(niveles$nivel),]) +
geom_bar(stat="identity", position = position_dodge(),
aes(x = año, y = desempeño, fill = nivel)) +
geom_point(data = niveles[niveles$prueba=="MATEMÁTICAS" &
niveles$departamento=="COLOMBIA" &
!is.na(niveles$nivel),], shape = 24,
aes(x = año, y = desempeño, group = nivel, fill = "blue"))
which gives me the following plot:
However, I was hoping to get each one of the "points" withing its corresponding category of the "niveles" variable. Does anyone know how I can do that?
You can dodge points the same way as you dodge bars using position=position_dodge(). However, you need to add a width argument specifying how much to "dodge". A value of 1 should correspond with the dodged bars. You also have an unknown "blue" category in the legend. That's because the fill argument should appear outside the aesthetic (aes)
I also think you should subset the data first instead of doing all that within the ggplot command.
An alternative is to facet by department (see option 2 below).
But first to dodge the points.
Option 1: Subsetting
Create a subset for prueba and non-missing for nivel:
MATH <- niveles[niveles$prueba=="MATEMÁTICAS" & !is.na(niveles$nivel),]
Create subsets for each department:
CUNDINAMARCA <- MATH[MATH$departamento=="CUNDINAMARCA",]
COLOMBIA <- MATH[MATH$departamento=="CUNDINAMARCA",]
Then make your graph:
ggplot(data = CUNDINAMARCA) +
geom_bar(stat="identity", position = position_dodge(),
aes(x = año, y = desempeño, fill = nivel)) +
geom_point(data = COLOMBIA, shape = 24,
position = position_dodge(width=1), # You need this to align points with bars
aes(x = año, y = desempeño, group = nivel), fill = "blue")
I can't test it on your data but I used the mtcars dataset as an example.
mtcars <- mtcars %>%
mutate(gear=factor(gear), cyl=factor(cyl))
VS0 <- mtcars[mtcars$vs==0,]
VS1 <- mtcars[mtcars$vs==1,]
ggplot() +
geom_bar(data = VS0, stat="identity", position = position_dodge(),
aes(x = cyl, y = mpg, fill = gear)) +
geom_point(data = VS1, shape = 24,
position = position_dodge(width=1),
aes(x = cyl, y = mpg, group = gear), fill = "blue")
Option 2: Facetting
ggplot(data = mtcars, group=vs) +
geom_bar(stat="identity", position = position_dodge(),
aes(x = cyl, y = mpg, fill = gear)) +
facet_grid(~vs, labeller=label_both)
For your data, maybe this would work:
DATA <- MATH[MATH$departamento %in% c("CUNDINAMARCA","COLOMBIA"),]
ggplot(data = DATA, group=departamento) +
geom_bar(stat="identity", position = position_dodge(),
aes(x = año, y = desempeño, fill = nivel)) +
facet_grid(~departamento, labeller=label_both)

Adding labels to ggplot bar chart

I would like to do a bar plot outlined in black with percentages inside the bars. Is this possible from qplot? I get the percentages to appear but they don't align with the particular bars.
packages: ggplot2, reshape
x <- data.frame(filename = c("file1", "file2", "file3", "file4"),
low = c(-.05,.06,.07,-.14),
hi = c(.87,.98,.56,.79))
x$tot <- x$hi + x$low
x <- melt(x, id = 'filename')
bar <- qplot(x = factor(filename),
y = value*100,
fill = factor(variable),
data = x,
geom = 'bar',
position = 'dodge') + coord_flip()
bar <- bar + scale_fill_manual(name = '',
labels = c('low',
'Hi',
"Tot"),
values = c('#40E0D0',
'#FF6347',
"#C7C7C7"))
bar <- bar + geom_text(aes(label = value*100))+geom_bar(colour = 'black')
bar <- bar + opts(panel.background = theme_rect(colour = NA))
bar <- bar + opts(legend.justification = 'bottom')
print(bar)
Here you go:
library(scales)
ggplot(x, aes(x = filename, fill = variable)) +
geom_bar(stat="identity", ymin=0, aes(y=value, ymax=value), position="dodge") +
geom_text(aes(x=filename, y=value, ymax=value, label=value,
hjust=ifelse(sign(value)>0, 1, 0)),
position = position_dodge(width=1)) +
scale_y_continuous(labels = percent_format()) +
coord_flip()
This would be a good opportunity for you to start moving away from using qplot, in favor of ggplot. This will be much easier in the long run, trust me.
Here's a start:
library(scales)
ggplot(data = x,aes(x = factor(filename),y = value)) +
geom_bar(aes(fill = factor(variable)),colour = "black",position = 'dodge') +
coord_flip() +
scale_fill_manual(name = '',
labels = c('low',
'Hi',
"Tot"),
values = c('#40E0D0',
'#FF6347',
"#C7C7C7")) +
scale_y_continuous(labels = percent_format())
For philosophical reasons, I will leave the annotation piece to you...

Annotation above bars:

dodged bar plot in ggplot again has me stumped. I asked about annotating text above bars on here a few weeks back (LINK) and got a terrific response to use + stat_bin(geom="text", aes(label=..count.., vjust=-1)). I figured since I already have the counts I'll just supply them with out the .. before and after and I told stat_bin that the position was dodge. It lines them up over the center of the group and adjusts up and down. Probably something minor. Please help me to get the text over the bars.
mtcars2 <- data.frame(type=factor(mtcars$cyl),
group=factor(mtcars$gear))
library(plyr); library(ggplot)
dat <- rbind(ddply(mtcars2,.(type,group), summarise,
count = length(group)),c(8,4,NA))
p2 <- ggplot(dat,aes(x = type,y = count,fill = group)) +
geom_bar(colour = "black",position = "dodge",stat = "identity") +
stat_bin(geom="text", aes(position='dodge', label=count, vjust=-.6))
I was having trouble getting the position dodges to line up, so I ended up creating a position_dodge object (is that the right terminology?), saving it to a variable, and then using that as the position for both geoms. Somewhat infuriatingly, they still seem to be a little off centre.
dodgewidth <- position_dodge(width=0.9)
ggplot(dat,aes(x = type,y = count, fill = group)) +
geom_bar(colour = "black", position = dodgewidth ,stat = "identity") +
stat_bin(geom="text", position= dodgewidth, aes(x=type, label=count), vjust=-1)
Updated geom_bar() needs stat = "identity"
I think this does what you want as well.
mtcars2 <- data.frame(type = factor(mtcars$cyl), group = factor(mtcars$gear))
library(plyr); library(ggplot2)
dat <- rbind(ddply(mtcars2, .(type, group), summarise, count = length(group)), c(8, 4, NA))
p2 <- ggplot(dat, aes(x = type,y = count,fill = group)) +
geom_bar(stat = "identity", colour = "black",position = "dodge", width = 0.8) +
ylim(0, 14) +
geom_text(aes(label = count, x = type, y = count), position = position_dodge(width = 0.8), vjust = -0.6)
p2

Resources