Related
I would like to make an interactive graphs based on user input. However I'm struggle to make more than one graphs using R plotly. Suppose I have following data and codes,
dput(norwd5)
structure(list(LENGTH_OF_STAY = c(57L, 28L, 15L, 28L, 14L, 49L,
15L, 22L, 17L, 81L, 34L, 24L, 31L, 38L, 33L, 22L, 21L, 49L, 188L,
21L, 21L, 36L, 24L, 23L, 48L, 54L, 42L, 62L, 13L, 139L, 29L,
49L, 15L, 7L, 43L, 28L, 31L, 22L, 23L, 26L, 33L, 30L, 127L, 22L,
22L, 15L, 28L, 26L, 15L, 31L, 22L, 89L, 28L, 60L, 54L, 37L, 20L,
135L, 155L, 51L, 15L, 8L, 38L, 16L, 16L, 22L, 30L, 14L, 16L,
18L, 14L, 272L, 25L, 22L, 18L, 21L, 188L, 264L, 34L, 34L, 136L,
23L, 142L, 25L, 32L, 58L, 163L, 16L, 35L, 23L, 50L, 71L, 10L,
19L, 22L, 24L, 45L, 29L, 15L, 82L), PRE_OPERATIVE_LOS = c(2L,
2L, 3L, 1L, 3L, 6L, 3L, 7L, 2L, 2L, 11L, 2L, 6L, 3L, 6L, 3L,
5L, 3L, 179L, 2L, 5L, 3L, 4L, 2L, 5L, 6L, 2L, 4L, 2L, 6L, 3L,
2L, 2L, 6L, 6L, 1L, 4L, 5L, 6L, 5L, 0L, 4L, 6L, 2L, 4L, 4L, 7L,
4L, 4L, 6L, 2L, 4L, 3L, 3L, 2L, 6L, 4L, 110L, 63L, 6L, 4L, 7L,
5L, 1L, 6L, 1L, 4L, 2L, 6L, 3L, 2L, 8L, 2L, 2L, 4L, 3L, 6L, 171L,
5L, 4L, 116L, 6L, 47L, 3L, 7L, 3L, 60L, 1L, 3L, 20L, 31L, 49L,
9L, 8L, 3L, 4L, 35L, 7L, 4L, 9L), POST_OPERATIVE_LOS = c(55L,
26L, 12L, 27L, 11L, 43L, 12L, 15L, 15L, 79L, 23L, 22L, 25L, 35L,
27L, 19L, 16L, 46L, 9L, 19L, 16L, 33L, 20L, 21L, 43L, 48L, 40L,
58L, 11L, 133L, 26L, 47L, 13L, 1L, 37L, 27L, 27L, 17L, 17L, 21L,
33L, 26L, 121L, 20L, 18L, 11L, 21L, 22L, 11L, 25L, 20L, 85L,
25L, 57L, 52L, 31L, 16L, 25L, 92L, 45L, 11L, 1L, 33L, 15L, 10L,
21L, 26L, 12L, 10L, 15L, 12L, 264L, 23L, 20L, 14L, 18L, 182L,
93L, 29L, 30L, 20L, 17L, 95L, 22L, 25L, 55L, 103L, 15L, 32L,
3L, 19L, 22L, 1L, 11L, 19L, 20L, 10L, 22L, 11L, 73L), digoxin_any = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L,
2L, 1L, 2L), .Label = c("0:No", "1.Yes"), class = "factor")), row.names = c(NA,
-100L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x0000012f36b61ef0>)
num <- c('PRE_OPERATIVE_LOS','POST_OPERATIVE_LOS')
plist <- scan(text=num,what = "",quiet = T)
groups <- 'digoxin_any'
bygrp <- scan(text=groups,what="",quiet=T)
norwd5[, (bygrp) := lapply(.SD, as.factor), .SDcols = bygrp]
plotList = list()
for(i in length(plist)){
gplot <- ggplot(norwd5,aes_string(x=plist[i],group=bygrp,color=bygrp))+geom_histogram(aes(y=..density..),position = "dodge")+geom_density(alpha=.5) +theme(legend.position = "left")
plotList[[i]] <- plotly_build(gplot)
}
for(i in length(plist)){
print(plotList[[i]])
}
The goal is to show both graphs for PRE_OPERATIVE_LOS and POST_OPERATIVE_LOS. However, the codes above only show histogram for POST_OPERATIVE_LOS.
I checked maybe subplot is the way to go but how to make subplot work in a loop? Any hints?
Thanks!
There is an error in your first loop and calling each subplot won't make both appear at the same time.
First-- the issue with your first for call- when you wrote
for(i in length(plist))
You wrote for i in 2 or i == 2, meaning that you never looped. If you modify it to a range of values, now it's written: for i in 1 to 2.
for(i in 1:length(plist))
So you're aware, if you had written for(i in plist) it would have done both loops, but instead of a value, i would be the strings.
Okay, so now there are two graphs. From the plotly library, you can use the function subplot. You will want to turn the legend off for one of them, though.
subplot(plotList[[1]],
style(plotList[[2]], showlegend = FALSE))
If you wanted the outline color, that's more than okay! However, if you wanted to bars to be filled, you need to assign fill instead of color.
If you change color = bygrp to fill = bygrp, this is how this would change:
If you leave the color assignment and add fill = bygrp (so you have both), this is how this would change:
I can'd find a solution for the following problem(s). I would appreciate some help a lot!
The following code produces bar charts using facet. However, due to "extra space" ggplot2 has in some groups it makes the bars much wider, even if I specify a width of 0.1 or similar. I find that very annoying since it makes it look very unprofessional. I want all the bars to look the same (except for the fill). I hope somebody can tell me how to fix this.
Secondly, how can I reorder the different classes in the facet windows so that the order is always C1, C2 ... C5, M, F, All where applicable. I tried it with ordering the levels of the factor, but since not all classes are present in every graph part it did not work, or at least I assume that was the reason.
Thirdly, how can I reduce the space between the bars? So that the whole graph is more compressed. Even if I make the image smaller for exporting, R will scale the bars smaller but the spaces between the bars are still huge.
I would appreciate feedback for any of those answers!
My Data:
http://pastebin.com/embed_iframe.php?i=kNVnmcR1
My Code:
library(dplyr)
library(gdata)
library(ggplot2)
library(directlabels)
library(scales)
all<-read.xls('all_auto_visual_c.xls')
all$station<-as.factor(all$station)
#all$group.new<-factor(all$group, levels=c('C. hyperboreus','C. glacialis','Special Calanus','M. longa','Pseudocalanus sp.','Copepoda'))
allp <- ggplot(data = all, aes(x=shortname2, y=perc_correct, group=group,fill=sample_size)) +
geom_bar(aes(fill=sample_size),stat="identity", position="dodge", width=0.1, colour="NA") + scale_fill_gradient("Sample size (n)",low="lightblue",high="navyblue")+
facet_wrap(group~station,ncol=2,scales="free_x")+
xlab("Species and stages") + ylab("Automatic identification and visual validation concur (%)") +
ggtitle("Visual validation of predictions") +
theme_bw() +
theme(plot.title = element_text(lineheight=.8, face="bold", size=20,vjust=1), axis.text.x = element_text(colour="grey20",size=12,angle=0,hjust=.5,vjust=.5,face="bold"), axis.text.y = element_text(colour="grey20",size=12,angle=0,hjust=1,vjust=0,face="bold"), axis.title.x = element_text(colour="grey20",size=15,angle=0,hjust=.5,vjust=0,face="bold"), axis.title.y = element_text(colour="grey20",size=15,angle=90,hjust=.5,vjust=1,face="bold"),legend.position="none", strip.text.x = element_text(size = 12, face="bold", colour = "black", angle = 0), strip.text.y = element_text(size = 12, face="bold", colour = "black"))
allp
#ggsave(allp, file="auto_visual_stackover.jpeg", height= 11, width= 8.5, dpi= 400,)
The current graph that needs some fixing:
Thanks a lot!
Here what I did after suggestion from Gregor. Using geom_segment and geom_point makes a nice graph as I think.
library(ggplot2)
all<-read.xls('all_auto_visual_c.xls')
all$station<-as.factor(all$station)
all$group.new<-factor(all$group, levels=c('C. hyperboreus','C. glacialis','Combined','M. longa','Pseudocalanus sp.','Copepoda'))
all$shortname2.new<-factor(all$shortname2, levels=c('All','F','M','C5','C4','C3','C2','C1','Micro', 'Oith','Tric','Cegg','Cnaup','C3&2','C2&1'))
allp<-ggplot(all, aes(x=perc_correct, y=shortname2.new)) +
geom_segment(aes(yend=shortname2.new), xend=0, colour="grey50") +
geom_point(size=4, aes(colour=sample_size)) +
scale_colour_gradient("Sample size (n)",low="lightblue",high="navyblue") +
geom_text(aes(label = perc_correct, hjust = -0.5)) +
theme_bw() +
theme(panel.grid.major.y = element_blank()) +
facet_grid(group.new~station,scales="free_y",space="free") +
xlab("Automatic identification and visual validation concur (%)") + ylab("Species and stages")+
ggtitle("Visual validation of predictions")+
theme_bw() +
theme(plot.title = element_text(lineheight=.8, face="bold", size=20,vjust=1), axis.text.x = element_text(colour="grey20",size=12,angle=0,hjust=.5,vjust=.5,face="bold"), axis.text.y = element_text(colour="grey20",size=12,angle=0,hjust=1,vjust=0,face="bold"), axis.title.x = element_text(colour="grey20",size=15,angle=0,hjust=.5,vjust=0,face="bold"), axis.title.y = element_text(colour="grey20",size=15,angle=90,hjust=.5,vjust=1,face="bold"),legend.position="none", strip.text.x = element_text(size = 12, face="bold", colour = "black", angle = 0), strip.text.y = element_text(size = 8, face="bold", colour = "black"))
allp
ggsave(allp, file="auto_visual_no_label.jpeg", height= 11, width= 8.5, dpi= 400,)
This is what it produces!
Assuming the bar widths are inversely proportional to the number of x-breaks, an appropriate scaling factor can be entered as a width aesthetic to control the width of the bars. But first, calculate the number of x-breaks in each panel, calculate the scaling factor, and put them back into the "all" data frame.
Updating to ggplot2 2.0.0 Each column mentioned in facet_wrap gets its own line in the strip. In the edit, a new label variable is setup in the dataframe so that the strip label remains on one line.
library(ggplot2)
library(plyr)
all = structure(list(station = structure(c(2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Station 101",
"Station 126"), class = "factor"), shortname2 = structure(c(2L,
7L, 8L, 11L, 1L, 5L, 7L, 8L, 11L, 1L, 2L, 3L, 5L, 7L, 8L, 12L,
11L, 1L, 6L, 8L, 15L, 14L, 9L, 10L, 4L, 6L, 2L, 7L, 8L, 11L,
1L, 5L, 7L, 8L, 11L, 1L, 2L, 3L, 5L, 7L, 8L, 12L, 11L, 1L, 8L,
11L, 1L, 15L, 14L, 13L, 9L, 10L), .Label = c("All", "C1", "C2",
"C2&1", "C3", "C3&2", "C4", "C5", "Cegg", "Cnaup", "F", "M",
"Micro", "Oith", "Tric"), class = "factor"), color = c(1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L,
18L, 19L, 21L, 26L, 30L, 31L, 33L, 34L, 20L, 21L, 1L, 2L, 3L,
4L, 5L, 6L, 7L, 8L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L,
19L, 26L, 28L, 29L, 30L, 31L, 32L, 33L, 34L), group = structure(c(1L,
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 6L, 5L, 3L, 3L, 3L, 3L, 6L, 6L, 1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 3L, 3L,
3L, 3L, 3L), .Label = c("cgla", "Chyp", "Cope", "mlong", "pseudo",
"specC"), class = "factor"), sample_size = c(11L, 37L, 55L, 16L,
119L, 21L, 55L, 42L, 40L, 158L, 24L, 16L, 17L, 27L, 14L, 45L,
98L, 241L, 30L, 34L, 51L, 22L, 14L, 47L, 13L, 41L, 24L, 41L,
74L, 20L, 159L, 18L, 100L, 32L, 29L, 184L, 31L, 17L, 27L, 23L,
21L, 17L, 49L, 185L, 30L, 16L, 46L, 57L, 16L, 12L, 30L, 42L),
perc_correct = c(91L, 78L, 89L, 81L, 85L, 90L, 91L, 93L,
80L, 89L, 75L, 75L, 76L, 81L, 86L, 76L, 79L, 78L, 90L, 97L,
75L, 86L, 93L, 74L, 85L, 88L, 88L, 90L, 92L, 90L, 91L, 89L,
89L, 91L, 90L, 89L, 81L, 88L, 74L, 78L, 90L, 82L, 84L, 82L,
90L, 94L, 91L, 81L, 69L, 83L, 90L, 81L)), class = "data.frame", row.names = c(NA,
-52L))
all$station <- as.factor(all$station)
# Calculate scaling factor and insert into data frame
library(plyr)
N = ddply(all, .(station, group), function(x) length(row.names(x)))
N$Fac = N$V1 / max(N$V1)
all = merge(all, N[,-3], by = c("station", "group"))
all$label = paste(all$group, all$station, sep = ", ")
allp <- ggplot(data = all, aes(x=shortname2, y=perc_correct, group=group, fill=sample_size, width = .5*Fac)) +
geom_bar(stat="identity", position="dodge", colour="NA") +
scale_fill_gradient("Sample size (n)",low="lightblue",high="navyblue")+
facet_wrap(~label,ncol=2,scales="free_x") +
xlab("Species and stages") + ylab("Automatic identification and visual validation concur (%)") +
ggtitle("Visual validation of predictions") +
theme_bw() +
theme(plot.title = element_text(lineheight=.8, face="bold", size=20,vjust=1),
axis.text.x = element_text(colour="grey20",size=12,angle=0,hjust=.5,vjust=.5,face="bold"),
axis.text.y = element_text(colour="grey20",size=12,angle=0,hjust=1,vjust=0,face="bold"),
axis.title.x = element_text(colour="grey20",size=15,angle=0,hjust=.5,vjust=0,face="bold"),
axis.title.y = element_text(colour="grey20",size=15,angle=90,hjust=.5,vjust=1,face="bold"),
legend.position="none",
strip.text.x = element_text(size = 12, face="bold", colour = "black", angle = 0),
strip.text.y = element_text(size = 12, face="bold", colour = "black"))
allp
I'm trying to plot the graph below, and want to manually specify colours.
I need to plot by genotype, since there are multiple genotypes belonging to the same Bgrnd_All, and I want them to come up separately in the lines plotted.
However, I want to colour the lines by Bgrnd_All, and specifically in the order/colour I use in scale_fill_manual.
When I do this, the values in scale_fill_manual do not overwrite the existing colour as defined in geom_line. How can I do this?
I'd be grateful for pointers.
[Data for graph below][1]https://www.dropbox.com/s/9nmu87wkh2yqfxn/summary_200_exp2.csv?dl=0
pd <- position_dodge(1)
ggplot(data=summary.200.exp2, aes(x=Time, y=Length, colour=Genotype, group=Genotype)) +
geom_errorbar(aes(ymin=Length - se, ymax=Length + se), colour="black", width=1, position=pd) +
geom_line(aes(colour=Bgrnd_All), position=pd, size =1) +
scale_x_continuous(breaks=c(0,17,22,41,89)) + #using breaks of when sampled
scale_fill_manual(values=c(Avalon="#000066",Av_A="#663399",Av_B="#339999",Cadenza="CC0033",Cad_A="FF6600",Cad_B="FF9933"))+
ylab("leaf segment width (mm)") +
xlab("Time") +
theme(axis.title = element_text(size=14,face="bold"),
axis.text = element_text(size=14),
strip.text.y = element_text(size=14))
A dput of the data:
summary.200.exp2 <- structure(list(X = 1:40,
Genotype = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L), .Label = c("4.18", "4.41", "7.50", "7.59", "8.51", "8.77", "Avalon", "Cadenza"), class = "factor"),
Time = c(0L, 17L, 22L, 41L, 89L, 0L, 17L, 22L, 41L, 89L, 0L, 17L, 22L, 41L, 89L, 0L, 17L, 22L, 41L, 89L, 0L, 17L, 22L, 41L, 89L, 0L, 17L, 22L, 41L, 89L, 0L, 17L, 22L, 41L, 89L, 0L, 17L, 22L, 41L, 89L),
Bgrnd_All = structure(c(4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 5L, 5L, 5L, 5L, 5L, 4L, 4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 6L, 6L, 6L, 6L, 6L), .Label = c("Av_A", "Av_B", "Avalon", "Cad_A", "Cad_B", "Cadenza"), class = "factor"),
N = c(43L, 48L, 44L, 47L, 48L, 22L, 21L, 26L, 27L, 25L, 36L, 24L, 44L, 48L, 45L, 50L, 26L, 52L, 54L, 53L, 38L, 52L, 52L, 49L, 50L, 39L, 39L, 42L, 38L, 42L, 84L, 42L, 84L, 42L, 42L, 50L, 26L, 53L, 27L, 27L),
Length = c(1.17423255813953, 1.58852083333333, 1.71263636363636, 1.86736170212766, 2.0331875, 1.07563636363636, 1.49866666666667, 1.48734615384615, 1.66796296296296, 2.15416, 1.08716666666667, 1.09858333333333, 1.24593181818182, 1.30827083333333, 1.81537777777778, 1.15672, 1.8475, 1.96815384615385, 2.01822222222222, 2.5057358490566, 1.14697368421053, 1.40276923076923, 1.49832692307692, 1.76981632653061, 2.27954, 1.18312820512821, 1.75928205128205, 1.86195238095238, 1.91426315789474, 2.26883333333333, 1.10839285714286, 1.97902380952381, 2.03271428571429, 2.15685714285714, 2.8227380952381, 1.08658, 1.68880769230769, 1.7277358490566, 1.9232962962963, 2.13466666666667),
sd = c(0.218740641945063, 0.357307960001092, 0.377931031662453, 0.416137123383518, 0.440003996899158, 0.176915784499843, 0.426273190962478, 0.305677731254037, 0.450036449932454, 0.48642939535627, 0.15212823538055, 0.175160775008132, 0.293836087650785, 0.282464815326021, 0.346608194369436, 0.211422397593258, 0.408328617659845, 0.413460118977535, 0.419730221832425, 0.508692484972064, 0.217587942685885, 0.207510416973071, 0.245473270071832, 0.377310585673427, 0.536134471785516, 0.159925670150259, 0.298319411009668, 0.338847829173593, 0.296186727462412, 0.445638589029855, 0.162594700328365, 0.308723610551514, 0.318831396748337, 0.381781291715339, 0.402059458017902, 0.167826451905484, 0.257140275994371, 0.338637947743116, 0.362428434825926, 0.343680867174096),
se = c(0.0333576351702583, 0.0515729617225566, 0.0569752467571038, 0.0606998379642952, 0.06350910651356, 0.0377185719899813, 0.0930204363959963, 0.0599483352513503, 0.0866095551712153, 0.097285879071254, 0.0253547058967583, 0.0357545434766975, 0.0442974569365289, 0.040770284291269, 0.0516692989445678, 0.0298996422065822, 0.0800798303617661, 0.0573366022820362, 0.0571180485063685, 0.0698742866122227, 0.0352974252834232, 0.0287765172534354, 0.0340410177692235, 0.053901512239061, 0.0758208641254813, 0.0256086023072023, 0.0477693365291991, 0.052285355168868, 0.0480478318490224, 0.0687635271596866, 0.0177405362346046, 0.0476370873204908, 0.0347873573697084, 0.0589101322645314, 0.0620391212561054, 0.0237342444409691, 0.0504293571163821, 0.046515499476421, 0.0697493848029077, 0.0661414137260961),
ci = c(0.0673184331863912, 0.103751416510302, 0.114901535684132, 0.122182436693452, 0.127763842564108, 0.0784400645137227, 0.194037230170767, 0.123465907623535, 0.178028490322197, 0.200788185881879, 0.0514727894594648, 0.0739639084701291, 0.0893343358495282, 0.0820192326650262, 0.104132629687123, 0.0600855805773719, 0.164927497928001, 0.11510803218647, 0.11456429705202, 0.140213013986381, 0.0715193770736051, 0.0577712690042106, 0.0683401947985261, 0.108376253996364, 0.152367731004308, 0.0518419050566429, 0.0967039660836575, 0.105592416917608, 0.0973541547573791, 0.138870760371045, 0.0352852130493688, 0.0962050495562246, 0.06919065466693, 0.118971425682342, 0.125290547146885, 0.0476957499005439, 0.103861205171753, 0.0933401784102089, 0.143371913789607, 0.135955623027448)),
.Names = c("X", "Genotype", "Time", "Bgrnd_All", "N", "Length", "sd", "se", "ci"), class = "data.frame", row.names = c(NA, -40L))
As stated by #juba in the comments, you should use scale_colour_manual instead of scale_fill_manual. Moreover, you are trying to plot to many lines and errorbars in one plot. They overlap each other to much and it is therefore hard to distuinguish between the lines/errorbars.
An example with the use of facetting (and some simplification of your code):
ggplot(summary.200.exp2, aes(x=Time, y=Length, group=Genotype)) +
geom_line(aes(colour=Bgrnd_All), size =1) +
geom_errorbar(aes(ymin=Length-se, ymax=Length+se, colour=Bgrnd_All), width=2) +
scale_x_continuous("Time", breaks=c(0,17,22,41,89)) +
scale_colour_manual(values=c(Avalon="#000066",Av_A="#663399",Av_B="#339999",Cadenza="#CC0033",Cad_A="#FF6600",Cad_B="#FF9933"))+
ylab("leaf segment width (mm)") +
theme_bw() +
theme(axis.title = element_text(size=14,face="bold"), axis.text = element_text(size=10)) +
facet_wrap(~Bgrnd_All, ncol=3)
this gives:
require(ggplot2)
The data: It's shark incidents grouped by shark species. It's actually a real dataset, already summarized.
D <- structure(list(FL_FATAL = structure(c(2L, 2L, 2L, 1L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 2L, 2L), .Label = c("FATAL",
"NO FATAL"), class = "factor"), spec = structure(c(26L, 24L,
6L, 26L, 25L, 16L, 2L, 11L, 27L, 5L, 24L, 29L, 12L, 21L, 13L,
15L, 28L, 1L, 17L, 19L, 8L, 3L, 6L, 13L, 22L, 18L, 27L, 14L,
23L, 20L, 7L, 4L, 8L, 9L, 10L), .Label = c("blacknose", "blacktip",
"blue", "bonnethead", "bronze", "bull", "caribbean", "draughtsboard",
"dusky", "galapagos", "ganges", "hammerhead", "involve", "leon",
"mako", "nurse", "porbeagle", "recovered", "reef", "sand", "sandtiger",
"sevengill", "spinner", "tiger", "unconfired", "white", "whitespotted",
"whitetip", "wobbegong"), class = "factor"), N = c(368L, 169L,
120L, 107L, 78L, 77L, 68L, 59L, 56L, 53L, 46L, 42L, 35L, 35L,
33L, 30L, 29L, 29L, 26L, 25L, 25L, 25L, 24L, 24L, 21L, 21L, 20L,
20L, 17L, 16L, 16L, 15L, 11L, 11L, 11L)), .Names = c("FL_FATAL",
"spec", "N"), row.names = c(NA, -35L), class = "data.frame")
.
head(D)
# FL_FATAL spec N Especies
# 1 NO FATAL white 368 white
# 2 NO FATAL tiger 169 tiger
# 3 NO FATAL bull 120 bull
# 4 FATAL white 107 white
# 5 NO FATAL unconfired 78 unconfired
# 6 NO FATAL nurse 77 nurse
Reordering a factor variable by a numeric making a new variable.
# Re-order spec creating Especies variable ordered by D$N
D$Especies <- factor(D$spec, levels = unique(D[order(D$N), "spec"]))
# This two plots work as spected
ggplot(D, aes(x=N, y=Especies)) +
geom_point(aes(size = N, color = FL_FATAL))
ggplot(D, aes(x=N, y=Especies)) +
geom_point(aes(size = N, color = FL_FATAL)) +
facet_grid(. ~ FL_FATAL)
Reordering using reorder()
# Using reorder isn't working or am i missing something?
ggplot(D, aes(x=N, y=reorder(D$spec, D$N))) +
geom_point(aes(size = N, color = FL_FATAL))
# adding facets makes it worse
ggplot(D, aes(x=N, y=reorder(D$spec, D$N))) +
geom_point(aes(size = N, color = FL_FATAL)) +
facet_grid(. ~ FL_FATAL)
Which would be the correct approach for producing the plots with reorder()?
The problem is that by using D$ in your reorder call, you're reordering spec independent of the data frame, so the values no longer match up with the corresponding x values. You need to use it directly on the variables:
ggplot(D, aes(x=N, y=reorder(spec, N, sum))) +
geom_point(aes(size = N, color = FL_FATAL)) +
facet_grid(. ~ FL_FATAL)
I'm surprised you like your first way--it's a happy coincidence that worked out. Most of your species have one N value (NO_FATAL only), but you have a few that have both FATAL and NO_FATAL. Whenever there are more than two numeric rows corresponding to a factor, reorder uses a function of those numerics to do the final sort. The default function is mean, but you probably want sum, to sort by the total number of incidents.
D$spec_order <- reorder(D$spec, D$N, sum)
ggplot(D, aes(x=N, y=spec_order)) +
geom_point(aes(size = N, color = FL_FATAL))
ggplot(D, aes(x=N, y=spec_order)) +
geom_point(aes(size = N, color = FL_FATAL)) +
facet_grid(. ~ FL_FATAL)
I'm having some trouble producing a faceted bar_plot in ggplot2. Perhaps it is something very obvious, but I can't figure it out:( I've the following dataset:
structure(list(COUNTRY = structure(c(1L, 4L, 7L, 10L, 13L, 16L,
19L, 2L, 5L, 8L, 11L, 14L, 17L, 20L, 3L, 6L, 9L, 12L, 15L, 18L,
2L, 5L, 8L, 11L, 14L, 17L, 20L, 3L, 6L, 9L, 12L, 15L, 18L, 1L,
4L, 7L, 10L, 13L, 16L, 19L, 3L, 6L, 9L, 12L, 15L, 18L, 1L, 4L,
7L, 10L, 13L, 16L, 19L, 2L, 5L, 8L, 11L, 14L, 17L, 20L), .Label = c("Angola",
"Botswana", "Burundi", "Comoros", "Eritrea", "Ethiopia", "Kenya",
"Lesotho", "Madagascar", "Malawi", "Mozambique", "Namibia", "Rwanda",
"Somalia", "South Africa", "Swaziland", "Tanzania", "Uganda",
"Zambia", "Zimbabwe"), class = "factor"), Year = structure(c(2L,
2L, 14L, 16L, 16L, 11L, 12L, 2L, 4L, 15L, 5L, 10L, 16L, 16L,
2L, 17L, 14L, 11L, 12L, 10L, 2L, 4L, 15L, 5L, 10L, 16L, 16L,
2L, 17L, 14L, 11L, 12L, 10L, 2L, 2L, 14L, 16L, 16L, 11L, 12L,
2L, 17L, 14L, 11L, 12L, 10L, 2L, 2L, 14L, 16L, 16L, 11L, 12L,
2L, 4L, 15L, 5L, 10L, 16L, 16L), .Label = c("1998", "2000", "2001/2",
"2002", "2003", "2003/4", "2004", "2005", "2005/6", "2006", "2006/7",
"2007", "2007/8", "2008/9", "2009", "2010", "2011"), class = "factor"),
sex = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L,
3L, 3L, 3L, 3L, 3L), .Label = c("m", "f", "b"), class = "factor"),
location = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L), .Label = c("Urban", "Rural", "Total",
"Capital.City", "Other.Cities.towns", "Urban.Non.slum", "Urban.Slum"
), class = "factor"), percent = c(60.4, 42.3, 85.4919452426806,
96.3, 90.2847535659154, 87.7347421555771, 87.7323067592087,
80.4, 80.6, 93.8186266493188, 75.0109418832216, 36.8, 87.1059275774722,
90.1216932603937, 66.8, 83.6279398931798, 89.690685909038,
88.8207941092749, 94.6139558774441, 88.0251085200726, 70.4,
54.7, 86.1919805548309, 56.9792710715853, 13.1, 75.6355555697382,
86.8196674671991, 42.5, 61.9452522893308, 77.597285694676,
88.3453320625631, 94.5192341778471, 80.6271302923487, 44.1,
29, 77.8542469357068, 90, 86.7073851186482, 83.8921034867784,
76.4094871587916, 49.3, 63.952805392032, 77.004884485532,
88.6723566877386, 93.9560433940531, 82.3095948307742, 56.1,
31.1, 80.0235653889704, 91.5, 88.3809682134183, 85.5656196766576,
80.0539027063387, 77, 61.2, 89.2538966046165, 59.6756344409838,
23, 79.6749544074645, 86.9507859695728)), .Names = c("COUNTRY",
"Year", "sex", "location", "percent"), row.names = c(1L, 4L,
7L, 10L, 13L, 16L, 19L, 22L, 25L, 28L, 31L, 34L, 37L, 40L, 43L,
46L, 49L, 52L, 55L, 58L, 62L, 65L, 68L, 71L, 74L, 77L, 80L, 83L,
86L, 89L, 92L, 95L, 98L, 101L, 104L, 107L, 110L, 113L, 116L,
119L, 123L, 126L, 129L, 132L, 135L, 138L, 141L, 144L, 147L, 150L,
153L, 156L, 159L, 162L, 165L, 168L, 171L, 174L, 177L, 180L), class = "data.frame")
I am trying to make a bar_plot which shows the percentage of people living in rural, urban areas (and the average) for a number of countries, and wish to show this split by gender. I can plot one of these categories on a simple bar plot by using a subset call within the ggplot function as follows:
ggplot(edu_melt[c(edu_melt$sex!="b" & edu_melt$location==c("Urban")), ], aes(x=COUNTRY, y=percent, fill=sex)) + geom_bar(position="dodge", width=0.5) + facet_grid(~location) + labs(x="Country") + theme(axis.text.x = element_text(angle=30, hjust=1, vjust=1))
I would however like to compare the data across the location (e.g. urban, rural, and both). I thought this would be a simple case of introducing a facet_wrap call, however I get some odd behaviour where the data is plotted across the three facets - I would expect 20 pairs of bars on each facet, however this code produces 20 pairs of bars spread over the three facets?!
ggplot(edu_melt_over[c(edu_melt_over$sex!="b"),], aes(x=COUNTRY, y=percent, fill=sex)) + geom_bar(position="dodge", width=0.5, space=1) + facet_wrap(~location, nrow=3) + labs(x="Country", title="Proportion Net Primary School Enrolement in ESA") + theme(axis.text.x = element_text(angle=30, hjust=1, vjust=1))
I'm not sure why this is happening, but have searched for hints and tips and tried a number of approaches, but get the same result. Anybody have any idea how I could produce this plot?
Thanks
Marty
Your data looks odd as you don't seem to have any combinations of male and female in the same strata (e.g. Angola has a male urban percent but no female). This is the data not the plotting.
ggplot(edu_melt[edu_melt$sex!="b", ], aes(x=COUNTRY, y=percent, fill=sex)) +
geom_bar(position="dodge", width=0.25) + facet_grid(location~.) + labs(x="Country") +
theme(axis.text.x = element_text(angle=30))