I'm hoping to have a legend that includes references to all colours, not just the vertical lines, and does not include a title.
I've tried scale_colour_manual and scale_fill_manual and they all either overlap or only show the vertical lines. I would appreciate any suggestions.
Reprex is below, including the custom colour palette.
var1 <- c(head(randu$x,n=12))
var2 <- as.Date(c("2010-01-01","2010-02-01","2010-03-01","2010-04-01","2010-05-01","2010-06-01","2010-07-01","2010-08-01","2010-09-01","2010-10-01","2010-11-01","2010-12-01"))
var3 <- c(tail(randu[which(randu$x + randu$y < 1),]$x,n=12))
var4 <- c(tail(randu[which(randu$x + randu$y < 1),]$y,n=12))
dat <- data.frame(var1,var2,var3,var4)
setDT(dat)
dat$var5 <- dat[,(var3+var4)]
new_dates <- as.Date(c("2010-09-01","2010-05-01"))
cbp2 <- c("#000000", "#56B4E9", "#009E73", "#0072B2", "#D55E00", "#CC79A7")
ggplot()+
geom_bar(data=dat,colour=cbp2[1],fill = cbp2[1],aes(x=var2,y=var5,colour="var4"),stat="identity")+
geom_bar(data=dat,colour=cbp2[2],fill = cbp2[2],aes(x=var2,y=var3,colour="var3"),stat="identity")+
geom_line(data=dat,colour=cbp2[1],aes(x=var2,y=var1))+
geom_vline(data=data.frame(xintercept = new_dates),
aes(xintercept = new_dates,linetype = "Changes", colour="red"),
linetype="dashed",key_glyph = "path")+
scale_color_manual(name = "",
values = c("red",cbp2[2],cbp2[1]),
breaks = c("red",cbp2[2],cbp2[1]),
labels = c("Changes","Var3","Var4"))+
scale_fill_manual(name = "",
values = c(cbp2[2],cbp2[1]),
breaks = c(cbp2[2],cbp2[1]),
labels = c("var3","var4"))+
ylab("")+
xlab("")+
scale_x_date(expand=c(0,0),date_breaks = "3 month", date_labels = "%b %y") +
scale_y_continuous(labels = function(var5) paste0(var5*100, "%"),
limits=c(0,1),
breaks=c(0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1)) +
theme(panel.background = element_blank(),
axis.line = element_line(colour = "#000000"),
axis.text.x = element_text(angle=60, hjust=1),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.title.x= (element_text(margin = unit(c(3, 0, 0, 0), "mm"))),
legend.position = "top")
There's quite a lot to unpack here with this one, but I gave it my best shot.
First of all, consider what you are trying to plot here. Normally, it's not a problem to call things var1, var2, var3,...; however, in this context it's really quite confusing. Consequently, for this solution, I will be re-posting your entire code reworked instead of just the plotting portion for reasons I hope to outline in this answer.
The Data and the Question
With all that being said, here is my understanding about the nature of the dataset and your desire for the final plot:
var2 in the dataset contains Date class information, and this is the common x axis for the entire plot.
var1 contains values that are to be used for the y values of the geom_line plot layer
var3 and var4 contain values that are to be used for creation of the stacked barplot which should make up the background of the plot
var5 is a sum of var3 + var4, and was a device to create the plot. Herein, it will not be useful, given the data analysis we are to do on the dataset and the application of Tidy Data principles.
xintercept Values for the geom_vline plot layer are supplied as the two dates new_dates
The OP's question indicates a need for the Legend to be displayed correctly. In this case, we want to indicate:
fill color of the bars as var3 and var4
the nature of the vertical lines as dashed red lines.. called "Changes"
A label for the geom_line plot layer. Assume the label will be var1.
Hope all that was correct!
Synthesizing the Dataset
I encourage the OP to consult use of Tidy Data Principles, which will make synthesis of data such as this much more straightforward in the future. Herein, I will apply these principles to the dataset dat.
First of all, let's handle the bar layer data. Applying Tidy Data principles, we would want to gather together var3 and var4 and create out of them two columns: (1) one for the name of the variable ("var3" or "var4"), and (2) one for the value. We will be telling ggplot2 to "stack" bars, so var5 is not needed here: ggplot2 will do that calculation automatically. To gather the columns together, my preference is always to use gather() from dplyr and tidyr:
library(dplyr)
library(tidyr)
library(ggplot2)
library(data.table)
var1 <- c(head(randu$x,n=12))
var2 <- as.Date(c("2010-01-01","2010-02-01","2010-03-01","2010-04-01","2010-05-01","2010-06-01","2010-07-01","2010-08-01","2010-09-01","2010-10-01","2010-11-01","2010-12-01"))
var3 <- c(tail(randu[which(randu$x + randu$y < 1),]$x,n=12))
var4 <- c(tail(randu[which(randu$x + randu$y < 1),]$y,n=12))
dat <- data.frame(var1,var2,var3,var4)
setDT(dat)
# dat$var5 <- dat[,(var3+var4)] no longer needed
new_dates <- as.Date(c("2010-09-01","2010-05-01"))
cbp2 <- c("#000000", "#56B4E9", "#009E73", "#0072B2", "#D55E00", "#CC79A7")
newdat <- dat %>%
gather(key='var_name', value='value', -var2) # gather all columns except for var2
names(newdat) <- c('Dates', 'var_name', 'value')
newdat$var_name <- factor(newdat$var_name, levels=c('var4', 'var3','var1'))
In addition to gathering together, you will also note that I'm adjusting the names of the columns to make them a bit more easier to follow when it comes down to plotting. Additionally, I'm setting the order of the levels for newdat$var_name. The purpose here is that the order we specify will relate to the ordering used to create the plot. I want var3 to appear as a bar "under" var4, so we need to specify that var4 is first.
You could also create a separate dataset containing var2 and var1 to use for plotting the geom_line layer... but this also works fine.
The Plot
For the plot, I've tried to organize the code into separate sections. What OP was trying to do was to plot column-by-column, rather than using aes(fill= and aes(color= to set and create legends. In addition, the OP's original code had numerous examples of the following:
geom_*(aes(color=...), color=...)
The result of this in ggplot2 is that if you set an aesthetic value (like color=) outside of aes() while also stating this argument inside aes(), the value on the outside will overwrite the value specified inside the mapping--effectively removing any call to place that within a legend. This was the biggest cause for issue in the OP's example, and why certain items were the "right" color, but did not appear in any legend.
Specifying arguments in aes() only indicates that a legend should be created and tells ggplot2 on what basis to apply color, fill, linetype... it does not actually specify the color. Color should be specified using the scale_*_*() functions. In this case, we have 3 legend types created. The OP can organize however they wish to do so, but I tried to keep this example a bit illustrative to allow for some changing on the OP's case, since it is still not entirely clear how the legend is wanted to look completely.
Note that values= is used to apply the color, linetype, or fill aesthetic, and is done by feeding that argument a named vector. You can also use a non-named vector, in which case the attributes will be applied according to the ordering of the levels for that factor.
Note that I changed the line color of the geom_line to blue... just so that it stands out a bit. It would be a bit confusing otherwise, since there is a fill color that is also black.
ggplot(dat, aes(x=Dates, y=value)) +
# plot layers
geom_col(
data=subset(newdat, var_name != 'var1'),
aes(fill=var_name), position='stack') +
geom_line(
data=subset(newdat, var_name == 'var1'),
aes(color=var_name)
) +
geom_vline(data=data.frame(xintercept = new_dates),
aes(xintercept = new_dates, linetype = "Changes"), colour="red",
key_glyph = "path")+
# color and legend settings
scale_fill_manual(
name="Fill",
values=c('var3'=cbp2[2], 'var4'=cbp2[1])) +
scale_color_manual(
name='Color',
values = 'blue') +
scale_linetype_manual(
name='Linetype',
values=2) +
# scale adjustment and theme stuff
scale_y_continuous(labels = function(var5) paste0(var5*100, "%"),
limits=c(0,1),
breaks=c(0,0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1)) +
theme(panel.background = element_blank(),
axis.line = element_line(colour = "#000000"),
axis.text.x = element_text(angle=60, hjust=1),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.title.x= (element_text(margin = unit(c(3, 0, 0, 0), "mm"))),
legend.position = "top")
Related
Here is the code I have:
library("ggplot2") # Loading the req1uired packages
library("ggpubr")
rm(list=ls())
TeamsHalf <- read.table(file= "../data/TeamsHalf.txt", sep = ";", header = T) # Loading the data table
index = TeamsHalf$Half == 2 #Making a column with each teams' rank at the end of the season
TeamsHalf$FinalRank[index] <- TeamsHalf$Rank[index]
TeamsHalf$divID <- as.factor(TeamsHalf$divID) # changing the order that facet_wrap orders the divisions, so that "East" is on the right-hand side of the plot
TeamsHalf$divID = names(TeamsHalf$divID) = c("Western ", "Eastern")
levels(TeamsHalf$divID) <- c("Eastern ","Western")
TeamsHalf$Half <- as.factor(TeamsHalf$Half) # Changing the order of the stacked bars so that the 1st half of the season is below 2nd half.
levels(TeamsHalf$Half) <- c("2","1")
g.TeamsHalfNL <- ggplot(data = TeamsHalf[TeamsHalf$lgID == "NL",], # Extracting data for the Eastern division
aes (y = W, x = reorder(teamID, -FinalRank, na.rm = T), fill = factor(Half)))+ # setting the name of the variables, reordering x in order of rank at the end of the season, and setting the colorss for the different halfs.
geom_col(width = 0.8, position = "stack")+ # Setting the width of cols, and stacking them on top of each other.
scale_colour_manual(values = c( "#002a5b", "#ffc78a"), aesthetics = c("color", "fill"))+ # changing the colors of the columns.
labs(x = "National League", y = "Wins", fill = "Season Half")+ # changing x and y labels, and the legend title.
theme(
panel.background = element_rect(fill = NA), # setting a white background
panel.grid.major.y = element_line(colour = "grey50", size = 1/4), # Changing the size and color of the horizontal grid lines
panel.grid.major.x = element_blank())+ # removing the vertical gridlines
facet_wrap(.~divID, scales = "free") # splitting the plot into two graphs, one for each league
g.TeamsHalfAL <- ggplot(data = TeamsHalf[TeamsHalf$lgID == "AL",], # Extracting data for the Western division
aes (y = W , x = reorder(teamID, -FinalRank, na.rm = T), fill = factor(Half)))+ # setting the name of the variables, reordering x in order of rank at the end of the season, and setting the colorss for the different halfs.
geom_col(width = 0.8, position = "stack")+ # Setting the width of cols, and stacking them on top of each other.
scale_colour_manual(values = c( "#002a5b", "#ffc78a"), aesthetics = c("color", "fill"))+ # changing the colors of the columns.
labs(x = "American League", y="Wins")+ # changing x and y labels, and the legend title.
theme(
panel.background = element_rect(fill = NA), # setting a white background
panel.grid.major.y = element_line(colour = "grey50", size = 1/4), # # Changing the size and color of the horizontal grid lines
panel.grid.major.x = element_blank())+ # removing the vertical gridlines
facet_wrap(.~divID, scales = "free") # splitting the plot into two graphs, one for each league
FinalGraph <- ggarrange(g.TeamsHalfAL, g.TeamsHalfNL, ncol = 1, nrow = 2, # combinging the two graphs, stacking them vertically
common.legend = T, legend = "bottom", align = "hv") # creating a shared legend, moving the legend, and aligning the graphs.
annotate_figure(FinalGraph, top = text_grob("\nOverview of the 1981 baseball season for AL and NL", face = "italic"), # Here, I set the text for title, note, and figure nr. the Title is indented with a "\n" to adhere to the APA style.
bottom = text_grob("Note. The figure shows the number of wins for each team in the Western\n and Eastern division of the AL and NL. The teams are sorted by their final rank\n at the end of the season.", size = 8),
fig.lab = "Figure 1")
I want the labels of the facet_wrap to say "Eastern" and "Western" instead of E and W, so I use the names() function to change them in line 10. However, when I do this, the facet_wrap makes two plots, but with all the same teams, instead of splitting them into western and Eastern. When I don't change the name of the variables, and they remain named "E" and "W", it works perfectly fine. Does anyone know what the issue is here?
TL;DR - don't use names() for changing the factor labels.
Instead, change the names of the facets in your plot within the plot code, as in:
... facet_wrap(~divID, labeller=as_labeller(c("E" = "Eastern", "W" = "Western"))...)
The line in your code where you use names() is not going to work:
TeamsHalf$divID = names(TeamsHalf$divID) = c("Western ", "Eastern")
First of all, the syntax is not correct, and more importantly, the names() function is not used for factor levels or labels = it's used for changing names of the "internal parts" of data structures. For example, names() can be used to rename columns in a data frame or items in a list.
The crux of your question is related to how to change the labels of the facets when using facet_wrap() or facet_grid(). Your example is not reproducible, so I'll use the built-in iris dataset to show you two fundamental approaches to do this.
The plot:
library(ggplot2)
p <- ggplot(iris, aes(x=Sepal.Width, y=Sepal.Length)) + geom_point()
p + facet_wrap(~Species)
Use a Labeller Function
The most direct way to relabel facets is to use the labeller= argument of facet_wrap(...) or facet_grid(...). Here I use a named vector with the as_labeller() function to make sure the label names are applied correctly. Consequently, the order of the labels does not get controlled this way and the order of the facets is preserved.
p + facet_wrap(~Species,
labeller = as_labeller(c(
"setosa" = "First Facet",
"virginica" = "Third Facet",
"versicolor" = "Second Facet"
)))
Change the names of the factor
The above method is more direct, but what you were attempting is similar to another approach that also should work. The caution here is that when using levels(...) you will need to make sure that the names you set for your facets are in the same order as they appear in the factor (df$Species). Consequently, this method is a bit more risky unless you're sure of the ordering of the levels of the particular factor:
df <- iris
df$Species <- factor(df$Species)
levels(df$Species) <- c("First", "Second", "Third")
ggplot(df, aes(x=Sepal.Width, y=Sepal.Length)) + geom_point() +
facet_wrap(~Species)
In the end, remove your names() line in your code and I would recommend changing directly in the plot code. If your factor levels are called "E" and "W", then apply the labels you want with a named vector formatted with as_labeller() and applied via the labeller= argument inside facet_wrap(...).
I want to make a Munsell for color chart for the chips used by the World Color Survey. It should look like this:
The information needed can be found on the WCS page, here, I take the following steps:
library(munsell) # https://cran.r-project.org/web/packages/munsell/munsell.pdf
library(ggplot2)
# take the "cnum-vhcm-lab-new.txt" file from: https://www1.icsi.berkeley.edu/wcs/data.html#wmt
# change by replacing .50 with .5 removing .00 after hue values
WCS <- read.csv("cnum-vhcm-lab-new.txt", sep = "\t", header = T)
WCS$hex <- mnsl2hex(hvc2mnsl(hue = WCS$MunH, value = ceiling(WCS$MunV), chroma = WCS$C), fix = T)
# this works, but the order of tiles is messed up
ggplot(aes(x=H, y=V, fill=hex), data = WCS) +
geom_tile(aes(x=H, y=V), show.legend = F) +
scale_fill_manual(values = WCS$hex) +
scale_x_continuous(breaks = scales::pretty_breaks(n = 40))
The result:
Clearly, the chips are not ordered along hue and value but with reference to some other dimension, perhaps even order in the original data frame. I also have to revert the order on the y-axis. I guess the solution will have to do with factor() and reorder(), but how to do it?
OP. TL;DR - you should be using scale_fill_identity() rather than scale_fill_manual().
Now for the long description: At its core, ggplot2 functions on mapping the columns of your data to specific features on the plot, which ggplot2 refers to as "aesthetics" using the aes() function. Positioning is defined by mapping certain columns of your data to x and y aesthetics, and the different colors in your tiles are mapped to fill using aes() as well.
The mapping for fill does not specify color, but only specifies which things should be different colors. When mapped this way, it means that rows in your data (observations) that have the same value in column mapped to the fill aesthetic will be the same color, and observations that have different values in the column mapped to the fill aesthetic will be different colors. Importantly, this does not specify the color, but only specifies if colors should be different!
The default behavior is that ggplot2 will determine the colors to use by applying a default scale. For continuous (numeric) values, a continuous scale is applied, and for discrete values (like a vector of characters), a discrete scale is applied.
To see the default behavior, just remove scale_fill_manual(...) from your plot code. I've recopied your code below and added the needed revisions to programmatically remove and adjust the ".50" and ".00" changes to WCS$MunH. The code below should work entirely if you have downloaded the original .txt file from the link you provided.
library(munsell)
library(ggplot2)
WCS <- read.csv("cnum-vhcm-lab-new.txt", sep = "\t", header = T)
WCS$MunH <- gsub('.50','.5', WCS$MunH) # remove trailing "0" after ".50"
WCS$MunH <- gsub('.00', '', WCS$MunH) # remove ".00" altogether
WCS$V <- factor(WCS$V) # needed to flip the axis
WCS$hex <- mnsl2hex(hvc2mnsl(hue = WCS$MunH, value = ceiling(WCS$MunV), chroma = WCS$C), fix = T)
ggplot(aes(x=H, y=V, fill=hex), data = WCS) +
geom_tile(aes(x=H, y=V), show.legend = F, width=0.8, height=0.8) +
scale_y_discrete(limits = rev(levels(WCS$V))) + # flipping the axis
scale_x_continuous(breaks = scales::pretty_breaks(n = 40)) +
coord_fixed() + # force all tiles to be "square"
theme(
panel.grid = element_blank()
)
You have show.legend = F in there, but there should be 324 different values mapped to the WCS$hex column (i.e. length(unique(WCS$hex))).
When using scale_fill_manual(values=...), you are supplying the names of the colors to be used, but they are not mapped to the same positions in your column WCS$hex. They are applied according to the way in which ggplot2 decides to organize the levels of WCS$hex as if it were a factor.
In order to tell ggplot2 to basically ignore the mapping and just color according to the actual color name you see in the column mapped to fill, you use scale_fill_identity(). This will necessarily remove the ability to show any legend, since it kind of removes the mapping and recoloring that is the default behavior of aes(fill=...). Regardless, this should solve your issue:
ggplot(aes(x=H, y=V, fill=hex), data = WCS) +
geom_tile(aes(x=H, y=V), width=0.8, height=0.8) +
scale_fill_identity() + # assign color based on text
scale_y_discrete(limits = rev(levels(WCS$V))) + # flipping the axis
scale_x_continuous(breaks = scales::pretty_breaks(n = 40)) +
coord_fixed() + # force all tiles to be "square"
theme(
panel.grid = element_blank()
)
The main thing is to use the right color scale (scale_fill_identity). This ensures the hex values are uses as the color for the tiles.
library(munsell) # https://cran.r-project.org/web/packages/munsell/munsell.pdf
library(ggplot2)
WCS <- read.csv(url('https://www1.icsi.berkeley.edu/wcs/data/cnum-maps/cnum-vhcm-lab-new.txt'), sep = "\t", header = T)
WCS$hex <- mnsl2hex(hvc2mnsl(hue = gsub('.00','',gsub('.50', '.5',WCS$MunH)), value = ceiling(WCS$MunV), chroma = WCS$C), fix = T)
# this works, but the order of tiles is messed up
ggplot(aes(x=H, y=V, fill=hex), data = WCS) +
geom_tile(aes(x=H, y=V), show.legend = F) +
scale_fill_identity() +
scale_x_continuous(breaks = scales::pretty_breaks(n = 40))
Created on 2021-10-05 by the reprex package (v2.0.1)
Tr<-c("Sorghum Male \n Sorghum Female","Sorghum Male \n Wheat Female","Wheat Male \n Sorghum Female","Wheat Male \n Wheat Female")
Treatment<-c(rep(Tr,3))
Matingdiet<-c(rep(c("Same diet","Cross diet","Cross diet", "Same diet"),3))
Rejection<-c(0.05, 0.00, 0.10, 0.00, 0.00, 0.05, 0.05, 0.00, 0.05, 0.05, 0.05, 0.05)
d<-as.data.frame(cbind(Treatment,Rejection, Matingdiet))
d$pop<-c(rep("JN200A-OBL",4),rep("JN200B-OBL",4),rep("JN200C-OBL",4))
d$Rejection<-as.numeric(as.character(d$Rejection))
d$pop<-as.factor(d$pop)
datatxt<-as.data.frame(cbind(labels = rep("N = 20 per treatment",3)),pop=c("JN200A-OBL","JN200B-OBL","JN200C-OBL"))
pl<-ggplot(data = d, aes(x=Treatment, y=Rejection, fill=Matingdiet))+geom_col()+facet_wrap(~pop)
pl<-pl+labs(fill="Mating pair type", y = "Proportion of mates rejected")+ylim(0,1)+theme(axis.text.x = element_text(angle = -60, hjust = 1, vjust = -1))
pl<-pl+theme(plot.background = element_blank(),panel.grid.major = element_blank(), panel.grid.minor = element_blank())
pl+geom_text(data=datatxt,aes(label = labels))
Which gives this error
Error: Aesthetics must be either length 1 or the same as the data (9): x, y and fill
When I run it without adding the geom_text() function I get my desired graph but I want to annotate it.
Well, first of all, it looks like in your code there is a misplaced parentheses when you define datatxt that results in that data frame to have only one column called labels. You're also using as.data.frame() when it makes much more sense to use simply data.frame(), where you do not have to use cbind(), but rather just list the column names as vectors, separated by ,.
datatxt <- data.frame(
labels = rep("N = 20 per treatment",3),
pop=c("JN200A-OBL","JN200B-OBL","JN200C-OBL")
)
As for placing the text on your plot, if you are using geom_text it will be mapping the data like ggplot does for all the other geoms. That is to say that what is drawn on the plot will be based on the data itself and the mapping you define in aes(), which is linked to the columns of that data. For geom_text, it will look through each observation in the dataset you give it (in this case, datatxt) and look for those values pertaining to x, y, and also fill (because this was defined in the overall call to ggplot() in your plot. The error message is due to not finding those columns in the dataset, and in fact, you do not have columns that are mapped to x, y, or fill in datatxt at all.
The first fix is to remove the fill aesthetic from the overall call to ggplot(). If it is used in all geoms, it makes sense to put it here, but I like to define the aesthetics that are used only for particular geoms inside the geom call itself. Hence, I'm moving fill=Matingdiet inside the aes() for geom_col() where it is used. We can get around another way, but this is simplest.
Second, you presumably want the text to appear in the same location for each facet, right? Since it's not going to move with the data, we should be defining where it goes outside the mapping= specification of geom_text() - in other worse, outside aes(). I also change a few other aesthetics so you can see what else you may want to specify here.
Here's the result:
pl<-
ggplot(data = d, aes(x=Treatment, y=Rejection)) +
geom_col(aes(fill=Matingdiet)) +
facet_wrap(~pop) +
labs(fill="Mating pair type", y = "Proportion of mates rejected") +
ylim(0,1) +
theme(
axis.text.x = element_text(angle = -60, hjust = 1, vjust = -1),
plot.background = element_blank(),
panel.grid.major = element_blank(),
panel.grid.minor = element_blank()
)
pl +
geom_text(
data=datatxt, aes(label = labels),
x=1, y=0.95, hjust=0, fontface='italic', color='red')
Remember that geom_text is really supposed to be suited for the case where the value of whatever you assign inside aes() changes with respect to the data. For example, you might have N=20 for two of the facets, but N=30 for another one. If that's the case, you can use that approach above. If you need to have the text remain the same regardless of the data, an easier approach might be to use annotate() instead:
pl +
annotate(
geom='text', x=1, y=0.95, color='red', fontface='italic',
label='N = 20 per treatment', hjust=0
)
Ultimately, it's up to you, as both work here. The above code gives you the same plot as using geom_text.
I'm making a horizontal bar chart where each observation has a numeric count variable associated with it. I want to show the bars for each variable ordered by (descending) count, which is no problem. However I also want to highlight the variable name based on a third dichotomous variable. I found how to do the latter in another post on here, but I have been unable to combine the two. Here's an example of what I mean:
library(ggplot2)
testdata<-data.frame("var"=c('V1','V2','V3','V4'),"cat"=c('Y','N','Y','N'),
"count"=c(1,5,2,10))
ggplot(testdata, aes(var,count))+
geom_bar(stat='identity',colour='blue',fill='blue',width=0.3)+
coord_flip(ylim=c(0,10))+
theme(axis.text.y=
element_text(colour=ifelse(testdata$cat=="N","darkgreen","darkred"),
size=15))
That's the horizontal bar chart with highlighting, which works fine - V1/V3 are red and V2/V4 are green.
However when I try to sort it doesn't keep the groups:
ggplot(testdata, aes(reorder(var,count),count))+
geom_bar(stat='identity',colour='blue',fill='blue',width=0.3)+
coord_flip(ylim=c(0,10))+theme_classic()+
theme(axis.ticks.y=element_blank())+
theme(axis.text.y=
element_text(colour=ifelse(testdata$cat=="N","darkgreen","darkred"),
size=15))
In this second graph, V2 and V3 are the wrong color.
I also tried sorting the data by count first, and then using the first ggplot statement, however it still plots the data by variable name instead of count (and even if it did work, I would have to resolve tied count values). Any ideas? What I really need is for the dataframe in the "ifelse" colour to match the dataframe in the aes statement. I tried using the data frame that was sorted by descending count in the colour statement, but that also did not work.
Thanks
edit: more code
testdata$var = with(testdata, reorder(var, count))
ggplot(testdata, aes(var,count))+
geom_bar(stat='identity',colour='blue',fill='blue',width=0.3)+
coord_flip(ylim=c(0,10))+theme_classic()+
theme(axis.ticks.y=element_blank())+
theme(axis.text.y=
element_text(colour=ifelse(testdata$cat=="N","darkgreen","darkred"),
size=15))
My comment was partially incorrect. The order of the levels is the only thing that matters for the order of the axis, but when we do ifelse(testdata$cat == "N", "darkgreen", "darkred") of course it goes in the order of the data! So we need the order of the levels and the order of the data to be the same:
testdata$var = with(testdata, reorder(var, count))
testdata = testdata[order(testdata$var), ]
ggplot(testdata, aes(var, count)) +
geom_bar(
stat = 'identity',
colour = 'blue',
fill = 'blue',
width = 0.3
) +
coord_flip(ylim = c(0, 10)) + theme_classic() +
theme(axis.ticks.y = element_blank()) +
theme(axis.text.y =
element_text(
colour = ifelse(testdata$cat == "N", "darkgreen", "darkred"),
size = 15
))
I have a bar graph coming from one set of monthly data and I want to overlay on it data from another set of monthly data in the form of a line. Here is a simplified example (in my data the second data set is not a simple manipulation of the first):
library(reshape2)
library(ggplot2)
test<-abs(rnorm(12)*1000)
test<-rbind(test, test+500)
colnames(test)<-month.abb[seq(1:12)]
rownames(test)<-c("first", "second")
otherTest<-apply(test, 2, mean)
test<-melt(test)
otherTest<-as.data.frame(otherTest)
p<-ggplot(test, aes(x=Var2, y=value, fill=Var1, order=-as.numeric(Var2))) + geom_bar(stat="identity")+
theme_bw() + theme(panel.border = element_blank(), panel.grid.major = element_blank(),
panel.grid.minor = element_blank(), axis.line = element_line(colour = "black")) +
ggtitle("Test Graph") +
scale_fill_manual(values = c(rgb(1,1,1), rgb(.9,0,0))) +
guides(fill=FALSE) +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
works great to get the bar graph:
but I have tried multiple iterations to get the line on there and can't figure it out (like this):
p + geom_line(data=otherTest,size=1, color=rgb(0,.5,0)
Also, if anybody knows how I can make the bars in front of each other so that all you see is a red bar of height 500, I would appreciate any suggestions. I know I can just take the difference between the two lines of the matrix and keep it as a stacked bar but I thought there might be an easy way to put both bars on the x-axis, white in front of red. Thanks!
You have a few problems to deal with here.
Directly answering your question, if you don't provide a mapping via aes(...) in a geom call (like your geom_line...), then the mapping will come from ggplot(). Your ggplot() specifies x=Var2, y=value, fill=Var1.... All of these variable names must exist in your data frame otherTest for this to work, and they don't right now.
So, you either need to ensure that these variable names exist in otherTest, or specify mapping separately in geom_line. You might want to read up about how these layering options work. E.g., here's a post of mine that goes into some detail.
If you go for the first option, some other problems to think about:
is Var2 a factor with the same levels in both data frames? It probably should be.
to use geom_line as you are, you might need to add group = 1. See here.
Some others too, but here's a brief example of what you might do:
library(reshape2)
library(ggplot2)
test <- abs(rnorm(12)*1000)
test <- rbind(test, test+500)
colnames(test) <- month.abb[seq(1:12)]
rownames(test) <- c("first", "second")
otherTest <- apply(test, 2, mean)
test <- melt(test)
otherTest <- data.frame(
Var2 = names(otherTest),
value = otherTest
)
otherTest$Var2 = factor(otherTest$Var2, levels = levels(test$Var2))
ggplot(test, aes(x = Var2, y = value, group = 1)) +
geom_bar(aes(fill = Var1), stat="identity") +
geom_line(data = otherTest)