Text Labels with stacked bar charts in ggplot2 [duplicate] - r

I'm trying to make a stacked bar chart with text labels, this some example data / code:
library(reshape2)
ConstitutiveHet <- c(7,13)
Enhancer <- c(12,6)
FacultativeHet <- c(25,39)
LowConfidence <- c(3,4)
Promoter <- c(5,4)
Quiescent <- c(69,59)
RegPermissive <- c(23,18)
Transcribed <- c(12,11)
Bivalent <- c(6,22)
group <- c("all","GWS")
meanComb <- data.frame(ConstitutiveHet,Enhancer,LowConfidence,Promoter,Quiescent,RegPermissive,Transcribed,Bivalent,group)
meanCombM <- melt(meanComb,id.vars = "group")
ggplot(meanCombM,aes(group,value,label=value)) +
geom_col(aes(fill=variable))+
geom_text(position = "stack")+
coord_flip()
The text labels appear out of order, they seem to be the mirror image of their intended order. (you get the same problem with or without the coord_flip())
A poster had a similar problem here:
ggplot2: add ordered category labels to stacked bar chart
An answer to their post propsed reversing the order of the values in the groups, which I tried (see below), the resulting order on the plot is not one i've been able to figure out. Also this approach seems hacky, is there a bug here or am I missing something?
x <- c(rev(meanCombM[meanCombM$group=="GWS",]$value),rev(meanCombM[meanCombM$group=="all",]$value))
ggplot(meanCombM,aes(group,value,label=x)) +
geom_col(aes(fill=variable))+
geom_text(position = "stack")+
coord_flip()

ggplot(meanCombM,aes(group,value,label=value)) +
geom_col(aes(fill=variable))+
geom_text(aes(group=variable),position = position_stack(vjust = 0.5))+
coord_flip()
Hadley answered a question similar to my own in this issue in ggplot2's git repository: https://github.com/tidyverse/ggplot2/issues/1972
Apparently the default grouping behaviour (see:
http://ggplot2.tidyverse.org/reference/aes_group_order.html) does not partition the data correctly here without specifying a group aesthetic, which should map to the same value as fill in geom_col in this example.

Related

labels right next points in gg plot

Ive tried to google my way to the answere to the question, but none seems to give the answer to what im trying to do.
My goal is to add legends right besides the observations within the plot to show the name of the observation. Name of observations are located in the first column of my data frame.
ggplot(data = coef.vec)+aes(x = coef.x, y = variable.mean) +
geom_point()
You can use labels with geom_text() in next style. I have used simulated data:
library(tidyverse)
#Code
data <- data.frame(group=paste0('Obs',1:10),
coef.x=rnorm(10,0,1),
variable.mean=runif(10,0.015,0.05),stringsAsFactors = F)
#Plot
ggplot(data,aes(x=coef.x,y=variable.mean))+
geom_point()+
geom_text(aes(label=group),hjust=-0.15)
Output:

ggplotly - R, labeling trace names

I'm new to plotly and not able to find the relevant documentation on how to name the traces so a meaningful label appears in plot rendered by ggplotly. Here is the ggplotly site that shows a number of examples. What is needed to show a meaningful label on hover instead of the value followed by trace0, trace1, etc.
For example, in the first plot, how can the labels appear so it shows:
Proportion: value
Total bill: value
Ideally, I would like to do this directly in R rather than through the web interface. Thanks in advance.
Using ggplot2 and Plotly you can set the text. You'll want to install Plotly and get a key. Here are two examples. Example one:
data(canada.cities, package="maps")
viz <- ggplot(canada.cities, aes(long, lat)) +
borders(regions="canada", name="borders") +
coord_equal() +
geom_point(aes(text=name, size=pop), colour="red", alpha=1/2, name="cities")
ggplotly()
ggplotly(filename="r-docs/canada-bubble")
This yields this plot with the name of Canadian cities available on the hover.
Example two:
install.packages("gapminder")
library(gapminder)
ggplot(gapminder, aes(x = gdpPercap, y = lifeExp, color = continent, text = paste("country:", country))) +
geom_point(alpha = (1/3)) + scale_x_log10()
ggplotly(filename="ggplot2-docs/alpha-example")
Which yields this plot.
For more information, see our R docs or this question on how to overwrite the hover_text element. Plotly's native R API lets you add more controls to your plots. Thanks for asking Brian. We'll add a new section to our docs on this as well. Disclaimer: I work for Plotly.
You can also edit any of the plotly figure properties after the ggplot2 conversion but before you send it to plotly. Here is an example that changes the legend entry names manually. I'll repeat it here:
df <- data.frame(x=c(1, 2, 3, 4), y=c(1, 5, 3, 5), group=c('A', 'A', 'B', 'B'))
g <- ggplot(data=df, aes(x=x, y=y, colour=group)) + geom_point()
# an intermediate step that `ggplotly` calls
p <- plotly_build(g)
# manually change the legend entry names, which are "trace0", "trace1" in your case
p$data[[1]]$name <- 'Group A'
p$data[[2]]$name <- 'Group B'
# send this up to your plotly account
p$filename <- 'ggplot2-user-guide/custom-ggplot2'
plotly_POST(p)
The extended example here explains in more detail how and why this works.
Note that in general the legend item names, e.g. "trace0", are going to be the labels that you grouped by in the dataframe (as in ggplot2).

overlaying 2 plots from different sized dataset with legends using ggplot

I have 2 datasets, with different size. How do I simply plot them and have each with a different color and a legend?
So in this case, the legend would be count1, count2, and the legend title is something I choose, let's say: mylegend. What do I need to change or add to the following commands?
x <- data.frame(Q=1:10, count1=21:30)
y <- data.frame(Q=seq(1,10,0.5), count2=seq(11,20, 0.5))
ggplot() + geom_line(data=x, aes(x=Q, y=count1)) + geom_point(data=y, aes(x=Q, y=count2))
The easiest solution is to combine your data in the same data.frame, then set the aesthetics (aes) in ggplot.
Here is one way you can combine everything:
df <- data.frame(Q = c(x$Q, y$Q),
count = c(x$count1, y$count2),
type = c(rep("count1", 10), rep("count2", 19))
)
But you can also use commands like rbind() or melt() (from the reshape2 library).
With the data combined into one data.frame:
ggplot(df, aes(x=Q, y=count, colour=type)) + geom_point() + geom_line() +
scale_colour_discrete(name="mylegend")
This is a basic example, and I highly recommend Hadley Wickham's ggplot2 book, as well just searching google (Stack Overflow, R Cookbook, etc) for solutions to specific plotting problems or more ways to customize your plots.

colour single ggplot axis item

I have created a chart and am wanting to colour one of the x-axis items based on a variable. I have seen this post (How to get axis ticks labels with different colors within a single axis for a ggplot graph?), but am struggling to apply it to my dataset.
df1 <- data.frame(var=c("a","b","c","a","b","c","a","b","c"),
val=c(99,120,79,22,43,53,12,27,31),
type=c("alpha","alpha","alpha","bravo","bravo","bravo","charlie","charlie","charlie"))
myvar="a"
ggplot(df1,aes(x=reorder(var,-val), y=val,fill=type)) + geom_bar(stat="identity")
Any tips on how to make the x-axis value red when it is equal to myvar?
Update: Thanks to #ddiez for some guidance. I finally came around to the fact that i would have to reorder prior to plotting. I also should have made my original example with data.table, so am not sure if this would influenced original responses. I modified my original dataset to be a data.table and used the following code to achieve success.
df1 <- data.table(var=c("a","b","c","a","b","c","a","b","c"),
val=c(99,120,79,22,43,53,12,27,31),
type=c("alpha","alpha","alpha","bravo","bravo","bravo","charlie","charlie","charlie"))
myvar="a"
df1[,axisColour := ifelse(var==myvar,"red","black")]
df1$var <- reorder(df1$var,-df1$val,sum)
setkey(df1,var,type)
ggplot(df1,aes(x=var, y=val,fill=type)) + geom_bar(stat="identity") +
theme(axis.text.x = element_text(colour=df1[,axisColour[1],by=var][,V1]))
There may be a more elegant solution but a quick hack (requires you to know the final order) would be:
ggplot(df1,aes(x=reorder(var,-val), y=val,fill=type)) +
geom_bar(stat="identity") +
theme(axis.text.x = element_text(colour=c("black","black","red")))
A solution using the variable myvar (yet, there may be better ways):
# reorder the factors in the data.frame (instead of in situ).
df1$var=reorder(df1$var, -df1$val)
# create a vector of colors for each level.
mycol=rep("black", nlevels(df1$var))
names(mycol)=levels(df1$var)
# assign the desired ones a different color.
mycol[myvar]="red"
ggplot(df1,aes(x=var, y=val,fill=type)) +
geom_bar(stat="identity") +
theme(axis.text.x = element_text(colour=mycol))

Label selected percentage values inside stacked bar plot (ggplot2)

I want to put labels of the percentages on my stacked bar plot. However, I only want to label the largest 3 percentages for each bar. I went through a lot of helpful posts on SO (for example: 1, 2, 3), and here is what I've accomplished so far:
library(ggplot2)
groups<-factor(rep(c("1","2","3","4","5","6","Missing"),4))
site<-c(rep("Site1",7),rep("Site2",7),rep("Site3",7),rep("Site4",7))
counts<-c(7554,6982, 6296,16152,6416,2301,0,
20704,10385,22041,27596,4648, 1325,0,
17200, 11950,11836,12303, 2817,911,1,
2580,2620,2828,2839,507,152,2)
tapply(counts,site,sum)
tot<-c(rep(45701,7),rep(86699,7), rep(57018,7), rep(11528,7))
prop<-sprintf("%.1f%%", counts/tot*100)
data<-data.frame(groups,site,counts,prop)
ggplot(data, aes(x=site, y=counts,fill=groups)) + geom_bar()+
stat_bin(geom = "text",aes(y=counts,label = prop),vjust = 1) +
scale_y_continuous(labels = percent)
I wanted to insert my output image here but don't seem to have enough reputation...But the code above should be able to produce the plot.
So how can I only label the largest 3 percentages on each bar? Also, for the legend, is it possible for me to change the order of the categories? For example put "Missing" at the first. This is not a big issue here but for my real data set, the order of the categories in the legend really bothers me.
I'm new on this site, so if there's anything that's not clear about my question, please let me know and I will fix it. I appreciate any answer/comments! Thank you!
I did this in a sort of hacky manner. It isn't that elegant.
Anyways, I used the plyr package, since the split-apply-combine strategy seemed to be the way to go here.
I recreated your data frame with a variable perc that represents the percentage for each site. Then, for each site, I just kept the 3 largest values for prop and replaced the rest with "".
# I added some variables, and added stringsAsFactors=FALSE
data <- data.frame(groups, site, counts, tot, perc=counts/tot,
prop, stringsAsFactors=FALSE)
# Load plyr
library(plyr)
# Split on the site variable, and keep all the other variables (is there an
# option to keep all variables in the final result?)
data2 <- ddply(data, ~site, summarize,
groups=groups,
counts=counts,
perc=perc,
prop=ifelse(perc %in% sort(perc, decreasing=TRUE)[1:3], prop, ""))
# I changed some of the plotting parameters
ggplot(data2, aes(x=site, y=perc, fill=groups)) + geom_bar()+
stat_bin(geom = "text", aes(y=perc, label = prop),vjust = 1) +
scale_y_continuous(labels = percent)
EDIT: Looks like your scales are wrong in your original plotting code. It gave me results with 7500000% on the y axis, which seemed a little off to me...
EDIT: I fixed up the code.

Resources