I had previously used annotate() to add letters to facet panels of ggplots. After updating R (to 3.6.1), code that had previously worked with annotate no longer does.
I can solve this by making a separate dataframe to label each facet, but that is cumbersome when I have a decent number of plots to make that vary in how many facets they have. All I want is a letter (e.g., a-f) on each panel for identification in a journal article.
library(ggplot2)
data(diamonds)
ggplot(diamonds, aes(x=carat,y=price)) +geom_point()+ facet_wrap(~cut) + annotate("text",label=letters[1:5],x=4.5,y=15000,size=6,fontface="bold")
ggplot(diamonds, aes(x=carat,y=price)) +geom_point()+ facet_wrap(~cut) + annotate("text",label=letters[1],x=4.5,y=15000,size=6,fontface="bold")
The first ggplot should produce a plot that has the facets labeled with lowercase letters. Instead, I get the error:
Error: Aesthetics must be either length 1 or the same as the data (25): label
The code does work if only one letter is used, as seen in the second ggplot, so annotate will work, but not with multiple values as it previously did.
I usually always use an external data frame for faceted annotations, because it is more traceable to me.
df_labels=unique(diamonds[,"cut"])
df_labels$label=letters[as.numeric(df_labels$cut)] #to preserve factor level ordering
df_labels$x=4.5
df_labels$y=15000
ggplot(diamonds, aes(x=carat,y=price)) +
geom_point()+ facet_wrap(~cut) +
geom_text(data=df_labels,aes(x=x,y=y,label=label))
Related
I am planning to plot a bar-plot/clustered column chart for time vs revenue with trend-line connecting each bars on top. Starting from year 1981 to 1988.
I have used this code to read the csv : read.csv(file_location/Revenue.csv",header = T, sep=",", dec = ".")
for the plotting : pl <- ggplot(data,aes(x=ï..Year))
and then : pl + geom_bar(color='red',fill='blue').
Unfortunately, i end up with something like this. Whereas, i'd prefer something like this.
I used only ggplot2 library in this case, should i use tidyr, diplyr additionally ? Am i mistaking between continuous and discrete variables. Any advice regarding aesthetic modification to beautify it or solutions regarding this would be really appreciated as i am still in the basics of ggplot and data visualizations.
I have added the fine in case if you want to check it : Revenue.csv
Check the documentation here form some information, but the big change you should make is to use geom_col in place of geom_bar. Your current call specifies an x= aesthetic (what should be the x axis), but not the y= aesthetic (what should be the y axis). geom_bar indicates the number of cases/observations at each x value by default, whereas geom_col is used to display a bar of length y at each x value... but you need a y aesthetic.
With all that being said, try this:
pl <- ggplot(data,aes(x=ï..Year, y=your.y.column.name)) +
geom_col(color='red',fill='blue')
As for aesthetics, I might change the color scheme a bit and also the theme, but that's ind of personal preference. My suggestion would be to at least change your color scheme for geom_bar/col. The color= specifies the outline on the bars, and the fill= is the color of the bars. Your code would give you bright blue bars with a red outline... not awesome. I would also change the width of your bars a to be a bit skinnier by adjusting the width= argument from the default of 1 to something smaller. Here is an example with a dummy dataset. Most people (me included) would not want to download someone else's data via a link, sorry.
df <- data.frame(x=1:10, y=1:10)
ggplot(df, aes(x=x, y=y)) +
geom_col(fill='steelblue', color='black', width=0.5) +
theme_bw()
I have 5 plots for 5 different groups. I want to indicate a statistically significant difference a specific time points. I used annotate() to place asterisks in individual plots above the time points. However, when I combine all the plots together to make one figure, the asterisks get pushed off the plots. It looks like it is a problem with the y scales not being fixed. I'm providing as much data as I feel comfortable with. The first bit of code is for one of the groups. The plots all look relatively similar for the 5 groups. The second bit is the data frame I am using to combine the plots. Pictures attached of one plot by itself, then all plots combined. There should be multiple asterisks on multiple plots
ggplot(data,aes(X,Y,group=Group,color=Group))+
theme_bw()+
theme(panel.grid.major=element_line(color="white",size=.1))+
theme(panel.grid.minor=element_line(color="white",size=.1))+
geom_point(stat="summary")+
geom_errorbar(stat="summary",fun.data=mean_se,width=0.25)+
geom_line(stat="summary")+
scale_color_manual(labels = c("C", "T"),values=c("black", "red"))+
theme(axis.title.y = element_text(vjust=2.5))+
annotate("text", x=5, y=3, label= "*",size=10)
grid.newpage()
grid.draw(rbind(ggplotGrob(plotanimal1),
ggplotGrob(plotanimal2),
ggplotGrob(plotanimal3),
ggplotGrob(plotanimal4),
ggplotGrob(plotanimal5)))
You can make the asterisks by using geom_point with shape = 42. That way, ggplot will automatically fix the y axis values itself. You need to set the aesthetics at the same values you would have with annotate. So instead of
annotate("text", x=5, y=3, label= "*",size=10)
You can do
geom_point(aes(x=5, y=3), shape = 42, size = 2)
Have you tried using the package patchwork to organize the plots? It typically works better than grid.draw
In the data that I am attempting to plot, each sample belongs in one of several groups, that will be plotted on their own grids. I am plotting stacked bar plots for each sample that will be ordered in increasing number of sequences, which is an id attribute of each sample.
Currently, the plot (with some random data) looks like this:
(Since I don't have the required 10 rep for images, I am linking it here)
There are couple things I need to accomplish. And I don't know where to start.
I would like the bars not to be placed at its corresponding nseqs value, rather placed next to each other in ascending nseqs order.
I don't want each grid to have the same scale. Everything needs to fit snugly.
I have tried to set scales and size to for facet_grid to free_x, but this results in an unused argument error. I think this is related to the fact that I have not been able to get the scales library loaded properly (it keeps saying not available).
Code that deals with plotting:
ggfdata <- melt(fdata, id.var=c('group','nseqs','sample'))
p <- ggplot(ggfdata, aes(x=nseqs, y=value, fill = variable)) +
geom_bar(stat='identity') +
facet_grid(~group) +
scale_y_continuous() +
opts(title=paste('Taxonomic Distribution - grouped by',colnames(meta.frame)[i]))
Try this:
update.packages()
## I'm assuming your ggplot2 is out of date because you use opts()
## If the scales library is unavailable, you might need to update R
ggfdata <- melt(fdata, id.var=c('group','nseqs','sample'))
ggfdata$nseqs <- factor(ggfdata$nseqs)
## Making nseqs a factor will stop ggplot from treating it as a numeric,
## which sounds like what you want
p <- ggplot(ggfdata, aes(x=nseqs, y=value, fill = variable)) +
geom_bar(stat='identity') +
facet_wrap(~group, scales="free_x") + ## No need for facet_grid with only one variable
labs(title = paste('Taxonomic Distribution - grouped by',colnames(meta.frame)[i]))
I am trying to make a better version of an R base plot with ggplot2. Not only to have a common legend but also because I like the ggplot2 styles and customization. My data consists of 3 seperate datasets that contain the same two groups of observations for several (but different) treatments. Hence I want to generate 3 separate plots in 1 graph with a common legend however with different factor levels. To illustrate my point the first image here is what I have generated with R base so far:
I tried to generate a ggplot2 plot with dummy data that has exactly the same structure as my data:
foo<-data.frame(c(letters,letters),c(rep('T1',26),rep('T2',26)),
runif(52),rep(c(rep('Ori1',12),rep('Ori2',8),rep('ori3',6)),2))
names(foo)<-c('Treatment','Type','Count','Origin')
a<-ggplot(foo,aes(x = factor(Treatment),y = Count))
a+ facet_grid(Origin~., scales="free_y", space="free") +
geom_bar(stat="identity",aes(fill=factor(foo$Type)),position="dodge")
+theme_bw()+theme(axis.text.x=element_text(angle=60,hjust=1))+coord_flip()
Which gives me the following undesirable result.
I am aware of the stack overflow topics Removing Unused Factors from a Facet in ggplot2 and How can I remove empty factors from ggplot2 facets? however, they do not deal with the clustered bar graphs I try to realise here and I feel they are the problem, however do not now how to solve it. All pointers are welcome.
To illustrate my comment:
a<-ggplot(foo,aes(x = factor(Treatment),y = Count))
a+ facet_wrap(~Origin, scales="free_x") +
geom_bar(stat="identity",aes(fill=factor(Type)),position="dodge") +
theme_bw() +
theme(axis.text.x=element_text(angle=60,hjust=1))
Note that if you add coord_flip and switch to free_y you get a specific error about coord_flip not working with some types of free scales, which is the source of you problem.
I am making a graph in ggplot2 consisting of a set of datapoints plotted as points, with the lines predicted by a fitted model overlaid. The general idea of the graph looks something like this:
names <- c(1,1,1,2,2,2,3,3,3)
xvals <- c(1:9)
yvals <- c(1,2,3,10,11,12,15,16,17)
pvals <- c(1.1,2.1,3.1,11,12,13,14,15,16)
ex_data <- data.frame(names,xvals,yvals,pvals)
ex_data$names <- factor(ex_data$names)
graph <- ggplot(data=ex_data, aes(x=xvals, y=yvals, color=names))
print(graph + geom_point() + geom_line(aes(x=xvals, y=pvals)))
As you can see, both the lines and the points are colored by a categorical variable ('names' in this case). I would like the legend to contain 2 entries: a dot labeled 'Data', and a line labeled 'Fitted' (to denote that the dots are real data and the lines are fits). However, I cannot seem to get this to work. The (awesome) guide here is great for formatting, but doesn't deal with the actual entries, while I have tried the technique here to no avail, i.e.
print(graph + scale_colour_manual("", values=c("green", "blue", "red"))
+ scale_shape_manual("", values=c(19,NA,NA))
+ scale_linetype_manual("",values=c(0,1,1)))
The main trouble is that, in my actual data, there are >200 different categories for 'names,' while I only want the 2 entries I mentioned above in the legend. Doing this with my actual data just produces a meaningless legend that runs off the page, because the legend is trying to be a key for the colors (of which I have way too many).
I'd appreciate any help!
I think this is close to what you want:
ggplot(ex_data, aes(x=xvals, group=names)) +
geom_point(aes(y=yvals, shape='data', linetype='data')) +
geom_line(aes(y=pvals, shape='fitted', linetype='fitted')) +
scale_shape_manual('', values=c(19, NA)) +
scale_linetype_manual('', values=c(0, 1))
The idea is that you specify two aesthetics (linetype and shape) for both lines and points, even though it makes no sense, say, for a point to have a linetype aesthetic. Then you manually map these "nonsense" aesthetics to "null" values (NA and 0 in this case), using a manual scale.
This has been answered already, but based on feedback I got to another question (How can I fix this strange behavior of legend in ggplot2?) this tweak may be helpful to others and may save you headaches (sorry couldn't put as a comment to the previous answer):
ggplot(ex_data, aes(x=xvals, group=names)) +
geom_point(aes(y=yvals, shape='data', linetype='data')) +
geom_line(aes(y=pvals, shape='fitted', linetype='fitted')) +
scale_shape_manual('', values=c('data'=19, 'fitted'=NA)) +
scale_linetype_manual('', values=c('data'=0, 'fitted'=1))