How I can correctly overlap bar and linechart together - r

I am using below codes
p <- ggplot() +
geom_bar(data=filter(df, variable=="LA"), aes(x=Gen, y=Mean, fill=Leaf),
stat="identity", position="dodge")+
geom_point(data=filter(df, variable=="TT"),aes(x=Gen, y=Mean, colour=Leaf))+
geom_line(data=filter(df, variable=="TT"), aes(x=Gen, y=Mean, group=Leaf))+
ggtitle("G")+xlab("Genotypes")+ylab("Canopy temperature")+
scale_fill_hue(name="", labels=c("Leaf-1", "Leaf-2", "Leaf-3"))+
scale_y_continuous(sec.axis=sec_axis(~./20, name="2nd Y-axis"))+
theme(axis.text.x=element_text(angle=90, hjust=1), legend.position="top")
graph produced from above code
I want graph like that
data
https://docs.google.com/spreadsheets/d/1Fjmg-l0WTL7jhEqwwtC4RXY_9VQV9GOBliFq_3G1f8I/edit#gid=0
From data, I want variable LA to left side and TT from right side
Above part is resolved,
Now, I am trying to put errorbars on the bar graph with below code, it caused an error, can someone have a look for solution?
p + geom_errorbar(aes(ymin=Mean-se, ymax=Mean+se), width=0.5,
position=position_dodge(0.9), colour="black", size=.7)

For this you need to understand that even you have the second Y-Axis, it is just a markup and everything draw on the graph is still base on the main Y-Axis(left one).
So you need to do two things:
Convert anything that should reference to the second Y-Axis to same scale of the one on the left, in this case is the bar scale (LA variables) whose maximum is 15. So you need to divide the value of TT by 20.
Second Axis needs to label correctly so it will be the main Y-Axis multiply by 20.
p <- ggplot() +
geom_bar(data=filter(df, variable=="LA"), aes(x=Gen, y=Mean, fill=Leaf),
stat="identity", position="dodge") +
# values are divided by 20 to be in the same value range of bar graph
geom_point(data=filter(df, variable=="TT"),aes(x=Gen, y=Mean/20, colour=Leaf))+
geom_line(data=filter(df, variable=="TT"), aes(x=Gen, y=Mean/20, group=Leaf))+
ggtitle("G")+xlab("Genotypes")+ylab("Canopy temperature")+
scale_fill_hue(name="", labels=c("Leaf-1", "Leaf-2", "Leaf-3"))+
# second axis is multiply by 20 to reflect the actual value of lines & points
scale_y_continuous(
sec.axis=sec_axis(trans = ~ . * 20, name="2nd Y-axis",
breaks = c(0, 100, 200, 300))) +
theme(axis.text.x=element_text(angle=90, hjust=1), legend.position="top")
For the error par which is very basic here. You will need to adjust the theme and the graph to have a good looking one.
p + geom_errorbar(data = filter(df, variable=="TT"),
aes(x = Gen, y=Mean/20, ymin=(Mean-se)/20,
ymax=(Mean+se)/20), width=0.5,
position=position_dodge(0.9), colour="black", size=.7)
One final note: Please consider reading the error message, understand what it say, reference to the help document of packages, functions in R so you can learn how to do all the code yourself.

Related

How to set background color for each panel in grouped boxplot?

I plotted a grouped boxplot and trying to change the background color for each panel. I can use panel.background function to change whole plot background. But how this can be done for individual panel? I found a similar question here. But I failed to adopt the code to my plot.
Top few lines of my input data look like
Code
p<-ggplot(df, aes(x=Genotype, y=Length, fill=Treatment)) + scale_fill_manual(values=c("#69b3a2", "#CF7737"))+
geom_boxplot(width=2.5)+ theme(text = element_text(size=20),panel.spacing.x=unit(0.4, "lines"),
axis.title.x=element_blank(),axis.text.x=element_blank(),axis.ticks.x=element_blank(),axis.text.y = element_text(angle=90, hjust=1,colour="black")) +
labs(x = "Genotype", y = "Petal length (cm)")+
facet_grid(~divide,scales = "free", space = "free")
p+theme(panel.background = element_rect(fill = "#F6F8F9", colour = "#E7ECF1"))
Unfortunately, like the other theme elements, the fill aesthetic of element_rect() cannot be mapped to data. You cannot just send a vector of colors to fill either (create your own mapping of sorts). In the end, the simplest solution probably is going to be very similar to the answer you linked to in your question... with a bit of a twist here.
I'll use mtcars as an example. Note that I'm converting some of the continuous variables in the dataset to factors so that we can create some more discrete values.
It's important to note, the rect geom is drawn before the boxplot geom, to ensure the boxplot appears on top of the rect.
ggplot(mtcars, aes(factor(carb), disp)) +
geom_rect(
aes(fill=factor(carb)), alpha=0.5,
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
geom_boxplot() +
facet_grid(~factor(carb), scales='free_x') +
theme_bw()
All done... but not quite. Something is wrong and you might notice this if you pay attention to the boxes on the legend and the gridlines in the plot panels. It looks like the alpha value is incorrect for some facets and okay for others. What's going on here?
Well, this has to do with how geom_rect works. It's drawing a box on each plot panel, but just like the other geoms, it's mapped to the data. Even though the x and y aesthetics for the geom_rect are actually not used to draw the rectangle, they are used to indicate how many of each rectangle are drawn. This means that the number of rectangles drawn in each facet corresponds to the number of lines in the dataset which exist for that facet. If 3 observations exist, 3 rectangles are drawn. If 20 observations exist for one facet, 20 rectangles are drawn, etc.
So, the fix is to supply a dataframe that contains one observation each for every facet. We have to then make sure that we supply any and all other aesthetics (x and y here) that are included in the ggplot call, or we will get an error indicating ggplot cannot "find" that particular column. Remember, even if geom_rect doesn't use these for drawing, they are used to determine how many observations exist (and therefore how many to draw).
rect_df <- data.frame(carb=unique(mtcars$carb)) # supply one of each type of carb
# have to give something to disp
rect_df$disp <- 0
ggplot(mtcars, aes(factor(carb), disp)) +
geom_rect(
data=rect_df,
aes(fill=factor(carb)), alpha=0.5,
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
geom_boxplot() +
facet_grid(~factor(carb), scales='free_x') +
theme_bw()
That's better.

Adding a "//" on the x-axis to remove whitespace in one side of the ggplot panel plot

I'm hoping if there's a way to remove whitespace in one side of the panel plot (created by facet_wrap) by adding "//" on the x-axis. Below is sample data and code:
df <- data.frame(
condition = c("cond1","cond2","cond3"),
measure = c("type1","type2"),
value = rep(NA, 6)
)
# all type 1 measure values are between -0.5 and 0.5
# all type 2 measure values are between 0.5 and 2
df[df$measure=="type1",]$value <- runif(3, min=-0.5, max=0.5)
df[df$measure=="type2",]$value <- runif(3, min= 1.5, max=2.0)
# both panels should have same axis tick intervals
custom_breaks = function(x){
seq(round(min(x), 2), round(max(x), 2), 0.2)
}
# create a panel plot with vertical line at y=0 for both panels
ggplot(df, aes(x=condition, y=value, color=measure)) +
geom_point() +
geom_hline(aes(yintercept=0), color="grey") +
scale_y_continuous(breaks=custom_breaks) +
facet_wrap(~measure, scales="free_x") +
coord_flip() +
theme_bw() +
theme(panel.grid.major=element_blank(), panel.grid.minor=element_blank())
This code returns the below plot:
Because the values for type 2 (right panel) are far off from zero, adding a vertical line at y=0 results in lots of whitespace. I'm wondering if there's a way to put a "//" on the x-axis on the right panel after 0 and going straight to 1.5 so there aren't tons of wasted white space. Any help would be greatly appreciated!
Broken axes are generally discouraged because they can lead to misleading visualizations, so this is intentionally not implemented in ggplot2 (as answered by Hadley Wickham himself).
My preferred solutions for something like this are (a) facetting (which you are already doing) or (b) log transormation of the axis - but only if it makes sense for the given data.
Take this barchart for example (source / link to image): Since there is valuable information in the outliers (red circle and arrows) both log transformation and broken axes would distort the representation of reality. The package library(ggforce) has an implementation for such zoom facets with the facet_zoom() function.
Your scales = "free_x" is working just fine - the issue is that your geom_hline putting a line at 0 is included in both facets. Here's a way to include it only on the first facet.
ggplot(df, aes(x=condition, y=value, color=measure)) +
geom_point() +
geom_hline(data = data.frame(measure = "type1"), aes(yintercept=0), color="grey") +
scale_y_continuous(breaks=custom_breaks) +
facet_wrap(~measure, scales="free_x") +
coord_flip() +
theme_bw() +
theme(panel.grid.major=element_blank(), panel.grid.minor=element_blank())

Overlay points (and error bars) over bar plot with position_dodge

I have been trying to look for an answer to my particular problem but I have not been successful, so I have just made a MWE to post here.
I tried the answers here with no success.
The task I want to do seems easy enough, but I cannot figure it out, and the results I get are making me have some fundamental questions...
I just want to overlay points and error bars on a bar plot, using ggplot2.
I have a long format data frame that looks like the following:
> mydf <- data.frame(cell=paste0("cell", rep(1:3, each=12)),
scientist=paste0("scientist", rep(rep(rep(1:2, each=3), 2), 3)),
timepoint=paste0("time", rep(rep(1:2, each=6), 3)),
rep=paste0("rep", rep(1:3, 12)),
value=runif(36)*100)
I have attempted to get the plot I want the following way:
myPal <- brewer.pal(3, "Set2")[1:2]
myPal2 <- brewer.pal(3, "Set1")
outfile <- "test.pdf"
pdf(file=outfile, height=10, width=10)
print(#or ggsave()
ggplot(mydf, aes(cell, value, fill=scientist )) +
geom_bar(stat="identity", position=position_dodge(.9)) +
geom_point(aes(cell, color=rep), position=position_dodge(.9), size=5) +
facet_grid(timepoint~., scales="free_x", space="free_x") +
scale_y_continuous("% of total cells") +
scale_fill_manual(values=myPal) +
scale_color_manual(values=myPal2)
)
dev.off()
But I obtain this:
The problem is, there should be 3 "rep" values per "scientist" bar, but the values are ordered by "rep" instead (they should be 1,2,3,1,2,3, instead of 1,1,2,2,3,3).
Besides, I would like to add error bars with geom_errorbar but I didn't manage to get a working example...
Furthermore, overlying actual value points to the bars, it is making me wonder what is actually being plotted here... if the values are taken properly for each bar, and why the max value (or so it seems) is plotted by default.
The way I think this should be properly plotted is with the median (or mean), adding the error bars like the whiskers in a boxplot (min and max value).
Any idea how to...
... have the "rep" value points appear in proper order?
... change the value shown by the bars from max to median?
... add error bars with max and min values?
I restructured your plotting code a little to make things easier.
The secret is to use proper grouping (which is otherwise inferred from fill and color. Also since you're dodging on multiple levels, dodge2 has to be used.
When you are unsure about "what is plotted where" in bar/column charts, it's always helpful to add the option color="black" which reveals that still things are stacked on top each other, because of your use of dodge instead of dodge2.
p = ggplot(mydf, aes(x=cell, y=value, group=paste(scientist,rep))) +
geom_col(aes(fill=scientist), position=position_dodge2(.9)) +
geom_point(aes(cell, color=rep), position=position_dodge2(.9), size=5) +
facet_grid(timepoint~., scales="free_x", space="free_x") +
scale_y_continuous("% of total cells") +
scale_fill_brewer(palette = "Set2")+
scale_color_brewer(palette = "Set1")
ggsave(filename = outfile, plot=p, height = 10, width = 10)
gives:
Regarding error bars
Since there are only three replicates I would show original data points and maybe a violin plot. For completeness sake I added also a geom_errorbar.
ggplot(mydf, aes(x=cell, y=value,group=paste(cell,scientist))) +
geom_violin(aes(fill=scientist),position=position_dodge(),color="black") +
geom_point(aes(cell, color=rep), position=position_dodge(0.9), size=5) +
geom_errorbar(stat="summary",position=position_dodge())+
facet_grid(timepoint~., scales="free_x", space="free_x") +
scale_y_continuous("% of total cells") +
scale_fill_brewer(palette = "Set2")+
scale_color_brewer(palette = "Set1")
gives
Update after comment
As I mentioned in my comment below, the stacking of the percentages leads to an undesirable outcome.
ggplot(mydf, aes(x=paste(cell, scientist), y=value)) +
geom_bar(aes(fill=rep),stat="identity", position=position_stack(),color="black") +
geom_point(aes(color=rep), position=position_dodge(.9), size=3) +
facet_grid(timepoint~., scales="free_x", space="free_x") +
scale_y_continuous("% of total cells") +
scale_fill_brewer(palette = "Set2")+
scale_color_brewer(palette = "Set1")

pie chart in ggplot text labelling horror

I can't resolve that strange situation. Somewhere I have error, or bug, but sitting over three halfs of an hour could not deal with it.
I have: sta_df
sta value
1 IN_LIQUIDATION 29
2 LIQUIDATED 47
3 OPERATING 435
4 TRANSFORMED 8
sp <- ggplot(sta_df, aes(x="", y=value, fill=sta)) +
geom_bar(width = 1, stat = "identity", color = "black") +
coord_polar("y") + scale_fill_brewer(palette="Pastel2") +
geom_text(aes(x = seq(1.2,1.4,,4), label = percent(value/sum(value))),
position = position_stack(vjust = 0.5), size=5)
and the plot have wrong direction of labelling.
Nevermind this strange font of a picture. I've tried to use many different functions instead of position_stack. For example:
geom_text(aes(x = rep(seq(0.9,1.4,,6),1), y = value/2 + c(0, cumsum(value)[-length(value)])
but it didn't help. This thread neither: wrong labeling in ggplot pie chart
When I wanted to reverse y=rev(value) the legend didn't correspond with data. Putting direction 1 or -1 doesn't do more than reversing all. Reversing values in geom_text gives Pac-Man-like chart. I've updated ggplot2.
Honestly, the problem is because chart starts to draw anti-clockwise although direction is set to clockwise and text numbers are in right direction. And reversing data in data.frame doesn't change anything in the whole plot. Sorry, I stuck, but feel the solution is right there.
The problem occurs when you assign different x-values to your labels in geom_label(). Why? Because you are relying on position_stack() to give you your y-values. But when the points no longer share the same x, then they don't get 'stacked' anymore. If you want to customize the x-values, you will need to compute your own y-values, as described here (Showing data values on stacked bar chart in ggplot2) and here (http://docs.ggplot2.org/current/geom_text.html) near bottom of the page. By the way, I did all my troubleshooting with coord_polar removed, just looking at the plain barplot version.
Anyway, here is a partial solution:
sta_df <- read.table(header=TRUE,
text=" sta value
IN_LIQUIDATION 29
LIQUIDATED 47
OPERATING 435
TRANSFORMED 8")
library(ggplot2)
library(scales)
sta_df$fraction = sta_df$value / sum(sta_df$value)
sp <- ggplot(sta_df, aes(x="", y=value, fill=sta)) +
geom_bar(width=1, stat="identity", color="black",) +
scale_fill_brewer(palette="Pastel2") +
coord_polar(theta="y") +
geom_text(aes(x=1.4, label=percent(fraction)),
position=position_stack(vjust=0.5), size=4)
ggsave("pie_chart.png", plot=sp, height=4, width=6, dpi=150)

How to make dodge in geom_bar agree with dodge in geom_errorbar, geom_point

I have a dataset where measurements are made for different groups at different days.
I want to have side by side bars representing the measurements at the different days for the different groups with the groups of bars spaced according to day of measurement with errorbars overlaid to them.
I'm having trouble with making the dodging in geom_bar agree with the dodge on geom_errorbar.
Here is a simple piece of code:
days = data.frame(day=c(0,1,8,15));
groups = data.frame(group=c("A","B","C","D", "E"), means=seq(0,1,length=5));
my_data = merge(days, groups);
my_data$mid = exp(my_data$means+rnorm(nrow(my_data), sd=0.25));
my_data$sigma = 0.1;
png(file="bar_and_errors_example.png", height=900, width=1200);
plot(ggplot(my_data, aes(x=day, weight=mid, ymin=mid-sigma, ymax=mid+sigma, fill=group)) +
geom_bar (position=position_dodge(width=0.5)) +
geom_errorbar (position=position_dodge(width=0.5), colour="black") +
geom_point (position=position_dodge(width=0.5), aes(y=mid, colour=group)));
dev.off();
In the plot, the errorsegments appears with a fixed offset from its bar (sorry, no plots allowed for newbies even if ggplot2 is the subject).
When binwidth is adjusted in geom_bar, the offset is not fixed and changes from day to day.
Notice, that geom_errorbar and geom_point dodge in tandem.
How do I get geom_bar to agree with the other two?
Any help appreciated.
The alignment problems are due, in part, to your bars not representing the data you intend. The following lines up correctly:
ggplot(my_data, aes(x=day, weight=mid, ymin=mid-sigma, ymax=mid+sigma, fill=group)) +
geom_bar (position=position_dodge(), aes(y=mid), stat="identity") +
geom_errorbar (position=position_dodge(width=0.9), colour="black") +
geom_point (position=position_dodge(width=0.9), aes(y=mid, colour=group))
This is an old question, but since i ran into the problem today, i want to add the following:
In
geom_bar(position = position_dodge(width=0.9), stat = "identity") +
geom_errorbar( position = position_dodge(width=0.9), colour="black")
the width-argument within position_dodge controls the dodging width of the things to dodge from each other. However, this produces whiskers as wide as the bars, which is possibly undesired.
An additional width-argument outside the position_dodge controls the width of the whiskers (and bars):
geom_bar(position = position_dodge(width=0.9), stat = "identity", width=0.7) +
geom_errorbar( position = position_dodge(width=0.9), colour="black", width=0.3)
The first change I reformatted the code according to the advanced R style guide.
days <- data.frame(day=c(0,1,8,15))
groups <- data.frame(
group=c("A","B","C","D", "E"),
means=seq(0,1,length=5)
)
my_data <- merge(days, groups)
my_data$mid <- exp(my_data$means+rnorm(nrow(my_data), sd=0.25))
my_data$sigma <- 0.1
Now when we look at the data we see that day is a factor and everything else is the same.
str(my_data)
To remove blank space from the plot I converted the day column to factors. CHECK that the levels are in the proper order before proceeding.
my_data$day <- as.factor(my_data$day)
levels(my_data$day)
The next change I made was defining y in your aes arguments. As I'm sure you are aware, this lets ggplot know where to look for y values. Then I changed the position argument to "dodge" and added the stat="identity" argument. The "identity" argument tells ggplot to plot y at x. geom_errorbar inherits the dodge position from geom_bar so you can leave it unspecified, but geom_point does not so you must specify that value. The default dodge is position_dodge(.9).
ggplot(data = my_data,
aes(x=day,
y= mid,
ymin=mid-sigma,
ymax=mid+sigma,
fill=group)) +
geom_bar(position="dodge", stat = "identity") +
geom_errorbar( position = position_dodge(), colour="black") +
geom_point(position=position_dodge(.9), aes(y=mid, colour=group))
sometimes you put aes(x=tasks,y=val,fill=group) in geom_bar rather than ggplot. This causes the problem since ggplot looks forward x and you specify it by the location of each group.

Resources