pie chart in ggplot text labelling horror - r

I can't resolve that strange situation. Somewhere I have error, or bug, but sitting over three halfs of an hour could not deal with it.
I have: sta_df
sta value
1 IN_LIQUIDATION 29
2 LIQUIDATED 47
3 OPERATING 435
4 TRANSFORMED 8
sp <- ggplot(sta_df, aes(x="", y=value, fill=sta)) +
geom_bar(width = 1, stat = "identity", color = "black") +
coord_polar("y") + scale_fill_brewer(palette="Pastel2") +
geom_text(aes(x = seq(1.2,1.4,,4), label = percent(value/sum(value))),
position = position_stack(vjust = 0.5), size=5)
and the plot have wrong direction of labelling.
Nevermind this strange font of a picture. I've tried to use many different functions instead of position_stack. For example:
geom_text(aes(x = rep(seq(0.9,1.4,,6),1), y = value/2 + c(0, cumsum(value)[-length(value)])
but it didn't help. This thread neither: wrong labeling in ggplot pie chart
When I wanted to reverse y=rev(value) the legend didn't correspond with data. Putting direction 1 or -1 doesn't do more than reversing all. Reversing values in geom_text gives Pac-Man-like chart. I've updated ggplot2.
Honestly, the problem is because chart starts to draw anti-clockwise although direction is set to clockwise and text numbers are in right direction. And reversing data in data.frame doesn't change anything in the whole plot. Sorry, I stuck, but feel the solution is right there.

The problem occurs when you assign different x-values to your labels in geom_label(). Why? Because you are relying on position_stack() to give you your y-values. But when the points no longer share the same x, then they don't get 'stacked' anymore. If you want to customize the x-values, you will need to compute your own y-values, as described here (Showing data values on stacked bar chart in ggplot2) and here (http://docs.ggplot2.org/current/geom_text.html) near bottom of the page. By the way, I did all my troubleshooting with coord_polar removed, just looking at the plain barplot version.
Anyway, here is a partial solution:
sta_df <- read.table(header=TRUE,
text=" sta value
IN_LIQUIDATION 29
LIQUIDATED 47
OPERATING 435
TRANSFORMED 8")
library(ggplot2)
library(scales)
sta_df$fraction = sta_df$value / sum(sta_df$value)
sp <- ggplot(sta_df, aes(x="", y=value, fill=sta)) +
geom_bar(width=1, stat="identity", color="black",) +
scale_fill_brewer(palette="Pastel2") +
coord_polar(theta="y") +
geom_text(aes(x=1.4, label=percent(fraction)),
position=position_stack(vjust=0.5), size=4)
ggsave("pie_chart.png", plot=sp, height=4, width=6, dpi=150)

Related

How to set background color for each panel in grouped boxplot?

I plotted a grouped boxplot and trying to change the background color for each panel. I can use panel.background function to change whole plot background. But how this can be done for individual panel? I found a similar question here. But I failed to adopt the code to my plot.
Top few lines of my input data look like
Code
p<-ggplot(df, aes(x=Genotype, y=Length, fill=Treatment)) + scale_fill_manual(values=c("#69b3a2", "#CF7737"))+
geom_boxplot(width=2.5)+ theme(text = element_text(size=20),panel.spacing.x=unit(0.4, "lines"),
axis.title.x=element_blank(),axis.text.x=element_blank(),axis.ticks.x=element_blank(),axis.text.y = element_text(angle=90, hjust=1,colour="black")) +
labs(x = "Genotype", y = "Petal length (cm)")+
facet_grid(~divide,scales = "free", space = "free")
p+theme(panel.background = element_rect(fill = "#F6F8F9", colour = "#E7ECF1"))
Unfortunately, like the other theme elements, the fill aesthetic of element_rect() cannot be mapped to data. You cannot just send a vector of colors to fill either (create your own mapping of sorts). In the end, the simplest solution probably is going to be very similar to the answer you linked to in your question... with a bit of a twist here.
I'll use mtcars as an example. Note that I'm converting some of the continuous variables in the dataset to factors so that we can create some more discrete values.
It's important to note, the rect geom is drawn before the boxplot geom, to ensure the boxplot appears on top of the rect.
ggplot(mtcars, aes(factor(carb), disp)) +
geom_rect(
aes(fill=factor(carb)), alpha=0.5,
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
geom_boxplot() +
facet_grid(~factor(carb), scales='free_x') +
theme_bw()
All done... but not quite. Something is wrong and you might notice this if you pay attention to the boxes on the legend and the gridlines in the plot panels. It looks like the alpha value is incorrect for some facets and okay for others. What's going on here?
Well, this has to do with how geom_rect works. It's drawing a box on each plot panel, but just like the other geoms, it's mapped to the data. Even though the x and y aesthetics for the geom_rect are actually not used to draw the rectangle, they are used to indicate how many of each rectangle are drawn. This means that the number of rectangles drawn in each facet corresponds to the number of lines in the dataset which exist for that facet. If 3 observations exist, 3 rectangles are drawn. If 20 observations exist for one facet, 20 rectangles are drawn, etc.
So, the fix is to supply a dataframe that contains one observation each for every facet. We have to then make sure that we supply any and all other aesthetics (x and y here) that are included in the ggplot call, or we will get an error indicating ggplot cannot "find" that particular column. Remember, even if geom_rect doesn't use these for drawing, they are used to determine how many observations exist (and therefore how many to draw).
rect_df <- data.frame(carb=unique(mtcars$carb)) # supply one of each type of carb
# have to give something to disp
rect_df$disp <- 0
ggplot(mtcars, aes(factor(carb), disp)) +
geom_rect(
data=rect_df,
aes(fill=factor(carb)), alpha=0.5,
xmin=-Inf, xmax=Inf, ymin=-Inf, ymax=Inf) +
geom_boxplot() +
facet_grid(~factor(carb), scales='free_x') +
theme_bw()
That's better.

How I can correctly overlap bar and linechart together

I am using below codes
p <- ggplot() +
geom_bar(data=filter(df, variable=="LA"), aes(x=Gen, y=Mean, fill=Leaf),
stat="identity", position="dodge")+
geom_point(data=filter(df, variable=="TT"),aes(x=Gen, y=Mean, colour=Leaf))+
geom_line(data=filter(df, variable=="TT"), aes(x=Gen, y=Mean, group=Leaf))+
ggtitle("G")+xlab("Genotypes")+ylab("Canopy temperature")+
scale_fill_hue(name="", labels=c("Leaf-1", "Leaf-2", "Leaf-3"))+
scale_y_continuous(sec.axis=sec_axis(~./20, name="2nd Y-axis"))+
theme(axis.text.x=element_text(angle=90, hjust=1), legend.position="top")
graph produced from above code
I want graph like that
data
https://docs.google.com/spreadsheets/d/1Fjmg-l0WTL7jhEqwwtC4RXY_9VQV9GOBliFq_3G1f8I/edit#gid=0
From data, I want variable LA to left side and TT from right side
Above part is resolved,
Now, I am trying to put errorbars on the bar graph with below code, it caused an error, can someone have a look for solution?
p + geom_errorbar(aes(ymin=Mean-se, ymax=Mean+se), width=0.5,
position=position_dodge(0.9), colour="black", size=.7)
For this you need to understand that even you have the second Y-Axis, it is just a markup and everything draw on the graph is still base on the main Y-Axis(left one).
So you need to do two things:
Convert anything that should reference to the second Y-Axis to same scale of the one on the left, in this case is the bar scale (LA variables) whose maximum is 15. So you need to divide the value of TT by 20.
Second Axis needs to label correctly so it will be the main Y-Axis multiply by 20.
p <- ggplot() +
geom_bar(data=filter(df, variable=="LA"), aes(x=Gen, y=Mean, fill=Leaf),
stat="identity", position="dodge") +
# values are divided by 20 to be in the same value range of bar graph
geom_point(data=filter(df, variable=="TT"),aes(x=Gen, y=Mean/20, colour=Leaf))+
geom_line(data=filter(df, variable=="TT"), aes(x=Gen, y=Mean/20, group=Leaf))+
ggtitle("G")+xlab("Genotypes")+ylab("Canopy temperature")+
scale_fill_hue(name="", labels=c("Leaf-1", "Leaf-2", "Leaf-3"))+
# second axis is multiply by 20 to reflect the actual value of lines & points
scale_y_continuous(
sec.axis=sec_axis(trans = ~ . * 20, name="2nd Y-axis",
breaks = c(0, 100, 200, 300))) +
theme(axis.text.x=element_text(angle=90, hjust=1), legend.position="top")
For the error par which is very basic here. You will need to adjust the theme and the graph to have a good looking one.
p + geom_errorbar(data = filter(df, variable=="TT"),
aes(x = Gen, y=Mean/20, ymin=(Mean-se)/20,
ymax=(Mean+se)/20), width=0.5,
position=position_dodge(0.9), colour="black", size=.7)
One final note: Please consider reading the error message, understand what it say, reference to the help document of packages, functions in R so you can learn how to do all the code yourself.

Adding a "//" on the x-axis to remove whitespace in one side of the ggplot panel plot

I'm hoping if there's a way to remove whitespace in one side of the panel plot (created by facet_wrap) by adding "//" on the x-axis. Below is sample data and code:
df <- data.frame(
condition = c("cond1","cond2","cond3"),
measure = c("type1","type2"),
value = rep(NA, 6)
)
# all type 1 measure values are between -0.5 and 0.5
# all type 2 measure values are between 0.5 and 2
df[df$measure=="type1",]$value <- runif(3, min=-0.5, max=0.5)
df[df$measure=="type2",]$value <- runif(3, min= 1.5, max=2.0)
# both panels should have same axis tick intervals
custom_breaks = function(x){
seq(round(min(x), 2), round(max(x), 2), 0.2)
}
# create a panel plot with vertical line at y=0 for both panels
ggplot(df, aes(x=condition, y=value, color=measure)) +
geom_point() +
geom_hline(aes(yintercept=0), color="grey") +
scale_y_continuous(breaks=custom_breaks) +
facet_wrap(~measure, scales="free_x") +
coord_flip() +
theme_bw() +
theme(panel.grid.major=element_blank(), panel.grid.minor=element_blank())
This code returns the below plot:
Because the values for type 2 (right panel) are far off from zero, adding a vertical line at y=0 results in lots of whitespace. I'm wondering if there's a way to put a "//" on the x-axis on the right panel after 0 and going straight to 1.5 so there aren't tons of wasted white space. Any help would be greatly appreciated!
Broken axes are generally discouraged because they can lead to misleading visualizations, so this is intentionally not implemented in ggplot2 (as answered by Hadley Wickham himself).
My preferred solutions for something like this are (a) facetting (which you are already doing) or (b) log transormation of the axis - but only if it makes sense for the given data.
Take this barchart for example (source / link to image): Since there is valuable information in the outliers (red circle and arrows) both log transformation and broken axes would distort the representation of reality. The package library(ggforce) has an implementation for such zoom facets with the facet_zoom() function.
Your scales = "free_x" is working just fine - the issue is that your geom_hline putting a line at 0 is included in both facets. Here's a way to include it only on the first facet.
ggplot(df, aes(x=condition, y=value, color=measure)) +
geom_point() +
geom_hline(data = data.frame(measure = "type1"), aes(yintercept=0), color="grey") +
scale_y_continuous(breaks=custom_breaks) +
facet_wrap(~measure, scales="free_x") +
coord_flip() +
theme_bw() +
theme(panel.grid.major=element_blank(), panel.grid.minor=element_blank())

Rescale y axis ggplot (geom_bar)

This is should be very simple question!
I would like to make a barplot with errorbars and I'm using the following code:
ggplot(data = bars, aes(x=c("1","2","3"), y=V2, fill = names)) +
geom_bar(position=position_dodge(), stat="identity", alpha = 0.7) +
geom_errorbar(aes(ymin=V1, ymax=V3))+
theme(legend.position='none')+
coord_cartesian(ylim=c(0,10))
However, I have 2 problems:
1. I would like the bars to start at y = 0
2. I don't like the ticks in the y axis. I would like numbers with just one decimal and less ticks.
this is my actual plot: Bars with error bars
For the first problem (if I understand it correctly) you can use ylim
... + ylim(0.2, NA)
NA leaves the upper bound free.
For the second, I suggest to use pretty_breaks from scale
library(scales)
... + scale_y_continuous(breaks=pretty_breaks(n=5))

ggplot2 density histogram with width=.5, vline and centered bar positions

I want a nice density (that sums to 1) histogram for some discrete data. I have tried a couple of ways to do this, but none were entirely satisfactory.
Generate some data:
#data
set.seed(-999)
d.test = data.frame(score = round(rnorm(100,1)))
mean.score = mean(d.test[,1])
d1 = as.data.frame(prop.table(table(d.test)))
The first gives the right placement of bars -- centered on top of the number -- but the wrong placement of vline(). This is because the x-axis is discrete (factor) and so the mean is plotted using the number of levels, not the values. The mean value is .89.
ggplot(data=d1, aes(x=d.test, y=Freq)) +
geom_bar(stat="identity", width=.5) +
geom_vline(xintercept=mean.score, color="blue", linetype="dashed")
The second gives the correct vline() placement (because the x-axis is continuous), but wrong placement of bars and the width parameter does not appear to be modifiable when x-axis is continuous (see here). I also tried the size parameter which also has no effect. Ditto for hjust.
ggplot(d.test, aes(x=score)) +
geom_histogram(aes(y=..count../sum(..count..)), width=.5) +
geom_vline(xintercept=mean.score, color="blue", linetype="dashed")
Any ideas? My bad idea is to rescale the mean so that it fits with the factor levels and use the first solution. This won't work well in case some of the factor levels are 'missing', e.g. 1, 2, 4 with no factor for 3 because no datapoint had that value. If the mean is 3.5, rescaling this is odd (x-axis is no longer an interval scale).
Another idea is this:
ggplot(d.test, aes(x=score)) +
stat_bin(binwidth=.5, aes(y= ..density../sum(..density..)), hjust=-.5) +
scale_x_continuous(breaks = -2:5) + #add ticks back
geom_vline(xintercept=mean.score, color="blue", linetype="dashed")
But this requires adjusting the breaks, and the bars are still in the wrong positions (not centered). Unfortunately, hjust does not appear to work.
How do I get everything I want?
density sums to 1
bars centered above values
vline() at the correct number
width=.5
With base graphics, one could perhaps solve this problem by plotting twice on the x-axis. Is there some similar way here?
It sounds like you just want to make sure that your x-axis values are numeric rather than factors
ggplot(data=d1, aes(x=as.numeric(as.character(d.test)), y=Freq)) +
geom_bar(stat="identity", width=.5) +
geom_vline(xintercept=mean.score, color="blue", linetype="dashed") +
scale_x_continuous(breaks=-2:3)
which gives

Resources