Basically I want to display a barplot which is grouped by Methods i.e I want to display the number of people conducted the tests, the number of positive test results had found for each of the methods. Also, I want to display all the numbers and percentages as labels on the bar. I am trying to display these using ggplot2. But I am failing every time.
Any helps.
Thanks in advance
I'm not sure to have fully understand your question. But I will suggest you to take look on geom_text.
library(ggplot2)
ggplot(df, aes(x = methods, y = percentage)) +
geom_bar(stat = "identity") +
geom_text(aes(label = paste0(round(percentage,2), " (",positive," / ", people,")")), vjust = -0.3, size = 3.5)+
scale_x_discrete(limits = c("NS1", "NS1+IgM", "NS1+IgG","Tourniquet")) +
ylim(0,100)
Data:
df = data.frame(methods = c("NS1", "NS1+IgM","NS1+IgG","Tourniquet"),
people = c(542,542,541,250),
positive = c(505,503,38,93))
df$percentage = df$positive / df$people * 100
> df
methods people positive percentage
1 NS1 542 505 93.17343
2 NS1+IgM 542 503 92.80443
3 NS1+IgG 541 38 7.02403
4 Tourniquet 250 93 37.20000
Does it answer your question ? If not, can you clarify your question by adding the code you have tried so far in ggplot ?
Related
I have attempted adapting from some other solutions for slightly different situations. I am not being able to sort this out.
I would like to build a mirrored barplot comparing chemicals with controls, but with results grouped by chemical concentrations, and (if possible) both positive axes.
I provide data below, and an example of I would like it to generally look like.
volatiles<-c("hexenal3", "trans2hexenal", "trans2hexenol", "ethyl2hexanol", "phenethylalcohol", "methylsalicylate", "geraniol", "eugenol")
require(reshape2)
dat<-list(
conc1=data.frame("volatile"=volatiles, "focal"=c(26,27,28,28,31,31,30,28), "control"=c(24,31,30,29,24,23,21,25)),
conc2=data.frame("volatile"=volatiles, "focal"=c(29,18,34,17,30,32,35,27), "control"=c(21,42,20,40,25,16,17,29)),
conc3=data.frame("volatile"=volatiles, "focal"=c(33, 5,38, 7,37,35,40,26), "control"=c(18,51,14, 50,15,12,13,31))
)
long.dat<-melt(dat)
Attempting the following isn't working. Perhaps I should input a different data structure?
ggplot(long.dat, aes(x=L1, group=volatile, y=value, fill=variable)) +
geom_bar(stat="identity", position= "identity")
I would like it to look similar to this, but with the bars grouped in the triads of different concentrations (and, if possible, with all positive values).
Thanks in advance!
Try this:
long.dat$value[long.dat$variable == "focal"] <- -long.dat$value[long.dat$variable == "focal"]
library(ggplot2)
gg <- ggplot(long.dat, aes(interaction(volatile, L1), value)) +
geom_bar(aes(fill = variable), color = "black", stat = "identity") +
scale_y_continuous(labels = abs) +
scale_fill_manual(values = c(control = "#00000000", focal = "blue")) +
coord_flip()
gg
I suspect that the order on the left axis (originally x, but flipped to the left with coord_flip) will be relevant to you. If the current isn't what you need and using interaction(L1, volatile) instead does not give you the correct order, then you will need to combine them intelligently before ggplot(..), convert to a factor, and control the levels= so that they are in the order (and string-formatting) you need.
Most other aspects can be controlled via + theme(...), such as legend.position="top". I don't know what the asterisks in your demo image might be, but they can likely be added with geom_point (making sure to negate those that should be on the left).
For instance, if you have a $star variable that indicates there should be a star on each particular row,
set.seed(42)
long.dat$star <- sample(c(TRUE,FALSE), prob=c(0.2,0.8), size=nrow(long.dat), replace=TRUE)
head(long.dat)
# volatile variable value L1 star
# 1 hexenal3 focal -26 conc1 TRUE
# 2 trans2hexenal focal -27 conc1 TRUE
# 3 trans2hexenol focal -28 conc1 FALSE
# 4 ethyl2hexanol focal -28 conc1 TRUE
# 5 phenethylalcohol focal -31 conc1 FALSE
# 6 methylsalicylate focal -31 conc1 FALSE
then you can add it with a single geom_point call (and adding the legend move):
gg +
geom_point(aes(y=value + 2*sign(value)), data = ~ subset(., star), pch = 8) +
theme(legend.position = "top")
Like r2evans stated, the only way to do this is to use negative values in your data, and then manually use abs() when labelling. More specifically, it would look something like this:
ggplot(long.dat, aes(x=L1, group=volatile, y=value, fill=variable)) +
geom_bar(stat="identity", position= "identity") +
scale_y_continuous(breaks= c(-25,-15,-5,5,15,25),labels=abs(c(-25,-15,-5,5,15,25)))
Of course, use whatever labels make the most sense for your data, or you can set a sequence of numbers using the seq() function.
P.S. I also had trouble with your code, so next time please make sure your example is reproducible- you'll get answers quicker!
Note: I found a similar question, for which there was an answer explaining the problem. However, I'm looking for an answer, as opposed to a reason why it's difficult (which I fully understand).
I have data for which I want to create a histogram. This data has a count of 10000 for the bin [0, 200) and a count of 1 for several bins such as [30000, 30200). Both bins are important and need to be visible. For this, I can perform a histogram with the log1p scale.
contig_len <- read.table(data_file, header = FALSE, sep = ",", col.names=c("Length"))
ggplot(contig_len, aes(x = Length)) + geom_histogram(binwidth=200) +
scale_y_continuous(trans="log1p")
This works perfectly! But now, I want to categorise the items in the histogram, as follows:
ggplot(contig_len, aes(x = Length, fill = Prevalence)) +
geom_histogram(binwidth=200, alpha=0.5, position="stack") +
scale_y_continuous(trans = "log1p")
This doesn't work, however, as the stacking is performed without taking the log scale into account. Has anyone found a way around this problem? My data looks like this:
head(contig_len)
Length Prevalence
1 606 Repetitive (<5)
2 888 Non-Repetitive
3 192 Repetitive (<9)
4 9830 Non-Repetitive
5 506 Non-Repetitive
6 850 Non-Repetitive
I have a grouped barplot produced using ggplot in R with the following code
ggplot(mTogether, aes(x = USuniquNegR, y = value, fill = variable)) +
geom_bar(stat = "identity", position = "dodge") +
scale_fill_discrete(name = "Area",
labels = c("Everywhere", "New York")) +
xlab("Reasons") +
ylab("Proportion of total complaints") +
coord_flip() +
ggtitle("Comparison between NY and all areas")
mTogether is created using the following code
mTogether <- melt(together, id.vars = 'USuniquNegR')
The Data Frame together is made up of
USperReasons USperReasonsNY USuniquNegR
1 0.198343304187759 0.191304347826087 Late Flight
2 0.35987114588127 0.321739130434783 Customer Service Issue
3 0.0667280257708237 0.11304347826087 Lost Luggage
4 0.0547630004601933 0.00869565217391304 Flight Booking Problems
5 0.109065807639208 0.121739130434783 Can't Tell
6 0.00460193281178095 0 Damaged Luggage
7 0.0846755637367694 0.0782608695652174 Cancelled Flight
8 0.0455591348366314 0.0521739130434783 Bad Flight
9 0.0225494707777266 0.0347826086956522 longlines
10 0.0538426138978371 0.0782608695652174 Flight Attendant Complaints
Together can be generated by the following
together<-data.frame(cbind(USperReasons,USperReasonsNY,USuniquNegR))
where
USperReasons <- c(0.19834,0.35987,.06672,0.05476,0.10906,.00460,.08467,0.04555,0.02254,0.05384)
USperReasonsNY <- c(0.191304348,0.321739130,0.113043478,0.008695652,0.121739130,0.000000000,0.078260870,0.05217391,0.034782609,0.078260870)
USuniquNegR <- c("Late Flight","Customer Service Issue","Lost Luggage","Flight Booking Problems","Can't Tell","Damaged Luggage","Cancelled Flight","Bad Flight","longlines","Flight Attendant Complaints")
The problem is when I try change xlim of the ggplot using
+ xlim(0, 1)
I just seem to get an error:
Discrete value supplied to continuous scale
I can't understand why this happens but I need to resolve it because currently the x axis starts below 0 and is very highly packed:
image of ggplot output
The problem is that you are cbind()ing your column vectors together, which converts the numbers to characters. Fix that and the rest should fix itself.
together<-data.frame(USperReasons,USperReasonsNY,USuniquNegR)
You need to remove the cbind from
together<-data.frame(cbind(USperReasons,USperReasonsNY,USuniquNegR))
because str(together) tells that all three columns are factors.
With
together <- data.frame(USperReasons, USperReasonsNY, USuniquNegR)
the plot looks reasonable to me (without having to use ylim or xlim).
So, the error was not within ggplot2 but in data preparation.
Therefore, please, provide a full working example which can be copied, pasted and run when asking a question next time. Thank you.
I'm trying to create a grouped barplot using ggplot due to the more aesthetically pleasing quality it produces. I have a dataframe, together, containing the values and the name of each value but I can't manage to create the plot it? the dataframe is as follows
USperReasons USperReasonsNY USuniquNegR
1 0.198343304187759 0.191304347826087 Late Flight
2 0.35987114588127 0.321739130434783 Customer Service Issue
3 0.0667280257708237 0.11304347826087 Lost Luggage
4 0.0547630004601933 0.00869565217391304 Flight Booking Problems
5 0.109065807639208 0.121739130434783 Can't Tell
6 0.00460193281178095 0 Damaged Luggage
7 0.0846755637367694 0.0782608695652174 Cancelled Flight
8 0.0455591348366314 0.0521739130434783 Bad Flight
9 0.0225494707777266 0.0347826086956522 longlines
10 0.0538426138978371 0.0782608695652174 Flight Attendant Complaints
I tried different methods with errors in all, one such example is below
ggplot(together,aes(USuniquNegR, USperReasons,USperReasonsNY))+ geom_bar(position = "dodge")
Thanks,
Alan.
df <- reshape2::melt(together, 3)
ggplot(reshape2::melt(df, 3),
aes(USuniquNegR, value, fill = variable)) +
geom_bar(stat = 'identity', position = 'dodge') +
coord_flip() +
theme(legend.position = 'top')
I am trying to make the error bars above each bar plot, but I have the bar plots in three groups, and 6 bar plots and it's positioning the error bars with respect to each group, but I want the positioning if each error bar above each bar. Here's what my data looks like:
NewData
Group Session HeartRate StdError n sd
1 Novices one 71.89276 1.821146 29 9.807170
2 Experts one 66.40705 1.923901 26 9.810008
3 Novices two 71.38609 1.571261 29 8.011889
4 Experts two 67.79910 1.788151 26 9.117818
5 Novices three 71.79759 1.941730 29 10.456534
6 Experts three 67.04564 1.938620 26 9.885061
And here is my code:
plot_2 = ggplot(NewData, aes(x=Session, y=HeartRate, fill=Group)) +
theme_bw() +
geom_bar(position="dodge",stat="identity")+
scale_x_discrete(limits=c("one","two","three")) +
coord_cartesian(ylim = c(50, 80)) +
geom_errorbar(aes(ymin=HeartRate-StdError,ymax=HeartRate+StdError),position="dodge",width=.25)
Here's the output: http://i.imgur.com/BrLB6Px.png
Any help would be appreciated. Thanks!
OK-- I found a solution, not really sure how or why it works, but here's my new code:
dodge <- position_dodge(width=0.9)
plot_2 = ggplot(NewData, aes(x=Session, y=HeartRate, fill=Group)) +
geom_bar(position=dodge)+
scale_x_discrete(limits=c("one","two","three")) +
coord_cartesian(ylim = c(50, 80)) +
geom_errorbar(aes(ymin=HeartRate-StdError,ymax=HeartRate+StdError),position=dodge,width=.25)
And here's the desired result: http://i.imgur.com/PodCeh5.png
It's kind of hard to tell what it is you want to see as output, but from what I gather, perhaps this will do what you want?
geom_errorbar(aes(ymin=HeartRate,ymax=HeartRate+StdError*2),position="dodge",width=.25)
If you want to move the error bar along the x-axis, we need to modify position=dodge. I don't see much documentation but you might try something like...
geom_errorbar(aes(ymin=HeartRate-StdError,ymax=HeartRate+StdError),position=group-10,width=.25)
or
geom_errorbar(aes(ymin=HeartRate-StdError,ymax=HeartRate+StdError),position=Session-10,width=.25)
Or (most likely?) this:
geom_errorbar(aes(ymin=HeartRate-StdError,ymax=HeartRate+StdError),position=x-10,width=.25)