Create a grouped barplot in R using ggplot - r

I'm trying to create a grouped barplot using ggplot due to the more aesthetically pleasing quality it produces. I have a dataframe, together, containing the values and the name of each value but I can't manage to create the plot it? the dataframe is as follows
USperReasons USperReasonsNY USuniquNegR
1 0.198343304187759 0.191304347826087 Late Flight
2 0.35987114588127 0.321739130434783 Customer Service Issue
3 0.0667280257708237 0.11304347826087 Lost Luggage
4 0.0547630004601933 0.00869565217391304 Flight Booking Problems
5 0.109065807639208 0.121739130434783 Can't Tell
6 0.00460193281178095 0 Damaged Luggage
7 0.0846755637367694 0.0782608695652174 Cancelled Flight
8 0.0455591348366314 0.0521739130434783 Bad Flight
9 0.0225494707777266 0.0347826086956522 longlines
10 0.0538426138978371 0.0782608695652174 Flight Attendant Complaints
I tried different methods with errors in all, one such example is below
ggplot(together,aes(USuniquNegR, USperReasons,USperReasonsNY))+ geom_bar(position = "dodge")
Thanks,
Alan.

df <- reshape2::melt(together, 3)
ggplot(reshape2::melt(df, 3),
aes(USuniquNegR, value, fill = variable)) +
geom_bar(stat = 'identity', position = 'dodge') +
coord_flip() +
theme(legend.position = 'top')

Related

R ggplot horizontal bar chart with thousand data

I am trying to produce a bar graph that has thousand data.
I have size problem with ggplot.
Code :
ggplot(data = df, aes(x=extension, y=duration)) +
geom_bar(stat="identity", width=10,fill="steelblue")+
ggtitle("Chart") +
xlab("Number") +
ylab("Duration") +
theme(legend.position = "none")+
theme(plot.title = element_text(hjust = 0.5), plot.subtitle = element_text(hjust = 0.5))+
coord_flip()
Output:
Chart output
Load data frame from MongoDB.
Data Frame:
1 36952 7158803
2 36110 7068360
3 36080 4736043
4 36509 4726630
5 36890 4699026
6 36051 4698594
7 36783 4677233
8 36402 4672623
9 36880 4672093
10 36513 4655583
11 36522 4630962
12 36116 4628046
13 36746 4593291
....
From your sample chart I would infer that your x-axis (extension) is probably a factor. If it were numeric, ggplot would correctly scale the axis.
I would recommend to check the class of the columns of your dataset. Make sure that both are numeric.
Alternatively, you would have to come up with an appropriate scaling of your x-axis.
Here's the plot where your flipped x-axis is a factor; ggplot tries to render every separate level of the factor and they overlap as there are so many. I created some fake data quickly to mimic yours.
Here's the plot where extension is numeric and ggplot neatly scales this correctly.

Population pyramid plot in R [duplicate]

This question already has answers here:
Simpler population pyramid in ggplot2
(4 answers)
Closed 5 months ago.
I am new to R and trying to create a population pyramid plot similar to the first one here https://klein.uk/teaching/viz/datavis-pyramids/. I have a dataset with two variables sex and age groups that looks like this:
sex age_group
1 Male 20-30
2 Female 50-60
3 Male 70-80
4 Male 10-20
5 Female 80-90
... ... ...
This is the code I used
ggplot(data = pyramid_graph(x = age_group, fill = sex)) +
geom_bar(data = subset(pyramid_graph, sex == "F")) +
geom_bar(data = subset(pyramid_graph, sex == "M")) +
mapping = aes(y = - ..count.. ),
position = "identity") +
scale_y_continuous(labels = abs) +
coord_flip()
I do not get any errors from R but when I execute this code a blank image is produced.
Can anyone help?
Thank you
Using a similar input dataset from the same website that you cite in your question:
# Obtain source data
load(url("http://klein.uk/R/Viz/popGH.RData"))
# Convert to summary table
df <- as_tibble(popGH) %>%
mutate(AgeDecade=as.factor(floor(AGE/10)*10)) %>%
group_by(SEX, AgeDecade) %>%
dplyr::summarise(N=n(), .groups="drop") %>%
# A more transparent way of managing the transformation to get "Females to the left".
mutate(PlotN=ifelse(SEX=="Female", -N, N))
# Create the plot
df %>% ggplot() +
geom_col(aes(fill=SEX, x=AgeDecade, y=PlotN)) +
scale_y_continuous(breaks=c(-2*10**5, 0, 2*10**5), labels=c("200k", "0", "200k")) +
labs(y="Population", x="Age group") +
scale_fill_discrete(name="Sex") +
coord_flip()
Gives
Note that I've created a new column to create the "females to the left" effect in the plot. Normally, I'd avoid doing that and would rely on the options to the various ggplot functions to achieve the same thing (much as you have attempted to do). However, in this case, I think it's far more transparent (and simple) use the extra column rather than to modify the mapping.

Clustered Bar Plot Using ggplot2

Basically I want to display a barplot which is grouped by Methods i.e I want to display the number of people conducted the tests, the number of positive test results had found for each of the methods. Also, I want to display all the numbers and percentages as labels on the bar. I am trying to display these using ggplot2. But I am failing every time.
Any helps.
Thanks in advance
I'm not sure to have fully understand your question. But I will suggest you to take look on geom_text.
library(ggplot2)
ggplot(df, aes(x = methods, y = percentage)) +
geom_bar(stat = "identity") +
geom_text(aes(label = paste0(round(percentage,2), " (",positive," / ", people,")")), vjust = -0.3, size = 3.5)+
scale_x_discrete(limits = c("NS1", "NS1+IgM", "NS1+IgG","Tourniquet")) +
ylim(0,100)
Data:
df = data.frame(methods = c("NS1", "NS1+IgM","NS1+IgG","Tourniquet"),
people = c(542,542,541,250),
positive = c(505,503,38,93))
df$percentage = df$positive / df$people * 100
> df
methods people positive percentage
1 NS1 542 505 93.17343
2 NS1+IgM 542 503 92.80443
3 NS1+IgG 541 38 7.02403
4 Tourniquet 250 93 37.20000
Does it answer your question ? If not, can you clarify your question by adding the code you have tried so far in ggplot ?

Changing the xlim of numeric value causing error ggplot R

I have a grouped barplot produced using ggplot in R with the following code
ggplot(mTogether, aes(x = USuniquNegR, y = value, fill = variable)) +
geom_bar(stat = "identity", position = "dodge") +
scale_fill_discrete(name = "Area",
labels = c("Everywhere", "New York")) +
xlab("Reasons") +
ylab("Proportion of total complaints") +
coord_flip() +
ggtitle("Comparison between NY and all areas")
mTogether is created using the following code
mTogether <- melt(together, id.vars = 'USuniquNegR')
The Data Frame together is made up of
USperReasons USperReasonsNY USuniquNegR
1 0.198343304187759 0.191304347826087 Late Flight
2 0.35987114588127 0.321739130434783 Customer Service Issue
3 0.0667280257708237 0.11304347826087 Lost Luggage
4 0.0547630004601933 0.00869565217391304 Flight Booking Problems
5 0.109065807639208 0.121739130434783 Can't Tell
6 0.00460193281178095 0 Damaged Luggage
7 0.0846755637367694 0.0782608695652174 Cancelled Flight
8 0.0455591348366314 0.0521739130434783 Bad Flight
9 0.0225494707777266 0.0347826086956522 longlines
10 0.0538426138978371 0.0782608695652174 Flight Attendant Complaints
Together can be generated by the following
together<-data.frame(cbind(USperReasons,USperReasonsNY,USuniquNegR))
where
USperReasons <- c(0.19834,0.35987,.06672,0.05476,0.10906,.00460,.08467,0.04555,0.02254,0.05384)
USperReasonsNY <- c(0.191304348,0.321739130,0.113043478,0.008695652,0.121739130,0.000000000,0.078260870,0.05217391,0.034782609,0.078260870)
USuniquNegR <- c("Late Flight","Customer Service Issue","Lost Luggage","Flight Booking Problems","Can't Tell","Damaged Luggage","Cancelled Flight","Bad Flight","longlines","Flight Attendant Complaints")
The problem is when I try change xlim of the ggplot using
+ xlim(0, 1)
I just seem to get an error:
Discrete value supplied to continuous scale
I can't understand why this happens but I need to resolve it because currently the x axis starts below 0 and is very highly packed:
image of ggplot output
The problem is that you are cbind()ing your column vectors together, which converts the numbers to characters. Fix that and the rest should fix itself.
together<-data.frame(USperReasons,USperReasonsNY,USuniquNegR)
You need to remove the cbind from
together<-data.frame(cbind(USperReasons,USperReasonsNY,USuniquNegR))
because str(together) tells that all three columns are factors.
With
together <- data.frame(USperReasons, USperReasonsNY, USuniquNegR)
the plot looks reasonable to me (without having to use ylim or xlim).
So, the error was not within ggplot2 but in data preparation.
Therefore, please, provide a full working example which can be copied, pasted and run when asking a question next time. Thank you.

What is happening with my geom_line() in ggplot2?

I am no expert in R, but I have used ggplot2 many times and never had any problems. Still, this time I am not able to plot lines in my graph and I have no idea why (it should be something really simple though).
For instance for:
def.percent period
1 5.0657339 1984-1985
2 3.9164528 1985-1986
3 -1.756613 1986-1987
4 2.8184863 1987-1988
5 -2.606311 1988-1989
I have to code:
ggplot(plot.tab, aes(x=period, y=def.percent)) + geom_line() + geom_point() + ggtitle("Deforestation rates within Properties")
BUt when I run it, it just plots the points without a line. It also gives me this message:
geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?
Its not really an error but I cannot figure it out how to plot the lines... Any ideas?
Your x axis (period) is a factor rather than numeric, so it doesn't connect them. You can fix this by setting group = 1 in the aesthetics, which tells ggplot2 to group them all together into a single line:
ggplot(plot.tab, aes(x = period, y = def.percent, group = 1)) +
geom_line() +
geom_point() +
ggtitle("Deforestation rates within Properties")

Resources