I have data for the first 5 months of 2 separate years presented in a stacked bar plot. Currently it looks like this
I've used this code to get to where I am
ggplot(data=total_arr2, aes(fill=type, y=flights, x=period, group=year)) +
geom_bar(position="fill", stat="identity") +
ggtitle("Arrivals by period, type stacked proportions") +
scale_y_continuous(labels = scales::percent)
I want to add a custom gap to separate the 2019 cases from the 2020 cases so that the trends in the first 5 months of each year can be compared side by side, but I want to keep them on the same graph.
Is there a relatively easy way to do this (been using R for about 3 weeks in total, so it's a steep learning curve!!)
Related
I am plotting years across two decades using ggplot.I have a situation where due to how the data for the years was taken, the datapoints are really halfway through the year so to be accurate, I labeled the years with a .5 at the end. In addition, I also have one single datapoint that was taken in early 2005 so it's labeled as 2005.22 so the years look like : 2005.22, 2005.5,2006.5,2007.5,2008.5,2009.5,2010.5,2011.5,2012.5. Since I am technically missing data for 2005-2005.21, I want the plot to start at 2005 with no line showing until 2005.22 and then breaking every 2 years starting at 2005.5,2007.5 and so on...
I've been using the following to plot geom_line for the years but I do not know how to get the above result. I was able to get the limits to start at 2005 but with the datapoint starting at 2005.22, it just plots like 2005.22,2007.22....below is what I am using to properly plot and break the years.
scale_x_continuous(
name = "year",
breaks = seq(c(2005, 2012.5), by=2),
expand = c(0,0))+
coord_cartesian(xlim = c(2005, 2012.5))```
It's a little hard for me to understand what exactly you want the plot to look like (especially in terms of the labels), but does this do what you're looking for? You can add 2005 to the front of the breaks sequence, which places it in front without disrupting the rest of the sequence.
library(ggplot2)
d <- data.frame(x=c(2005.22, 2005.5,2006.5,2007.5,2008.5,2009.5,2010.5,2011.5,2012.5),
y=runif(9,-1,1))
ggplot(d, aes(x,y)) +
geom_line() +
scale_x_continuous(breaks=c(2005, seq(2005.5, 2012.5,2)))
This question already has answers here:
Order Bars in ggplot2 bar graph
(16 answers)
Closed 3 years ago.
How can I sort a bar graph from highest to lowest in R? (ggplot2)
the code is this, but feel free to do a better code haha
Ps: it is a huge data
ggplot(kiva, aes(repayment_interval, loan_amount, fill = repayment_interval)) +
geom_bar(stat = "identity") +
ggtitle("Total of loan for different types of repayment intervals")
I assume you want highest to lowest y-variable, loan_amount.
Difficult to answer without example data but something like this should work, using reorder. Also geom_col = less typing.
ggplot(kiva,
aes(reorder(repayment_interval, -loan_amount), loan_amount),
fill = repayment_interval)) +
geom_col() +
ggtitle("Total of loan for different types of repayment intervals")
I would like to make a histogram for my data but I would also like to visualize it in such a way that each category is coloured differently but stacked together.
This is what I'm trying to achieve: Stacked histogram from already summarized counts using ggplot2
but I'm unsure how to do it for my data set and my R skills are very much on the rusty side.
My data is formatted like this
Name Category Age Year
1 A 3 2017
2 B 6 2016
3 B 12 2017
4 B 8 2017
I'm only interested in Category B so I made a subset called catB. I would like the histogram to graph the frequency of the different ages, and I would like to colour the stacks based on year (in my data there are 5 year options).
I would appreciate any help! Thank you!
ggplot(catB, aes(x = Age, fill = Year)) +
geom_histogram()
one more nice graphical option. You have to add frequency(count): in example given it is count=1. However you have to see on real data what is count value:
catB <- cbind(catB, count=1)
ggplot(catB, aes(x=Age, y=count)) + geom_histogram(aes(fill=Year), stat="identity", group=1)
I am fairly new to R and ggplot2 and am having some trouble plotting multiple variables in the same histogram plot.
My data is already grouped and just needs to be plotted. The data is by week and I need to plot the number for each category (A, B, C and D).
Date A B C D
01-01-2011 11 0 11 1
08-01-2011 12 0 3 3
15-01-2011 9 0 2 6
I want the Dates as the x axis and the counts plotted as different colors according to a generic y axis.
I am able to plot just one of the categories at a time, but am not able to find an example like mine.
This is what I use to plot one category. I am pretty sure I need to use position="dodge" to plot multiple as I don't want it to be stacked.
ggplot(df, aes(x=Date, y=A)) + geom_histogram(stat="identity") +
labs(title = "Number in Category A") +
ylab("Number") +
xlab("Date") +
theme(axis.text.x = element_text(angle = 90))
Also, this gives me a histogram with spaces in between the bars. Is there any way to remove this? I tried spaces=0 as you would do when plotting bar graphs, but it didn't seem to work.
I read some previous questions similar to mine, but the data was in a different format and I couldn't adapt it to fit my data.
This is some of the help I looked at:
Creating a histogram with multiple data series using multhist in R
http://www.cookbook-r.com/Graphs/Plotting_distributions_%28ggplot2%29/
I'm also not quite sure what the bin width is. I think it is how the data should be spaced or grouped, which doesn't apply to my question since it is already grouped. Please advise me if I am wrong about this.
Any help would be appreciated.
Thanks in advance!
You're not really plotting histograms, you're just plotting a bar chart that looks kind of like a histogram. I personally think this is a good case for faceting:
library(ggplot2)
library(reshape2) # for melt()
melt_df <- melt(df)
head(melt_df) # so you can see it
ggplot(melt_df, aes(Date,value,fill=Date)) +
geom_bar() +
facet_wrap(~ variable)
However, I think in general, that changes over time are much better represented by a line chart:
ggplot(melt_df,aes(Date,value,group=variable,color=variable)) + geom_line()
This question already has answers here:
Making a stacked bar plot for multiple variables - ggplot2 in R
(3 answers)
Closed 9 years ago.
I have data that has the following format:
revision added removed changed confirmed
1 20 0 0 0
2 18 3 8 10
3 12 8 14 10
4 6 5 11 8
5 0 1 7 11
Each row represents a revision of a document. The first column is the revision number, and the remaining columns represent elements added, removed, changed, and confirmed (ready) in the respective revision. (In reality, there are more rows and columns, this is just an example.) Each number represents the amount of recorded additions, removals, changes, and confirmations in each respective revision.
What I need is a stacked barplot that looks like somthing like this:
I would like to do this in ggplot2. The exact visual look is not important (fonts, colours, and placement of the legend) as long as I can tweak it later. At the moment, it's the general idea I'm looking for.
I've looked at several questions and answers, e.g.
How do I do a Barplot of already tabled data?,
Making a stacked bar plot for multiple variables - ggplot2 in R,
barplot with 3 variables (continous X and Y and third stacked variable), and
Stacked barplot, but they all seem to make assumptions that don't match my data. I've also experimented with something like this:
ggplot(data) + geom_bar(aes(x=revision, y=added), stat="identity", fill="white", colour="black") + geom_bar(aes(x=revision, y=removed), stat="identity", fill="red", colour="black")
But obviously this does not create a stacked barplot because it just drawns the second geom_bar over the first.
How can I make a stacked barplot of my data using ggplot2?
Try:
library(reshape2)
dat <- melt(data, id="revision")
ggplot(dat, aes(x=revision, y=value, fill=variable)) +
geom_bar(stat="identity")