Making a spiral column chart with ggplot? - r

I am trying to make a spiral side-by-side bar chart like the following examples:
The previous one is from Spiral barplot using ggplot & coord_polar (Condegram), but I couldn't make it work for my data.
Using ggplot, this is as far as I have gotten:
data <- read.table(text="year group1 group2
1973 25939 27147
1978 21086 23108
1989 28401 24010
1995 34601 25457
2000 38672 28894
2007 40874 34926
2009 43892 38169
2013 48028 39270
2014 47289 39948
2015 48261 41913
2016 49814 42373
2017 50346 42818",header=T)
data$year <- as.character(data$year)
data <- data %>% gather(group, value, group1:group2)
ggplot(data)+
geom_bar(aes(x=year, y=value, fill=group), stat="identity", position = "dodge") +
coord_polar()
Which produces an ugly spiral bar chart
I'm not sure how to make the bottoms square and add the space the white spiral needs. Any help and explanation would be greatly appreciated!

Related

How to add legend on a line plot?

I have a data like this
year catch group
2011 22 1
2012 45 1
2013 34 1
2011 11 2
2012 22 2
2013 32 2
I would like to have the number of the group (1 and 2) to appear above the line in the plot.
Any suggestion?
My real data has 8 groups in total with 8 lines which makes it hard to see because the lines cross one another and the colors of the legend are similar.
I tried this:
library(ggplot2)
ggplot(aes(x=as.factor(year), y=catch, group=as.factor(group),
col=as.factor(group)), data=df) +
geom_line() +
geom_point() +
xlab("year") +
labs(color="group")
Firstly, distinguishing 8 different colours is very difficult. That's why your 8 groups seem to have similar colors.
What you want in this case is not a legend (which usually is an off-chart summary), but rather "annotation".
You can directly add the groups with
ggplot(...) +
geom_text(aes(x=as.factor(year), y=catch, label=group)) +
...
and then try to tweak the position of the text with nudge_x and nudge_y. But if you wanted only 1 label per group, you would have to prepare a data frame with it:
labels <- df %>% group_by(group) %>% top_n(1, -year)
ggplot(...) +
geom_text(data=labels, aes(x=as.factor(year), y=catch, label=group)) +
...

Stacked barplot histogram in R

I would like to make a histogram for my data but I would also like to visualize it in such a way that each category is coloured differently but stacked together.
This is what I'm trying to achieve: Stacked histogram from already summarized counts using ggplot2
but I'm unsure how to do it for my data set and my R skills are very much on the rusty side.
My data is formatted like this
Name Category Age Year
1 A 3 2017
2 B 6 2016
3 B 12 2017
4 B 8 2017
I'm only interested in Category B so I made a subset called catB. I would like the histogram to graph the frequency of the different ages, and I would like to colour the stacks based on year (in my data there are 5 year options).
I would appreciate any help! Thank you!
ggplot(catB, aes(x = Age, fill = Year)) +
geom_histogram()
one more nice graphical option. You have to add frequency(count): in example given it is count=1. However you have to see on real data what is count value:
catB <- cbind(catB, count=1)
ggplot(catB, aes(x=Age, y=count)) + geom_histogram(aes(fill=Year), stat="identity", group=1)

How can I get my area plot to stack using ggplot?

I am trying to get my cumulative area plot to stack using the code below, which is based on http://dantalus.github.io/2015/08/16/step-plots/. I have added in position=stack, however the plot still overlaps.
The aim of what I am trying to achieve is to show the cumulative number of publications each year within a given period. So, as an example, in 1940 there may be one publication, the following year there may be 2 more, bringing the cumulative total to 3.
What would be the best way to get the areas to stack on top of each other?
How can the order be controlled? Would I need to use arrange() to order TERM2?
ggplot(data=working, aes(x=Year, color=TERM2, fill=TERM2)) +
stat_bin(data = subset(working, TERM2=="A"), bins=80, aes(y=cumsum(..count..)),geom="area", position="stack", alpha=0.1) +
stat_bin(data = subset(working, TERM2=="B"), bins=80, aes(y=cumsum(..count..)),geom="area", position="stack",alpha=0.1) +
stat_bin(data = subset(working, TERM2=="Both"),bins=80, aes(y=cumsum(..count..)),geom="area", position="stack", alpha=0.1) +
ylab("Total Number") + xlim(1940,2020) + ggtitle("Cumulative number by measurement method")
What I am currently getting:
Example of what I am trying to achieve:
The following chart was created in Excel using the same data which is exactly what I am looking to achieve in R.
My Data:
Example of how my data is currently structured:
Year TERM2
1944 A
1959 B
1966 A
1968 B
1968 A
1970 A
1971 B
1971 B
1971 A
1971 A
1971 Both
1971 Both
1971 Both
1972 A
1972 Both
1972 Both
1973 B
1973 A
1974 A
1974 A
'data.frame': 803 obs. of 6 variables:
$ Year : int 1944 1959 1966 1968 1968 1970 1971 1971 1971 1971 ...
$ TERM2 : Factor w/ 3 levels "B","A","Both": 2 1 2 1 2 2 1 1 2 2 ...
Changes based on user127649's suggestions
This is the plot after user127649's suggestions, which is close to what I would expect except I am looking for it to start at 0 and end at 803 (total number of publications).
ggplot(data=working, aes(x=Year, color=TERM2, fill=TERM2)) +
stat_bin(bins=80, aes(y=cumsum(..count..)), geom="area", alpha=0.1) +
ylab("Total Number") + xlim(1940,2020) + ggtitle("Cumulative number by measurement method")
I think there were two issues.
When You use stat_bin() in three separate layers, each effectively has it’s own independent data set. This will give the correct count, but (and this is a guess really) I think being in three separate layers means you can’t stack them.
If you use stat_bin() on all the layers I think stat = '..count..' performs cumsum() on the data as a whole.
I don’t know whether this is the best approach or not, but I think it’s what you’re after.
Data
The data are grouped and cumsum() is used on each group separately.
library(tidyverse)
working <- working %>%
count(Year, TERM2) %>%
spread(TERM2, n, fill = 0) %>%
mutate_at(vars('A', 'B', 'Both'), cumsum) %>%
gather(TERM2, N, -Year, factor_key = T) #%>%
# mutate(TERM2 = ordered(TERM2, levels = rev(levels(TERM2))))
Plot
This code will produce the first plot below. If you prefer the look of the second plot, you can un-comment the last line of the data manipulation chunk.
ggplot(working, aes(Year, N, fill = TERM2)) +
geom_area(position = 'stack') +
ylab("Total Number")
Result

Fill geom_area (ggplot2) with a gradient

I am having some troubles applying a gradient fill to my area plot.
The data is as below:
> df
year annual
1 1960 0.0100
2 1961 -0.2700
3 1962 -0.3450
4 1963 -0.6508
5 1964 -0.9458
6 1965 -0.2458
7 1966 0.9492
8 1967 0.5383
9 1968 0.6275
10 1969 0.0000
I've set up a colorRampPalette for the gradient, and I know this works.
spi.cols <- colorRampPalette(c("darkred","red","yellow","white","green","blue","darkblue"),space="rgb")
With the plot, my aim is to have the fill colours follow the values in the annual column. So as to make it easy to tell that values are within certain boundaries. Right now, the plot seems to think every value it is "filling" is equal to zero, and is thus filling it all in one colour only.
ggplot(df, aes(x = year)) +
geom_polygon(aes(y = annual, fill = annual)) +
theme_classic() +
scale_fill_gradientn(colours = spi.cols(12), limits = c(-2.5, 2.5), guide = "legend")
I have also specified the breaks I'd like in my gradient, but I'm not sure how to utilise this. I attempted to use this in values of the scale_fill_gradientn but this was unsuccessful.
spi.breaks <- c(-2.5,-2,-1.6,-1.3,-0.8,-0.5,0.5,0.8,1.3,1.6,2,2.5)
Any help would be much appreciated

Join gap in polar line ggplot plot

When ggplot makes a line plot with polar coordinates, it leaves a gap between the highest and lowest x-values (Dec and Jan below) instead of wrapping around into a spiral. How can I continue the line and close that gap?
In particular, I want to use months as my x-axis, but plot multiple years of data in one looping line.
Reprex:
library(ggplot2)
# three years of monthly data
df <- expand.grid(month = month.abb, year = 2014:2016)
df$value <- seq_along(df$year)
head(df)
## month year value
## 1 Jan 2014 1
## 2 Feb 2014 2
## 3 Mar 2014 3
## 4 Apr 2014 4
## 5 May 2014 5
## 6 Jun 2014 6
ggplot(df, aes(month, value, group = year)) +
geom_line() +
coord_polar()
Here's a somewhat-hacky option:
# make a data.frame of start values end values should continue to
bridges <- df[df$month == 'Jan',]
bridges$year <- bridges$year - 1 # adjust index to align with previous group
bridges$month <- NA # set x value to any new value
# combine extra points with original
ggplot(rbind(df, bridges), aes(month, value, group = year)) +
geom_line() +
# close gap by removing expansion; redefine breaks to get rid of "NA/Jan" label
scale_x_discrete(expand = c(0,0), breaks = month.abb) +
coord_polar()
Obviously adding extra data points is not ideal, though, so maybe a more elegant answer exists.

Resources