I created a bar graph in ggplot to show how counts in column scheme changed over time (i.e. from 2001 to 2016).
The x-axis is the year, the y-axis shows the frequencies (I used the fill=) to get the counts.
The data set consists of two columns (year and scheme) filled with character values:
year scheme
2016 yes
2016 yes
2016 yes
2016 yes
2015 yes
2015 yes
2014 yes
2013 yes
....
2006 no
2006 no
2006 no
2006 no
2005 no
2005 no
2004 no
2003 no
2002 no
2002 no
2001 no
2001 no
My code:
a <- ggplot(s) +
stat_bin(aes(x=year, fill=scheme, group=scheme), geom="bar", position = "dodge",bins=30)
b <- a + scale_x_continuous(breaks = c(2001:2016), labels = factor(2001:2016))
c <- b + theme(axis.text.x=element_text(size = 10, colour = "black"))
The graph:
The problem I have is that the bars are shifted in the graph for no reason. You can recognize it by looking at the x-axis and the year label. The bars are moved too much to the left (e.g.2007) or to the right (2002).
I have no clue why it happened and how can I fix it? Any type of suggestions is very much welcome.
Use binwidth = 1 instead of bins = 30. When you specify there should be 30 bins, you're asking for the years to be broken into the segments whose endpoints are sequential values in seq(2001, 2016, length.out = 30).
All the weird gaps are from the bins which didn't include a whole number.
Related
I'm trying to make a spaghetti plot for some time series data and am having some trouble - my attempts so far do not show a separate line for each individual.
Individuals were sampled twice (in July and October) and at each time point I tested whether or not they were infected with nematodes (Yes/No).
I want to make a graph with "Nematode infection status" on the Y axis and "Month" (or "Time" if I have to make it numeric?) on the X axis.
My data looks like this:
ID Month Time nematode_infected
1 July 1 Yes
1 Oct 2 Yes
2 July 1 Yes
2 Oct 2 No
3 July 1 No
3 Oct 2 Yes
ID, Month, and nematode_infected are all factors
And my current code to graph it looks like this:
ggplot(recaptures_combined_long, aes(x=Time, y=nematode_infected, group=ID))+
facet_wrap(~RELATIVEAGE)+
labs(x="Month", y="Nematode infected?") + geom_line()
This is what I get (note that I have ~70 different IDs so there should be 70 lines on this graph, but there are only 4 per graph)
If I use "colour = ID" instead of "group = ID", I get this mess.
What I want is something like this, but with each line (each ID) a different colour:
Any idea how I could go about producing this graph? Thank you!
Edit - the solution was:
ggplot(recaptures_combined_long, aes(nematode_infected, Month, color = as.factor(ID), group = ID) )+
facet_wrap(~RELATIVEAGE)+ labs(x="Nematode infected?", y="Month")+
geom_line(position=position_dodge(width=0.2))+theme_classic() +
coord_flip() + theme(legend.position="none")
Is this what you are looking for?
df <- read.table(text =
"ID Month Time nematode_infected
1 July 1 Yes
1 Oct 2 Yes
2 July 1 Yes
2 Oct 2 No
3 July 1 No
3 Oct 2 Yes",
h = T)
library(ggplot2)
ggplot(df) +
geom_line(aes(Month, nematode_infected, color = as.factor(ID), group = ID))
Created on 2022-02-11 by the reprex package (v2.0.0)
I have this kind of table:
Year Substance Number
2013 A 32
2013 B 27
2013 C 17
2013 D 17
2013 E 15
2013 F 13
2014 B 20
2014 D 17
2014 A 16
2014 C 11
2014 F 9
2014 G 3
Basically, the years go up to 2018 with 6 or 7 substances every year, and each substance has a number (frequency of occurrence). The substances have actual names, but I cannot publish them on the Internet, so I changed them for A, B, C, D, E, F, and G. I am unable to order the bars as I want, in decreasing order.
I did a lot of research on the Internet and tried many things: forcats, factor, levels, reorder, etc. and none of it worked. I have an R novice, so I don't really now what would be the best way to do what I want.
When I try to plot like this, it places the substance in alphabetical order:
ggplot(Test, aes(x = Year, y = Number, fill = Substance)) + geom_col(position = "dodge")
For the first year, 2013, the order is right. I want it to look like that, in decreasing order, for every other year. What should I do?
This is kind of tricky because your ordering is changing by year, so factor variable conversion gets messy. Here is one way to do it by sorting x position using a separate numeric value:
library('data.table')
library('ggplot2')
Test[, Ranking:= rank(-Number, ties.method = 'first'), by = .(Year)]
ggplot(Test, aes(x = Ranking,
y = Number,
fill = Substance)) +
geom_col(position = 'dodge') +
scale_x_continuous(name = '', breaks = 0) +
facet_wrap(~Year)
Output:
I need to look at relative change in 2 groups of data which have very different scales.
I would therefore think that by setting my first value to 100% and then creating a proportion to that value per group is the way forward. I can then create a line chart to show the relative movement.
I would call this an index chart so may have missed existing questions.
However I don't know how to set my data up in R to do this.
My aggregated data below. I want each of 1999 to be 100% and the subsequent years to be % of that.
> Totals
year fips Emissions
1 1999 06037 6109.6900
2 2002 06037 7188.6802
3 2005 06037 7304.1149
4 2008 06037 6421.0170
5 1999 24510 403.7700
6 2002 24510 192.0078
7 2005 24510 185.4144
8 2008 24510 138.2402
I'm probably going to want to add a bar chart behind it to show weighting too as relative change is much more dramatic for smaller data. Tips on that are appreciated too but I've not searched for that yet as the above is the primary issue IMO.
Appreciate your help.
James
For example with dplyr:
library(dplyr)
dat <-
df1 %>%
group_by(fips) %>%
mutate(ind = Emissions / first(Emissions))
And using ggplot2 to plot a line chart:
library(ggplot2)
ggplot(dat, aes(x = year, y = ind, color = as.factor(fips))) +
geom_line()
I am using the xyplot in lattice trying to make a plot that shows temperature change over time in correlation with count data. I am not sure if ggplot2 would be better? My data is arrange like this:
Year (1998 1998 1999 2000 2001 2001 2002)
Low (2.777778 8.333330 10.555556 4.444444 26.388889 15.555556 12.500000)
Geese (2 14 10 16 7 10 15)
State (Arkansas California California California California Florida California)
I am stuck at this part of the code:
xyplot(c(geese,low)~year,subset=state=="California", par.settings=bwtheme, auto.key=TRUE)
The plot has the geese and low (temperature) as the same type of point and if I add a line there is no separation between the two. Please any help for this would be awesome.
To plot multiple series on the same plot, use + rather than c() to specify multiple y values. For example
xyplot(geese + low ~year, subset=state=="California", auto.key=TRUE, type="b")
That will produce
I am very new to R, and so this question is extremely elementary, but I can't solve it myself. I would very much appreciate your help.
This is a sort of dataframe I want to use:
Period Value Cut.off
1 January 1998 - August 2002 8.798129 1.64
2 September 2002 - Jun 2006 4.267268 1.64
3 Jul 2006 - Dec 2009 7.280275 1.64
This the code I am using:
require(ggplot2)
bq <- ggplot(data=glomor, aes(x=as.character(Period),y=Value))+geom_point()+ylim(0,10)
bq <- bq + scale_x_discrete(limits=c("January 1998 - August 2002","September 2002 - Jun 2006","Jul 2006 - Dec 2009"))
bq + geom_line()
I receive the following error message:
geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?
How do I need to change the code, so that the points will be connected by a line?
You should add group=1 in your aes() call to conect points with line. This will inform geom_line() that all your points belong to one level and they should be connected.
ggplot(data=glomor, aes(x=as.character(Period),y=Value,group=1))+
geom_point()+ylim(0,10) + geom_line()