Merging separate Month and Year columns to graph in ggplot2 - r

I've been around the forums looking for a solution to my issue but can't seem to find anything. Derivatives of my question and their answer haven't really helped either. My data has four columns, one for Year and one for Month). I've been wanting to plot the data all in one graph without using any facets for years in ggplot. This is what I've been struggling with so far with:
df<-data.frame(Month = rep(c("January", "February", "March", "April", "May", "June",
"July", "August", "September", "October",
"November", "February", "March"),each = 20),
Year = rep(c("2018", "2019"), times = c(220, 40)),
Type = rep(c("C", "T"), 260),
Value = runif(260, min = 10, max = 55))
df$Month<-ordered(df$Month, month.name)
df$Year<-ordered(df$Year)
ggplot(df) +
geom_boxplot(aes(x = Month, y = Value, fill = Type)) +
facet_wrap(~Year)
I'd ideally like to manage this using dplyr and lubridate. Any help would be appreciated!

One option would be to make a true date value, then you can use the date axis formatter. Something like this is a rough start
ggplot(df) +
geom_boxplot(aes(x = lubridate::mdy(paste(Month, 1, Year)), y = Value, fill = Type, group=lubridate::mdy(paste(Month, 1, Year)))) +
scale_x_date(breaks="month", date_labels = "%m")

Do you mean this?
df<-data.frame(Month = rep(c("January", "February", "March", "April", "May", "June",
"July", "August", "September", "October",
"November", "February", "March"),each = 20),
Year = rep(c("2018", "2019"), times = c(220, 40)),
Type = rep(c("C", "T"), 260),
Value = runif(260, min = 10, max = 55))
df$Month <- factor(df$Month,levels=c("January", "February", "March", "April", "May", "June",
"July", "August", "September", "October",
"November", "Dicember"), ordered = T)
df$Month<-ordered(df$Month)
df$Year<-ordered(df$Year)
df$Year_Month <- paste0(df$Month, " ", df$Year)
df$Year_Month <- factor(df$Year_Month, levels = unique(df$Year_Month))
ggplot(df) +
geom_boxplot(aes(x = Year_Month, y = Value, fill = Type))

Related

How to avoid error "must be size" when changing month names to numbers?

I am trying to convert a character column containing month names into a numerical column containing numbers (1=January, etc.).
Example 1. This works:
df2 <- structure(list(Month = c("January", "January", "March", "March", "April")), class = "data.frame", row.names = c(NA, -5L))
df2 %>%
group_by(Month) %>%
mutate(Month = which(Month == month.name)) -> df2
Example 2. This does not work (error: "Month must be size 31 or 1, not 2."):
df <- structure(list(Month = c("August", "August", "August", "August",
"August", "August", "August", "August", "August", "August", "August",
"August", "August", "August", "August", "August", "August", "August",
"August", "August", "August", "August", "August", "August", "August",
"August", "August", "August", "August", "August", "August", "December",
"December", "December", "December")), class = "data.frame", row.names = c(NA, -35L))
df %>%
group_by(Month) %>%
mutate(Month = which(Month == month.name)) -> df
The code for converting is the same in both cases. Why doesn't it work in the second example? I can't get my head around it.

Plot temperature over time (date)

I am trying to plot temperature over time (in the form of a date), however, I am not sure how to.
See here my original table in Excel:
Or as R code:
dput(Average_temperature_period)
structure(list(Sample = c("ZS_IG_1", "AK_SN_1", "JP_IG_2", "AW_IG_1",
"SBB_SN_1", "AW_IG_2", "JvH_IG_3", "JvH_IG_2", "SBB_SN_4", "SBB_SN_3",
"SBB_SN_2", "EF_SN_1", "JP_IG_2", "JvH_IG_3", "EF_SN_1", "JvH_IG_2",
"AK_SN_1", "ZS_IG_1", "AW_IG_1", "SBB_SN_1", "AW_IG_2", "SBB_SN_4",
"SBB_SN_3", "SBB_SN_2"), Sampling_date = c("23/03/2022", "24/03/2022",
"25/03/2022", "25/03/2022", "25/03/2022", "25/03/2022", "29/03/2022",
"29/03/2022", "01/04/2022", "01/04/2022", "01/04/2022", "12/04/2022",
"25/04/2022", "26/04/2022", "28/04/2022", "29/04/2022", "03/05/2022",
"04/05/2022", "10/05/2022", "10/05/2022", "11/05/2022", "11/05/2022",
"12/05/2022", "12/05/2022"), Period = c("March", "March", "March",
"March", "March", "March", "March", "March", "March", "March",
"March", "March", "AprilMay", "AprilMay", "AprilMay", "AprilMay",
"AprilMay", "AprilMay", "AprilMay", "AprilMay", "AprilMay", "AprilMay",
"AprilMay", "AprilMay"), Average_temperature_field = c(7.137037037,
6.966666667, 10.55555556, 7.281481481, 6.874074074, 9.211111111,
9.662962963, 8.12962963, 6.707407407, 6.774074074, 7.162962963,
8.114814815, NA, 11.74814815, 13.51111111, 11.29259259, 15.4962963,
NA, 15.45925926, 17.14814815, 17.72592593, 15.84074074, 16.85555556,
19.78148148), Average_moisture_field = c(33.48518519, 47.35555556,
32.54814815, 34.01851852, 38.66666667, 31.71851852, 23.54814815,
26.83333333, 42.47777778, 29.45555556, 44.50740741, 40.27407407,
25.77407407, 18.91481481, 26.67777778, 16.27407407, 25.38518519,
19.9962963, 18.27777778, 16.14074074, 22.86666667, 23.48518519,
13.93703704, 20.92222222)), row.names = c(NA, 24L), class = "data.frame")
See here my code in R thus far:
##### Soil temperature graph
Average_temperature_period <- read.csv("~/Desktop/First Internship/MicroResp/R/R script/Average_temperature_period.csv")
Average_temperature_period$Sampling_date <- as.character(Average_temperature_period$Sampling_date)
Average_temperature_period <- Average_temperature_period[c(1:24),c(1:5)]
# Change order x axis (past to present)
Average_temperature_period$Sampling_date <- factor(Average_temperature_period$Sampling_date, levels = c("23/03/22","24/03/22","25/03/22","29/03/22","01/04/22","12/04/22","25/04/22","26/04/22","28/04/22","29/04/22","03/05/22","04/05/22","10/05/22","11/05/22","12/05/22"))
# Plot average temperature against the date
ggplot(data=Average_temperature_period, aes(x=Sampling_date, y=Average_temperature_field)) +
geom_smooth(method = "lm", se=FALSE, color="black", aes(group=1)) +
theme_classic() +
ylab("Average soil temperature (°C)") +
xlab("Sampling date")
The x axis keeps on showing 'NA' for the sampling date. Does anyone know why and how to fix it? I would like to have the x axis in order of date (past to present).
Update with the new data and request of OP:
adding this line drop_na(Average_temperature_field) %>%
library(tidyverse)
library(lubridate)
df %>%
drop_na(Average_temperature_field) %>%
mutate(Sampling_date = dmy(Sampling_date)) %>%
group_by(Sampling_date) %>%
summarise(avg_temp_day = mean(Average_temperature_field,na.rm = TRUE)) %>%
ggplot(aes(x = Sampling_date, y=avg_temp_day))+
geom_point()+
geom_line()+
scale_x_date(date_labels="%d %b",date_breaks ="1 day")+
theme_bw()+
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
First answer:
Here is one way to do it. You have sometimes two temperatures per day so I used the mean for this day:
library(tidyverse)
library(lubridate)
df %>%
mutate(Sampling_date = dmy(Sampling_date)) %>%
group_by(Sampling_date) %>%
summarise(avg_temp_day = mean(Average_temperature_field,na.rm = TRUE)) %>%
ggplot(aes(x = Sampling_date, y=avg_temp_day))+
geom_point()+
geom_line()+
scale_x_date(date_labels="%d %b",date_breaks ="2 day")+
theme_bw()+
theme(axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
data:
df <- structure(list(Sample = c("ZS_IG_1", "AK_SN_1", "JP_IG_2", "AW_IG_1",
"SBB_SN_1", "AW_IG_2", "JvH_IG_3", "JvH_IG_2", "SBB_SN_4", "SBB_SN_3",
"SBB_SN_2", "EF_SN_1", "JP_IG_2", "JvH_IG_3", "EF_SN_1", "JvH_IG_2",
"AK_SN_1", "ZS_IG_1", "AW_IG_1", "SBB_SN_1", "AW_IG_2", "SBB_SN_4",
"SBB_SN_3", "SBB_SN_2"), Sampling_date = c("23/03/2022", "24/03/2022",
"25/03/2022", "25/03/2022", "25/03/2022", "25/03/2022", "29/03/2022",
"29/03/2022", "01/04/2022", "01/04/2022", "01/04/2022", "12/04/2022",
"25/04/2022", "26/04/2022", "28/04/2022", "29/04/2022", "03/05/2022",
"04/05/2022", "10/05/2022", "10/05/2022", "11/05/2022", "11/05/2022",
"12/05/2022", "12/05/2022"), Period = c("March", "March", "March",
"March", "March", "March", "March", "March", "March", "March",
"March", "March", "AprilMay", "AprilMay", "AprilMay", "AprilMay",
"AprilMay", "AprilMay", "AprilMay", "AprilMay", "AprilMay", "AprilMay",
"AprilMay", "AprilMay"), Average_temperature_field = c(33.48518519,
47.35555556, 32.54814815, 34.01851852, 38.66666667, 31.71851852,
23.54814815, 26.83333333, 42.47777778, 29.45555556, 44.50740741,
40.27407407, 25.77407407, 11.74814815, 13.51111111, 11.29259259,
15.4962963, 19.9962963, 15.45925926, 17.14814815, 17.72592593,
15.84074074, 16.85555556, 19.78148148), Average_moisture_field = c(7.137037037,
6.966666667, 10.55555556, 7.281481481, 6.874074074, 9.211111111,
9.662962963, 8.12962963, 6.707407407, 6.774074074, 7.162962963,
8.114814815, NA, 18.91481481, 26.67777778, 16.27407407, 25.38518519,
NA, 18.27777778, 16.14074074, 22.86666667, 23.48518519, 13.93703704,
20.92222222)), class = "data.frame", row.names = c(NA, -24L))

change graph according to slider position

Here is my code. I am trying to make an rshiny page to show the mean symptoms in the past 30 days vesus the state based off of slider position (1 to 12 - for month). I know it is a little sloppy but I almost have it. I can get a graph that changes the title based off of the month on the slider but the graph just lists all of the data and not by month. Any help would be great.
`asthma = read.csv("AsthmaChild.Ozone.2006_2007.Sample.csv")
state.month = asthma[,-3:-10]
state.month = state.month[,-4]
state.month = aggregate(state.month$Symptoms.Past30D ~ state.month$STATE +
state.month$Month, state.month, mean)
colnames(state.month) = c("STATE", "Month", "Symptoms.Past30D")
sd = asthma[,-3:-10]
sd = sd[,-4]
sd = aggregate(sd$Symptoms.Past30D ~ sd$STATE + sd$Month, sd, function(x)
sd = sd(x))
colnames(sd) = c("STATE", "Month", "sd")
merged = merge(state.month,sd, by=c("STATE", "Month"))
df = count(asthma, "STATE", "Month")
colnames(df) = c("STATE","Freq")
data = merge(df, merged,by=c("STATE"))
data$sem = (data$sd)/(sqrt(data$Freq))
merged = data
merged$ConfUp = (merged$Symptoms.Past30D) + (merged$sem)
merged$ConfDown = (merged$Symptoms.Past30D) - (merged$sem)
merged$Month = as.character(merged$Month)
merged$Month = gsub("12", "December", merged$Month)
merged$Month = gsub("11", "November", merged$Month)
merged$Month = gsub("10", "October", merged$Month)
merged$Month = gsub("9", "September", merged$Month)
merged$Month = gsub("8", "August", merged$Month)
merged$Month = gsub("7", "July", merged$Month)
merged$Month = gsub("6", "June", merged$Month)
merged$Month = gsub("5", "May", merged$Month)
merged$Month = gsub("4", "April", merged$Month)
merged$Month = gsub("3", "March", merged$Month)
merged$Month = gsub("2", "February", merged$Month)
merged$Month = gsub("1", "January", merged$Month)
index = c(1:12)
values = c("January", "February", "March", "April", "May", "June", "July",
"August", "September", "October", "November", "December")
ui = fluidPage(
sidebarPanel(
sliderInput("Month", "Month: Jan=1, Dec=12",min = 1, max =
12,step=1,value=1)),
mainPanel(plotOutput("plot")))
server = function(input,output){
sliderInput(inputId="Month",
label="Month: Jan=1, Dec=12",
min = 1,
max = 12,
value=1,
step=1)
mainPanel(plotOutput("plot"))
dat = reactive({
test <- merged[merged$Month %in%
seq(from=min(input$Month),to=max(input$Month),by=1),]
})
output$plot = renderPlot({
ggplot(data=merged, aes(x=Symptoms.Past30D, y = STATE)) +
geom_errorbarh(aes(xmin=ConfUp,xmax=ConfDown), height=1, linetype = 1) +
xlab ("Mean Sympotms.Past30D (SEM)") + ylab ("STATE") +
labs(title=paste(values[match(input$Month, index)]))
})
}
shinyApp(ui, server)`

How to order by Month? [duplicate]

I have this data frame called yy:
structure(list(Time = structure(c(1209096000, 1238731200, 1272600000,
1301666400, 1335794400, 1364835600, 1218772800, 1250222400, 1280808000,
1314028800, 1346421600, 1377835200, 1229317200, 1262235600, 1291352400,
1324047600, 1355497200, 1385960400, 1204261200, 1235710800, 1265950800,
1298646000, 1328281200, 1360940400, 1199250000, 1232082000, 1263186000,
1295017200, 1326466800, 1357149600), class = c("POSIXct", "POSIXt"
), tzone = ""), Peak_Logons = c(472452L, 398061L, 655849L, 881689L,
873720L, 1278295L, 340464L, 520943L, 1995150L, 883184L, 931721L,
1098553L, 405193L, 638301L, 734635L, 1254951L, 962391L, 1126432L,
316200L, 477407L, 674884L, 812793L, 898550L, 1541478L, 291564L,
394967L, 902076L, 916832L, 878264L, 918102L), Year = c("2008",
"2009", "2010", "2011", "2012", "2013", "2008", "2009", "2010",
"2011", "2012", "2013", "2008", "2009", "2010", "2011", "2012",
"2013", "2008", "2009", "2010", "2011", "2012", "2013", "2008",
"2009", "2010", "2011", "2012", "2013"), Month = c("April", "April",
"April", "April", "April", "April", "August", "August", "August",
"August", "August", "August", "December", "December", "December",
"December", "December", "December", "February", "February", "February",
"February", "February", "February", "January", "January", "January",
"January", "January", "January")), .Names = c("Time", "Peak_Logons",
"Year", "Month"), row.names = c(35479L, 30535L, 23645L, 15248L,
49696L, 8077L, 24651L, 13098L, 20204L, 47450L, 41228L, 20740L,
28049L, 9739L, 2636L, 50230L, 43746L, 3435L, 38091L, 28351L,
7382L, 3343L, 47824L, 45481L, 23951L, 29664L, 10024L, 4545L,
38808L, 44205L), class = "data.frame")
What I would like to do is create a heat map, Year on the y-axis and Month on the x-axis.
I am doing this:
ggplot(yy ,aes(Month, Year, fill=Peak_Logons)) +
geom_tile() +
theme_bw() +
guides(fill = guide_legend(keywidth = 5, keyheight = 1)) +
theme(axis.text.x = element_text(size=10, angle=45, hjust=1))
This kinda works but Months on x-axis are not order from January, February, March, April ... December.
They are order aphabetically from April, August etc.
How would I order the x-axis from January to December?
Is there a way to change the default colors, it looks like it is using blue shades?
Can I isert text the the geom_tiles? I would like to insert Time and Peak_Logons inside the tiles.
I would really appreciate any insight.
Use reorder to arrange your axis labels. I create a new column with month index.
geom_text to add text. Maybe you should play with text size.
scale_fill_gradientn to change fill color. See also scale_fill_gradientn
dat.m <- data.frame(Month=months(seq(as.Date("2000/1/1"),
by = "month", length.out = 12)),month.id = 1:12)
yy <- merge(yy,dat.m)
library(ggplot2)
ggplot(yy ,aes(reorder(Month,month.id), Year, fill=Peak_Logons)) +
geom_tile() +
theme_bw() +
guides(fill = guide_legend(keywidth = 5, keyheight = 1)) +
theme(axis.text.x = element_text(size=10, angle=45, hjust=1)) +
geom_text(aes(label=paste(Peak_Logons,format(Time,"%H"),sep='-'))) +
scale_fill_gradient(low = "yellow", high = "red")
You seem to consider months an ordered factor. You should make it one in R:
Month = c("April", "April",
"April", "April", "April", "April", "August", "August", "August",
"August", "August", "August", "December", "December", "December",
"December", "December", "December", "February", "February", "February",
"February", "February", "February", "January", "January", "January",
"January", "January", "January")
Month.ordered <- ordered(Month, month.name)
#[1] April April April April April April August August August August August August December December December December
#[17] December December February February February February February February January January January January January January
#Levels: January < February < March < April < May < June < July < August < September < October < November < December

how do you order Months in ggplot

I have this data frame called yy:
structure(list(Time = structure(c(1209096000, 1238731200, 1272600000,
1301666400, 1335794400, 1364835600, 1218772800, 1250222400, 1280808000,
1314028800, 1346421600, 1377835200, 1229317200, 1262235600, 1291352400,
1324047600, 1355497200, 1385960400, 1204261200, 1235710800, 1265950800,
1298646000, 1328281200, 1360940400, 1199250000, 1232082000, 1263186000,
1295017200, 1326466800, 1357149600), class = c("POSIXct", "POSIXt"
), tzone = ""), Peak_Logons = c(472452L, 398061L, 655849L, 881689L,
873720L, 1278295L, 340464L, 520943L, 1995150L, 883184L, 931721L,
1098553L, 405193L, 638301L, 734635L, 1254951L, 962391L, 1126432L,
316200L, 477407L, 674884L, 812793L, 898550L, 1541478L, 291564L,
394967L, 902076L, 916832L, 878264L, 918102L), Year = c("2008",
"2009", "2010", "2011", "2012", "2013", "2008", "2009", "2010",
"2011", "2012", "2013", "2008", "2009", "2010", "2011", "2012",
"2013", "2008", "2009", "2010", "2011", "2012", "2013", "2008",
"2009", "2010", "2011", "2012", "2013"), Month = c("April", "April",
"April", "April", "April", "April", "August", "August", "August",
"August", "August", "August", "December", "December", "December",
"December", "December", "December", "February", "February", "February",
"February", "February", "February", "January", "January", "January",
"January", "January", "January")), .Names = c("Time", "Peak_Logons",
"Year", "Month"), row.names = c(35479L, 30535L, 23645L, 15248L,
49696L, 8077L, 24651L, 13098L, 20204L, 47450L, 41228L, 20740L,
28049L, 9739L, 2636L, 50230L, 43746L, 3435L, 38091L, 28351L,
7382L, 3343L, 47824L, 45481L, 23951L, 29664L, 10024L, 4545L,
38808L, 44205L), class = "data.frame")
What I would like to do is create a heat map, Year on the y-axis and Month on the x-axis.
I am doing this:
ggplot(yy ,aes(Month, Year, fill=Peak_Logons)) +
geom_tile() +
theme_bw() +
guides(fill = guide_legend(keywidth = 5, keyheight = 1)) +
theme(axis.text.x = element_text(size=10, angle=45, hjust=1))
This kinda works but Months on x-axis are not order from January, February, March, April ... December.
They are order aphabetically from April, August etc.
How would I order the x-axis from January to December?
Is there a way to change the default colors, it looks like it is using blue shades?
Can I isert text the the geom_tiles? I would like to insert Time and Peak_Logons inside the tiles.
I would really appreciate any insight.
Use reorder to arrange your axis labels. I create a new column with month index.
geom_text to add text. Maybe you should play with text size.
scale_fill_gradientn to change fill color. See also scale_fill_gradientn
dat.m <- data.frame(Month=months(seq(as.Date("2000/1/1"),
by = "month", length.out = 12)),month.id = 1:12)
yy <- merge(yy,dat.m)
library(ggplot2)
ggplot(yy ,aes(reorder(Month,month.id), Year, fill=Peak_Logons)) +
geom_tile() +
theme_bw() +
guides(fill = guide_legend(keywidth = 5, keyheight = 1)) +
theme(axis.text.x = element_text(size=10, angle=45, hjust=1)) +
geom_text(aes(label=paste(Peak_Logons,format(Time,"%H"),sep='-'))) +
scale_fill_gradient(low = "yellow", high = "red")
You seem to consider months an ordered factor. You should make it one in R:
Month = c("April", "April",
"April", "April", "April", "April", "August", "August", "August",
"August", "August", "August", "December", "December", "December",
"December", "December", "December", "February", "February", "February",
"February", "February", "February", "January", "January", "January",
"January", "January", "January")
Month.ordered <- ordered(Month, month.name)
#[1] April April April April April April August August August August August August December December December December
#[17] December December February February February February February February January January January January January January
#Levels: January < February < March < April < May < June < July < August < September < October < November < December

Resources