Convering a Character to HMS in R - r

I am inquiring to learn how I can convert a character to hms so that I can accurately plot the mean of a series of album lengths that are in hh:mm:ss format. Here's my code. It doesn't work.
#Create labels for the mean
meanHours <- music %>%
select(AlbumHours) %>%
summarise(hrsMean = albumHRSmean, na.rm=TRUE) %>%
strptime(hrsMean, "%H:%M:%S") %>%
mutate(label_hrsMean = paste0("Mean of Hours: ",(hrsMean)))

We could use hms() function from lubridate package:
library(lubridate)
library(dplyr)
meanHours <- music %>%
select(AlbumHours) %>%
summarise(hrsMean = albumHRSmean, na.rm=TRUE) %>%
mutate(hrsMean = hms(hrsMean),
label_hrsMean = paste0("Mean of Hours: ",(hrsMean)))

Related

How do I sort dates by month in R?

new R student here. I am trying to sort data by month. Here is a sample of the data I need to use, followed by the code, then my results. Any tips for how to accomplish this?! I'm super stuck...
This is the latest code I have been trying:
library(readr)
weather <- read_csv("R/weather.csv", col_types = cols(High = col_number(),
Low = col_number(), Precip = col_number(),
Snow = col_number(), Snowd = col_integer()))
View(weather)
library(ggplot2)
library(ggridges)
library(dplyr)
library(lubridate)
class(weather) #what class is dataset = dataframe
head(weather) #structure of the dataset
weather.month <- weather %>% # Group data by month
mutate(weather, 'month') %>%
group_by(month = lubridate::floor_date(weather$Day, 'month')) %>%
summarise(weather.month$High)
Then this is the errors I get:
Any help getting through this would be greatly appreciated!!!
The code can be modified by converting the Day to Date class (with mdy or dmy - as it is not clear whether it is month-day-year or day-month-year format), then apply the floor_date by 'month' and apply the function on High column
library(dplyr)
library(lubridate)
weather %>% #
group_by(month = lubridate::floor_date(mdy(Day), 'month')) %>%
summarise(High = sum(High, na.rm = TRUE))

Grouping and Summing Data by Irregular Time Intervals (R language)

I am looking at a stackoverflow post over here: R: Count Number of Observations within a group
Here, daily data is created and summed/grouped at monthly intervals (as well as weekly intervals):
library(xts)
library(dplyr)
#create data
date_decision_made = seq(as.Date("2014/1/1"), as.Date("2016/1/1"),by="day")
date_decision_made <- format(as.Date(date_decision_made), "%Y/%m/%d")
property_damages_in_dollars <- rnorm(731,100,10)
final_data <- data.frame(date_decision_made, property_damages_in_dollars)
# weekly
weekly = final_data %>%
mutate(date_decision_made = as.Date(date_decision_made)) %>%
group_by(week = format(date_decision_made, "%W-%y")) %>%
summarise( total = sum(property_damages_in_dollars, na.rm = TRUE), Count = n())
# monthly
final_data %>%
mutate(date_decision_made = as.Date(date_decision_made)) %>%
group_by(week = format(date_decision_made, "%Y-%m")) %>%
summarise( total = sum(property_damages_in_dollars, na.rm = TRUE), Count = n())
It seems that the "format" statement in R (https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/format) is being used to instruct the computer to "group and sum" the data some fixed interval.
My question: is there a way to "instruct" the computer to "group and sum" by irregular intervals? E.g. by 11 day periods, by 3 month periods, by 2 year periods? (I guess 3 months can be written as 90 days...2 years can be written as 730 days).
Is this possible?
Thanks
You can use lubridate's ceiling_date/floor_date to create groups at irregular intervals.
library(dplyr)
library(lubridate)
final_data %>%
mutate(date_decision_made = as.Date(date_decision_made)) %>%
group_by(group = ceiling_date(date_decision_made, '11 days')) %>%
summarise(amount = sum(property_damages_in_dollars))
You can also specify intervals like ceiling_date(date_decision_made, '3 years') or ceiling_date(date_decision_made, '2 months').
Using data.table
library(data.table)
library(lubridate)
setDT(final_data)[, .(amount = sum(property_damages_in_dollars)),
,.(group = ceiling_date(as.IDate(date_decison_made), "11 days"))]

R Lubridate is ordering months and weekdays alphabetically by default

I'm using Lubridate in R Studio and when I use group by (with dplyr) to group by months or weekdays it sorts it automatically in alphabetical order. How can I change this to date order?
Here is the code:
df %>% group_by(months(DateColumn)) %>% summarise(Freq=n())
DateColumn has te following structure:
When I view the result this is the order. (Same for plots)
After the summarise step, we can arrange the rows by matching with the inbuilt month.name (months in the correct order), and then convert the 'Months' to a factor with levels specified (so that it can be used later in ggplot to order in the same order as the levels)
library(tidyverse)
df %>%
group_by(Months = months(DateColumn)) %>%
summarise(n = n()) %>%
arrange(match(month.name, Months)) %>%
mutate(Months = factor(Months, levels = Months))
data
df <- data.frame(DateColumn = seq(as.POSIXct("2015-05-10"),
length.out = 30, by = '1 month'))
Using data from #akrun's answer. Here is an alternative:
df <- data.frame(DateColumn = seq(as.POSIXct("2015-05-10"),
length.out = 30, by = '1 month'))
df %>%
mutate(Date=month(DateColumn,label=T),ID=row_number()) %>%
group_by(Date) %>%
arrange(Date) %>%
select(-DateColumn)

Convert character to date and numeric, maintain same format

In the output of the code below the variables day and sales are in the format that I need but not the type, it outputs type chr instead. The variables should be date and num respectively. I've tried many things but either I get chr or some sort of error. For instance, using as.Date() doesn´t change the variable day to the format "%d/%m/%Y". The code with sample data:
library(dplyr)
library(lubridate)
df <- data.frame(matrix(c("2017-09-04","2017-09-05",103,104,17356,18022),ncol = 3, nrow = 2))
colnames(df) <- c("DATE","ORDER_ID","SALES")
df$DATE <- as.Date(df$DATE, format = "%Y-%m-%d")
df$SALES <- as.numeric(as.character(df$SALES))
df$ORDER_ID <- as.numeric(as.character(df$ORDER_ID))
TOTALSALES <- df %>%
select(ORDER_ID,DATE,SALES) %>%
mutate(weekday = wday(DATE, label=TRUE)) %>%
mutate(DATE=as.Date(DATE)) %>%
filter(!wday(DATE) %in% c(1, 7) & !(DATE %in% as.Date(c('2017-01-02','2017-02-27','2017-02-28','2017-04-14'))) ) %>%
group_by(day=floor_date(DATE,"day")) %>%
summarise(sales=sum(SALES)) %>%
data.frame()
TOTALSALES$day <- TOTALSALES$day %>%
as.POSIXlt(, tz="America/Sao_Paulo") %>%
format("%d/%m/%Y")
TOTALSALES$sales <- TOTALSALES$sales %>%
format(digits=9, decimal.mark=",",nsmall=2,big.mark = ".")
TOTALSALES$day <- as.Date(df$DATE, format = "%d/%m/%Y")
Any idea how can I solve this problem or a direction on how it should be done ?
Appreciate any help
I'm not sure I understand your question.
To print a Date object in a particular date-time format you can use format
# This *converts* a character vector/factor to a vector of Dates
df$DATE <- as.Date(df$DATE, format = "%Y-%m-%d")
# This *prints* the Date vector as a character vector with format "%d/%m/%Y"
format(df$DATE, format = "%d/%m/%Y")
Minimal example
ss <- c("2017-09-04","2017-09-05")
date <- as.Date(ss, format = "%Y-%m-%d")
format(date, format = "%d/%m/%Y")
#[1] "04/09/2017" "05/09/2017"

function applied to summarise + group_by doesn't work correctly

I extract my data
fluo <- read.csv("data/ctd_SOMLIT.csv", sep=";", stringsAsFactors=FALSE)
I display in three columns : the day, the month and the year based on the original date : Y - m - d
fluo$day <- day(as.POSIXlt(fluo$DATE, format = "%Y-%m-%d"))
fluo$month <- month(as.POSIXlt(fluo$DATE, format = "%Y-%m-%d"))
fluo$year <- year(as.POSIXlt(fluo$DATE, format = "%Y-%m-%d"))
This is a part of my data_frame:
Then, I do summarise and group_by in order to apply the function :
prof_DCM = fluo[max(fluo$FLUORESCENCE..Fluorescence.),2]
=> I want the depth of the max of FLUORESCENCE measured for each month, for each year.
mean_fluo <- summarise(group_by(fluo, month, year),
prof_DCM = fluo[max(fluo$FLUORESCENCE..Fluorescence.),2])
mean_fluo <- arrange(mean_fluo, year, month)
View(mean_fluo)
But it's not working ...
The values of prof_DCM still the same all along the column 3 of the data_frame:
Maybe try the following code.
library(dplyr)
mean_fluo <- fluo %>%
group_by(month,year) %>%
filter(FLUORESCENCE..Fluorescence. == max(FLUORESCENCE..Fluorescence.)) %>%
arrange(year,month)
View(mean_fluo)
You can select the variables you want to keep with 'select'
mean_fluo <- fluo %>%
group_by(month,year) %>%
filter(FLUORESCENCE..Fluorescence. == max(FLUORESCENCE..Fluorescence.)) %>%
arrange(year,month)%>%
select(c(month,year,PROFONDEUR))

Resources