convert quarter year to last date of quarter in R - r

I have an issue when I use as.Date(as.yearqtr(test[,1],format ="%qQ%Y"),frac =1), but it returns an error,and quater-year didn't change to date. The error is:
error in as.yearqtr(as.numeric(x)) (list) object cannot be coerced to type 'double'
This is my dataframe in R.
TIME VALUE
1Q2019 1
2Q2019 2
3Q2019 3
4Q2019 4
The ideal output is
TIME VALUE
2019-03-31 1
2019-06-30 2
2019-09-30 3
2019-12-31 4

We can convert to Date with zoo and get the last date of the quarter with frac. We use some RegEx to rearrange in zoo's suitable format:
df$TIME=as.Date(as.yearqtr(gsub("(\\d)(Q)(\\d{1,})","\\3 Q\\1",df$TIME)),frac = 1)
df
TIME VALUE
1 2019-03-31 1
2 2019-06-30 2
3 2019-09-30 3
4 2019-12-31 4
Data:
df <-structure(list(TIME = structure(1:4, .Label = c("1Q2019", "2Q2019",
"3Q2019", "4Q2019"), class = "factor"), VALUE = 1:4), class = "data.frame", row.names = c(NA,
-4L))

Here is a function that will return a vector of dates, given an input vector in the form of 1Q2019...
dateStrings <- c("1Q2019","2Q2019","3Q2019","4Q2019","1Q2020")
lastDayOfQuarter <- function(x){
require(lubridate)
result <- NULL
months <-c(3,6,9,12)
days <- c(31,30,30,31)
for(i in 1:length(x)) {
qtr <- as.numeric(substr(x[i],1,1))
result[i] <- mdy(paste(months[qtr],days[qtr],(substr(x[i],3,6)),sep="-"))
}
as.Date(result)
}
lastDayOfQuarter(dateStrings)
and the output:
>lastDayOfQuarter(dateStrings)
[1] "2019-03-31" "2019-06-30" "2019-09-30" "2019-12-31" "2020-03-31"
>

Related

converting to hh:mm:ss format in r

How would I convert seconds into h/m/s format. I've tried to use seconds_to_period but it only gives the value in seconds. e.g
ID Time
1 345 secs
2 121 secs
3 78 secs
I want this is in HH:MM:SS format how is this done?
We may use hms from hms after converting to period
library(lubridate)
df1$Time <- hms::hms(seconds_to_period(readr::parse_number(df1$Time)))
-output
> df1
ID Time
1 1 00:05:45
2 2 00:02:01
3 3 00:01:18
data
df1 <- structure(list(ID = 1:3, Time = c("345 secs", "121 secs", "78 secs"
)), class = "data.frame", row.names = c(NA, -3L))

Converting Date to Name

I have date's in a dataframe with corresponding sampling date as presented by the sample dataframe:
Date Temp
2016-06-11 5
2017-08-19 12
2018-01-21 13
2019-04-28 7
The date column is in numeric format currently. I want to convert the numeric month (i.e. 06) into its full name (i.e. June) but am having trouble with the conversion.
I did check the converting dates to names question but was confused by the select DATENAME.
You may simply use months(). Example:
d <- transform(d, date.m=months(v))
d
# date x date.m
# 1 2020-10-01 -1.1390886 October
# 2 2020-11-01 -0.6872151 November
# 3 2020-12-01 1.0632769 December
# 4 2021-01-01 1.7351265 January
Note: If your date is not of class "date" you also need to wrap as.Date:
d <- transform(d, date.m=months(as.Date(v)))
Data:
d <- structure(list(date = structure(c(18536, 18567, 18597, 18628), class = "Date"),
x = c(-1.13908860117162, -0.687215137639502, 1.06327693201579,
1.73512650928455)), class = "data.frame", row.names = c(NA,
-4L))

How can I add 1 to a column in R when A conditional is met?

I am trying to fill a new column in a data frame (in R) based on the following conditional:
df$B<- ifelse(difftime(df$A,lag(df$A))>minutes(30), increment(1), increment(0))
Here, the A column is time. So in A, every time the time difference between row i and row i-1 is greater than 30 minutes, I increment the new column B by one.
A B
1:00 1
1:31 2
1:40 2
2:30 3
Example
Any help is greatly appreciated, thank you.
In base R, you can use cumsum with difftime :
df$B <- cumsum(c(TRUE, difftime(df$A[-1], df$A[-nrow(df)], units = 'mins') > 30))
df
# A B
#1 2020-02-03 01:00:00 1
#2 2020-02-03 02:00:00 2
#3 2020-02-03 02:15:00 2
#4 2020-02-03 03:00:00 3
data
Make sure class(df$A) returns "POSIXct" :
df <- structure(list(A = structure(c(1580691600, 1580695200, 1580696100,
1580698800), class = c("POSIXct", "POSIXt"), tzone = "UTC")),
class = "data.frame", row.names = c(NA, -4L))

Prophet Date Format R

year_month amount_usd
201501 -390217.24
201502 230944.09
201503 367259.69
201504 15000.00
201505 27000.21
201506 38249.65
df <- structure(list(year_month = 201501:201506, amount_usd = c(-390217.24,
230944.09, 367259.69, 15000, 27000.21, 38249.65)), class = "data.frame", row.names = c(NA,
-6L))
I want to bring it in to DD/MM/YYYY format for usability in Prophet Forecasting code.
this is what i have tried so far.
for (loopitem in loopvec){
df2 <- subset(df, account_id==loopitem)
df3 <- df2[,c("year_month","amount_usd")]
df3$year_month <- as.Date(df3$year_month, format="YYYY-MM", origin="1/1/1970")
try <- prophet(df3, seasonality.mode = 'multiplicative')
}
Error in fit.prophet(m, df, ...) :
Dataframe must have columns 'ds' and 'y' with the dates and values respectively.
You need to paste the day number (I'm just using the first) to the year_month values, then can use the ymd() function from lubridate to convert the column to a date object.
library(dplyr)
library(lubridate)
mutate_at(df, "year_month", ~ymd(paste(., "01")))
year_month amount_usd
1 2015-01-01 -390217.24
2 2015-02-01 230944.09
3 2015-03-01 367259.69
4 2015-04-01 15000.00
5 2015-05-01 27000.21
6 2015-06-01 38249.65

Adding a random number of days to dates via some function

My data contains a column of order dates. It also has a column of delivery dates. Some of the delivery dates are a date (12/31/1990) that occurred before the order date, which is causing problems in calculating average shipping time. I would like to take the order date for these rows and add a random number of days from a uniform distribution.
First, I tried to write a function that I could apply to the data, but the result was not what I wanted. What I want is for the simulated delivery date to end up in the delivery date column.
func1 = function(x){
if(x[2]=="1990-12-31" && !is.na(x[2]))
x[2] = as.Date(x[1]) + floor(runif(1,min=0,max=30))
return (x)
}
Example data:
x <- structure(list(orderDate = structure(c(15706, 15706, 15706, 15706,
15706), class = "Date"), deliveryDate = structure(c(15707, 15707,
7669, 15707, 7669), class = "Date")), .Names = c("orderDate",
"deliveryDate"), row.names = c(NA, 5L), class = "data.frame")
# orderDate deliveryDate
#1 2013-01-01 2013-01-02
#2 2013-01-01 2013-01-02
#3 2013-01-01 1990-12-31
#4 2013-01-01 2013-01-02
#5 2013-01-01 1990-12-31
If I did not get it wrong, x is a data frame with 2 columns. A vectorized if implementation can be achieved via ifelse:
x[[2]] <- structure(ifelse(x[[2]] == "1990-12-31" & !is.na(x[[2]]),
as.Date(x[[1]]) + sample(0:30, 1),
x[[2]]),
class = "Date")
Or a faster replacement:
ind <- x[[2]] == "1990-12-31" & !is.na(x[[2]])
x[ind, 2] <- as.Date(x[ind, 1]) + sample(0:30, sum(ind), replace = TRUE)
With your example dataset and the same random seed 0, both options give the same result:
# orderDate deliveryDate
#1 2013-01-01 2013-01-02
#2 2013-01-01 2013-01-02
#3 2013-01-01 2013-01-28
#4 2013-01-01 2013-01-02
#5 2013-01-01 2013-01-28
In the first case, ifelse alone is returning integers (the internal representation of "Date"), hence we need to give "Date" class to it to make it a "Date".

Resources