Convert decimal month and year into Date - r

I have a 'decimal month' and a year variable:
df <- data.frame(decimal_month = c(4.75, 5, 5.25), year = c(2011, 2011, 2011))
How can I convert these variables to a Date? ("2011-04-22" "2011-05-01" "2011-05-08"). Or at least to day of the year.

You may use some nice functions from the zoo package:
as.yearmon to convert year and floor of the decimal month to class yearmon.
Then use as.Date.yearmon and its frac argument to coerce the year-month to class Date.
library(zoo)
df$date = as.Date(as.yearmon(paste(df$year, floor(df$decimal_month), sep = "-")),
frac = df$decimal_month - floor(df$decimal_month))
# decimal_month year date
# 1 4.75 2011 2011-04-22
# 2 5.00 2011 2011-05-01
# 3 5.25 2011 2011-05-08
If desired, day of year is simply format(df$date, "%j")

Related

How do I replace a value in my dataframe with text?

I have a dataframe with dates from April 2020 to today, right now they are labelled 1 to 492 with 1 being the first date I have data on. I also have a list of dates in the format I want. How can I tell R that date 1 is april 12 2020, date 2 is april 13, 2020, and so on for each date? I'm ok either replacing the values in the column or creating a new column called real_date next to it.
Update:
Sorry I didn't describe this very well. I ended up making a look-up table with the date number and real date, and I used the inner_join function to add the real date to my dataframe.
library(tidyverse)
library(lubridate)
#Creating a sample data.frame
df <-
tibble(
dates = seq.Date(dmy("01/04/20"),today(),by = "1 day")
)
df %>%
#Format date, where: %B = month as string, %d numeric day and %y numeric year
mutate(
new_date = format(dates,"%B %d %Y")
)
*Abril is April in portuguese.
If I have understood the question correctly, you have a dataframe which has numbers from 1 to 492, now you want to change them to dates where number 1 is 12th April 2020, number 2 is 13th April 2020 and so on.
You can use as.Date to convert these numbers to date and pass the origin as 11th April.
df <- data.frame(date = 1:492)
df$real_date <- as.Date(df$date, origin = '2020-04-11')
head(df)
# date real_date
#1 1 2020-04-12
#2 2 2020-04-13
#3 3 2020-04-14
#4 4 2020-04-15
#5 5 2020-04-16
#6 6 2020-04-17
Just create a sequence of dates
data.frame(date = seq(as.Date('2020-04-12'), length.out = 492,
by = '1 day'), code = 1:492)

Converting 3 columns (day, month, year) into a single date column R

I have the data-frame called dates which looks like this:
Day Month Year
2 April 2015
5 May 2014
23 December 2017
This code is:
date <- data.frame(Day = c(2,5,23),
Month = c("April", "May", "December"),
Year = c(2015, 2014, 2017))
I want to create a new column that looks like this:
Day Month Year Date
2 April 2015 2/4/2015
5 May 2014 5/5/2014
23 December 2017 23/12/2017
To do this, I tried:
data <- data %>%
mutate(Date = as.Date(paste(Day, Month, Year, sep = "/"))) %>%
dmy()
But I got an error which says:
Error in charToDate(x) :
character string is not in a standard unambiguous format
Is there an obvious error that I'm not seeing?
Thank you so much.
We need to use appropriate format in as.Date. Using base R, we can do
transform(data, Date = as.Date(paste(Day, Month, Year, sep = "/"), "%d/%B/%Y"))
# Day Month Year Date
#1 2 April 2015 2015-04-02
#2 5 May 2014 2014-05-05
#3 23 December 2017 2017-12-23
Or with dplyr and lubridate
library(dplyr)
library(lubridate)
data %>% mutate(Date = dmy(paste(Day, Month, Year, sep = "/")))
You can add format(Date, "%d/%m/%Y") if you need to change the display format.

R: assign months to day of the year

Here's my data which has 10 years in one column and 365 day of another year in second column
dat <- data.frame(year = rep(1980:1989, each = 365), doy= rep(1:365, times = 10))
I am assuming all years are non-leap years i.e. they have 365 days.
I want to create another column month which is basically month of the year the day belongs to.
library(dplyr)
dat %>%
mutate(month = as.integer(ceiling(day/31)))
However, this solution is wrong since it assigns wrong months to days. I am looking for a dplyr
solution possibly.
We can convert it to to datetime class by using the appropriate format (i.e. %Y %j) and then extract the month with format
dat$month <- with(dat, format(strptime(paste(year, doy), format = "%Y %j"), '%m'))
Or use $mon to extract the month and add 1
dat$month <- with(dat, strptime(paste(year, doy), format = "%Y %j")$mon + 1)
tail(dat$month)
#[1] 12 12 12 12 12 12
This should give you an integer value for the months:
dat$month.num <- month(as.Date(paste(dat$year, dat$doy), '%Y %j'))
If you want the month names:
dat$month.names <- month.name[month(as.Date(paste(dat$year, dat$doy), '%Y %j'))]
The result (only showing a few rows):
> dat[29:33,]
year doy month.num month.names
29 1980 29 1 January
30 1980 30 1 January
31 1980 31 1 January
32 1980 32 2 February
33 1980 33 2 February

parsing date elements for calculations and look up

My real goal here is to use the numeric value for a month in table 1 (i.e. January = 01, ... December =12; years are present as a separate column) and find a value in table 2 where the value returned is from one month earlier. The problem I do not know how to deal with is when the month from table 1 is January (i.e. 2014-01), how would I return the value from table 2 related to December 2013 (i.e. 2013-12)?
I'm thinking that there is a package that has a process to decrement the date/month accounting for the beginning of the year condition I describe above. I do not have an issue converting the month and year columns into actual dates to accomplish this task.
year1 <- c(2013, 2013, 2014)
year2 <- c(2013, 2013, 2014)
month1 <- c(04, 08, 01)
month2 <- c(03, 12, 08)
value1 <- c(4,6,10)
value2 <- c(6,3,8)
df1 <- data.frame(year1, month1, value1)
df2 <- data.frame(year2, month2, value2)
Given the date combination of 2014-01 from df1, the expected output from df2 would be value2 = 3 from date combination 2013-12.
Thanks in advance
I find it more convenient to work with Date objects because it's easier to add/subtract days or months (thanks to the lubridate package). So, the idea is to use the first day of a month as date field instead of separate fields for year and month.
In addition, I prefer data.table for data manipulation.
# initial data
df1 <- data.frame(year1=c(2013, 2013, 2014), month1=c(04, 08, 01), value1=c(4,6,10))
df2 <- data.frame(year2=c(2013, 2013, 2014), month2=c(03, 12, 08), value2=c(6,3,8))
library(data.table) # CRAN version 1.10.4 used
library(lubridate) # CRAN version 1.6.0 used
# coerce 1st data.frame to data.table,
# create date from year and month, skip year and month columns,
# create join date which is one month earlier
DT1 <- setDT(df1)[, .(date1 = as.Date(sprintf("%4i-%02i-01", year1, month1)),
value1)][, join.date := date1 - months(1L),]
# coerce 2nd data.frame to data.table,
# create date from year and month, skip year and month columns,
DT2 <- setDT(df2)[, .(date2 = as.Date(sprintf("%4i-%02i-01", year2, month2)),
value2)]
# right join: take all rows of DT1
DT2[DT1, on = c(date2 = "join.date")]
# date2 value2 date1 value1
#1: 2013-03-01 6 2013-04-01 4
#2: 2013-07-01 NA 2013-08-01 6
#3: 2013-12-01 3 2014-01-01 10
You can merge the dataframes (after some manipulation):
df1 <- data.frame(year1=c(2013, 2013, 2014), month1=c(04, 08, 01), value1=c(4,6,10))
df2 <- data.frame(year2=c(2013, 2013, 2014), month2=c(03, 12, 08), value2=c(6,3,8))
df1$month2 <- ifelse(df1$month1==1, 12, df1$month - 1)
df1$year2 <- ifelse(df1$month2==12, df1$year1-1, df1$year1)
merge(df1, df2, all.x=TRUE)
# month2 year2 year1 month1 value1 value2
# 1 3 2013 2013 4 4 6
# 2 7 2013 2013 8 6 NA
# 3 12 2013 2014 1 10 3
It's a bit of a workaround, but here is an idea that might help: instead of only subtracting 1, subtract 2, use the modulo operator and then add 1 back.
i = 1:12
((i - 2) %% 12) + 1
[1] 12 1 2 3 4 5 6 7 8 9 10 11

format a time series as dataframe with julian date

I have a time series tt.txt of daily data from 1st May 1998 to 31 October 2012 in one column as this:
v1
296.172
303.24
303.891
304.603
304.207
303.22
303.137
303.343
304.203
305.029
305.099
304.681
304.32
304.471
305.022
304.938
304.298
304.120
Each number in the text file represents the maximum temperature in kelvin for the corresponding day. I want to put the data in 3 columns as follows by adding year, jday, and the value of the data:
year jday MAX_TEMP
1 1959 325 11.7
2 1959 326 15.6
3 1959 327 14.4
If you have a vector with dates, we can convert it to 'year' and 'jday' by
v1 <- c('May 1998 05', 'October 2012 10')
v2 <- format(as.Date(v1, '%b %Y %d'), '%Y %j')
df1 <- read.table(text=v2, header=FALSE, col.names=c('year', 'jday'))
df1
# year jday
#1 1998 125
#2 2012 284
To convert back from '%Y %j' to 'Date' class
df1$date <- as.Date(do.call(paste, df1[1:2]), '%Y %j')
Update
We can read the dataset with read.table. Create a sequence of dates using seq if we know the start and end dates, cbind with the original dataset after changing the format of 'date' to 'year' and 'julian day'.
dat <- read.table('tt.txt', header=TRUE)
date <- seq(as.Date('1998-05-01'), as.Date('2012-10-31'), by='day')
dat2 <- cbind(read.table(text=format(date, '%Y %j'),
col.names=c('year', 'jday')),MAX_TEMP=dat[1])
You can use yday
as.POSIXlt("8 Jun 15", format = "%d %b %y")$yday

Resources