Convert date in vector to Excel date number (R) - r

I have a vector of date values:
dates=c("43018","43343","42272","06/27/17","01/10/18","10/11/18")
This is a mixture of actual dates and the Excel number-value of dates (ie: number of days since January 1, 1900). I want to convert all of these values to the Excel format of dates, so we would have an output that looks like the following:
dates
[1] "43018" "43343" "42272" "42913" "43110" "43384"
My goal is to take these values and subtract them from another vector with an equal number of date values that are all the same to get an age of each observation.
Can anyone help point me in the right direction? Thank you!

Figured it out - use the "janitor" library and the excel_numeric_to_date function.
Badda bing badda boom.

Related

How to convert date format to total number of days?

I'm trying to convert a yyyy-mm-dd data in a data frame to the total number of days from some date to put in my survival function.
I've already tried as_date() and grepl(), but I can't seem to get it to work since there are either too many NA values in my data frame or I'm doing something wrong.
Ref.date <- ymd("1941-08-24")
Date.MI <- ymd("Date.MI")
Day <- as.numeric(difftime(Date.MI, Ref.date))
I expect just the total number of days since 1941-08-24.
How do I solve the problem?
difftime() gives you the option to specify the units for the resulting output. So maybe try something like this
as.numeric(difftime(as.POSIXct("1941-08-25"), as.POSIXct("1941-08-24"), units = c("days")))
The way to solve it:
as.numeric(difftime(as.POSIXct(Date.MI[[1]]), as.POSIXct("1941-08-24"), units = c("days")))
There were square brackets needed since that refers to the first column.

Subset a dataframe based on numerical values of a string inside a variable

I have a data frame which is a time series of meteorological measurement with monthly resolution from 1961 till 2018. I am interested in the variable that measures the monthly average temperature since I need the multi-annual average temperature for the summers.
To do this I must filter from the "DateVaraible" column the fifth and sixth digit, which are the month.
The values in time column are formatted like this
"19610701". So I need the 07(Juli) after 1961.
I start coding for 1 month for other purposes, so I did not try anything worth to mention. I guess that .grepl could do the work, but I do not know how the "matching" operator works.
So I started with this code that works.
summersmonth<- Df[DateVariable %like% "19610101" I DateVariable %like% "19610201"]
I am expecting a code like this
summermonths <- Df[DateVariable %like% "**06**" I DateVariable%like% "**07**..]
So that all entries with month digit from 06 to 09 are saved in the new dataframe summermonths.
Thanks in advance for any reply or feedback regarding my question.
Update
Thank to your answers I got the first part, which is to convert the variable in a as.date with the format "month"(Class=char)
Now I need to select months from Juni to September .
A horrible way to get the result I wanted is to do several subset and a rbind afterward.
Sommer1<-subset(Df, MonthVar == "Mai")
Sommer2<-subset(Df, MonthVar == "Juli")
Sommer3<-subset(Df, MonthVar == "September")
SummerTotal<-rbind(Sommer1,Sommer2,Sommer3)
I would be very glad to see this written in a tidy way.
Update 2 - Solution
Here is the tidy way, as here Using multiple criteria in subset function and logical operators
Veg_Seas<-subset(Df, subset = MonthVar %in% c("Mai","Juni","Juli","August","September"))
You can convert your date variable as date (format) and take the month:
allmonths <- month(as.Date(Df$DateVariable, format="%Y%m%d"))
Note that of your column has been originally imported as factor you need to convert it to character first:
allmonths <- month(as.Date(as.character(Df$DateVariable), format="%Y%m%d"))
Then you can check whether it is a summermonth:
summersmonth <- Df[allmonths %in% 6:9, ]
Example:
as.Date("20190702", format="%Y%m%d")
[1] "2019-07-02"
month(as.Date("20190702", format="%Y%m%d"))
[1] 7
We can use anydate from anytime to convert to Date class and then extract the month
library(anytime)
month(anydate(as.character(Df$DateVariable)))

function in R that creates dummies for given time period

There is a data frame like this:
The first two columns in the df describe the start date (month and year) and the end date (month and year). Column names describe every single month and year of a certain time period.
I need a function/loop that insterts "1" or "0" in each cell - "1" when the date from given column name is within the period described by the two first columns, and "0" if not.
I would appreciate any help.
You want to do two different things. (a) create a dummy variable and (b) see if a particular date is in an interval.
Making a dummy variable is the easiest one, in base R you can use ifelse. For example in the iris data frame:
iris$dummy <- ifelse(iris$Sepal.Width > 2.5, 1, 0)
Now working with dates is more complicated. In this answer we will use the library lubridate. First you need to convert all those dates to a format 'Month Year' to something that R can understand. For example for February you could do:
new_format_february_2016 <- interval(ymd('2016-02-01'), ymd('2016-03-01') - dseconds(1))
#[1] 2016-02-01 UTC--2016-02-29 23:59:59 UTC
This is February, the interval of time from the 1 of February to one second before the 1 of March. You can do the same with your start date column and you end date column.
To compare two intevals of time (so, to see if a particular month fall into your other intervals) you can do:
int_overlaps(new_format_february_2016, other_interval)
If this returns true, the two intervals (one particular month and another one) overlaps. This is not the same as one being inside another, but in your case it will work. Using this you can iterate over different columns and rows and build your dummy variable.
But before doing so, I would recommend to clean your data, as your current format is complicate to work with. To get all the power that vector types in R provides ideally you would want to have one row per observation and one variable per column. This does not seem to be the case with your data frame. Take a look to the chapter 'Tidy data' of 'R for Data Science' specially the spreading and gathering subsection:
Tidy data

Convert series of numbers to date in R

I have a .xlsx file that I read onto R. This file has one of the columns in date format (d/m/y) but for some reason it's displaying as series of numbers in the data frame on RStudio.
My question is how do I change the column to the original date format?
Here's an example of the date that's showing: 887587200 - instead of something like 12/03/1974.
Any help to fix this would be appreciated.
Thanks
Looks like your dates are being stored as a numeric value, likely the number of seconds since Jan 1, 1970. So to convert the column, you could do:
df$my_col <- as.Date(df$my_col / 60 / 60 / 24, origin = '1970-01-01')
This converts 887587200 to a date of 1998-02-16.

calculate date differences in R

I am trying to calculate the day differences between the two columns of Last.Start and Last.Stop. What I did is first convert the column of Last.Start to Character variable, then convert it to a Date type variable (as the last column of LastStart).
enter image description here
SMFull$Last.Start <- as.character(SMFull$Last.Start)
SMFull$LastStart <- as.Date(SMFull$Last.Start, format="%Y/%m/%d")
Can someone advise me Why the format of LastStart was off?
Thanks in advance!

Resources