How to convert character into Date in R? - r

I have an excel file where dates are in below format.
01-Jan-2020
03-Jun-2015
I need to convert this in Date. I have tried with many converting techniques.I am getting NA every time.

This should do it:
x <- c("01-Jan-2020", "03-Jun-2015")
as.Date(x, format = "%d-%b-%Y")
#> [1] "2020-01-01" "2015-06-03"
# Or with lubridate
lubridate::dmy(x)
#> [1] "2020-01-01" "2015-06-03"
We can confirm x was converted from character to date with:
y <- list(input = x,
lubridate = lubridate::dmy(x),
base = as.Date(x, format = "%d-%b-%Y"))
str(y)
#> List of 3
#> $ input : chr [1:2] "01-Jan-2020" "03-Jun-2015"
#> $ lubridate: Date[1:2], format: "2020-01-01" "2015-06-03"
#> $ base : Date[1:2], format: "2020-01-01" "2015-06-03"

Download and install "Lubridate" library.
wget https://github.com/tidyverse/lubridate/archive/v1.6.0.tar.gz
sudo R CMD INSTALL v1.6.0.tar.gz
It contains a method called, "parse_date_time2()"
You can parse the date in any format.
Ex.
parse_date_time2("21 Jan 2010", orders = "dmy")
Check for reference.
https://www.displayr.com/r-date-conversion/

Related

strptime outputing NA when I try and convert from "%Y-%m" format

Strptime outputs NA when I set format to "%Y-%m"
I have tried adding the day as a test and it worked, but whenever I do "%Y-%m" or "%m" i get NA
print(strptime("2007-07", format = "%Y-%m"))
[1] NA
print(strptime("07", format = "%m"))
[1] NA
print(strptime("2007", format = "%Y"))
[1] "2007-07-30 EDT"
Use library zoo. It is useful when you have to deal with dates like that.
require(zoo)
yearmon(c(2017,01))
Then you can manipulate the object yearmon.
as.Date(yearmon(c(2017,01)))
[1] "2017-01-01" "7-01-01"

Trouble reformatting date with as.Date

I'm having a simple problem and I can't tell what's wrong. I'm trying to convert dates formatted "YYYY-MM-DD" to "m/d/YYYY". On my machine, this code:
x <- as.Date("2000-01-01")
x <- as.Date(x, format = "%m/%d/%Y")
print(x)
returns
"2000-01-01"
What am I missing?
as.Date() creates a date object where you tell it how to interpret the input with a format argument.
format() (or alternatively strftime()) will convert a date object to a character object in a desired format:
x <- as.Date("2000-01-01")
x
[1] "2000-01-01"
str(x)
Date[1:1], format: "2000-01-01"
y <- format(x = x,format = "%m/%d/%Y")
y
[1] "01/01/2000"
str(y)
chr "01/01/2000"
y <- strftime(x = x,format = "%m/%d/%Y")
y
[1] "01/01/2000"
str(y)
chr "01/01/2000"

Excel Date Conversion Issue in R

Running into an issue when drawing in Excel data from R and converting to a date within R. I have a "time_period" column that is pulled from Excel in that Excel date format with 5 digit numbers (e.g. 41640).
> head(all$time_period)
[1] "41640" "41671" "41699" "41730" "41760" "41791"
These numbers are originally in chr format so I change them to numeric type with:
all[,3] <- lapply(all[,3], function(x) as.numeric(as.character(x)))
Once that is complete, I run the below to format the date:
all$time_period <-format(as.Date(all$time_period, "1899-12-30"), "%Y-%m-%d")
However, once this action is completed the time_period column is all the same date (presumably the first date in the column).
> head(all$time_period)
[1] "2014-01-01" "2014-01-01" "2014-01-01" "2014-01-01" "2014-01-01" "2014-01-01"
Any suggestions? Thanks in advance.
set the origin argument in as.Date()
These numbers refer to distances away from an origin, which depends on the machine the excel file was created on.
Windows: as.Date(my_date, origin = "1899-12-30")
Mac: as.Date(my_date, origin = "1904-01-01")
For example:
x <- c("41640","41671","41699","41730","41760","41791")
x <- as.numeric(x)
format(as.Date(x, "1899-12-30"), "%Y-%m-%d")
Returns:
[1] "2014-01-01" "2014-02-01" "2014-03-01" "2014-04-01" "2014-05-01" "2014-06-01"
I believe this one line solves your problem, you don't need to format it, as de default of as.Date function is "%Y-%m-%d".
time_period = c("41640", "41671", "41699", "41730", "41760", "41791")
as.Date(as.numeric(time_period), origin = "1899-12-30").

Date time conversion and extract only time

Want to change the class for Time to POSIXlt and extract only the hours minutes and seconds
str(df3$Time)
chr [1:2075259] "17:24:00" "17:25:00" "17:26:00" "17:27:00" ...
Used the strptime function
df33$Time <- strptime(df3$Time, format = "%H:%M:%S")
This gives the date/time appended
> str(df3$Time)
POSIXlt[1:2075259], format: "2015-08-07 17:24:00" "2015-08-07 17:25:00" "2015-08-07 17:26:00" ...
Wanted to extract just the time without changing the POSIXlt class. using the strftime function
df3$Time <- strftime(df3$Time, format = "%H:%M:%S")
but this converts the class back to "char" -
> class(df3$Time)
[1] "character"
How can I just extract the time with class set to POSIX or numeric...
If your data is
a <- "17:24:00"
b <- strptime(a, format = "%H:%M:%S")
you can use lubridate in order to have a result of class integer
library(lubridate)
hour(b)
minute(b)
# > hour(b)
# [1] 17
# > minute(b)
# [1] 24
# > class(minute(b))
# [1] "integer"
and you can combine them using
# character
paste(hour(b),minute(b), sep=":")
# numeric
hour(b) + minute(b)/60
for instance.
I would not advise to do that if you want to do any further operations on your data. However, it might be convenient to do that if you want to plot the results.
A datetime object contains date and time; you cannot extract 'just time'. So you have to think throught what you want:
POSIXlt is a Datetime representation (as a list of components)
POSIXct is a different Datetime representation (as a compact numeric)
Neither one omits the Date part. Once you have a valid object, you can choose to display only the time. But you cannot make the Date part disappear from the representation.
A "modern" tidyverse answer to this is to use hms::as_hms()
For example
library(tidyverse)
library(hms)
as_hms(1)
#> 00:00:01
as_hms("12:34:56")
#> 12:34:56
or, with your example data:
x <- as.POSIXlt(c("17:24:00", "17:25:00", "17:26:00", "17:27:00"), format = "%H:%M:%S")
x
#>[1] "2021-04-10 17:24:00 EDT" "2021-04-10 17:25:00 EDT" "2021-04-10 17:26:00 EDT" "2021-04-10 17:27:00 EDT"
as_hms(x)
# 17:24:00
# 17:25:00
# 17:26:00
# 17:27:00
See also docs here:
https://hms.tidyverse.org/reference/hms.html
You can also use the chron package to extract just times of the day:
library(chron)
# current date/time in POSIXt format as an example
timenow <- Sys.time()
# create chron object "times"
onlytime <- times(strftime(timenow,"%H:%M:%S"))
> onlytime
[1] 14:18:00
> onlytime+1/24
[1] 15:18:00
> class(onlytime)
[1] "times"
This is my idiom for getting just the timepart from a datetime object. I use floor_date() from lubridate to get midnight of the timestamp and take the difference of the timestamp and midnight of that day. I create and store a hms object provided with lubridate (I believe) in dataframes because the class has formatting of hh:mm:ss that is easy to read, but the underlying value is a numeric value of seconds. Here is my code:
library(tidyverse)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
# Create timestamps
#
# Get timepart by subtacting the timestamp from it's floor'ed date, make sure
# you convert to seconds, and then cast to a time object provided by the
# `hms` package.
# See: https://www.rdocumentation.org/packages/hms/versions/0.4.2/topics/hms
dt <- tibble(dt=c("2019-02-15T13:15:00", "2019-02-19T01:10:33") %>% ymd_hms()) %>%
mutate(timepart = hms::hms(as.numeric(dt - floor_date(dt, "1 day"), unit="secs")))
# Look at result
print(dt)
#> # A tibble: 2 x 2
#> dt timepart
#> <dttm> <time>
#> 1 2019-02-15 13:15:00 13:15
#> 2 2019-02-19 01:10:33 01:10
# `hms` object is really a `difftime` object from documentation, but is made into a `hms`
# object that defaults to always store data in seconds.
dt %>% pluck("timepart") %>% str()
#> 'hms' num [1:2] 13:15:00 01:10:33
#> - attr(*, "units")= chr "secs"
# Pull off just the timepart column
dt %>% pluck("timepart")
#> 13:15:00
#> 01:10:33
# Get numeric part. From documentation, `hms` object always stores in seconds.
dt %>% pluck("timepart") %>% as.numeric()
#> [1] 47700 4233
Created on 2019-02-15 by the reprex package (v0.2.1)
If you want it in POSIX format, the only way would be to leave it as it is, and extract just the "time" part everytime you display it. But internally it will always be date + time anyway.
If you want it in numeric, however, you can simply convert it into a number.
For example, to get time as number of seconds passed since the beginning of the day:
df3$Time=df3$Time$sec + df3$Time$min*60 + df3$Time$hour*3600

r datetime conversion from chr using strptime gives NA

I have a data frame containing date time information as characters in the format dd/mm/yyyy hh:mm but I can't get it to convert e.g
$ LaserStart : chr "07/12/2014 11:21" "13/12/2014 05:37"
I am trying to convert them to date time using
data.LotCT$Start <- strptime(data.LotCT$LaserStart, "%d/%B/%Y %H:%M")
this runs without producing any errors but when I review the dataframe I have only NA
$ Start : POSIXlt, format: NA NA NA ...
thanks in advance
> x <- "07/12/2014 11:21"
> y <- strptime(x, format='%m/%d/%Y %H:%M')
> strftime(y, '%d/%B/%Y %H:%M')
[1] "12/July/2014 11:21"
Just figured it out
data.LotCT$Start <- strptime(data.LotCT$LaserStart, "%d/%B/%Y %H:%M")
should be
data.LotCT$Start <- strptime(data.LotCT$LaserStart, "%d/%m/%Y %H:%M")
which gives
$ Start : POSIXlt, format: "2014-12-07 11:21:00" "2014-12-13 05:37:00"
sorry for bothering you all :)

Resources