Change of Date format - r

I have a table with date and temperature, and I need to convert the date from this format 1/1/2021 12:00:00 AM to just 1/1/21. Please help. I tried to add a new column with the new date using this code
SFtemps$Date <- as.Date(strptime(SFtemps$Date, "%m/%d/%y %H:%M"))
but it's not working.

It should be %I to represent hours as decimal number (01–12), not %H, and %y to
years without century (00–99).
x <- "1/1/2021 12:00:00 AM"
format(strptime(x, "%m/%d/%Y %I:%M:%S %p"), "%m/%d/%y")
[1] "01/01/21"
Note that after you re-foramt the time object, it'll be a pure character string and lose all attributes of a time object.

Your string is actually a timestamp, using lubridate you can convert it to the desired format and then leverage stamp function to get your data formatted in the manner that suits you.
org_str <- "1/1/2021 12:00:00 AM"
library("lubridate")
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
x <- mdy_hms(x = org_str, tz = "UTC")
sf <- stamp("29/01/1997")
#> Multiple formats matched: "%d/%Om/%Y"(1), "%d/%m/%Y"(1)
#> Using: "%d/%Om/%Y"
sf(x)
#> [1] "01/01/2021"
Created on 2022-04-22 by the reprex package (v2.0.1)
Explanation
Contrary to common perception the date doesn't have a format as such. Only string representation of the date can be of a specific format. Consider the following example. Running code:
x <- Sys.Date()
unclass(x)
will return an integer. This is because1:
Thus, objects of the class Date are internally represented as numbers. More specifically, dates in R are actually integers.
The same is applicable to the POSIX objects
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
xn <- now()
class(xn)
#> [1] "POSIXct" "POSIXt"
unclass(xn)
#> [1] 1650644616
#> attr(,"tzone")
#> [1] ""
Created on 2022-04-22 by the reprex package (v2.0.1)
You can always run:
format.Date(lubridate::now(), "%d-%m")
# [1] "22-04"
However, by doing this you are not formatting the date but creating a string representation of the date / timestamp object. Your string representation would have to be converted to timestamp / date object if you want to use it in common date time operations.
Suggestions
Whenever working with date time / date objects I would argue that it's advisable to keep those objects as a correct class maintaining the relevant details (such as time zone in case of timestamps) and only leverage formatting functions when utilising the data in analysis / visual representation, as shown in the sample.
1 Essentials of dates and times

Your format is a bit off. These are two options that work (lubridate is my favorite):
SFtemps <- list()
SFtemps$Date <- '1/1/2021 12:00:00 AM'
> as.Date(strptime(SFtemps$Date, "%m/%d/%Y %r"))
[1] "2021-01-01"
> as.Date(lubridate::mdy_hms(SFtemps$Date))
[1] "2021-01-01"
This will give you the date. If you want to see it as 1/1/2021, use the format function

You can try tryFormats
as.Date(x, format, tryFormats = c("%m/%d/%y, %H:%M"),
optional = FALSE)

Related

Conversion of character into datetime in R

How to convert a column of character "2020-08-12T04:31:41Z" into datetime. I have tried as.POSIXct() function, but, when I give value to format it returns NA.
You can convert the time using utctime from the anytime package or as_datetime from lubridate like this:
df <- data.frame(date = "2020-08-12T04:31:41Z")
library(anytime)
utctime(df$date, tz="UTC")
#> [1] "2020-08-12 04:31:41 UTC"
library(lubridate)
as_datetime(df$date)
#> [1] "2020-08-12 04:31:41 UTC"
Created on 2022-07-06 by the reprex package (v2.0.1)
You can use strptime which creates a datetime object from the given string
strptime("2020-08-12T04:31:41Z", tz = "UTC", "%Y-%m-%dT%H:%M:%OSZ")

Converting non-standard date format strings ("April-20") to date objects R

I have a vector of date strings in the form month_name-2_digit_year i.e.
a = rbind("April-21", "March-21", "February-21", "January-21")
I'm trying to convert that vector into a vector of date objects. I'm aware this question is very similar to this: Convert non-standard date format to date in R posted some years ago, but unfortunately, it has not answered my question.
I have tried the following as.Date() calls to do this, but it just returns a vector of NA. I.e.
b = as.Date(a, format = "%B-%y")
b = as.Date(a, format = "%B%y")
b = as.Date(a, "%B-%y")
b = as.Date(a, "%B%y")
I'm also attempted to do it using the convertToDate function from the openxlsx package:
b = convertToDate(a, format = "%B-%y")
I have also tried all the above but using a single character string rather than a vector, but that produced the same issue.
I'm a little lost as to why this isn't working, as this format has worked in reverse earlier in my script (that is, I had a date object already in dd-mm-yyyy format and converted it to month_name-yy using %B-%y). Is there another way to go from string to date when the string is a non-standard (anything other than dd-mm-yyy or mm-dd-yy if you're in the US) date format?
For the record my R locales are all UK and english.
Thanks in advance.
A Date must have all three of day, month and year. Convert to yearmon class which requires only month and year and then to Date as in (1) and (2) below or add the day as in (3).
(1) and (3) give first of month and (2) gives the end of the month.
(3) uses only functions from base R.
Also consider not converting to Date at all but just use yearmon objects instead since they directly represent a year and month which is what the input represents.
library(zoo)
# test input
a <- c("April-21", "March-21", "February-21", "January-21")
# 1
as.Date(as.yearmon(a, "%B-%y"))
## [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
# 2
as.Date(as.yearmon(a, "%B-%y"), frac = 1)
## [1] "2021-04-30" "2021-03-31" "2021-02-28" "2021-01-31"
# 3
as.Date(paste(1, a), "%d %B-%y")
## [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
In addition to zoo, which #G. Grothendieck mentioned, you can also use clock or lubridate.
clock supports a variable precision calendar type called year_month_day. In this case you'd want "month" precision, then you can set the day to whatever you'd like and convert back to Date.
library(clock)
x <- c("April-21", "March-21", "February-21", "January-21")
ymd <- year_month_day_parse(x, format = "%B-%y", precision = "month")
ymd
#> <year_month_day<month>[4]>
#> [1] "2021-04" "2021-03" "2021-02" "2021-01"
# First of month
as.Date(set_day(ymd, 1))
#> [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"
# End of month
as.Date(set_day(ymd, "last"))
#> [1] "2021-04-30" "2021-03-31" "2021-02-28" "2021-01-31"
The simplest solution may be to use lubridate::my(), which parses strings in the order of "month then year". That assumes that you want the first day of the month, which may or may not be correct for you.
library(lubridate)
x <- c("April-21", "March-21", "February-21", "January-21")
# Assumes first of month
my(x)
#> [1] "2021-04-01" "2021-03-01" "2021-02-01" "2021-01-01"

How can I parse_date_time with letters and timezones?

I have a pretty standard looking date time as a character. I'm trouble parsing it into the datetime because of the letters and redundancies.
How can I parse this as a datetime?
lubridate::parse_date_time("Fri Feb 05 09:58:38 PST 2021", )
You can specify all the formats by directly referring to the documentation of the function parse_date_time(). You can find it here. The tricky thing here is that you need to specify the time zone via the argument tz and not directly in the argument orders.
Below is the code.
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
newdate = parse_date_time("Fri Feb 05 09:58:38 PST 2021",
"%a %b %d %H:%M:%S %Y",
tz = "America/Los_Angeles" )
newdate
#> [1] "2021-02-05 09:58:38 PST"
Created on 2021-02-16 by the reprex package (v0.3.0)
Does this work for you?

R lubridate time zone issue

I'd like to change the time zone of a POSIXct object in R, using the with_tz() function in the lubridate package.
This example I pulled from the web works for me:
meeting <- ymd_hms("2011-07-01 09:00:00", tz = "Pacific/Auckland")
with_tz(meeting, "America/Chicago")
But this one does not, using a snippet of some data:
atime <- as.POSIXct("2016-11-04 18:04:30",
format="%Y-%m-%d %H:%M:%S",
tz="PST")
atime_utc <- with_tz(atime, "UTC")
str() and tz() show that the new object has a time zone of "UTC", and is a POSIXct object, but the times are identical. There should be 8 hours between them after the time zone conversion.
Another solution using a different function would be fine, too.
The comments above should be well taken, but you can also try force_tz depending on your needs:
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
atime <- as.POSIXct("2016-11-04 18:04:30",
format="%Y-%m-%d %H:%M:%S",
tz="PST")
#> Warning in strptime(x, format, tz = tz): unknown timezone 'PST'
#> Warning in as.POSIXct.POSIXlt(as.POSIXlt(x, tz, ...), tz, ...): unknown
#> timezone 'PST'
tz(atime)
#> [1] "PST"
atime_utc <- with_tz(atime, "UTC")
force_tz(atime, "UTC")
#> [1] "2016-11-04 10:04:30 UTC"
Created on 2019-03-03 by the reprex package (v0.2.1)

Date time conversion and extract only time

Want to change the class for Time to POSIXlt and extract only the hours minutes and seconds
str(df3$Time)
chr [1:2075259] "17:24:00" "17:25:00" "17:26:00" "17:27:00" ...
Used the strptime function
df33$Time <- strptime(df3$Time, format = "%H:%M:%S")
This gives the date/time appended
> str(df3$Time)
POSIXlt[1:2075259], format: "2015-08-07 17:24:00" "2015-08-07 17:25:00" "2015-08-07 17:26:00" ...
Wanted to extract just the time without changing the POSIXlt class. using the strftime function
df3$Time <- strftime(df3$Time, format = "%H:%M:%S")
but this converts the class back to "char" -
> class(df3$Time)
[1] "character"
How can I just extract the time with class set to POSIX or numeric...
If your data is
a <- "17:24:00"
b <- strptime(a, format = "%H:%M:%S")
you can use lubridate in order to have a result of class integer
library(lubridate)
hour(b)
minute(b)
# > hour(b)
# [1] 17
# > minute(b)
# [1] 24
# > class(minute(b))
# [1] "integer"
and you can combine them using
# character
paste(hour(b),minute(b), sep=":")
# numeric
hour(b) + minute(b)/60
for instance.
I would not advise to do that if you want to do any further operations on your data. However, it might be convenient to do that if you want to plot the results.
A datetime object contains date and time; you cannot extract 'just time'. So you have to think throught what you want:
POSIXlt is a Datetime representation (as a list of components)
POSIXct is a different Datetime representation (as a compact numeric)
Neither one omits the Date part. Once you have a valid object, you can choose to display only the time. But you cannot make the Date part disappear from the representation.
A "modern" tidyverse answer to this is to use hms::as_hms()
For example
library(tidyverse)
library(hms)
as_hms(1)
#> 00:00:01
as_hms("12:34:56")
#> 12:34:56
or, with your example data:
x <- as.POSIXlt(c("17:24:00", "17:25:00", "17:26:00", "17:27:00"), format = "%H:%M:%S")
x
#>[1] "2021-04-10 17:24:00 EDT" "2021-04-10 17:25:00 EDT" "2021-04-10 17:26:00 EDT" "2021-04-10 17:27:00 EDT"
as_hms(x)
# 17:24:00
# 17:25:00
# 17:26:00
# 17:27:00
See also docs here:
https://hms.tidyverse.org/reference/hms.html
You can also use the chron package to extract just times of the day:
library(chron)
# current date/time in POSIXt format as an example
timenow <- Sys.time()
# create chron object "times"
onlytime <- times(strftime(timenow,"%H:%M:%S"))
> onlytime
[1] 14:18:00
> onlytime+1/24
[1] 15:18:00
> class(onlytime)
[1] "times"
This is my idiom for getting just the timepart from a datetime object. I use floor_date() from lubridate to get midnight of the timestamp and take the difference of the timestamp and midnight of that day. I create and store a hms object provided with lubridate (I believe) in dataframes because the class has formatting of hh:mm:ss that is easy to read, but the underlying value is a numeric value of seconds. Here is my code:
library(tidyverse)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
# Create timestamps
#
# Get timepart by subtacting the timestamp from it's floor'ed date, make sure
# you convert to seconds, and then cast to a time object provided by the
# `hms` package.
# See: https://www.rdocumentation.org/packages/hms/versions/0.4.2/topics/hms
dt <- tibble(dt=c("2019-02-15T13:15:00", "2019-02-19T01:10:33") %>% ymd_hms()) %>%
mutate(timepart = hms::hms(as.numeric(dt - floor_date(dt, "1 day"), unit="secs")))
# Look at result
print(dt)
#> # A tibble: 2 x 2
#> dt timepart
#> <dttm> <time>
#> 1 2019-02-15 13:15:00 13:15
#> 2 2019-02-19 01:10:33 01:10
# `hms` object is really a `difftime` object from documentation, but is made into a `hms`
# object that defaults to always store data in seconds.
dt %>% pluck("timepart") %>% str()
#> 'hms' num [1:2] 13:15:00 01:10:33
#> - attr(*, "units")= chr "secs"
# Pull off just the timepart column
dt %>% pluck("timepart")
#> 13:15:00
#> 01:10:33
# Get numeric part. From documentation, `hms` object always stores in seconds.
dt %>% pluck("timepart") %>% as.numeric()
#> [1] 47700 4233
Created on 2019-02-15 by the reprex package (v0.2.1)
If you want it in POSIX format, the only way would be to leave it as it is, and extract just the "time" part everytime you display it. But internally it will always be date + time anyway.
If you want it in numeric, however, you can simply convert it into a number.
For example, to get time as number of seconds passed since the beginning of the day:
df3$Time=df3$Time$sec + df3$Time$min*60 + df3$Time$hour*3600

Resources