Is there a way to use the format function on a date object, specifically an object of class POSIXlt, POSIXct, or Date, with the format %Y, %m, %d such that leading zeros are stripped from each of those 3 fields?
For example, I would like format(as.Date("1998-09-02"), "%Y, %m, %d") to return 1998, 9, 2 and not 1998, 09, 02.
Just remove the leading zeros at the end:
gsub(" 0", " ", format(as.Date("1998-09-02"), "%Y, %m, %d"))
## [1] "1998, 9, 2"
Use %e to obtain a leading space instead of a leading zero.
You can do this with a simple change to your strftime format string. However, it depends on your platform (Unix or Windows).
Unix
Insert a minus sign (-) before each term you'd like to remove leading zeros from:
format(as.Date("2020-06-02"), "%Y, %-m, %-d")
[1] "2020, 6, 2"
Windows
Insert a pound sign (#) before each desired term:
format(as.Date("2020-06-02"), "%Y, %#m, %#d")
[1] "2020, 6, 2"
I have discovered a workaround by using year(), month() and day() function of {lubridate} package. With the help of glue::glue(), it is easy to do it as following:
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
library(glue)
dt <- "1998-09-02"
glue("{year(dt)}, {month(dt)}, {day(dt)}")
#> 1998, 9, 2
Created on 2021-04-19 by the reprex package (v2.0.0)
If {tidyverse} is used (suggested by #banbh), then str_glue() can be used:
library(tidyverse)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
dt <- "1998-09-02"
str_glue("{year(dt)}, {month(dt)}, {day(dt)}")
#> 1998, 9, 2
Created on 2021-04-19 by the reprex package (v2.0.0)
A more general solution using gsub, to remove leading zeros from the day or month digits produced by %m or %d. This deletes any zero that is not preceded by a digit:
gsub("(\\D)0", "\\1", format(as.Date("1998-09-02"), "%Y, %m, %d"))
Related
I have a table with date and temperature, and I need to convert the date from this format 1/1/2021 12:00:00 AM to just 1/1/21. Please help. I tried to add a new column with the new date using this code
SFtemps$Date <- as.Date(strptime(SFtemps$Date, "%m/%d/%y %H:%M"))
but it's not working.
It should be %I to represent hours as decimal number (01–12), not %H, and %y to
years without century (00–99).
x <- "1/1/2021 12:00:00 AM"
format(strptime(x, "%m/%d/%Y %I:%M:%S %p"), "%m/%d/%y")
[1] "01/01/21"
Note that after you re-foramt the time object, it'll be a pure character string and lose all attributes of a time object.
Your string is actually a timestamp, using lubridate you can convert it to the desired format and then leverage stamp function to get your data formatted in the manner that suits you.
org_str <- "1/1/2021 12:00:00 AM"
library("lubridate")
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
x <- mdy_hms(x = org_str, tz = "UTC")
sf <- stamp("29/01/1997")
#> Multiple formats matched: "%d/%Om/%Y"(1), "%d/%m/%Y"(1)
#> Using: "%d/%Om/%Y"
sf(x)
#> [1] "01/01/2021"
Created on 2022-04-22 by the reprex package (v2.0.1)
Explanation
Contrary to common perception the date doesn't have a format as such. Only string representation of the date can be of a specific format. Consider the following example. Running code:
x <- Sys.Date()
unclass(x)
will return an integer. This is because1:
Thus, objects of the class Date are internally represented as numbers. More specifically, dates in R are actually integers.
The same is applicable to the POSIX objects
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
xn <- now()
class(xn)
#> [1] "POSIXct" "POSIXt"
unclass(xn)
#> [1] 1650644616
#> attr(,"tzone")
#> [1] ""
Created on 2022-04-22 by the reprex package (v2.0.1)
You can always run:
format.Date(lubridate::now(), "%d-%m")
# [1] "22-04"
However, by doing this you are not formatting the date but creating a string representation of the date / timestamp object. Your string representation would have to be converted to timestamp / date object if you want to use it in common date time operations.
Suggestions
Whenever working with date time / date objects I would argue that it's advisable to keep those objects as a correct class maintaining the relevant details (such as time zone in case of timestamps) and only leverage formatting functions when utilising the data in analysis / visual representation, as shown in the sample.
1 Essentials of dates and times
Your format is a bit off. These are two options that work (lubridate is my favorite):
SFtemps <- list()
SFtemps$Date <- '1/1/2021 12:00:00 AM'
> as.Date(strptime(SFtemps$Date, "%m/%d/%Y %r"))
[1] "2021-01-01"
> as.Date(lubridate::mdy_hms(SFtemps$Date))
[1] "2021-01-01"
This will give you the date. If you want to see it as 1/1/2021, use the format function
You can try tryFormats
as.Date(x, format, tryFormats = c("%m/%d/%y, %H:%M"),
optional = FALSE)
I have a pretty standard looking date time as a character. I'm trouble parsing it into the datetime because of the letters and redundancies.
How can I parse this as a datetime?
lubridate::parse_date_time("Fri Feb 05 09:58:38 PST 2021", )
You can specify all the formats by directly referring to the documentation of the function parse_date_time(). You can find it here. The tricky thing here is that you need to specify the time zone via the argument tz and not directly in the argument orders.
Below is the code.
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
newdate = parse_date_time("Fri Feb 05 09:58:38 PST 2021",
"%a %b %d %H:%M:%S %Y",
tz = "America/Los_Angeles" )
newdate
#> [1] "2021-02-05 09:58:38 PST"
Created on 2021-02-16 by the reprex package (v0.3.0)
Does this work for you?
Suppose I have a character vector dt with some months-years stored as character format-
dt <- c("Mar-19", "Apr-19", "May-19")
when I try to convert it into date object, it returns NAs only
as.Date(dt, format = "%b-%y")
[1] NA NA NA
So I have to first concatenate a dummy date say 01 to each object and then parse it as date and thereafter format the vector to show it in original format
format.Date(as.Date(paste0("01-", dt), format = "%d-%b-%y"), "%b-%y")
[1] "Mar-19" "Apr-19" "May-19"
Is there any direct method to parse a truncated date/datepart directly without concatenating and thus avoiding the long route?
zoo has as.yearmon class and function which can convert data with year and month without the date.
dt <- c("Mar-19", "Apr-19", "May-19")
zoo::as.yearmon(dt, '%b-%y')
To get it in character class you can use format with format as required.
format(zoo::as.yearmon(dt, '%b-%y'), '%b-%y')
#[1] "Mar-19" "Apr-19" "May-19"
You can use readr::parse_date for this, but it is still not elegant
library(tidyverse)
dt <- c("Mar-19", "Apr-19", "May-19")
dt %>% parse_date(format = "%b-%y") %>% format("%b-%y")
#> [1] "Mar-19" "Apr-19" "May-19"
Created on 2021-02-01 by the reprex package (v0.3.0)
dt <- c("Mar-19", "Apr-19", "May-19")
lubridate::myd(dt, truncated = 1)
#> [1] "2019-03-01" "2019-04-01" "2019-05-01"
# Created on 2021-02-01 by the reprex package (v0.3.0.9001)
(Related to Conversion of date format %B %Y)
Regards,
Suppose there is a file with a multi-line string like:
/Analysis made on 28 september 2011
people who exercise a lot
are healthy/
How can I extract the date 28 September 2011 from the entire file or string, regardless of the month in the date, or whether it's capitalized?
I assume you have more than one date you want to extract here, and that you want the result to be date types (if not, just pass them to format() with the strptime() specification you want, e.g. %e %B %Y - but converting to date first will standardize them, because, for example, you have a lowercase month name here).
What I'm doing here is using R's built-in month.name vector of full month names, and making a single regex string out of it that will match any text with any month name surrounded by date and year numbers. We end up with a list of character vectors, one vector for each document string, with all the date strings extracted from them in order, and then I map as_date() to them with the appropriate parse pattern so that they're actually R dates now.
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
library(tidyverse)
string <-
"Analysis made on 28 september 2011 people who exercise a lot are healthy
Another analysis on 6 May 1998 found otherwise"
pattern <-
paste("[:digit:]{1,2}", month.name, "[:digit:]{4}",
collapse = "|") %>%
regex(ignore_case = TRUE)
pattern
#> [1] "[:digit:]{1,2} January [:digit:]{4}|[:digit:]{1,2} February [:digit:]{4}|[:digit:]{1,2} March [:digit:]{4}|[:digit:]{1,2} April [:digit:]{4}|[:digit:]{1,2} May [:digit:]{4}|[:digit:]{1,2} June [:digit:]{4}|[:digit:]{1,2} July [:digit:]{4}|[:digit:]{1,2} August [:digit:]{4}|[:digit:]{1,2} September [:digit:]{4}|[:digit:]{1,2} October [:digit:]{4}|[:digit:]{1,2} November [:digit:]{4}|[:digit:]{1,2} December [:digit:]{4}"
#> attr(,"options")
#> attr(,"options")$case_insensitive
#> [1] TRUE
#>
#> attr(,"options")$comments
#> [1] FALSE
#>
#> attr(,"options")$dotall
#> [1] FALSE
#>
#> attr(,"options")$multiline
#> [1] FALSE
#>
#> attr(,"class")
#> [1] "regex" "pattern" "character"
str_extract_all(string, pattern) %>%
map(as_date, tz = "", format = "%e %B %Y")
#> [[1]]
#> [1] "2011-09-28" "1998-05-06"
Created on 2019-09-29 by the reprex package (v0.3.0)
Want to change the class for Time to POSIXlt and extract only the hours minutes and seconds
str(df3$Time)
chr [1:2075259] "17:24:00" "17:25:00" "17:26:00" "17:27:00" ...
Used the strptime function
df33$Time <- strptime(df3$Time, format = "%H:%M:%S")
This gives the date/time appended
> str(df3$Time)
POSIXlt[1:2075259], format: "2015-08-07 17:24:00" "2015-08-07 17:25:00" "2015-08-07 17:26:00" ...
Wanted to extract just the time without changing the POSIXlt class. using the strftime function
df3$Time <- strftime(df3$Time, format = "%H:%M:%S")
but this converts the class back to "char" -
> class(df3$Time)
[1] "character"
How can I just extract the time with class set to POSIX or numeric...
If your data is
a <- "17:24:00"
b <- strptime(a, format = "%H:%M:%S")
you can use lubridate in order to have a result of class integer
library(lubridate)
hour(b)
minute(b)
# > hour(b)
# [1] 17
# > minute(b)
# [1] 24
# > class(minute(b))
# [1] "integer"
and you can combine them using
# character
paste(hour(b),minute(b), sep=":")
# numeric
hour(b) + minute(b)/60
for instance.
I would not advise to do that if you want to do any further operations on your data. However, it might be convenient to do that if you want to plot the results.
A datetime object contains date and time; you cannot extract 'just time'. So you have to think throught what you want:
POSIXlt is a Datetime representation (as a list of components)
POSIXct is a different Datetime representation (as a compact numeric)
Neither one omits the Date part. Once you have a valid object, you can choose to display only the time. But you cannot make the Date part disappear from the representation.
A "modern" tidyverse answer to this is to use hms::as_hms()
For example
library(tidyverse)
library(hms)
as_hms(1)
#> 00:00:01
as_hms("12:34:56")
#> 12:34:56
or, with your example data:
x <- as.POSIXlt(c("17:24:00", "17:25:00", "17:26:00", "17:27:00"), format = "%H:%M:%S")
x
#>[1] "2021-04-10 17:24:00 EDT" "2021-04-10 17:25:00 EDT" "2021-04-10 17:26:00 EDT" "2021-04-10 17:27:00 EDT"
as_hms(x)
# 17:24:00
# 17:25:00
# 17:26:00
# 17:27:00
See also docs here:
https://hms.tidyverse.org/reference/hms.html
You can also use the chron package to extract just times of the day:
library(chron)
# current date/time in POSIXt format as an example
timenow <- Sys.time()
# create chron object "times"
onlytime <- times(strftime(timenow,"%H:%M:%S"))
> onlytime
[1] 14:18:00
> onlytime+1/24
[1] 15:18:00
> class(onlytime)
[1] "times"
This is my idiom for getting just the timepart from a datetime object. I use floor_date() from lubridate to get midnight of the timestamp and take the difference of the timestamp and midnight of that day. I create and store a hms object provided with lubridate (I believe) in dataframes because the class has formatting of hh:mm:ss that is easy to read, but the underlying value is a numeric value of seconds. Here is my code:
library(tidyverse)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
# Create timestamps
#
# Get timepart by subtacting the timestamp from it's floor'ed date, make sure
# you convert to seconds, and then cast to a time object provided by the
# `hms` package.
# See: https://www.rdocumentation.org/packages/hms/versions/0.4.2/topics/hms
dt <- tibble(dt=c("2019-02-15T13:15:00", "2019-02-19T01:10:33") %>% ymd_hms()) %>%
mutate(timepart = hms::hms(as.numeric(dt - floor_date(dt, "1 day"), unit="secs")))
# Look at result
print(dt)
#> # A tibble: 2 x 2
#> dt timepart
#> <dttm> <time>
#> 1 2019-02-15 13:15:00 13:15
#> 2 2019-02-19 01:10:33 01:10
# `hms` object is really a `difftime` object from documentation, but is made into a `hms`
# object that defaults to always store data in seconds.
dt %>% pluck("timepart") %>% str()
#> 'hms' num [1:2] 13:15:00 01:10:33
#> - attr(*, "units")= chr "secs"
# Pull off just the timepart column
dt %>% pluck("timepart")
#> 13:15:00
#> 01:10:33
# Get numeric part. From documentation, `hms` object always stores in seconds.
dt %>% pluck("timepart") %>% as.numeric()
#> [1] 47700 4233
Created on 2019-02-15 by the reprex package (v0.2.1)
If you want it in POSIX format, the only way would be to leave it as it is, and extract just the "time" part everytime you display it. But internally it will always be date + time anyway.
If you want it in numeric, however, you can simply convert it into a number.
For example, to get time as number of seconds passed since the beginning of the day:
df3$Time=df3$Time$sec + df3$Time$min*60 + df3$Time$hour*3600