Error while converting to Date format in R - r

It should be an easy issue, but I got stacked with it. I have a data.frame with dates and values:
class(var_data)
[1] "tbl_df" "tbl" "data.frame"
var_data
A tibble: 42 x 2
date Tourists
<dttm> <dbl>
1 2006-03-01 00:00:00 55280.
2 2006-06-01 00:00:00 84392.
3 2006-09-01 00:00:00 132714.
Then I want to copy some dates and values into other data.frame:
var_list_DB$var_last[ii] <- var_data[last,"Tourists"]
var_list_DB$var_date_start[ii] <- var_data[1,"date"]
var_list_DB$var_date_last[ii] <- var_data[last,"date"]
But instead of dates I got numbers:
var_date_start var_date_last var_val_last
951868800 1496275200 10044.3162
And while trying to convert to date format, got an error:
as.Date(var_data[last,"date"], format = "%m/%d/%Y")
Error in as.Date.default(x, ...) :
do not know how to convert 'x' to class “Date”
I recently updated to 3.5.0 version, may be this is an issue.

Add as.character convertion before pass to date and move var_data to data.frame format, like this two examples using as.Date and as.POSIXct:
var_data<-data.frame(var_data)
as.Date(as.character(var_data[,"date"]))
[1] "2006-03-01" "2006-06-01" "2006-09-01"
as.POSIXct(as.character(var_data[,"date"]))
[1] "2006-03-01 CET" "2006-06-01 CEST" "2006-09-01 CEST"

Related

Format date in R with lubridate

My input data, formatted as character, looks like this
"2020-07-10T00:00:00"
I tried
library(lubridate)
mdy_hms("2020-07-10T00:00:00", format='%Y-%m-%dT%H:%M:%S', tz=Sys.timezone())
But I get
[1] NA NA
Warning message:
All formats failed to parse. No formats found.
I tried the more flexibel approach parse_date_time(), but without luck
parse_date_time("2020-07-10T00:00:00", '%Y-%m-%dT%H:%M:%S', tz=Sys.timezone())
How can I convert this date "2020-07-10T00:00:00" to a date R recognizes? Note: I am not interested in the time really, only the date!
Why not just
as.Date("2020-07-10T00:00:00")
# [1] "2020-07-10"
Fun fact:
as.Date("2020-07-101sddT00:1sdafsdfsdf0:00sdfzsdfsdfsdf")
# [1] "2020-07-10"
Assuming that the 07 is the month of July, and the 10 is the 10th:
x <- "2020-07-10T00:00:00"
ymd_hms(x, tz = Sys.timezone())
> [1] "2020-07-10 AEST"
If it's in format year-day-month, swap the ymd for ydm.
Hope this helps!

Reconvert numeric date to POSIXct R

I have a date that I convert to a numeric value and want to convert back to a date afterwards.
Converting date to numeric:
date1 = as.POSIXct('2017-12-30 15:00:00')
date1_num = as.numeric(date1)
# 1514646000
Reconverting numeric to date:
as.Date(date1_num, origin = '1/1/1970')
# "4146960-12-12"
What am I missing with the reconversion? I'd expect the last command to return my original date1.
As the numeric vector is created from an object with time component, reconversion can also be in the same way i.e. first to POSIXct and then wrap with as.Date
as.Date(as.POSIXct(date1_num, origin = '1970-01-01'))
#[1] "2017-12-30"
You could use anytime() and anydate() from the anytime package:
R> pt <- anytime("2017-12-30 15:00:00")
R> pt
[1] "2017-12-30 15:00:00 CST"
R>
R> anydate(pt)
[1] "2017-12-30"
R>
R> as.numeric(pt)
[1] 1514667600
R>
R> anydate(as.numeric(pt))
[1] "2017-12-30"
R>
POSIXct counts the number of seconds since the Unix Epoch, while Date counts the number of days. So you can recover the date by dividing by (60*60*24) (let's ignore leap seconds), or convert back to POSIXct instead.
as.Date(as.numeric(date1)/(60*60*24), origin="1970-01-01")
[1] "2017-12-30"
as.POSIXct(as.numeric(date1),origin="1970-01-01")
[1] "2017-12-30 15:00:00 GMT"
Using lubridate :
lubridate::as_datetime(1514646000)
[1] "2017-12-30 15:00:00 UTC"

strptime outputing NA when I try and convert from "%Y-%m" format

Strptime outputs NA when I set format to "%Y-%m"
I have tried adding the day as a test and it worked, but whenever I do "%Y-%m" or "%m" i get NA
print(strptime("2007-07", format = "%Y-%m"))
[1] NA
print(strptime("07", format = "%m"))
[1] NA
print(strptime("2007", format = "%Y"))
[1] "2007-07-30 EDT"
Use library zoo. It is useful when you have to deal with dates like that.
require(zoo)
yearmon(c(2017,01))
Then you can manipulate the object yearmon.
as.Date(yearmon(c(2017,01)))
[1] "2017-01-01" "7-01-01"

Date time conversion and extract only time

Want to change the class for Time to POSIXlt and extract only the hours minutes and seconds
str(df3$Time)
chr [1:2075259] "17:24:00" "17:25:00" "17:26:00" "17:27:00" ...
Used the strptime function
df33$Time <- strptime(df3$Time, format = "%H:%M:%S")
This gives the date/time appended
> str(df3$Time)
POSIXlt[1:2075259], format: "2015-08-07 17:24:00" "2015-08-07 17:25:00" "2015-08-07 17:26:00" ...
Wanted to extract just the time without changing the POSIXlt class. using the strftime function
df3$Time <- strftime(df3$Time, format = "%H:%M:%S")
but this converts the class back to "char" -
> class(df3$Time)
[1] "character"
How can I just extract the time with class set to POSIX or numeric...
If your data is
a <- "17:24:00"
b <- strptime(a, format = "%H:%M:%S")
you can use lubridate in order to have a result of class integer
library(lubridate)
hour(b)
minute(b)
# > hour(b)
# [1] 17
# > minute(b)
# [1] 24
# > class(minute(b))
# [1] "integer"
and you can combine them using
# character
paste(hour(b),minute(b), sep=":")
# numeric
hour(b) + minute(b)/60
for instance.
I would not advise to do that if you want to do any further operations on your data. However, it might be convenient to do that if you want to plot the results.
A datetime object contains date and time; you cannot extract 'just time'. So you have to think throught what you want:
POSIXlt is a Datetime representation (as a list of components)
POSIXct is a different Datetime representation (as a compact numeric)
Neither one omits the Date part. Once you have a valid object, you can choose to display only the time. But you cannot make the Date part disappear from the representation.
A "modern" tidyverse answer to this is to use hms::as_hms()
For example
library(tidyverse)
library(hms)
as_hms(1)
#> 00:00:01
as_hms("12:34:56")
#> 12:34:56
or, with your example data:
x <- as.POSIXlt(c("17:24:00", "17:25:00", "17:26:00", "17:27:00"), format = "%H:%M:%S")
x
#>[1] "2021-04-10 17:24:00 EDT" "2021-04-10 17:25:00 EDT" "2021-04-10 17:26:00 EDT" "2021-04-10 17:27:00 EDT"
as_hms(x)
# 17:24:00
# 17:25:00
# 17:26:00
# 17:27:00
See also docs here:
https://hms.tidyverse.org/reference/hms.html
You can also use the chron package to extract just times of the day:
library(chron)
# current date/time in POSIXt format as an example
timenow <- Sys.time()
# create chron object "times"
onlytime <- times(strftime(timenow,"%H:%M:%S"))
> onlytime
[1] 14:18:00
> onlytime+1/24
[1] 15:18:00
> class(onlytime)
[1] "times"
This is my idiom for getting just the timepart from a datetime object. I use floor_date() from lubridate to get midnight of the timestamp and take the difference of the timestamp and midnight of that day. I create and store a hms object provided with lubridate (I believe) in dataframes because the class has formatting of hh:mm:ss that is easy to read, but the underlying value is a numeric value of seconds. Here is my code:
library(tidyverse)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#>
#> date
# Create timestamps
#
# Get timepart by subtacting the timestamp from it's floor'ed date, make sure
# you convert to seconds, and then cast to a time object provided by the
# `hms` package.
# See: https://www.rdocumentation.org/packages/hms/versions/0.4.2/topics/hms
dt <- tibble(dt=c("2019-02-15T13:15:00", "2019-02-19T01:10:33") %>% ymd_hms()) %>%
mutate(timepart = hms::hms(as.numeric(dt - floor_date(dt, "1 day"), unit="secs")))
# Look at result
print(dt)
#> # A tibble: 2 x 2
#> dt timepart
#> <dttm> <time>
#> 1 2019-02-15 13:15:00 13:15
#> 2 2019-02-19 01:10:33 01:10
# `hms` object is really a `difftime` object from documentation, but is made into a `hms`
# object that defaults to always store data in seconds.
dt %>% pluck("timepart") %>% str()
#> 'hms' num [1:2] 13:15:00 01:10:33
#> - attr(*, "units")= chr "secs"
# Pull off just the timepart column
dt %>% pluck("timepart")
#> 13:15:00
#> 01:10:33
# Get numeric part. From documentation, `hms` object always stores in seconds.
dt %>% pluck("timepart") %>% as.numeric()
#> [1] 47700 4233
Created on 2019-02-15 by the reprex package (v0.2.1)
If you want it in POSIX format, the only way would be to leave it as it is, and extract just the "time" part everytime you display it. But internally it will always be date + time anyway.
If you want it in numeric, however, you can simply convert it into a number.
For example, to get time as number of seconds passed since the beginning of the day:
df3$Time=df3$Time$sec + df3$Time$min*60 + df3$Time$hour*3600

ymd with vector of dates

A simple question, I think. I have some dates, d:
d <- as.POSIXct(c("2014-01-01 00:00:00 BST", "2014-01-01 00:30:00 BST"))
> class(d)
[1] "POSIXct" "POSIXt"
If I try and extract just the date part with lubridate, it works fine with a single value but not the whole vector, i.e.:
> ymd(d[1])
[1] "2014-01-01 UTC"
> ymd(d)
[1] NA NA
Warning message:
All formats failed to parse. No formats found.
For the record, this works:
> as.Date(d, format="%F")
[1] "2014-01-01" "2014-01-01"
What's going on here?
Your issue is that your vector is not just year, month, day (ymd), but also hour, minute, second (hms). Consider using this instead:
ymd_hms(d)
If you want to just extract the date, you can use:
strftime(ymd_hms(d),'%Y-%m-%d')

Resources