i've searched for threads about timestamp conversion in R, but could not figure this out.
I need to convert time column into timestamp so R would read it as dates. When the cell has only date without time, there is no problem, but the current format (either with + or without it in the cell - R considers it as integer or factor).
How do i convert it into timestamp?
thank you
You do not need to remove the +:
R> crappyinput <- c("2014-11-29 15:23:02+", "2014-11-29 15:38:36+",
+ "2014-11-29 15:52:49+")
R> pt <- strptime(crappyinput, "%Y-%m-%d %H:%M:%S")
R> pt
[1] "2014-11-29 15:23:02 CST" "2014-11-29 15:38:36 CST" "2014-11-29 15:52:49 CST"
R>
It will simply be ignored as trailing garbage.
would this work for you?
t <- c("2014-11-29 15:23:02+")
t <- substr(t, 1, nchar(t)-1)
t
[1] "2014-11-29 15:23:02"
t <- strptime(t, format="%Y-%m-%d")
str(t)
POSIXlt[1:1], format: "2014-11-29"
Related
I am designing a Flex dashboard. One of the column in my dashboard is a time stamp whose column contains entries like 2020-03-02T16:30:36Z. I want to convert it into dd/mm/yyy hh:mm:ss. Please help.
I tried this but nothing happened. In-fact, the entries got removed from the flex dashboard
df$time<- as.POSIXct(df$time,
format="%Y-%m-%dT%H:%M:%OSZ", tz="GMT")
The anytime package can help:
R> library(anytime)
R> anytime("2020-03-02T16:30:36Z")
[1] "2020-03-02 16:30:36 CST"
R> utctime("2020-03-02T16:30:36Z", tz="UTC")
[1] "2020-03-02 16:30:36 UTC"
R>
First, by not requiring an input format but rather by relying on a number of possibly / plausible formats it tries heuristically. Second, by also offering to parse at UTC (and, as we do here, impose UTC for the printed format / display, which is otherwise localtime). Third, we also have some output formats should you need them:
R> pt <- utctime("2020-03-02T16:30:36Z", tz="UTC")
R> iso8601(pt)
[1] "2020-03-02T16:30:36"
R> rfc2822(pt)
[1] "Mon, 02 Mar 2020 16:30:36.000000 +0000"
R> rfc3339(pt)
[1] "2020-03-02T16:30:36.000000+0000"
R> yyyymmdd(pt)
[1] "20200302"
R>
The underlying implementation is in C++ so it also tends to be faster than the equivalent alternatives (which require a format spec or hint).
libridate's function as_datetime also works:
library(lubridate)
as_datetime("2020-03-02T16:30:36Z")
[1] "2020-03-02 16:30:36 UTC"
I have several variables that exist in the following format:
/Date(1353020400000+0100)/
I want to convert this format to ddmmyyyy. I found this solution for the same problem using php, but I don't know anything about php, so I'm unable to convert that solution to what I need, which is a solution that I can use in R.
Any suggestions?
Thanks.
If the format is milliseconds since the epoch then anytime() or as.POSIXct() can help you:
R> anytime(1353020400000/1000)
[1] "2012-11-15 17:00:00 CST"
R> anytime(1353020400.000)
[1] "2012-11-15 17:00:00 CST"
R>
anytime() converts to local time, which is Chicago for me. You would have to deal with the UTC offset separately.
Base R can do it too, but you need the dreaded origin:
R> as.POSIXct(1353020400.000, origin="1970-01-01")
[1] "2012-11-15 17:00:00 CST"
R>
As far as I can tell from the linked question, this is milliseconds since the epoch:
x <- "/Date(1353020400000+0100)/"
spl <- strsplit(x, "[()+]")
as.POSIXct(as.numeric(sapply(spl,`[[`,2)) / 1000, origin="1970-01-01", tz="UTC")
#[1] "2012-11-15 23:00:00 UTC"
If you want to pick up the timezone difference as well, here's an attempt:
x <- "/Date(1353020400000+0100)/"
spl <- strsplit(x, "(?=[+-])|[()]", perl=TRUE)
tzo <- sapply(spl, function(x) paste(x[3:4],collapse="") )
dt <- as.POSIXct(as.numeric(sapply(spl,`[[`,2)) / 1000, origin="1970-01-01", tz="UTC")
as.POSIXct(paste(format(dt), tzo), tz="UTC", format = '%F %T %z')
#[1] "2012-11-15 22:00:00 UTC"
The package lubridate can come to the rescue as follows:
as.Date("1970-01-01") + lubridate::milliseconds(1353020400000)
Read: Number of milliseconds since epoch (= 1. January 1970, UTC + 0)
A parsing function can now be made using regular expressions:
parse.myDate <- function(text) {
num <- as.numeric(stringr::str_extract(text, "(?<=/Date\\()\\d+"))
as.Date("1970-01-01") + lubridate::milliseconds(num)
}
finally, format the Date with
format(theDate, "%d/%m/%Y %H:%M")
If you also need the time zone information, you can use this instead:
parse.myDate <- function(text) {
parts <- stringr::str_match(text, "^/Date\\((\\d+)([+-])(\\d{4})\\)/$")
as.POSIXct(as.numeric(parts[,2])/1000, origin = "1970-01-01", tz = paste0("Etc/GMT", parts[,3], as.integer(parts[,4])/100))
}
This question already has answers here:
Using strptime %z with special timezone format
(2 answers)
Closed 9 years ago.
This should be quick - we are parsing the following format in R:
2013-04-05T07:49:54-07:00
My current approach is
require(stringr)
timenoT <- str_replace_all("2013-04-05T07:49:54-07:00", "T", " ")
timep <- strptime(timenoT, "%Y-%m-%d %H:%M:%S%z", tz="UTC")
but it gives NA.
%z is the signed offset in hours, in the format hhmm, not hh:mm. Here's one way to remove the last :.
newstring <- gsub("(.*).(..)$","\\1\\2","2013-04-05T07:49:54-07:00")
(timep <- strptime(newstring, "%Y-%m-%dT%H:%M:%S%z", tz="UTC"))
# [1] "2013-04-05 14:49:54 UTC"
Also note that you don't have to remove the "T".
You don't the string replacement.
NA just means that the whole did not work, so do it pieces to build your expression:
R> strptime("2013-04-05T07:49:54-07:00", "%Y-%m-%d")
[1] "2013-04-05"
R> strptime("2013-04-05T07:49:54-07:00", "%Y-%m-%dT%H:%M")
[1] "2013-04-05 07:49:00"
R> strptime("2013-04-05T07:49:54-07:00", "%Y-%m-%dT%H:%M:%S")
[1] "2013-04-05 07:49:54"
R>
Also, for reasons I never fully understood -- but which probably reside with C library function underlying it, %z only works on output, not input. So your NA mostly likely comes from your use of %z.
strptime("2013-04-05 07:49:54-07:00", "%Y-%m-%d %H:%M:%S", tz="UTC") gives 2013-04-05 07:49:54 UTC
Try
timep <- strptime(timenoT, "%Y-%m-%d %H:%M:%S", tz="UTC")
I am reading a date value from csv file.So the format will vary according to the date format of csv. How can I convert any date string to dd-mm-yyy HH:mm:ss ?
EDIT :
The input format are :
dd/mm/yyyy HH:mm:ss
dd/mm/yyyy
dd-mm-yyyy HH:mm:ss
dd-mm-yyyy
mm-dd-yyyy HH:mm:ss
mm-dd-yyyy
mm/dd/yyyy
yyyy-mm-dd HH:mm:ss
yyyy-mm-dd
I need to convert all these formats to dd-mm-yyyy HH:mm:ss
See the anytime package whose anytime function does just that -- and without requiring a format string:
> inputs <- c("12/07/2017 10:11:12", "12/07/2017", "12-07-2017 10:11:12",
+ "07-12-2017", "2017-12-07 10:11:12", "2017-12-07")
> library(anytime)
> anytime(inputs)
[1] "2017-12-07 10:11:12 CST" "2017-12-07 00:00:00 CST"
[3] "2017-12-07 10:11:12 CST" "2017-07-12 00:00:00 CDT"
[5] "2017-12-07 10:11:12 CST" "2017-12-07 00:00:00 CST"
>
However, your requirement of accepting both d-m-y and m-d-y is not satisfiable. So you need to make a choice and supply an explicit format here.
In general, I highly recommend avoiding the ambiguity and sticking to y-m-d ISO formats. As a convenience to stubborn North American habits, anytime and anydata also accept m-d-y ordering but it is dangerous.
Again, only you can tell if 3-4-5 is April 3rd or March 4th, and you need to specify that.
as.Date() converts a string into a Date object. You will need to adapt its format parameter to the specific format your csv has.
Try "lubridate" package. Here A_1.csv has both formats.
Data <-read.csv("Al_1.csv") # import data
str(Data)
a = NULL # create a null object
library(lubridate)
a$Date <- mdy_hm(Data$Date) # store dd/mm/yyyy HH:mm:ss objects here
a$Price <- Data$Price # get respective values(it could by any other column)
b = NULL
b$Date <- mdy(Data$Date)
b$Price <- Data$Price
a <- as.data.frame(a)
b <- as.data.frame(b)
a <- a[is.na(a$Date)==FALSE,] # those with NA had diffrent formats remove it
b <- b[is.na(b$Date)==FALSE,]
b$Date <- as.POSIXlt(b$Date)# change your other format also to UTC
x <- rbind(a,b)
str(x)
x <- ts(x)
I am reading a date value from csv file.So the format will vary according to the date format of csv. How can I convert any date string to dd-mm-yyy HH:mm:ss ?
EDIT :
The input format are :
dd/mm/yyyy HH:mm:ss
dd/mm/yyyy
dd-mm-yyyy HH:mm:ss
dd-mm-yyyy
mm-dd-yyyy HH:mm:ss
mm-dd-yyyy
mm/dd/yyyy
yyyy-mm-dd HH:mm:ss
yyyy-mm-dd
I need to convert all these formats to dd-mm-yyyy HH:mm:ss
See the anytime package whose anytime function does just that -- and without requiring a format string:
> inputs <- c("12/07/2017 10:11:12", "12/07/2017", "12-07-2017 10:11:12",
+ "07-12-2017", "2017-12-07 10:11:12", "2017-12-07")
> library(anytime)
> anytime(inputs)
[1] "2017-12-07 10:11:12 CST" "2017-12-07 00:00:00 CST"
[3] "2017-12-07 10:11:12 CST" "2017-07-12 00:00:00 CDT"
[5] "2017-12-07 10:11:12 CST" "2017-12-07 00:00:00 CST"
>
However, your requirement of accepting both d-m-y and m-d-y is not satisfiable. So you need to make a choice and supply an explicit format here.
In general, I highly recommend avoiding the ambiguity and sticking to y-m-d ISO formats. As a convenience to stubborn North American habits, anytime and anydata also accept m-d-y ordering but it is dangerous.
Again, only you can tell if 3-4-5 is April 3rd or March 4th, and you need to specify that.
as.Date() converts a string into a Date object. You will need to adapt its format parameter to the specific format your csv has.
Try "lubridate" package. Here A_1.csv has both formats.
Data <-read.csv("Al_1.csv") # import data
str(Data)
a = NULL # create a null object
library(lubridate)
a$Date <- mdy_hm(Data$Date) # store dd/mm/yyyy HH:mm:ss objects here
a$Price <- Data$Price # get respective values(it could by any other column)
b = NULL
b$Date <- mdy(Data$Date)
b$Price <- Data$Price
a <- as.data.frame(a)
b <- as.data.frame(b)
a <- a[is.na(a$Date)==FALSE,] # those with NA had diffrent formats remove it
b <- b[is.na(b$Date)==FALSE,]
b$Date <- as.POSIXlt(b$Date)# change your other format also to UTC
x <- rbind(a,b)
str(x)
x <- ts(x)