Time Zone issue in R - r

I have a time-series object dat as
dat <- structure(list(timestamp = c("2015-07-01T00:00:06+05:30", "2015-07-01T00:00:36+05:30",
"2015-07-01T00:01:06+05:30", "2015-07-01T00:01:36+05:30", "2015-07-01T00:02:06+05:30",
"2015-07-01T00:02:37+05:30"), value = c(110535.421875, 110516.6484375,
110398.25, 110381.5703125, 110392.15625, 110471.609375)), .Names = c("timestamp",
"value"), row.names = c(NA, 6L), class = "data.frame")
This object is associated with a timezone offset of 05:30, which is of "Asia/Kolkata". Whenever I try to convert the timestamp, I face following issues:
with as.POSIXct(strptime(dat$timestamp,format ="%Y-%m-%dT%H:%M:%S%z")), it outputs empty strings
If I remove timezone information in strptime(), it gets automatically converted to the timezone of system, i.e., as.POSIXct(strptime(dat$timestamp,format ="%Y-%m-%dT%H:%M:%S")) . In other words, it takes timezone of the system.
How should I always force the timezone to be the same as associated with the original data object?

Format %z can be used for input or output: it is a character string, conventionally plus or minus followed by two digits for hours and two for minutes. So you need to clean up the data first to change +05:30 into +0530.
strptime(gsub("05:30", "0530",dat$timestamp), format="%Y-%m-%dT%H:%M:%S%z")
If it can contain a range of time zones in the data, assuming the data is in a standard format all the time, you can do this to remove the last semicolon from dat$timestamp:
strptime(gsub("(.*)\\:(.*)", "\\1\\2", dat$timestamp), "%Y-%m-%dT%H:%M:%S%z")

lubridate is your friend. It was written because the basic R functionality for handling dates and times is complete, but tiresome and unforgiving. lubridate is (comparatively) easy, flexible, and very forgiving of minor differences in format, tokens, etc.
install.packages("lubridate")
require(lubridate)
# the "z!*" will correctly parse any of these:
# "+0530", "+5:30", "+05:30"
dat$parsedTime <- parse_date_time(dat$timestamp, orders="ymd hms z!*")
#
# now print that
#
dat
format(dat$parsedTime, tz="Asia/Kolkata")
Internally, POSIXct objects are represented numerically in Unix time (https://en.wikipedia.org/wiki/Unix_time ) which has no time zone of its own, so you specify the time zone you want when you print. By default, it's usually displayed in the time zone of the R machine, but the default displayed time zone can be changed by setting the tzone attribute, like so:
attr(dat$parsedTime, "tzone") <- "Asia/Kolkata"
Cheers
Jason

Related

How to convert numeric numbers to time stamp using R

I used this script
df?timestamp< as.POSIXct(df$timestamp,tz="",format="%Y%m%d%H%M%S",origin = "1970-01-01")
then i get NA's.
The number i have is 43466.10 and the result i should get is 2019-01-01 02:17:26.
What am i supposed to do?
In fact, 43466.10 is not a valid Unix epoch time. See https://en.wikipedia.org/wiki/Unix_time for more information.
This number looks like an Excel date/time value. If that truly is, then "2019-01-01 02:17:26" is actually 43466.0954398148, and the origin you should use is "1899-12-30", not "1970-01-01".
You cannot use as.POSIXct directly in this case because UNIX epoch time is in counting seconds, but an Excel date/time value is in counting days.
You can use the package openxlsx to do this conversion. Try this:
# install.packages("openxlsx")
df$timestamp <- openxlsx::convertToDateTime(df$timestamp, origin = "1899-12-30")

How to convert date and time into a numeric value in R

I am relatively new to R and I have a dataset in which I am trying to convert a date and time into a numeric value. The date and time are in the format 01JUN17:00:00:00 under a variable called pickup_datetime. I have tried using the code
cab_small_sample$pickup_datetime <- as.numeric(as.Date(cab_small_sample$pickup_datetime, format = '%d%b%y'))
but this way doesn't incorporate time, I tried to add the time format to the format section of code but still did not work. Is there an R function that will convert the data into a numeric value>
R has two main time classes: "Date" and "POSIXct". POSIXct is a datetime class and you can get all the gory details at: ? DateTimeClasses. The help page for the formats used at the time of data input, however, are at ?striptime.
cab_small_sample <- data.frame(pickup_datetime = "01JUN17:00:00:00")
cab_small_sample$pickup_dt <- as.numeric(as.POSIXct(cab_small_sample$pickup_datetime,
format = '%d%b%y:%H:%M:%S'))
cab_small_sample
# pickup_datetime pickup_dt
#1 01JUN17:00:00:00 1496300400 # seconds since 1970-01-01
I find that a "destructive reassignment of values" is generally a bad idea so as a "my (best?) practice rule" I don't assign to the same column until I'm sure I have the code working properly. (And I always leave an untouched copy somewhere safe.)
lubridate is an extremely handy package for dealing with dates. It includes a variety of functions which do the date/time parsing for you, as long as you can provide the order of components. In this case, since your data is in day-month-year-hms form, you can use the dmy_hms function.
library(lubridate)
cab_small_sample <- dplyr::tibble(
pickup_datetime = c("01JUN17:00:00:00", "01JUN17:11:00:00"))
cab_small_sample$pickup_POSIX <- dmy_hms(cab_small_sample$pickup_datetime)

Converting integer format date to double format of date

I have date format in following format in a data frame:
Jan-85
Apr-99
1-Nov
Feb-96
When I see the typeof(df$col) I get the answer as "integer".
Actually when I see the format in excel it is in m/d/yyyy format. I was trying to convert this to date format in R. All my efforts yielded NA.
I tried parse_date_time function. I tried as.date along with as.character. I tried as.POSIXct but everything is giving me NA.
My trials were as follows and everything was a failure:
as.Date.numeric(df$col,"m%d%Y")
transform(df$col, as.Date(as.character(df$col), "%m%d%Y"))
as.Date(df$col,"m%d%Y")
as.POSIXct.numeric(as.character(loan_new$issue_d), format="%Y%m%d")
as.POSIXct.date(as.character(df$col), format="%Y%m%d")
mdy(df$col)
parse_date_time(df$col,c("mdy"))
How can I convert this to date format? I have used lubridate package for parse_date_time and mdy package.
dput output is below
Label <- factor(c("Apr-08",
"Apr-09", "Apr-10", "Apr-11", "Aug-07", "Aug-08", "Aug-09", "Aug-10",
"Aug-11", "Dec-07", "Dec-08", "Dec-09", "Dec-10", "Dec-11", "Feb-08",
"Feb-09", "Feb-10", "Feb-11", "Jan-08", "Jan-09", "Jan-10", "Jan-11",
"Jul-07", "Jul-08", "Jul-09", "Jul-10", "Jul-11", "Jun-07", "Jun-08",
"Jun-09", "Jun-10", "Jun-11", "Mar-08", "Mar-09", "Mar-10", "Mar-11",
"May-08", "May-09", "May-10", "May-11", "Nov-07", "Nov-08", "Nov-09",
"Nov-10", "Nov-11", "Oct-07", "Oct-08", "Oct-09", "Oct-10", "Oct-11",
"Sep-07", "Sep-08", "Sep-09", "Sep-10", "Sep-11"))
NA is typically what you get when you misspecify the format. Which is what you do. That said, if your data is really looking like the first example you gave, it's impossible to simply convert this to a date. You have two different formats, one being month-year and the other day-month.
If your updated date (i.e. Dec-11) is the correct format, then you use the format argument of as.Date like this:
date <- "Dec-11"
as.Date(date, format = "%b-%d")
# [1] "2017-12-11"
Or on your example data:
as.Date(Label, format = "%b-%d")
# [1] "2017-04-08" "2017-04-09" "2017-04-10" "2017-04-11" "2017-08-07" "2017-08-08"
# [7] "2017-08-09" "2017-08-10" "2017-08-11" "2017-12-07" "2017-12-08" "2017-12-09"
If you want to convert something like Jan-85, you have to decide which day of the month that date should have. Say we just take the first of each month, then you can do:
x <- "Jan-85"
xd <- paste0("1-",x)
as.Date(xd, "%d-%b-%y")
# [1] "1985-01-01"
More information on the format codes can be found on ?strptime
Note that R will automatically add this year as the year. It has to, otherwise it can't specify the date. In case you do not have a day of the month (eg like Jan-85), conversion to a date is impossible because the underlying POSIX algorithms don't have all necessary information.
Also keep in mind that this only works when your locale is set to english. Otherwise you have a big chance your OS won't recognize the month abbreviations correctly. To do so, do eg:
Sys.setlocale(category = "LC_TIME", locale = "English_United Kingdom")
You can later set it back to the original one if you must, or restart your R session to reset the locale settings.
note: Please check carefully which locale notations are valid for your OS. The above example works on Windows, but is not guaranteed on either Linux or Mac.
Why you see integer
The fact that these string values are of integer type, is due to the fact that R automatically convert character vectors to factors when reading in a data frame. So typeof() returns integer because that's the internal representation of a factor.

Converting a numeric value to time in R [duplicate]

I have a data frame containing what should be a datetime column that has been read into R. The time values are appearing as numeric time as seen in the below data example. I would like to convert these into datetime POSIXct or POSIXlt format, so that date and time can be viewed.
tdat <- c(974424L, 974430L, 974436L, 974442L, 974448L, 974454L, 974460L, 974466L, 974472L,
974478L, 974484L, 974490L, 974496L, 974502L, 974508L, 974514L, 974520L, 974526L,
974532L,974538L)
974424 should equate to 00:00:00 01/03/2011, but the do not know the origin time of the numeric values (i.e. 1970-01-01 used below does not work). I have tried using commands such as the below to achieve this and have spent time trying to get as.POXISct to work, but I haven’t found a solution (i.e. I either end up with a POSIXct object of NAs or end up with obscure datetime values).
Attempts to convert numeric time to datetime:
datetime <- as.POSIXct(strptime(time, format = "%d/%m/%Y %H:%M:%S"))
datetime <- as.POSIXct(as.numeric(time), origin='1970-01-01')
I am sure that this is a simple thing to do. Any help would be greatly received. Thanks!
Try one of these depending on which time zone you want:
t.gmt <- as.POSIXct(3600 * (tdat - 974424), origin = '2011-03-01', tz = "GMT")
t.local <- as.POSIXct(format(t.gmt))

Converting tick data to OHLC bars using R

How can I convert tick data into OHLC data using R? I have seen a couple of examples on here but the issue I am having is converting the actual times for the individual time stamps. For example the very first time stamp is 2013-07-29 15:30:00.
x <- read.delim(header=TRUE, stringsAsFactor=FALSE,"http://hopey.netfonds.no/tradedump.php?date=20130729&paper=AAPL.O&csv_format=txt")
xx <- xts(x[,c(2:3)], as.POSIXct(x[,1], "UTC", "%Y%m%dT%H%M%S"))
to.period(xx,"seconds",5)
Just use to.period (or one of the wrappers) once you've created an xts object. To properly convert time to POSIXct, you have to specify the correct format (including the "T").
xx <- xts(x[,-1], as.POSIXct(x[,1], "UTC", "%Y%m%dT%H%M%S"))
to.period(xx, "seconds")
Also note that you should specify the timezone the time column was recorded in. I specified it as "UTC", since I don't know what timezone to use.

Resources