How can I convert tick data into OHLC data using R? I have seen a couple of examples on here but the issue I am having is converting the actual times for the individual time stamps. For example the very first time stamp is 2013-07-29 15:30:00.
x <- read.delim(header=TRUE, stringsAsFactor=FALSE,"http://hopey.netfonds.no/tradedump.php?date=20130729&paper=AAPL.O&csv_format=txt")
xx <- xts(x[,c(2:3)], as.POSIXct(x[,1], "UTC", "%Y%m%dT%H%M%S"))
to.period(xx,"seconds",5)
Just use to.period (or one of the wrappers) once you've created an xts object. To properly convert time to POSIXct, you have to specify the correct format (including the "T").
xx <- xts(x[,-1], as.POSIXct(x[,1], "UTC", "%Y%m%dT%H%M%S"))
to.period(xx, "seconds")
Also note that you should specify the timezone the time column was recorded in. I specified it as "UTC", since I don't know what timezone to use.
Related
I used this script
df?timestamp< as.POSIXct(df$timestamp,tz="",format="%Y%m%d%H%M%S",origin = "1970-01-01")
then i get NA's.
The number i have is 43466.10 and the result i should get is 2019-01-01 02:17:26.
What am i supposed to do?
In fact, 43466.10 is not a valid Unix epoch time. See https://en.wikipedia.org/wiki/Unix_time for more information.
This number looks like an Excel date/time value. If that truly is, then "2019-01-01 02:17:26" is actually 43466.0954398148, and the origin you should use is "1899-12-30", not "1970-01-01".
You cannot use as.POSIXct directly in this case because UNIX epoch time is in counting seconds, but an Excel date/time value is in counting days.
You can use the package openxlsx to do this conversion. Try this:
# install.packages("openxlsx")
df$timestamp <- openxlsx::convertToDateTime(df$timestamp, origin = "1899-12-30")
im relatively new to R. I´m doing an experiment where i measure the exact time where a couple of insects mate in 14 days in different luminic conditions (12:12H Light/Dark, Continious light, Continious dark).The idea is to analyze this data with ANOVA, but i'm having problems with the data. So I have a .csv file with 3 columns: Light condition, Date and Time. Date is not required for the analysis so i dont need it. But i have trouble converting the time data for a proper data R can work with. I've already tried read.csv(file="",stringsAsFactors = FALSE) but it doesnt work at all, I've also tried with lubridate , as.POSIX function and strptime() but nothing seems to work (or maybe im not converting the data at all for a proper analysis)
Thank you in advance.
It looks like you have date and time in two different columns and for time you have only hour and minutes. You can combine them using paste and convert to date time using appropriate format.
df$DateTime <- as.POSIXct(paste(df$Dia, df$Hora),
format = "%Y-%m-%d %H:%M",tz = "UTC")
#Can also use strptime
df$DateTime <- strptime(paste(df$Dia, df$Hora),
format = "%Y-%m-%d %H:%M", tz = "UTC")
Or with lubridate
df$DateTime <- lubridate::ymd_hm(paste(df$Dia, df$Hora))
I have imported a netCDF file into R and created a dataset which has 58196 time stamps. I’ve then fitted an Arima model to it and forecasted. However, the format of the time is ‘hours since 1900-01-01 00:00:00’. Each of the times are just in a numerical order up to 58196, but I would like to use ggplot to plot the forecast with dates on the xaxis.
Any ideas? Here is some code I have put in.
I have read in the required variable and taken it along what pressure level I want, so that it is a single variable at 58169 times, 6hourly intervals up to the end of the year in 2018. I have then done the following:
data <- data_array[13, ] # To get my univariate time series.
print(data)
[58176] -6.537371e-01 -4.765177e-01 -4.226107e-01 -4.303621e-01
-3.519134e-01
[58181] -2.706966e-01 -1.864843e-01 -9.974014e-02 2.970415e-02
6.640909e-02
[58186] -1.504763e-01 -3.968417e-01 -4.864971e-01 -5.934973e-01
-7.059880e-01
[58191] -7.812654e-01 -7.622807e-01 -8.968482e-01 -9.414597e-01
-1.003678e+00
[58196] -9.908477e-01
datafit <- auto.arima(data)
datamodel <- Arima(data, order = c(5, 0, 2))
datafcst <- forecast(datamodel, h=60, level=95)
plot(datafcst, xlim=c(58100, 58250))
enter image description here
I have attached the image it yields too. The idea is that I can use ggplot to plot this rather than the standard plot, with dates on the xaxis instead of the numerical values. However, ggplot also won't work for me as it says it isn't considered a data frame?
Many thanks!
as you did not provide a minimal example it is hard to help you but I try. Assume your date is called "date".
dater = as.Date(strptime(date, "%Y-%m-%d"))
And from ?strptime:
format
A character string. The default for the format methods is "%Y-%m-%d %H:%M:%S" if any element has a time component which is not midnight, and "%Y-%m-%d" otherwise.
Hope that helps
I have a time-series object dat as
dat <- structure(list(timestamp = c("2015-07-01T00:00:06+05:30", "2015-07-01T00:00:36+05:30",
"2015-07-01T00:01:06+05:30", "2015-07-01T00:01:36+05:30", "2015-07-01T00:02:06+05:30",
"2015-07-01T00:02:37+05:30"), value = c(110535.421875, 110516.6484375,
110398.25, 110381.5703125, 110392.15625, 110471.609375)), .Names = c("timestamp",
"value"), row.names = c(NA, 6L), class = "data.frame")
This object is associated with a timezone offset of 05:30, which is of "Asia/Kolkata". Whenever I try to convert the timestamp, I face following issues:
with as.POSIXct(strptime(dat$timestamp,format ="%Y-%m-%dT%H:%M:%S%z")), it outputs empty strings
If I remove timezone information in strptime(), it gets automatically converted to the timezone of system, i.e., as.POSIXct(strptime(dat$timestamp,format ="%Y-%m-%dT%H:%M:%S")) . In other words, it takes timezone of the system.
How should I always force the timezone to be the same as associated with the original data object?
Format %z can be used for input or output: it is a character string, conventionally plus or minus followed by two digits for hours and two for minutes. So you need to clean up the data first to change +05:30 into +0530.
strptime(gsub("05:30", "0530",dat$timestamp), format="%Y-%m-%dT%H:%M:%S%z")
If it can contain a range of time zones in the data, assuming the data is in a standard format all the time, you can do this to remove the last semicolon from dat$timestamp:
strptime(gsub("(.*)\\:(.*)", "\\1\\2", dat$timestamp), "%Y-%m-%dT%H:%M:%S%z")
lubridate is your friend. It was written because the basic R functionality for handling dates and times is complete, but tiresome and unforgiving. lubridate is (comparatively) easy, flexible, and very forgiving of minor differences in format, tokens, etc.
install.packages("lubridate")
require(lubridate)
# the "z!*" will correctly parse any of these:
# "+0530", "+5:30", "+05:30"
dat$parsedTime <- parse_date_time(dat$timestamp, orders="ymd hms z!*")
#
# now print that
#
dat
format(dat$parsedTime, tz="Asia/Kolkata")
Internally, POSIXct objects are represented numerically in Unix time (https://en.wikipedia.org/wiki/Unix_time ) which has no time zone of its own, so you specify the time zone you want when you print. By default, it's usually displayed in the time zone of the R machine, but the default displayed time zone can be changed by setting the tzone attribute, like so:
attr(dat$parsedTime, "tzone") <- "Asia/Kolkata"
Cheers
Jason
I have a data frame containing what should be a datetime column that has been read into R. The time values are appearing as numeric time as seen in the below data example. I would like to convert these into datetime POSIXct or POSIXlt format, so that date and time can be viewed.
tdat <- c(974424L, 974430L, 974436L, 974442L, 974448L, 974454L, 974460L, 974466L, 974472L,
974478L, 974484L, 974490L, 974496L, 974502L, 974508L, 974514L, 974520L, 974526L,
974532L,974538L)
974424 should equate to 00:00:00 01/03/2011, but the do not know the origin time of the numeric values (i.e. 1970-01-01 used below does not work). I have tried using commands such as the below to achieve this and have spent time trying to get as.POXISct to work, but I haven’t found a solution (i.e. I either end up with a POSIXct object of NAs or end up with obscure datetime values).
Attempts to convert numeric time to datetime:
datetime <- as.POSIXct(strptime(time, format = "%d/%m/%Y %H:%M:%S"))
datetime <- as.POSIXct(as.numeric(time), origin='1970-01-01')
I am sure that this is a simple thing to do. Any help would be greatly received. Thanks!
Try one of these depending on which time zone you want:
t.gmt <- as.POSIXct(3600 * (tdat - 974424), origin = '2011-03-01', tz = "GMT")
t.local <- as.POSIXct(format(t.gmt))