endpoints are always in UTC/GMT - r

I think the endpoints are generated by converting the index to UTC. I want the endpoints at change of every hour 9:00/10:00 etc ( or at my specific defined intervals, say at 9:30/10:30 etc).
In the below example object 'a' is UTC and endpoints are created at 4:55,5:55 etc, for the object 'b' its at 10:25,11:25 etc. I am looking for a solution that gives endpoints in specified way, irrespective of timezone.
Is there any simple mechanism to do so?
> head(a)
Open High Low Close
2008-01-01 04:30:00 6114.05 6126.65 6111.35 6111.35
2008-01-01 04:35:00 6110.50 6130.65 6110.50 6128.90
2008-01-01 04:40:00 6128.70 6130.15 6123.15 6123.55
2008-01-01 04:45:00 6124.85 6131.90 6123.45 6131.55
2008-01-01 04:50:00 6132.20 6134.45 6128.70 6131.20
2008-01-01 04:55:00 6132.25 6134.85 6132.25 6134.45
> indexTZ(a)
TZ
"UTC"
> a[endpoints(a,on="hours")]
Open High Low Close
2008-01-01 04:55:00 6132.25 6134.85 6132.25 6134.45
2008-01-01 05:55:00 6136.70 6136.70 6132.15 6134.45
2008-01-01 06:55:00 6157.65 6157.65 6153.20 6154.25
2008-01-01 07:55:00 6155.65 6157.60 6155.00 6157.25
2008-01-01 08:55:00 6143.25 6143.90 6137.50 6138.05
2008-01-01 09:55:00 6150.95 6151.65 6147.50 6149.20
2008-01-02 04:55:00 6113.40 6120.90 6089.00 6089.00
2008-01-02 05:55:00 6086.15 6087.25 6068.80 6068.95
2008-01-02 06:55:00 6098.10 6108.25 6098.10 6105.85
2008-01-02 07:05:00 6107.40 6107.40 6093.70 6094.80
> head(b)
Open High Low Close
2008-01-01 10:00:00 6114.05 6126.65 6111.35 6111.35
2008-01-01 10:05:00 6110.50 6130.65 6110.50 6128.90
2008-01-01 10:10:00 6128.70 6130.15 6123.15 6123.55
2008-01-01 10:15:00 6124.85 6131.90 6123.45 6131.55
2008-01-01 10:20:00 6132.20 6134.45 6128.70 6131.20
2008-01-01 10:25:00 6132.25 6134.85 6132.25 6134.45
> indexTZ(b)
TZ
"Asia/Kolkata"
> b[endpoints(b,on="hours")]
Open High Low Close
2008-01-01 10:25:00 6132.25 6134.85 6132.25 6134.45
2008-01-01 11:25:00 6136.70 6136.70 6132.15 6134.45
2008-01-01 12:25:00 6157.65 6157.65 6153.20 6154.25
2008-01-01 13:25:00 6155.65 6157.60 6155.00 6157.25
2008-01-01 14:25:00 6143.25 6143.90 6137.50 6138.05
2008-01-01 15:25:00 6150.95 6151.65 6147.50 6149.20
2008-01-02 10:25:00 6113.40 6120.90 6089.00 6089.00
2008-01-02 11:25:00 6086.15 6087.25 6068.80 6068.95
2008-01-02 12:25:00 6098.10 6108.25 6098.10 6105.85
2008-01-02 12:35:00 6107.40 6107.40 6093.70 6094.80
>
Thanks & Regds
Siva Sunku

endpoints always calculates offsets in UTC. There's nothing you can do to change that as an end-user. But you could work-around it by comparing the 1-hour endpoints with the 30-minute endpoints.
x <- .xts(1:12, seq(0, by=600, length.out=12), tzone="Asia/Kolkata")
x[endpoints(x, "hours")]
# [,1]
# 1970-01-01 06:20:00 6
# 1970-01-01 07:20:00 12
hourEndpointsTZ30 <- function(x) {
h <- endpoints(x, "hours", 1)
m <- endpoints(x, "minutes", 30)
c(0, setdiff(m, h), last(m))
}
x[hourEndpointsTZ30(x)]
# [,1]
# 1970-01-01 05:50:00 3
# 1970-01-01 06:50:00 9
# 1970-01-01 07:20:00 12

Related

Update year only in column timestamp date field SQLITE

I want to update the year only to 2025 without changing the month day and time
what I have
2027-01-01 09:30:00
2012-03-06 12:00:00
2014-01-01 17:24:00
2020-07-03 04:30:00
2020-01-01 05:50:00
2021-09-03 06:30:00
2013-01-01 23:30:00
2026-01-01 08:30:00
2028-01-01 09:30:00
what i required is below:
2025-01-01 09:30:00
2025-03-06 12:30:00
2025-01-01 17:24:00
2025-07-03 04:30:00
2025-01-01 05:50:00
2025-09-03 06:30:00
2025-01-01 23:30:00
2025-01-01 08:30:00
2025-01-01 09:30:00
I am using dB Browser for SQLite
what i have tried but it didn't worked
update t set
d = datetime(strftime('%Y', datetime(2059)) || strftime('-%m-%d', d));
You may update via a substring operation:
UPDATE yourTable
SET ts = '2025-' || SUBSTR(ts, 6, 14);
Note that SQLite does not actually have a timestamp/datetime type. Instead, these values would be stored as text, and hence we can do a substring operation on them.

Subtracting two columns in R; one of them consisting of date and the other one of time and date

I want to substract timeA with timearriving and timeL with timeleaving but I get this error:
"Error in unclass(e1) - e2 : non-numeric argument to binary operator"
When you see that error message, it means that you're trying to perform a binary operation with something that isn't a number. I understand the error but I wanted to ask is there is a way I can perform these calculations?
I provided a sample image of my dataset
number id location timearriving timeleaving timeA timeL person late
1 214980 900264 1001.18 NULL NULL 2016-09-15 10:00:00 2016-09-15 12:00:00 Teacher
2 215708 900264 1001.18 07:55:06 09:59:58 2016-09-22 10:00:00 2016-09-22 12:00:00 Teacher
3 216388 900264 1001.18 08:00:22 09:54:06 2016-09-29 10:00:00 2016-09-29 12:00:00 Teacher
4 217106 900264 1001.18 08:40:15 09:53:07 2016-10-05 10:00:00 2016-10-05 12:00:00 Teacher
5 217250 900264 1001.18 08:03:47 09:52:59 2016-10-06 10:00:00 2016-10-06 12:00:00 Teacher
6 217808 900264 1001.18 NULL NULL 2016-10-12 10:00:00 2016-10-12 12:00:00 Teacher
7 217952 900264 1001.18 08:01:44 09:51:45 2016-10-13 10:00:00 2016-10-13 12:00:00 Teacher
8 218640 900264 1001.18 08:04:04 09:57:24 2016-10-19 10:00:00 2016-10-19 12:00:00 Teacher
9 218788 900264 1001.18 07:59:52 09:50:17 2016-10-20 10:00:00 2016-10-20 12:00:00 Teacher
10 219397 900264 1001.18 08:01:06 09:51:05 2016-10-26 10:00:00 2016-10-26 12:00:00 Teacher
11 219541 900264 1001.18 08:05:29 09:56:04 2016-10-27 10:00:00 2016-10-27 12:00:00 Teacher
12 220273 900264 1001.18 08:09:20 09:57:46 2016-11-02 09:00:00 2016-11-02 11:00:00 Teacher
13 220419 900264 1001.18 08:09:05 09:59:53 2016-11-03 09:00:00 2016-11-03 11:00:00 Teacher
Here I added a new column with the name "late".
I want to subtract TimeA- timearriving
I did this using this code:
dataset["late"] <- NA
dataset$late <- dataset$timeA - dataset$timearriving
then the error was:
Error in unclass(e1) - e2 : non-numeric argument to binary operator
Now I tried to convert them like you said:
timeA <- ymd_hms(timeA )
timearriving <- hms(timearriving )
Warning message:
In .parse_hms(..., order = "HMS", quiet = quiet) :
Some strings failed to parse
Since you don't provide a reproducible example I will illustrate using one value for each variable e.g.:
library(lubridate)
timeleaving <- hms("09:59:33")
timeA <- ymd_hms("2017-02-16 10:00:00")
You could use:
timeleaving <- ymd_hms(paste(floor_date(timeA, "days"), timeleaving))
dif <- timeA -timeleaving
Time difference of 27 secs
Edited since the data was added to the original question:
data$timeleaving <- hms(data$timeleaving)
data$timearriving <- hms(data$timearriving)
data$timeA <- ymd_hms(data$timeA )
data$timeL <- ymd_hms(data$timeL )
data$timeleaving <- ymd_hms(paste(floor_date(data$timeL, "days"), data$timeleaving))
data$timearriving <- ymd_hms(paste(floor_date(data$timeA, "days"), data$timearriving))
data$late <- data$timeA - data$timearriving

R: count 15 minutes interval in time

I would like to count the amount of sessions started at each 15 minutes intervals for businessdays within a large dataset.
My data looks like:
df <-
Start_datetime End_datetime Duration Volume
2016-04-01 06:20:55 2016-04-01 14:41:22 08:20:27 8.360
2016-04-01 08:22:27 2016-04-01 08:22:40 00:00:13 0.000
2016-04-01 08:38:53 2016-04-01 09:31:58 00:53:05 12.570
2016-04-01 09:33:57 2016-04-01 12:37:43 03:03:46 7.320
2016-04-01 10:05:03 2016-04-01 16:41:16 06:36:13 9.520
2016-04-01 12:07:57 2016-04-02 22:22:32 34:14:35 7.230
2016-04-01 16:56:55 2016-04-02 10:40:17 17:43:22 5.300
2016-04-01 17:29:18 2016-04-01 19:50:29 02:21:11 7.020
2016-04-01 17:42:39 2016-04-01 19:45:38 02:02:59 2.430
2016-04-01 17:47:57 2016-04-01 20:26:35 02:38:38 8.090
2016-04-01 22:00:15 2016-04-04 08:22:21 58:22:06 4.710
2016-04-02 01:12:38 2016-04-02 09:49:00 08:36:22 3.150
2016-04-02 01:32:00 2016-04-02 12:49:47 11:17:47 5.760
2016-04-02 07:28:48 2016-04-04 06:58:56 47:30:08 0.000
2016-04-02 07:55:18 2016-04-05 07:55:15 71:59:57 0.240
I would like to count all the starting sessions per 15minutes starting, where:
For business days
Time PTU Count
00:00:00 - 00:15:00 1 10 #(where count is the amount of sessions started between 00:00:00 and 00:15:00)
00:15:00 - 00:30:00 2 6
00:30:00 - 00:45:00 3 5
00:45:00 - 01:00:00 3 3
And so on and the same data for the weekend.
I have tried the cut function:
df$PTU <- table (cut(df$Start_datetime, breaks="15 minutes"))
data.frame(PTU)
EDIT: When I run this i receive the following error:
Error in cut.default(df$Start_datetime, breaks = "15 minutes") :'x' must be numeric
And some functions with lubridate, but I can't seem to make it work. My final goal is to create a table like the following, but then with 15 minutes interval.
Here is a sort of complete process from datetime "strings" to the format you want. The start is a string vector:
Start_time <-
c("2016-04-01 06:20:55", "2016-04-01 08:22:27", "2016-04-01 08:38:53",
"2016-04-01 09:33:57", "2016-04-01 10:05:03", "2016-04-01 12:07:57",
"2016-04-01 16:56:55", "2016-04-01 17:29:18", "2016-04-01 17:42:39",
"2016-04-01 17:47:57", "2016-04-01 22:00:15", "2016-04-02 01:12:38",
"2016-04-02 01:32:00", "2016-04-02 07:28:48", "2016-04-02 07:55:18"
)
df <- data.frame(Start_time)
And this is an actual processing
## We will use two packages
library(lubridate)
library(data.table)
# convert df to data.table, parse the datetime string
setDT(df)[, Start_time := ymd_hms(Start_time)]
# floor time by 15 min to assign the appropriate slot (new variable Start_time_slot)
df[, Start_time_slot := floor_date(Start_time, "15 min")]
# aggregate by wday and time in a date
start_time_data_frame <- df[, .N, by = .(wday(Start_time_slot), format(Start_time_slot, format="%H:%M:%S") )]
# output looks like this
start_time_data_frame
## wday time N
## 1: 6 06:15:00 1
## 2: 6 08:15:00 1
## 3: 6 08:30:00 1
## 4: 6 09:30:00 1
## 5: 6 10:00:00 1
## 6: 6 12:00:00 1
## 7: 6 16:45:00 1
## 8: 6 17:15:00 1
## 9: 6 17:30:00 1
## 10: 6 17:45:00 1
## 11: 6 22:00:00 1
## 12: 7 01:00:00 1
## 13: 7 01:30:00 1
## 14: 7 07:15:00 1
## 15: 7 07:45:00 1
There's two things you have to keep in mind when using cut on datetimes:
Make sure your data is actually a POSIXt class. I'm quite sure yours isn't, or R wouldn't be using cut.default but cut.POSIXt as a method.
"15 minutes" should be "15 min". See ?cut.POSIXt
So this works:
Start_datetime <- as.POSIXct(
c("2016-04-01 06:20:55",
"2016-04-01 06:22:12",
"2016-04-01 05:30:12")
)
table(cut(Start_datetime, breaks = "15 min"))
# 2016-04-01 05:30:00 2016-04-01 05:45:00 2016-04-01 06:00:00 2016-04-01 06:15:00
# 1 0 0 2
Note that the output gives you the start of the 15 minute interval as names of the table.

R time series missing values

I was working with a time series dataset having hourly data. The data contained a few missing values so I tried to create a dataframe (time_seq) with the correct time value and do a merge with the original data so the missing values become 'NA'.
> data
date value
7980 2015-03-30 20:00:00 78389
7981 2015-03-30 21:00:00 72622
7982 2015-03-30 22:00:00 65240
7983 2015-03-30 23:00:00 47795
7984 2015-03-31 08:00:00 37455
7985 2015-03-31 09:00:00 70695
7986 2015-03-31 10:00:00 68444
//converting the date in the data to POSIXct format.
> data$date <- format.POSIXct(data$date,'%Y-%m-%d %H:%M:%S')
// creating a dataframe with the correct sequence of dates.
> time_seq <- seq(from = as.POSIXct("2014-05-01 00:00:00"),
to = as.POSIXct("2015-04-30 23:00:00"), by = "hour")
> df <- data.frame(date=time_seq)
> df
date
8013 2015-03-30 20:00:00
8014 2015-03-30 21:00:00
8015 2015-03-30 22:00:00
8016 2015-03-30 23:00:00
8017 2015-03-31 00:00:00
8018 2015-03-31 01:00:00
8019 2015-03-31 02:00:00
8020 2015-03-31 03:00:00
8021 2015-03-31 04:00:00
8022 2015-03-31 05:00:00
8023 2015-03-31 06:00:00
8024 2015-03-31 07:00:00
// merging with the original data
> a <- merge(data,df, x.by = data$date, y.by = df$date ,all=TRUE)
> a
date value
4005 2014-07-23 07:00:00 37003
4006 2014-07-23 07:30:00 NA
4007 2014-07-23 08:00:00 37216
4008 2014-07-23 08:30:00 NA
The values I get after merging are incorrect and they contain half-hourly values. What would be the correct approach for solving this?
Why are is the merge result in 30 minute intervals when both my dataframes are hourly?
PS:I looked into this question : Fastest way for filling-in missing dates for data.table and followed the steps but it didn't help.
You can use the padr package to solve this problem.
library(padr)
library(dplyr) #for the pipe operator
data %>%
pad() %>%
fill_by_value()

subset by vector in r

I am trying to subset an xts object of OHLC hourly data with a vector.
If i create the vector myself with the following command
lookup = c("2012-01-12", "2012-01-31", "2012-03-05", "2012-03-19")
testdfx[lookup]
testdfx[lookup]
I get the correct data displayed which shows all the hours that match the dates in the vector (00:00 to 23:00.
> head(testdfx[lookup])
open high low close
2012-01-12 00:00:00 1.27081 1.27217 1.27063 1.27211
2012-01-12 01:00:00 1.27212 1.27216 1.27089 1.27119
2012-01-12 02:00:00 1.27118 1.27166 1.27017 1.27133
2012-01-12 03:00:00 1.27134 1.27272 1.27133 1.27261
2012-01-12 04:00:00 1.27260 1.27262 1.27141 1.27183
2012-01-12 05:00:00 1.27183 1.27230 1.27145 1.27165
> tail(testdfx[lookup])
open high low close
2012-03-19 18:00:00 1.32451 1.32554 1.32386 1.32414
2012-03-19 19:00:00 1.32417 1.32465 1.32331 1.32372
2012-03-19 20:00:00 1.32373 1.32415 1.32340 1.32372
2012-03-19 21:00:00 1.32373 1.32461 1.32366 1.32376
2012-03-19 22:00:00 1.32377 1.32424 1.32359 1.32366
2012-03-19 23:00:00 1.32364 1.32406 1.32333 1.32336
However when I extract a dates from an object and create a vector to use for subsetting I only get the hours of 00:00-19:00 displayed in my subset.
> head(testdfx[dates])
open high low close
2007-01-05 00:00:00 1.3092 1.3093 1.3085 1.3088
2007-01-05 01:00:00 1.3087 1.3092 1.3075 1.3078
2007-01-05 02:00:00 1.3079 1.3091 1.3078 1.3084
2007-01-05 03:00:00 1.3083 1.3084 1.3073 1.3074
2007-01-05 04:00:00 1.3073 1.3080 1.3061 1.3071
2007-01-05 05:00:00 1.3070 1.3072 1.3064 1.3069
> tail(euro[nfp.releases])
open high low close
2014-01-10 14:00:00 1.35892 1.36625 1.35728 1.36366
2014-01-10 15:00:00 1.36365 1.36784 1.36241 1.36743
2014-01-10 16:00:00 1.36742 1.36866 1.36693 1.36719
2014-01-10 17:00:00 1.36720 1.36752 1.36579 1.36617
2014-01-10 18:00:00 1.36617 1.36663 1.36559 1.36624
2014-01-10 19:00:00 1.36630 1.36717 1.36585 1.36702
I have compared both objects containing the require dates and they appear to be the same.
> class(lookup)
[1] "character"
> class(nfp.releases)
[1] "character"
> str(lookup)
chr [1:4] "2012-01-12" "2012-01-31" "2012-03-05" "2012-03-19"
> str(nfp.releases)
chr [1:86] "2014-02-07" "2014-01-10" "2013-12-06" "2013-11-08" ..
I am new to R but have tried everything over the past 3 days to get this to work. If I can't to it this way I will end up having to create a variable by hand but as its got 86 dates this may take some time.
Thanks in advance.
I cannot reproduce your problem
lookup = c("2012-01-12", "2012-01-31", "2012-03-05", "2012-03-19")
time_index <- seq(from = as.POSIXct("2012-01-01 07:00"), to = as.POSIXct("2012-05-17 18:00"), by = "hour")
set.seed(1)
value <- matrix(rnorm(n = 4*length(time_index)),length(time_index),4)
testdfx <- xts(value, order.by = time_index)
testdfx[lookup[1]]
testdfx["2012-01-12"]
Thanks for the response guys I actually thought i had deleted this thread but obviously not.
The problem in the case above was to be found around 3' from the computer. When looking through the data I was only interested in Fridays which also means that the FX market is closing down for the week end.
Sorry to have wasted your time and Admin please remove.

Resources