endpoints are always in UTC/GMT

endpoints are always in UTC/GMT - r

I think the endpoints are generated by converting the index to UTC. I want the endpoints at change of every hour 9:00/10:00 etc ( or at my specific defined intervals, say at 9:30/10:30 etc).
In the below example object 'a' is UTC and endpoints are created at 4:55,5:55 etc, for the object 'b' its at 10:25,11:25 etc. I am looking for a solution that gives endpoints in specified way, irrespective of timezone.
Is there any simple mechanism to do so?
> head(a)
Open High Low Close
2008-01-01 04:30:00 6114.05 6126.65 6111.35 6111.35
2008-01-01 04:35:00 6110.50 6130.65 6110.50 6128.90
2008-01-01 04:40:00 6128.70 6130.15 6123.15 6123.55
2008-01-01 04:45:00 6124.85 6131.90 6123.45 6131.55
2008-01-01 04:50:00 6132.20 6134.45 6128.70 6131.20
2008-01-01 04:55:00 6132.25 6134.85 6132.25 6134.45
> indexTZ(a)
TZ
"UTC"
> a[endpoints(a,on="hours")]
Open High Low Close
2008-01-01 04:55:00 6132.25 6134.85 6132.25 6134.45
2008-01-01 05:55:00 6136.70 6136.70 6132.15 6134.45
2008-01-01 06:55:00 6157.65 6157.65 6153.20 6154.25
2008-01-01 07:55:00 6155.65 6157.60 6155.00 6157.25
2008-01-01 08:55:00 6143.25 6143.90 6137.50 6138.05
2008-01-01 09:55:00 6150.95 6151.65 6147.50 6149.20
2008-01-02 04:55:00 6113.40 6120.90 6089.00 6089.00
2008-01-02 05:55:00 6086.15 6087.25 6068.80 6068.95
2008-01-02 06:55:00 6098.10 6108.25 6098.10 6105.85
2008-01-02 07:05:00 6107.40 6107.40 6093.70 6094.80
> head(b)
Open High Low Close
2008-01-01 10:00:00 6114.05 6126.65 6111.35 6111.35
2008-01-01 10:05:00 6110.50 6130.65 6110.50 6128.90
2008-01-01 10:10:00 6128.70 6130.15 6123.15 6123.55
2008-01-01 10:15:00 6124.85 6131.90 6123.45 6131.55
2008-01-01 10:20:00 6132.20 6134.45 6128.70 6131.20
2008-01-01 10:25:00 6132.25 6134.85 6132.25 6134.45
> indexTZ(b)
TZ
"Asia/Kolkata"
> b[endpoints(b,on="hours")]
Open High Low Close
2008-01-01 10:25:00 6132.25 6134.85 6132.25 6134.45
2008-01-01 11:25:00 6136.70 6136.70 6132.15 6134.45
2008-01-01 12:25:00 6157.65 6157.65 6153.20 6154.25
2008-01-01 13:25:00 6155.65 6157.60 6155.00 6157.25
2008-01-01 14:25:00 6143.25 6143.90 6137.50 6138.05
2008-01-01 15:25:00 6150.95 6151.65 6147.50 6149.20
2008-01-02 10:25:00 6113.40 6120.90 6089.00 6089.00
2008-01-02 11:25:00 6086.15 6087.25 6068.80 6068.95
2008-01-02 12:25:00 6098.10 6108.25 6098.10 6105.85
2008-01-02 12:35:00 6107.40 6107.40 6093.70 6094.80
>
Thanks & Regds
Siva Sunku

endpoints always calculates offsets in UTC. There's nothing you can do to change that as an end-user. But you could work-around it by comparing the 1-hour endpoints with the 30-minute endpoints.
x <- .xts(1:12, seq(0, by=600, length.out=12), tzone="Asia/Kolkata")
x[endpoints(x, "hours")]
# [,1]
# 1970-01-01 06:20:00 6
# 1970-01-01 07:20:00 12
hourEndpointsTZ30 <- function(x) {
h <- endpoints(x, "hours", 1)
m <- endpoints(x, "minutes", 30)
c(0, setdiff(m, h), last(m))
}
x[hourEndpointsTZ30(x)]
# [,1]
# 1970-01-01 05:50:00 3
# 1970-01-01 06:50:00 9
# 1970-01-01 07:20:00 12

Related

Update year only in column timestamp date field SQLITE

I want to update the year only to 2025 without changing the month day and time
what I have
2027-01-01 09:30:00
2012-03-06 12:00:00
2014-01-01 17:24:00
2020-07-03 04:30:00
2020-01-01 05:50:00
2021-09-03 06:30:00
2013-01-01 23:30:00
2026-01-01 08:30:00
2028-01-01 09:30:00
what i required is below:
2025-01-01 09:30:00
2025-03-06 12:30:00
2025-01-01 17:24:00
2025-07-03 04:30:00
2025-01-01 05:50:00
2025-09-03 06:30:00
2025-01-01 23:30:00
2025-01-01 08:30:00
2025-01-01 09:30:00
I am using dB Browser for SQLite
what i have tried but it didn't worked
update t set
d = datetime(strftime('%Y', datetime(2059)) || strftime('-%m-%d', d));

You may update via a substring operation:
UPDATE yourTable
SET ts = '2025-' || SUBSTR(ts, 6, 14);
Note that SQLite does not actually have a timestamp/datetime type. Instead, these values would be stored as text, and hence we can do a substring operation on them.

Subtracting two columns in R; one of them consisting of date and the other one of time and date

I want to substract timeA with timearriving and timeL with timeleaving but I get this error:
"Error in unclass(e1) - e2 : non-numeric argument to binary operator"
When you see that error message, it means that you're trying to perform a binary operation with something that isn't a number. I understand the error but I wanted to ask is there is a way I can perform these calculations?
I provided a sample image of my dataset
number id location timearriving timeleaving timeA timeL person late
1 214980 900264 1001.18 NULL NULL 2016-09-15 10:00:00 2016-09-15 12:00:00 Teacher
2 215708 900264 1001.18 07:55:06 09:59:58 2016-09-22 10:00:00 2016-09-22 12:00:00 Teacher
3 216388 900264 1001.18 08:00:22 09:54:06 2016-09-29 10:00:00 2016-09-29 12:00:00 Teacher
4 217106 900264 1001.18 08:40:15 09:53:07 2016-10-05 10:00:00 2016-10-05 12:00:00 Teacher
5 217250 900264 1001.18 08:03:47 09:52:59 2016-10-06 10:00:00 2016-10-06 12:00:00 Teacher
6 217808 900264 1001.18 NULL NULL 2016-10-12 10:00:00 2016-10-12 12:00:00 Teacher
7 217952 900264 1001.18 08:01:44 09:51:45 2016-10-13 10:00:00 2016-10-13 12:00:00 Teacher
8 218640 900264 1001.18 08:04:04 09:57:24 2016-10-19 10:00:00 2016-10-19 12:00:00 Teacher
9 218788 900264 1001.18 07:59:52 09:50:17 2016-10-20 10:00:00 2016-10-20 12:00:00 Teacher
10 219397 900264 1001.18 08:01:06 09:51:05 2016-10-26 10:00:00 2016-10-26 12:00:00 Teacher
11 219541 900264 1001.18 08:05:29 09:56:04 2016-10-27 10:00:00 2016-10-27 12:00:00 Teacher
12 220273 900264 1001.18 08:09:20 09:57:46 2016-11-02 09:00:00 2016-11-02 11:00:00 Teacher
13 220419 900264 1001.18 08:09:05 09:59:53 2016-11-03 09:00:00 2016-11-03 11:00:00 Teacher
Here I added a new column with the name "late".
I want to subtract TimeA- timearriving
I did this using this code:
dataset["late"] <- NA
dataset$late <- dataset$timeA - dataset$timearriving
then the error was:
Error in unclass(e1) - e2 : non-numeric argument to binary operator
Now I tried to convert them like you said:
timeA <- ymd_hms(timeA )
timearriving <- hms(timearriving )
Warning message:
In .parse_hms(..., order = "HMS", quiet = quiet) :
Some strings failed to parse

Since you don't provide a reproducible example I will illustrate using one value for each variable e.g.:
library(lubridate)
timeleaving <- hms("09:59:33")
timeA <- ymd_hms("2017-02-16 10:00:00")
You could use:
timeleaving <- ymd_hms(paste(floor_date(timeA, "days"), timeleaving))
dif <- timeA -timeleaving
Time difference of 27 secs
Edited since the data was added to the original question:
data$timeleaving <- hms(data$timeleaving)
data$timearriving <- hms(data$timearriving)
data$timeA <- ymd_hms(data$timeA )
data$timeL <- ymd_hms(data$timeL )
data$timeleaving <- ymd_hms(paste(floor_date(data$timeL, "days"), data$timeleaving))
data$timearriving <- ymd_hms(paste(floor_date(data$timeA, "days"), data$timearriving))
data$late <- data$timeA - data$timearriving

R: count 15 minutes interval in time

I would like to count the amount of sessions started at each 15 minutes intervals for businessdays within a large dataset.
My data looks like:
df <-
Start_datetime End_datetime Duration Volume
2016-04-01 06:20:55 2016-04-01 14:41:22 08:20:27 8.360
2016-04-01 08:22:27 2016-04-01 08:22:40 00:00:13 0.000
2016-04-01 08:38:53 2016-04-01 09:31:58 00:53:05 12.570
2016-04-01 09:33:57 2016-04-01 12:37:43 03:03:46 7.320
2016-04-01 10:05:03 2016-04-01 16:41:16 06:36:13 9.520
2016-04-01 12:07:57 2016-04-02 22:22:32 34:14:35 7.230
2016-04-01 16:56:55 2016-04-02 10:40:17 17:43:22 5.300
2016-04-01 17:29:18 2016-04-01 19:50:29 02:21:11 7.020
2016-04-01 17:42:39 2016-04-01 19:45:38 02:02:59 2.430
2016-04-01 17:47:57 2016-04-01 20:26:35 02:38:38 8.090
2016-04-01 22:00:15 2016-04-04 08:22:21 58:22:06 4.710
2016-04-02 01:12:38 2016-04-02 09:49:00 08:36:22 3.150
2016-04-02 01:32:00 2016-04-02 12:49:47 11:17:47 5.760
2016-04-02 07:28:48 2016-04-04 06:58:56 47:30:08 0.000
2016-04-02 07:55:18 2016-04-05 07:55:15 71:59:57 0.240
I would like to count all the starting sessions per 15minutes starting, where:
For business days
Time PTU Count
00:00:00 - 00:15:00 1 10 #(where count is the amount of sessions started between 00:00:00 and 00:15:00)
00:15:00 - 00:30:00 2 6
00:30:00 - 00:45:00 3 5
00:45:00 - 01:00:00 3 3
And so on and the same data for the weekend.
I have tried the cut function:
df$PTU <- table (cut(df$Start_datetime, breaks="15 minutes"))
data.frame(PTU)
EDIT: When I run this i receive the following error:
Error in cut.default(df$Start_datetime, breaks = "15 minutes") :'x' must be numeric
And some functions with lubridate, but I can't seem to make it work. My final goal is to create a table like the following, but then with 15 minutes interval.

Here is a sort of complete process from datetime "strings" to the format you want. The start is a string vector:
Start_time <-
c("2016-04-01 06:20:55", "2016-04-01 08:22:27", "2016-04-01 08:38:53",
"2016-04-01 09:33:57", "2016-04-01 10:05:03", "2016-04-01 12:07:57",
"2016-04-01 16:56:55", "2016-04-01 17:29:18", "2016-04-01 17:42:39",
"2016-04-01 17:47:57", "2016-04-01 22:00:15", "2016-04-02 01:12:38",
"2016-04-02 01:32:00", "2016-04-02 07:28:48", "2016-04-02 07:55:18"
)
df <- data.frame(Start_time)
And this is an actual processing
## We will use two packages
library(lubridate)
library(data.table)
# convert df to data.table, parse the datetime string
setDT(df)[, Start_time := ymd_hms(Start_time)]
# floor time by 15 min to assign the appropriate slot (new variable Start_time_slot)
df[, Start_time_slot := floor_date(Start_time, "15 min")]
# aggregate by wday and time in a date
start_time_data_frame <- df[, .N, by = .(wday(Start_time_slot), format(Start_time_slot, format="%H:%M:%S") )]
# output looks like this
start_time_data_frame
## wday time N
## 1: 6 06:15:00 1
## 2: 6 08:15:00 1
## 3: 6 08:30:00 1
## 4: 6 09:30:00 1
## 5: 6 10:00:00 1
## 6: 6 12:00:00 1
## 7: 6 16:45:00 1
## 8: 6 17:15:00 1
## 9: 6 17:30:00 1
## 10: 6 17:45:00 1
## 11: 6 22:00:00 1
## 12: 7 01:00:00 1
## 13: 7 01:30:00 1
## 14: 7 07:15:00 1
## 15: 7 07:45:00 1

There's two things you have to keep in mind when using cut on datetimes:
Make sure your data is actually a POSIXt class. I'm quite sure yours isn't, or R wouldn't be using cut.default but cut.POSIXt as a method.
"15 minutes" should be "15 min". See ?cut.POSIXt
So this works:
Start_datetime <- as.POSIXct(
c("2016-04-01 06:20:55",
"2016-04-01 06:22:12",
"2016-04-01 05:30:12")
)
table(cut(Start_datetime, breaks = "15 min"))
# 2016-04-01 05:30:00 2016-04-01 05:45:00 2016-04-01 06:00:00 2016-04-01 06:15:00
# 1 0 0 2
Note that the output gives you the start of the 15 minute interval as names of the table.

R time series missing values

I was working with a time series dataset having hourly data. The data contained a few missing values so I tried to create a dataframe (time_seq) with the correct time value and do a merge with the original data so the missing values become 'NA'.
> data
date value
7980 2015-03-30 20:00:00 78389
7981 2015-03-30 21:00:00 72622
7982 2015-03-30 22:00:00 65240
7983 2015-03-30 23:00:00 47795
7984 2015-03-31 08:00:00 37455
7985 2015-03-31 09:00:00 70695
7986 2015-03-31 10:00:00 68444
//converting the date in the data to POSIXct format.
> data$date <- format.POSIXct(data$date,'%Y-%m-%d %H:%M:%S')
// creating a dataframe with the correct sequence of dates.
> time_seq <- seq(from = as.POSIXct("2014-05-01 00:00:00"),
to = as.POSIXct("2015-04-30 23:00:00"), by = "hour")
> df <- data.frame(date=time_seq)
> df
date
8013 2015-03-30 20:00:00
8014 2015-03-30 21:00:00
8015 2015-03-30 22:00:00
8016 2015-03-30 23:00:00
8017 2015-03-31 00:00:00
8018 2015-03-31 01:00:00
8019 2015-03-31 02:00:00
8020 2015-03-31 03:00:00
8021 2015-03-31 04:00:00
8022 2015-03-31 05:00:00
8023 2015-03-31 06:00:00
8024 2015-03-31 07:00:00
// merging with the original data
> a <- merge(data,df, x.by = data$date, y.by = df$date ,all=TRUE)
> a
date value
4005 2014-07-23 07:00:00 37003
4006 2014-07-23 07:30:00 NA
4007 2014-07-23 08:00:00 37216
4008 2014-07-23 08:30:00 NA
The values I get after merging are incorrect and they contain half-hourly values. What would be the correct approach for solving this?
Why are is the merge result in 30 minute intervals when both my dataframes are hourly?
PS:I looked into this question : Fastest way for filling-in missing dates for data.table and followed the steps but it didn't help.

You can use the padr package to solve this problem.
library(padr)
library(dplyr) #for the pipe operator
data %>%
pad() %>%
fill_by_value()

subset by vector in r

I am trying to subset an xts object of OHLC hourly data with a vector.
If i create the vector myself with the following command
lookup = c("2012-01-12", "2012-01-31", "2012-03-05", "2012-03-19")
testdfx[lookup]
testdfx[lookup]
I get the correct data displayed which shows all the hours that match the dates in the vector (00:00 to 23:00.
> head(testdfx[lookup])
open high low close
2012-01-12 00:00:00 1.27081 1.27217 1.27063 1.27211
2012-01-12 01:00:00 1.27212 1.27216 1.27089 1.27119
2012-01-12 02:00:00 1.27118 1.27166 1.27017 1.27133
2012-01-12 03:00:00 1.27134 1.27272 1.27133 1.27261
2012-01-12 04:00:00 1.27260 1.27262 1.27141 1.27183
2012-01-12 05:00:00 1.27183 1.27230 1.27145 1.27165
> tail(testdfx[lookup])
open high low close
2012-03-19 18:00:00 1.32451 1.32554 1.32386 1.32414
2012-03-19 19:00:00 1.32417 1.32465 1.32331 1.32372
2012-03-19 20:00:00 1.32373 1.32415 1.32340 1.32372
2012-03-19 21:00:00 1.32373 1.32461 1.32366 1.32376
2012-03-19 22:00:00 1.32377 1.32424 1.32359 1.32366
2012-03-19 23:00:00 1.32364 1.32406 1.32333 1.32336
However when I extract a dates from an object and create a vector to use for subsetting I only get the hours of 00:00-19:00 displayed in my subset.
> head(testdfx[dates])
open high low close
2007-01-05 00:00:00 1.3092 1.3093 1.3085 1.3088
2007-01-05 01:00:00 1.3087 1.3092 1.3075 1.3078
2007-01-05 02:00:00 1.3079 1.3091 1.3078 1.3084
2007-01-05 03:00:00 1.3083 1.3084 1.3073 1.3074
2007-01-05 04:00:00 1.3073 1.3080 1.3061 1.3071
2007-01-05 05:00:00 1.3070 1.3072 1.3064 1.3069
> tail(euro[nfp.releases])
open high low close
2014-01-10 14:00:00 1.35892 1.36625 1.35728 1.36366
2014-01-10 15:00:00 1.36365 1.36784 1.36241 1.36743
2014-01-10 16:00:00 1.36742 1.36866 1.36693 1.36719
2014-01-10 17:00:00 1.36720 1.36752 1.36579 1.36617
2014-01-10 18:00:00 1.36617 1.36663 1.36559 1.36624
2014-01-10 19:00:00 1.36630 1.36717 1.36585 1.36702
I have compared both objects containing the require dates and they appear to be the same.
> class(lookup)
[1] "character"
> class(nfp.releases)
[1] "character"
> str(lookup)
chr [1:4] "2012-01-12" "2012-01-31" "2012-03-05" "2012-03-19"
> str(nfp.releases)
chr [1:86] "2014-02-07" "2014-01-10" "2013-12-06" "2013-11-08" ..
I am new to R but have tried everything over the past 3 days to get this to work. If I can't to it this way I will end up having to create a variable by hand but as its got 86 dates this may take some time.
Thanks in advance.

I cannot reproduce your problem
lookup = c("2012-01-12", "2012-01-31", "2012-03-05", "2012-03-19")
time_index <- seq(from = as.POSIXct("2012-01-01 07:00"), to = as.POSIXct("2012-05-17 18:00"), by = "hour")
set.seed(1)
value <- matrix(rnorm(n = 4*length(time_index)),length(time_index),4)
testdfx <- xts(value, order.by = time_index)
testdfx[lookup[1]]
testdfx["2012-01-12"]

Thanks for the response guys I actually thought i had deleted this thread but obviously not.
The problem in the case above was to be found around 3' from the computer. When looking through the data I was only interested in Fridays which also means that the FX market is closing down for the week end.
Sorry to have wasted your time and Admin please remove.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

endpoints are always in UTC/GMT - r

Related

Update year only in column timestamp date field SQLITE

Subtracting two columns in R; one of them consisting of date and the other one of time and date

R: count 15 minutes interval in time

R time series missing values

subset by vector in r

Categories

Resources