Convert periods (hundredth of a second) r - r

I am trying to convert a vector of the following form:
data$Time[1:10]
[1] 0:00.00 0:00.01 0:00.02 0:00.03 0:00.04 0:00.05 0:00.06 0:00.07 0:00.08 0:00.09
573394 Levels: 0:00.00 0:00.01 0:00.02 0:00.03 0:00.04 0:00.05 0:00.06 0:00.07 0:00.08 0:00.09 0:00.10 0:00.11 0:00.12 0:00.13 0:00.14 ... 9:59.99
notice that this is a factor form
class(data$Time)
factor
I've tried the following
hms(data$Time[1:10])
[1] "0S" "1S" "2S" "3S" "4S" "5S" "6S" "7S" "8S" "9S"
it sees the 1/100 of a second as a second! same thing for
period_to_seconds(hms(data$Time[1:10]))
[1] 0 1 2 3 4 5 6 7 8 9
I need to be able to extract the time (with the require accuracy) to be able to subtract and calculate periods. Notice that these files will extend to few hours. So a solution that is good for HH:MM:SS.00 will be appreciated
another approach that only works if you have data that is either H M S or M S solely is the following:
Test <- c('03:5.05', '1:03.05.05')
tmp <- strptime(as.character(Test),"%H:%M:%OS")
tmp
[1] NA NA
tmp <- strptime(as.character(Test),"%M:%OS")
tmp
[1] "2016-04-30 00:03:05.05 CDT" "2016-04-30 00:01:03.05 CDT
(The hours had to be removed)

## set option to use digits for seconds
options(digits.secs = 2)
## convert your factor to a string and then to Posix format
tmp <- strptime(as.character(data$Time),'%H:%M:%OS')
## convert it to a numeric (unit seconds)
as.numeric(strftime(tmp,'%OS'))+60*as.numeric(strftime(tmp,'%M'))+60*60*as.numeric(strftime(tmp,'%H'))

There is a ms function in lubridate package to read only the minutes and seconds.
Test <- c('0:00.02', '9:59.99')
library(lubridate)
Test %>% ms() %>% period_to_seconds()
[1] 0.02 599.99

Based on Jorg's answer. I think I was able to solve my problem. The files I am working with extend for few hours (with each point representing 0.01 sec). So I split the vector (data$Time) and applied the MS script for the first 360000 points and the HMS script for what following:
options(digits.secs = 2)
tmp1 <- strptime(as.character(data$Time[1:360000]),"%M:%OS")
tmp2 <- strptime(as.character(data$Time[-(1:360000)]),"%H:%M:%OS")
tmp1_numeric <-as.numeric(strftime(tmp1,'%OS'))+60*as.numeric(strftime(tmp1,'%M'))+60*60*as.numeric(strftime(tmp1,'%H'))
tmp2_numeric <-as.numeric(strftime(tmp2,'%OS'))+60*as.numeric(strftime(tmp2,'%M'))+60*60*as.numeric(strftime(tmp2,'%H'))
tmp_numeric <- c(tmp1_numeric, tmp2_numeric)

Related

Discretize a date-time variable to "in-hours" and "after-hours"

I have date-times like:
x = c("2015-09-12 03:52:00", "2017-06-15 21:37:28", "2017-04-08 20:44:11")
I want to create two categories: If the time is between 6.30pm and 8 am I want to return "after-hours"`, otherwise it returns "in-hours".
I tried to solve this first by extracting the time part, but that converted it to a character which meant, ifelse was not working.
Thank you in advance.
base R
Cheating a little, converting to %H%M as an integer on a 24h clock.
vec <- as.POSIXct(c("2015-09-12 03:52:00", "2017-06-15 21:37:28", "2017-04-08 20:44:11"))
hhmm <- as.integer(format(vec, format = "%H%M"))
ifelse(hhmm < 0800 | hhmm > 1830, "after-hours", "in-hours")
# [1] "after-hours" "after-hours" "after-hours"
lubridate
Similar, but using decimal hours instead of fake-hour/minute.
library(lubridate)
hhmm2 <- hour(vec) + minute(vec)/60
ifelse(hhmm2 < 8 | hhmm2 > 18.5, "after-hours", "in-hours")
# [1] "after-hours" "after-hours" "after-hours"
times_as_char = c("2015-09-12 03:52:00", "2017-06-15 21:37:28", "2017-04-08 20:44:11")
# Converting character to date-time
times_as_datetimes <- lubridate::ymd_hms(times_as_char)
# We can use decimal hours to make time comparisons easier
times_as_hour_dec <- lubridate::hour(times_as_datetimes) +
lubridate::minute(times_as_datetimes)/60
time_status <- ifelse(times_as_hour_dec < 8 | times_as_hour_dec >= 18.5,
"after-hours",
"in hours")

How to read quarterly data with R?

I'm trying to use Bayesian VAR, but I can't even get my data right properly. I get them from https://sdw.ecb.europa.eu/ but since a lot of them are quarterly data I have a problem to merge my variables since I'm unable to convert for example "2020-Q1" from char to date with as.Date.
I used the sub function to get 2020-1 for example and then tried as.Date(, format="%Y-%q) but it doesn't work, so I'm stuck.
textData <- "yearQuarter,Amount
2019-Q1,1000
2019-Q2,2000
2019-Q3,3000"
df <- read.csv(text=textData,header = TRUE,stringsAsFactors = FALSE)
as.Date(df$yearQuarter,format="%Y-%q")
...which produces:
> as.Date(df$yearQuarter,format="%Y-%q")
[1] NA NA NA
Thank you for your help !
library(lubridate)
d = yq("2020-Q1")
d
# [1] "2020-01-01"
year(d)
# [1] 2020
quarter(d)
# [1] 1

Print all hours:minutes from 00:00 to 23:59

I would like to print all the hours: minutes in a day from 00:00 to 23:59.
This part goes beyond the question, but if you want to help me, this is the whole idea:
Once that is done, I would like to calculate all the "curious" times that can be interpreted as serendipities. Patterns like: 00:00, 22:22, 01:10, 12:34, 11:44, and the like.
Later on, I would like to count all the "serendipities", and divide them to the total number of hours to know the probabilities of find a "serendipity" each time a person look at the time on his smartphone.
To be honest, I am pretty lost. There is already some months without coding. For the first part of the problem, I guess that a loop can make the task.
For the second part, an if conditional can probably make it.
For the first part of the problem I have tried loops like this
for(i in x){
for(k in y){
cat(i,":",k, ",")
}
}
For the second, something like
Assuming the digits of the time are ab:cd
if(a==b & a==c & a==d){
print(ab:cd)
TRUE
}
if(a==b & c==d){
print(ab:cd)
TRUE
}
I would like to get the whole list of numbers first. Then, the list of "serendipities", and finally the count of both to make the percentage.
I find interesting how people find patterns in numbers when they look at the time, and I would like to know how probable is to get one of these patterns out of the 24*60 = 1440
I hope I have explained myself. (I used to be better with coding and maths, but after some months, I have forgotten almost everything).
Here's a way to generate the list of all possible times.
h <- seq(from=0, to=23)
m <- seq(from=0, to=59)
h <- sprintf('%02d', h)
m <- sprintf('%02d', m)
df <- data.frame(expand.grid(h, m))
df$times <- paste0(df$Var1, ':', df$Var2)
df <- df[order(df$times), ]
df$times
Partial output
df$times[1:25]
[1] "00:00" "00:01" "00:02" "00:03" "00:04" "00:05" "00:06" "00:07" "00:08"
[10] "00:09" "00:10" "00:11" "00:12" "00:13" "00:14" "00:15" "00:16" "00:17"
[19] "00:18" "00:19" "00:20" "00:21" "00:22" "00:23" "00:24"
Length of variable
dim(df)
[1] 1440 3
We can create a sequence of 1 minute interval starting from 00:00:00 to 23:59:00 and then use format to get output in desired format.
format(seq(as.POSIXct("00:00:00", format = "%T"),
as.POSIXct("23:59:00", format = "%T"), by = "1 min"), "%H:%M")
#[1] "00:00" "00:01" "00:02" "00:03" "00:04" "00:05" "00:06" "00:07" "00:08" "00:09"
# "00:10" "00:11" "00:12" "00:13" "00:14" "00:15" "00:16" "00:17" "00:18" "00:19" ...
Yet another way of doing it:
> result <- character(1440)
> for (i in 0:1439) result[i+1L] <- sprintf("%02d:%02d",
+ i %/% 60,
+ i %% 60
+ )
> head(result)
[1] "00:00" "00:01" "00:02" "00:03" "00:04" "00:05"
> tail(result)
[1] "23:54" "23:55" "23:56" "23:57" "23:58" "23:59"

How do I change the index in a csv file to a proper time format?

I have a CSV file of 1000 daily prices
They are of this format:
1 1.6
2 2.5
3 0.2
4 ..
5 ..
6
7 ..
.
.
1700 1.3
The index is from 1:1700
But I need to specify a begin date and end date this way:
Start period is lets say, 25th january 2009
and the last 1700th value corresponds to 14th may 2013
So far Ive gotten this close to this problem:
> dseries <- ts(dseries[,1], start = ??time??, freq = 30)
How do I go about this? thanks
UPDATE:
managed to create a seperate object with dates as suggested in the answers and plotted it, but the y axis is weird, as shown in the screenshot
Something like this?
as.Date("25-01-2009",format="%d-%m-%Y") + (seq(1:1700)-1)
A better way, thanks to #AnandaMahto:
seq(as.Date("2009-01-25"), by="1 day", length.out=1700)
Plotting:
df <- data.frame(
myDate=seq(as.Date("2009-01-25"), by="1 day", length.out=1700),
myPrice=runif(1700)
)
plot(df)
R stores Date-classed objects as the integer offset from "1970-01-01" but the as.Date.numeric function needs an offset ('origin') which can be any staring date:
rDate <- as.Date.numeric(dseries[,1], origin="2009-01-24")
Testing:
> rDate <- as.Date.numeric(1:10, origin="2009-01-24")
> rDate
[1] "2009-01-25" "2009-01-26" "2009-01-27" "2009-01-28" "2009-01-29"
[6] "2009-01-30" "2009-01-31" "2009-02-01" "2009-02-02" "2009-02-03"
You didn't need to add the extension .numeric since R would automticallly seek out that function if you used the generic stem, as.Date, with an integer argument. I just put it in because as.Date.numeric has different arguments than as.Date.character.

Working with hundreths of a second using the chron package or modifying the precision

I'm using the chron package and I'm trying to work with hundreths of a second, like this:
library(chron)
tms <- times(c("00:01:30.81", "00:01:33.38", "00:01:34.10", "00:01:37.09",
"00:01:37.29", "00:01:36.96", "00:01:37.65", "00:01:37.63",
"00:01:36.80", "00:01:40.06"))
mean(tms)
# [1] 00:01:36
var(tms)
# [1] 9.432812e-10
sum(tms)
# [1] 00:16:02
That the times are not being taken with the hundredths of a second, as when I do this:
tms
# [1] 00:01:31 00:01:33 00:01:34 00:01:37 00:01:37 00:01:37 00:01:38 00:01:38
# [9] 00:01:37 00:01:40
it's only using seconds and that's it, it's rounding, I want the exact times, or average... how could I fix this?
Its there but its only displaying the seconds part. Try this:
# x should be a times object
show100s <- function(x) sprintf("%s.%02d", format(x),
round(100 * 3600 * 24 * (as.numeric(x) - as.numeric(trunc(x, "sec")))))
and run it like this:
library(chron)
tt <- times("11:12:13.81")
tt
## [1] 11:12:14
show100s(tt)
## [1] "11:12:14.81"

Resources