R: how to transfer numeric of different length into date time - r

I want to transfer a numeric value like "212259" into a datetime format.
These numbers specifies the hours, minutes and seconds of a day.
I already used parse_date_time((x), orders="HMS")) or out of the lubridate package: strptime(x = x, format = "%H%M%S"), but my problem is that these columns could also contain values "1158" if it was early in the day. So there is no character for the hours for example. It could also be just seconds, e.g. (12) for the 12. second of a day.
Does someone know you I can handle it ? I want to combine these value with the column of the specific day and do some arithmetic on it.
Best regards

Do you require something like this?
toTime <- function(value) {
padded_value = str_pad(value, 6, pad = "0")
strptime(padded_value, "%H%M%S")
}
str_pad is from the stringr package

So assuming that the numerical just cuts of the leading zeros, I would suggest you transform to character and then re-add them. You could use a function to do that, something along the lines of:
convert_numeric <- function(x){
if (nchar(x) == 6) {
x <- as.character(x)
return(x)
} else if (nchar(x) == 4) {
x <- as.character(paste0("00",x))
return(x)
} else if (nchar(x) == 2) {
x <- as.character(paste0("0000",x))
return(x)
}
}
Let's say your times vector has the examples you mention in it:
times <- c(212259, 1158, 12)
You could then use sapply to get the right format to use the functions you mention for date-time conversion:
char_times <- sapply(times, convert_numeric)
# [1] "212259" "001158" "000012"
strptime(char_times, format = "%H%M%S")
# [1] "2016-11-03 21:22:59 CET" "2016-11-03 00:11:58 CET" "2016-11-03 00:00:12 CET"

Related

Create sequence of times for 32Hz in format hh:mm:ss

I'd like to create a sequence of times from 00:00:00 to 00:38:24 separated by 1/32 seconds. This means many times will be repeated but it doesn't matter.
I've tried:
require(lubridate)
#Convert 00:38:24 to seconds
timeEnd<-period_to_seconds(hms("00:38:24"))
>timeEnd
[1] 2304
#Total number of elements in sequence
timeEnd*32Hz=73728 elements
#Create a sequence from 0 to timeEnd in 1/32 steps
t<-seq(0,timeEnd,by=1/32) #This bit is wrong! makes 73729 elements
#Convert to 00:00:00 format
sprintf('%02d:%02d:%02d', seconds_to_period(t)#hour, minute(seconds_to_period(t)), second(seconds_to_period(t)))
>t[1:5]
[1] "00:00:00" "00:00:00" "00:00:00" "00:00:00" "00:00:00"
Is there a better solution?
There are the wrong number of elements in the sequence. It should be 73728 but I get 73729. Why?
I don't know if this is a better solution; but using only base and stringi and using a function as a wrapper for a bit more flexibility. Time values can be passed as a character and time increments as a fraction or whatever.
f <- function(start_time = "00:00:00", end_time = "00:38:24",
time_inc = 1/32, ...){
options(digits.secs=4)
tl <- lapply(list(start_time, end_time), function(i){
as.POSIXct(i, format = "%H:%M:%OS", tz = "EST")
})
as_seq <- seq(tl[[1]], tl[[2]], by = time_inc)
stringi::stri_datetime_format(as_seq, format = "HH:mm:ss:SSSS")
}
time.seq <- f()
> head(time.seq)
[1] "00:00:00:0000" "00:00:00:0310" "00:00:00:0620" "00:00:00:0930" "00:00:00:1250" "00:00:00:1560"
> tail(time.seq)
[1] "00:38:23:8430" "00:38:23:8750" "00:38:23:9060" "00:38:23:9370" "00:38:23:9680" "00:38:24:0000"

Create valid Time from integer with built-in function

Is there way to create 8:46:01 from integer 84601 without using modulo operations in R ? something like format with mask in another languages : format(84600, "HHMMSS") ? Otherwise modulo devision is needed and some messy formulas
format(strptime("084601","%H%M%S"),"%H:%M:%S")
works, but you have to make sure that you have a two-digit hour, for example:
x <- "84601"
Put a zero in front of any 5-digit numeric strings:
xx <- gsub("([0-9]{5})","0\\1",x)
(or, as #Frank says in a comment, sprintf("%06d", x) will work for integers ...)
Convert:
format(strptime(xx,"%H%M%S"),"%H:%M:%S")
(if you don't format() you'll get a date-time string with the current date filled in ...)
Just treat it as a string:
x <- 84601
# index from end in case of extra hours digit
y <- paste0(substr(x, 1, nchar(x)-4), ':',
substr(x, nchar(x)-3, nchar(x)-2), ':',
substr(x, nchar(x)-1, nchar(x)))
y
# [1] "8:46:01"
Or with regex:
y <- gsub('(.?.)(..)(..)', '\\1:\\2:\\3', x)
y
# [1] "8:46:01"
Or with format (formatting numbers, not time):
y <- format(x, big.mark = ':', big.interval = 2L)
y
# [1] "8:46:01"
If you need an actual time class, chron::times is nice:
chron::times(y)
# [1] 08:46:01

R parse timestamp of form %m%d%Y with no leading zeroes

I have data with a timestamp of the form %m%d%Y with no leading zeroes.
Timestamp sample:
112001
1112001
Desired parsing
January 1 2001
January 11 2001 or November 1 2001 based on context
The timestamps are in sequential order. Is it possible to parse this data?
It is possible, but I think there needs to be some prior work. This follows the same premise as #hrbrmstr, which is I think is what needs to be done to be able to parse these dates.
> x <- c("112001", "1112001")
> x1 <- ifelse(substring(x, 1, 1) != 0, paste0(0, x), x)
> x2 <- ifelse(nchar(x1) == 7 & substring(x1, 3, 3) != 0,
paste0(substring(x1, 1, 2), 0, substring(x1, 3)), x1)
> library(lubridate)
> parse_date_time(x2, "mdy")
[1] "2001-01-01 UTC" "2001-01-11 UTC"
This would be the basic logic handling for those date strings by length. You'll need to add logic for the "context", given that we have no idea how these are structured. I'm putting them in a vector for example:
dates <- c(112001, 1112001)
lapply(dates, function(x) {
x <- as.character(x)
if (nchar(x) == 6) {
as.Date(sprintf("0%s0%s%s", substr(x,1,1), substr(x,2,2), substr(x,3,6)), format="%m%d%Y")
} else if (nchar(x) == 7) {
as.Date(sprintf("0%s%s%s", substr(x,1,1), substr(x,2,3), substr(x,4,7)), format="%m%d%Y")
} else {
as.Date(x, format="%m%d%Y")
}
})
## [[1]]
## [1] "2001-01-01"
##
## [[2]]
## [1] "2001-01-11"
You can parse a date from a string representation in a fixed format using strptime. You can then convert the result to a different representation using strftime.
Your desire to support non-uniquely parseable formats and decide "based on a context" is not as simple to implement and you probably want to avoid going down this way.

Vectorizing a function that uses strsplit

I am trying to make a function that converts time (in character form) to decimal format such that 1 corresponds to 1 am and 23 corresponds to 11 pm and 24 means the end of the day.
Here are the two function that does this. Here one function vectorizes while other do
time2dec <- function(time0)
{
time.dec <-as.numeric(substr(time0,1,2))+as.numeric(substr(time0,4,5))/60+(as.numeric(substr(time0,7,8)))/3600
return(time.dec)
}
time2dec1 <- function(time0)
{
time.dec <-as.numeric(strsplit(time0,':')[[1]][1])+as.numeric(strsplit(time0,':')[[1]][2])/60+as.numeric(strsplit(time0,':')[[1]][3])/3600
return(time.dec)
}
This is what I get...
times <- c('12:23:12','10:23:45','9:08:10')
#>time2dec(times)
[1] 12.38667 10.39583 NA
Warning messages:
1: In time2dec(times) : NAs introduced by coercion
2: In time2dec(times) : NAs introduced by coercion
#>time2dec1(times)
[1] 12.38667
I know time2dec which is vectorized, gives NA for the last element because it extracts 9: instead of 9 as hour. That is why I created time2dec1 but I do not know why it is not getting vectorized.
I will also be interested in getting a better function for doing what I am trying to do.
I saw this which explain a part of my question but does not provide a clue to do what I am trying.
Don't try to reinvent the wheel:
times1 <- difftime(as.POSIXct(times, "%H:%M:%S", tz="GMT"),
as.POSIXct("0:0:0", "%H:%M:%S", tz="GMT"),
units="hours")
#Time differences in hours
#[1] 12.386667 10.395833 9.136111
as.numeric(times1)
#[1] 12.386667 10.395833 9.136111
In the following we shall use this test vector:
ch <- c('12:23:12','10:23:45','9:08:10')
1) To fix up the solution in the question we prepend a 0 and then replace any string of 3 digits with the last two:
num.substr <- function(...) as.numeric(substr(...))
time2dec <- function(time0) {
t0 <- sub("\\d(\\d\\d)", "\\1", paste0(0, time0))
num.substr(t0, 1, 2) + num.substr(t0, 4, 5) / 60 + num.substr(t0, 7, 8) / 3600
}
time2dec(ch)
## [1] 12.386667 10.395833 9.136111
2) Parsing the string is slightly easier with strapply in the gsubfn package:
strapply(ch, "^(.?.):(..):(..)",
~ as.numeric(h) + as.numeric(m)/60 + as.numeric(s)/36000,
simplify = c)
## [1] 12.383667 10.384583 9.133611
3) We can reduce the string manipulation to just removing the colons and then convert the resulting character string to numeric so we can manipulate it numerically:
num <- as.numeric(gsub(":", "", ch))
num %/% 10000 + num %% 10000 %/% 100 / 60 + num %% 100 / 3600
## [1] 12.386667 10.395833 9.136111
4) The chron package has a "times" class that internally represents times as fractions of a day. Converting that to hours gives an easy solution:
library(chron)
24 * as.numeric(times(ch))
## [1] 12.386667 10.395833 9.136111
ADDED Added more solutions.
as.numeric( strptime(times, "%H:%M:%S")-strptime(Sys.Date(), "%Y-%m-%d" ))
[1] 12.386667 10.395833 9.136111
Basically the same as Roland's but bypassing some steps, and I try to avoid using difftime if I can. Had too many bugs arise because I don't really understand the function or the class ... or something. And when I timed it versus Roland's his was faster. Oh, well.
Emulating #G.Grothendieck's efforts (and essentially working similarly to his elegant strapply solution:
num <- apply( matrix(scan(text=gsub(":", " ", ch), what=numeric(0)),nrow=3), 2,
function(x) x[1]+x[2]/60 +x[3]/3600 )
#Read 9 items
num
#[1] 12.386667 10.395833 9.136111
And this actually answers the original question:
num <- sapply( strsplit(ch, ":"), function(x){ x2 <- as.numeric(x);
x2[1]+x2[2]/60 +x2[3]/3600})
num
#[1] 12.386667 10.395833 9.136111
The following does what you want
sapply(strsplit(times, ":"), function(d) {
sum(as.numeric(d)*c(1,1/60,1/3600))
})
Step by step:
strsplit(times, ":")
returns a list with character vectors. Each character vector contains the three part of the time (hour, minutes, seconds). We now want to convert each of the elements in the list to a numeric values. For this we need to apply a function to each element and put the results of the back into a vector which is what sapply does.
sapply(strsplit(times, ":", function(d) {
})
As for the function. We first need to convert the character values to numeris values using as.numeric. The we multiply the first element with 1, the second with 1/60 and the third with 1/3600 and add the results (for which we use sum). Resulting in
sapply(strsplit(times, ":"), function(d) {
sum(as.numeric(d)*c(1,1/60,1/3600))
})

converting numbers to time

I entered my data by hand, and to save time I didn't include any punctuation in my times. So, for example, 8:32am I entered as 832. 3:34pm I entered as 1534. I'm trying to use the 'chrono' package (http://cran.r-project.org/web/packages/chron/chron.pdf) in R to convert these to time format, but chrono seems to require a delimiter between the hour and minute values. How can I work around this or use another package to convert my numbers into times?
And if you'd like to criticize me for asking a question that's already been answered before, please provide a link to said answer, because I've searched and haven't been able to find it. Then criticize away.
I think you don't need the chron package necessarily. When:
x <- c(834, 1534)
Then:
time <- substr(as.POSIXct(sprintf("%04.0f", x), format='%H%M'), 12, 16)
time
[1] "08:34" "15:34"
should give you the desired result. When you also want to include a variable which represents the date, you can use the ollowing line of code:
df$datetime <- as.POSIXct(paste(df$yymmdd, sprintf("%04.0f", df$x)), format='%Y%m%d %H%M%S')
Here's a sub solution using a regular expression:
set.seed(1); times <- paste0(sample(0:23,10), sample(0:59,10)) # ex. data
sub("(\\d+)(\\d{2})", "\\1:\\2", times) # put in delimitter
# [1] "6:12" "8:10" "12:39" "19:21" "4:43" "17:27" "18:38" "11:52" "10:19" "0:57"
Say
x <- c('834', '1534')
The last two characters represent minutes, so you can extract them using
mins <- substr(x, nchar(x)-1, nchar(x))
Similarly, extract hours with
hour <- substr(x, 0, nchar(x)-2)
Then create a fixed vector of time values with
time <- paste0(hour, ':', mins)
I think you are forced to specify dates in the chron package, so assuming a date value, you can converto chron with this:
chron(dates.=rep('02/02/02', 2),
times.=paste0(hour, ':', mins, ':00'),
format=c(dates='m/d/y',times='h:m:s'))
I thought I'd throw out a non-regex solution that uses lubridate. This is probably overkill.
library(lubridate)
library(stringr)
time.orig <- c('834', '1534')
# zero pad times before noon
time.padded <- str_pad(time.orig, 4, pad="0")
# parse using lubridate
time.period <- hm(time.padded)
# make it look like time
time.pretty <- paste(hour(time.period), minute(time.period), sep=":")
And you end up with
> time.pretty
[1] "8:34" "15:34"
Here are two solutions that do not use regular expressions:
library(chron)
x <- c(832, 1534, 101, 110) # test data
# 1
times( sprintf( "%d:%02d:00", x %/% 100, x %% 100 ) )
# 2
times( ( x %/% 100 + x %% 100 / 60 ) / 24 )
Either gives the following chron "times" object:
[1] 08:32:00 15:34:00 01:01:00 01:10:00
ADDED second solution.

Resources