converting numbers to time - r

I entered my data by hand, and to save time I didn't include any punctuation in my times. So, for example, 8:32am I entered as 832. 3:34pm I entered as 1534. I'm trying to use the 'chrono' package (http://cran.r-project.org/web/packages/chron/chron.pdf) in R to convert these to time format, but chrono seems to require a delimiter between the hour and minute values. How can I work around this or use another package to convert my numbers into times?
And if you'd like to criticize me for asking a question that's already been answered before, please provide a link to said answer, because I've searched and haven't been able to find it. Then criticize away.

I think you don't need the chron package necessarily. When:
x <- c(834, 1534)
Then:
time <- substr(as.POSIXct(sprintf("%04.0f", x), format='%H%M'), 12, 16)
time
[1] "08:34" "15:34"
should give you the desired result. When you also want to include a variable which represents the date, you can use the ollowing line of code:
df$datetime <- as.POSIXct(paste(df$yymmdd, sprintf("%04.0f", df$x)), format='%Y%m%d %H%M%S')

Here's a sub solution using a regular expression:
set.seed(1); times <- paste0(sample(0:23,10), sample(0:59,10)) # ex. data
sub("(\\d+)(\\d{2})", "\\1:\\2", times) # put in delimitter
# [1] "6:12" "8:10" "12:39" "19:21" "4:43" "17:27" "18:38" "11:52" "10:19" "0:57"

Say
x <- c('834', '1534')
The last two characters represent minutes, so you can extract them using
mins <- substr(x, nchar(x)-1, nchar(x))
Similarly, extract hours with
hour <- substr(x, 0, nchar(x)-2)
Then create a fixed vector of time values with
time <- paste0(hour, ':', mins)
I think you are forced to specify dates in the chron package, so assuming a date value, you can converto chron with this:
chron(dates.=rep('02/02/02', 2),
times.=paste0(hour, ':', mins, ':00'),
format=c(dates='m/d/y',times='h:m:s'))

I thought I'd throw out a non-regex solution that uses lubridate. This is probably overkill.
library(lubridate)
library(stringr)
time.orig <- c('834', '1534')
# zero pad times before noon
time.padded <- str_pad(time.orig, 4, pad="0")
# parse using lubridate
time.period <- hm(time.padded)
# make it look like time
time.pretty <- paste(hour(time.period), minute(time.period), sep=":")
And you end up with
> time.pretty
[1] "8:34" "15:34"

Here are two solutions that do not use regular expressions:
library(chron)
x <- c(832, 1534, 101, 110) # test data
# 1
times( sprintf( "%d:%02d:00", x %/% 100, x %% 100 ) )
# 2
times( ( x %/% 100 + x %% 100 / 60 ) / 24 )
Either gives the following chron "times" object:
[1] 08:32:00 15:34:00 01:01:00 01:10:00
ADDED second solution.

Related

R: how to transfer numeric of different length into date time

I want to transfer a numeric value like "212259" into a datetime format.
These numbers specifies the hours, minutes and seconds of a day.
I already used parse_date_time((x), orders="HMS")) or out of the lubridate package: strptime(x = x, format = "%H%M%S"), but my problem is that these columns could also contain values "1158" if it was early in the day. So there is no character for the hours for example. It could also be just seconds, e.g. (12) for the 12. second of a day.
Does someone know you I can handle it ? I want to combine these value with the column of the specific day and do some arithmetic on it.
Best regards
Do you require something like this?
toTime <- function(value) {
padded_value = str_pad(value, 6, pad = "0")
strptime(padded_value, "%H%M%S")
}
str_pad is from the stringr package
So assuming that the numerical just cuts of the leading zeros, I would suggest you transform to character and then re-add them. You could use a function to do that, something along the lines of:
convert_numeric <- function(x){
if (nchar(x) == 6) {
x <- as.character(x)
return(x)
} else if (nchar(x) == 4) {
x <- as.character(paste0("00",x))
return(x)
} else if (nchar(x) == 2) {
x <- as.character(paste0("0000",x))
return(x)
}
}
Let's say your times vector has the examples you mention in it:
times <- c(212259, 1158, 12)
You could then use sapply to get the right format to use the functions you mention for date-time conversion:
char_times <- sapply(times, convert_numeric)
# [1] "212259" "001158" "000012"
strptime(char_times, format = "%H%M%S")
# [1] "2016-11-03 21:22:59 CET" "2016-11-03 00:11:58 CET" "2016-11-03 00:00:12 CET"

as.Date function gives different result in a for loop

Slight problem where my as.Date function gives a different result when I put it in a for loop. I'm looking in a folder with subfolders (per date) that contain images. I build date_list to organize all the dates (for plotting options in a later stage). The Julian Day starts from the first of January of the year, so because I have 4 years of date, the year must be flexible.
# Set up list with 4 columns and counter Q. jan is used to set all dates to the first of january
date_list <- outer(1:52, 1:4)
q = 1
jan <- "-01-01"
for (scene in folders){
year <- as.numeric(substr(scene, start=10, stop=13))
day <- as.numeric(substr(scene, start=14, stop=16))
datum <- paste(year, day, sep='_')
date_list[q, 1] <- datum
date_list[q, 2] <- year
date_list[q, 3] <- day
date_list[q, 4] <- as.Date(day, origin = as.Date(paste(year,jan, sep="")))
q = q+1
}
Output final row:
[52,] "2016_267" "2016" "267" "17068"
What am i missing in date_list[q, 4] that doesn't transfer my integer to a date?
running the following code does work, but due to the large amount of scenes and folders I like to automate this:
as.Date(day, origin = as.Date(paste(year,jan, sep="")))
Thank you for your time!
Well, I assume this would answer your first question:
date_list[q, 4] <- as.character(as.Date(datum,format="%Y_%j"))
as.Date accept a format argument, (the %Y and %j are documented in strptime), the %jis the julian day, this is a little easier to read than using origin and multiple paste calls.
Your problem is actually linked to what a Date object is:
> dput(as.Date("2016-01-10"))
structure(16810, class = "Date")
When entered into a matrix (your date_list) it is coerced to character w
without special treatment before like this:
> d<-as.Date("2016-01-10")
> class(d)<-"character"
> d
[1] "16810"
Hence you get only the number of days since 1970-01-01. When you ask for the date as character representation with as.character, it gives the correct value because the Date class as a as.character method which first compute the date in human format before returning a character value.
Now if I understood well your problem I would go this way:
First create a function to work on one string:
name_to_list <- function(name) {
dpart <- substr(name, start=10, stop=16)
date <- as.POSIXlt(dpart, format="%Y%j")
c("datum"=paste(date$year+1900,date$yday,sep="_"), "year"=date$year+1900, "julian_day"=date$yday, "date"=as.character(date) )
}
this function just get your substring, and then convert it to POSIXlt class, which give us julian day, year and date in one pass. as the year is stored as integer since 1900 (could be negative), we have to add 1900 when storing the year in the fields.
Then if your folders variable is a vector of string:
lapply(folders,name_to_list)
wich for folders=c("LC81730382016267LGN00","LC81730382016287LGN00","LC81730382016167LGN00") gives:
[[1]]
datum year julian_day date
"2016_266" "2016" "266" "2016-09-23"
[[2]]
datum year julian_day date
"2016_286" "2016" "286" "2016-10-13"
[[3]]
datum year julian_day date
"2016_166" "2016" "166" "2016-06-15"
Do you mean to output your day as 3 numbers? Should it not be 2 numbers?
day <- as.numeric(substr(scene, start=15, stop=16))
or
day <- as.numeric(substr(scene, start=14, stop=15))
That could at least be part of the issue. Providing an example of what typical values of "scene" are would be helpful here.

Vectorizing a function that uses strsplit

I am trying to make a function that converts time (in character form) to decimal format such that 1 corresponds to 1 am and 23 corresponds to 11 pm and 24 means the end of the day.
Here are the two function that does this. Here one function vectorizes while other do
time2dec <- function(time0)
{
time.dec <-as.numeric(substr(time0,1,2))+as.numeric(substr(time0,4,5))/60+(as.numeric(substr(time0,7,8)))/3600
return(time.dec)
}
time2dec1 <- function(time0)
{
time.dec <-as.numeric(strsplit(time0,':')[[1]][1])+as.numeric(strsplit(time0,':')[[1]][2])/60+as.numeric(strsplit(time0,':')[[1]][3])/3600
return(time.dec)
}
This is what I get...
times <- c('12:23:12','10:23:45','9:08:10')
#>time2dec(times)
[1] 12.38667 10.39583 NA
Warning messages:
1: In time2dec(times) : NAs introduced by coercion
2: In time2dec(times) : NAs introduced by coercion
#>time2dec1(times)
[1] 12.38667
I know time2dec which is vectorized, gives NA for the last element because it extracts 9: instead of 9 as hour. That is why I created time2dec1 but I do not know why it is not getting vectorized.
I will also be interested in getting a better function for doing what I am trying to do.
I saw this which explain a part of my question but does not provide a clue to do what I am trying.
Don't try to reinvent the wheel:
times1 <- difftime(as.POSIXct(times, "%H:%M:%S", tz="GMT"),
as.POSIXct("0:0:0", "%H:%M:%S", tz="GMT"),
units="hours")
#Time differences in hours
#[1] 12.386667 10.395833 9.136111
as.numeric(times1)
#[1] 12.386667 10.395833 9.136111
In the following we shall use this test vector:
ch <- c('12:23:12','10:23:45','9:08:10')
1) To fix up the solution in the question we prepend a 0 and then replace any string of 3 digits with the last two:
num.substr <- function(...) as.numeric(substr(...))
time2dec <- function(time0) {
t0 <- sub("\\d(\\d\\d)", "\\1", paste0(0, time0))
num.substr(t0, 1, 2) + num.substr(t0, 4, 5) / 60 + num.substr(t0, 7, 8) / 3600
}
time2dec(ch)
## [1] 12.386667 10.395833 9.136111
2) Parsing the string is slightly easier with strapply in the gsubfn package:
strapply(ch, "^(.?.):(..):(..)",
~ as.numeric(h) + as.numeric(m)/60 + as.numeric(s)/36000,
simplify = c)
## [1] 12.383667 10.384583 9.133611
3) We can reduce the string manipulation to just removing the colons and then convert the resulting character string to numeric so we can manipulate it numerically:
num <- as.numeric(gsub(":", "", ch))
num %/% 10000 + num %% 10000 %/% 100 / 60 + num %% 100 / 3600
## [1] 12.386667 10.395833 9.136111
4) The chron package has a "times" class that internally represents times as fractions of a day. Converting that to hours gives an easy solution:
library(chron)
24 * as.numeric(times(ch))
## [1] 12.386667 10.395833 9.136111
ADDED Added more solutions.
as.numeric( strptime(times, "%H:%M:%S")-strptime(Sys.Date(), "%Y-%m-%d" ))
[1] 12.386667 10.395833 9.136111
Basically the same as Roland's but bypassing some steps, and I try to avoid using difftime if I can. Had too many bugs arise because I don't really understand the function or the class ... or something. And when I timed it versus Roland's his was faster. Oh, well.
Emulating #G.Grothendieck's efforts (and essentially working similarly to his elegant strapply solution:
num <- apply( matrix(scan(text=gsub(":", " ", ch), what=numeric(0)),nrow=3), 2,
function(x) x[1]+x[2]/60 +x[3]/3600 )
#Read 9 items
num
#[1] 12.386667 10.395833 9.136111
And this actually answers the original question:
num <- sapply( strsplit(ch, ":"), function(x){ x2 <- as.numeric(x);
x2[1]+x2[2]/60 +x2[3]/3600})
num
#[1] 12.386667 10.395833 9.136111
The following does what you want
sapply(strsplit(times, ":"), function(d) {
sum(as.numeric(d)*c(1,1/60,1/3600))
})
Step by step:
strsplit(times, ":")
returns a list with character vectors. Each character vector contains the three part of the time (hour, minutes, seconds). We now want to convert each of the elements in the list to a numeric values. For this we need to apply a function to each element and put the results of the back into a vector which is what sapply does.
sapply(strsplit(times, ":", function(d) {
})
As for the function. We first need to convert the character values to numeris values using as.numeric. The we multiply the first element with 1, the second with 1/60 and the third with 1/3600 and add the results (for which we use sum). Resulting in
sapply(strsplit(times, ":"), function(d) {
sum(as.numeric(d)*c(1,1/60,1/3600))
})

How can you insert a colon every two characters?

I have a column of time values, except that they are in character format and do not have the colons to separate H, M, S. The column looks similar to the following:
Time
024201
054722
213024
205022
205024
125440
I want to convert all the values in the column to look like actual time values in the format H:M:S. The values are already in HMS format, so it is simply a matter of inserting colons, but that is proving more difficult than I thought. I found a package that adds commas every three digits from the right to make Strings look like currency values, but nothing for time (without also adding a date value, which I do not want to do). Any help would be appreciated.
Since the data is time related, you should consider storing it in a POSIX format:
> df <- data.frame(Time=c("024201", "054722", "213024", "205022", "205024", "125440")
> df$Time <- as.POSIXct(df$Time, format="%H%M%S")
> df
Time
1 2014-01-05 02:42:01
2 2014-01-05 05:47:22
3 2014-01-05 21:30:24
4 2014-01-05 20:50:22
5 2014-01-05 20:50:24
6 2014-01-05 12:54:40
To output just the times:
> format(df, "%H:%M:%S")
Time
1 02:42:01
2 05:47:22
3 21:30:24
4 20:50:22
5 20:50:24
6 12:54:40
A regular expression with lookaround works for this:
gsub('(..)(?=.)', '\\1:', x$Time, perl=TRUE)
The (?=.) means a character (matched by .) must follow, but is not considered part of the match (and is not captured).
Here is a regex solution:
x <- readLines(n=6)
024201
054722
213024
205022
205024
125440
gsub("(\\d\\d)(\\d\\d)(\\d\\d)", "\\1:\\2:\\3", x)
## [1] "02:42:01" "05:47:22" "21:30:24"
## [4] "20:50:22" "20:50:24" "12:54:40 "
Here the (\\d\\d) says we're looking for 2 digits. The parenthesis breaks the string into 3 parts. Then the \\1: says take chunk 1 and place a colon after it.
Or via date/times classes:
time <- c("024201", "054722", "213024", "205022", "205024", "125440")
time <- as.POSIXct(paste0("1970-01-01", time), format="%Y-%d-%m %H%M%S")
(time <- format(time, "%H:%M:%S"))
# [1] "02:42:01" "05:47:22" "21:30:24" "20:50:22" "20:50:24" "12:54:40"
This gives a chron "times" class vector:
> library(chron)
> times(gsub("(..)(..)(..)", "\\1:\\2:\\3", DF$Time))
[1] 02:42:01 05:47:22 21:30:24 20:50:22 20:50:24 12:54:40
The "times" class can display times without having to display the date and supports various methods on the times.
On the other hand, if only a character string is wanted then only the gsub part is needed.

Convert HH:MM:SS to hours (for more than 24 hours) in R

I would like to convert hours more than 24 hours in R.
For example, I have a dataframe which contains hours and minutes like [HH:MM]:
[1] "111:15" "221:15" "111:15" "221:15" "42:05"
I want them to be converted in hours like this:
"111.25" "221.25" "111.25" "221.25" "42.08333333"
as.POSIXct()
function works for general purpose, but not for more than 24 hours.
You can split the strings with strsplit and use sapply to transform all values.
vec <- c("111:15", "221:15", "111:15", "221:15", "42:05")
sapply(strsplit(vec, ":"), function(x) {
x <- as.numeric(x)
x[1] + x[2] / 60
})
The result:
[1] 111.25000 221.25000 111.25000 221.25000 42.08333
I would just parse the strings with regex. Grab the bit before the : then add on the bit after the : divided by 60
> foo = c("111:15", "221:15", "111:15", "221:15", "42:05")
> foo
[1] "111:15" "221:15" "111:15" "221:15" "42:05"
> as.numeric(gsub("([^:]+).*", "\\1", foo)) + as.numeric(gsub(".*:([0-9]{2})$", "\\1", foo))/60
[1] 111.25000 221.25000 111.25000 221.25000 42.08333
Another possibility is a vectorized function such as:
FUN <- function(time){
hours <- sapply(time,FUN=function(x) as.numeric(strsplit(x,split=":")[[1]][1]))
minutes <- sapply(time,FUN=function(x) as.numeric(strsplit(x,split=":")[[1]][2]))
result <- hours+(minutes/60)
return(as.numeric(result))
}
Where you use strsplit to extract the hours and minutes, of which you then take the sum after dividing the minutes by 60.
You can then use the function like this:
FUN(c("111:15","221:15","111:15","221:15","42:05"))
[1] 111.25000 221.25000 111.25000 221.25000 42.08333
strapplyc Here ia a solution using strapplyc in the gsubfn package. It passes the match to each of the parenthesized regular expressions (i.e. the hours and the minutes) to the function described in the third argument. The function can be specified using the usual R function notation and it also supports a short form using a formula (used here) where the right hand side of the formula is the function body and the left hand side represent the arguments and defaults to the free variables (m, h) in the right hand side. We suppose that the original character vector is ch.
library(gsubfn)
strapply(ch, "(\\d+):(\\d+)", ~ as.numeric(h) + as.numeric(m)/60, simplify = TRUE)
numeric processing Another way is to replace the : with a . and manipulate it numerically into what we want:
num <- as.numeric(chartr(":", ".", ch))
trunc(num) + 100 * (num %% 1) / 60
sub This is yet another approach:
h <- as.numeric(sub(":.*", "", ch))
m <- as.numeric(sub(".*:", "", ch))
h + m / 60
The codes above each gives a numberic result but we could wrap each in as.character(...) if a character result were desired.
read.table
as.matrix(read.table(text = ch, sep = ":")) %*% c(1, 1/60)
eval/parse. This one maipulates each one into an R expression which is evaluated. This one is short but the use of eval is often frowned upon:
sapply(parse(text = sub(":", "+(1/60)*", ch)), eval)
ADDED additional solutions.

Resources