Could you please tell me how to rearrange the datetime of data set A in order to compatible with datetime of data set B (which is in GMT+10 format)?
Thank you.
**data set A**
sitecode status start end
ANS0009 spike 11/09/2013 04:45:00 PM (GMT+11) 11/09/2013 05:00:00 PM (GMT+11)
ARM0064 spike 05/03/2014 11:00:00 AM (GMT+10) 05/03/2014 11:15:00 AM (GMT+10)
BAS0059 dry 13/01/2013 00:00:00 AM (GMT+11) 29/03/2013 11:45:00 PM (GMT+11)
BAS0059 spike 11/03/2014 10:15:00 AM (GMT+10) 11/03/2014 10:30:00 AM (GMT+10)
BLC0097 failure 12/20/2012 05:00:00 PM (GMT+11) 12/31/2012 11:45:00 PM (GMT+11)
BLC0097 spike 24/12/2015 04:59:45 PM (GMT+10) 24/12/2015 05:01:50 PM (GMT+10)
**data set B**
sitecode status start end
EUM0056 record 2012-12-01 11:00:00 2013-10-06 01:45:00
EUM0056 missing 2013-10-06 01:45:00 2013-10-06 03:00:00
EUM0056 record 2013-10-06 03:00:00 2014-03-11 20:15:00
MDL0026 record 2012-12-07 11:00:00 2013-04-04 19:45:00
MDL0026 missing 2013-04-04 19:45:00 2014-02-27 23:00:00
MDL0026 record 2014-02-27 23:00:00 2014-10-05 01:45:00
We can could use lubridate to parse multiple formats after splitting the string into two to remove the (GMT + ...).
library(lubridate)
library(stringr)
v1 <- strsplit(str1, "\\s+(?=\\()", perl = TRUE)[[1]]
parse_date_time(v1[1], c("%d/%m/%Y %I:%M:%S %p", "%m/%d/%Y %I:%M:%S %p"),
tz= "GMT", exact = TRUE) + lubridate::hours(str_extract(v1[2], "\\d+"))
#[1] "2013-09-12 03:45:00 GMT"
Using the full dataset example
datA[c("start", "end")] <- lapply(datA[c("start", "end")], function(x){
m1 <- do.call(rbind, strsplit(x, "\\s+(?=\\()", perl = TRUE))
parse_date_time(m1[,1], c("%d/%m/%Y %I:%M:%S %p", "%m/%d/%Y %I:%M:%S %p"),
tz = "GMT", exact = TRUE) + lubridate::hours(str_extract(m1[,2], "\\d+")
)})
data
str1 <- "11/09/2013 04:45:00 PM (GMT+11)"
require(lubridate)
exampleA <- c("11/09/2013 04:45:00 PM (GMT+11)",
"11/09/2013 04:45:00 PM (GMT+10)")
exampleA <- as.data.frame(exampleA)
exampleA$flag <- 0
exampleA$flag[grep(" PM \\(GMT\\+11\\)", exampleA$exampleA)] <- 1
exampleA$exampleA <- gsub(" PM \\(GMT\\+11\\)","", exampleA$exampleA)
exampleA$exampleA <- gsub(" PM \\(GMT\\+10\\)","", exampleA$exampleA)
exampleA$exampleA <- mdy_hms(exampleA$exampleA)
exampleA$exampleA[exampleA$flag == 1] <- exampleA$exampleA - 3600
exampleB <- c("2013-11-09 03:45:00", "2013-11-09 04:45:00")
exampleB <- ymd_hms(exampleB)
# Proof it works
exampleA$exampleA == exampleB
[1] TRUE TRUE
If you have a mix of formats in 1 data set (i.e. mdy, ydm, etc) you can deal with this by using if statements -- either in a function which you can apply or a for loop -- and text if a certain position has a value >12 to determine the format, then use the appropriate lubridate function to convert it.
Related
We have the code:
times <- c("2:30 PM", "10:00 AM", "10:00 AM")
mydat <- data.frame(times=times)
which results in
> mydat
times
1 2:30 PM
2 10:00 AM
3 10:00 AM
I want to convert these times, which are characters, into POSIX format. So I do
mydat$ntimes <- as.POSIXct(NA,"")
mydat$ntimes <- sapply(mydat$times, function(x) parse_date_time(x, '%I:%M %p'))
Then we get
> mydat
times ntimes
1 2:30 PM -62167167000
2 10:00 AM -62167183200
3 10:00 AM -62167183200
I have no idea why these are negative. Furthermore, if instead of sapply we did a loop:
for (i in 1:length(mydat$times)){
mydat$ntimes[i] <- parse_date_time(mydat$times[i], '%I:%M %p')
}
we get the format right, but everything is off by 7 minutes and 2 seconds, why is that?
> mydat
times ntimes
1 2:30 PM 0000-01-01 06:37:02
2 10:00 AM 0000-01-01 02:07:02
3 10:00 AM 0000-01-01 02:07:02
You don't need a loop for this :
as.POSIXct(mydat$times, format = '%I:%M %p', tz = 'UTC')
#[1] "2021-03-14 14:30:00 UTC" "2021-03-14 10:00:00 UTC" "2021-03-14 10:00:00 UTC"
Or
lubridate::parse_date_time(mydat$times, '%I:%M %p')
#[1] "0000-01-01 14:30:00 UTC" "0000-01-01 10:00:00 UTC" "0000-01-01 10:00:00 UTC"
The difference in two options is that when the date is absent as.POSIXct will give today's date whereas parse_date_time will give 0000-01-01.
Base R Solution
You can use the strptime function to convert the times variable of character type to POSIXlt. Without a date provided, this function also returns todays date.
times <- c("2:30 PM", "10:00 AM", "10:00 AM")
mydat <- data.frame(times=times)
# FORMAT SPECIFICATIONS:
# %I = Hours as decimal number (01–12).
# %M = Minute as decimal number (00–59).
# %p = AM/PM indicator in the locale.
strptime(mydat$times, format='%I:%M %p', tz = 'UTC')
#> [1] "2021-03-13 14:30:00 UTC" "2021-03-13 10:00:00 UTC"
#> [3] "2021-03-13 10:00:00 UTC"
Created on 2021-03-13 by the reprex package (v0.3.0)
Add it to the data frame as a new variable
times <- c("2:30 PM", "10:00 AM", "10:00 AM")
mydat <- data.frame(times=times)
mydat$new_times <- strptime(mydat$times, format='%I:%M %p')
#> times new_times
#> 1 2:30 PM 2021-03-13 14:30:00
#> 2 10:00 AM 2021-03-13 10:00:00
#> 3 10:00 AM 2021-03-13 10:00:00
Created on 2021-03-13 by the reprex package (v0.3.0)
I have a column with date and time in the as.POSIXct format e.g. "2019-02-23 12:45". I want to identify if the time is AM or PM and add AM or PM to the date and time?
the following code creates an example dataset for representation:
ID <- data.frame(c(1,2,3,4))
DATE <- data.frame(as.POSIXct(c("2019-02-25 07:30", "2019-03-25 14:30", "2019-03-25 12:00", "2019-03-25 00:00"),format="%Y-%m-%d %H:%M"))
DATEAMPM <- data.frame(c("2019-02-25 07:30 AM", "2019-03-25 14:30 PM", "2019-03-25 12:00 PM", "2019-03-25 00:00 AM"))
AMPMFLAG <- data.frame(c(0,1,1,0))
test <- cbind(ID,DATE,DATEAMPM,AMPMFLAG)
names(test) <- c("PID","DATE","DATEAMPM","AMPMFLAG")
Would like to create the DATEAMPM and AMPMFLAG columns as represented in the code above.
I have seen character strings of the form "2019-09-23 08:45 PM" converted to 2019-09-23 20:45" by specifying the argument as below, but do not the other way around to incorporate AM/PM into the date time
as.POSIXct(strptime(,format="%Y-%m-%d %I:%M %p"))
Appreciate your help
We can use format to get the data with AM/PM
test$DATEAMPM <- format(test$DATE, "%Y-%m-%d %I:%M %p")
test$AMPMFLAG <- +(grepl("PM", test$DATEAMPM))
test
# PID DATE DATEAMPM AMPMFLAG
#1 1 2019-02-25 07:30:00 2019-02-25 07:30 AM 0
#2 2 2019-03-25 14:30:00 2019-03-25 02:30 PM 1
#3 3 2019-03-25 12:00:00 2019-03-25 12:00 PM 1
#4 4 2019-03-25 00:00:00 2019-03-25 12:00 AM 0
Also note that when you convert 14:30:00 in AM/PM it would be 02:30 PM and not 14:30 PM.
I'm having trouble creating a time series (POSIXct or dttm column) with a row every 15 minutes.
Something that will look like this for every 15 minutes between Jan 1st 2015 and Dec 31st 2016 (here as month/day/year hour:minutes):
1/15/2015 0:00
1/15/2015 0:15
1/15/2015 0:30
1/15/2015 0:45
1/15/2015 1:00
A loop starting date of 01/01/2015 0:00 and then adding 15 minutes until 12/31/2016 23:45?
Does anyone has an idea of how this can be done easily?
Little bit easier to read
library(lubridate)
seq(ymd_hm('2015-01-01 00:00'),ymd_hm('2016-12-31 23:45'), by = '15 mins')
intervals.15.min <- 0 : (366 * 24 * 60 * 60 / 15 / 60)
res <- as.POSIXct("2015-01-01","GMT") + intervals.15.min * 15 * 60
res <- res[res < as.POSIXct("2016-01-01 00:00:00 GMT")]
head(res)
# "2015-01-01 00:00:00 GMT" "2015-01-01 00:15:00 GMT" "2015-01-01 00:30:00 GMT"
tail(res)
# "2015-12-31 23:15:00 GMT" "2015-12-31 23:30:00 GMT" "2015-12-31 23:45:00 GMT"
I was thinking of how to find date(which does not exist in the table) based on time.
Example: Remember, I only have the time
time = c("9:44","15:30","23:48","00:30","05:30", "15:30", "22:00", "00:45")
I know for the fact that the start date is 2014-08-28, but how do I get the date which changes after midnight.
Expected outcome would be
9:44 2014-08-28
15:30 2014-08-28
23:48 2014-08-28
00:30 2014-08-29
05:30 2014-08-29
15:30 2014-08-29
22:00 2014-08-29
00:45 2014-08-30
Here's an example using data.table package ITime class which enables you to manipulate time (upon converting time to this class you can now subtract/add minutes/hours/etc.)
library(data.table)
time <- as.ITime(time)
Date <- as.IDate("2014-08-28") + c(0, cumsum(diff(time) < 0))
data.table(time, Date)
# time Date
# 1: 09:44:00 2014-08-28
# 2: 15:30:00 2014-08-28
# 3: 23:48:00 2014-08-28
# 4: 00:30:00 2014-08-29
# 5: 05:30:00 2014-08-29
# 6: 15:30:00 2014-08-29
# 7: 22:00:00 2014-08-29
# 8: 00:45:00 2014-08-30
Using the chron package we assume that a later time is on the same day and an earlier time is on the next day:
library(chron)
date <- as.Date("2014-08-28") + cumsum(c(0, diff(times(paste0(time, ":00"))) < 0))
data.frame(time, date)
giving:
time date
1 9:44 2014-08-28
2 15:30 2014-08-28
3 23:48 2014-08-28
4 00:30 2014-08-29
5 05:30 2014-08-29
6 15:30 2014-08-29
7 22:00 2014-08-29
8 00:45 2014-08-30
Here's one way to do it:
time = c("9:44","15:30","23:48","00:30","05:30", "15:30", "22:00", "00:45")
times <- sapply(strsplit(time, ":", TRUE), function(x) Reduce("+", as.numeric(x) * c(60, 1)))
as.POSIXct("2014-08-28") + times + 60*60*24*cumsum(c(F, tail(times < lag(times), -1)))
# [1] "2014-08-28 00:09:44 CEST" "2014-08-28 00:15:30 CEST" "2014-08-28 00:23:48 CEST" "2014-08-29 00:00:30 CEST" "2014-08-29 00:05:30 CEST" "2014-08-29 00:15:30 CEST" "2014-08-29 00:22:00 CEST" "2014-08-30 00:00:45 CEST"
You can concatenate system date with time and get result. For example, in Oracle we can get date with time as:
to_char(sysdate,'DD-MM-RRRR')|| ' ' || To_char(sysdate,'HH:MIAM')
This will result as eg. 12-09-2015 09:50 AM
For your requirement, use this as:
to_char(sysdate,'DD-MM-RRRR')|| ' 00:45' and so on.
I want to split x (which is a factor)
dd = data.frame(x = c("29-4-2014 06:00:00", "9-4-2014 12:00:00", "9-4-2014 00:00:00", "6-5-2014 00:00:00" ,"7-4-2014 00:00:00" , "29-5-2014 00:00:00"))
x
29-4-2014 06:00:00
9-4-2014 12:00:00
9-4-2014 00:00:00
6-5-2014 00:00:00
7-4-2014 00:00:00
29-5-2014 00:00:00
at the horizontal space and get two columns as:
x.date x.time
29-4-2014 06:00:00
9-4-2014 12:00:00
9-4-2014 00:00:00
6-5-2014 00:00:00
7-4-2014 00:00:00
29-5-2014 00:00:00
Any suggestion is appreciated!
strsplit is typically used here, but you can also use read.table:
read.table(text = as.character(dd$x))
# V1 V2
# 1 29-4-2014 06:00:00
# 2 9-4-2014 12:00:00
# 3 9-4-2014 00:00:00
# 4 6-5-2014 00:00:00
# 5 7-4-2014 00:00:00
# 6 29-5-2014 00:00:00
Other option (better)
# Convert to POSIXct objects
times <- as.POSIXct(dd$x, format="%d-%m-%Y %T")
# You may also want to specify the time zone
times <- as.POSIXct(dd$x, format="%d-%m-%Y %T", tz="GMT")
Then, to extract times
strftime(times, "%T")
[1] "06:00:00" "12:00:00" "00:00:00" "00:00:00" "00:00:00" "00:00:00"
or dates
strftime(times, "%D")
[1] "04/29/14" "04/09/14" "04/09/14" "05/06/14" "04/07/14" "05/29/14"
or, any format you want, really
strftime(times, "%d %b %Y at %T")
[1] "29 Apr 2014 at 06:00:00" "09 Apr 2014 at 12:00:00"
[3] "09 Apr 2014 at 00:00:00" "06 May 2014 at 00:00:00"
[5] "07 Apr 2014 at 00:00:00" "29 May 2014 at 00:00:00"
See, for more info: ?as.POSIXct and ?strftime
Here is another approach using lubridate:
dd = data.frame(x = c("29-4-2014 06:00:00", "9-4-2014 12:00:00", "9-4-2014 00:00:00", "6-5-2014 00:00:00" ,"7-4-2014 00:00:00" , "29-5-2014 00:00:00"),
stringsAsFactors = FALSE)
Note the use of stringsAsFactors = FALSE, which prevents your dates from being read as factors.
library(lubridate)
dd2 <- transform(dd,x2 = dmy_hms(x))
transform(dd2, the_year = year(x2))
x x2 the_year
1 29-4-2014 06:00:00 2014-04-29 06:00:00 2014
2 9-4-2014 12:00:00 2014-04-09 12:00:00 2014
3 9-4-2014 00:00:00 2014-04-09 00:00:00 2014
4 6-5-2014 00:00:00 2014-05-06 00:00:00 2014
5 7-4-2014 00:00:00 2014-04-07 00:00:00 2014
6 29-5-2014 00:00:00 2014-05-29 00:00:00 2014