My dates in Submitted.on column are of different formats hence resorted to converting the various formats separately and then concatenating them into a new column using ifelse and as.Date. But the however when I use strptime to have the timestamp also in the new column it throws error and a warning message.
data$Submitted.on[0:5]
#[1] 02-06-02 0:00 03/30/2010 23:15:12
#[3] 11-05-09 6:28 07/29/2009 23:07:38
#[5] 07-10-05 0:00
vec1 = as.character(strptime(data$Submitted.on, "%m/%d/%Y %H:%M:%S"))
vec1[0:5]
#[1] NA "2010-03-30 23:15:12"
#[3] NA "2009-07-29 23:07:38"
#[5] NA
vec2 = as.character(strptime(data$Submitted.on, "%m-%d-%y %H:%M"))
vec2[0:5]
#[1] "2002-02-06 00:00:00" NA
#[3] "2009-11-05 06:28:00" NA
#[5] "2005-07-10 00:00:00"
data['new_format']=as.Date(ifelse(is.na(vec1),vec2,vec1))
data[0:5,'new_format']
#[1] "2002-02-06" "2010-03-30" "2009-11-05" "2009-07-29"
#[5] "2005-07-10"
Using as.Date works great but when I use strptime for timestamp also , it gives warning message.
data['new_format']=strptime(ifelse(is.na(vec1),vec2,vec1),"%Y-%m-%d
%H:%M:%S")
#Warning message:
#In `[<-.data.frame`(`*tmp*`, "new_format", value = list(sec = c(0, :
#provided 11 variables to replace 1 variables
data[0:5,'new_format']
[1] 0 12 0 38 0
Any help on how to have the timestamp also will be of great help.
We can use parse_date_time from lubridate
library(lubridate)
parse_date_time(data$Submitted.on, guess_formats(data$Submitted.on,
c("mdy HMS", "mdy MS")))
#[1] "2002-02-06 00:00:00 UTC" "2010-03-30 23:15:12 UTC" "2009-11-05 00:06:28 UTC"
#[4] "2009-07-29 23:07:38 UTC" "2005-07-10 00:00:00 UTC"
Regarding the use of ifelse, we would advocate against this as strptime gives a POSIXlt class. So, instead of ifelse, one can use the indexing method
v1 <- strptime(data$Submitted.on, "%m/%d/%Y %H:%M:%S", tz = "UTC")
v1[is.na(v1)] <- strptime(data$Submitted.on[is.na(v1)], "%m-%d-%y %H:%M", tz = "UTC")
v1
#[1] "2002-02-06 00:00:00 UTC" "2010-03-30 23:15:12 UTC" "2009-11-05 06:28:00 UTC"
#[4] "2009-07-29 23:07:38 UTC" "2005-07-10 00:00:00 UTC"
data
data <- structure(list(Submitted.on = c("02-06-02 0:00", "03/30/2010 23:15:12",
"11-05-09 6:28", "07/29/2009 23:07:38", "07-10-05 0:00")),
.Names = "Submitted.on", row.names = c(NA, -5L), class = "data.frame")
Related
This question already has answers here:
Convert date-time string to class Date
(4 answers)
Closed 3 years ago.
I have date&time stamp as a character variable
"2018-12-13 11:00:01 EST" "2018-10-23 22:00:01 EDT" "2018-11-03 14:15:00 EDT" "2018-10-04 19:30:00 EDT" "2018-11-10 17:15:31 EST" "2018-10-05 13:30:00 EDT"
How can I strip the time from this character vector?
PS: Can someone please help. I have tried using strptime but I am getting NA values as a result
It's a bit unclear whether you want the date or time but if you want the date then as.Date ignores any junk after the date so:
x <- c("2018-12-13 11:00:01 EST", "2018-10-23 22:00:01 EDT")
as.Date(x)
## [1] "2018-12-13" "2018-10-23"
would be sufficient to get a Date vector from the input vector x. No packages are used.
If you want the time then:
read.table(text = x, as.is = TRUE)[[2]]
## [1] "11:00:01" "22:00:01"
If you want a data frame with each part in a separate column then:
read.table(text = x, as.is = TRUE, col.names = c("date", "time", "tz"))
## date time tz
## 1 2018-12-13 11:00:01 EST
## 2 2018-10-23 22:00:01 EDT
I think the OP wants to extract the time from date-time variable (going by the title of the question).
x <- "2018-12-13 11:00:01 EST"
as.character(strptime(x, "%Y-%m-%d %H:%M:%S"), "%H:%M:%S")
[1] "11:00:01"
Another option:
library(lubridate)
format(ymd_hms(x, tz = "EST"), "%H:%M:%S")
[1] "11:00:01"
The package lubridate makes everything like this easy:
library(lubridate)
x <- "2018-12-13 11:00:01 EST"
as_date(ymd_hms(x))
You can use the as.Date function and specify the format
> as.Date("2018-12-13 11:00:01 EST", format="%Y-%m-%d")
[1] "2018-12-13"
If all values are in a vector:
x = c("2018-12-13 11:00:01 EST", "2018-10-23 22:00:01 EDT",
"2018-11-03 14:15:00 EDT", "2018-10-04 19:30:00 EDT",
"2018-11-10 17:15:31 EST", "2018-10-05 13:30:00 EDT")
> as.Date(x, format="%Y-%m-%d")
[1] "2018-12-13" "2018-10-23" "2018-11-03" "2018-10-04" "2018-11-10"
[6] "2018-10-05"
I have downloaded the 2015-2017 sunset/sunrise data from the Navy and I am trying to format it into dates and time to further use with other data I have. This is how my data set looks in R.
I have managed to convert the date into the right R data format. Yet, I still can't convert my rise/set column data from a number/integer format to a time data (as hh:mm).
Based on one internet source, I wrote the following codes:
Sun2015$Srise<- format(strptime(Sun2015$Rise, format="%H:%M"))
However this gives NA in my data
OR
Sun2015$Srise<-str_pad(Sun2015$Rise, 4, pad="0")
Sun2015$Srise<-hour(hm(Sun2015$Srise))
Yet, I received the following error:
Warning message: In .parse_hms(..., order = "HM", quiet = quiet) :
Some strings failed to parse.
Is there a better way to convert the columns into the right time format so that I can merge the date and time columns into date-time columns for sunset and sunrise?
Thank you in advance for your help.
You can convert your military time to 2400 time strings using sprint("%04d", data) and go from there. For example, with the first 5 lines of your data:
# Sample of your data
Day <- c("1/1/2015", "1/2/2015", "1/3/2015", "1/4/2015", "1/5/2015")
Rise <- c(652,652,652,653,653)
Set <- c(1755,1756,1756,1757,1757)
sun2015 <- data.frame(Day, Rise, Set)
# Convert to 2400 style strings with leading zeroes where necessary
sun2015$Rise <- sprintf("%04d", sun2015$Rise)
sun2015$Set <- sprintf("%04d", sun2015$Set)
# Merge with your date
sun2015$day_rise <- as.POSIXct(paste0(sun2015$Day, " ",sun2015$Rise), format = "%m/%d/%Y %H%M", origin = "1970-01-01", tz = "UTC")
sun2015$day_set <- as.POSIXct(paste0(sun2015$Day, " ",sun2015$Set), format = "%m/%d/%Y %H%M", origin = "1970-01-01", tz = "UTC")
> sun2015$day_rise
[1] "2015-01-01 06:52:00 UTC" "2015-01-02 06:52:00 UTC" "2015-01-03 06:52:00 UTC" "2015-01-04 06:53:00 UTC"
[5] "2015-01-05 06:53:00 UTC"
> sun2015$day_set
[1] "2015-01-01 17:55:00 UTC" "2015-01-02 17:56:00 UTC" "2015-01-03 17:56:00 UTC" "2015-01-04 17:57:00 UTC"
[5] "2015-01-05 17:57:00 UTC"
You can adjust to the appropriate time zone if necessary.
I have the following data.frame called "data" (it is much larger but i just give the first lines as an example):
Timestamp Weight Degrees
1 30-09-2016 11:45:00,000 38.19 40.00
2 01-10-2016 06:19:57,860 39.12 40.00
3 01-10-2016 06:20:46,393 42.11 41.00
I would like to convert the "Timestamp" to a date/time vector including milliseconds. This seems to be a problem because the milliseconds are separated by a comma.
Also, data.frame has mode "list" and Timestamp has mode "character" which clearly aren't right...
I have tried data$Timestamp <- as.POSIXct(data$Timestamp,format='%d-%m-%Y %H:%M:%OS') but I only get "2016-09-30 11:42:00 UTC", without the milliseconds. The mode however becomes "numeric", which should be a step in the right direction. I only have set options(digits.secs=3).
I'd really appreciate your help. Thank you in advance.
x = c("30-09-2016 11:45:00,000", "01-10-2016 06:19:57,860", "01-10-2016 06:20:46,393")
format(as.POSIXct(gsub(",", ".", x), format='%d-%m-%Y %H:%M:%OS'), '%d-%m-%Y %H:%M:%OS3')
#[1] "30-09-2016 11:45:00.000" "01-10-2016 06:19:57.859" "01-10-2016 06:20:46.392"
OR
x = c("30-09-2016 11:45:00,000", "01-10-2016 06:19:57,860", "01-10-2016 06:20:46,393")
#Converting to POSIXct
options(digits.secs=3)
y = as.POSIXct(gsub(",", ".", x), format='%d-%m-%Y %H:%M:%OS', tz = "UTC")
y
#[1] "2016-09-30 11:45:00.000 UTC" "2016-10-01 06:19:57.859 UTC" "2016-10-01 06:20:46.392 UTC"
#Converting to numeric
as.numeric(y)
#[1] 1475253900 1475320798 1475320846
#Converting numeric back to POSIXct
as.POSIXct(as.numeric(y), origin = "1970-01-01", tz = "UTC")
#[1] "2016-09-30 11:45:00.000 UTC" "2016-10-01 06:19:57.859 UTC" "2016-10-01 06:20:46.392 UTC"
OR
x = c("30-09-2016 11:45:00,000", "01-10-2016 06:19:57,860", "01-10-2016 06:20:46,393")
library(lubridate)
options(digits.secs=3)
dmy_hms(gsub(",", ".", x))
#[1] "2016-09-30 11:45:00.000 UTC" "2016-10-01 06:19:57.860 UTC" "2016-10-01 06:20:46.393 UTC"
I try to turn a string with the format WeekYear into a date using parse_date_time from lubridate. But I always get the same date in return.
library(lubridate)
> isoweek(parse_date_time("092015", orders = "WY"))
[1] 53
Shouldn't it work like:?
> month(parse_date_time("092015", orders = "mY"))
[1] 9
It doesn't get better when I add the first day of the week to the calender weeks:
> test <- c("2013-01-01","2013-02-02","2014-12-12")
> parse_date_time(paste(isoyear(test),isoweek(test),"1", sep = "-"), "YWw")
[1] "2013-01-07 UTC" "2013-02-04 UTC" "2014-12-15 UTC"
The first date should be "2012-12-31" (the monday of the first week 2013).
The second date should be "2013-01-28" and
the third should be "2014-12-08".
This gave me the idea for the following:
> parse_date_time(paste(isoyear(test),isoweek(test),"1", sep = "-"), "YWw") - weeks(1)
[1] "2012-12-31 UTC" "2013-01-28 UTC" "2014-12-08 UTC"
which works but produces errors (NA) for dates at the end of 2015 (week 53).
More precisely, the errors emerge from 2015-12-28 to 2016-01-03:
> test <- c("2015-12-27","2015-12-28","2015-12-31","2016-01-01","2016-01-03","2016-01-04")
> parse_date_time(paste(isoyear(test),isoweek(test),"1", sep = "-"), "YWw") - weeks(1)
[1] "2015-12-21 UTC" NA NA NA NA "2015-12-28 UTC"
I found a solution with the package "ISOweek":
> library(ISOweek)
> test <- c("2015-12-27","2015-12-28","2015-12-31","2016-01-01","2016-01-03","2016-01-04")
> ISOweek2date(paste(ISOweek(test),"1", sep = "-"))
[1] "2015-12-21" "2015-12-28" "2015-12-28" "2015-12-28" "2015-12-28" "2016-01-04"
I cannot get R to format POSIXlt objects in the desired timezone. POSIXct works as expected. Is this a bug or am I missing something?
date.str = "2015-12-09 13:30"
from = "Europe/London"
to = "America/Los_Angeles"
lt = as.POSIXlt(date.str, tz=from)
format(lt, tz=to, usetz=TRUE)
#[1] "2015-12-09 13:30:00 GMT"
ct = as.POSIXct(date.str, tz=from)
format(ct, tz=to, usetz=TRUE)
#[1] "2015-12-09 05:30:00 PST"
The tzone attributes are the same:
attributes(ct)$tzone
#[1] "Europe/London"
attributes(lt)$tzone
#[1] "Europe/London"
Solution
As pointed out by #nicola, format.POSIXlt has no tz parameter. To print a POSIXlt date in another timezone one can use lubridate package to convert a POSIXlt object to the desired timezone first:
require(lubridate)
lt.changed = with_tz(lt, tz=to)
format(lt.changed, usetz=TRUE)
#[1] "2015-12-09 05:30:00 PST"