CSV date format changes after import into R - r

I tried to import csv with date format:
3/1/2017 0:00
3/1/2017 1:00
3/1/2017 2:00
3/1/2017 3:00
3/1/2017 4:00
3/1/2017 5:00
into R, however the date format appears in R become:
2017-03-01 00:00:00 2017-03-01 01:00:00 2017-03-01 02:00:00 2017-03-01 03:00:00 2017-03-01 04:00:00 2017-03-01 05:00:00
How can I read csv into R as the original format without changing anything?

It is in the "original" format, in the sense that you're probably looking at a POSIXct or POSIXlt object. You can reformat dates and datetimes using format() or strftime(), but this will render them character.
So as long as you're working with the datetime objects, just leave it as is. If you need to report, you can use any of the aforementioned functions to format the string:
x <- "3/1/2017 3:00"
x1 <- as.POSIXct(x, format = "%d/%m/%Y %H:%M")
x1
# [1] "2017-01-03 03:00:00 CET"
strftime(x1, format = "%d/%m/%Y %H:%M")
# [1] "03/01/2017 03:00"
format(x1, format = "%d/%m/%Y %H:%M")
# [1] "03/01/2017 03:00"

Related

Why can I not put POSIXct objects into a data frame in R?

We have the code:
times <- c("2:30 PM", "10:00 AM", "10:00 AM")
mydat <- data.frame(times=times)
which results in
> mydat
times
1 2:30 PM
2 10:00 AM
3 10:00 AM
I want to convert these times, which are characters, into POSIX format. So I do
mydat$ntimes <- as.POSIXct(NA,"")
mydat$ntimes <- sapply(mydat$times, function(x) parse_date_time(x, '%I:%M %p'))
Then we get
> mydat
times ntimes
1 2:30 PM -62167167000
2 10:00 AM -62167183200
3 10:00 AM -62167183200
I have no idea why these are negative. Furthermore, if instead of sapply we did a loop:
for (i in 1:length(mydat$times)){
mydat$ntimes[i] <- parse_date_time(mydat$times[i], '%I:%M %p')
}
we get the format right, but everything is off by 7 minutes and 2 seconds, why is that?
> mydat
times ntimes
1 2:30 PM 0000-01-01 06:37:02
2 10:00 AM 0000-01-01 02:07:02
3 10:00 AM 0000-01-01 02:07:02
You don't need a loop for this :
as.POSIXct(mydat$times, format = '%I:%M %p', tz = 'UTC')
#[1] "2021-03-14 14:30:00 UTC" "2021-03-14 10:00:00 UTC" "2021-03-14 10:00:00 UTC"
Or
lubridate::parse_date_time(mydat$times, '%I:%M %p')
#[1] "0000-01-01 14:30:00 UTC" "0000-01-01 10:00:00 UTC" "0000-01-01 10:00:00 UTC"
The difference in two options is that when the date is absent as.POSIXct will give today's date whereas parse_date_time will give 0000-01-01.
Base R Solution
You can use the strptime function to convert the times variable of character type to POSIXlt. Without a date provided, this function also returns todays date.
times <- c("2:30 PM", "10:00 AM", "10:00 AM")
mydat <- data.frame(times=times)
# FORMAT SPECIFICATIONS:
# %I = Hours as decimal number (01–12).
# %M = Minute as decimal number (00–59).
# %p = AM/PM indicator in the locale.
strptime(mydat$times, format='%I:%M %p', tz = 'UTC')
#> [1] "2021-03-13 14:30:00 UTC" "2021-03-13 10:00:00 UTC"
#> [3] "2021-03-13 10:00:00 UTC"
Created on 2021-03-13 by the reprex package (v0.3.0)
Add it to the data frame as a new variable
times <- c("2:30 PM", "10:00 AM", "10:00 AM")
mydat <- data.frame(times=times)
mydat$new_times <- strptime(mydat$times, format='%I:%M %p')
#> times new_times
#> 1 2:30 PM 2021-03-13 14:30:00
#> 2 10:00 AM 2021-03-13 10:00:00
#> 3 10:00 AM 2021-03-13 10:00:00
Created on 2021-03-13 by the reprex package (v0.3.0)

How can I add a midnight time stamp to dates in a column in R?

I have a column of dates in an R data frame, that look like this,
Date
2020-08-05
2020-08-05
2020-08-05
2020-08-07
2020-08-08
2020-08-08
So the dates are formatted as 'yyyy-mm-dd'.
I am writing this data frame to a CSV that needs to be formatted in a very specific manner. I need to convert these dates to the format 'mm/dd/yyyy hh:mm:ss', so this is what I want the columns to look like:
Date
8/5/2020 12:00:00 AM
8/5/2020 12:00:00 AM
8/5/2020 12:00:00 AM
8/7/2020 12:00:00 AM
8/8/2020 12:00:00 AM
8/8/2020 12:00:00 AM
The dates do not have a timestamp attached to begin with, so all dates will need a midnight timestamp in the format shown above.
I spent quite some time trying to coerce this format yesterday and was unable. I am easily able to change 2020-08-05 to 8/5/2020 using as.Date(), but the issue arises when I attempt to add the midnight time stamp.
How can I add a midnight timestamp to these reformatted dates?
Thanks so much for any help!
You can use format:
df <- data.frame(Date = as.Date(c("2020-08-05", "2020-08-07")))
format(df$Date, "%d-%m-%Y 12:00:00 AM")
[1] "05-08-2020 12:00:00 AM" "07-08-2020 12:00:00 AM"
dat <- data.frame(
Date = as.Date("2020-08-05") + c(0, 0, 0, 2, 3, 3)
)
dat[["Date"]] <- format(dat[["Date"]], "%m/%d/%Y %I:%M:%S %p")
dat[["Date"]] <- sub("([ap]m)$", "\\U\\1", dat[["Date"]], perl = T)
dat
## Date
## 1 08/05/2020 12:00:00 AM
## 2 08/05/2020 12:00:00 AM
## 3 08/05/2020 12:00:00 AM
## 4 08/07/2020 12:00:00 AM
## 5 08/08/2020 12:00:00 AM
## 6 08/08/2020 12:00:00 AM
Try this:
format(as.POSIXct("2022-11-08", tz = "Australia/Sydney"), "%Y-%m-%d %H:%M:%S")

How to know if a as.POSIXct date time is AM/PM in r?

I have a column with date and time in the as.POSIXct format e.g. "2019-02-23 12:45". I want to identify if the time is AM or PM and add AM or PM to the date and time?
the following code creates an example dataset for representation:
ID <- data.frame(c(1,2,3,4))
DATE <- data.frame(as.POSIXct(c("2019-02-25 07:30", "2019-03-25 14:30", "2019-03-25 12:00", "2019-03-25 00:00"),format="%Y-%m-%d %H:%M"))
DATEAMPM <- data.frame(c("2019-02-25 07:30 AM", "2019-03-25 14:30 PM", "2019-03-25 12:00 PM", "2019-03-25 00:00 AM"))
AMPMFLAG <- data.frame(c(0,1,1,0))
test <- cbind(ID,DATE,DATEAMPM,AMPMFLAG)
names(test) <- c("PID","DATE","DATEAMPM","AMPMFLAG")
Would like to create the DATEAMPM and AMPMFLAG columns as represented in the code above.
I have seen character strings of the form "2019-09-23 08:45 PM" converted to 2019-09-23 20:45" by specifying the argument as below, but do not the other way around to incorporate AM/PM into the date time
as.POSIXct(strptime(,format="%Y-%m-%d %I:%M %p"))
Appreciate your help
We can use format to get the data with AM/PM
test$DATEAMPM <- format(test$DATE, "%Y-%m-%d %I:%M %p")
test$AMPMFLAG <- +(grepl("PM", test$DATEAMPM))
test
# PID DATE DATEAMPM AMPMFLAG
#1 1 2019-02-25 07:30:00 2019-02-25 07:30 AM 0
#2 2 2019-03-25 14:30:00 2019-03-25 02:30 PM 1
#3 3 2019-03-25 12:00:00 2019-03-25 12:00 PM 1
#4 4 2019-03-25 00:00:00 2019-03-25 12:00 AM 0
Also note that when you convert 14:30:00 in AM/PM it would be 02:30 PM and not 14:30 PM.

Text process using R

I am quite new in programming and R Software.
My data-set includes date-time variables as following:
2007/11/0103
2007/11/0104
2007/11/0105
2007/11/0106
I need an operator which count from left up to the character number 10 and then execute a space and copy the last two characters and then add :00 for all columns.
Expected results:
2007/11/01 03:00
2007/11/01 04:00
2007/11/01 05:00
2007/11/01 06:00
If you want to actually turn your data into a "POSIXlt" "POSIXt" class in R (so you could subtract/add days, minutes and etc from/to it) you could do
# Your data
temp <- c("2007/11/0103", "2007/11/0104", "2007/11/0105", "2007/11/0106")
temp2 <- strptime(temp, "%Y/%m/%d%H")
## [1] "2007-11-01 03:00:00 IST" "2007-11-01 04:00:00 IST" "2007-11-01 05:00:00 IST" "2007-11-01 06:00:00 IST"
You could then extract hours for example
temp2$hour
## [1] 3 4 5 6
Add hours
temp2 + 3600
## [1] "2007-11-01 04:00:00 IST" "2007-11-01 05:00:00 IST" "2007-11-01 06:00:00 IST" "2007-11-01 07:00:00 IST"
And so on. If you just want the format you mentioned in your question (which is just a character string), you can also do
format(strptime(temp, "%Y/%m/%d%H"), format = "%Y/%m/%d %H:%M")
#[1] "2007/11/01 03:00" "2007/11/01 04:00" "2007/11/01 05:00" "2007/11/01 06:00"
Try
library(lubridate)
dat <- read.table(text="2007/11/0103
2007/11/0104
2007/11/0105
2007/11/0106",header=F,stringsAsFactors=F)
dat$V1 <- format(ymd_h(dat$V1),"%Y/%m/%d %H:%M")
dat
# V1
# 1 2007/11/01 03:00
# 2 2007/11/01 04:00
# 3 2007/11/01 05:00
# 4 2007/11/01 06:00
Suppose your dates are a vector named dates
library(stringr)
paste0(paste(str_sub(dates, end=10), str_sub(dates, 11)), ":00")
paste and substr are your friends here. Type ? before either to see the documentation
my.parser <- function(a){
paste0(substr(a, 0,10),' ',substr(a,11,12),':00') # paste0 is like paste but does not add whitespace
}
a<- '2007/11/0103'
my.parser(a) # = "2007/11/01 03:00"

dateTime with different format in data frame

I have imported some data into R, which looks like the following:
dateTime temp
1 10/25/2005 12:00:00 15.50
2 10/25/2005 1:00:00 15.49
3 10/25/2005 2:00:00 15.52
4 10/25/2005 3:00:00 15.50
5 10/25/2005 4:00:00 15.50
6 10/25/2005 5:00:00 15.46
where the class of the dateTime column of the data.frame is factor and the second column is numeric.
I try to convert the dateTime into POSIXct format as follows:
dat[,1] <- as.POSIXct(dat[,1])
but receive the error
Error in as.POSIXlt.character(as.character(x), ...) :
character string is not in a standard unambiguous format
which I think is to do with the dateTime varying in the format that hour is presented e.g. 12, 1, 2 etc and not 12, 01, 02.
How can I change this to POSIXct?
You need to specify the format:
datetime <- factor("10/25/2005 12:00:00")
as.POSIXct(datetime)
#Error in as.POSIXlt.character(as.character(x), ...) :
# character string is not in a standard unambiguous format
as.POSIXct(datetime, format="%m/%d/%Y %H:%M:%S")
#[1] "2005-10-25 12:00:00 CEST"
Note: I advise you to always specify a time zone explicitly when creating datetime variables. Otherwise, you could get into trouble with daylight saving time.
as.POSIXct(datetime, format="%m/%d/%Y %H:%M:%S", tz="GMT")
#[1] "2005-10-25 12:00:00 GMT"

Resources