Reformatting date and timestamp with r [duplicate] - r

This question already has answers here:
Changing date format in R
(7 answers)
Closed 3 years ago.
In order to prevent an error in uploading xls data into a sql database, I am trying to reformat a date type of "08/22/2019 02:05 PM CDT" and want only the date, not the time or the timezone. Many efforts to use the default, POSIX and lubridate actions have failed. The xls file formats the date column as general.
I have a column of data to convert, not a single cell. This is a part of a loop for multiple files in a folder.
Failures:
#mydata_r11_Date2 <- strptime(as.character(mydata_r11_Date$Date), "%d/%m/%Y")
# parse_date_time(x = mydata_r11_Date$Date,
# orders = c("d m y", "d B Y", "m/d/y"),
# locale = "eng")
#
#
# mydata_r11_Date <- as.character(mydata_r11_Date)
mydata_r11_Date <- gsub('^([0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}\\.[0-9]+[+-][0-9]{2}):([0-9]{2})$',
'\\1\\2',
mydata_r11_Date$Date)
ymd_hms(mydata_r11_Date$Date)
mydata_r11_Date <- as_date (mydata_r11_Date$Date,format = "%Y-%m-%d")
mydata_r11_Date2 <- format(as.Date(mydata_r11_Date,"%Y-%m-%d"),"%Y-%m-%d")
Errors include:
Warning message:
All formats failed to parse. No formats found.
Error in as.Date.default(x, ...) :
do not know how to convert 'x' to class “Date”
Error in as.Date.default(mydata_r11_Date$Date, format = "%Y-%m-%d") :
do not know how to convert 'mydata_r11_Date$Date' to class “Date”
Error: unexpected ',' in " mydata_r11_Date <- as.Date(mydata_r11_Date$Date),"
Error in as_date(x) : object 'x' not found
library(readxl)library(reshape2) library(lubridate)
import xsl
mydata_r11 <- read_excel("C:/FOLDER/FOLDER/FOLDER/OUTPUT/WADUJONOKO_student_assessment_results.xls",1,skip = 1, col_types = "list")
Isolate date column
mydata_r11_Date <- mydata_r11[,c(8)]
Convert date
mydata_r11_Date 2 <-
Have "08/22/2019 02:05 PM CDT"
Want "08/22/2019"

I don't understand why you are resorting to complex regex here when you seem to only want the date component, which is the first 10 characters of the timestamps. Just take the substring and then call as.Date with an appropriate format mask:
x <- "08/22/2019 02:05 PM CDT"
y <- substr(x, 1, 10)
as.Date(y, format = "%m/%d/%Y")
[1] "2019-08-22"

Related

Converting character string to local time [duplicate]

This question already has answers here:
How can I fix corrupted dates in R?
(1 answer)
R datetime series missing values
(3 answers)
Closed 7 months ago.
I'm appending two datasets where both have columns for Date (YYYY/MM/DD HH:MM:SS), but one is a character and the other is date. The first df lists Date as "2022-01-01 01:00:00" and the second (crimes) lists Date as 01/01/2022 12:00:00 AM.
For my purposes, I do not need the HHMMSS or timezone characteristics.
I've tried:
crimes$Date <- format(as.POSIXct(crimes$Date,format='%I:%M %p'),format="%H:%M:%S")
crimes$Date <- format(as.POSIXct(crimes$Date,format="%Y-%m-%d %I:%M:%S %p"))
crimes$Date <- as.numeric(as.character(crimes$Date))
crimes$Date <- strptime(crimes$Date, format = "%Y-%m-%d %H:%M:%OS", tz = "EST")
Each of these changes the observations under date as NA.
Other options I've tried (and their error messages):
crimes$Date <- parse_date_time("%Y/%m/%d %I:%M:%S %p")
#Error in .best_formats(train, orders, locale = locale, select_formats, :
#argument "orders" is missing, with no default
crimes$Date <- format(as.POSIXct(crimes$Date), format='%y-%m-%d')
#Error in as.POSIXlt.character(x, tz, ...) :
#character string is not in a standard unambiguous format
What am I missing?

How to extract the hour of this kind of timestamp? [duplicate]

This question already has an answer here:
Extracting 2 digit hour from POSIXct in R
(1 answer)
Closed 7 years ago.
This is how my timestamp looks like
date1 = timestamp[1]
date1
[1] 07.01.2016 11:52:35
3455 Levels: 01.02.2016 00:11:52 01.02.2016 00:23:35 01.000:30:21 31.01.2016 23:16:18
When I try to extract the hour, I get an invalid 'trim' argument. Thank you !
format(date1, "%I")
Error in format.default(structure(as.character(x), names = names(x), dim = dim(x), :
invalid 'trim' argument
How can I extract single components like the hour from this timestamp.
With base R:
format(strptime(x,format = '%d.%m.%Y %H:%M:%S'), "%H")
[1] "11"
data
x <- as.factor("07.01.2016 11:52:35")
You first need to parse the time
d = strptime(date1,format = '%d.%m.%Y %H:%M:%S')
Then use lubridate to extract parts like hour etc.
library(lubridate)
hour(d)
minute(d)

About transforming dates in R 3.1.0 in OSX 10.9.2

I have a column of dates containing 2 different formats, that is DD/MM/YY and D/M/YY. Because the Microsoft Excel (for mac 2011, 14.3.9) recognised those dates labelled D/M/YY as M/D/YY in part of the variables, the output dates become incorrect.
Then I turned to R and tried to transform the column into a format of "DD-MON-YYYY", where MON is short form of months, like 01-Jan-2014. The column is something like this:
> head(date, 10)
date
1 17/12/96
2 27/6/07
3 21/6/13
4 24/7/13
5 17/7/13
6 16/7/13
7 13/10/99
8 20/2/97
9 14/12/96
10 19/6/13
I used the format function
format(date,"%d %b %Y")
And the output was
Error in format.default(structure(as.character(x), names = names(x), dim = dim(x), :
invalid 'trim' argument
I have also tried the lubridate package with no success.
> library(lubridate)
> dmy(date)
[1] NA
Warning message:
All formats failed to parse. No formats found.
Is there any simple method to transform the date?
You need to transform your strings to objects of class date, e.g.
as.Date("17/12/96", "%d/%m/%y")
[1] "1996-12-17"
and then apply your format
format(as.Date("17/12/96", "%d/%m/%y"), "%d-%b-%Y")
[1] "17-Dec-1996"

Converting dates from excel to R

I have difficulty converting dates from excel (reading from csv) to R. Help is much appreciated.
Here is what I'm doing:
df$date = as.Date(df$excel.date, format = "%d/%m/%Y")
However, some dates get converted but some not. Here is the output of:
head(df$date)
[1] NA NA NA "0006-01-05" NA NA
the first 5 entries imported from csv file are as follows:
7/28/05
7/28/05
12/16/05
5/1/06
4/21/05
and here is the output of:
head(df$excel.date)
[1] 7/28/05 7/28/05 12/16/05 5/1/06 4/21/05 1/25/07
1079 Levels: 1/1/00 1/1/02 1/1/97 1/10/96 1/10/99 1/11/04 1/11/94 1/11/96 1/11/97 1/11/98 ... 9/9/99
str(df)
.
.
$ excel.date : Factor w/ 1079 levels "1/1/00","1/1/02",..: 869 869 288 618 561 48 710 1022 172 241 ...
First of all, make sure you have the dates in your file in an unambiguous format, using full years (not just 2 last numbers). %Y is for "year with century" (see ?strptime) but you don't seem to have century. So you can use %y (at your own risk, see ?strptime again) or reformat the dates in Excel.
It is also a good idea to use as.is=TRUE with read.csv when reading in these data -- otherwise character vectors are converted to factors which can lead to unexpected results.
And on Wndows it may be easier to use RODBC to read in dates directly from xls or xlsx file.
(edit)
The following may give a hint:
> as.Date("13/04/2014", format= "%d/%m/%Y")
[1] "2014-04-13"
> as.Date(factor("13/04/2014"), format= "%d/%m/%Y")
[1] "2014-04-13"
> as.Date(factor("13/04/14"), format= "%d/%m/%Y")
[1] "14-04-13"
> as.Date(factor("13/04/14"), format= "%d/%m/%y")
[1] "2014-04-13"
(So as.Date can actually take care of factors - the magick happens in as.Date.factor method defined as:
function (x, ...) as.Date(as.character(x), ...)
It is not a good idea to represent dates as factors but in this case it is not a problem either. I think the problem is excel which saves your years as 2-digit numbers in a CSV file, without asking you.)
-
The ?strptime help file says that using %y is platform specific - you can have different results on different machines. So if there's no way of going back to the source and save the csv in a better way you might use something like the following:
x <- c("7/28/05", "7/28/05", "12/16/05", "5/1/06", "4/21/05", "1/25/07")
repairExcelDates <- function(x, yearcol=3, fmt="%m/%d/%Y") {
x <- do.call(rbind, lapply(strsplit(x, "/"), as.numeric))
year <- x[,yearcol]
if(any(year>99)) stop("dont'know what to do")
x[,yearcol] <- ifelse(year <= as.numeric(format(Sys.Date(), "%Y")), year+2000, year + 1900)
# if year <= current year then add 2000, otherwise add 1900
x <- apply(x, 1, paste, collapse="/")
as.Date(x, format=fmt)
}
repairExcelDates(x)
# [1] "2005-07-28" "2005-07-28" "2005-12-16" "2006-05-01" "2005-04-21"
# [6] "2007-01-25"
Your data is formatted as Month/Day/Year so
df$date = as.Date(df$excel.date, format = "%d/%m/%Y")
should be
df$date = as.Date(df$excel.date, format = "%m/%d/%Y")

Convert unix timestamp column to day of week in R

I am working with a data frame in R labeled "mydata". The first column, labled "ts" contains unix timestamp fields. I'd like to convert these fields to days of the week.
I've tried using strptime and POSIXct functions but I'm not sure how to execute them properly:
> strptime(ts, "%w")
--Returned this error:
"Error in as.character(x) : cannot coerce type 'closure' to vector of type 'character'"
I also just tried just converting it to human-readable format with POSIXct:
as.Date(as.POSIXct(ts, origin="1970-01-01"))
--Returned this error:
"Error in as.POSIXct.default(ts, origin = "1970-01-01") :
do not know how to convert 'ts' to class “POSIXct”"
Update: Here is what ended up working for me:
> mydata$ts <- as.Date(mydata$ts)
then
> mydata$ts <- strftime( mydata$ts , "%w" )
No need to go all the way to strftime when POSIXlt gives you this directly, and strftime calls as.POSIXlt.
wday <- function(x) as.POSIXlt(x)$wday
wday(Sys.time()) # Today is Sunday
## [1] 0
There is also the weekdays function, if you want character rather than numeric output:
weekdays(Sys.time())
## [1] "Sunday"

Resources