I have a problem with the as.date function.
I have a list of normal date shows in the excel, but when I import it in R, it becomes numbers, like 33584. I understand that it counts since a specific day. I want to set up my date in the form of "dd-mm-yy".
The original data is:
how the "date" variable looks like in r
I've tried:
as.date <- function(x, origin = getOption(date.origin)){
origin <- ifelse(is.null(origin), "1900-01-01", origin)
as.Date(date, origin)
}
and also simply
as.Date(43324, origin = "1900-01-01")
but none of them works. it shows the error: do not know how to convert '.' to class “Date”
Thank you guys!
The janitor package has a pair of functions designed to deal with reading Excel dates in R. See the following links for usage examples:
https://www.rdocumentation.org/packages/janitor/versions/2.0.1/topics/excel_numeric_to_date
https://www.rdocumentation.org/packages/janitor/versions/2.0.1/topics/convert_to_date
janitor::excel_numeric_to_date(43324)
[1] "2018-08-12"
I've come across excel sheets read in with readxl::read_xls() that read date columns in as strings like "43488" (especially when there is a cell somewhere else that has a non-date value). I use
xldate<- function(x) {
xn <- as.numeric(x)
x <- as.Date(xn, origin="1899-12-30")
}
d <- data.frame(date=c("43488"))
d$actual_date <- xldate(d$date)
print(d$actual_date)
# [1] "2019-01-23"
Dates are notoriously annoying. I would highly recommend the lubridate package for dealing with them. https://lubridate.tidyverse.org/
Use as_date() from lubridate to read numeric dates if you need to.
You can use format() to put it in dd-mm-yy.
library(lubridate)
date_vector <- as_date(c(33584, 33585), origin = lubridate::origin)
formatted_date_vector <- format(date_vector, "%d-%m-%y")
Related
When I read in data from a CSV file, the column names return in this format:
X2017.04, X2017.05, X2017.06
I'm looking to format it as (or something similar to):
April-2017, May-2017, June-2017
Currently, I've tried for loops to iterate through the entire data set and reformat everything using as.Date() or as.yearmon and some of them have worked, kinda.
as.yearmon returned 1997.333333333 and similar looking floats. The as.Date code I tried returned blank values.
I'm relatively new/novice level in R and could use some help.
Thank you!
Using as.yearmon you can try :
names(df) <- zoo::as.yearmon(names(df), 'X%Y.%m')
Or in base R pasting an arbitrary date :
names(df) <- format(as.Date(paste0(names(df), '.01'), 'X%Y.%m.%d'), '%b-%Y')
As an example :
x <- c('X2017.04', 'X2017.05', 'X2017.06')
format(as.Date(paste0(x, '.01'), 'X%Y.%m.%d'), '%b-%Y')
#[1] "Apr-2017" "May-2017" "Jun-2017"
We can use
library(lubridate)
format(ymd(sub('^X', "", x,), truncated = 2), '%b-%Y')
#[1] "Apr-2017" "May-2017" "Jun-2017"
data
x <- c('X2017.04', 'X2017.05', 'X2017.06')
I know this question is asked a lot, but I only come to you because I tried everything (including the tips from similar questions that I managed to understand).
I have a rather big CSV file (> 16 000 rows), with, among others, a "Date" column, containing dates in the following format "01/01/1999".
However, when loading the file, the column is not recognised as a date, but as a Factor with read.csv2, or a character with fread (data.table package). I loaded the lubridate library.
In both cases, I tried to convert the column to a date format, using all methods I knew (column = Date, data = test):
as.Date(test$Date, format = "%d/%m/%Y", tz = "")
Or
strptime(test$Date, format = "%d/%m/%y", tz = "")
Or
as_date(test$Date)
And also the function dmy from lubridate, and
as.POSIXct(test$Date, "%d/%m/%y", tz = "").
I also tried changing the format: ymd instead of dmy, "-" instead of "/".
I even tried to change the character class to numeric (when loaded with fread), and the factor class to numeric (when loaded with read.csv2).
Despite all of this, the columns stay in their factor / character classes.
Does someone know what I missed?
Just use the anydate() function from the anytime package:
R> library(anytime)
R> var <- as.factor(c("01/01/1999", "01/02/1999"))
R> anydate(var)
[1] "1999-01-01" "1999-01-02"
R>
R> class(anydate(var))
[1] "Date"
R>
R> class(var)
[1] "factor"
R>
R>
It will read just about any input time, and convert it without requiring a format and this works as long as the represented is somewhat standard (i.e. we do not work with two-digit years etc).
(Otherwise you can of course also use the base R functions after first converting from factor back to character via as.character(). But anytime() and anydate() do that, and much more, for you too.)
If you are using read.csv2, try
read.csv2(..., stringsAsFactors=F)
and then continue with as.Date
at the moment I'm trying to convert a string into time-format.
e.g. my string looks like following: time <- '12:00'.
I already tried to use the chron-Package. And my code looks like following:
time <- paste(time,':00', sep = '') time <- times(time)
Instead of getting a value like "12:00:00" the function times() always translate the object time into "0.5"
Am I using the wrong approach?
regards
Your code works. If you check the 'class()' it is "times". However, if you want another way, try:
time <- '12:00:00'
newtime<-as.POSIXlt(time, format = "%H:%M:%S") # The whole date with time
t <- strftime(newtime, format="%H:%M:%S") # To extract the time part
t
#[1] "12:00:00"
Cheers !
I have dates in the following formats:
08MAR1978:00:00:00
10FEB1973:00:00:00
15AUG1982:00:00:00
I would like to convert them to:
1978-03-08
1973-02-10
1982-09-15
I have tried the following in SparkR:
period_uts <- unix_timestamp(all.new$DATE_OF_BIRTH, '%d%b%Y:%H:%M:%S')
period_ts <- cast(period_uts, 'timestamp')
period_dt <- cast(period_ts, 'date')
df <- withColumn(all.new, 'p_dt', period_dt)
But when I do this, all the dates get changed into "NA".
Can anyone please provide some insights on how I can convert dates in %d%B%Y:%H:%M:%S format to dates in SparkR?
Thanks!
I don't think you need SparkR to solve this question.
What you have:
DoB <- c("08MAR1978:00:00:00", "10FEB1973:00:00:00", "15AUG1982:00:00:00")
If you want to get 1978-03-08 etc. you could just use as.Date in combination with the date format you already found yourself:
as.Date(DoB, format="%d%B%Y:%H:%M:%S")
# [1] "1978-03-08" "1973-02-10" "1982-08-15"
as.Date will ensure that R knows how to interpret your string as a date.
Note, however, that in general the way dates are displayed to you (i.e. 1978-03-08) actually don't really matter. The reason is that 'under the hood', R understands your date now, so all date-related operations will be performed appropriately.
I figured out how to do it:
all.new = all.new %>% withColumn("Date_of_Birth_Fixed", to_date(.$DATE_OF_BIRTH, "ddMMMyyyy"))
This works in Spark 2.2.x
I am wanting to convert date-times stored as characters to date-time objects.
However if a date time includes midnight then the resulting datetime object excludes the time component, which then throws an error when used in a later function (not needed here - but a function that extracts weather data for specified location and date-time).
Example code:
example.dates <- c("2011-11-02 00:31:00","2011-11-02 00:00:00","2011-11-02 00:20:22")
posix.dates <- as.POSIXct(example.dates, tz="GMT", format="%Y-%m-%d %H:%M:%S")
posix.dates
posix.dates[2]
NB times is only excluded when the datetime containing midnight is called on it's own (atomic vector).
Is there a way of retaining the time data for midnight times? Can you suggest an alternative function?
Okay, after some time I can reconfirm your problem.
For me this looks like a bug in R. I would suggest you to report it on https://bugs.r-project.org/bugzilla3/.
As a temporary workaround, you could try if it helps to overwrite the strptime function like this:
strptime <- function (x, format, tz = "")
{
if ("POSIXct" %in% class(x)) {
x
} else {
y <- .Internal(strptime(as.character(x), format, tz))
names(y$year) <- names(x)
y
}
}
Realise this is an old question now, but I had the same issue and found this solution:
https://stackoverflow.com/a/51195062/8158951
Essentially, all you need to do is apply formatting as follows. The OP's code needed to include the formatting call after the POSIXct function call.
posix.dates <- format(as.POSIXct(example.dates, tz="GMT"), format="%Y-%m-%d %H:%M:%S")
This worked for me.
I prefer to use the lubridate package for date-times. It does not seem to cause problems here either:
example.dates <- c("2011-11-02 00:31:00","2011-11-02 00:00:00","2011-11-02 00:20:22")
library(lubridate)
ymd_hms(example.dates)