I have a dataset with some date time like this "{datetime:2015-07-01 09:10:00" So I wanted to remove the text, and then keep the date & the time as as.Date returns only the date. So I write this code but the only problem I have is that during the second line with strsplit, it only returns me the date time of the first line and so erase the others... I woud love to get ALL my date time not only the first. I thought about sapply maybe, but I can't make it right I have many errors or maybe with a loop for? I am novice to R so I don't really know how to do this the best way.
Could you help me please? Besides If you have another idea for the time & date format or a simple way to do it, it should be very nice of you too.
data$`Date Time`=as.character(data$`Date Time`)
data$`Date Time`=unlist(strsplit(data[,1], split='e:'))[2]
date=substr(data$`Date Time`,0,10)
date=as.Date(date)
time=substr(data$`Date Time`,12,19)
data$Date=date
data$Time=time
Thank you very much for your help!
You could use the format argument to avoid all the strsplit:
times <- as.POSIXct(data$`Date Time`, format='{datetime:%Y-%m-%d %H:%M:%S')
(The reason for the "{datetime:" in the format is because you mentioned this is the format of your strings).
This object has both date and time in it, and then you can just store it in the dataframe as a single column of type POSIXct rather than two columns of type string e.g.
data$datetime <- times
but if you do want to store the date as a Date and the time as a string (as in your example above):
data$Date <- as.Date(times)
data$Time <- strftime(times, format='%H:%M:%S')
See ?as.Date, ?as.POSIXct, ?strptime for more details on that format argument and various conversions between date and string.
Related
Hi and thanks for reading me. Im trying to convert a character string in to a datetime format in r, but I cannot discover the way to do that, because the year is only taken the last 2 digits ("22" instead of "2022"), im not sure of how to fix It.
The character string is "23/8/22 12:45" and I tried with:
as.Date("23/8/22 12:45", format ="%m/%d/%Y" )
and
as_datetime(ymd_hm("23/8/22 12:45"), format ="%m/%d/%Y")
But it doest work. Anyone knows how I can fix it? thanks for the help
as.Date returns only the calendar date, because class Date objects are only shown as calendar dates.
in addition to POSIXct, you can also use parse_date_time from lubridate package:
library(lubridate)
parse_date_time("23/8/22 12:45", "dmy HM", tz="") #dmy - day, month, year HM - hour, minute
# Or the second function:
dmy_hm("23/8/22 12:45",tz=Sys.timezone())
As has been mentioned in the comments you need to use "%d/%m/%y" to correctly parse the date in your string. But, if you want datetime format (in base r this is normally done with POSIXct classes) you could use as.POSIXct("23/8/22 12:45", format = "%d/%m/%y %H:%M"). This will make sure you keep information about the time from your string.
I am relatively new to R and I have a dataset in which I am trying to convert a date and time into a numeric value. The date and time are in the format 01JUN17:00:00:00 under a variable called pickup_datetime. I have tried using the code
cab_small_sample$pickup_datetime <- as.numeric(as.Date(cab_small_sample$pickup_datetime, format = '%d%b%y'))
but this way doesn't incorporate time, I tried to add the time format to the format section of code but still did not work. Is there an R function that will convert the data into a numeric value>
R has two main time classes: "Date" and "POSIXct". POSIXct is a datetime class and you can get all the gory details at: ? DateTimeClasses. The help page for the formats used at the time of data input, however, are at ?striptime.
cab_small_sample <- data.frame(pickup_datetime = "01JUN17:00:00:00")
cab_small_sample$pickup_dt <- as.numeric(as.POSIXct(cab_small_sample$pickup_datetime,
format = '%d%b%y:%H:%M:%S'))
cab_small_sample
# pickup_datetime pickup_dt
#1 01JUN17:00:00:00 1496300400 # seconds since 1970-01-01
I find that a "destructive reassignment of values" is generally a bad idea so as a "my (best?) practice rule" I don't assign to the same column until I'm sure I have the code working properly. (And I always leave an untouched copy somewhere safe.)
lubridate is an extremely handy package for dealing with dates. It includes a variety of functions which do the date/time parsing for you, as long as you can provide the order of components. In this case, since your data is in day-month-year-hms form, you can use the dmy_hms function.
library(lubridate)
cab_small_sample <- dplyr::tibble(
pickup_datetime = c("01JUN17:00:00:00", "01JUN17:11:00:00"))
cab_small_sample$pickup_POSIX <- dmy_hms(cab_small_sample$pickup_datetime)
I have dates in the following formats:
08MAR1978:00:00:00
10FEB1973:00:00:00
15AUG1982:00:00:00
I would like to convert them to:
1978-03-08
1973-02-10
1982-09-15
I have tried the following in SparkR:
period_uts <- unix_timestamp(all.new$DATE_OF_BIRTH, '%d%b%Y:%H:%M:%S')
period_ts <- cast(period_uts, 'timestamp')
period_dt <- cast(period_ts, 'date')
df <- withColumn(all.new, 'p_dt', period_dt)
But when I do this, all the dates get changed into "NA".
Can anyone please provide some insights on how I can convert dates in %d%B%Y:%H:%M:%S format to dates in SparkR?
Thanks!
I don't think you need SparkR to solve this question.
What you have:
DoB <- c("08MAR1978:00:00:00", "10FEB1973:00:00:00", "15AUG1982:00:00:00")
If you want to get 1978-03-08 etc. you could just use as.Date in combination with the date format you already found yourself:
as.Date(DoB, format="%d%B%Y:%H:%M:%S")
# [1] "1978-03-08" "1973-02-10" "1982-08-15"
as.Date will ensure that R knows how to interpret your string as a date.
Note, however, that in general the way dates are displayed to you (i.e. 1978-03-08) actually don't really matter. The reason is that 'under the hood', R understands your date now, so all date-related operations will be performed appropriately.
I figured out how to do it:
all.new = all.new %>% withColumn("Date_of_Birth_Fixed", to_date(.$DATE_OF_BIRTH, "ddMMMyyyy"))
This works in Spark 2.2.x
I have a dataset called EPL2011_12. I would like to make new a dataset by subsetting the original by date. The dates are in the column named Date The dates are in DD-MM-YY format.
I have tried
EPL2011_12FirstHalf <- subset(EPL2011_12, Date > 13-01-12)
and
EPL2011_12FirstHalf <- subset(EPL2011_12, Date > "13-01-12")
but get this error message each time.
Warning message:
In Ops.factor(Date, 13- 1 - 12) : > not meaningful for factors
I guess that means R is treating like text instead of a number and that why it won't work?
Well, it's clearly not a number since it has dashes in it. The error message and the two comments tell you that it is a factor but the commentators are apparently waiting and letting the message sink in. Dirk is suggesting that you do this:
EPL2011_12$Date2 <- as.Date( as.character(EPL2011_12$Date), "%d-%m-%y")
After that you can do this:
EPL2011_12FirstHalf <- subset(EPL2011_12, Date2 > as.Date("2012-01-13") )
R date functions assume the format is either "YYYY-MM-DD" or "YYYY/MM/DD". You do need to compare like classes: date to date, or character to character. And if you were comparing character-to-character, then it's only going to be successful if the dates are in the YYYYMMDD format (with identical delimiters if any delimiters are used).
The first thing you should do with date variables is confirm that R reads it as a Date. To do this, for the variable (i.e. vector/column) called Date, in the data frame called EPL2011_12, input
class(EPL2011_12$Date)
The output should read [1] "Date". If it doesn't, you should format it as a date by inputting
EPL2011_12$Date <- as.Date(EPL2011_12$Date, "%d-%m-%y")
Note that the hyphens in the date format ("%d-%m-%y") above can also be slashes ("%d/%m/%y"). Confirm that R sees it as a Date. If it doesn't, try a different formatting command
EPL2011_12$Date <- format(EPL2011_12$Date, format="%d/%m/%y")
Once you have it in Date format, you can use the subset command, or you can use brackets
WhateverYouWant <- EPL2011_12[EPL2011_12$Date > as.Date("2014-12-15"),]
I am getting Data where the first column is always a string with the Time like
Time <- "2015-06-01 09:45:33"
for plotting, later I convert it with as.POSIXct and so on.
But sometimes I have another Time string like
Time <- "2015/07/01 09:33"
So is there a possibility(or a function) to check the Time format of the string in a way like this
format <- checkFormat(Time)
and then convert it automatically to
as.POSIXct(Time, format=format)
I cant be the first one who asks this, although i really searched a lot.
Thanks
As requested in answer format: it is not possible, since in your example no solution can know if 06 is the month and 01 the day or vice versa.
You can change the time format string with
Time <- "2015/07/01 09:33"
Time <- if(grepl("\\d{4}/\\d{2}/\\d{2}", Time)) gsub("/", "-", Time)