I have a variable in my dataset called timestamp which is of the form.
mydata$timestamp
2013-08-01 12:00:00
2013-08-01 12:00:00
2013-08-01 12:00:00
I want to modify these and change them to only dd-mm-yy format
dates<-strptime(mydata$timestamp, format="%d:%m:%y")
printing dates is resulting in only NA's. Not sure why.
Could anyone help, please?
Thanks in advance
Pascal's answer is correct, but I'm going to add to it because with all due respect to him I think you need to be aware of the mistakes you are making so you won't make them in the future.
In some programming languages there are internal dates that have associated formats and you can change that format without changing the internal representation of the date. That is probably why you phrased the question as you did, but that is not the way R works. In R you can either have dates represented as a string or as an actual date class that R understands, like Date or POSIXlt. Classes that R understands don't have any specific output format associated with them.
Your input data appears to be a string representation of a date. It appears that you want it in a different string representation. strptime() will change from a string to POSIXlt but this data type isn't formatted one way or the other. If you want to turn it back into a string, you need to use a different command. In Pascal's example, that function is format().
Ok, so you want to use strptime() to turn it into an R date and then use format() to turn it back into a string. Fine, but you have to have the arguments right. The second argument to strptime() is a set of characters that informs the function of what the current format is. Your argument "%d:%m:%y" is not remotely similar to what your data is. That's why are getting NA. As Pascal points out, the correct format is "%Y-%m-%d %H:%M:%S". Check the help file for strptime() to see what the symbols in the formatting strings mean.
Personally I avoid all that local time stuff that strptime() does and just use R's basic Date() class. My solution would be
dates <- format(as.Date(mydata$timestamp,format="%Y-%m-%d %H:%M:%S"),format="%d-%m-%y")
Notice that that format argument in as.Date() informs the function of what the incoming data format is and the format argument in format() tells it what you want the outgoing format to be.
If you want dd-mm-yy format, you need format(mydata$timestamp, "%d-%m-%y"). For example:
x <- strptime(c("2006-01-08 10:07:52", "2006-08-07 19:33:02"), "%Y-%m-%d %H:%M:%S", tz = "EST5EDT")
[1] "2006-01-08 10:07:52 EST" "2006-08-07 19:33:02 EDT"
format(x, "%d-%m-%y")
[1] "08-01-06" "07-08-06"
Related
I load a dataset from excel
library(readxl)
df<-read_excel("excel_file.XLSX")
In the file there is a separate date column as Posixct
str(df$datecol)
I also have a time column that in R gets loaded as a date time. To bring it back as time I do........
df$Timecol<-format(df$Timecol,"%H:%M:%S")
However it turns into a character. This is where i think the problem lies
str(STOP_DATA$`Stop Frisk Time`)
I would think this part resolves the situation
df$merge_date_time<-as.POSIXct(paste(df$Datecol, df$TimeCol), format="%Y-%m-%d %H:%M:%S")
The date and time is then combined. What i want to do now is reference a timestamp column that is a Poxict data type.
str(df$Timestamp)
I would like to then find the time difference between them
df$TIME_SINCE <- difftime(df$Timestamp, df$merge_date_time, tz="UTC", units = "mins" )
but I end up with weird numbers that don't make sense. My guess its the Character data type for time. Does anyone know how to solve this?
I ended up finding out that this works
df$date_time<-paste(df$date, format(as.POSIXct(df$time), '%T'))
I removed the portion below from the script as it changed the file into a character.
df$Timecol<-format(df$Timecol,"%H:%M:%S")
I accepted the obscure POSIXT default with the proper time and odd dates (1899-12-31) and what the script did was replace 1899-12-31 with the proper correstponding df$date column.
In R I am trying to take a date string and convert it to date time format using lubridate but anm getting an error that:
All formats failed to parse. No formats found.
Using this code:
lubridate::as_date("1/2/34")
Shouldn't this just return a formated date time?
as.Date or as_Date needs format. By default, it can parse if the format is %Y-%m-%d. Here, it is not the case. So
lubridate::as_date("1/2/34", format ="%d/%m/%y")
Or more compactly
lubridate::dmy("1/2/34")
Based on the string, it is not clear whether it is day/month/year or month/day/year. Also, for 2-digit year, there is an issue with prefix i.e. it can be either "19" or "20". Here, it would parse at "2034"
I've a system generated date and time format. It looks something like this, "2017-04-12-02.29.25.000000" . I want to convert this format into a standard one so that my system can read this and later on I can convert it into minutes. Someone please help to provide a code in R.
If you're unsure of the format, the guess_formats function in lubridate is pretty helpful:
w <- "2017-04-12-02.29.25.000000"
> lubridate::guess_formats(w, orders = 'YmdHMS')
YOmdHMS YmdHMS
"%Y-%Om-%d-%H.%M.%OS" "%Y-%m-%d-%H.%M.%OS"
orders is the format you want the function to investigate and it outputs the correct representation. If the second entry in the string is the day you can try YdmHMS.
The difference in the two formats in the output in the above example is based on formatting of the second entry (always with a leading zero or not). Trying the first format gives:
> as.POSIXct(w, format = "%Y-%Om-%d-%H.%M.%OS")
[1] "2017-04-12 02:29:25 EDT"
In the as.POSIXct call you may specify the timezone tz if required.
I am working on an email data set including 3 columns as sender, receiver and time of communication, I must constrain emails data to working hour, but the problem is that the time format is like this 2017-03-27T02:06:42.793Z
so i wonder how i can i convert this format to a normal time format as %y-%m-%d %H:%M:%S, to be able to constrain the time.
I also test the following:
email$time2= format(anytime(email$time), "%y-%m-%d %H:%M:%S")
but strangely this command adds one hour to each corresponding time; i mean for instance "2017-03-27T00:00:37.820Z" has converted to "17-03-27 01:00:37"
Trailing characters are ignored so that could be read with only a slight variation in your format string:
as.POSIXct("2017-03-27T02:06:42.793Z", "%Y-%m-%dT%H:%M:%S",
tz="America/Los_Angeles")
[1] "2017-03-27 02:06:42 PDT"
I need to get current time in R in this particular format:
2014-01-07T14:57:55+05:30
Sys.time() seems to return in a different format than this. How do I exactly get this ?
Link to the format : https://en.wikipedia.org/wiki/ISO_8601
The function for converting/formatting a time string is as.POSIXct or as.POSIXlt. The documentation for these points to the docs for strptime for format symbols. This reference indicates %F is the correct symbol for ISO-8601 however, implementing that results in a format different from what you suggest.
> as.POSIXct(Sys.time(),format="%F")
[1] "2016-10-02 18:57:58 EDT"
I suspect looking at strptime you will find the combination necessary to output the exact format you need.
Is this what your looking for?
format(Sys.time(), format="%Y-%m-%dT%H:%M:%S+01:00")
format(Sys.time(), format="%Y-%m-%dT%H:%M:%S%z")
The meaning of the letters you find a the documentation of strptime() function