How do I convert a string into date and day in R - r

I have a dataset with 10 columns, one of which is date in the following format
10-MAR-12 00.00.00.000000000
I would like to convert this into a data format which is read as a date and not as a string in the following format
10/03/12
I would also like there to be an additional column that says what day of the week it is
I would then like to filter out certain days or dates and to create a subset of my data.
I am a beginner to R so any help is appreciated

Take a look at ?strptime for formatting options and as.Date or as.POSIXct for the function to convert. Also, don't be surprised if your question is down voted or closed since this is a common question and answers can be found on SO or from quick google searching.
Specifically:
format(as.Date(tolower('10-MAR-12 00.00.00.000000000'), format='%d-%b-%y'), format='%d/%m/%y')
should give you the formatting you're looking for. If you want a date type though you should take off the outer format.

Related

adding a column in dataframe an R date format

Question: Create a new reldate column in the movies data frame in R by converting the column release_date into R date format.
This is my code:
movies <-read.csv("C:/Users/phili/Downloads/movies500.csv")
movies
movies$reldate <- format(as.Date(movies$release_date),"%d/%m/%Y")
print(movies)
Unfortunatly the second code does not add a new column in as R date format.
If you can't answer my question directly, please use a very similar example
In the future it would be helpful to see your data or similar example data instead of a screen shot.
Anyways, looks like there are three things that need to be fixed:
You probably don't need to use the format() function. What you might have wanted is the format= argument within the as.Date() function
"%d/%m/%Y" this part tells R what format to expect the dates should be in, your dates are year month, day, so the order is wrong
similarly your dates are separated by dashes not slashes
So it should look like this: as.Date("2018-09-12",format="%Y-%m-%d")
So in your example try this: as.Date(movies$release_date,format="%Y-%m-%d")
Or because one of the default for as.Date() is "%Y-%m-%d" you could probably just do as.Date(movies$release_date)

Gather turns my dates into unrecognisable format

I am trying to gather a couple columns of dates so that its easier for it to be choices in shiny. However, when I gather dates, it turns into for example, 2020/12/14 to 128284 format. I have tried as.Date, as.character, I have tried lubridating but it doesn't work. (I have been gathering in a separate script besides shiny). Please see my code when gathering.
Here is my data
before gather
df<-df%>%gather(key="date.type", value="dates",
date.1, date.2, date.3, date.4)
This turns it to something like this;
after gather
This becomes a problem when I am trying to find difference between two dates in Shiny(I have been using difftime).
The error I get in shiny is:
x character string is not in a standard unambiguous format
I am also thinking of not gathering at all, but allowing the user to choose the from date column and to date column in the UI, but I am not sure how to then find the difference in days between the from and to dates in the server.
mutate(theduration=difftime(input$to,input$from,units="days")
This doesn't work.
OK, so I had this problem when using gather to make a dataset. When you get those 5 digit time character blocks, try this:
mutate(time=as.Date(as.numeric(time),origin="1899-12-30"))
Apparently, that that 5 digit number is days since the origin date. It's a MS thing. Good Luck!

Difference in Days Between Two Date Columns in a Dataframe with Different Date Formats

Just looking for help working with some dates in R. Code for a simple data frame is below, with one column of start dates and one column of end dates. I would like to create a new column with the difference in days between each set of dates - start date and end date. Also, the dates are in different formats, so is there an easy way to convert all dates to a similar format? I've been reading about the lubridate package but haven't found anything yet on this particular situation that is easy for me to quickly learn as an R newbie. It would be great to link the answer to the dplyr pipeline as well, if possible, to calculate average number of days, etc.
Start.date<-c("05-May-15", "10-June-15", "July-12-2015")
End.date<-c("12-July-15", "2015-Aug-15", "Sept-12-2015")
Dates.df<-data.frame(Start.date,End.date)

How to convert date and time into a numeric value

As a new and self taught R user I am struggling with converting date and time values characters into numbers to enable me to group unique combinations of data. I'm hoping someone has come across this before and knows how I might go about it.
I'd like to convert a field of DateTime data (30/11/2012 14:35) to a numeric version of the date and time (seconds from 1970 maybe??) so that I can back reference the date and time if needed.
I have search the R help and online help and only seem to be able to find POSIXct, strptime which seem to convert the other way in the examples I've seen.
I will need to apply the conversion to a large dataset so I need to set the formatting for a field not an individual value.
I have tried to modify some python code but to no avail...
Any help with this, including pointers to tools I should read about would be much appreciated.
You can do this with base R just fine, but there are some shortcuts for common date formats in the lubridate package:
library(lubridate)
d <- ymd_hms("30/11/2012 14:35")
> as.numeric(d)
[1] 1921407275
From ?POSIXct:
Class "POSIXct" represents the (signed) number of seconds since the
beginning of 1970 (in the UTC timezone) as a numeric vector.

Specific date format conversion problems in R

Basically I want to know why as.Date(200322,format="%Y%W") gives me NA. While we are at it, I would appreciate any advice on a data structure for repeated cross-section (aka pseudo-panel) in R.
I did get aggregate() to (sort of) work, but it is not flexible enough - it misses data on columns when I omit the missed values, for example.
Specifically, I have a survey that is repeated weekly for a couple of years with a bunch of similar questions answers to which I would like to combine, average, condition and plot in both dimensions. Getting the date conversion right should presumably help me towards my goal with zoo package or something similar.
Any input is appreciated.
Update: thanks for string suggestion, but as you can see in your own example, %W part doesn't work - it only identifies the year while setting the current day while I need to set a specific week (and leave the day blank).
Use a string as first argument in as.Date() and select a specific weekday (format %w, value 0-6). There are seven possible dates in each week, therefore strptime needs more information to select a unique date. Otherwise the current day and month are returned.
> as.Date(paste("200947", "0", sep="-"), format="%Y%W-%w")
[1] "2009-11-22"

Resources