Convert string to date and merge data sets R - r

I have a column of data in the form of a string, and I need to change it to date because it is a time series.
200612010018 --> 2006-12-01 00:18
I've tried unsucessfully,
strptime(200612010018,format ='%Y%M%D %H:%MM')
After doing this I need to append one data set to another one.
Will I have any problems using rbind() if the column contains dates?
Thanks

You were close. You mixed minutes(%M) and months (%m). And the the format argument needs to follow the format you provide.
strptime(200612010018,format ='%Y%m%d%H%M')
#"2006-12-01 00:18:00

Related

adding a column in dataframe an R date format

Question: Create a new reldate column in the movies data frame in R by converting the column release_date into R date format.
This is my code:
movies <-read.csv("C:/Users/phili/Downloads/movies500.csv")
movies
movies$reldate <- format(as.Date(movies$release_date),"%d/%m/%Y")
print(movies)
Unfortunatly the second code does not add a new column in as R date format.
If you can't answer my question directly, please use a very similar example
In the future it would be helpful to see your data or similar example data instead of a screen shot.
Anyways, looks like there are three things that need to be fixed:
You probably don't need to use the format() function. What you might have wanted is the format= argument within the as.Date() function
"%d/%m/%Y" this part tells R what format to expect the dates should be in, your dates are year month, day, so the order is wrong
similarly your dates are separated by dashes not slashes
So it should look like this: as.Date("2018-09-12",format="%Y-%m-%d")
So in your example try this: as.Date(movies$release_date,format="%Y-%m-%d")
Or because one of the default for as.Date() is "%Y-%m-%d" you could probably just do as.Date(movies$release_date)

How can I convert a column of date/time data from numeric to character in R?

I have a column of data compiled from Excel files. Some of the values in the date column have changed upon binding and are now numeric date format (despite their starting out character) whilst others remain as they were (yyyy-mm-dd hh:mm). How can I change the entire column to the same date format (yyyy-mm-dd hh:mm)?
Thanks in advance.
Try strptime:
df$column <- strptime(df$column, format='%Y-%m-%d %H:%M')
OK so I finally cracked it. This is probably very obvious to everyone but just in case a newb like myself has this same issue this is what solved it for me.
I had two sets of data that I'd bound into the same table. One set of data came from XLSX files and the other from CSV files. They both presented fine in R but when combined the CSV-derived lost formatting and reverted to numerical dates. I discovered that the 'date' columns in the xlsx-derived tables were 'character' whilst the 'date' columns in the csv-derived tables were 'factor with 1 level'. When combined, the character data preserved format (i.e. looked like a date - yyyy-mm-dd hh:mm) and the factor data turned into numeric dates
So to rectify I used the following on the .csv (factor) tables before binding:
myfile$Date <- as.character(myfile$Date)
This changed the columns to character to match the others and the bind was successful and all date formatting was preserved. Thank you for your help!

Trying to correctly format all the dates in RStudio imported from Excel

I imported some data from Excel to RStudio (csv file). The data contains date information. The date format I want is month-day-year (e.g. 2-10-16 means February 10th 2016). The problem is that Excel auto-fills 2-10-16 to 2002-10-16, and the problem continues to exist when I imported the data to R. So, my data column contains both the correctly formatted dates (e.g. 2-10-16) and incorrectly formatted dates (e.g. 2002-10-16). Because I have a lot of dates, it is impossible to manually change everything. I have tried to use the this code
as.Date(data[,1], format="%m-%d-%y") but it gives me NA for those incorrectly formatted dates (e.g. 2002-10-16). Does anybody know how to make all the dates correctly formatted?
Thank you very much in advance!
would you consider to have a consistent date format in excel before importing the data to R?
The best approach is likely to change how the data is captured in Excel even if it means storing the dates as strings. What you're looking for is string manipulation to then convert into a date which could potentially create incorrect data.
This will remove the first two digits and then allow conversion to a date.
as.Date(sub('^\\d{2}', '', '2002-10-16'), '%m-%d-%y')
[1] "2016-02-10"

Mixed Timed Data

I have a vector that contains time data, but there's a problem: some of the entries are listed as dates (e.g., 10/11/2017), while other entries are listed as dates with time (e.g., 12/15/2016 09:07:17). This is problematic for myself, since as.Date() can't recognize the time portion and enters dates in an odd format (0012-01-20), while seemingly adding dates with time entries as NA's. Furthermore, using as.POSIXct() doesn't work, since not all entries are a combination of date with time.
I suspect that, since these entries are entered in a consistent format, I could hypothetically use an if function to change the entries in the vector to a consistent format, such as using an if statement to remove time entirely, but I don't know enough about it to get it to work.
use
library(lubridate)
Name of the data frame or table-> x
the column that has date->Date
use the ymd function
x$newdate<-ydm(x$Date)

Parsing out time-of-day from a date to a separate column

I have, per cell, a date value in the format 2013-01-05 11:21.
Is there a way to separate the time of day (ie 11:21) and put it in a new column, without having to manually cut and paste?
I have a lot of date values in one column, and I want to separate the time-of-day portion of these dates into a new adjacent column.
Yes - the TIMEVALUE function should do this. You may need to format the result cells (in my examle: B1:B8) as time values. Using cell formatting, you can set the output to a hh:mm syntax, too.

Resources