How can I convert a column of date/time data from numeric to character in R? - r

I have a column of data compiled from Excel files. Some of the values in the date column have changed upon binding and are now numeric date format (despite their starting out character) whilst others remain as they were (yyyy-mm-dd hh:mm). How can I change the entire column to the same date format (yyyy-mm-dd hh:mm)?
Thanks in advance.

Try strptime:
df$column <- strptime(df$column, format='%Y-%m-%d %H:%M')

OK so I finally cracked it. This is probably very obvious to everyone but just in case a newb like myself has this same issue this is what solved it for me.
I had two sets of data that I'd bound into the same table. One set of data came from XLSX files and the other from CSV files. They both presented fine in R but when combined the CSV-derived lost formatting and reverted to numerical dates. I discovered that the 'date' columns in the xlsx-derived tables were 'character' whilst the 'date' columns in the csv-derived tables were 'factor with 1 level'. When combined, the character data preserved format (i.e. looked like a date - yyyy-mm-dd hh:mm) and the factor data turned into numeric dates
So to rectify I used the following on the .csv (factor) tables before binding:
myfile$Date <- as.character(myfile$Date)
This changed the columns to character to match the others and the bind was successful and all date formatting was preserved. Thank you for your help!

Related

Trying to correctly format all the dates in RStudio imported from Excel

I imported some data from Excel to RStudio (csv file). The data contains date information. The date format I want is month-day-year (e.g. 2-10-16 means February 10th 2016). The problem is that Excel auto-fills 2-10-16 to 2002-10-16, and the problem continues to exist when I imported the data to R. So, my data column contains both the correctly formatted dates (e.g. 2-10-16) and incorrectly formatted dates (e.g. 2002-10-16). Because I have a lot of dates, it is impossible to manually change everything. I have tried to use the this code
as.Date(data[,1], format="%m-%d-%y") but it gives me NA for those incorrectly formatted dates (e.g. 2002-10-16). Does anybody know how to make all the dates correctly formatted?
Thank you very much in advance!
would you consider to have a consistent date format in excel before importing the data to R?
The best approach is likely to change how the data is captured in Excel even if it means storing the dates as strings. What you're looking for is string manipulation to then convert into a date which could potentially create incorrect data.
This will remove the first two digits and then allow conversion to a date.
as.Date(sub('^\\d{2}', '', '2002-10-16'), '%m-%d-%y')
[1] "2016-02-10"

Inconsistency date value when read.xlsx in R

I am using the read.xlsx function in R to read excel sheets. All the values of a date column 'A' is of the form dd/mm/yyyy. However,when using the read.xlsx function, the values of the date parsed ranges from being an integer ie. 42283 to string i.e. 20/08/2015. This probelm persist even when I uses read.xlsx2.
I guess the inconsistency in the format for different rows makes it hard to change the column to a single standard format. Also, it is hard to specify the column classes in the read.xlsx since I have more than 100 variables.
Are there ways around this problem and also is this an excel specific problems?
Thank you!
This problem with date formats is pervasive and it seems like every R package out there deals with it differently. My experience with read.xlsx has been that it sometimes saves the date as a character string of numbers, e.g. "42438" as character data that I then have to convert to numeric and then to POSIXct. Then other times, it seems to save it as numeric and sometimes as character and once in a while, actually as POSIXct! If you're consistently getting character data in the form "20/08/2015", try the lubridate package:
library(lubridate)
dmy("20/08/2015")

R problem Date column stored as Factor R can't convert it

I have downloaded the SP500 data from Yahoo Finance ticker GSPC and am trying to filter it by year, however the Date column is stored as Factor so R can't filter it. Can anyone help me convert it? I tried multiple solutions, but nothing worked.
So far I've used the loaded the lubridate package and used the following code, but all the values just got replaced with NA's.
as.Date(SP500$Date, format = "%m-%d-%Y")
Then I used the: SP500$Date <- ymd(SP500$Date, format = "%Y-%m-%d") code and again nothing happened. (SP500 is the name of the data frame that I stored the data in)
Also, tried using just SP500$Date <- as.Date(SP500$Date) but R says do not know how to convert it to Date.
Any help would be much appreciated! Thank you!
Classes only exist in the environment of a programming language. What likely happened was that your data (perhaps a .csv file?) got interpreted as factor by R during reading.
Everything you're trying to do here can be accomplished using the base library in R (meaning you don't need to import anything).
If you're dealing with dates:
df$date <- as.Date(df$date, format = "%Y-%m-%d")
If you're dealing with datetimes:
df$date <- as.POSIXct(df$date, format = "%Y-%m-%d %H:%M:%S")
(obviously the specific format may vary; see list)
Occasionally, coercion in R may act finicky. The format parameter is somewhat unforgiving of errors. I personally frequently mistake - for /, or conflate "%Y-%m-%d" with "%d-%m-%Y" causing the operation to throw an error. Obviously, if the format isn't consistent in your data, instances that can't be described by the specific format you supplied will result in NAs.
Sometimes your dates are actually integers (e.g. 20181111); in this case, you may need to supply '1970-01-01' to the origin parameter of as.Date(). For example, if you are iterating through a vector of Dates using a for loop, R won't honour the class of passed Dates and will convert them to integers.
It may sound like a bandaid solution, but class coercions from common types like character are usually written well; I often pre-emptively coerce the object to character when I'm clueless about why my attempt to coerce a class failed.

Reading date as a character in googlevis

so this is in reference to an earlier question: Using a ordered factor as timevar in Motion Chart but it wouldn't let me leave this as a comment :/
So, I am having the same error the person earlier was having, thing is, according to the answer: "the documentation says that timevar argument can't handle factor. It can handle character if and only if they are in a particular format, that is (for example): 2010Q1." Thing is, I already have my data formatted like that, in a csv file: http://www.filedropper.com/texasgdp
Time GDP
2006Q1 500
2006Q2 1000
2006Q3 2000
2006Q4 2600....etc
So, if this is the character format, why am i still getting the same error? Is there a way I could just have rstudio reread that entire column as a "character" rather than a factor?
Yes, you can use this as you read the data in:
read.csv('file path',stringsAsFactors=FALSE)
Or you can convert the column to a character vector after reading the data:
df$Time <- as.character(df$Time)

Convert string to date and merge data sets R

I have a column of data in the form of a string, and I need to change it to date because it is a time series.
200612010018 --> 2006-12-01 00:18
I've tried unsucessfully,
strptime(200612010018,format ='%Y%M%D %H:%MM')
After doing this I need to append one data set to another one.
Will I have any problems using rbind() if the column contains dates?
Thanks
You were close. You mixed minutes(%M) and months (%m). And the the format argument needs to follow the format you provide.
strptime(200612010018,format ='%Y%m%d%H%M')
#"2006-12-01 00:18:00

Resources