Messy dates after importing from Excel into R - r

I have trouble reading in a date column in the correct format after importing an excel file.
If I format the Date column to character, some dates appear messy - like 44655, 44565, 31/03/2022, 30/03/2022...
However, if I format the Date column as a date, there are some NA values.
What is the problem? How can I reformat all values in the Date column in d/m/y?

This normally occurs if you have entered dates into your excel spreadsheet in multiple formats. I suspect on your excel spreadsheet, sometimes you have the date formatted as a date, and other times, the date is formatted as text. In my experience, its simplest when you have the date entered in excel as text.
If you go in excel, and change the column format to text, you will see the entries that are mixed.

Related

How to stop R from automatically converting string to date?

I have a large excel file I'm reading into R for a sports analytics project. One of the columns is height, and the format in excel is ft-in (i.e. 5-10). This is fine in excel because that column was specifically formatted to take in plain text and not convert it to date. But when I read the csv to R, the data frame auto converts to date. Is there a command/parameter to have it not do this?

How do I control the format of columns when I use the write.csv function?

I've created a dataframe in R and one of my columns convert a date such as 01/08/2018 (dd/mm/yyyy) into text form Aug-18 (mmm-yy). However, when I write this to csv using the write.csv function, Excel automatically converts this to date.
Is there a way I can specify the column type to be "Text" so that Excel doesn't change it to date format?
One simple trick that IMHO gets far too little attention is to pad your date colums with whitespace, e.g. df$mydate <- paste(' ', df$mydate, sep=''). This stops Excel from translating the text as dates.
I have started routinely doing that for all kinds of risky columns when doing R<->Excel transformations.
Taken from here: https://support.office.com/en-us/article/stop-automatically-changing-numbers-to-dates-452bd2db-cc96-47d1-81e4-72cec11c4ed8

Read excel data as is using readxl in R

I have to read an excel file in R. The excelfile has a column with values such as 50%,20%... and another column with dates in the format "12-December-2017" but R converts both the column datas.
I am using readxl package and i specified in col_types parameter all the columns to be read as text but when i check the dataframe all the column types are characters but the percentage data and date changes to decimals and numbers respectively.
excelfile2<-read_excel(filePath,col_types=rep("text",8))
I want to read the excel file as is.Any help will be appreciated.
This is because what you visualize inside the Excel is not what actually is stored.
For example, if in excel you visualize "12-December-2017", what is stored in reality is the number of days since 1-1-1899.
My suggestion is to open the Excel file with the TextReader so you have a grasp what really you are reading in R.
Then, you can either define everything as text in excel or you can apply some transformations in R in order to convert the days since 1-1-1899 into a POSIXct format.

What is the correct date format for writexl

What is the correct date format for the new writexl package? I tried it on lubridate dates, and the resulting Excel spreadsheet contains strings of the form yyyy-mm-dd (i.e. not Excel dates).
The purpose of the writexl package is to essentially create a mirror image of an R table in an excel file format. So having data in one of the usual R date/time formats like as.Date, as.POSIXct, etc. won't translate to a date format shifting from YYYY-mm-dd to d/m/y while being exported to excel. If you'd like it in a more standard excel date/time format in the excel file, it's best to convert it prior to exporting it with something like the strftime() function, like this:
require(writexl)
write_xlsx(
data.frame(date=strftime(c("2017-09-10","2017-09-11","2017-09-12"),"%d/%m/%y")),
"~/Downloads/mydata.xlsx")
Output (in xlsx file):
date
10/09/17
11/09/17
12/09/17
Edit:
If you'd like the data to be an Excel date format once it's in the new file, then adding as.POSIXct() to the above will ensure that.
It worked when I converted my dates to POSIXct dates using as.POSIXct.

Excel download date format changing in different time zone

How, i can keep date format same of excel at the time of download in different-2 time zone?
My downloaded file contains date column format is MM-dd-YYYY.
The issue is that, while i am downloading the excel file in US region it is not converting the dates into different formats, while i am downloading the same file in UK region, it is converting the dates into dd-MM-YYYY.
Date 12-01-2015 is converting to 01-12-2015.
I want to keep date format of excel file MM-dd-YYYY in all the regions.
I am generating the file using ASP.Net C#.
Thanks in advance.
This is because Excel is selecting the different date format based on the program's locale setting. I would imagine your options are:
a: Change the cell format to what you need in each location, e.g. in VBA the code would be something like Selection.NumberFormat = "m/d/yyyy;#"
b: Store your date data in text format. So literally a text string in mm-dd-yyyy format. You could still easily parse it later.

Resources