Read Excel file in R with multiple date format - r

I'm trying to read an Excel file in R, with two columns containing dates. Now here is my problem, when I view my data file in R, most of the dates are in the good format, but some were transformed into number that don't make sense at all. I joined images to show the different outputs from R/Excel.
(Only pay attention to the columns "ArrivalDate" and ActlFlightDate")
Output seen from R
Output seen from Excel
My question is, how, in R, can I make those numbers become the date they are supposed to be? Especially since the class of the elements in those columns are characters.
Thank you in advance!

Your dates in your excel file are different formats. That is why some are to left side of column and some are to right side of column in the spreadsheet. They are probably mixed between an actual excel date format and a text or general. You should copy the columns to new columns and highlight the new column and right click and change format until all your dates are uniform. Then you can import to R and have consistent dates data.

Related

Manipulating Data/Data Wrangling of excel sheets in R and changing layout

I have over a hundred files in excel that in general have the same format of :
Screenshot of format
they are all different lengths too!
I want to combine the data of all the different excels sheet into this format:
Desired format
So far my thinking is:
create a new column in every excel sheet that is titled Day
combine day column and Steps column in some kind of join that combines the title of the two columns to say day{1}steps and have the data of where they intersect be the data that goes in that row
Is this the right thought process on how to do that? Any tips on how to get to my end result?

Trying to correctly format all the dates in RStudio imported from Excel

I imported some data from Excel to RStudio (csv file). The data contains date information. The date format I want is month-day-year (e.g. 2-10-16 means February 10th 2016). The problem is that Excel auto-fills 2-10-16 to 2002-10-16, and the problem continues to exist when I imported the data to R. So, my data column contains both the correctly formatted dates (e.g. 2-10-16) and incorrectly formatted dates (e.g. 2002-10-16). Because I have a lot of dates, it is impossible to manually change everything. I have tried to use the this code
as.Date(data[,1], format="%m-%d-%y") but it gives me NA for those incorrectly formatted dates (e.g. 2002-10-16). Does anybody know how to make all the dates correctly formatted?
Thank you very much in advance!
would you consider to have a consistent date format in excel before importing the data to R?
The best approach is likely to change how the data is captured in Excel even if it means storing the dates as strings. What you're looking for is string manipulation to then convert into a date which could potentially create incorrect data.
This will remove the first two digits and then allow conversion to a date.
as.Date(sub('^\\d{2}', '', '2002-10-16'), '%m-%d-%y')
[1] "2016-02-10"

Values in column change when reading in excel file

I am trying to read in an excel file using the readXL package with a column of time stamps. For this particular file, they are randomly distributed times, so they make look like 00:01, 00:03, 00:04, 00:08, 00:10, etc. so I need these timestamps to read into R correctly. The time stamps turn into random decimals.
I looked in the excel file (which is outputted from a different program) and it appears the column type within excel is "custom". When I convert that "custom" column to "text", it shows me the decimals that are actually stored and reading into R. Is there a way to load in the timestamps instead of the decimals?
I have already tried to using col_types to make it text or integers, but it is still reading the numbers in as decimals and not the timestamps.
df<-
readxl::read_xlsx(
"./Data/LAFLAC_60sec_9999Y1_2019-06-28.xlsx",
range = cell_cols("J:CE"),
col_names = T
)
The decimals are a representation in day after midnight. So 1:00 am is .041667, or 1/24.
In R, you can convert those numbers back into timestamps in a variety of ways.
Try this page for more info
https://stackoverflow.com/a/14484019/6912825

R: Preventing strings (date-alike) from converting to numbers

So I have these genes in R and, while their names resemble dates (eg Sept-4), I want to print them in an excel file without converting them in numbers.
Thank you in advance.
When you know in which cells you'll copy the dates, just format that cell to Text. This should leave the content as is.

Formatting Header to Append to Data Frame in R

I'm attempting to create a specifically formatted header to append to a data frame I have created in R.
The essence of my problem is that it seems increasingly difficult (maybe impossible?) to create a header that breaks away from a typical one row by one column framework, without merging the underlying table, using the dataframe concept in R.
The issue stems from me not being able to figure out a way to import this particular format of a header into R through methods such as read.csv or read.xlsx which preserve the format of the header.
Reading in a header of this format into R from a .csv or .xlsx is quite ugly and doesn't preserve the original format. The format of the header I'm trying to create and append to an already existing dataframe I have of 17 nameless columns in R could be represented in such a way:
Where the number series of 1 - 17 represents the already existing data frame of 17 nameless columns of data that I have created in R in which I wish to append to this header. Could anyone point me in the right direction?
You are correct that this header will not work within R. The data frame only supports single header values and wont do something akin to a merged cell in excel.
However if you simply want to export your data to an .csv or .xlsx (use write.csv) then just copy your header in, that could work.
OR
You could add in a factor column to your data frame to capture the information contained in the top level of your header.

Resources