I am working an uploaded document originally from google docs downloaded to an xlsx file. This data has been hand entered & formatted to be DD-MM-YY, however this data has uploaded inconsistently (see example below). I've tried a few different things (kicking myself for not saving the code) and it left me with just removing the incorrectly formatted dates.
Any suggestions for fixing this in excel or (preferably) in R? This is longitudinal data so it would be frustrating to have to go back into every excel sheet to update. Thanks!
data <- read_excel("DescriptiveStats.xlsx")
ex:
22/04/13
43168.0
43168.0
is a correct date value
22/04/13
is not a valid date. it is a text string. to convert it into date you will need to change it into 04/13/2022
there are a few options. one is to change the locale so the 22/04/13 would be valid. see more over here: locale differences in google sheets (documentation missing pages)
2nd option is to use regex to convert it. see examples:
https://stackoverflow.com/a/72410817/5632629
https://stackoverflow.com/a/73722854/5632629
however, it is very likely that 43168 is also not the correct date. if your date before import was 01/02/2022 then after import it could be: 44563 which is actually 02/01/2022 so be careful. you can check it with:
=TO_DATE(43168)
and date can be checked with:
=ISDATE("22/04/13")
This question already has answers here:
How to convert Excel date format to proper date in R
(5 answers)
Closed 1 year ago.
I am reading an Excel file using the function readxl::read_excel(), but it appears that date are not getting read properly.
In the original file, one such date is 2020-JUL-13, but it is getting read as 44025.
Is there any way to get back the original date variable as in the original file?
Any pointer is very appreciated.
Thanks,
Basically, you could try to use:
as.Date(44025)
However, you will notice error saying Error in as.Date.numeric(44025) : 'origin' must be supplied. And that means that all you need is to know origin, i.e. starting date from which to start counting. When you check, mentioned by Bappa Das, help page for convertToDate function, you will see that it is just a wrapper for as.Date() function and that the default argument for origin parameter is "1900-01-01".
Next, you can check, why is this, by looking for date systems in Excel and here is a page for this:
Date systems in Excel
Where is an information that for Windows (for Mac there are some exceptions) starting date is indeed "1900-01-01".
And now, finally, if you want to use base R, you can do:
as.Date(44025, origin = "1900-01-01")
This is vectorized function, so you can pass whole column as well.
You can use openxlsx package to convert number to date like
library(openxlsx)
convertToDate("44025")
Or to convert the whole column you can use
convertToDate(df$date)
for one of my projects I will need to import the dataset (csv-File) outside of R and then assign it from the Ruby side of the project in R (this will be done with rinruby and already works).
In my R-Script I now need to create a list out of that csv file.
The variable contains an escaped string that contains the original csv.
data <- "\"\",\"futime\",\"fustat\",\"age\",\"resid.ds\",\"rx\",\"ecog.ps\"\n\"1\",59,1,72.3315,2,1,1\n\"2\",115,1,74.4932,2,1,1\n\"3\",156,1,66.4658,2,1,2\n\"4\",421,0,53.3644,2,2,1\n\"5\",431,1,50.3397,2,1,1\n\"6\",448,0,56.4301,1,1,2\n\"7\",464,1,56.937,2,2,2\n\"8\",475,1,59.8548,2,2,2\n\"9\",477,0,64.1753,2,1,1\n\"10\",563,1,55.1781,1,2,2\n\"11\",638,1,56.7562,1,1,2\n\"12\",744,0,50.1096,1,2,1\n\"13\",769,0,59.6301,2,2,2\n\"14\",770,0,57.0521,2,2,1\n\"15\",803,0,39.2712,1,1,1\n\"16\",855,0,43.1233,1,1,2\n\"17\",1040,0,38.8932,2,1,2\n\"18\",1106,0,44.6,1,1,1\n\"19\",1129,0,53.9068,1,2,1\n\"20\",1206,0,44.2055,2,2,1\n\"21\",1227,0,59.589,1,2,2\n\"22\",268,1,74.5041,2,1,2\n\"23\",329,1,43.137,2,1,1\n\"24\",353,1,63.2192,1,2,2\n\"25\",365,1,64.4247,2,2,1\n\"26\",377,0,58.3096,1,2,1"
And I would like to convert this to a R-List.
So my approach is basically to call read.csv(data_as_string) but unfortunately the signature is read.csv(file_where_data_lies).
How can this be done?
Thanks so much!
As Therkel mentioned above, myfunc(file = textConnection(data)) did exactly what I was about to do. Thanks!
I have a pandas dataframe with a datetime.date column.
I try to export the dataframe to excel via xlwings.
I get the following error message:
AttributeError: 'datetime.date' object has no attribute 'microsecond'
I am quite confident the error takes place in the translation between the datetime.date type column into the excel equivalent.
The obvious solution would be convert the column into datetime which should map to the excel timestamp (16.02.2015 00:00:00 -> 42051).
Are there alternatives to that? I find quite odd that there isn't a Date type in Excel. Are there workarounds? Add a dummy time of the day to the date just to convert the column into datetime for the sake of exporting it to excel is not the (type) safest solution.
This is a bug as logged here and admittedly it's a shame it hasn't been resolved yet.
However, in the case of a Pandas DataFrame, you can for now workaround the issue by converting the column into a Pandas datetime column:
df.DateColumn = pandas.to_datetime(df.DateColumn)
Iam trying to load some data using tpump utility from Unix Console.
The data has various datatypes viz., text, number, decimal, date.
Now, iam stuck as what should be the FORMAT type i need to specify in the tpump script.
I went through the tpump manual, but could not decipher the FORMAT type to used.
The data/columns are delimited by "|" symbol.
Any info/hint in using the appropriate FORMAT type would be of great help.
If this is a duplicate question, please help me with the actual question link.
Thanks a lot in advance.