I have a pandas dataframe with a datetime.date column.
I try to export the dataframe to excel via xlwings.
I get the following error message:
AttributeError: 'datetime.date' object has no attribute 'microsecond'
I am quite confident the error takes place in the translation between the datetime.date type column into the excel equivalent.
The obvious solution would be convert the column into datetime which should map to the excel timestamp (16.02.2015 00:00:00 -> 42051).
Are there alternatives to that? I find quite odd that there isn't a Date type in Excel. Are there workarounds? Add a dummy time of the day to the date just to convert the column into datetime for the sake of exporting it to excel is not the (type) safest solution.
This is a bug as logged here and admittedly it's a shame it hasn't been resolved yet.
However, in the case of a Pandas DataFrame, you can for now workaround the issue by converting the column into a Pandas datetime column:
df.DateColumn = pandas.to_datetime(df.DateColumn)
Related
I am working an uploaded document originally from google docs downloaded to an xlsx file. This data has been hand entered & formatted to be DD-MM-YY, however this data has uploaded inconsistently (see example below). I've tried a few different things (kicking myself for not saving the code) and it left me with just removing the incorrectly formatted dates.
Any suggestions for fixing this in excel or (preferably) in R? This is longitudinal data so it would be frustrating to have to go back into every excel sheet to update. Thanks!
data <- read_excel("DescriptiveStats.xlsx")
ex:
22/04/13
43168.0
43168.0
is a correct date value
22/04/13
is not a valid date. it is a text string. to convert it into date you will need to change it into 04/13/2022
there are a few options. one is to change the locale so the 22/04/13 would be valid. see more over here: locale differences in google sheets (documentation missing pages)
2nd option is to use regex to convert it. see examples:
https://stackoverflow.com/a/72410817/5632629
https://stackoverflow.com/a/73722854/5632629
however, it is very likely that 43168 is also not the correct date. if your date before import was 01/02/2022 then after import it could be: 44563 which is actually 02/01/2022 so be careful. you can check it with:
=TO_DATE(43168)
and date can be checked with:
=ISDATE("22/04/13")
I am new to VAEX. Also I couldn't find any solution for my specific question in google. So I am asking here hoping someone can solve my issue :).
I am using VAEX to import data from CSV file in my DASH Plotly app and then want to convert Date column to datetime format within VAEX. It successfully imports the data from csv file. Here is how I imported data from csv into VAEX:
vaex_df=vaex.from_csv(link,convert=True,chunk_size=5_000)
Below it shows the type of the Date column after importing into VAEX. As you can see, it takes the Date column as string type.
Then when I try to change data type of Date columns with below code, it gives error:
vaex_df['Date']=vaex_df['Date'].astype('datetime64[ns]')
I dont know how to handle this issue, so I need your help. What am I doing wrong here?
Thanks in advance
The vaex.from_csv is basically an alias to pandas.read_csv. In pandas.read_csv there is an argument that use can use to specify which columns should be parsed as datetime. Just pass that very same argument to vaex.from_csv and you should be good to go!
This question already has answers here:
How to convert Excel date format to proper date in R
(5 answers)
Closed 1 year ago.
I am reading an Excel file using the function readxl::read_excel(), but it appears that date are not getting read properly.
In the original file, one such date is 2020-JUL-13, but it is getting read as 44025.
Is there any way to get back the original date variable as in the original file?
Any pointer is very appreciated.
Thanks,
Basically, you could try to use:
as.Date(44025)
However, you will notice error saying Error in as.Date.numeric(44025) : 'origin' must be supplied. And that means that all you need is to know origin, i.e. starting date from which to start counting. When you check, mentioned by Bappa Das, help page for convertToDate function, you will see that it is just a wrapper for as.Date() function and that the default argument for origin parameter is "1900-01-01".
Next, you can check, why is this, by looking for date systems in Excel and here is a page for this:
Date systems in Excel
Where is an information that for Windows (for Mac there are some exceptions) starting date is indeed "1900-01-01".
And now, finally, if you want to use base R, you can do:
as.Date(44025, origin = "1900-01-01")
This is vectorized function, so you can pass whole column as well.
You can use openxlsx package to convert number to date like
library(openxlsx)
convertToDate("44025")
Or to convert the whole column you can use
convertToDate(df$date)
Is there any way to prevent coercion of dates when reading data from Excel? I'm using either the readxl package or the tidyxl package. The tidyxl package is terrific, but it automatically moves the data into the date column.
Also, I was intrigued by this sentence from the Help page for the xlsx_cells() function: " xlsx_cells() attempts to infer the correct data type of each cell, returning its value in the appropriate column (error, logical, numeric, date, character). In case this cleverness is unhelpful, the unparsed value and type information is available in the 'content' and 'type' columns." It's this unparsed value that I'm looking for.
Alternatively, I'm looking for something similar to TextReader, except for XLSX files.
Any suggestions?
I'm working with an Oracle database and have a connection in RStudio using the ROracle package. For some reason some dates are converted when imported into R through either dplyr or dbGetQuery.
A date field that in the database reads 2018-01-01, turns into 2018-01-31 23:00:00 when imported. The same is the case with 2018-02-01 that is converted to 2018-02-28 23:00:00.
What is really weird is that if I export the data frame to an excel spread sheet using openxlsx the dates are again displayed correctly.
Anybody who knows what is going on, or could point me in the right direction? The column is formatted as POSIXct, and I´ve tried changing locale and timezone. I´ve also tried converting the date column with as.Date, but with no luck.
The problem had to do with how ROracle converts dates when importing. The dates for the winter months where imported as CET while the rest of the dates where imported as CEST.
Found the explanation here: https://www.oralytics.com/2015/05/r-roracle-and-oracle-date-formats_27.html