I have been messing around with the datetime formatting, as I have data that goes through daylight savings time. The csv files I read in skip an hour in the spring and then repeat one in the fall. This seems to mess up a loop I am using. So I am tying to see if telling R what timezone the data is in PST and PDT, will help.
I was playing with reading in the data:
read_csv("file.csv", locale = locale(tz = "America/Los_Angeles"))
I read in of the data inputs the proper timezone, but now I have issues using write_csv.
When I write my datafile it outputs ie. "2011-11-30T14:20:51Z" in the csv datetime column.
I tried to reinstall R/RSTUDIO, readr, and to read in new data without this formatting and write_csv; all with no luck! I am still learning so any tips would be appreciated, thank you!
Related
I am working an uploaded document originally from google docs downloaded to an xlsx file. This data has been hand entered & formatted to be DD-MM-YY, however this data has uploaded inconsistently (see example below). I've tried a few different things (kicking myself for not saving the code) and it left me with just removing the incorrectly formatted dates.
Any suggestions for fixing this in excel or (preferably) in R? This is longitudinal data so it would be frustrating to have to go back into every excel sheet to update. Thanks!
data <- read_excel("DescriptiveStats.xlsx")
ex:
22/04/13
43168.0
43168.0
is a correct date value
22/04/13
is not a valid date. it is a text string. to convert it into date you will need to change it into 04/13/2022
there are a few options. one is to change the locale so the 22/04/13 would be valid. see more over here: locale differences in google sheets (documentation missing pages)
2nd option is to use regex to convert it. see examples:
https://stackoverflow.com/a/72410817/5632629
https://stackoverflow.com/a/73722854/5632629
however, it is very likely that 43168 is also not the correct date. if your date before import was 01/02/2022 then after import it could be: 44563 which is actually 02/01/2022 so be careful. you can check it with:
=TO_DATE(43168)
and date can be checked with:
=ISDATE("22/04/13")
I am importing an excel file that has a date field with the following format: dd/mm/YYYY
It imports the data OK, however it does something strange with the dates. If the day is higher than 12, then it takes the current date.
I found this on the log:
WARNING: unrecognized date format ``, assigning current date
For example: the date 08/04/2020 is imported ok because 08 <= 12, however the date 23/02/2020 is not imported because 23 is not <= 12, and then it takes the current date.
Any ideas about what is happening?
This issue can get confusing with Excel, since typically, if the system is set for U.S. locales, Excel converts standard European date format in day-month-year to some version of month/day/year. In other words, tThe forward slash usually indicates American format, which is month first, as in April 9, 2022 or 04/09/22. You can force day/month/year, and apparently that was done for your file or system, but that's not an expected or standard format.
Without examining All Import code in detail, I'd say the behavior you describe implies it is presuming the standard usage. I'd therefore recommend that you try converting your date column before attempting imports, but how to do that efficiently would probably be more an Excel question than a WordPress question, unless you are able in your version of All Import to perform a custom conversion on the field.
A trick that sometimes works on this type of problem is copy-pasting Excel data into a Google Sheet, then downloading as CSV, without ever opening the file in your system before re-uploading. Google Sheets is pretty good about such conversions in general, but no guarantees: Your source file might defeat it in this case.
I had the same issue. Change the dates in your csv file to mm/dd/yyyy and when it imports it will automatically convert to dd/mm/yyyy if that is what your WP website is set up to use.
Only numbers lower than 12 will work the way you are importing because there are only 12 months in a year and it is reading you dd as mm.
I am new to VAEX. Also I couldn't find any solution for my specific question in google. So I am asking here hoping someone can solve my issue :).
I am using VAEX to import data from CSV file in my DASH Plotly app and then want to convert Date column to datetime format within VAEX. It successfully imports the data from csv file. Here is how I imported data from csv into VAEX:
vaex_df=vaex.from_csv(link,convert=True,chunk_size=5_000)
Below it shows the type of the Date column after importing into VAEX. As you can see, it takes the Date column as string type.
Then when I try to change data type of Date columns with below code, it gives error:
vaex_df['Date']=vaex_df['Date'].astype('datetime64[ns]')
I dont know how to handle this issue, so I need your help. What am I doing wrong here?
Thanks in advance
The vaex.from_csv is basically an alias to pandas.read_csv. In pandas.read_csv there is an argument that use can use to specify which columns should be parsed as datetime. Just pass that very same argument to vaex.from_csv and you should be good to go!
I have a file that I am reading in. Everything is fine, except for one detail. In the file, dates are stored in the format "mm/dd/yyyy". When I try to read this in with fread, I'm using
fread(..., select = c(var = "Date"))
It appears fread assumes it's in the ISO format, so January 9, 2019 stored as 1/9/2019 is read in as the date"0001-09-20", September 20, year 1. Is there any way to specify a format to tell fread how to read this? It could be in select or colClasses, though select is my preference as I've already selected around 80 columns and specified their data types.
I know I could read it in as character and change it afterward. I'm trying to do as much as possible while reading in the data. If I have to change it after the fact, I will do that.
You have two options.
Read as character and convert in extra step.
Fill feature request in data.table github repo providing your minimal example file and wait for it to be implemented.
Personally I would go with the first one. Good thing is that you can do both.
I recently ran into an issue in R where it wasn't reading the date values from my csv file. I had reviewed and revised my code many times before realizing the source of the issue was the file itself. After experimentation, I realized that the date would only read if I expanded the date column in Excel and then resaved the file. This doesn't seem logical to me, the data is stored in the spreadsheet so while I expect Excel to make it unreadable to the human eye until the column is expanded, I did not expect it to be unreadable to another computer program, in this case, R. I feel that I got lucky discovering this and would like to understand why it works that way?
It helps to mention that I noticed a very similar issue when pasting un-expanded dates from Excel to Google Sheets; the result is cells filled with "######" instead of the actual date values.
Because Excel is the common party here I'm assuming this is an Excel issue.
What is happening with Excel to make the un-expanded date values unreadable to other software? Is this something that Excel is aware of?
Un-expanded dates
R-Misread
Google Sheets Paste
After expanding the column in the csv file and saving it, both the R issue and the Google Sheets issue went away.