I recently ran into an issue in R where it wasn't reading the date values from my csv file. I had reviewed and revised my code many times before realizing the source of the issue was the file itself. After experimentation, I realized that the date would only read if I expanded the date column in Excel and then resaved the file. This doesn't seem logical to me, the data is stored in the spreadsheet so while I expect Excel to make it unreadable to the human eye until the column is expanded, I did not expect it to be unreadable to another computer program, in this case, R. I feel that I got lucky discovering this and would like to understand why it works that way?
It helps to mention that I noticed a very similar issue when pasting un-expanded dates from Excel to Google Sheets; the result is cells filled with "######" instead of the actual date values.
Because Excel is the common party here I'm assuming this is an Excel issue.
What is happening with Excel to make the un-expanded date values unreadable to other software? Is this something that Excel is aware of?
Un-expanded dates
R-Misread
Google Sheets Paste
After expanding the column in the csv file and saving it, both the R issue and the Google Sheets issue went away.
Related
I'm experiencing an odd error. I have a large dataframe in R (75000 rows, 97 columns) and I need to save it out and then import it into Power Bi.
At first I just did the simple:
library(tidyverse)
write_csv(Visits,"Visits.csv")
and while it seems to export and looks fine in excel, the csv itself is all messed up when I look at the contents in Power Bi. Here's an example of what I mean:
The 'phase.x' column should only have "follow-up" or "treatment" in that column. In excel, looks great:
but that exact same file gets screwed up in Power Bi:
I figured that being a 'comma separated variable' file, there must be some extra comma somewhere, and I saved it as an .xlsx instead.
So, while in excel, I saved that .csv as an .xlsx and it opened great in Power Bi!
Jump forward a moment and instead of write_csv() in R, I use write.xlsx(). But now I get this error:
If I simply go to that file, open it in excel, save it and hit close, that error goes away and it can load into Power Bi just fine. I figure it has something to do with this question on here.
Any ideas on what I might be screwing up as I save it out of R? Somehow I can fix it in R and not have to open and save it every time?
In power BI check that your source has ignore quoted line breaks enabled. I've found this is often an issue with .csv files in PowerBI.
hi literally day one new coder
On the excel sheet, my data looks organized, but when I upload my file to R, it's not able to read the excel properly and the column headers are in the rows and the data seems randomized.
So far I have tried:
library(readxl)
dataset <-read_excel("pathname")
View(dataset)
Also tried:
dataset <-read_excel("pathname", sheet=1, colNames=TRUE)
Also tried to use the package openxlsx
but nothing is giving me the correct, organized data set.
I tried formatting my Excel to a CSV file, and the CSV file looks exactly like the data that shows up on R (both are messed up).
How should I approach this problem?
I deal with importing .xlsx into R frequently. It can be challenging due to the flexibility of the excel platform. I generally use readxl::read_xlsx() to fetch data from .xlsx files. My suggestions:
First, specify exactly the data you want to import with the range argument.
A cell range to read from, as described in cell-specification. Includes typical Excel
ranges like "B3:D87", possibly including the sheet name like "Budget!B2:G14"
Second, if there are there merged cells or other formatting challenges in column headers, I resort to setting col_names = FALSE. And supplying clean names after import with names(df) <- c("first_col", "second_col")
Third, if there are merged cells elsewhere in the spreadsheet I generally I resort to "fixing" them in excel (not ideal but easier for my use case), however, others may have suggestions on a programmatic fix.
It may be helpful to provide a screenshot of your spreadsheet.
When I go to save Excel data that I've pasted into a .csv file, I get a formatting issue and often the saved file has all the numbers in each row as one long string.
My read statement is
resids<-read.csv("C:\\Projects\residuals_Parts3.csv",header=TRUE)
Any ideas on how to fix this?
The warning you are getting is fairly standard in Excel - any formatting you've added to the file (e.g. widening columns) will get lost if you don't save the file as an excel file.. and the warning is supposed to remind you of this. Personally, the extra click or two annoys me too.
If you would like to avoid converting excel files to CSV before bringing them into R, try the openxls package. It's saved me from a lot of that monkey business.
I got the output file from a spectrometer which is supposed to be a series of decimals numbers. The file looks like this:
™pQH1JHxþFH$ÏFH÷~EHa×BHäBHßdBH.#H²Ï=HL=HŒÚ<Hê‰:HP:Hoõ9H¢Ž6Hº7H¨Y5H ?1H½¶.Hø²0HøŽ2H8æ.H.î,HŒt/H&1H͸0Hí.Hvî,H$ª+HµX+HCý*H·W+H!º+HP+HfØ(Hû'H†Ù'H|U(HQ`)Hn*H
})H'Hó%HÂ%H¶¨&H&H|•&H\
I have been reading a lot without getting to the solution. My silly question is: is that a ENVI or ASCII file? Or? How can I see the numbers I need do to use? I tried some online converters without being successful.
The starting point would be to get these numbers to develop a R code to make graphs. Thanks a lot for your time.
This looks like you opened the binary file of the mass spectrometer. Almost all vendors keep their format a secret. The only way to do this is to export it to an open format. Most vendors supply some kind of data analysis software and there are often export functions present. Most general open data formats are mzXML and mzML.
For converting have a look at the msconvert program from ProteoWizard.
If you have converted the data one of the packages in R where you can start with is XCMS.
This might be a noob question but I have a problem with exporting and importing a CSV.
I export a CSV with values (i.e. 300425.25). When I open this in Excel it is all comma delimited, as expected. When I hit Data To Columns I everytime changes my values to 300425,25 (I have tried all different combinations of decimal seperator in Advanced). This is excellent to work with in Excel but import this back into R I'm stuck with comma before the decimals, which in turn is unusable with the rest of my R code.
I never had this problem with the export of .CSV until I recently cleaned my computer and had a fresh install. I suspect it might be in R studio settings or Excel.
Can somebody help me out?
Thanks in advance.