Writing to a CSV file producing errors - r

I am using R to analyze some text data. After doing some aggregation, I had a new dataframe I wanted to write to a csv file, so I can use it in other analyses. The dataframe looks correct in r- it only has 2 columns with text data- but once I write the csv and open it, the text is scattered across different columns. Here is the code I was using:
write.csv(new_df, "4.19 Group 1_agg by user try 2.csv")
I tried adding in an extra bit of code to specify that it should be using UTF-8, since I've heard this could be an encoding error, so the code then looked like this:
write.csv(new_df, "4.19 Group 1_agg by user try 2.csv", fileEncoding = "UTF-8")
I also tried reading in the file differently (using fread instead of read.csv)
Still, the csv file looks wrong/messy in many places. Here is what it should look like:
This is what it looks like currently:
Again, I think the error must be in writing the csv file, because everything looks good in R when I check it using names and head. Any help is appreciated, thank you!

Related

Using large dput output

I have a dataframe I want to share using dput.
I want all of it and not just some of it.
When I use dput(), the result is so big that it does not all fit in the Rstudio Console.
I tried assigning the dput to a variable and saving that variable as a txt file but it did not work, the txt file was made of unreadable characters.
Does anyone know a way of copying the whole length of a dput eventhough it's huge so that I can then share it on here with a google doc link or whatever ?
You are looking for dump().
try:
dump("iris", "iris.txt")

Read a weird txt file as data in R

I'm trying to get R to read data from a txt file, but the file is not properly made, so it's giving me lots of errors.
Ideally, I'd like to be able to extract a dataframe to be able to work with from this, but I trully don't know how to.
The files are all in this link.
An example with any of them would work.
Thanks a lot!

When I upload my excel file to R, the column titles are in the rows and the data seems all jumbled. How do I fix this?

hi literally day one new coder
On the excel sheet, my data looks organized, but when I upload my file to R, it's not able to read the excel properly and the column headers are in the rows and the data seems randomized.
So far I have tried:
library(readxl)
dataset <-read_excel("pathname")
View(dataset)
Also tried:
dataset <-read_excel("pathname", sheet=1, colNames=TRUE)
Also tried to use the package openxlsx
but nothing is giving me the correct, organized data set.
I tried formatting my Excel to a CSV file, and the CSV file looks exactly like the data that shows up on R (both are messed up).
How should I approach this problem?
I deal with importing .xlsx into R frequently. It can be challenging due to the flexibility of the excel platform. I generally use readxl::read_xlsx() to fetch data from .xlsx files. My suggestions:
First, specify exactly the data you want to import with the range argument.
A cell range to read from, as described in cell-specification. Includes typical Excel
ranges like "B3:D87", possibly including the sheet name like "Budget!B2:G14"
Second, if there are there merged cells or other formatting challenges in column headers, I resort to setting col_names = FALSE. And supplying clean names after import with names(df) <- c("first_col", "second_col")
Third, if there are merged cells elsewhere in the spreadsheet I generally I resort to "fixing" them in excel (not ideal but easier for my use case), however, others may have suggestions on a programmatic fix.
It may be helpful to provide a screenshot of your spreadsheet.

R misreading csv files after modifications on Excel

This is more of a curiosity.
Sometimes I modify csv files from Excel rather than R (suppose I manage to find a missing piece of info and I type it in the csv file), of course maintaining commas and quotes as they were.
Every time I do this, R becomes unable to read the csv file, i.e. it imports a single column as it appears on Excel, rather than separating the values (no options like sep= or quote= change this).
Does anyone know why this happens?
Thanks a lot
An example
This was readable:
state,"city","county"
AK,"Anchorage",""
AK,"Haines",""
AK,"Juneau","Juneau"
After adding the missing info under "county", R fails to import it as a data frame, reading it instead as a single vector.
state,"city","county"
AK,"Anchorage","Anchorage"
AK,"Haines","Haines"
AK,"Juneau","Juneau"
Edit:
I'm just running the basic read.csv
df <- read.csv("C:/directory/df.csv")

Tab Delimited Data

Since I'm not able to find/decipher solutions to my problem I'll try asking.
Up until now I've only worked with other people's data (.csv files) in RStudio but this time around it's my own data, which I entered into Excel. All entries are tab-delimited, and I wouldn't enter data into Excel any other way. After some googling it seems like R and .xlsx files aren't best friends, so I saved my file in various other formats, .csv one of them. Thus I have tab-delimited .csv file.
The problem with loading tab-delimited .csv files also features here, but my problem is not with "reading some of the numeric variables in as factors" (whatever that means), but that data is loaded as semi-colon separated in R:
data <- read.table(file.choose(), sep="\t", header=T)
View(data)
Date.Miles.Time
2015-08-10;5;45
Apart from improper formatting there are 29 observations rather than 24; the last five entries are just ;;. Again, my problem is not the same as in the link but I figure there's no harm in trying Justin's suggestion in the answer, i.e. options(stringsAsFactors=FALSE) and then running the above again, but it achieves nothing.
read.csv() and read.delim() yield the same result. Any suggestions?

Resources