I have a large data frame(df). I want to export it as an excel file. I am using "WriteXLS" function from "WriteXLS" library. Everything is fine except for some non English characters. For "İ", "ı" and for some other characters a strange object(which is "�") is printed in the excel sheet. I guess it is an ecoding issue. The used code is:
WriteXLS("df", "C:/Users/ozgur/Desktop/df.xlsx",Encoding = "UTF-8", col.names = TRUE,perl = "perl")
The base exporting function
write.table(df, "C:/Users/ozgur/Desktop/df.txt", sep="\t",col.names=TURE)
does not work well. It does not convert data types as in R.
How can I overcome this problem. I will be very glad for any help. Thanks a lot.
Related
I am an R beginner. I am trying to upload a CSV file into R. However, When I upload the dataset, I am getting strange structures and I cannot figure out how to solve this problem. My CSV original document looks like so:
1,"male","other",39.1500015258789,"yes","no","yes","yes",6.19999980926514,8.09000015258789,0.200000002980232,0.889150023460388,12,"high","other"
When upload into R my data frame looks like so:
"\"1" "\"\"male\"\"" "\"\"other\"\"" 39.2 "\"\"yes\"\""
I tried to remove the slash and the quotation mark by using the following function: new <- read_csv("CollegeDistance.csv", TRUE, quote = ""). However , it does not really help.
Does someone now how to solve this problem. Thanks, by advance.
You have to change the second argument from TRUE to FALSE, because the file is only one row and can therefore not use it as column names.
new <- read_csv("CollegeDistance.csv", FALSE, quote = "")
I am trying to read a specific file that I have copied from an SFTP location. The file is pipe delimited. I can read the file in Excel. But R read is as null values and column names are being duplicated. I don't understand if this is an encoding issue? I am trying to create a bash script to automate this process. Any help? Below is the link for the data.
Here's file!
I have tried changing the Encoding. But without knowing which encoding I am struggling. I have tried using read_delim, ead_table, read.table, read_csv and read.csv. But no help.
this is the code I have used to read the file.
read_delim("./Engagement_Level.txt", delim = "|")
I would like to read it as a data frame.
The issue is that the file encoding is UTF-16LE, which read_delim cannot read at present.
You could use the base read.delim and file() to specify the encoding:
read.delim(file("Engagement_Level.txt", encoding = "UTF-16LE"), sep = "|")
That will convert all the quoted numbers to numeric. If you'd rather they were type character, to deal with later:
read.delim(file("Engagement_Level.txt", encoding = "UTF-16LE"), sep = "|",
colClasses = "character")
I really recommend you to use Excel to build a CSV file using Data>Text in columns, this is not appropriate in this context but it's incredibly infallible and quickly.
Then use read.csv(file,sep=",").
I have few XML files which are having Japanese characters, when I change it to csv it's japanese characters changed to code point e.g. <U+FA32> some characters like this.
I want to keep them as it is and change it to csv or excel format.
I tried changing Locale, I tried changing setting of R studio. Nothing is working. The xml contains a lot of data and some fields are having email body which consists raw data with some special characters.
Let me show you how my code for changing xml to csv looks like:-
for(f in file)
{
doc <- xmlParse(f,useInternalNodes = TRUE , fileEncoding='UTF-8');
xL <- xmlToList(doc, fileEncoding='UTF-8');
data <- ldply(xL, data.frame, fileEncoding='UTF-8');
write.csv(data, concat(f,".csv"), row.names = FALSE, fileEncoding='UTF-8')
}
Please help with the solution. If we can change it to csv even with some other ways than R please help.
I'm trying to import the YRBS ASCII .dat file found here to analyze in R, but I'm having trouble importing the file. I followed the recommendations here and here but none seem to work. More specifically, it's still showing up as being one column/variable in R with 14,765 observations.
I've tried using the readLines(), read.table, and read.csv functions but none seem to be separating the columns.
Here are the specific codes I tried:
readLines("D:/Projects/XXH2017_YRBS_Data.dat", n=5)
read.csv("D:/Projects/XXH2017_YRBS_Data.dat", header = FALSE)
read.table("D:/Projects/XXH2017_YRBS_Data.dat", header = FALSE)
readLines and read.csv only provided one column and I got an error message from using read.table that stated that line 1 did not have 23 elements (which I'm assuming is just referring to the missing values?).
The data also starts from line 1 so I cannot use skip = 1 like some have suggested online.
How do I import this file into R so that I can separate the columns?
Bulky file, so I did not download them.
First, use an Access file version then use try following codes.
Compare it to Access data.
data<- readr::read_table2("XXH2017_YRBS_Data.dat", col_names = FALSE, na = ".")
I am using read.xlsx to load a spreadsheet that has several columns containing Chinese characters.
slides<-read.xlsx("test1.xlsx",sheetName="Sheet1",encoding="UTF8",stringsAsFactors=FALSE)
I have tried with and without specifying encoding, I have tried reading from a text file, CSV file etc. No matter the approach, the result is always:
樊志强 -> é‡åº†æ–°æ¡¥åŒ»é™¢
Is there any package/format/sys.locale setup that might help me get the right information into R?
Any help would be greatly appreciated.
You can try the readr package, and use the following code:
library(readr)
loc <- locale(encoding = "UTF-8")
slides <- read_table("filename", locale = loc)