I am trying to read .tex files containing LaTeX code, and paste their content into different .tex files depending on the results of calculations in R.
I need to avoid changing any character of the tex files by processing them with R. I am looking for a way to stop R from interpreting the content of the files and make R just "copy" the files character for character.
Example R file:
cont <- paste(readLines("path/to/file/a.tex"), collapse = "\n")
write.table(cont , file = "Mother.tex", append = FALSE, quote = FALSE, sep = "",
eol = "\n", na = "NA", dec = ".", row.names = FALSE,
col.names = FALSE, qmethod = c("escape", "double"),
fileEncoding = "")
cont2 <- paste(readLines("path/to/file/b.tex"), collapse = "\n")
write.table(cont2 , file = "Mother.tex", append = TRUE, quote = FALSE, sep = "",
eol = "\n", na = "NA", dec = ".", row.names = FALSE,
col.names = FALSE, qmethod = c("escape", "double"),
fileEncoding = "")
cont3 <- paste(readLines("path/to/file/c.tex"), collapse = "\n")
write.table(cont3 , file = "Mother.tex", append = TRUE, quote = FALSE, sep = "",
eol = "\n", na = "NA", dec = ".", row.names = FALSE,
col.names = FALSE, qmethod = c("escape", "double"),
fileEncoding = "")
cont4 <- paste(readLines("path/to/file/d.tex"), collapse = "\n")
write.table(cont4 , file = "Mother.tex", append = TRUE, quote = FALSE, sep = "",
eol = "\n", na = "NA", dec = ".", row.names = FALSE,
col.names = FALSE, qmethod = c("escape", "double"),
fileEncoding = "")
Example Latex File a:
\documentclass{beamer}
\usepackage{listings}
\lstset{basicstyle=\ttfamily, keywordstyle=\bfseries}
\begin{document}
Example Latex file b:
\begin{frame}
Example Latex file c:
content based on values in r
\end{frame}
Example Latex file d:
\end{document}
I do have two Problems now:
wrong escape information for readlines
non utf-8 keyword at files: b,c,d
Latex is not abled to compile sucessfully, because theres an non utf-8 information inside the Motherfile after processing Mother with r.
If i do copy and paste the content of each file manually i am abled to compile Latex sucessfully. As a result of the information about bad utf-8 information in Latex (no wrong Characters in TexLive IDE shown) I suspect r to add information into the files, which is not shown by IDE TextLive.
I do not understand why theres something "invisible" added into my Mother tex file which is not shown inside TexLive.
Assuming you want to store the content of the .tex file into a string.
cont <- paste(readLines("path/to/file/file.tex"), collapse = "\n")
I want to read a csv-data record into R. I downloaded the script and the data set from SoSci Survey and got the following error message:
Error in scan(file = file, what = what, sep = sep, quote = quote, dec
= dec, : scan() expected 'a logical', got '3'
in the script:
zh = read.table(
file=zh_file, encoding="UTF-8",
header = FALSE, sep = "\t", quote = "\"",
dec = ".", row.names = "CASE",
col.names = c(
"CASE","SERIAL","REF","QUESTNNR","MODE","LANGUAGE","STARTED","ZH02","ZH03",
"ZH19","ZH19_03","ZH04","ZH05","ZH08_01","ZH08_02","ZH08_03","ZH08_04",
"ZH08_05","ZH08_06","ZH09_01","ZH09_02","ZH11_01","ZH11_02","ZH11_03","ZH11_04",
"ZH13_01","ZH13_02","ZH13_03","ZH13_04","ZH13_05","ZH14","ZH14_01","ZH14_02",
"ZH14_03","ZH14_04","ZH14_05","ZH14_06","ZH14_07","ZH14_09","ZH14_08",
"ZH14_08a","ZH15","ZH15_01","ZH15_02","ZH15_03","ZH15_04","ZH15_05","ZH15_06",
"ZH15_07","ZH15_08","ZH15_09","ZH15_09a","ZH16","ZH16_01","ZH16_02","ZH16_03",
"ZH16_04","ZH16_05","ZH16_06","ZH16_07","ZH16_08","ZH16_09","TIME001","TIME002",
"TIME003","TIME004","TIME005","TIME006","TIME007","TIME008","TIME009","TIME010",
"TIME011","TIME012","TIME013","TIME014","TIME015","TIME016","TIME017",
"TIME_SUM","MAILSENT","LASTDATA","FINISHED","Q_VIEWER","LASTPAGE","MAXPAGE",
"MISSING","MISSREL","TIME_RSI","DEG_TIME"
),
as.is = TRUE,
colClasses = c(
CASE="numeric", SERIAL="character", REF="character", QUESTNNR="character",
MODE="character", LANGUAGE="character", STARTED="POSIXct", ZH02="numeric",
ZH03="numeric", ZH19="numeric", ZH19_03="character", ZH04="numeric",
ZH05="numeric", ZH08_01="numeric", ZH08_02="numeric", ZH08_03="numeric",
ZH08_04="numeric", ZH08_05="numeric", ZH08_06="numeric", ZH09_01="numeric",
ZH09_02="numeric", ZH11_01="numeric", ZH11_02="numeric", ZH11_03="numeric",
ZH11_04="numeric", ZH13_01="numeric", ZH13_02="numeric", ZH13_03="numeric",
ZH13_04="numeric", ZH13_05="numeric", ZH14="numeric", ZH14_01="logical",
ZH14_02="logical", ZH14_03="logical", ZH14_04="logical", ZH14_05="logical",
ZH14_06="logical", ZH14_07="logical", ZH14_09="logical", ZH14_08="logical",
ZH14_08a="character", ZH15="numeric", ZH15_01="logical", ZH15_02="logical",
ZH15_03="logical", ZH15_04="logical", ZH15_05="logical", ZH15_06="logical",
ZH15_07="logical", ZH15_08="logical", ZH15_09="logical",
ZH15_09a="character", ZH16="numeric", ZH16_01="logical", ZH16_02="logical",
ZH16_03="logical", ZH16_04="logical", ZH16_05="logical", ZH16_06="logical",
ZH16_07="logical", ZH16_08="logical", ZH16_09="logical", TIME001="integer",
TIME002="integer", TIME003="integer", TIME004="integer", TIME005="integer",
TIME006="integer", TIME007="integer", TIME008="integer", TIME009="integer",
TIME010="integer", TIME011="integer", TIME012="integer", TIME013="integer",
TIME014="integer", TIME015="integer", TIME016="integer", TIME017="integer",
TIME_SUM="integer", MAILSENT="POSIXct", LASTDATA="POSIXct",
FINISHED="logical", Q_VIEWER="logical", LASTPAGE="numeric",
MAXPAGE="numeric", MISSING="numeric", MISSREL="numeric", TIME_RSI="numeric",
DEG_TIME="numeric"
),
skip = 1,
check.names = TRUE, fill = TRUE,
strip.white = FALSE, blank.lines.skip = TRUE,
comment.char = "",
na.strings = ""
)
What should I do?
Looking for help!
Have you tried using read.csv("filename.csv",header=T,sep=",") instead of read.table?
I want to write a file using write.table and use UTF-8 as encoding. This works as long as I don't append to this file. When I do, the encoding changes to ANSI. Why is that and how can I prevent this?
Here is a small example code:
options("encoding" = "UTF-8")
write.table("Hello World in UTF-8", file = "C:/TEMP/test.txt", col.names = FALSE, row.names = FALSE, sep = "", quote = FALSE)
write.table("Now it changes to ANSI", file = "C:/TEMP/test.txt", col.names = FALSE, row.names = FALSE, sep = "", quote = FALSE, append = TRUE)
I also tried to use fileEncoding = "UTF-8" directly in write.table, but the result is the same.
Personally, I prefer not to rely on global option. Using fileEncoding parameter to write.table safeguards your code from any changes in global option. Hence the line should be:
write.table("Now it changes to ANSI", file = "C:/TEMP/test.txt", col.names = FALSE, row.names = FALSE, sep = "", quote = FALSE, append = TRUE, fileEncoding = "UTF-8")
I have a text file containing several languages, how to read in R use read.delim function,
Encoding("file.tsv")
#[1] "unknown"
source_data = read.delim(file, header= F, fileEncoding= "windows-1252",
sep = "\t", quote = "")
source_D[360]
#[1] "ð¿ð¾ð¸ñðº ð½ð° ññ‚ð¾ð¼ ñð°ð¹ñ‚ðµ"
But the source_D[360] showed in Notepad is 'поиск на этом сайте'
tidyverse approach:
use the option locale in read_delim.
(readr functions have _ instead of . and are usually faster and smarter to read)
more details here: https://r4ds.had.co.nz/data-import.html#parsing-a-vector
source_data = read_delim(file, header= F,
locale = locale(encoding = "windows-1252"),
sep = "\t", quote = "")
source_data = read.delim(file, header = F, sep = "\t", quote = "", stringsAsFactors = FALSE)
Encoding(source_data)= "UTF-8"
I have tried, If you run you R in windows, above code works for me.
and if you run R in Unix, you could use following code
source_data = read.delim(file, header = F, fileEncoding="UTF-8", sep = "\t", quote = "", stringsAsFactors = FALSE)