Good evening,
I have a problem with formatting in pdf file after converting it from .jl (Julia file).
For example, $ I_{1}, I_{2}, I_{3} $ illustration the error is not shown
like a normal text. It is shown as the set of symbols, including $, _ , { , }. The headers are also shifted in pdf but in jl (Julia file) they are shown correctly.
How to fix this?
Kind regards,
Ilya
Related
I have a line of code that alters text
temperature<-as.numeric(gsub("°.*","",temp))
R does not like the "°" character. When I save the file it says I need to use a different encoding.
I have tried all sorts of different encodings from the list, but they all save the code in some variation of
temperature<-as.numeric(gsub("??.*","",temp))
My current solution is to open the script in notepad and copy paste the code into rstudio. Which encoding do I need to save a ° in rstudio?
The full solution to this in rstudio was to go to file -> save with encoding -> select ISO-8859-1 -> check the box Set as default encoding for source files. Now the file opens properly with the degree character every time.
I am trying to read a given set of files which contain characters like "à". So I use the following code to read the file:
readLines(con = file.path("C:\\myFolder", "testFile.txt"), encoding = "UTF-8")
Now instead of getting an 'à', I get as output "\xe1". So when I remove all content of the .txt file except for this specific letter and a newline, readelines produces "\xe1" as output. Then I created a new file with the same file content, to create a reproducible example. I called it "testFile2.txt". However when I try:
readLines(con = file.path("C:\\myFolder", "testFile.txt", "testFile2.txt"), encoding = "UTF-8")
I get the expected output of "à". I tried manually retyping the content of the file, resaving both files under a different name in a different folder. But whatever I try, the 2 files with seemingly different content produce different outputs in readLines. Is there any meta-information somehow attached to the files that could be causing this?
I use notepad++ to manipulate the files.
edit: readr::guess_encoding was a good suggestion. I got for the file that worked:
encoding confidence
1 UTF-8 0.8
2 GB18030 0.1
3 Big5 0.1
And for the file that gave problems:
encoding confidence
1 UTF-8 0.15
So there does indeed to seem to be an encoding problem
edit 2: and I got my answer, the file was saved in ANSI instead of in UTF-8. So notepad++ of course kept it in that format regardless of how much a moved or renamed it. And when I made a new file to get a reproducible example, it saved that automatically in UTF-8, causing the differences in output when reading the files. Even though the content in notepad++ showed as the same.
more a tip question that can save lots of time in many cases. I have a script.R file which I try to save and get the error:
Not all of the characters in ~/folder/script.R could be encoded using ASCII. To save using a different encoding, choose "File | Save with Encoding..." from the main menu.
I was working on this file for months and today I was editing like crazy my code and got this error for the first time, so obviously I inserted a character that can not be encoded while I was working today.
My question is, can I track and find this specific character and where exactly in the document is?
There are about 1000 lines in my code and it's almost impossible to manually search it.
Use tools::showNonASCIIfile() to spot the non-ascii.
Let me suggest two slight improvements this.
Process:
Save your file using a different encoding (eg UTF-8)
set a variable 'f' to the name of that file. something like this f <- yourpath\\yourfile.R
Then use tools::showNonASCIIfile(f) to display the faulty characters.
Something to check:
I have a Markdown file which I run to output to Word document (not important).
Some of the packages I used to initialise overload previous functions. I have found that the warning messages sometimes have nonASCII characters and this seems to have caused this message for me - some fault put all that output at the end of the file and I had to delete it anyway!
Check where characters are coming back from Warnings!
Cheers
Expanding the accepted answer with this answer to another question, to check for offending characters in the script currently open in RStudio, you can use this:
tools::showNonASCIIfile(rstudioapi::getSourceEditorContext()$path)
I have to submit a programming assignment in pdf format (produced using LaTeX), and the tutor expects to be able to copy and paste the code directly from the pdf into R to run it. I know I can do this by hard-copying the code into the LaTeX document in a \verbatim block, but I usually use the 'listings' package to link my R source file directly to my LaTeX document, and when I do that, the pdf output contains a lot of extra spaces that are picked up when the code is copied back into R. Sometimes the code will still run with the spaces, but with decimal points, underscores etc the inserted space will cause problems. I've copied the same line from the 'verbatim' environment (top) and 'listings' (bottom) to illustrate the difference:
par(mfrow = c(2,1), ps = 10, mar = c(3,3,2,2))
par ( mfrow = c(2 ,1) , ps = 10, mar = c(3 ,3 ,2 ,2))
I've been through the Source Codes documentation and tried removing whitespace and changing the basic style (my default is ttfamily), but this doesn't work, and Googling just brings me variations on the official documentation. Essentially, what I'd like to be able to do is apply the Verbatim font style to my Listings environment so that I can still format my code how I want to - but I suspect it won't be that easy. Any suggestions on how to get my R code into a document without copy-pasting each line, so that the output can be copied back into R, would be greatly appreciated! Thanks in advance...
An easy solution has been mentioned here:
https://tex.stackexchange.com/questions/119218/how-to-copy-paste-from-lstlistings
Add
\lstset{columns=fullflexible}
and you will be able to copy/paste the R code from the pdf document.
+1 to #Roland, he has the right idea with knitr.
I would however also assume with enough configuration of listings in LaTeX you would be able to get rid of the unwanted whitespace. It's been a while since I fiddled with the listings, but I recall them having a lot of customisability, as well as syntax support for R, that should clear most of the conversion issues, but I may be mistaken.
I am new to R and have worked for a while as follows. I have the code writen in a word document, then I copy and paste the document with the code into R as to have the code run which works fine, however when the code is long (hundred pages) it takes a significant amount of time in R to start making the code run. This seems rather not a very effective working procedure and I am sure there are other forms to compile the R code.
On another hand one of then that comes to my mind is to import the content of word into R which I am unsure how to do. Have tried with read.table but it does not work, have look on internet as to how to import data, however most explanations are all for data tables etc or internet files in the form of data tables and similar. I have tried saving the document into csv. however word does not include csv have tried with Rich text format and XML package but again the instructions from the packages are for importing tables and similars. I am wondering if there is an effective way for R to import a word document as is in the word document.
Thank you
It's hard to say what the easiest solution would be, without examining the word document. Assuming it only contains code and nothing else, it should be pretty easy to convert it all to plain text from within Word. You can do that by going to File -> Save As, and use 'plain text' under 'Save as type'.
Then edit the filename extension to .R from .txt, download a proper text editor (I can recommend RStudio for R), and open your code in it. Then you will be able to run the code from inside the editor without using copy / paste.
No, read table won't do it.
Microsoft Word has its own format, which includes a lot of meta data over and above the text you enter into it. You'll need a reader/parser that understands the Word format.
A Java developer would use a library like Apache POI to read and parse it into word tokens and n-grams.
Look for Natural Language Processing tools, like this R module:
http://cran.r-project.org/web/views/NaturalLanguageProcessing.html