WARNING: Line <x> split at 64 KB in <myfile> - textpad

Using Textpad this error appears when searching files that have very long lines. It pollutes the tool output window with these messages, hiding relevant search results. How can I turn this off please?

Related

UTF-8 problems in RStudio

I am passing on my work with some R-files to my colleague atm, and we are having a lot of trouble getting the files to work on his computer. The script as well as the data contains the nordic letters, and so to prevent this from being an issue, we have made sure to save the R-files with encoding UTF-8.
Still, there are two problems. A solution to either one would be much appreciated:
Problem 1: loading the standard data CSV-file (semicolon separated - which works on my computer), my colleague gets the following error:
Error in make.names(col.names, unique = TRUE) :
invalid multibyte string 3
But then, instead we have tried to make it work, both with a CSV-file that he has saved with UTF-8 format and also an excel (xlsx) file. Both these files he can load fine (with read.csv2 and read_excel from the officer-package, respectively), and in both cases, when he opens up the data in R, it looks fine to him too ("æ", "ø" and "å" are included).
The problem first comes when he tries to run the plots that actually has to "grap" and display the values from the data columns where "æ", "ø" and "å" are included in the values. Here, he gets the following error message:
in grid.call(c_textBounds, as.graphicAnnot(x$label), x$x, x$y, : invalid input 'value with æ/ø/å' in 'utf8towcs'
When I try to run the R-script with the CSV-UTF-8 data-file (comma-separated), and I open up the data in a tab in RStudio, I can see that æ, ø and å is not written correctly (but they are just a bunch of weird signs). This is weird, considering that it should work more optimally with this type of CSV-file, and instead I'm having problems with this and not the standard CSV-file (the non UTF-8, semicolon-separated file).
When I try to run the script with the xlsx-file it then works totally fine for me. I get to the plot that has to display the data values with æ, ø and å, and it works completely fine. I do not get the same error message.
Why does my colleague get these errors?
(we have also made sure that he has installed the danish version of R at the CRAN-website)
We have tried all of the above.

BlueSky Statistics - Character Encoding Problem

I am loading a data set, characters of which was encoded in ISO 8859-9 ("Latin 5") using Windows 10 OS (Microsoft has assigned code page 28599 a.k.a. Windows-28599 to ISO-8859-9 in Windows).
The data set is originally in Excel.
Whenever I run an analysis, or any operation with a variable name containing a character specific to this code page (ISO 8859-9), I get an error like:
Error: undefined columns selected
BSkyFreqResults <- BSkyFrequency(vars = c("MesleÄŸi"), data = Turnudep_raw_data_5)
Error: object 'BSkyFreqResults' not found
BSkyFormat(BSkyFreqResults)
The characters ÄŸ within "MesleÄŸi" are originally one character in Turkish (g with an inverted hat on) ğ
Those variable names that contain only letters from US code page work normally in BlueSky operations.
If I try to use save as in Excel and use web option UTF-8, to convert the data to UTF-8, this does not work either. If I export it to csv file, it does not work as is, or saved as UTF-8.
How can I load this data into BlueSky so that it works?
This same data set works in Rstudio:
> Sys.getlocale('LC_CTYPE')
[1] "Turkish_Turkey.1254"
And also in SPSS:
Language is set to Unicode
Picture of Language settings in SPSS
It also works in Jamovi
I also get an error when I start BlueSky, that may be relevant to this problem:
Python-CFFI error
From cffi callback <function _consolewrite_ex at 0x000002A36B441F78>:
Traceback (most recent call last):
File "rpy2\rinterface_lib\callbacks.py", line 132, in _consolewrite_ex
File "rpy2\rinterface_lib\conversion.py", line 133, in _cchar_to_str_with_maxlen
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 15: invalid start byte
Since then I re-downloaded and re-installed BlueSky, but I still get this Python-CFFI error every time I start the software.
I want to work with BlueSky and will appreciate any help in resolving this problem.
Thanks in advance
Here is a link for reproducing the problem.
The zip file contains a data source of 2 cases both in Excel and BlueSky format, a BlueSky Markdown file to show how the error is produced and an RMarkdown file for redundancy (probably useless).
UPDATE: The Python error (Python-CFFI error) appears to be related to the Region settings in Windows.
If the region is USA (Turnudep_reprex_Windows_Region_USA-Settings.jpg) , the python error does NOT appear.
If the region is Turkey (Turnudep_reprex_Windows_Region_Turkey-Settings.jpg) the python error DOES appear.
Unfortunately, setting the region and language to USA does eliminate the python error message but not the other problem. Still all the operations with the Turkish variable names end up with an error.
This may be a problem only the BlueSky developers may solve ...
Any help or suggestion will be greatly appreciated.
UPDATE FOR VERSION 10.2: The Python error (Python-CFFI error) is eliminated in this version. All others persist. I also notice that I can not change the variable names that have characters not in US code page. Meaning, if a variable name is something like "HastaNo", I can do analysis with that variable and change the name of the variable in the editor. If the variable name is something like "Mesleği" I can not do analysis with that variable AND I CANNOT CHANGE THAT NAME in the editor to "Meslegi" or anything else, so that it is usable in analysis.
UPDATE FOR VERSION: BlueSky Statistics Version 10.2.1, R package version 8.70
No change from Version 10.2. Variable names that contain a character outside of ASCII, cause an error AND can not be changed in BlueSky Statistics.
For version 10, according to user manual chapter 15.1.3 you can adjust the encoding setting. (answer has been edited for more clarity)

Rmarkdown Error: Paragraph ended before \text# was complete. Unable to find source

When trying to knit an rmarkdown file to a PDF, I am encountering the following error. It is referencing a line number with no text reference and the line number is out of range of the number of lines of the file.
Rmarkdown Error:
! Paragraph ended before \text# was complete.
<to be read again>
\par
l.375
If anyone has encountered a similar error or has an idea about how to identify the location of the error in the file it'd be appreciated.
I have encountered this a couple times. It can be frustrating because you are likely finishing a lot of work and just creating the report so it can take some time to knit to find what is causing it.
The first thing you likely want to do is to heed that next line that has:
... See <your_output_file>.log for more info. ...
However because of the confluence of using all of TeX, Markdown and R's text generator, the line numbers are off a bit in this log file, so you may need to just start hunting.
I tend to use $\color{blue}{\text{, however likely searching for \text{ throughout your .Rmd file should be enough.
Likely you will need to escape some character as described in this answer. For me it is often parentheses.
For example I just got this error on a line that had:
$\color{blue}{\text{..., (if provided)...}}$
I was able to resolve with:
$\color{blue}{\text{..., \(if provided\)...}}$

Reading large csv file in R

I have a number of csv-files of different size, but all somewhat big. Using read.csv to read them into R takes longer than I've been patient to wait so far (several hours). I managed to read the biggest file (2.6 gb) very fast (less than a minute) with data.table's fread.
My problem occurs when I try to read a file of half the size. I get the following error message:
Error in fread("C:/Users/Jesper/OneDrive/UdbudsVagten/BBR/CO11700T.csv",:
Expecting 21 cols, but line 2557 contains text after processing all
cols. It is very likely that this is due to one or more fields having
embedded sep=';' and/or (unescaped) '\n' characters within unbalanced
unescaped quotes.
fread cannot handle such ambiguous cases and those
lines may not have been read in as expected. Please read the section
on quotes in ?fread.
Through research I've found suggestions to add quote = "" to the code, but it doesn't help me. I've tried using the bigmemory package, but R crashes when I try. I'm on a 64 bit system with 8 gb of ram.
I know there are quite a few threads on this subject, but I haven't been able to solve the problem with any of the solutions. I would really like to use fread (given my good experience with the bigger file), and it seems like there should be some way to make it work - just can't figure it out.
Solved this by installing SlickEdit and using it to edit the lines that caused the trouble. A few characters like ampersand, quotation marks, and apostrophes were consistently encoded to include semicolon - e.g. & instead of just &. As semicolon was the seperator in the text document, this caused the problem in reading with fread.

Track the exact place of a not encoded character in an R script file

more a tip question that can save lots of time in many cases. I have a script.R file which I try to save and get the error:
Not all of the characters in ~/folder/script.R could be encoded using ASCII. To save using a different encoding, choose "File | Save with Encoding..." from the main menu.
I was working on this file for months and today I was editing like crazy my code and got this error for the first time, so obviously I inserted a character that can not be encoded while I was working today.
My question is, can I track and find this specific character and where exactly in the document is?
There are about 1000 lines in my code and it's almost impossible to manually search it.
Use tools::showNonASCIIfile() to spot the non-ascii.
Let me suggest two slight improvements this.
Process:
Save your file using a different encoding (eg UTF-8)
set a variable 'f' to the name of that file. something like this f <- yourpath\\yourfile.R
Then use tools::showNonASCIIfile(f) to display the faulty characters.
Something to check:
I have a Markdown file which I run to output to Word document (not important).
Some of the packages I used to initialise overload previous functions. I have found that the warning messages sometimes have nonASCII characters and this seems to have caused this message for me - some fault put all that output at the end of the file and I had to delete it anyway!
Check where characters are coming back from Warnings!
Cheers
Expanding the accepted answer with this answer to another question, to check for offending characters in the script currently open in RStudio, you can use this:
tools::showNonASCIIfile(rstudioapi::getSourceEditorContext()$path)

Resources