How to process latex commands in R? - r

I work with knitr() and I wish to transform inline Latex commands like "\label" and "\ref", depending on the output target (Latex or HTML).
In order to do that, I need to (programmatically) generate valid R strings that correctly represent the backslash: for example "\label" should become "\\label". The goal would be to replace all backslashes in a text fragment with double-backslashes.
but it seems that I cannot even read these strings, let alone process them: if I define:
okstr <- function(str) "do something"
then when I call
okstr("\label")
I directly get an error "unrecognized escape sequence"
(of course, as \l is faultly)
So my question is : does anybody know a way to read strings (in R), without using the escaping mechanism ?
Yes, I know I could do it manually, but that's the point: I need to do it programmatically.
There are many questions that are close to this one, and I have spent some time browsing, but I have found none that yields a workable solution for this.
Best regards.

Inside R code, you need to adhere to R’s syntactic conventions. And since \ in strings is used as an escape character, it needs to form a valid escape sequence (and \l isn’t a valid escape sequence in R).
There is simply no way around this.
But if you are reading the string from elsewhere, e.g. using readLines, scan or any of the other file reading functions, you are already getting the correct string, and no handling is necessary.
Alternatively, if you absolutely want to write LaTeX-like commands in literal strings inside R, just use a different character for \; for instance, +. Just make sure that your function correctly handles it everywhere, and that you keep a way of getting a literal + back. Here’s a suggestion:
okstr("+label{1 ++ 2}")
The implementation of okstr then needs to replace single + by \, and double ++ by + (making the above result in \label{1 + 2}). But consider in which order this needs to happen, and how you’d like to treat more complex cases; for instance, what should the following yield: okstr("1 +++label")?

Related

Does R 4.0.0. make it possible to define foo"(...)" operators, similar to the newly introduced r"(...)" syntax?

R 4.0.0 brings in a new syntax for raw strings:
r"(raw string here can contain anything except the closing sequence)"
But this same construct in R 3.x.x produced a syntax error:
Error: unexpected string constant in "r"(asdasd)""
Does it mean that the interpreter was changed in R 4.0.0. ?
And if so - does R 4.0.0. provide a mechanism to define custom functions like foo"()" ?
No, that's not possible at the moment (nor would I anticipate it becoming possible anytime soon).
Here's the NEWS item:
There is a new syntax for specifying raw character constants similar to the one used in C++: r"(...)" with ... any character sequence not containing the sequence )". This makes it easier to write strings that contain backslashes or both single and double quotes. For more details see ?Quotes.
https://cran.r-project.org/doc/manuals/r-devel/NEWS.html
Then from ?Quotes:
Raw character constants are also available using a syntax similar to
the one used in C++: r"(...)" with ... any character
sequence, except that it must not contain the closing sequence
)". The delimiter pairs [] and {} can also be
used, and R can be used in place of r. For additional
flexibility, a number of dashes can be placed between the opening quote
and the opening delimiter, as long as the same number of dashes appear
between the closing delimiter and the closing quote.
https://github.com/wch/r-source/blob/trunk/src/library/base/man/Quotes.Rd
Here's the (git mirror of the SVN patch of the) commit where this functionality was added:
https://github.com/wch/r-source/commit/8b0e58041120ddd56cd3bb0442ebc00a3ab67ebc

How to set up custom automatic character replacement in emacs ess?

One of the useful features of ess-mode (Emacs speaks statistics) is to automatically replace the underscore _ with the assignment operator <-. Lately, I have been using a lot of pipes (written as %>%) and it would be great to not have to type three characters for each pipe.
Is it possible to define a custom key binding for the pipe, similar to the one converting _ into ->?
The simplest solution is to just bind a key to insert a string:
(define-key ess-mode-map (kbd "|") "%>%")
You can still insert | with C-q |. I'm not sure about the map's name; you'll almost certainly want to limit the key binding to ess-mode.
Check out yasnippet. You can use it to define something like "if this sequence of characters is followed by this key (which you can define to whatever you like), then replace them with this other sequence of characters and leave the cursor in this place". There's more to yasnippet than this, but there's plenty of documentation online and even already made recipes similar to the example I gave above that you can try, like yasnippet-ess-mode, for example.
Alternatively, you can also try abbrev-mode and see if that works for you.
I, for one, like yasnippet better, since you can also specify where to leave the cursor after the expansion, but abbrev-mode seems to be easier to set up. As always in Emacs world, try multiple solutions, don't settle for the first one you put your hands on. What works best for others might not work for you, and vice-versa.

ZSH prompt substitution issues

I've searched through several answers here and through Google, but I'm still not sure what's going wrong with my prompt.
According to the documentation I've read, this should work
setopt prompt_subst
autoload -U colors && colors
PROMPT="%{[00m[38;5;245m%}test %D%{[00m%}"
My prompt is the following, however:
[00m[38;5;245mtest 15-07-01[00m
Note that the date expansion actually worked, so prompt substitution is working. The ZSH man pages for prompt expansion states that %{...%} should be treated as a raw escape code, but that doesn't seem to be happening. Passing that string to print -P also results in the output above. I've found example prompts on the Internet for ZSH that also seem to indicate that the above syntax should work. See this for one example - the $FG and $FX arrays are populated with escape codes and are defined here. I've tried this example directly by merging both the files above, adding setopt prompt_subst to the beginning just to make sure it's set, then sourcing it and the prompt is a mess of escape codes.
The following works
setopt prompt_subst
autoload -U colors && colors
PROMPT=$'%{\e[00m\e[38;5;245m%}test %D%{\e[00m%}'
I get the expected result of test 15-07-01 in the proper color.
I've tested this on ZSH 5.0.5 in OSX Yosimite, 5.0.7 from MacPorts, and 4.3.17 on Debian, with the same results. I know I have provided a valid solution to my own problem here with the working example, but I'm wondering why the first syntax isn't working as it seems it should.
I think this all has to do with the timeless and perennial problem of escaping. It's worth reminding ourselves what escaping means, briefly: an escape character is an indicator to the computer that what follows should not be output literally.
So there are 2 escaping issues with:
PROMPT="%{[00m[38;5;245m%}test %D%{[00m%}"
Firstly, the colour escape sequences (eg; [00m) should all start with the control character like so \e[00m. You may have also seen it written as ^[00m and \003[00m. What I suspect has happened is one of the variations has suffered the common fate of being inadvertently escaped by either the copy/paste of the author or the website's framework stack, whether that be somewhere in a database, HTTP rendering or JS parsing. The control character (ie, ^, \e or \003), as you probably know, does not have a literal representation, say if you press it on the keyboard. That's why a web stack might decide to not display anything if it sees it in a string. So let's correct that now:
PROMPT="%{\e[00m\e[38;5;245m%}test %D%{\e[00m%}"
This actually nicely segues into the next escaping issue. Somewhat comically \e[ is actually a representation of ESC, it is therefore in itself an escape sequence marker that, yes, is in turn escaped by \. It's a riff on the old \\\\\\\\\\ sort of joke. Now, significantly, we must be clear on the difference between the escape expressions for the terminal and the string substitutions of the prompt, in pseudo code:
PROMPT="%{terminal colour stuff%}test %D%{terminal colour stuff%}"
Now what I suspect is happening, though I can't find any documentation to prove it, is that once ZSH has done its substitutions, or indeed during the substitution process, all literal characters, regardless of escape significations, are promoted to real characters¹. To yet further the farce, this promotion is likely done by escaping all the escape characters. For example if you actually want to print '\e' on the command line, you have to do echo "\\\e". So to overcome this issue, we just need to make sure the 'terminal colour stuff' escape sequences get evaluated before being assigned to PROMPT and that can be done simply with the $'' pattern, like so:
PROMPT=$'%{\e[00m\e[38;5;245m%}test %D%{\e[00m%}'
Note that $'' is of the same ilk as $() and ${}, except that its only function is to interpret escape sequences.
[1] My suspicion for this is based on the fact that you can actually do something like the following:
PROMPT='$(date)'
where $(date) serves the same purpose as %D, by printing a live version of the date for every new prompt output to the screen. What this specific examples serves to demonstrate is that the PROMPT variable should really be thought of as storage for a mini script, not a string (though admittedly there is overlap between the 2 concepts and thus stems confusion). Therefore, as a script, the string is first evaluated and then printed. I haven't looked at ZSH's prompt rendering code, but I assume such evaluation would benefit from native use of escape sequences. For example what if you wanted to pass an escape sequence as an argument to a command (a command that gets run for every prompt render) in the prompt? For example the following is functionally identical to the prompt discussed above:
PROMPT='%{$(print "\e[00m\e[38;5;245m")%}test $(date)%{$(print "\e[00m")%}'
The escape sequences are stored literally and only interpreted at the moment of each prompt rendering.

Is it possible to disable Command Substitution in Bash?

Is it possible to disable Command Substitution in Bash?
I want to pass a string containing several backticks characters as command-line argument to a program, without trailing backslashs or quoting the string.
Thank you.
I assume there is a misconception which grounds your question. Quoting is most likely the solution to your situation. But maybe you haven't found the right way of quoting yet or similar.
If your dangerous string shall be verbatim (without quoting or escaping) in the source code, you can put it in a separate file and read it from there:
dangerous_string=$(cat dangerous_string_file.txt)
If it shall be passed without interpretation to a command, use the double quotes to prevent interpretation:
my_command "$dangerous_string"
If you have to pass it to a command which needs to receive a quoted version of your string because it is known to carelessly pass the string without using sth like the double quotes to prevent interpretation, you can always use printf to get a quoted version:
quoted_dangerous_string=$(printf "%q" "$dangerous_string")
careless_command "$quoted_dangerous_string"
If all these options do not help in your situation, please explain in more detail where your problem lies.

String continuation across multiple lines, no newline characters

Am using the RODBC library to bring data into R. I have a long query that I want to pass a variable to, much like this SO user.
Problem is that R interprets the whitespace/carriage returns in my query as a newline '\n'.
The accepted solution for this question suggests to simply break up the text into chunks and then paste() together - which works, but ideally I'd like to keep the whitespace intact - makes it easier to test/verify the behavior of the query over in the database before pasting into R.
In other languages I'm familiar with there's a simple line continuation character - indeed, several of the comments on the accepted answer are looking for an approach similar to python's \.
I found an aside to a workaround using strwrap deep in the bowels of an R discussion lists, so in the interest of making the internet better I will post it here. However, if someone can point the direction toward a more elegant/straightforward solution, I will happily accept your answer.
I don't know if you will find this helpful or not, but I have eventually gravitated towards keeping my SQL separate from my R scripts. Keeping the query in my R script, except for very very short ones, I find gets unreadable very quickly.
These days, I tend to keep queries that are more than a single line in their own separate .sql file. Then I can keep them nice and formatted and readable in a nice text editor, and read them into R as needed via something like this:
read_sql <- function(path){
stopifnot(file.exists(path))
sql <- readChar(path,nchar = file.info(path)$size)
sql
}
For binding parameters into the queries, I just keep a %s where the parameter will go in the .sql file, and then add in the parameters in R using sprintf.
I've been much happier this way, as I was finding that cluttering up my R scripts with really long paste statements and multi-line character objects was making my code really hard to read.
R's strwrap will destroy whitespace, including newline characters, per the documentation.
Essentially, you can get the desired behavior by initially letting R introduce line breaks/newline \ns, and then immediately stripping them out.
#make query using PASTE
query_1 <- paste("SELECT map.ps_studentid
,students.first_name || ' ' || students.last_name AS full_name
,map.testritscore
,map.termname
,map.measurementscale
FROM map$comprehensive_with_growth map
JOIN students
ON map.ps_studentid = students.id
WHERE map.termname = '",map_term,"'", sep='')
#remove newline characters introduced above.
#width is an arbitrary big number-
#it just needs to be longer than your string.
query_1 <- strwrap(query_1, width=10000, simplify=TRUE)
#execute the query
map_njask <- sqlQuery(XE, query_1)
query <- gsub(pattern='\\s',replacement="",x=query)
Try using sprintf to get variable substitution, and then replacing all newlines and whitespace.
See my answer to a similar question for details.

Resources