.REnviron with special characters - r

I am having trouble trying to add environment variables to a REnviron file that have special characters. This is on a Debian machine with the file located at /usr/lib/R/etc/Renviron. If my value has a &, I get a weird error when installing packages (although the package installs fine):
REnviron file: TEST_KEY=HEY&X&THERE
Command: install.packages(futures)
Error:
/usr/lib/R/bin/Rcmd: 468: /usr/lib/R/etc/Renviron: THERE: not found
/usr/lib/R/bin/Rcmd: 468: /usr/lib/R/etc/Renviron: X: not found
Which seems like it's because & is a special character. I can fix this by putting quotes around the value like this: TEST_KEY="HEY&X&THERE". However at that point I can't figure out how to handle when a value itself has a " in it. For example if I wanted the value to be HEY&"&THERE I am not sure how to format that (a backlash in front of the quote didn't work). I tried "HEY&\"&THERE", but that left the \ in the string once loaded into R. Which leads me to my broader question:
How can I ensure that anything that satisfies linux environment variable styling rules works in an REnviron file?
Update: this seems to be a Debian specific issue. You can recreate it using the debian:bullseye-slim docker image, installing R, then editing the Renviron to have a & in it.

Okay I spent an hour looking into this and I think there is the answer.
In both Ubuntu and Debian (and maybe other systems too), the Renviron file gets executed within bash. So what you're typing in the file is exactly bash commands. You can see in lines 39-40 of RCmd the commands:
. "${R_HOME}/etc${R_ARCH}/Renviron"
export `sed 's/^ *#.*//; s/^\([^=]*\)=.*/\1/' "${R_HOME}/etc${R_ARCH}/Renviron"`
The first line runs the Renviron file in the shell, the second then exports the variable names based on lines that have a = in them.
So in our case the way to handle this is to put double quotations around all the values, and any double-quote within the string should get a \ before it. The reason why I didn't realize the solution before I posted the question is that I didn't use cat() when printing my text in R, which removes the leading \. So: "HEY&\"&THERE" would be the right way to do it.
To recap:
The Renviron file is executed on the shell
To handle special characters in strings you use the same logic you would in the OS (so double quotes with \ to escape actual double quotes).

Related

How get zsh prompt of form: [working-directory] # under macOS

In macOS Catalina (10.15.6), I want to use zsh for Terminal sessions. Formerly I had been using the default bash. For bash, I had a .profile containing the line
export PS1="[\u#\h:\w]$ "
which gave a prompt of the form:
[me#myhost:current-dir]$
I want something similar for zsh, but without the user-name#host-name prefix and with # instead of $ for the actual prompt.
In a zsh Terminal session, the command
PROMPT='[%/]%% '
gives the expected prompt, with the current directory enclosed in square brackets.
Of course I don't want to enter that manually each time. Instead, I want to set this in .zprofile. So in .zprofile I included the line
export PROMPT='[%/]%% '
However, that does not work as expected -- the prompt now has the form:
me#myhost current-dir %
Question: How can I get the zsh prompt to have the desired form as follows?
[current-dir] %
Just add the following export to ~/.zshrc, otherwise it won't work.
export PROMPT='[%1~] %%'
That will give you the following, my directory name is test-workflow-branch-only
[test-workflow-branch-only] %
NOTE: This will give you [~] % when in ~/ directory so don't be alarmed when you see that
UPDATE - per comment questions
We add it to ~/.zshrc as this file gets sourced in all interactive shell configurations. The file ~/.zprofile are for commands that we want to execute when we log in, therefore a non-login shell won't source this file.
Thanks for info from Edward Romero. My critique of answer is that it contains four wasted characters, '[',']',' ','%'. Using instead PROMPT='%d>' yields the nice clear absolute path, something like this:
/Users/myuser/test-workflow-branch-only>
In any case, nice to get this headache behind me, and begin reaping the wonderful benefits of using zsh, whatever they may be.

Rscript behaves inconsistently on windows with single and double quotes

If I invoke
Rscript -e "print('hello')"
It correctly prints out the answer
[1] "hello"
However, if I switch single and double quotes, it does not work, and it looks like the double quotes are removed:
Rscript -e 'print("hello")'
gives:
Error in print(hello) : object 'hello' not found
Execution halted
Note that it's not powershell doing the escaping incorrectly. Echoing only gives the expected results:
PS> echo 'print("hello")'
print("hello")
PS> echo "print('hello')"
print('hello')
And the same behavior is not observed on macOs or Linux, where both variants are correctly parsed.
Interestingly enough, it's even crazier for command.com:
C:>Rscript -e "print('hello')"
[1] "hello"
C:>Rscript -e 'print("hello")'
[1] "print(hello)"
I mean... what?!?
This has already been mentioned here:
Single line code to run R code from Windows command line
but there's no explanation about it. In my opinion it's a bug of Rscript on windows, but I want to hear other opinions.
Dabombber's helpful answer provides all the pointers, but let me try to boil it down conceptually:
The problem is not specific to RScript.exe and potentially affects calls to any external executable from PowerShell:
Up to at least PowerShell 7.1 (current as of this writing), passing arguments with embedded double quotes (") to external programs is fundamentally broken, as detailed in GitHub issue #1995; in short: behind the scenes, PowerShell constructs a command line for the target program (process) that uses "..."-quoting only, but neglects to escape embedded verbatim " chars. for their syntactically valid inclusion in such double-quoted strings; a fix may be coming in v7.2 - see this answer.
For now, you have to manually escape embedded " chars. as \".
However, if and when the bug gets fixed, this workaround will break, because the fix requires that this escaping be applied automatically, which would then escape a verbatim \" as \\\".
# WORKAROUND as of v7.0, which will break if and when the problem gets fixed.
PS> Rscript -e 'print(\"hello\")'
The third-party Native module (install with Install-Module -Scope CurrentUser Native, for instance) offers helper function ie, which compensates for the broken behavior; it is written in a forward-compatible manner so that it will simply defer to the built-in behavior if and when it should get fixed:
# Thanks to `ie`, no workarounds are required.
PS> ie Rscript -e 'print("hello")'
As for ad hoc workarounds - both of them work for Rscript.exe, but can't be expected to be a general solution:
For target programs that support both '...' and "..." quoting: Swap the quotes to use only embedded ' chars., as shown in your question, but note that '...' and "..." strings have different semantics in PowerShell ("..." strings are expandable (interpolating) strings), and may have different semantics in the target program too (not the case in Rscript):
Rscript -e "print('hello')"
For target programs that accept input via stdin, use the PowerShell pipeline, where the bug doesn't surface (though note that you may have to set the $OutputEncoding preference variable to the character encoding expected by the target program):
'print("hello")' | Rscript -
As for your observations and background information, including about cmd.exe and POSIX-compatible shells:
Note that it's not powershell doing the escaping incorrectly.
As Dabombber points out, it is PowerShell that is the problem, but the problem only occurs when calling external programs, whereas echo is a built-in alias for the PowerShell-native
Write-Output cmdlet (verify with Get-Command echo).
On Windows, you could see the problem with the flawed parameter passing as follows, by invoking choice.exe (ignore the [Y,N]?N suffix):
PS> 'n' | choice /m 'print("hello")'
print(hello) [Y,N]?N
choice.exe with /m can be used to echo an argument as it would be received by external programs, and as you can see the double quotes were effectively lost, because PowerShell mistakenly placed print("hello") verbatim on the process command line - without escaping the " chars. - which external programs parse as verbatim print(hello), because they allow a single argument to be composed of unquoted and double-quoted parts (print( + hello (stripped of the syntactic double quotes) + )).
If verbatim print(hello) is interpreted as an R script, it looks for a variable (object) named hello - which in this scenario doesn't exist and triggers the error message you saw.
On Unix-like platforms (macOS, Linux), using the cross-platform PowerShell [Core] edition, /bin/echo 'print("hello")' shows the same problem.
And the same behavior is not observed on macOs or Linux, where both variants are correctly parsed.
Yes, if you use a native, POSIX-compatible shell there, such as bash, you'll get the correct behavior (see below).
it's even crazier for command.com:
As an aside: You probably meant cmd.exe, the legacy command processor (Command Prompt) on NT-based Windows platforms up to the current Windows 10.
(command.com was the command processor on the extinct DOS-based Windows versions that ended with Windows ME).
cmd.exe only recognizes double-quoting ("...") to demarcate argument boundaries for itself, not also single-quoting ('...'); irrespective of that, it essentially passes the original quoting through to the target executable (after performing its own interpretation of the command line, such as environment-variable expansion).
This differs fundamentally from what PowerShell and POSIX-compatible shells do:
On Unix-like platforms - where POSIX-compatible shells recognize '...'-quoted arguments - the concept of a process command line doesn't exist, and whatever arguments a POSIX-like shell has itself parsed out of its command line are passed as-is - as an array of verbatim arguments - to the target executable; thus shell string literals "print('hello')" and 'print("hello")' are passed as verbatim print('hello') and print("hello"), respectively, which works as expected, given that R too recognizes both '...' and "..." string literals.
PowerShell too has '...' strings (it treats their content verbatim), but on Windows it translates them to "..." strings behind the scenes (if quoting is needed), which is where the aforementioned bug can surface as of v7.0. The bug aside, this translation makes sense, because only "..." quoting can be assumed to have syntactic meaning on the command line for other programs (see bottom section). Unfortunately, PowerShell does the same thing on Unix-like platforms, even though it shouldn't (it constructs a pseudo command line that the .NET API then parses into an array of verbatim arguments passed to the target process), so the bug surfaces there as well.
Because cmd.exe preserves the original quoting, RScript interprets 'print("hello")' in command line Rscript -e 'print("hello")' as a string literal rather than as a command, because it removes any " chars. with syntactic function on the command line first (whereas ' (single quotes) by convention do not have syntactic meaning on the command line), before the result is interpreted as an R script:
'print("hello")' is therefore parsed as 'print( + hello (the command-line " are stripped) + ), resulting in verbatim 'print(hello)' getting interpreted as R code, which is an R string literal that therefore prints as-is (the output uses "..." quoting, but that's just an artifact of output formatting; note that an explicit call to print() isn't necessary, the result of an expression - such as string literal 'print(hello)' in this case - is automatically printed).
By contrast, "print('hello')" is parsed as verbatim print('hello') (the command-line " are stripped), which - due to the absence of enclosing quoting - is then interpreted as a command, namely a print() function call, as intended.
Ultimately, there are no hard and fast rules in the anarchic world of process command-line parsing on Windows: it is ultimately up to each program to interpret its command line - this answer contains excellent background information.
Fortunately, however, there are widely adhered-to conventions, as implemented in the MS C/C++/.NET compilers and documented here.
Unfortunately, as of PowerShell 7.0, PowerShell doesn't adhere to these conventions, due to the aforementioned bug. Since the bug has been around since v1, users have learned to work around it, such as with manual \"-escaping, as shown above. The problem is that fixing the bug would break all workarounds. Implementing a fix as an experimental feature is now being considered, for v7.1 at the earliest - see this PR on GitHub and the associated discussion here, which suggests that, in addition to implementing the widely established conventions, accommodations be made for calls to batch files and msiexec.exe-style programs, which have non-conventional quoting requirements.
It might be worth taking a look through this PowerShell issue: Arguments for external executables aren't correctly escaped. The Native module by Michael Klement provides a workaround until the problem is fixed (and shouldn't be broken post-fix like many current workarounds).
Note that it's not powershell doing the escaping incorrectly. Echoing only gives the expected results
echo is a PowerShell function rather than an external program so you won't notice the broken behaviour when using it.
PS> Get-Command echo
CommandType Name Version Source
----------- ---- ------- ------
Alias echo -> Write-Output
A better test would be to use the EchoArgs.exe command line tool from PowerShell
Community Extensions (downloadable here).
PS> echoargs.exe 'print("hello")'
Arg 0 is <print(hello)>
Command line:
"E:\echoargs.exe" print("hello")
PS> echoargs.exe "print('hello')"
Arg 0 is <print('hello')>
Command line:
"E:\echoargs.exe" print('hello')
Note that it's not powershell doing the escaping incorrectly. Echoing
only gives the expected results:
In the case of using echo, its echo which is directly consuming the argument you are passing to it, so you get the same result for single quotes or double quotes.
In the case of Rscript, I believe Rscript is just a convenient way of calling R with some additional arguments. (see https://swcarpentry.github.io/r-novice-inflammation/05-cmdline/ for explanation). Specifically, it says that "From this output, we learn that Rscript is just a convenience command for running R scripts...."
So maybe what's happening is that when you call RScript, its passing the argument to a separate process, and because of this its trying to expand hello as a variable, leading to the error (in the Powershell case)
As for cmd it has its own behavior for handling single and double quotes.
See: What does single-quoting do in Windows batch files?
and
Differences between single and double quotes in CMD
So the problem may not be with RScript. The resulting output of your use case may just be a side effect of how powershell and cmd handle double quotes and single quotes.
This may also explain why the problem is there only on windows, and not in Linux or MacOS.
Check out this one! https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_quoting_rules?view=powershell-7
expressions in single-quoted strings are not evaluated. They are interpreted as literals.
in a double-quoted string, expressions are evaluated, and the result is inserted in the string.
same rules apply for cmd

executing a script with spaces in the path leading to it

I'm trying to execute a R script which has spaces in the path leading to it. It fails with path not found error. My command looks like this..
Rscript ../A/B C/test.R
I've tried
Rscript "`../A/B C/test.R`"
Rscript "../A/B C/test.R"
Doesn't work. What's going wrong here?
First let's try the obvious, escape the space:
Rscript "../A/B\ C/test.R"
If that doesn't work, cd inside the folder and try calling it from there:
cd A/B\ C/ && Rscript test.r
(Assuming you're in the parent folder)
If is still not working.. maybe is something inside the script.. What do you have in it?
R has problem sometimes managing spaces with single escape characters, so, if -let's say-, inside your script you have:
source("x.r")
And the FULL PATH of x.r has spaces in its name (like being in the same folder as the file in your example..), it can fail due to not finding the file called from inside r.
Then, change the paths INSIDE the script to have double escapes at the spaces
/A/B C/ -> /A/B\\ C/
And try again the previous options i posted.
Tell us what happens!
Make sure you are running your line of code from the Unix shell.
There may be an error in your directory name or file itself. As a test case, you may try the following:
Rscript "/directory/test A/rnorm.R"
rnorm.R being:
x <- rnorm(200, 10, 4)
print(x)
This basically should print the numbers to your Shell.

How to make the glob() function also match hidden dot files in Vim?

In a Linux or Mac environment, Vim’s glob() function doesn’t match dot files such as .vimrc or .hiddenfile. Is there a way to get it to match all files including hidden ones?
The command I’m using:
let s:BackupFiles = glob("~/.vimbackup/*")
I’ve even tried setting the mysterious {flag} parameter to 1, and yet it still doesn’t return the hidden files.
Update: Thanks ib! Here’s the result of what I’ve been working on: delete-old-backups.vim.
That is due to how the glob() function works: A single-star pattern
does not match hidden files by design. In most shells, the default
globbing style can be changed to do so (e.g., via shopt -s dotglob
in Bash), but it is not possible in Vim, unfortunately.
However, one has several possibilities to solve the problem still.
First and most obvious is to glob hidden and not hidden files
separately and then concatenate the results:
:let backupfiles = glob(&backupdir..'/*').."\n"..glob(&backupdir..'/.[^.]*')
(Be careful not to fetch the . and .. entries along with hidden files.)
Another, perhaps more convenient but less portable way is to use
the backtick expansion within the glob() call:
:let backupfiles = glob('`find '..&backupdir..' -maxdepth 1 -type f`')
This forces Vim to execute the command inside backticks to obtain
the list of files. The find shell command lists all files (-type f)
including the hidden ones, in the specified directory (-maxdepth 1
forbids recursion).

simple shell script in cygwin

#!/bin/bash
echo 'first line' >foo.xml
echo 'second line' >>foo.xml
I am a total newbie to shell scripting.
I am trying to run the above script in cygwin. I want to be able to write one line after the other to a new file.
However, when I execute the above script, I see the follwoing contents in foo.xml:
second line
The second time I run the script, I see in foo.xml:
second line
second line
and so on.
Also, I see the following error displayed at the command prompt after running the script:
: No such file or directory.xml
I will eventually be running this script on a unix box, I am just trying to develop it using cygwin. So I would appreciate it if you could point out if it is a cygwin oddity and if so, should I avoid trying to use cygwin for development of such scripts?
Thanks in advance.
Run dos2unix on your shell script. That will fix the problem.
I had the same kind of problem as the original poster: A very simple script file was not working in Cygwin.
Thanks to Don Branson for the clue.
The fix for me was built into the text editor I'm using. (Most programmer's editors have a feature like this.) For example, in my case I'm using Notepad++, which has a menu item to convert the file line endings to Unix-style. From the menu: [Edit]->[EOL Conversion]->[Unix (LF)]
Then the script behaved as expected.
But there must be something else that is wrong here. When I try it, it works as expected.
> foo.xml puts the line into foo.xml, replacing any previous contents.
>> foo.xml appends to file
It sounds like you may have a typo somewhere. Also keep in mind that while the Windows command prompt can be forgiving about paths with embedded spaces, cygwin's shells will not be, so if you have a filename that contains embedded spaces, you need to either quote the filename or escape the spaces:
echo 'first line' > 'My File.txt'
echo 'first line' > My\ File.txt
The same goes for certain "special" characters including quotes, ampersand (&), semicolons (;) and generally most punctuation other than period/full-stop (.).
So if you are seeing those issues using the exact script that you are running (i.e. you copy and pasted it, there is no possibility of transcription errors) then something truly strange may be happening that I can't explain. Otherwise, there may be a misplaced space or unquoted character somewhere.
I cannot reproduce your results. The script you quote looks correct, and indeed works as expected in my installation of Cygwin here, producing the file foo.xml containing the lines first line and second line; implying that what you are actually running differs from what you quoted in some way that is causing the problem.
The error message implies some sort of problem with the filename in the first echo line. Do you have some nonprintable characters in the script you are running? Have you missed escaping a space in the filename? Are you subsituting shell variables and mistyping the name of the variable or failing to escape the resulting string?
The above should work normally..
However you can always specify a heredoc:
#!/bin/bash
cat <<EOF > foo.xml
first line
second line
EOF

Resources