Difference between \\ vs \ backreference for regex in r [duplicate] - r

I'm writing strings which contain backslashes (\) to a file:
x1 = "\\str"
x2 = "\\\str"
# Error: '\s' is an unrecognized escape in character string starting "\\\s"
x2="\\\\str"
write(file = 'test', c(x1, x2))
When I open the file named test, I see this:
\str
\\str
If I want to get a string containing 5 backslashes, should I write 10 backslashes, like this?
x = "\\\\\\\\\\str"

[...] If I want to get a string containing 5 \ ,should i write 10 \ [...]
Yes, you should. To write a single \ in a string, you write it as "\\".
This is because the \ is a special character, reserved to escape the character that follows it. (Perhaps you recognize \n as newline.) It's also useful if you want to write a string containing a single ". You write it as "\"".
The reason why \\\str is invalid, is because it's interpreted as \\ (which corresponds to a single \) followed by \s, which is not valid, since "escaped s" has no meaning.

Have a read of this section about character vectors.
In essence, it says that when you enter character string literals you enclose them in a pair of quotes (" or '). Inside those quotes, you can create special characters using \ as an escape character.
For example, \n denotes new line or \" can be used to enter a " without R thinking it's the end of the string. Since \ is an escape character, you need a way to enter an actual . This is done by using \\. Escaping the escape!

Note that the doubling of backslashes is because you are entering the string at the command line and the string is first parsed by the R parser. You can enter strings in different ways, some of which don't need the doubling. For example:
> tmp <- scan(what='')
1: \\\\\str
2:
Read 1 item
> print(tmp)
[1] "\\\\\\\\\\str"
> cat(tmp, '\n')
\\\\\str
>

Related

gsub remove backslash and numbers from string [duplicate]

I'm writing strings which contain backslashes (\) to a file:
x1 = "\\str"
x2 = "\\\str"
# Error: '\s' is an unrecognized escape in character string starting "\\\s"
x2="\\\\str"
write(file = 'test', c(x1, x2))
When I open the file named test, I see this:
\str
\\str
If I want to get a string containing 5 backslashes, should I write 10 backslashes, like this?
x = "\\\\\\\\\\str"
[...] If I want to get a string containing 5 \ ,should i write 10 \ [...]
Yes, you should. To write a single \ in a string, you write it as "\\".
This is because the \ is a special character, reserved to escape the character that follows it. (Perhaps you recognize \n as newline.) It's also useful if you want to write a string containing a single ". You write it as "\"".
The reason why \\\str is invalid, is because it's interpreted as \\ (which corresponds to a single \) followed by \s, which is not valid, since "escaped s" has no meaning.
Have a read of this section about character vectors.
In essence, it says that when you enter character string literals you enclose them in a pair of quotes (" or '). Inside those quotes, you can create special characters using \ as an escape character.
For example, \n denotes new line or \" can be used to enter a " without R thinking it's the end of the string. Since \ is an escape character, you need a way to enter an actual . This is done by using \\. Escaping the escape!
Note that the doubling of backslashes is because you are entering the string at the command line and the string is first parsed by the R parser. You can enter strings in different ways, some of which don't need the doubling. For example:
> tmp <- scan(what='')
1: \\\\\str
2:
Read 1 item
> print(tmp)
[1] "\\\\\\\\\\str"
> cat(tmp, '\n')
\\\\\str
>

fread escapes quotes when not necessary

I'm reading a csv file with quoted fields using the fread function. In some of the fields escaped quotes (\") appear. I don't understand why the fread function escapes these quotes that are already escaped.
I reproduce the behavior with a simple example. I created a file with a single line and a single field:
"Hello \"World\" "
If I run the following R command:
table <- fread(input = "/tmp/quoteprova.csv", header=FALSE, sep = "\t")
the table variable will look like this:
V1
1: Hello \\"World\\"
I would expect instead this result:
V1
1: Hello \"World\"
Am I missing to specify some options in order to get the expected behavior?
You are geting what you want. \\" is two characters: a normal character \ and a ". Because \ is used to escape special characters and \* would be interpreted as a special character that are escaped with \. Thefore the additional \ (the first one) here will tell you that the second \ is not used to escape " and should be treated as is.
see this example:
> nchar('\\"')
[1] 2
> nchar('\"')
[1] 1
also this R faq

How to include multiple backslashes in Unix Korn Shell

I have the following variables in Unix Korn Shell
host=nyc43ksj
qry_dir='\test\mydoc\mds'
fullpath="\\$host\$qry_dir"
echo "$fullpath"
When I execute the above, I get output such as \nyc43ksj\qrydir.
It looks like the backslashes are used as escape characters.
I tried changing fullpath as follows:
fullpath="\\$host\\$qry_dir"
echo "$fullpath"
This time I get \nyc43ksj\test\mydoc\mds. However, the two backslashes at the beginning are not display as two backslashes. How can get the fullpath as \\nyc43ksj\test\mydoc\mds (two backslashes at the beginning).
In a string, the \ (backslash) character acts as an escape character (as you mention), and the second backslash instructs the shell to put in an actual backslash, as opposed to some special character.
If you want to have two actual backslash characters in the string in sequence, you will need to put \\\\ in the string, so:
fullpath="\\\\$host\\$qry_dir"

How to write "\" character using cat() in R?

I am trying to use the cat() function in R to write data to a file. I would like to write a "\" character to the output, but it seems that the cat() function interprets this as a formatting command. Any ideas on how I can write this in the middle of formatting commands (e.g. "\t\t\t \ \n")?
In R, because \ is a metacharacter you need to use \\ to print a single backslash in cat(). One is an escape character. This can easily be verified by calling cat("\\"),
Here are a few examples:
> cat("a\nb\tc") ## standard output
a
b c
> cat("a\\nb\\tc") ## prints the control characters in the string
a\nb\tc
> cat("a\\nb\\t\\c") ## prints the control characters in the string,
a\nb\t\c ## and one backslash before "c"
> cat("a\tb\tc\t\\\nd") ## read as "a<tab>b<tab>c<tab>\<newline>d"
a b c \
d
Also, I've found this wikibooks link to be quite useful for learning about text processing with R.

How to escape backslashes in R string

I'm writing strings which contain backslashes (\) to a file:
x1 = "\\str"
x2 = "\\\str"
# Error: '\s' is an unrecognized escape in character string starting "\\\s"
x2="\\\\str"
write(file = 'test', c(x1, x2))
When I open the file named test, I see this:
\str
\\str
If I want to get a string containing 5 backslashes, should I write 10 backslashes, like this?
x = "\\\\\\\\\\str"
[...] If I want to get a string containing 5 \ ,should i write 10 \ [...]
Yes, you should. To write a single \ in a string, you write it as "\\".
This is because the \ is a special character, reserved to escape the character that follows it. (Perhaps you recognize \n as newline.) It's also useful if you want to write a string containing a single ". You write it as "\"".
The reason why \\\str is invalid, is because it's interpreted as \\ (which corresponds to a single \) followed by \s, which is not valid, since "escaped s" has no meaning.
Have a read of this section about character vectors.
In essence, it says that when you enter character string literals you enclose them in a pair of quotes (" or '). Inside those quotes, you can create special characters using \ as an escape character.
For example, \n denotes new line or \" can be used to enter a " without R thinking it's the end of the string. Since \ is an escape character, you need a way to enter an actual . This is done by using \\. Escaping the escape!
Note that the doubling of backslashes is because you are entering the string at the command line and the string is first parsed by the R parser. You can enter strings in different ways, some of which don't need the doubling. For example:
> tmp <- scan(what='')
1: \\\\\str
2:
Read 1 item
> print(tmp)
[1] "\\\\\\\\\\str"
> cat(tmp, '\n')
\\\\\str
>

Resources