A variable name with "/" or "()" - r

Using RODBC package I pulled data from access to R.
One of the names in table in access is : "JED_credit\debit (local)"
When I'm trying to refer to the cell in R I get:
JED_credit\debit (local)
"Error: unexpected input in "JED_credit\"

I think it is not the recommended way of defining variables, but it is working by using backward ticks.
> `var` <- 'test'
> var
[1] "test"
> `var/bla` <- 'test'
> `var/bla`
[1] "test"
> `var()bla` <- 'test'
> `var()bla`
[1] "test"
> `var\bla` <- 'test'
> `var\bla`
[1] "test"

There are a few characters sequences in almost all languages which have a special meaning. For example \n stands for a newline character.
In your regular string, the parser treats \d in JED_credit\debit as a special character sequence rather than a part of the string. To make it a part of the string, you need to escape the back slash with another\ thus making your variable name as JED_credit\\debit.
Answer improvements are welcome.

Related

R with gsub() instead ' of \'

I want let a string like "ab'" become "ab\'"
I have tried following code
aa="ab'"
aa<-gsub("'","\\'",aa)
show ab'
aa="ab'"
aa<-gsub("'","\\\'",aa)
show ab'
aa="ab'"
aa<-gsub("'","\\\\'",aa)
show ab\\'
I don't know how to fixed it
please give me some suggest
In the case of the following code:
aa <- "ab'"
aa <- gsub("'", "\\\\'", aa)
In fact you are replacing a single quote with a single literal backslash. The output you see ab\\' I believe just shows an extra backslash to let you know that it is not an escape character.
Consider the following extension of your code:
gsub("\\\\", "A", gsub("'","\\\\'",aa))
[1] "abA'"
We can clearly see that there is only a single A in the replacement, implying that there was only a single backslash to be replaced.
Even though on the terminal, you sometimes see "\\" it is actually just "\".
Print the result using writeLines() to see the actual string:
> original_string = "ab'"
> new_string = gsub("'","\\\\",original_string)
> writeLines(new_string)
ab\
Bonus funny: https://xkcd.com/1638/

How to paste special characters in R [duplicate]

This question already has an answer here:
paste quotation marks into character string, within a loop
(1 answer)
Closed 5 years ago.
I have started learning R and am trying to create vector as below:
c(""check"")
I need the output as : "check". But am getting syntax error. How to escape the quotes while creating a vector?
As #juba mentioned, one way is directly escaping the quotes.
Another way is to use single quotes around your character expression that has double quotes in it.
> x <- 'say "Hello!"'
> x
[1] "say \"Hello!\""
> cat(x)
say "Hello!"
Other answers nicely show how to deal with double quotes in your character strings when you create a vector, which was indeed the last thing you asked in your question. But given that you also mentioned display and output, you might want to keep dQuote in mind. It's useful if you want to surround each element of a character vector with double quotes, particularly if you don't have a specific need or desire to store the quotes in the actual character vector itself.
# default is to use "fancy quotes"
text <- c("check")
message(dQuote(text))
## “check”
# switch to straight quotes by setting an option
options(useFancyQuotes = FALSE)
message(dQuote(text))
## "check"
# assign result to create a vector of quoted character strings
text.quoted <- dQuote(text)
message(text.quoted)
## "check"
For what it's worth, the sQuote function does the same thing with single quotes.
Use a backslash :
x <- "say \"Hello!\""
And you don't need to use c if you don't build a vector.
If you want to output quotes unescaped, you may need to use cat instead of print :
R> cat(x)
say "Hello!"

Searching especial character using grepl?

I have a data frame, and i want to find especial characters so i use:
example$bb <- ifelse(grepl("*****", example$aa)==T, 1, 0)
But R says :
Error in grepl("*****", example$aa :
invalid regular expression, reason 'Invalid use of repetition operators'
Any suggestion?
How to do i write the symbol *****?
* is a meta character, use the escape meta character / to search for it
grepl('/*', '***')
[1] TRUE

How to remove ellipsis at the end of Strings in R

I have list of words, which i got from below code.
tags_vector <- unlist(tags_used)
Some of the strings in this list has ellipsis in the end,which i want to remove. Here i print the 5th element of this list, and its class
tags_vector[5]
#[1] "#b…"
class(tags_vector[5])
#[1] "character"
I am trying to remove the ellipsis from this 5th element using gsub, using the code ,
gsub("[…]", "", tags_vector[5])
#[1] "#b…"
This code doesn't works and i get "#b…" as output. But in the same code when i put the value of 5th element directly, it works fine as below,
gsub("[…]", "", "#b…")
#[1] "#b"
I even tried putting the value of tags_vector[5] in a variable x1 and tried to use it in gsub() code but it still din't work.
It might be a Unicode issue. In R(studio), not all characters are created equally.
I tried to create a reproducible example:
# create the ellipsis from the definition (similar to your tags_used)
> ell_def <- rawToChar(as.raw(c('0xE2','0x80','0xA6'))) # from the unicode definition here: http://www.fileformat.info/info/unicode/char/2026/index.htm
> Encoding(ell_def) <- 'UTF-8'
> ell_def
[1] "…"
> Encoding(ell_def)
[1] "UTF-8"
# create the ellipsis from text (similar to your string)
> ell_text <- '…'
> ell_text
[1] "…"
> Encoding(ell_text)
[1] "latin1"
# show that you can get strange results
> gsub(ell_text,'',ell_def)
[1] "…"
The reproducibility of this example might be dependent on your locale. In my case, I work in windows-1252 since you cannot set the locale to UTF-8 in Windows. According to this stringi source, "R lets strings in ASCII, UTF-8, and your platform's native encoding coexist peacefully". As the example above shows, this might sometimes give contradictory results.
Basically, the output you see looks the same, but isn't on a byte level.
If I run this example in the R terminal, I get similar results, but apparently, it shows the ellipsis as a dot: ".".
A quick fix for your example would be to use the ellipsis definition in your gsub. E.g.:
gsub(ell_def,'',tags_vector[5])

Removing '|' from object names?

I have created a data frame in R with the following name:
table_file1_C.txt|file2_C.txt
This name is was generated by the assign() function, in reference to a single .txt file that was generated by a program run on command line. Here is a sample from the loop that created this object:
assign(x=paste("table_",
dir(file.dir, pattern="\\.txt$")[i],
sep=''),
value=tmpTables[[i]])#tmpTables holds the data I'm manipulating, as read in from readHTMLtable
The issue is that I an unable to reference this object after its creation;
>table_file1_C.txt|file2_C.txt
Error: object 'file2_C.txt' not found
I believe that R is seeing the '|' character, and reading it as an instruction, not a part of the object's name, even though it already accepted it as part of the object's name.
So, I need to strip the | from the object's name. I planned to accomplish this with gsub() embedded within the assign() function, using something like this:
assign(x=paste("table_",#creating the name of the object
gsub(x=dir(file.dir, pattern="\\.txt$")[i],
pattern="|",
replacement="."),#need to remove the | characters!!
sep=''),
value=tmpTables[[i]])
However, this output gives something like this:
[1] ".t.a.b.l.e._.f.i.l.e.1...t.x.t.|.f.i.l.e.2...t.x.t."
As you can see, the name has been mangled, and the | has not actually been removed.
I need to find a way to remove the | from the name, so I can process the object that I have created. Or, prevent it from being included in the name in the first place. I can only do this within R, as I cannot modify the output of the program that I used to generate the data.
Does this make sense? Let me know if more information is needed. Thank you for taking the time to read this.
You need to escape the | character in the regular expression. Otherwise it is an empty pattern, which matches everything.
Escaping the character with brackets (character class):
x <- 'a|b'
gsub('[|]', '.', x)
## [1] "a.b"
Escaping with a backslash:
gsub('\\|', '.', x)
## [1] "a.b"
If you don't escape the | character, it is an "or" operation in the regular expression. Nothing or nothing, same as matching nothing. Thus it inserts the . between each character:
gsub('', '.', x)
## [1] ".a.|.b."
gsub('|', '.', x) # Same as above
## [1] ".a.|.b."
For some reason, escaping with ' ' as per Matthew Lundberg did not work correctly for me, but escaping with ` did.
> 'file1.txt|file2.txt'
[1] "file1.txt|file2.txt"
>`denovo_AR_C.txt|FOXA1_C.txt`
*data*
Thanks go to Matthew

Resources