How to replace a caret with str_replace from stringr [duplicate] - r

This question already has answers here:
How do I deal with special characters like \^$.?*|+()[{ in my regex?
(2 answers)
Closed 3 years ago.
I have a code problem need some help(exclude specific string).
I found the str_replace_all to do the job.However,it works on other
characters like"/n", "/t" or "A","B","C",except the "^",I want to exclude
this sign but get a error message
(Error in stri_replace_all_regex(string, pattern, fix_replacement(replacement), : Missing closing bracket on a bracket expression.(U_REGEX_MISSING_CLOSE_BRACKET))
Thanks for your help!
code=c("^GSPC","^FTSE","000001.SS","^HSI","^FCHI","^KS11","^TWII","^GDAXI","^STI")
str_replace_all(code, "([^])", "")

An option is to wrap with fixed and should be fine
library(stringr)
str_replace_all(code, fixed("^"), "")
#[1] "GSPC" "FTSE" "000001.SS" "HSI" "FCHI" "KS11" "TWII" "GDAXI" "STI"
Also, as we are replacing with blank (""), an option is str_remove
str_remove(code, fixed("^"))
Regarding why the OP's code didn't, inside the square brackets, if we use ^, it is not reading the literal character, instead the metacharacter in it looks for characters other than and here it is blank ([^])

Related

How do I gsub ' with empty? [duplicate]

This question already has answers here:
How to remove single quote from a string in R?
(3 answers)
How do I deal with special characters like \^$.?*|+()[{ in my regex?
(2 answers)
Closed 11 months ago.
For example for a string like this
NANYANG-GIRLS'-HIGH-SCHOOL
how do I use gsub to replace ' to empty and make it
NANYANG-GIRLS-HIGH-SCHOOL
when I do it in R, it shows error
You can use either of the following two approaches:
sec_name <- gsub('\'', '', sec_name, fixed=TRUE)
sec_name <- gsub("'", "", sec_name, fixed=TRUE)
This first approach is a correct version of what you were doing. Here, we use single quotes for the strings, but we escape the single quote to make it a literal single quote.

R regex using stringr::str_detect and grepl don't seem to be matching "\\+" when it is surrounded by "\\b" [duplicate]

This question already has answers here:
Why does is this end of line (\\b) not recognised as word boundary in stringr/ICU and Perl
(2 answers)
Closed 3 years ago.
I'm pretty new to regex and am trying to detect a word with the "+" symbol when surrounded by "\\b" in long strings of words but both stringr and grepl are giving me the wrong result.
This is the code that I have wrote:
library(stringr)
str_detect("coversyl +", "\\bcoversyl(plus| plus|\\+| \\+)\\b")
The output is FALSE which is wrong.
What would be the right way to do it?
My guess is that your expression is just fine, maybe missing an space,
\\bcoversyl\\b\\s(\\bplus\\b|\\+)
Please see the demo for additional explanation.
If we might want more than one space, we would simply change \\s to \\s+ and it might work:
\\bcoversyl\\b\\s+(\\bplus\\b|\\+)

How to remove '+ off' from the end of string? [duplicate]

This question already has answers here:
How do I deal with special characters like \^$.?*|+()[{ in my regex?
(2 answers)
Closed 4 years ago.
Similar to R - delete last two characters in string if they match criteria except I'm trying to get rid of the special character '+' as well. I also attached a picture of my output.
When I attempt to use the escape command of '+', I get an error message saying
Error: '\+' is an unrecognized escape in character string starting ""\\s\+"
As you noticed, + is a metacharacter in regex so it needs to be escaped. \+ escapes that character, but \, itself, is a special character in R character strings so it, too, needs to be escaped. This is an R requirement, not a regex requirement.
This means that, instead of '\+', you need to write '\\+'.

How to replace "|" in R [duplicate]

This question already has answers here:
What special characters must be escaped in regular expressions?
(13 answers)
Closed 4 years ago.
I have a dataframe which contains
"HYD_SOA_UNBLOCK~SOA_BLOCK-UK|SOA_BLOCK-DE||SOA_BLOCK-FR||SOA_BLOCK-IT||SOA_BLOCK-ES|"
I want the result to be -
"HYD_SOA_UNBLOCK~SOA_BLOCK-UK|SOA_BLOCK-DE|SOA_BLOCK-FR|SOA_BLOCK-IT|SOA_BLOCK-ES|"
I tried:
leadtemp$collate = gsub("||","|",leadtemp$collate)
but it is not working.
Please help me replace "||" with "|"
As MrFlick suggested, include fixed = TRUE in your gsub statement. The problem occurs because "|" is a Regular Expression operator. Using fixed = TRUE tells gsub to assume the pattern is a string and not a RegEx.
leadtemp$collate = gsub("||","|",leadtemp$collate, fixed=TRUE)
Another (although more complicated) way of doing it would be to escape all the |s:
leadtemp$collate = gsub("\\|\\|","\\|",leadtemp$collate)
Try:
gsub("[|]{2}", "|", leadtemp$collate)
I have defined character class comprising the pipe character and forced gsub to look for exactly two occurrences.Result is:
"HYD_SOA_UNBLOCK~SOA_BLOCK-UK|SOA_BLOCK-DE|SOA_BLOCK-FR|SOA_BLOCK-IT|SOA_BLOCK-ES|"
| is a metacharacter. As you can read here, metacharacters need to be escaped out of with a \. \ is also a metacharacter so it must be escaped out of in the same way. So whenever you want to refer to a | in a string, you have to put \\|. This should make your code work:
leadtemp$collate = gsub("\\|\\|","\\|",leadtemp$collate)

Using Gsub in R to remove a string containing brackets [duplicate]

This question already has answers here:
How do I deal with special characters like \^$.?*|+()[{ in my regex?
(2 answers)
Closed 6 years ago.
I'm trying to use gsub to remove certain parts of a string. However, I can't get it to work, and I think it's because the string to be removed contains brackets. Is there any way around this? Thanks for any help.
The command I want to use:
gsub('(4:4aCO)_','', '(5:3)_(4:4)_(5:3)_(4:4)_(4:4aCO)_(6:2)_(4:4a)')
Returns:
#"(5:3)_(4:4)_(5:3)_(4:4)_(4:4aCO)_(6:2)_(4:4a)"
Expected output:
#"(5:3)_(4:4)_(5:3)_(4:4)_(6:2)_(4:4a)"
A quick test to see if brackets were the problem:
gsub('te','', 'test')
#[1] "st"
gsub('(te)','', '(te)st')
#[1] "()st"
We can by placing the brackets inside the square brackets as () is a metacharacter
gsub('[(]4:4aCO[)]','', '(5:3)(4:4)(5:3)(4:4)(4:4aCO)(6:2)_(4:4a)')
Or with fixed = TRUE to evaluate the literal meaning of that character
gsub('(4:4aCO)','', '(5:3)(4:4)(5:3)(4:4)(4:4aCO)(6:2)_(4:4a)', fixed = TRUE)

Resources