How to remove '+ off' from the end of string? [duplicate] - r

This question already has answers here:
How do I deal with special characters like \^$.?*|+()[{ in my regex?
(2 answers)
Closed 4 years ago.
Similar to R - delete last two characters in string if they match criteria except I'm trying to get rid of the special character '+' as well. I also attached a picture of my output.
When I attempt to use the escape command of '+', I get an error message saying
Error: '\+' is an unrecognized escape in character string starting ""\\s\+"

As you noticed, + is a metacharacter in regex so it needs to be escaped. \+ escapes that character, but \, itself, is a special character in R character strings so it, too, needs to be escaped. This is an R requirement, not a regex requirement.
This means that, instead of '\+', you need to write '\\+'.

Related

How to replace a caret with str_replace from stringr [duplicate]

This question already has answers here:
How do I deal with special characters like \^$.?*|+()[{ in my regex?
(2 answers)
Closed 3 years ago.
I have a code problem need some help(exclude specific string).
I found the str_replace_all to do the job.However,it works on other
characters like"/n", "/t" or "A","B","C",except the "^",I want to exclude
this sign but get a error message
(Error in stri_replace_all_regex(string, pattern, fix_replacement(replacement), : Missing closing bracket on a bracket expression.(U_REGEX_MISSING_CLOSE_BRACKET))
Thanks for your help!
code=c("^GSPC","^FTSE","000001.SS","^HSI","^FCHI","^KS11","^TWII","^GDAXI","^STI")
str_replace_all(code, "([^])", "")
An option is to wrap with fixed and should be fine
library(stringr)
str_replace_all(code, fixed("^"), "")
#[1] "GSPC" "FTSE" "000001.SS" "HSI" "FCHI" "KS11" "TWII" "GDAXI" "STI"
Also, as we are replacing with blank (""), an option is str_remove
str_remove(code, fixed("^"))
Regarding why the OP's code didn't, inside the square brackets, if we use ^, it is not reading the literal character, instead the metacharacter in it looks for characters other than and here it is blank ([^])

How to remove beginning-digits only in R [duplicate]

This question already has answers here:
Remove numbers at the beginning and end of a string
(3 answers)
Remove string from a vector in R
(4 answers)
Closed 5 years ago.
I have some strings with digits and alpha characters in them. Some of the digits are important, but the ones at the beginning of the string (and only these) are unimportant. This is due to a peculiarity in how email addresses are stored. So the best example is:
x<-'12345johndoe23#gmail.com'
Should be transformed to johndoe23#gmail.com
unfortunately there are no spaces. I have tried gsub('[[:digit:]]+', '', x) but this removes all numbers, not just the beginning-ones
Edit: I have found some solutions in other languages: Python: Remove numbers at the beginning of a string
As per my comment:
See regex in use here
^[[:digit:]]+
^ Asserts position at the start of the string
You can do this:
x<-'12345johndoe23#gmail.com'
gsub('^[[:digit:]]+', '', x) #added ^ as begin of string
Another regex is :
sub('^\\d+','',x)

How to put \' in my string using paste0 function [duplicate]

This question already has answers here:
How to escape backslashes in R string
(3 answers)
Closed 5 years ago.
I have an array:
t <- c("IMCR01","IMFA02","IMFA03")
I want to make it look like this:
"\'IMCR01\'","\'IMFA02\'","\'IMFA03\'"
I tried different ways like:
paste0("\'",t,"\'")
paste0("\\'",t,"\\'")
paste0("\\\\'",t,"\\\\'")
But none of them is correct. Any other functions are OK as well.
Actually your second attempt is correct:
paste0("\\'",t,"\\'")
If you want to tell paste to use a literal backslash, you need to escape it once (but not twice, as you would need within a regex pattern). This would output the following to the console in R:
[1] "\\'IMCR01\\'" "\\'IMFA02\\'" "\\'IMFA03\\'"
The trick here is that the backslash is even being escaped by R in the console output. If you were instead to write t to a text file, you would only see a single backslash as you wanted:
write(t, file = "/path/to/your/file.txt")
But why does R need to escape backslash when writing to its own console? One possibility is that if it were to write a literal \n then this would actually be interpreted by the console as a newline. Hence the need for eacaping is still there.

Using Gsub in R to remove a string containing brackets [duplicate]

This question already has answers here:
How do I deal with special characters like \^$.?*|+()[{ in my regex?
(2 answers)
Closed 6 years ago.
I'm trying to use gsub to remove certain parts of a string. However, I can't get it to work, and I think it's because the string to be removed contains brackets. Is there any way around this? Thanks for any help.
The command I want to use:
gsub('(4:4aCO)_','', '(5:3)_(4:4)_(5:3)_(4:4)_(4:4aCO)_(6:2)_(4:4a)')
Returns:
#"(5:3)_(4:4)_(5:3)_(4:4)_(4:4aCO)_(6:2)_(4:4a)"
Expected output:
#"(5:3)_(4:4)_(5:3)_(4:4)_(6:2)_(4:4a)"
A quick test to see if brackets were the problem:
gsub('te','', 'test')
#[1] "st"
gsub('(te)','', '(te)st')
#[1] "()st"
We can by placing the brackets inside the square brackets as () is a metacharacter
gsub('[(]4:4aCO[)]','', '(5:3)(4:4)(5:3)(4:4)(4:4aCO)(6:2)_(4:4a)')
Or with fixed = TRUE to evaluate the literal meaning of that character
gsub('(4:4aCO)','', '(5:3)(4:4)(5:3)(4:4)(4:4aCO)(6:2)_(4:4a)', fixed = TRUE)

Regular expression for 6 characters [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 5 years ago.
I need couple of regular expression in ASP.NET.
First should accept only 6 characters and second and third character should accept only _ (underscore).
Like: a__cde
And I want one other regex that should also take 6 characters and at second position it should accept only underscore (_) and at third position it should accept maybe underscore (_) or hash (#) and at fourth position it should accept only hash (#).
Note: In both the regex user can only enter: Number, Alphabets or Star (*) at any position instead of above mentioned positions.
Can any one help me out on this? I have tried by below website:
http://www.regexr.com/
But not able to generate proper regex.
Try something like:
1. ^[a-zA-Z0-9\*]__[a-zA-Z0-9\*]{3}$
2. ^[a-zA-Z0-9\*]_[_#]#[a-zA-Z0-9\*]{2}$

Resources