This question already has answers here:
How do I deal with special characters like \^$.?*|+()[{ in my regex?
(2 answers)
Closed 1 year ago.
I would like to replace $ in my R strings. I have tried:
mystring <- "file.tree.id$HASHd15962267-44c21f1cee1057d95d6840$HASHe92451fece3b3341962516acfa962b2f$checked"
stringr::str_replace(mystring, pattern="$",
replacement="!")
However, it fails and my replacement character is put as the last character in my original string:
[1] "file.tree.id$HASHd15962267-44c21f1cee1057d95d6840$HASHe92451fece3b3341962516acfa962b2f$checked!"
I tried some variation using "pattern="/$" but it fails as well. Can someone point a strategy to do that?
In base R, You could use:
chartr("$","!", mystring)
[1] "file.tree.id!HASHd15962267-44c21f1cee1057d95d6840!HASHe92451fece3b3341962516acfa962b2f!checked"
Or even
gsub("$","!", mystring, fixed = TRUE)
We need fixed to be wrapped as by default pattern is in regex mode and in regex $ implies the end of string
stringr::str_replace_all(mystring, pattern = fixed("$"),
replacement = "!")
Or could escape (\\$) or place it in square brackets ([$]$), but `fixed would be more faster
Related
This question already has answers here:
How to extract everything until first occurrence of pattern
(4 answers)
Closed 1 year ago.
I have a list of file names and want to string extract just the part of the name before the _
I tried using the following but was unsuccessful.
condition <- strsplit(count_files, "_*")
also tried
condition <- strsplit(count_files, "_*.[c,t]sv")
Any suggestions?
Just use trimws from base R
trimws(count_files, whitespace = "_.*")
[1] "Fibroblast" "Fibroblast"
The output from strsplit is a list, it may need to be unlisted. Also, when we use _* the regex mentioned is zero or more _. Instead, it should be _.* i.e. _ followed by zero or more other characters (.*)
unlist(strsplit(count_files, "_.*"))
data
count_files <- c("Fibroblast_1.csv", "Fibroblast_2.csv")
This question already has answers here:
How do I deal with special characters like \^$.?*|+()[{ in my regex?
(2 answers)
Closed 2 years ago.
I am using data table and I want to mark out observations with a substring ".." in a longer string. After looking at How to select R data.table rows based on substring match (a la SQL like) I tried
like("Hi!..", "..")
which returns TRUE and
like("Hi!..", "Bye")
returns FALSE. However, surprisingly,
like("Hi!". "..")
returns TRUE! If this is a feature, why is that? And what can I use instead if I want to check a substring for non-letter characters?
You have to escape the special character "." with "\":
like("Hi!", "\\.\\.")
The second argument to like() is a regular expression and . has a special meaning in regex; it matches all characters. If you want to look for . literary, then add the argument fixed = TRUE.
like("Hi!", "..", fixed = TRUE)
# [1] FALSE
like("Hi!..", "..", fixed = TRUE)
# [1] TRUE
This question already has answers here:
Regex group capture in R with multiple capture-groups
(9 answers)
Closed 2 years ago.
Let say I have a pattern like -
Str = "#sometext_any_character_including_&**(_etc_blabla\\s"
Now I want to replace above text with
"#some\\s"
i.e. I just want to retain first 4 characters and trailing space and beginning #. Is there any r way to do this?
Any pointer will be highly appreciated.
I would extract using regex. If you want all text following the \\s I would capture them with an ex:
import re
# Extract
pattern = re.compile("(#[a-z]{4}|\\\s)")
my_match = "".join(pattern.findall(my_string))
An option with sub
sub("^(#.{4}).*(\\\\s)$", "\\1\\2", Str)
#[1] "#some\\s"
str_replace(string, pattern, replacement)
or
str_replace_all(string, pattern, replacement)
You can use
This question already has answers here:
Replace single backslash in R
(5 answers)
Closed 3 years ago.
I'm trying to use regular expression in a sub() function in order to replace all the "\" in a Vector
I've tried a number of different ways to get R to recognize the "\":
I've tried "\\\" but I keep getting errors.
I've tried "\.*"
I've tried "\\\.*"
data.frame1$vector4 <- sub(pattern = "\\\", replace = ", data.frame1$vector4)
The \ that I am trying to get rid of only appears occasionally in the vector and always in the middle of the string. I want to get rid of it and all the characters that follow it.
The error that I am getting
Error: '\.' is an unrecognized escape in character string starting "\."
Also I'm struggling to get Stack to print the "\" that I am typing above. It keeps deleting them.
1) 4 backslashes To insert backslash into an R literal string use a double backslash; however, a backslash is a metacharacter for a regular expression so it must be escaped by prefacing it with another backslash which also has to be doubled. Thus using 4 backslashes will be needed in the regular expression.
s <- "a\\b\\c"
nchar(s)
## [1] 5
gsub("\\\\", "", s)
## [1] "abc"
2) character class Another way to effectively escape it is to surround it with [...]
gsub("[\\]", "", s)
## [1] "abc"
3) fixed argument Perhaps the simplest way is to use fixed=TRUE in which case special characters will not be regarded as regular expression metacharacters.
gsub("\\", "", s, fixed = TRUE)
## [1] "abc"
This question already has answers here:
Error: '\R' is an unrecognized escape in character string starting "C:\R"
(5 answers)
Closed 2 years ago.
I am not an expert on Regex in R, but I feel I have read the docs first long enough and still come up short, so I am posting here.
I am trying to replace the following string, all LITERALLY as written:
a = "\\begin{tabular}"
a = gsub("\\begin{tabular}", "\\scalebox{0.7}{
\\begin{tabular}", a)
Desired output is : cat('\\scalebox{0.7}{ \\begin{tabular}')
So I know I need to escape the first "\" to "\", but when I escape the brackets I get
Error: '\}' is an unrecognized escape in character string starting...
In your case since you're seeking to replace a fixed string, you can simply set fixed = T option to avoid regular expressions entirely.
a = "\\begin{tabular}"
a = gsub("\\begin{tabular}", "\\scalebox{0.7}{\n\\begin{tabular}", x=a, fixed= T)
and use \n for the newline.
If you did want to use regex, you need to escape curly bracket in pattern using two backslashes rather than one.
e.g.,
a = "\\begin{tabular}"
gsub(pattern = "\\{|\\}", replacement = "_foo_", x=a)
[1] "\\begin_foo_tabular_foo_"
Alternatively, you can enclose the curly brackets in square brackets like so:
e.g.,
a = "\\begin{tabular}"
gsub(pattern = "[{]|[}]", replacement = "_foo_", x=a)
[1] "\\begin_foo_tabular_foo_"