Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Given the string, how can I get the desired outcome?
('9','','','','','','','','31.23','testing7'),('10','','','','','','','','31.23','testing10')
Desired Output
(9,'','','','','','','',31.23,'testing7'),(10,'','','','','','','',31.23,'testing10')
Try the following regex:
s <- "('9','','','','','','','','31.23','testing7'),
('10','','','','','','','','31.23','testing10')"
gsub("'(-?\\d+(?:[\\.,]\\d+)?)'", x = s, replacement = "\\1")
Regex explanation:
' match literal single quote character
() capture group
? match between 0-1 times
\\d+ match digit 1 and unlimited times
\\1 group 1
This should allow for negative numbers and decimals.
Output
"(9,'','','','','','','',31.23,'testing7'),(10,'','','','','','','',31.23,'testing10')"
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have a string like the below
x <- "Supplier will initially respond to High Priority incidents. Supplier will subsequently update EY every 60 minutes or at an interval EY specifies. Reporting and Response times will be capture in ServiceNow which, save in respect of manifest error, will be conclusive proof of the time period taken."
I want to extract 2 words after the word "every".
How this can be achieved in R?
We can use str_extract by using a regex workaround ((?<=every\\s)) followed by two words
library(stringr) #corrected the package here
unlist(str_extract_all(x, "(?<=every\\s)(\\w+\\s+\\w+)"))
#[1] "60 minutes"
Or using base R
regmatches(x, gregexpr("(?<=every\\s)(\\w+\\s+\\w+)", x, perl = TRUE))[[1]]
#[1] "60 minutes"
Something like this in base R,
Splitting every word of the string and then finding the index of occurrence of word every and then selecting next two words from that index.
wordsplit <- unlist(strsplit(x, " ", fixed = TRUE))
indx <- grep("\\bevery\\b", wordsplit)
wordsplit[(indx+1):(indx +2)]
#[1] "60" "minutes"
Or as #DavidArenburg suggested we can also use match instead of grep
wordsplit[match("every", wordsplit) + 1:2]
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I have this string, for instance:
str1 = "UNCID_999277.TCGA-CV-7254-01A-11R-2016-07.111118_UNC11-SN627_0167_AD09WDACXX_TAGCTT.txt"
I would like to extract this substring, for instance:
TCGA-CV-7254
I tried something link this:
gsub(pattern = "(*.)(TCGA*)(.*)",
replacement = "\\2",
x = nameArq)
But it returns:
[1] "UNCID_999277TCGA"
Thanks for any help!
You almost had it. In the first parentheses, the period needs to come first (this means "repeat any character any number of times"). You also need some unique endpoint for the second part of your regex.
gsub(pattern = "(.*)(TCGA.*4)(.*)",
replacement = "\\2",
x = str1)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I need to anonymize names but in a very specific way so that the format of the entire string is still the same (spaces, hyphens, periods are preserved) but all the letters are scrambled. I want to consistently replace say all A's with C's, all D's with Z's, and so on. How would I do that?
We can use chartr
chartr('AD', 'CZ', str1)
#[1] "CZ,ZC. C"
data
str1 <- c('AD,DA. C')
Maybe use gsub?
string <- "ABCDEFG"
text <- gsub('A', 'C', string )
string <- gsub('D', 'Z', string )
string
[1] "CBCZEFG"
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
i have a string like this x <- "avd_1xx_2xx_3xx"
i need to extract the number from x(string) and put them in new variables
num1 <- 1xx
num1 <- 2xx
num1 <- 3xx
however, i can't predict the number of digits for each number
for instance, this x would be "avd_1_2_3" or "avd_11_21_33" or likes
could you give me some solutions?
Thanks
We can use str_extract from stringr. To extract multiple matches we use str_extract_all, which returns a list of length 1 (as we have a single element in 'x'). To extract the list element, we can use [[ i.e. [[1]].
library(stringr)
str_extract_all(x, "\\d+[a-z]*")[[1]]
#[1] "1xx" "2xx" "3xx"
A similar option using base R would be regmatches/gregexpr
regmatches(x, gregexpr("\\d+[a-z]*", x))[[1]]
#[1] "1xx" "2xx" "3xx"
The pattern we match is one or more numbers (\\d+) followed by zero or more lower case letters ([a-z]*).
It is better to keep it as a vector rather than having multiple objects in the global environment.