This question already has an answer here:
stringr extract full number from string
(1 answer)
Closed 3 years ago.
Trying to use StringR to find all the digits which occur at the end of the text.
For example
x <- c("Africa-123-Ghana-2", "Oceania-123-Sydney-200")
and StringR operation should return
"2 200"
I believe there might be multiple methods, but what would be the best code for this?
Thanks.
You could use
sub(".*-(\\d+)$", "\\1", x)
#[1] "2" "200"
Or
stringr::str_extract(x, "\\d+$")
Or
stringi::stri_extract_last_regex(x, "\\d+")
We can use regexpr/regmatches in base R to match one or more digits (\\d+) at the end ($) of the string
regmatches(x, regexpr("\\d+$", x))
#[1] "2" "200"
Or with sub, we match characters until the last character that is not a digit and replace with blank ("")
sub(".*\\D+", "", x)
#[1] "2" "200"
Or using strsplit
sapply(strsplit(x, "-"), tail, 1)
#[1] "2" "200"
Or using stringr with str_match
library(stringr)
str_match(x, "(\\d+)$")[,1]
#[1] "2" "200"
Or with str_remove
str_remove(x, ".*\\D+")
#[1] "2" "200"
Related
This question already has answers here:
Extracting numbers from vectors of strings
(12 answers)
Extract all numbers from a single string in R
(4 answers)
Closed 1 year ago.
I was thinking I could use str_extract_all or something in tidyverse, but I am not sure how to get it, because what my string returns is not correct.
This is the string:
str <- "12, 47, 48 The integers numbers are also interesting: 189 2036 314 \',\' is a separator, so please extract these numbers 125,789,1450 and also these 564,90456. 7890$ per month "
We can use str_extract_all to extract multiple instances of one of more digits (\\d+). The output will be a list of length 1. So, we extract the list element with [[
library(stringr)
str_extract_all(string1, "\\d+")[[1]]
-output
[1] "12" "47" "48" "189" "2036" "314" "125" "789" "1450" "564" "90456" "7890"
For a base R option, we can use regmatches along with gregexpr:
regmatches(string1, gregexpr("\\d+", string1))
[1] "12" "47" "48" "189" "2036" "314" "125" "789" "1450" "564" "90456" "7890"
I'm trying to find this specific character "|" location in a string.
for example: 8,75.2|6,0.376
the answer I expect is 7
I trying to use regexpr:
regexpr('|',"8,75.2|6,0.376")
but it didn't worked (although it did work to when I looked for the ",")
any ideas?
The '|' character is a special character in regular expression. You can search for a '|' by using the escape character '\' regexpr("\\|","8,75.2|6,0.376")
Another option is to use lapply:
> str <- '8,75.2|6,0.376'
> chars <- strsplit(str, '')
> chars
[[1]]
[1] "8" "," "7" "5" "." "2" "|" "6" "," "0" "." "3" "7" "6"
> loc <- lapply(chars, function(elem) which (elem == '|'))
> loc
[[1]]
[1] 7
See the lapply documentation
You can use the stringr package:
library(stringr)
str_locate("8,75.2|6,0.376",fixed('|'))
#or
str_locate("8,75.2|6,0.376",'\\|')
sample result:
start end
[1,] 7 7
I have an integer
a <- (0:3)
And I would like to convert it to a character string that looks like this
b <- "(0:3)"
I have tried
as.character(a)
[1] "0" "1" "2" "3"
and
toString(a)
[1] "0, 1, 2, 3"
But neither do exactly what I need to do.
Can anyone help me get from a to b?
Many thanks in advance!
paste0("(", min(a), ":", max(a), ")")
"(0:3)"
Or more concisely with sprintf():
sprintf("(%d:%d)", min(a), max(a))
One option is deparse and paste the brackets
as.character(glue::glue('({deparse(a)})'))
#[1] "(0:3)"
Another option would be to store as a quosure and then convert it to character
library(rlang)
a <- quo((0:3))
quo_name(a)
#[1] "(0:3)"
it can be evaluated with eval_tidy
eval_tidy(a)
#[1] 0 1 2 3
i have a data like below and need to extract text comes before any number. or if we can separate the text and number then it would be great
df<-c("axz123","bww2","c334")
output
"axz", "bww", "c"
or
"axz","bww","c"
"123","2","334"
We can do:
df <- c("axz123","bww2","c334")
gsub("\\d+", "", df)
#[1] "axz" "bww" "c"
gsub("(\\D+)", "", df)
#[1] "123" "2" "334"
For your other example:
df <- "BAILEYS IRISH CREAM 1.75 LITERS REGULAR_NOT FLAVORED"
gsub("\\d.*", "", df)
#[1] "BAILEYS IRISH CREAM "
gsub("[A-Z_ ]*", "", df)
#[1] "1.75"
We can use [:alpha:] to match the alphabetic characters, and combine this with gsub() and a negation to remove all characters that are not alphabetic:
gsub("[^[:alpha:]]", "", df)
#[1] "axz" "bww" "c"
To obtain only the non-alphabetic characters we can drop the negation ^:
gsub("[[:alpha:]]", "", df)
#[1] "123" "2" "334"
Using str_extract and regex lookarounds. We match one or more characters before any number ((?=\\d)) and extract it.
library(stringr)
str_extract(df, "[[:alpha:]]+(?=\\d)")
#[1] "axz" "bww" "c"
If we need to separate the numeric and non-numeric, strsplit can be used
lst <- strsplit(df, "(?<=[^0-9])(?=[0-9])", perl=TRUE)
I have a string ,
x = "[1,2,3]"
How can I get the elements 1 and 2 from the string?
I tried the strsplit but that seems a bit tricky. Then I tried splitting on "[", and that also did not seem easy.
You could use regex lookaround to extract the numbers
library(stringr)
str_extract_all(x, '(?<=\\[|,)\\d+(?=,)')[[1]]
#[1] "1" "2"
A base option, here we just remove the brackets and split by ,, though do note #MrFlick's comment.
strsplit(gsub("\\[|\\]", "", x), ",")[[1L]][1:2]
# [1] "1" "2"