I need to replace and masking some character for example "5489888811178620" needs to be
"XXXXXXXXXXXX8620",
thanks for helping me
Assuming this is some credit card number with a fixed length of 16 digits:
regexp_replace(col, '(\d{12})(\d{4})', 'XXXXXXXXXXXX\2')
This replaces the first 12 digits of a 16 digit number with X
select LPAD('X', 12, 'X') || RIGHT('5489888811178620',4);
Related
I've solved 2022 advent of code day 6, but was wondering if there was a regex way to find the first occurance of 4 non-repeating characters:
From the question:
bvwbjplbgvbhsrlpgdmjqwftvncz
bvwbjplbgvbhsrlpgdmjqwftvncz
# discard as repeating letter b
bvwbjplbgvbhsrlpgdmjqwftvncz
# match the 5th character, which signifies the end of the first four character block with no repeating characters
in R I've tried:
txt <- "bvwbjplbgvbhsrlpgdmjqwftvncz"
str_match("(.*)\1", txt)
But I'm having no luck
You can use
stringr::str_extract(txt, "(.)(?!\\1)(.)(?!\\1|\\2)(.)(?!\\1|\\2|\\3)(.)")
See the regex demo. Here, (.) captures any char into consequently numbered groups and the (?!...) negative lookaheads make sure each subsequent . does not match the already captured char(s).
See the R demo:
library(stringr)
txt <- "bvwbjplbgvbhsrlpgdmjqwftvncz"
str_extract(txt, "(.)(?!\\1)(.)(?!\\1|\\2)(.)(?!\\1|\\2|\\3)(.)")
## => [1] "vwbj"
Note that the stringr::str_match (as stringr::str_extract) takes the input as the first argument and the regex as the second argument.
I have a vector of strings with 24 digits each. Each digit represents an hour, and if the digit is "0" then the rate from period 0 applies and if the digit is 1 then the rate from period 1 applies.
As an example consider the two strings below. I would like to return the number of periods in each string. For example:
str1 <- "000000000000001122221100"
str2 <- "000000000000000000000000"
#str1: 3
#str2: 1
Any recommendations? I've been thinking about how to use str_count from stringr here. Also, I've searched other posts but most of them focus on counting letters in character strings, whereas this is a slight modification because the string contains digits and not letters.
Thanks!
Here is another option by using charToRaw.
length(unique(charToRaw(str1)))
[1] 3
length(unique(charToRaw(str2)))
[1] 1
This is an ugly solution, but here goes
length(unique(unlist(strsplit(str1,split = ""))))
I have some vectors as below:
I converted all characters, Special characters into X
xxxxxx18002514919xxxxxxxxxxxxxxxxxxxxxxxxxx24XXXXXX7
xxxxxx9000012345xxxxxxxxxxxxx34567xxxxxxxxxxxxx1800XXXXXX7
How can I derive only 11 digit or 10 digit phone number from the above strings in R
My Desired Output is:
For first string: 18002514919
For second string: 9000012345
You can use stringr to solve your problem, There is function called str_extract_all to extract the phone number as desired.
The regex:
\\d --> represent number,
{n,m} --> curly braces are for matching the times of number. Here n is applied for minimum no of matches and m is maximum number of numbers for the match. Since you want to match a phone number whose length between 10 and 11. n becomes 10 and m becomes 11.
X <- c("xxxxxx18002514919xxxxxxxxxxxxxxxxxxxxxxxxxx24XXXXXX7","xxxxxx9000012345xxxxxxxxxxxxx34567xxxxxxxxxxxxx1800XXXXXX7")
library(stringr)
str_extract_all(X,"\\d{10,11}")
Answer:
> str_extract_all(X,"\\d{10,11}")
[[1]]
[1] "18002514919"
[[2]]
[1] "9000012345"
If you are sure that one scalar would contain only one string of phone number then use str_extract.
> str_extract(X,"\\d{10,11}")
[1] "18002514919" "9000012345"
I have a variable in a data frame which consists of either 5 or 6 digits/characters. Those values in the variable with 5 digits are all numbers e.g. 27701 those with 6 digits however all have a character 'C' preceding the numbers e.g. C22701.
How can I replace the 'C' characters with 999 for example?
I have tried:
replace(data$varname,'C',999)
Any ideas folks?
data$varname <- as.numeric(gsub('C', '999', data$varname)) should do the trick, I think. Assuming you want a numeric vector in the end. If you want a character vector, then you can leave as.numeric off.
You can use a substring to remove the first letter, and paste0 to add 999 to it.
> x <- c("C000", "P1745")
> paste0("999", substring(x,2))
# [1] "999000" "9991745"
I want to limit the length of hex number to 4 OR 8. For example:
0x0001 is valid (4 hex numbers)
0x0001001F is valid (8 hex numbers)
0x00012 is invalid (5 hex numbers)
I can validate a hex number with length of n (n is 4 or 8) by using
Regex.IsMatch("0x0001", "^0x[0-9a-fA-F][{n}$");
How to achieve this OR operation, say either 4 or 8, not from 4 to 8?
"Or" is represented by | in regex. Your regex would therefore look like
0x([0-9a-fA-F]{4}|[0-9a-fA-F]{8})
Another way would be to make the last four digits optional using ?. This would lead to
0x[0-9a-fA-F]{4}([0-9a-fA-F]{4})?