I have a QString that I have replaced "=" and"," with " ". Now I would like to write a regular expression that would remove every occurrence of a certain string followed immediately by parenthesis containing a 1 to 2 character long number.
For Example: "mat(1) = 5, mat(2) = 4, mat(3) = 8" would become "5 4 8"
So this is what I have so far:
text = text.replace("=", " ");
text = text.replace(",", " ");
text = text.remove( QRegExp( "mat\([0-9]{1,2}\)" ) );
The regular expression is not correct, how can I fix it to do what i want? Thanks!
You need to escape your backslashes for C++ string literals:
text = text.remove( QRegExp( "mat\\([0-9]{1,2}\\)" ) );
Related
I have several strings with open and unclosed parenthesis. I managed to remove the opening parenthesis (if there is no closing one), but I do not manage to remove the closing parenthesis if there is no opening one. I want to leave those with matching parenthesis alone
string1 = "This (is solved"
string2 = "This is (fine)"
string3 = "This is the problem)"
This is what I was able to remove the first Problem case with (Opening parenthesis but no opening)
str_remove(data, "[(](?!.*[)])")
But I cannot seem to turn it around. The following grabs all closing parenthesis, but not the one without an oping.
"(?!.*[(])[)]"
Any ideas are appreciated!
If you do not need to handle nested paired (balanced) parentheses, you can use
gsub("(\\([^()]*\\))|[()]", "\\1", string)
See the regex demo. Details:
(\([^()]*\)) - Group 1 (\1 refers to this group value): (, then zero or more chars other than ( and ), and then a ) char
| - or
[()] - a ( or ) char.
See the R demo:
x <- c("This (is solved", "This is (fine)", "This is the problem)")
gsub("(\\([^()]*\\))|[()]", "\\1", x)
# => [1] "This is solved" "This is (fine)" "This is the problem"
If the parentheses can be nested, you can use
gsub("(\\((?:[^()]++|(?1))*\\))|[()]", "\\1", string, perl=TRUE)
See this regex demo. Details:
(\((?:[^()]++|(?1))*\)) - Group 1:
\( - a ( char
(?:[^()\n]++|(?1))* - zero or more sequences of either one or more chars other than ( and ), or the whole Group 1 pattern that is recursed
\) - a ) char
|[()] - or a ( / ) char.
While using the below print command:
print(k,':',dict[k])
I get the output as shown below but in the output, i want to remove the space between the key and colon.How to do it?
Current Output:
Sam : 40
Required Output:
Sam: 40
You could try printing a single string consisting of a concatenation:
print(k + ': ' + dict[k])
The python print() statement has a separator parameter that defaults to a space. So the comma-separated values that you are passing into it serve as arguments each of which will get separated by white-space while printing.
I think what you are looking for is
print(name, ": ", "40", sep = '')
>>> Sam: 40
Simply specifying the "sep" parameter solves your issue.
I am trying to parse a string with regex to pull out information between a colon and the last newline prior to the next colon. How can I do this?
string <- "Name: Al's\nPlace\nCountry:\nState\n/ Province: RI\n"
stringr::str_extract_all(string, "(?<=:)(.*)(?:\\n)")
but I get:
[[1]]
[1] " Al's\n" " \n" " RI\n"
when I want:
[[1]]
[1] " Al's\nPlace\n" " \n" " RI\n"
I'm not sure if this is what you're after as your wanted output looks a bit different.
:((?:.*\\n?)+?)(?=.*:|$)
: match a colon
((?:.*\n?)+?) match and capture lazily any lines (to optional \n)
(?=.*:|$) until there is a line with colon ahead
See this demo at regex101
I have a table with many strings that contain some weird characters that I'd like to replace with the "original" ones. Ä became ä, ö became ö, so I replace each ö with an ö in the text. It works, however, ß became à < U+009F> and I am unable to replace it...
# Works just fine:
gsub('ö', 'REPLACED', "Testing string ö")
# this does not work
gsub("Ã<U+009F>", "REPLACED", "Testing string Ã<U+009F> ")
# this does not work as well...
gsub("â<U+0080><U+0093>", "REPLACED", "Testing string â<U+0080><U+0093> ")
How do I tell R to replace These parts with some letter I want to insert?
As there are metacharacters (+ - to signify one or more), in order to evaluate it literally either escape (as #boski mentioned in the solution) or use fixed = TRUE
sub("Ã<U+009F>", "REPLACED", "Testing string Ã<U+009F> ", fixed = TRUE)
#[1] "Testing string REPLACED "
You have to escape the + symbol, as it is a regex command.
> gsub("Ã<U\\+009F>", "REPLACED", "Testing string Ã<U+009F> ")
[1] "Testing string REPLACED "
> gsub("â<U\\+0080><U\\+0093>", "REPLACED", "Testing string â<U+0080><U+0093> ")
[1] "Testing string REPLACED "
XQuery adds a space and I don't understand why. I have the following simple query :
declare option saxon:output "method=text";
for $i in 1 to 10
return concat(".", $i, " ", 100, "
", ".")
I ran it with Saxon (SaxonEE9-5-1-8J and SaxonHE9-5-1-8J):
java net.sf.saxon.Query -q:query.xq -o:result.txt
The result is the following:
.1 100
. .2 100
. .3 100
. .4 100
. .5 100
. .6 100
. .7 100
. .8 100
. .9 100
. .10 100
.
My question comes from the presence of an extra space between dots. The first line is OK but the folllowing lines (2 to 10) have that space and I don't understand why. What we see as spaces between digits is in fact a tabulation inserted by the character reference.
Could you enlighten me about that behavior ?
PS: I have added saxon as a tag for the question even if the question is not specific to Saxon.
I think your query returns a sequence of string values which are then by default concatenated with a space (see http://www.w3.org/TR/xslt-xquery-serialization/#sequence-normalization where it says "For each subsequence of adjacent strings in S2, copy a single string to the new sequence equal to the values of the strings in the subsequence concatenated in order, each separated by a single space"). If you don't want that then you can use
string-join(for $i in 1 to 10
return concat(".", $i, " ", 100, "
", "."), '')
The space between the dots is basically a separator introduced between the items in the sequence that you are constructing. It would seem that Saxon's text serializer where it outputs to the console inserts that space character to allow you to make sense of the output items.
Considering your code:
declare option saxon:output "method=text";
for $i in 1 to 10
return
concat(".", $i, " ", 100, "
", ".")
The result of for $i in 1 to 10 return is a sequence of 10 xs:string items. From your output you can determine that the space is interspersed between each evaluation of concat(".", $i, " ", 100, "
", ".").
If you want to check that you can rewrite your query as:
for $i in 1 to 10
return
<x>{concat(".", $i, " ", 100, "
", ".")}</x>
And you will see your 10 distinct items with no spaces between.
If you are trying to create a single text string, as you are already controlling the line-breaks, then you could also join all of the 10 xs:string items together yourself, which would have the effect of eliminating the spaces you are seeing between the sequence items. For example:
declare option saxon:output "method=text";
string-join(
for $i in 1 to 10
return
(".", string($i), " ", "100", "
", ".")
, "")