How to replace all escape sequence with blank in robot framework - robotframework

I have tried multiple things to convert my variable containing escape sequence characters into a blank string. How do I replace and escape sequence character with blank?
${stg} Set Variable \r\n
Replace String ${stg} \r\n ${EMPTY}
Log ${stg}
Should Not Be Equal ${stg} \r\n
In line 4, ${stg} == '\r\n'. How do I make this blank?

You were very close,docs for Replace String gives you the answer:
A modified version of the string is returned and the original
string is not altered.
Examples:
| ${str} = | Replace String | Hello, world! | world | tellus |
| Should Be Equal | ${str} | Hello, tellus! | | |
In your case,assign return of line #2 into ${stg}:
${stg} Replace String ${stg} \r\n ${EMPTY}

Related

How can I remove certain part of row names in data frame

I have a data set with the following format:
ID | Value
-------------------------- | -------------------------------
AAA1|404744 | 1.7554
ANKHD1-EIF4EBP3|404734 | 0.5174
HLA-B|3106 | 11.7659
HLA-A|3105 | 18.0851
What I want is removing certain part of the row names like this:
ID | Value
--------------------- | -------------------------------
AAA1 | 1.7554
ANKHD1-EIF4EBP3 | 0.5174
HLA-B | 11.7659
HLA-A | 18.0851
Thanks a lot!
We can do this with sub. Match the | (a metacharacter implies or - so either escape \\| it or place it in brackets to get the literal character) followed by characters (.*) and replace it with blank ("")
df$ID <- sub("[|].*", "", df$ID)

StartTag: invalid element name Error: 1: StartTag: invalid element name

I have an xml which I am trying to parse using xmlParse in R. I have a number of xml's which are very similar to what I am trying below and I have no issues, however when trying the exact same process using one of my xml's, I get the below error message.
a = "productlist1374.xml"
b = xmlParse(a)
StartTag: invalid element name
Error: 1: StartTag: invalid element name
Only certain characters are permitted in XML names by the W3C XML BNF for component names:
Name ::= NameStartChar (NameChar)*
NameStartChar ::= ":" | [A-Z] | "_" | [a-z] | [#xC0-#xD6] | [#xD8-#xF6] |
[#xF8-#x2FF] | [#x370-#x37D] | [#x37F-#x1FFF] |
[#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] |
[#x3001-#xD7FF] | [#xF900-#xFDCF] | [#xFDF0-#xFFFD] |
[#x10000-#xEFFFF]
NameChar ::= NameStartChar | "-" | "." | [0-9] | #xB7 | [#x0300-#x036F] |
[#x203F-#x2040]
You've not posted your XML, but clearly one or more of your start tags uses a character or characters that are not allowed.

gsub with "|" character in R

I have a data frame with strings under a variable with the | character. What I want is to remove anything downstream of the | character.
For example, considering the string
heat-shock protein hsp70, putative | location=Ld28_v01s1:1091329-1093293(-) | length=654 | sequence_SO=chromosome | SO=protein_coding
I wish to have only:
heat-shock protein hsp70, putative
Do I need any escape character for the | character?
If I do:
a <- c("foo_5", "bar_7")
gsub("*_.", "", a)
I get:
[1] "foo" "bar"
i.e. I am removing anything downstream of the _ character.
However, If I repeat the same task with a | instead of the _:
b <- c("foo|5", "bar|7")
gsub("*|.", "", a)
I get:
[1] "" ""
You have to scape | by adding \\|. Try this
> gsub("\\|.*$", "", string)
[1] "heat-shock protein hsp70, putative "
where string is
string <- "heat-shock protein hsp70, putative | location=Ld28_v01s1:1091329-1093293(-) | length=654 | sequence_SO=chromosome | SO=protein_coding"
This alternative remove the space at the end of line in the output
gsub("\\s+\\|.*$", "", string)
[1] "heat-shock protein hsp70, putative"
Maybe a better job for strsplit than for a gsub
And yes, it looks like the pipe does need to be escaped.
string <- "heat-shock protein hsp70, putative | location=Ld28_v01s1:1091329-1093293(-) | length=654 | sequence_SO=chromosome | SO=protein_coding"
strsplit(string, ' \\| ')[[1]][1]
That outputs
"heat-shock protein hsp70, putative"
Note that I'm assuming you only want the text from before the first pipe, and that you want to drop the space that separates the pipe from the piece of the string you care about.

How to split a String in Robot Framework with delimiter as "|"

I want to split a string using robot framework with delimiter as |.
Code:
${string} = 'Age:2|UNACCEPTED'
${str} = Split String ${string} '\|'
Ouput:
Expected: u'Age:2', u'UNACCEPTED'
Actual: u'Age:2|UNACCEPTED'
Could you please assist on same.
There is no need to escape symbols on Robot Framework:
${string}= Set Variable Age:2|UNACCEPTED
${str}= String.Split String ${string} |
Log ${str}
Log ${str}[0]
Log ${str}[1]
Output:
['Age:2', 'UNACCEPTED'] # Output of ${str}
Age:2 # Output of ${str}[0]
UNACCEPTED # Output of ${str}[1]
Unless you want to split a string on an escape character:
${string}= Set Variable Age:2\nUNACCEPTED\nanother line
${str}= String.Split String ${string} \n
Output:
INFO :
${string} = Age:2
UNACCEPTED
another line
INFO : ${str} = [u'Age:2', u'UNACCEPTED', u'another line']

How to replace text sequences ending in a fixed pattern within a long text string in R?

I have a column within a data frame containing long text sequences (often in the thousands of characters) of the format:
abab(VR) | ddee(NR) | def(NR) | fff(VR) | oqq | pqq | ppf(VR)
i.e. a string, a suffix in brackets, then a delimiter
I'm trying to work out the syntax in R to delete the items that end in (VR), including the trailing pipe if present, so that I'm left with:
ddee(NR) | def(NR) | oqq | pqq
I cannot work out the regular expression (or gsub) that will remove these entries and would like to request if anyone could help me please.
If you want to use gsub, you can remove the pattern in two stages:
gsub(" \\| $", "", gsub("\\w+\\(VR\\)( \\| )?", "", s))
# firstly remove all words ending with (VR) and optional | following the pattern and
# then remove the possible | at the end of the string
# [1] "ddee(NR) | def(NR) | oqq | pqq"
regular expression \\w+\\(VR\\) will match words ending with (VR), parentheses are escaped by \\;
( \\| )? matches optional delimiter |, this makes sure it will match the pattern both in the middle and at the end of the string;
possible | left out at the end of the string can be removed by a second gsub;
Here is a method using strsplit and paste with the collapse argument:
paste(sapply(strsplit(temp, split=" +\\| +"),
function(i) { i[setdiff(seq_along(i), grep("\\(VR\\)$", i))] }),
collapse=" | ")
[1] "ddee(NR) | def(NR) | oqq | pqq"
We split on the pipe and spaces, then feed the resulting list to sapply which uses the grep function to drop any elements of the vector that end with "(VR)". Finally, the result is pasted together.
I added a subsetting method with setdiff so that vectors without any "(VR)" will return without any modification.

Resources