Character in a string formation - r

I use Rselenium and I use javascript queries.
The query in javascript is this:
document.querySelectorAll('ul#test div.mytext')[1].innerText.split('\n').filter(x => x).join('???')
When I try to run it in RSelenium code I use this:
remDr$executeScript('return document.querySelectorAll(\'ul#test div.mytext\')[ 1 ].innerText.split(\'//\n\').filter(x => x).join(\'???\')', args = list("dummy"))
However I receive an error and I belive it is due to \n character
How can I write it properly?

You are using single quotes for delimiting the code you want to run when it also contains single quotes. Since there are no double quotes in the expression, try:
remDr$executeScript("return document.querySelectorAll(\'ul#test div.mytext\')[ 1 ].innerText.split(\'//\n\').filter(x => x).join(\'???\')", args = list("dummy"))

Related

combining strings to one string in r

I'm trying to combine some stings to one. In the end this string should be generated:
//*[#id="coll276"]
So my inner part of the string is an vector: tag <- 'coll276'
I already used the paste() method like this:
paste('//*[#id="',tag,'"]', sep = "")
But my result looks like following: //*[#id=\"coll276\"]
I don't why R is putting some \ into my string, but how can I fix this problem?
Thanks a lot!
tldr: Don't worry about them, they're not really there. It's just something added by print
Those \ are escape characters that tell R to ignore the special properties of the characters that follow them. Look at the output of your paste function:
paste('//*[#id="',tag,'"]', sep = "")
[1] "//*[#id=\"coll276\"]"
You'll see that the output, since it is a string, is enclosed in double quotes "". Normally, the double quotes inside your string would break the string up into two strings with bare code in the middle:
"//*[#id\" coll276 "]"
To prevent this, R "escapes" the quotes in your string so they don't do this. This is just a visual effect. If you write your string to a file, you'll see that those escaping \ aren't actually there:
write(paste('//*[#id="',tag,'"]', sep = ""), 'out.txt')
This is what is in the file:
//*[#id="coll276"]
You can use cat to print the exact value of the string to the console (Thanks #LukeC):
cat(paste('//*[#id="',tag,'"]', sep = ""))
//*[#id="coll276"]
Or use single quotes (if possible):
paste('//*[#id=\'',tag,'\']', sep = "")
[1] "//*[#id='coll276']"

Test Regex with new line string \n via PHPUnit

I created a php library that parse content via regex. One of this regex is '#\n-{3,}#' to parse --- only with an break, a new line, before.
I have also tests written in PHPUnit for all methods and always I get a failure for tests with the new line regex. I test always with assertSame()
I tried to set as input the follow strings:
$input = PHP_EOL . '---';
$input = '<br>---';
$input = '
---'; // with break in code
As expected I set:
'<hr/>'
However always it fail and get an error. If I send this variables to the assert check it will fail and parse not the new line. Only without the \n inside the Regex, like '#-{3,}#', it works fine without error for the tests.
Also if I use as input for the test a new line with a string before, it works also, like
$input = "test\n---";
But I would like also to test without string, only start with a new line.
The parse for front end works fine, it replace via this regex from my markdown file if the content is include a break and followed by the 3 -.
How is it possible to set as input for the assertSame() function in PHPUnit a new line before the string?
The problem is you are using single quotes in your pattern.
In PHP \n means "new line" in a string only if double quotes are used.
$input = PHP_EOL . '---';
preg_match("#\n-{3,}#", $input); // this will match
See https://3v4l.org/EpEGf

Regex to get parts of URL

Hi I have URL as follows:
vimeo.com/99612902
www.vimeo.com/99612902
http://vimeo.com/99612902
http://www.vimeo.com/99612902
http://vimeo.com/moogaloop.swf?clip_id=81368903
I need to parse the above URL to get two group as folloes:
Group1 Group 2
vimeo.com/ 99612902
www.vimeo.com/ 99612902
http://vimeo.com/ 99612902
http://www.vimeo.com/ 99612902
http://vimeo.com/ 81368903
I've tried the followin regex
^((http[s]?|ftp):\/)?\/?([^:\/\s]+)(:([^\/]*))?((\/[\w\-]+)*\/)([\w\-\.]+[^#?\s]+)(\?([^#]*))?(#(.*))?
but which yields me unwanted and empty group. Please help me out.
With your input, we can match both parts into Groups 1 and 2 with this:
^(.*/)(.*)
or, for your revised input:
^(.*[/=])([^/=]+$)
In the demo, see the capture groups in the right pane.
In VB.NET, you can do this:
Dim theUrl As String
Dim theNumbers As String
Try
ResultString = Regex.Match(SubjectString, "^(.*/)(.*)", RegexOptions.Multiline)
theUrl = ResultString.Groups(1).Value
theNumbers = ResultString.Groups(2).Value
Catch ex As ArgumentException
'Syntax error in the regular expression
End Try
Option 2
If you want to do some very lightweight url validation at the same time, you can use this:
^((?:http://)?(?:www\.)?[^./]+\.\w+/)(.*)
or, with your revised input:
^((?:http://)?(?:www\.)?[^./]+\.\w+[=/])([^/=]+$)
If you don't want to validate the url then try this as well. Get the matched group from index 1 and 2.
(.*?[^\/]*\/)(\d+)
Here is DEMO
String literals for use in programs: C#
#"(.*?[^\/]*\/)(\d+)"
Simply you could use the below regex,
^(.*\/)(.*)$
DEMO
From the starting upto the last / symbol are captured by group1. Remaining characters are captured into group2.
OR
^((?:https?:\/\/)?(?:www\.)?(?:[^.]*)\.\w+\/)(.*)$
DEMO

xQuery substring problem

I now have a full path for a file as a string like:
"/db/Liebherr/Content_Repository/Techpubs/Topics/HyraulicPowerDistribution/Released/TRN_282C_HYD_MOD_1_Drive_Shaft_Rev000.xml"
However, now I need to take out only the folder path, so it will be the above string without the last back slash content like:
"/db/Liebherr/Content_Repository/Techpubs/Topics/HyraulicPowerDistribution/Released/"
But it seems that the substring() function in xQuery only has substring(string,start,len) or substring(string,start), I am trying to figure out a way to specify the last occurence of the backslash, but no luck.
Could experts help? Thanks!
Try out the tokenize() function (for splitting a string into its component parts) and then re-assembling it, using everything but the last part.
let $full-path := "/db/Liebherr/Content_Repository/Techpubs/Topics/HyraulicPowerDistribution/Released/TRN_282C_HYD_MOD_1_Drive_Shaft_Rev000.xml",
$segments := tokenize($full-path,"/")[position() ne last()]
return
concat(string-join($segments,'/'),'/')
For more details on these functions, check out their reference pages:
fn:tokenize()
fn:string-join()
fn:replace can do the job with a regular expression:
replace("/db/Liebherr/Content_Repository/Techpubs/Topics/HyraulicPowerDistribution/Released/TRN_282C_HYD_MOD_1_Drive_Shaft_Rev000.xml",
"[^/]+$",
"")
This can be done even with a single XPath 2.0 (subset of XQuery) expression:
substring($fullPath,
1,
string-length($fullPath) - string-length(tokenize($fullPath, '/')[last()])
)
where $fullPath should be substituted with the actual string, such as:
"/db/Liebherr/Content_Repository/Techpubs/Topics/HyraulicPowerDistribution/Released/TRN_282C_HYD_MOD_1_Drive_Shaft_Rev000.xml"
The following code tokenizes, removes the last token, replaces it with an empty string, and joins back.
string-join(
(
tokenize(
"/db/Liebherr/Content_Repository/Techpubs/Topics/HyraulicPowerDistribution/Released/TRN_282C_HYD_MOD_1_Drive_Shaft_Rev000.xml",
"/"
)[position() ne last()],
""
),
"/"
)
It seems to return the desired result on try.zorba-xquery.com. Does this help?

Regular expression to convert substring to link

i need a Regular Expression to convert a a string to a link.i wrote something but it doesnt work in asp.net.i couldnt solve and i am new in Regular Expression.This function converts (bkz: string) to (bkz: show.aspx?td=string)
Dim pattern As String = "<bkz[a-z0-9$-$&-&.-.ö-öı-ış-şç-çğ-ğü-ü\s]+)>"
Dim regex As New Regex(pattern, RegexOptions.IgnoreCase)
str = regex.Replace(str, "<font color=""#CC0000"">$1</font>")
Generic remarks on your code: beside the lack of opening parentheses, you do redundant things: $-$ isn't incorrect but can be simplified into $ only. Same for accented chars.
Everybody will tell you that font tag is deprecated even in plain HTML: favor span with style attribute.
And from your question and the example in the reply, I think the expression could be something like:
\(bkz: ([a-z0-9$&.öışçğü\s]+)\)
the replace string would look like:
(bkz: <span style=""color: #C00"">$1</span>)
BUT the first $1 must be actually URL encoded.
Your regexp is in trouble because of a ')' without '('
Would:
<bkz:\s+((?:.(?!>))+?.)>
work better ?
The first group would capture what you are after.
Thanks Vonc,Now it doesnt raise error but also When i assign str to a Label.Text,i cant see the link too.Forexample after i bind str to my label,it should be viewed in view-source ;
<span id="Label1">(bkz: here)</span>
But now,it is in viewsource source;
<span id="Label1">(bkz: here)</span>

Resources