Extract XML child attribute based on another child attribute - r

I have the following XML structure. I am trying to extract the attributes StartDate and EndDate of the relationship period, that is only if rr:PeriodType is RELATIONSHIP_PERIOD.
However, the nodes for "relationship" and "accounting" have exactly the same name and am not sure how to proceed.
<rr:RelationshipPeriods>
<rr:RelationshipPeriod>
<rr:StartDate>2018-01-01T00:00:00.000Z</rr:StartDate>
<rr:EndDate>2018-12-31T00:00:00.000Z</rr:EndDate>
<rr:PeriodType>ACCOUNTING_PERIOD</rr:PeriodType>
</rr:RelationshipPeriod>
<rr:RelationshipPeriod>
<rr:StartDate>2019-01-02T00:00:00.000Z</rr:StartDate>
<rr:PeriodType>RELATIONSHIP_PERIOD</rr:PeriodType>
</rr:RelationshipPeriod>
</rr:RelationshipPeriods>
I tried using this code
ldply(xpathApply(xmlData, '//rr:RelationshipPeriod/rr:StartDate', getChildrenStrings), rbind)
But doesn't work well as it's hard to understand if it is extracting accounting or relationship period.
Any help would be greatly appreciated!

For rr:StartDate use XPath:
//rr:RelationshipPeriod[rr:PeriodType='RELATIONSHIP_PERIOD']/rr:StartDate
But probably better to first find the correct rr:RelationshipPeriod using XPath:
//rr:RelationshipPeriod[rr:PeriodType='RELATIONSHIP_PERIOD']
See this answer on how to reuse the result of a XPath.
But don't use // in front of rr:StartDate and rr:EndDate

Related

R Using Regex to find a word after a pattern

I'm grabbing the following page and storing it in R with the following code:
gQuery <- getURL("https://www.google.com/#q=mcdimalds")
Within this, there's the following snippet of code
Showing results for</span> <a class="spell" href="/search?rlz=1C1CHZL_enUS743US743&q=mcdonalds&spell=1&sa=X&ved=0ahUKEwj9koqPx_TTAhUKLSYKHRWfDlYQvwUIIygA"><b><i>mcdonalds</i></b></a>
Everything other than "showing results for" and the italics tags encasing the desired name for extraction are subject to change from query to query.
What I want to do is extract the mcdonalds out of this string using regex that occurs here: <b><i>mcdonalds</i> aka the second instance of mcdonalds. However, I'm not too sure how to write the regex to do so.
Any help accomplishing this would be greatly appreciated. As always, please let me know if any additional information should be added to clarify the question.

Using Marklogic Xquery data population

I have the data as below manner.
<Status>Active Leave Terminated</Status>
<date>05/06/2014 09/10/2014 01/10/2015</date>
I want to get the data as in the below manner.
<status>Active</Status>
<date>05/06/2014</date>
<status>Leave</Status>
<date>09/10/2014</date>
<status>Terminated</Status>
<date>01/10/2015</date>
please help me on the query, to retrieve the data as specified above.
Well, you have a string and want to split it at the whitestapces. That's what tokenize() is for and \s is a whitespace. To get the corresponding date you can get the current position in the for loop using at. Together it looks something like this (note that I assume that the input data is the current context item):
let $dates := tokenize(date, "\s+")
for $status at $pos in tokenize(Status, "\s+")
return (
<status>{$status}</status>,
<date>{$dates[$pos]}</date>
)
You did not indicate whether your data is on the file system or already loaded into MarkLogic. It's also not clear if this is something you need to do once on a small set of data or on an on-going basis with a lot of data.
If it's on the file system, you can transform it as it is being loaded. For instance, MarkLogic Content Pump can apply a transformation during load.
If you have already loaded the content and you want to transform it in place, you can use Corb2.
If you have a small amount of data, then you can just loop across it using Query Console.
Regardless of how you apply the transformation code, dirkk's answer shows how you need to change it. If you are updating content already in your database, you'll xdmp:node-delete() the original Status and date elements and xdmp:node-insert-child() the new ones.

Selenium IDE - Select checkbox on table row

I'm using Selenium IDE and I have a table where it has many rowns and columns. Each row has its own checkbox to select this row.
I was using this command to search for a specific row:
css=tr:contains('US Tester4') input[type="checkbox"]
But the problem is that in this colum, I have some other similar words like "US Tester41", "US Tester42" ... and when I use this command, it selects the wrong row.
I thought if I replace this word "contains" for some other like "equals" or "exactly" would work, but it didn't (I don't know the sintax).
Any ideas?
Follow the screenshot:
http://oi41.tinypic.com/2ake9hw.jpg
I'm not familiar with Selenium IDE, but with the selenium webdriver I would use an xpath. So I guess something like this will work for you:
xpath=//tr[td[3][text()='US Tester4']]//input[#type='checkbox']
This worked for me:
//tr//td[.='US Tester4']//input[type="checkbox"]
against:
<table>
<tr><td>US Tester</td>input(type="checkbox")</tr>
<tr><td>US Tester4</td>input(type="checkbox")</tr>
<tr><td>US Tester41</td>input(type="checkbox")</tr>
<tr><td>US Tester412</td>input(type="checkbox")</tr>
</table>
It matched the second element.
This worked for me
xpath=(//input[#name='uid'])[2])
The 2 being the order of elemets
I'm not very familiar with the IDE but I have used the Webdriver before. If possible I would use this xpath.
xpath = "//td[.= 'US Tester4']//previous-sibling::td//input[#type = 'checkbox']"
This should locate only one element on screen. Using previous-sibling and following-sibling is very helpful when you haven't got a good enough identifier on the exact element you want to find. In your case the which contains the checkbox hasn't a good identifier where as the after has text which you could match using the '=' operator. You just need to use the 'previous-sibling' to find the with the checkbox

This regex is not right

I am trying to use regex generators to create an expression, but I can't seem to get it right.
What I need to do is find the following type of string in a string:
community_n
For example, within the string which may be
community community_1 community_new_1 community_1_new
from that, I just want to extract community_1
I have tried /(community_\\d+)/, but that is clearly not right.
Try adding word boundries, so
/(\\bcommunity_\\d+\\b)/
Try using the regex (community_\d+).
Though I could be incorrect since I don't know which language you are using.
(For some reason I cannot add comments, I can only answer questions).

Finding First Row in a RDLC Table

I have a table in a RDLC report which is utilized as a subreport, and the first column of this table is a static string. Does anyone know how I can determine if a row is the first in the table. I tried using "=First("My String")" but it didn't work.
Looking at the link supplied by ThatBloke in his answer, I found the RowNumber command.
Which means that this worked:
=IIf(RowNumber(Nothing)=1,"myString", "")
Aggregate functions work with "Scope', referring to the paragraph scope in this MSDN article, might help...
http://msdn.microsoft.com/fr-fr/library/ms252112(VS.80).aspx"
From what I understand you may have to define a scope or try =First("MyString", Nothing).
=IIF((RowNumber(Nothing) Mod <>)=0)
<> Indicate No of Rows Which you want To Display

Resources