Xpath in R - Invalid predicate - r

I'm struggling with an Xpath formula. I want to capture a product name and have tried lots of versions only to get:
Invalid predicate 1206
or:
Invalid predicate 1207
or:
character(0)
The structure I'm after is:
<div class="product__info">
:: before
<a href="/our-range/brands/a/acme" itemprop="brand" itemscope="" itemtype="http://schema.org/Brand">
<img itemprop="logo" class="brand" src="https://picture.png" ></a>
<h1 itemprop="name" class="fn">Acme Whizz</h1>
I have tried:
xpath = ".fn"
xpath = ".product__info"
xpath = "//div[#class=product__info]/text()"
(amongst many others.)
Where am I going wrong with this formula?

I didn't understand very well what do you want to find, so here are some xpathes almost for everything in your code.
You can access your div with this xpath:
//div[#class="product__info"]
You can find image that has logo like this:
//img[#itemprop="logo"]
Or the brand like so:
//a[#itemprop="brand"]

Related

Scrapy: checking if the tag has another tag inside it and scrape both elements

I am trying to scrape an html page that uses this structure:
<div class="article-body">
<div id="firstBodyDiv">
<p class="ng-scope">
This is a dummy text for explanation purposes
</p>
<p> class="ng-scope">
This is a <a>dummy</a> text for explanation purposes
</p>
</div>
</div>
as you can see some of the P elements have a elements and some dont.
What i did so far is the following:
economics["article_content"] = response.css("div.article-body div#firstBodyDiv > p:nth-child(n+1)::text").extract()
but it returns only the text before and after the a element if there is an aelement inside the p element
while this query return the a(s) elements:
response.css("div.article-body div#firstBodyDiv p:nth-child(n+1) a::text").extract()
i want to find a way to check whether there is an a element or not so i can execute the other query(the one who scrape the text inside the a element)
this is what i did so far to do so:
for i in response.css("div.article-body div#firstBodyDiv p:nth-child(n+1)"):
if response.css("div.article-body div#firstBodyDiv p:nth-child(n+1) a") in i :
# ofcourse this isnt working since and i am getting this error
# 'in <string>' requires string as left operand, not SelectorList
# probably i will have a different list1, list1.append() the p
# before, a, and the p text after the a element
# assign that list to economics["article_content"]
Although i am using css selectors, you are welcome to use xpath selectors.
You can use the descendant-or-self functionality from xpath, which will get all inner texts.
for i in response.css('div.article-body div#firstBodyDiv > p:nth-child(n+1)'):
print(''.join(i.xpath('descendant-or-self::text()').extract()))
You can also use scrapy shell in order to test your code with raw HTML like so:
$ scrapy shell
from scrapy.http import HtmlResponse
response = HtmlResponse(url='test', body='''<div class="article-body">
<div id="firstBodyDiv">
<p class="ng-scope">
This is a dummy text for explanation purposes
</p>
<p class="ng-scope">
This is a <a>dummy</a> text for explanation purposes
</p>
</div>
</div>
''', encoding='utf-8')
for i in response.css('div.article-body div#firstBodyDiv > p:nth-child(n+1)'):
print(''.join(i.xpath('descendant-or-self::text()').extract()))

When I try to use to redirect to another screen through master page this error happening

When I click on link from master page I got this issue .How can I fix this issue
</a><a href="vacations.aspx">
<div class='<%= vacDetailMenu %>' style="cursor: pointer;">
Vacation History
</div>
This is the code which I used.
This is the issue:
"Index and length must refer to a location within the string.
Parameter name: length"
Can you please share your more code .. .
this is a error which comes when you put string length more than Real length,
string muString="Hello";
string substring = myString.Substring(0, 6);
and
</a><a href="vacations.aspx">
is not Proper Anchor Link , You should use it like
SomeText

Watir /Cucumber Unable to locate element : dd-field div.dd-arrow-box (Using CSS Selector*

Chief complaint:
Using CSS Selector cannot locate element
Css path:
html.ng-scope body.ng-scope div.snap-content div.content-container div.ng-scope div.content-box div div.cb-landing-module div.deposit-acct-box div.dropdown div.dd-field div.dd-arrow-box
Css selector:
.dropdown .dd-arrow-box
How I used it in Watir/Cucumber :
#browser.element(:css => 'dd-arrow-box').click
Error:
Cucumber reported error: Unable to late element
Description Of Problem:
I have a combo box of sorts that I would like to click
The result would be selecting items from the list.
I have tried xpath , in may combinations with the same error
Unable to find element
Tried:
Possible / sync issue: Fire on Even …. Failed
Xpath – Failed
When I take the method off ( e.g ) click away and add exist it is found . I can’t click it
Edit: Adding HTML from Comment:
<div class="dd-arrow">
</div>
</div>
</div>
<ul class="dd-box" style="display: none;">
<li class="dd-list-head"> Direct Deposit </li>
<li class="dd-item ng-scope ng-binding" ng-click="selectAccount(directDeposit)" ng-repeat="directDeposit in directDeposits">
<span class="acct-name ng-binding">WELLS FARGO BANK, NA</span>
Still puzzled the last threw the following exception: invalid attribute: :css (Watir::Exception::MissingWayOfFindingObjectException)
This might be because what you are looking for is the dd-arrow-box tag, not the class. Change your code to look like this:
#browser.element(:css => '.dd-arrow-box').click
exactly like you would if you were defining a css file. Notice the period in front of dd-arrow-box
Here's a good example.
Given the html:
<div class="divhere">
<p>Hello</p>
</div>
you could use the code #browser.element(css: 'p').text to grab the text from the paragraph.
Likewise, #browser.element(css: '.divhere') will locate the div element using the class selector.

How to getText from the element

I'm using webdriver for forum reply testing.In this scenario,I'm not able to locate and get the reply text ("I want rock!")from following code.
The HTML code is:
<div id="user_ack_con0" class="user_ack_con mt15 clear clearfix">
<dl class="clear clearfix">
<dt>
<a href="http://www.abc/user/1161/">
</a>
</dt>
<div>
Jason
<span class="total_icon total_icon5"></span>
:I want rock!
</div>
I really don't know how to get that text from this element:( Anybody knows,thanks.
Here's a general solution:
def get_text_excluding_children(driver, element):
return driver.execute_script("""
return jQuery(arguments[0]).contents().filter(function() {
return this.nodeType == Node.TEXT_NODE;
}).text();
""", element)
The element passed to the function can be something obtained from the find_element...() methods (i.e. it can be a WebElement object).
I'm actually using this code in a test suite.
The text is technically inside the div element, so you should try getting it using the find method on the xPath:
//div[#id="user_ack_con0"]/dl/div
and then getting the text
You can try:
driver.findElement(By.cssSelector("#user_ack_con0 > dl > div")).getText()
Or This:
$("#user_ack_con0 > dl > div").textContent
Jquery get Text

How to find a group of classes containing specific text using Selenium IDE

I am trying to test that the correct title, summary, and link appear in search results. For instance, in the example below, I want to confirm that at least one of the records contains the title "Title for Beta," the summary containing the text "Summary for Beta," and a link called "Link."
<ul>
<li class="results">
<h2 class="title">Title for Alpha</h2>
<div class="summary">Summary for Alpha...</li>
<div class="link">Link
</li>
<li class="results">
<h2 class="title">Title for Beta</h2>
<div class="summary">Summary for Beta...</li>
<div class="link">Link
</li>
</ul>
There are many ways to select them. One is using the CSS Class.
HtmlUnitDriver driver = new HtmlUnitDriver();
driver.get("...");
List<WebElement> titles = driver.findElements(By.className("title"));
List<WebElement> summarys = driver.findElements(By.className("summary"));
List<WebElement> links = driver.findElements(By.className("link"));
for (WebElement webElement : titles) {
String innerText = webElement.getText();
// do your test....
}
If your page structure is more complicated you can also use XPath to do that.
If you are new to Selenium, you should have a look to the PageFactory Pattern. This is a nice way to write a much cleaner code.
In the unusual way of selecting Elements by their InnerText you can use XPath.
This XPath selects all elements containing "Title for Alpha" as InnerText
List<WebElement> titles = driver.findElements(By.xpath("//*[contains(text(), 'Title for Alpha')]"));

Resources