I have a list of items:
<div class="item">
<a href="//external-link.com">
<img src="main-image.jpg" alt=""/>
</a>
<h2> Title </h2>
<p> Description lorem here </p>
</div>
<div class="item">
<a href="//external-link.com">
<img src="main-image.jpg" alt=""/>
</a>
<h2> Title </h2>
<p> Description lorem here </p>
</div>
<div class="item">
<a href="//external-link.com">
<img src="main-image.jpg" alt=""/>
</a>
<h2> Title </h2>
<p> Description lorem here </p>
</div>
I want to extract the text of the <h2> tag, and the "src" and "href" of the <a> and <img> tags, but I can't figure out how to extract the "src" and "href" attributes.
This is something like what I'm using:
require 'nokogiri'
require 'open-uri'
pageURL = 'http://ticketdriver.com/amg/buy/tickets'
page = Nokogiri::HTML(open(pageURL), nil, 'UTF-8')
page.css('.item').each do |node|
title = node.css('h2').text
srcUrl = node.css('img')['src']
end
The text part is working but I can't access the key and value for child elements of ".item". I tried children[0], [0]['src'] , [:src], attr(), attribute() and a few more.
I'm completely out of ideas and Google search pages.
I'd do something like:
doc = Nokogiri::HTML(<<EOT)
<html><body>
<div class="item">
<a href="//external-link.com">
<img src="main-image1.jpg" alt=""/>
</a>
<h2> Title1 </h2>
</div>
<div class="item">
<a href="//external-link.com">
<img src="main-image2.jpg" alt=""/>
</a>
<h2> Title2 </h2>
</div>
<div class="item">
<a href="//external-link.com">
<img src="main-image3.jpg" alt=""/>
</a>
<h2> Title3 </h2>
</div>
</body></html>
EOT
items = doc.search('.item').map { |item|
{
title: item.at('h2').text,
src: item.at('img')['src']
}
}
Which results in:
items
# => [{:title=>" Title1 ", :src=>"main-image1.jpg"},
# {:title=>" Title2 ", :src=>"main-image2.jpg"},
# {:title=>" Title3 ", :src=>"main-image3.jpg"}]
I'm deliberately only getting the "src" attribute from the <img> tag. Given the code above you can figure out how to get what you want from the <a> tags.
Notice that I'm using the generic search rather than css. Nokogiri is smart enough to differentiate between CSS and XPath selectors most of the time. The only time I use either css or xpath is when Nokogiri can't figure it out. I use CSS because it's generally simpler and more easily read.
Also, notice that I don't use node.css('h2').text. css returns a NodeSet, which is akin to an Array, whereas at returns a single Node. In your code you're masking the difference between the two, but using css, xpath or the generic search is a bug in waiting. Consider this:
require 'nokogiri'
doc = Nokogiri::HTML(<<EOT)
<html><body>
<p>foo</p>
<p>bar</p>
<p>baz</p>
</body></html>
EOT
doc.search('p').text # => "foobarbaz"
doc.at('p').text # => "foo"
What this means is, if search or one of its specific methods returns a NodeSet, text will return the text of all Nodes in that set, which is rarely what you want. Instead, you need to use at to find the specific child-node you want and then extract its text. How you do that is a different question, but it's easily done.
Related
I have a navbar element. It has a navbar-dropdown element. I want to add a navbar-item with a sentence to it, but word transposition in navbar-item doesn't work.
How can I turn it back on? Here's the code for the navbar-dropdown element.
<div class="is-primary navbar-dropdown is-size-4 dropdown-width">
<div class="navbar-item">
Чтобы воспользоваться функционалом системы, пожалуйста, авторизируйтесь.
</div>
<div class="navbar-item">
<a
class="button is-light"
:class="{ 'is-hidden':$store.state.isAuth }"
#click="openModal"
>
Войти
</a>
</div>
</div>
The "dropdown-width" class sets a fixed width for the element.
I'm trying to capture all the anchor clicks.
In GTM, my trigger is:
All Elements / Some Clicks / Click Element / Matches CSS Selector / #most-popular-posts > a
I've tried #most-popular-posts > * > a as well with no luck. Any ideas on why this isn't working?
My HTML is as follows:
<div id="most-popular-posts">
<h4>Most Popular</h4>
<div class="post-loop">
<ul>
<li>
<a class="latest_thumbnail_wrapper" href="http://thelink.com">
<div class="latest_thumbnail">
<img src="https://theimage.jpg" class="attachment-loop-thumbnail size-loop-thumbnail wp-post-image" alt="">
</div>
</a>
<div class="latest_list_wrapper">
<h5 class="cat-label">The Category</h5>
<h3>
Article Title
</h3>
<div class="byline">
<span>by</span> <a class="author" href="">First Last Name</a></div>
</div>
</li>
<li>
</ul>
</div>
</div>
Remove the > from the selector. That will look for any <a> tag inside of the #most-popular-posts element instead of looking for an <a> directly nested inside of that element.
See this Mozilla article for more details: https://developer.mozilla.org/en-US/docs/Learn/CSS/Introduction_to_CSS/Cascade_and_inheritance
Here is my md-card's loop (except the first div):
<div layout="column" layout-gt-sm="row" layout-wrap="">
<div flex="25" flex-gt-sm="50">
<md-card>
<md-card-header>
<md-card-avatar><img src="#" />
</md-card-avatar>
<md-card-header-text>
<span class="md-title">Angular</span>
<span class="md-subhead">Material</span>
</md-card-header-text>
</md-card-header>
<img ng-src="{{imagePath}}" alt="Washed Out" class="md-card-image" />
<md-card-title>
<md-card-title-text>
<span class="md-headline">Text</span>
</md-card-title-text>
</md-card-title>
<md-card-content>
<p>
Content
</p>
</md-card-content>
<md-button>Button</md-button>
</md-card>
</div>
<!-- another card -->
</div>
This works fine, here is the picture.
But what if i don't want to display cards strictly on the lines? Is there any way to get something like
this?
That kind of layout will require you to use some javascript to add to the view logic. Since each of your cards are using rows, it would not be currently possible to do that styling.
I would check out http://masonry.desandro.com/
I'm having trouble trying to get to some data using the dom crawler.
I want to get the name 'Avocado' and '£1.50' I though I'd be able to do something like
$message = $crawler->filterXPath('h3')->text();
<div class="product">
<div class="productInner">
<div class="productInfoWrapper">
<div class="productInfo">
<h3>
<a href="http://website.com" >
Avocado
<img src="pic.jpg" alt="" />
</a>
</h3>
</div>
</div>
<div class="pricingAndTrolleyOptions">
<div class="pricing">
<p class="pricePerUnit">
£1.50<abbr title="per">/</abbr><abbr title="unit"><span class="pricePerUnitUnit">unit</span></abbr>
</p>
<p class="pricePerMeasure">£1.50<abbr
title="per">/</abbr><abbr
title="each"><span class="pricePerMeasureMeasure">ea</span></abbr>
</p>
</div>
</div>
</div>
To get h3 text:
$message = $crawler->filterXPath('//div[#class="productInfo"]/h3')->text();
To get price (i.e. for class pricePerMeasure):
$price= $crawler->filterXPath('//p[#class="pricePerMeasure"]')->text();
In my Evolve theme, there is a section to custom the footer with the following text:
Available HTML tags and attributes: <b> <i> <a href="" title=""> <blockquote> <del datetime=""> <ins datetime=""> <img src="" alt="" /> <ul> <ol> <li> <code> <em> <strong> <div> <span> <h1> <h2> <h3> <h4> <h5> <h6> <table> <tbody> <tr> <td> <br /> <hr />
When I try to use them, nothing happens, the html code is shown as simple text.
Does anybody know this issue?
Why not editing the footer.php?
<?php $footer_content = evl_get_option('evl_footer_content','');
if ($footer_content === false) $footer_content = '';
echo esc_attr($footer_content);
?>
Push your HTML to $footer_content = 'insert here'.
If we have the same version of 'Evolve' you can find it at line 88 - 91 in footer.php.