Parsing images via nokogiri and xpath - css

I currently have a piece of code which will grab a product title, description, and price and for that it works great. However, I also need it to get the image URL which is where my dilemma is. I tried using a xpath inside the loop I have at the bottom and it lists out ALL the images that are equal to 220 on EVERY product which I dont want at all. So basically I get something like this....
product 1 Title here
product 1 Description here
product 1 price here
http://www.test.com/product1.jpg
http://www.test.com/product2.jpg
http://www.test.com/product3.jpg
http://www.test.com/product4.jpg
product 2 Title here
product 2 Description here
product 2 price here
http://www.test.com/product1.jpg
http://www.test.com/product2.jpg
http://www.test.com/product3.jpg
http://www.test.com/product4.jpg
Where as I obviously want product 1 to just have http://www.test.com/product1.jpg and product 2 to have http://www.test.com/product2.jpg etc, etc. The images are just in a div tag with no class or ID hence why I didnt just easily put them into a css selector. Im really new to ruby/nokogiri so any help would be great.
require 'nokogiri'
require 'open-uri'
url = "http://thewebsitehere"
data = Nokogiri::HTML(open(url))
products = data.css('.item')
products.each do |product|
puts product.at_css('.vproduct_list_title').text.strip
puts product.at_css('.vproduct_list_descr').text.strip
puts product.at_css('.price-value').text.strip
puts product.xpath('//img[#width = 220]/#src').map {|a| a.value }
end

Try changing:
puts product.xpath('//img[#width = 220]/#src').map {|a| a.value }
to:
puts product.xpath('.//img[#width = 220]/#src').map {|a| a.value }
The point of the '.' there is to say you want all images that are children of the current node (e.g. so you're not peeking at product 2's images).

File#basename will return only the filename:
File.basename('http://www.test.com/product4.jpg')
#=> "product4.jpg"
So you probably want something like this:
puts product.xpath('//img[#width = 220]/#src').map {|a| File.basename(a.value) }

Related

Is there a way to get a "table like" appearance to my dropdown entries?

Using a fixed-width font (e.g. Consolas, Courier, etc), I am trying to
populate a dropdown menu (AjaxControlToolkit:DropDownList) that has 2
columns (in appearance). I have a product name and a category name
(neither of which I know until runtime). The appearance I'm looking
for is something like:
Chevy Cruz (gas)
Prius (hybrid)
Tesla Model S (electic)
My list can have over 300 entries and if I just append the category to the
product name, the menu is harder to read.
I've tried using a character array and copying in the category name at the
same index for each ListItem, but the spaces between disappear when the
dropdown list is opened. I've looked at the ListItem(Paragraph) constuctor
but it doesn't look to solve my problem to my understanding of it. I
haven't looked at the Telerik controls I have available because it
would mean a lot of coding changes.
I can't think of another AjaxControlToolkit control that might help.
string padding might be works for you
var _maxLengthOfProductName = 20; //number of space you need
var _productName = "Product Name";
var _type = "(type)";
var _ProductNameWithType = _productName.PadRight(_maxLengthOfProductName, ' ') + _type; //assign this to the dropdown item
_ProductNameWithType = _ProductNameWithType.Replace(" ", " ");
it will show
Product Name (type)

cts search to test if the element is not available

Below is the XML structure where I want to get the entries for which element co:isbn is not available:-
<tr:trackingRecord xmlns:tr="https://www.mla.org/Schema/Tracking/tr"
xmlns:co="https://www.mla.org/Schema/commonModule/co"
xmlns:r="http://www.rsuitecms.com/rsuite/ns/metadata">
<tr:journal>
<tr:trackingDetails>
<tr:entry>
<co:trackingEntryID>2015323313</co:trackingEntryID>
<co:publicationDate>2015</co:publicationDate>
<co:volume>21</co:volume>
</tr:entry>
<tr:entry>
<co:trackingEntryID>2015323314</co:trackingEntryID>
<co:publicationDate>2015</co:publicationDate>
<co:isbn>
<co:entry>NA</co:entry>
<co:value>1234567890128</co:value>
</co:isbn>
</tr:entry>
<tr:entry>
<co:trackingEntryID>2015323315</co:trackingEntryID>
<co:publicationDate>2015</co:publicationDate>
<co:volume>21</co:volume>
<co:isbn></co:isbn>
</tr:entry>
<tr:entry>
<co:trackingEntryID>2015323316</co:trackingEntryID>
<co:publicationDate>2015</co:publicationDate>
<co:volume>21</co:volume>
</tr:entry>
</tr:trackingDetails>
</tr:journal>
</tr:trackingRecord>
Please suggest the cts:query for the same.
If you can edit xml structure, add one attribute in entry element, like
<tr:entry isbnPresent="yes"> for isbn present,
<tr:entry isbnPresent="no"> for isbn absent
and based on these field fire search with,
cts:element-attribute-value
on it.
OR
without editing schema, try like, ,
for $i in cts:search(//tr:entry,"2015")
return if(fn:exists($i//co:isbn)) then () else $i

How to theme an element of an array?

I have this array in a Drupal 7 installation, it outupts the term list that belongs to a specific vocabulary id:
<?php print render($content['taxonomy_vocabulary_3']); ?>
Now, what this does it outputs the result in a list, I would like to output it in a comma separated line.
Now, I suppose that I could do that with a foreach statement?
I´ve tried this, after reading the documentation, but it outputted nothing:
foreach($taxonomy_vocabulary_3 as $id=>$tag) {
echo "$tag, " ;
}
I´ve looked into what the Devel module told me about that array, and it showed me this:
taxonomy_vocabulary_3 (Array, 1 element)
und (Array, 2 elements)
0 (Array, 1 element)
tid (String, 3 characters ) 141
1 (Array, 1 element)
tid (String, 3 characters ) 320
But as you can see it shows the term id in each case, and not the term name...
What do you suggest? Thanks!!
You can load the term and then print it's title.
foreach($vocabulary as $tid) {
$term = taxonomy_term_load($tid);
// print whatever you want from this object.
print $term->title . ', ';
}
taxonomy_term_load
What you got is a build array - so that means that
$content['taxonomy_vocabulary_3']['#theme']
will be the theme function used to render the vocabulary. If you want to change the output you have two good solutions.
override the standard theme function in your theme - this will alter the output of all the calls to that theme function - in this case how all vocabularies is rendered.
Change the #theme value to a theme function of your liking - this could be a custom theme function you define in your theme.
For help on how to render the terms, you can take a look at how the original theme function is implemented - you can look it up at the Drupal API documentation.

Obtain data from dynamically incremented IDs in JQuery

I have a quick question about JQuery. I have dynamically generated paragraphs with id's that are incremented. I would like to take information from that page and bring it to my main page. Unfortunately I am unable to read the dynamically generated paragraph IDs to get the values. I am trying this:
var Name = ((data).find("#Name" + id).text());
The ASP.NET code goes like this:
Dim intI As Integer = 0
For Each Item As cItem in alProducts1
Dim pName As New System.Web.UI.HtmlControls.HtmlGenericControl("p")
pName.id = "Name" & intI.toString() pName.InnerText = Item.Name controls.Add(pName) intI += 1
Next
Those name values are the values I want...Name1, name2, name3 and I want to get them individually to put in their own textbox... I'm taking the values from the ASP.NET webpage and putting them into an AJAX page.
Your question is not clear about your exact requirement but you can get the IDs of elements with attr method of jQuery, here is an example:
alert($('selector').attr('id'));
You want to select all the elements with the incrementing ids, right?
// this will select all the elements
// which id starts with 'Name'
(data).find("[id^=Name]")
Thanks for the help everyone. I found the solution today however:
var Name = ($(data).find('#Name' + id.toString()).text());
I forgot the .toString() part and that seems to have made the difference.

Filtering names on the basis of first character of the name

i have a page in which i am displaying the name of all the users i want to filter their names on the basis of first character for that i want to show A B C D ....X Y Z filters on the top on clicking of which it will filter the names accordingly my problem is not the query part but how to add these letters do i have to add 26 link buttons separately or there is some work around for example you might have seen such type of behavior in some music sites for filtering the songs with starting character.
These are few useful links how to do alphabetical paging
1. http://www.highoncoding.com/Articles/209_GridView_Alphabet_Paging.aspx
2. http://aspdotnetcodebook.blogspot.com/2008/03/how-to-add-alphabet-paging-in-gridview.html
Use ASCII characters codes to do this, for example:
var letters = new List<string>()
for(int i = 65; i < 91; i++)
letters.Add(Convert.ToChar(i).ToString());
Display it by adding links to page:
foreach(letter in letters)
{
var hyperlink = new Hyperlink()
{
NavigateUrl = string.Format("Filter.aspx?letter={0}", letter),
Text = letter
}
Page.Controls.Add(hyperlink);
}
Of course instead of Page you can use any other container you want, you just need to add those hyperlinks to controls collection.
Also take care to run this code in proper method, for example by overriding CreateChildControls method.
Regards

Resources