Nokogiri and isolating select elements from an array full of Nokogiri nodes - css

I'm trying to scrape http://www.ign.com/games/reviews using Nokogiri and I'd like to instantiate new review objects that correspond to each game review on the page. Naturally, I'd also like to grab each numeric Score from each review and assign that score value as a class attribute to my review objects.
The problem is, the best I can do is return an entire string of scores mushed together instead of a list consisting of each score.
class VideoGameReviews::Review
attr_accessor :name, :score, :url
def self.scrape_titles
#doc = Nokogiri::HTML(open("http://www.ign.com/games/reviews?"))
#doc.search("#item-list div.itemList div.itemList-item").each do |review|
new_review = VideoGameReviews::Review.new
new_review.score = review.search("span.scoreBox-score").text
=> "99996.37.17.17.17778.58.58.586.36.47.187.57.88.95.587.6" #Not what I want
end
end
end
Any advice on how to extract a list of scores with each score separate and unique from other scores? Maybe use a more specific CSS selector?

You are using nokogiri properly but need to revise your logic to store the scores properly. For instance, we can get the score for an individual game pretty easily:
new_review.score = fourth_item.search("span.scoreBox-score").text
=> "6.3"
Instead of having to do everything in a single method, you can start by breaking your code into smaller methods and cacheing values as needed. I would change this class name as well since your Review class both represents a Review item and also scrapes (violation of Single Responsibility Principle). Maybe something like the following would be better?
require ‘nokogiri’
class VideoGameReviews::ReviewScraper
def reviews
#reviews ||= Nokogiri::HTML(open("http://www.ign.com/games/reviews?"))
end
def review_items
#review_items ||= reviews.search("#item-list div.itemList div.itemList-item")
end
def store_reviews
review_items.each do |review|
new_review = VideoGameReviews::Review.new #Review class still used to save review
new_review.score = review.search("span.scoreBox-score").text
#get other data
new_review.save! #or however you plan on persisting the data
end
end
end
The question will be: how will you save the reviews (in local memory, in a db, etc)? For something quick, ActiveRecord is pretty simple (and you use it independently from Rails).
Note that the :each method in Ruby will always return the original collection on which it's called. so for instance the following will return [1,2]:
[1,2].each do |n|
n * 4
end

Related

how to get list of Auto-IVC component output names

I'm switching over to using the Auto-IVC component as opposed to the IndepVar component. I'd like to be able to get a list of the promoted output names of the Auto-IVC component, so I can then use them to go and pull the appropriate value out of a configuration file and set the values that way. This will get rid of some boilerplate.
p.model._auto_ivc.list_outputs()
returns an empty list. It seems that p.model__dict__ has this information encoded in it, but I don't know exactly what is going on there so I am wondering if there is an easier way to do it.
To avoid confusion from future readers, I assume you meant that you wanted the promoted input names for the variables connected to the auto_ivc outputs.
We don't have a built-in function to do this, but you could do it with a bit of code like this:
seen = set()
for n in p.model._inputs:
src = p.model.get_source(n)
if src.startswith('_auto_ivc.') and src not in seen:
print(src, p.model._var_allprocs_abs2prom['input'][n])
seen.add(src)
assuming 'p' is the name of your Problem instance.
The code above just prints each auto_ivc output name followed by the promoted input it's connected to.
Here's an example of the output when run on one of our simple test cases:
_auto_ivc.v0 par.x

Jupyter Notebook different ways to display out

There seems to be 3 ways to display output in Jupyter:
By using print
By using display
By just writing the variable name
What is the exact difference, especially between number 2 and 3?
I haven't used display, but it looks like it provides a lot of controls. print, of course, is the standard Python function, with its own possible parameters.
But lets look at a simple numpy array in Ipython console session:
Simply giving the name - the default out:
In [164]: arr
Out[164]: array(['a', 'bcd', 'ef'], dtype='<U3')
This is the same as the repr output for this object:
In [165]: repr(arr)
Out[165]: "array(['a', 'bcd', 'ef'], dtype='<U3')"
In [166]: print(repr(arr))
array(['a', 'bcd', 'ef'], dtype='<U3')
Looks like the default display is the same:
In [167]: display(arr)
array(['a', 'bcd', 'ef'], dtype='<U3')
print on the other hand shows, as a default, the str of the object:
In [168]: str(arr)
Out[168]: "['a' 'bcd' 'ef']"
In [169]: print(arr)
['a' 'bcd' 'ef']
So at least for a simple case like this the key difference is between the repr and str of the object. Another difference is which actions produce an Out, and which don't. Out[164] is an array. Out[165] (and 168) are strings. print and display display, but don't put anything on the Out list (in other words they return None).
display can return a 'display' object, but I won't get into that here. You can read the docs as well as I can.

Rails 4: how to identify and format links, hashtags and mentions in model attributes?

In my Rails 4 app, I have a Post model, with :copy and :short_copy as custom attributes (strings).
These attributes contain copies for social medias (Facebook, Twitter, Instagram, Pinterest, etc.).
I display the content of these attributes in my Posts#Show view.
Currently, URLs, #hashtags and #mentions are formatted like the rest of the text.
What I would like to do is to format them in a different fashion, for instance in another color or in bold.
I found the twitter-text gem, which seems to offer such features, but my problem is that I do NOT need — and do NOT want — to have these URLs, #hashtags and #mentions turn into real links.
Indeed, it looks like the twitter-text gem converts URLs, #hashtags and #mentions by default with Twitter::Autolink, as explained in this Stack Overflow question.
That's is not what I am looking for: I just want to update the style of my URLs, #hashtags and #mentions.
How can I do this in Ruby / Rails?
—————
UPDATE:
Following Wes Foster's answer, I implemented the following method in post.rb:
def highlight(string)
string.gsub!(/\S*#(\[[^\]]+\]|\S+)/, '<span class="highlight">\1</span>')
end
Then, I defined the following CSS class:
.highlight {
color: #337ab7;
}
Last, I implemented <%= highlight(post.copy) %> in the desired view.
I now get the following error:
ArgumentError
wrong number of arguments (1 for 2..3)
<td><%= highlight(post.copy) %></td>
What am I doing wrong?
—————
I'm sure each of the following regex patterns could be improved to match even more options, however, the following code works for me:
def highlight_url(str)
str.gsub!(/(https?:\/\/[\S]+)/, '[\1]')
end
def highlight_hashtag(str)
str.gsub!(/\S*#(\[[^\]]+\]|\S+)/, '[#\1]')
end
def highlight_mention(str)
str.gsub!(/\B(\#[a-z0-9_-]+)/i, '[\1]')
end
# Initial string
str = "Myself and #doggirl bought a new car: http://carpictures.com #nomoremoney"
# Pass through each
highlight_mention(str)
highlight_hashtag(str)
highlight_url(str)
puts str # > Myself and [#doggirl] bought a new car: [http://carpictures.com] [#nomoremoney]
In this example, I've wrapped the matches with brackets []. You should use a span tag and style it. Also, you can wrap all three gsub! into a single method for simplicity.
Updated for the asker's add-on error question
It looks like the error is references another method named highlight. Try changing the name of the method from highlight to new_highlight to see if that fixes the new problem.

Meteor - Passing a jade helper into a helper function

I'm trying to populate a list with a dataset and set the selected option with a helper function that compares the current data with another object's data (the 2 objects are linked)
I made the same type of list population with static variables:
Jade-
select(name='status')
option(value='Newly Acquired' selected='{{isCurrentState "Newly Acquired"}}') Newly Acquired
option(value='Currently In Use' selected='{{isCurrentState "Currently In Use"}}') Currently In Use
option(value='Not In Use' selected='{{isCurrentState "Not In Use"}}') Not In Use
option(value='In Storage' selected='{{isCurrentState "In Storage"}}') In Storage
Coffeescript-
"isCurrentState" : (state) ->
return #status == state
This uses a helper isCurrentState to match a given parameter to the same object that my other code is linked to so I know that part works
The code I'm trying to get to work is :
Jade-
select.loca(name='location')
each locations
option(value='#{siteName}' selected='{{isCurrentLocation {{siteName}} }}') #{siteName}
Coffeescript-
"isCurrentLocation": (location) ->
return #locate == location
All the other parts are functioning 100%, but the selected part is not
I've also tried changing the way I entered the selected='' part in a manner of ways such as:
selected='{{isCurrentLocation "#{siteName}" }}'
selected='{{isCurrentLocation "#{siteName} }}'
selected='{{isCurrentLocation {{siteName}} }}'
selected='#{isCurrentLocation "{{siteName}}" }'
selected='#{isCurrentLocation {{siteName}} }'
selected='#{isCurrentLocation #{siteName} }'
Is what I'm trying to do even possible?
Is there a better way of achieving this?
Any help would be greatly appreciated
UPDATE:
Thanks #david-weldon for the quick reply, i've tried this out a bit and realised that I wasn't exactly clear in what I was trying to accomplish in my question.
I have a template "update_building" created with a parameter( a buidling object) with a number of attributes, one of which is "locate".
Locations is another object with a number of attributes as well, one of which is "siteName". One of the siteName == locate and thus i need to pass in the siteName from locations to match it to the current building's locate attribute
Though it doesn't work in the context I want to use it definitely pointed me in a direction I didn't think of. I am looking into moving the parent template(The building) date context as a parameter into the locations template and using it from within the locations template. This is easily fixable in normal HTML spacebars with:
{{>locations parentDataContext/variable}}
Something like that in jade would easily solve this
Short answer
selected='{{isCurrentLocation siteName}}'
Long answer
You don't really need to pass the current location because the helper should know it's own context. Here's a simple (tested) example:
jade
template(name='myTemplate')
select.location(name='location')
each locations
option(value=this selected=isCurrentLocation) #{this}
coffee
LOCATIONS = [
'Newly Acquired'
'Currently In Use'
'Not In Use'
'In Storage'
]
Template.myTemplate.helpers
locations: LOCATIONS
isCurrentLocation: ->
#toString() is Template.instance().location.get()
Template.myTemplate.onCreated ->
#location = new ReactiveVar LOCATIONS[1]
I looked into the datacontexts some more and ended up making the options that populate the select into a different template and giving that template a helper, accessing the template's parent's data context and using that to determine which location the building had saved in it so that I could set that option to selected
Jade-
template(name="location_building_option")
option(value='#{siteName}' selected='{{isSelected}}') #{siteName}
Coffeescript -
Template.location_building_option.helpers
'isSelected': ->
parent = Template.parentData(1)
buildSite = parent.locate
return #siteName == buildSite
Thanks #david-weldon, Your answer helped me immensely to head in the right direction

Automatically create new objects in a Plone Folder, being the ids only sequential numbers

I have the following structure:
/Plone/folder/year/month/day/id
I want to create the last id sequentially in a tool using invokeFactory. I want to have:
/Plone/folder/2011/06/21/1
/Plone/folder/2011/06/21/2
/Plone/folder/2011/06/21/3
/Plone/folder/2011/06/21/4
Instead of:
/Plone/folder/2011/06/21/id-1
/Plone/folder/2011/06/21/id-2
/Plone/folder/2011/06/21/id-3
/Plone/folder/2011/06/21/id-4
...that is automatically done when I try to create an object with the same name in a folder, Plone handles that for me adding a sequential number. I want an efficient way to create objects, but only using a sequential number instead of a name with a sequential number. I can do that getting the total number os items in a folder, but would like to know if there's a better way.
Real life example: http://plone.org/products/collective.captcha/issues/4
If you're creating these objects manually, you can do something like:
brains = context.getFolderContents({'sort_on' : 'id', 'sort_order' : "reverse"})
if len(brains) > 0:
id = str(int(brains[0].id) + 1)
else:
id = '1'
Then you'll have to create the object manually with that id.
If you want this to do done automatically for you when a user creates content, you might want to look into creating a content rule to change the id of the content for you. An example that might help is collective.contentrules.yearmonth

Resources