using css selector for imacros with conditions - web-scraping

Want to know if conditional css selector works with iMacros? I want to run a web scraper. The below selector is working with Web Scrape chrome extension, but it doesn't work with iMacros.
Selector:
div.s-expand-height:has(span.a-price.a-text-price), .celwidget div.s-item-container:has(span.a-price.a-text-price), div.s-include-content-margin:has(span.a-price.a-text-price)
I tried this with iMacros in below formats, but not working
Format 1
TAG POS=7 TYPE=DIV ATTR=CLASS:"s-expand-height:has(span a-price.a-text-price), celwidget s-item-container:has(span.a-price.a-text-price), s-include-content-margin:has(span.a-price.a-text-price)" EXTRACT=TXT
Format 2
TAG POS=7 TYPE=DIV ATTR=CLASS:"div.s-expand-height:has(span.a-price.a-text-price), .celwidget div.s-item-container:has(span.a-price.a-text-price), div.s-include-content-margin:has(span.a-price.a-text-price)" EXTRACT=TXT
My complete iMacros script looks like this.
SET !DATASOURCE E:\imacros\urllist1.csv
SET !LOOP 2
SET !DATASOURCE_LINE {{!LOOP}}
URL GOTO={{!COL1}}
WAIT SECONDS={{!COL2}}
TAG POS=7 TYPE=DIV ATTR=CLASS:"s-expand-height:has(span.a-price.a-text-price), .celwidget s-item-container:has(span.a-price.a-text-price), s-include-content-margin:has(span.a-price.a-text-price)" EXTRACT=TXT
ADD !EXTRACT {{!URLCURRENT}}
'TAG POS=1 TYPE=DIV ATTR=CLASS:"s-expand-height s-include-content-margin s-border-bottom s-latency-cf-section" EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=E:\imacros FILE=data.csv

Alright, + for anybody "interested", I posted a Solution in #OP's parallel Thread on the iMacros Forum using mostly 'Relative Positioning' to achieve what I "think" I understood, the Functionality that $OP wants, applied to the Website and Webpages they gave as Example...
Here is the "short" Script I posted, the whole Thread contains (much) more Info and a "full" Script with all Debug Info/Technique(s) I used...:
VERSION BUILD=8820413 RECORDER=FX
SET !EXTRACT_TEST_POPUP NO
SET !TIMEOUT_STEP 2
'SET !LOOP 1
TAB T=1
'URL GOTO=https://www.amazon.com/s?rh=n%3A1055398%2Cn%3A%211063498%2Cn%3A284507%2Cn%3A289814&lo=image&qid=1606931446&ref=lp_289814_il_ti_kitchen
'Extract 'Original Price' (Striked through):
'TAG POS=2 TYPE=SPAN ATTR=TXT:$14.95
'TAG POS=3 TYPE=SPAN ATTR=CLASS:"a-price a-text-price"&&DATA-A-STRIKE:"true"&&TXT:$*$* EXTRACT=HTM
SET !EXTRACT NULL
'TAG POS=3 TYPE=SPAN ATTR=CLASS:"a-price a-text-price"&&DATA-A-STRIKE:"true"&&TXT:$*$* EXTRACT=TXT
TAG POS={{!LOOP}} TYPE=SPAN ATTR=CLASS:"a-price a-text-price"&&DATA-A-STRIKE:"true"&&TXT:$*$* EXTRACT=TXT
SET Orig_Price EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.split('$'); y=x[1]; z='$'+y; z;")
'Extract 'Discount Price':
SET !EXTRACT NULL
TAG POS=R-1 TYPE=SPAN ATTR=CLASS:"a-price"&&TXT:$*$* EXTRACT=TXT
SET Discount_Price EVAL("var s='{{!EXTRACT}}'; var x,y,z; x=s.split('$'); y=x[1]; z='$'+y; z;")
'Extract 'Description':
'TAG POS=1 TYPE=H2 ATTR=TXT:Genuine<SP>Instant<SP>Pot<SP>Tempered<SP>Glass<SP>Lid,<SP>9* EXTRACT=HTM
SET !EXTRACT NULL
TAG POS=R-1 TYPE=H2 ATTR=TXT:* EXTRACT=TXT
SET Descr EVAL("var s='{{!EXTRACT}}'; var z=s.trim(); z;")
PROMPT LOOP:<SP>_{{!LOOP}}_<BR><BR>Descr:<SP>_{{Descr}}_<BR>Discount_Price:<SP>_{{Discount_Price}}_<BR>Original_Price:<SP>_{{Orig_Price}}_
Script written and tested using iMacros for FF v8.8.2, PM v26.3.3, Win10_x64.
(Needs to be looped...)

Related

TypoScript Insert HeaderData before Style Definitions

I need to insert some headerdata before Style Definitions. To be more precise I want to insert my google tag manager script before my style sheets due to the loading order of my font-family.
At the moment I do it like this:
page.includeCSS {
file1 = fileadmin/templates/css/bootstrap.css
page.includeCSS.file1.media = all
file2 = fileadmin/templates/css/style.css
page.includeCSS.file2.media = all
file3 = fileadmin/templates/fontawesome/css/all.min.css
page.includeCSS.file3.media = all
}
page.headerData {
10 = TEXT
10.value (
<script data-cookieconsent="ignore">(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':
new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
})(window,document,'script','dataLayer','GTM-XXXXXXXXXX');</script>
)
}
According to the Typo3 Documentation the headerdata "By default, gets inserted after all the style definitions." I think that means, that it also can be inserted before. But I don't know how.
I also don't want to put the css in the headerdata because I cannot concatenate and compress them there.
If you want it at first position you could modify the <head> tag:
page.headTag (
<head>
<script data-cookieconsent="ignore">(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':
new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
})(window,document,'script','dataLayer','GTM-XXXXXXXXXX');</script>
)
A cleaner way would be to reorder the header elements with typoscript:
config {
pageRendererTemplateFile = EXT:site_config/Resources/Private/Templates/templateFile.html
}
and then copy the file you can find under typo3/sysext/core/Resources/Private/Templates/PageRenderer.html to that location and change the order of the markers.
(Yes, it is an old style marker template file)
Now you can reorder addditionalHeaders in front of any CSS or JS.

How are Inspector Controls added for the header block?

The header block has 3 inspector controls defined:
But inside the block definition there is no InspectorControls tag:
https://github.com/WordPress/gutenberg/blob/master/packages/block-library/src/heading/edit.js
Also in the rich text control there is no InspectorControls definition.
How are the inspector controls for the header block being added?
The InspectorControls are a slot that exposes a way for plugin/block developers to easily add/insert content into the UI.
Its reasonable to expect that the "Heading Block" source would also contain InspectorControls,
however, given that InspectorControls use useDisplayBlockControls it then becomes a little less obvious how/where they are added..
Digging a bit deeper into the Sidebar (packages/edit-post/src/components/sidebar/settings-sidebar/index.js) where the "Block" component is defined, we find that BlockInspector is actually used:
{ sidebarName === 'edit-post/block' && <BlockInspector /> }
Which eventually leads to gutenberg/packages/block-editor/src/components/block-inspector/ and the revelation that the BlockInspector contains an InspectorControls slot..
...
<InspectorControls.Slot bubblesVirtually={ bubblesVirtually } />
<div>
<AdvancedControls
slotName={ InspectorAdvancedControls.slotName }
bubblesVirtually={ bubblesVirtually }
/>
</div>
...
Ref: packages/block-editor/src/components/block-inspector/index.js
The block displays the controls for the attributes it supports, eg for Heading: "typography", "color" and "advanced > custom css" are displayed.

Vis.js - set graph label's font as underline

I use vis.js to display a graph. I want to use markup on the node's label.
I'm using a node of type text.
What I did:
I set font option in the node option:
// in the option object
nodes: {
type: 'text'
font: {
multi: 'html',
}
}
And I added the <u> tag to my label
// in the option object or node data object
label: `<u>${YourLabel}</u>`
Result:
My label is displayed with the <u> tag on the graph. As mentioned in this post, this works for <b> and <i>.
Is <u> not supported ?
According to issue 3119 in the old vis.js, only <b>, <i> and <code> is supported:
With respect to HTML, the following is possible: Set option
node.font.multi: html. This allows you to use the <b>, <i> and <code>
tags within the label text for formatting.
In the current version of vis.js, it looks like that is still the case - from LabelSplitter.js:
// Hash of prepared regexp's for tags
var tagPattern = {
// HTML
'<b>': /<b>/,
'<i>': /<i>/,
'<code>': /<code>/,
'</b>': /<\/b>/,
'</i>': /<\/i>/,
'</code>': /<\/code>/,
// Markdown
'*': /\*/, // bold
'_': /\_/, // ital
'`': /`/, // mono
'afterBold': /[^\*]/,
'afterItal': /[^_]/,
'afterMono': /[^`]/,
};
Styling node content with SVG (where <u> can work) is shown here but there are warnings about browser support and this formats node content, not the node label.

Watir: selector for an a-href inside the first table cell

I have my css alias like:
module MyPage
def locator(key, *options)
hash = {
"FIRST_TABLE_CELL_HREF" => [ :css => '#my-table td:nth-child(1):first a']
}
end
I want to click on that alias:
#page.find("FIRST_TABLE_CELL_HREF").when_present.right_click
Problem: that's a Javascript style alias, so it doesn't work.
Question: how to write the same Ruby style css alias?
P.S. $('#my-table td:nth-child(1):first a') works well in browser console.
For #TitusFortner that's true when you want to select specific element. But I'm using some business level language (Gherkin in my case) and I want to write an universal instruction. It'd look like When I right click on the element "FIRST_TABLE_CELL_HREF". That instruction would be mapped to:
When(/^I right click(?: on|)(?: the|) "([^\"]*)"$/i) do |scope|
#page.find(scope).when_present.right_click
end
Where #page = #browser.visit(SomePage), where in turn #browser = BrowserBase.new start_browser(ENV['BROWSER'])
With Watir you often don't even need to use css
browser = Watir::Browser.new
First link in the table:
browser.table(id: 'my-table').link
Link in first data cell in table:
browser.table(id: 'my-table').td.link
If it has to be just css for some reason:
browser.link(css: '#my-table a')
Also if your table is within the context of an iframe, you have to explicitly declare it, because the driver can only see the top level browsing context unless specifically switched to. With Watir this would work:
browser.iframe(id: 'iframe_id').table(id: 'my-table').link
This would not work:
browser.link(css: '#iframe_id #my-table a')

How to set the default font in Google Closure Library rich text editor

Google Closure Library editor: demo, documentation.
The editable area is an iframe. How can I set the default font of the editable area? Now it is the default font of the browser. I prefer not to put a font tag around the content in the editable area**. That way, I can change the font of my website in the future, without the need to modify every HTML-content written in the editor.
** What I mean by that is something like this:
<font size="2" face="georgia, serif"><b>content</b></font>
I would prefer just this:
<b>content</b>
... and then style the editable area of the editor with the georgia font using CSS. That way, the HTML-content (produced by the editor) in my database wouldn't contain a hard-coded font, so I could change the font in the future.
Edit: maybe I should use a SeamlessField instead of a Field for the editable area?
Once you call makeEditable() on the goog.editor.Field, which creates the iFrame you referenced, the Field fires an event of type goog.editor.Field.EventType.LOAD. If you listen to that event, you can pull out the iFrame and toss in a link element to a CSS stylesheet so you can easily modify the content in your editor.
Here's the equivalent of one of my listeners that should get you on the right track. (I didn't check if the goog.editor.Field was the target of the event, but I assume it is).
some.namespace.Page.prototype.onEditorLoad_ = function(event) {
var editor = /** #type {goog.editor.Field} */ (event.target);
var iFrame = editor.getEditableIframe();
if (iFrame) {
var fieldDomHelper = editor.getEditableDomHelper();
var documentNode =
fieldDomHelper.getFrameContentDocument(iFrame).documentElement;
var head = documentNode.getElementsByTagName(goog.dom.TagName.HEAD)[0];
if (!head) {
head = fieldDomHelper.createDom(goog.dom.TagName.HEAD);
goog.dom.insertChildAt(documentNode, head, 0);
}
fieldDomHelper.appendChild(
head,
fieldDomHelper.createDom(
goog.dom.TagName.LINK,
{ 'href': '/css/myCSS.css', 'rel': 'stylesheet', 'type': 'text/css' }
)
);
}
}
Finally, in that CSS file, you can add whatever styling you want. Such as your font change.

Resources