(iMacros) How to scrape usernames this way? - web-scraping

So I manage to make it work, but it also scrapes the name
inside the box like this
And all I want it to scrape is the username "nekoakatsuki"
the code I use to scrape the username box is this:
TAG POS=1 TYPE=DIV ATTR=CLASS:infolist&&TXT:* EXTRACT=TXT
so it scrapes anything in the "infolist" and look below is
what it grabs which is also the name
<div class="infolist">
<strong>nekoakatsuki</strong>
<br>
<span class="fullname">Jennifer Sandoval</span>
</div>
So how would I only scrape the username and not the Name also?
Website I'm using for this is http://web.stagram.com/tag/anime/?vm=grid

TAG POS=1 TYPE=SPAN ATTR=CLASS:fullname EXTRACT=TXT
Try this.

Related

How to build trigger for button in form within Google Tag Manager?

I have the following button, for which I want to create a Google Tag Manager trigger (but I seem to be unable to do so):
<div class="class-a class-b">
<form class="class-c" action="https://www.example.com/test" method="get" onclick="window.open(this.action); return false;">
<button type="submit">Open now</button>
</form>
</div>
Which type of trigger should I use (the auto-event variable does not
work)?
How would I need to configure the trigger to track a button
click?
What would I need to do in order to also catch the action
value (i.e. the URL https://www.example.com/test)? Would I need Javascript for that to bind to its submit? If so, how?
You want to use the Any Click: All Elements.
In the trigger, you enable the "Some clicks" Then Click Element -> Matches CSS Selector -> button[type="submit"]
Optionally, you can add more conditions there.
Use this trigger in a new Tag. Use the Universal Analytics as a tag type.
Tag settings: change it from Pageview to Event. Now you have three fields for Category, Action and Label of your event.
Then you want to set your Action as a URL. Start typing {{ in that field and pick the variable to populate there like so:
That should be it.
Update to address the actual html given:
So just return this value in your CJS var:
{{Click Element}}.parentElement.getAttribute("action");
It should work for the exact html situation that you've provided.
Then use this CJS in your tag and you should be good.

How to extract data with iMacros from a website whose TAG POS=x of the same element is variable between different webpages?

I wish to extract data from a website that contains multiple webpages by searching in the website according to a list of keywords defined in a datasource .csv.
iMacros should enter sequentially in each individual page, grab certain elements on each webpage and save data in a csv. The elements to be extracted are the same in between all webpages.
My problem is that the TAG POS=x does not remain the same for an element when moving from webpage to webpage.
e.g on a page a HTML TAG element has TAG POS=95 TYPE=SPAN ATTR=* EXTRACT=TXT,
while on other page same HTML TAG element changes to TAG POS=96 TYPE=SPAN ATTR=* EXTRACT=TXT
The only possibility I am thinking would be to pick the elements by their text attribute ( I mean their text).
Question:
Does the TXT parameter like TXT:Manufacturer (or eventually TXT:Manufacturer*) permits the selection without knowing the exact TAG POS=?
Is there other solution to make this kind of an extraction with iMacros?(variable position of the tag for the same html element across pages)
Thank you.
You can use the tag like below. So the below tag will extract the text, that has the attribute starts with "Manufacturer" irrespective of the position.
TAG POS=* TYPE=SPAN ATTR=TXT:Manufacturer* EXTRACT=TXT
(1) Generally speaking, that depends on a website which is scraped. Nevertheless you can try the command such as this:
TAG POS=1 TYPE=SPAN ATTR=TXT:Manufacturer* EXTRACT=TXT
(2) If you exactly know these tag positions, the following code may be helpful as well:
SET !ERRORIGNORE YES
SET !TIMEOUT_STEP 0
TAG POS=95 TYPE=SPAN ATTR=* EXTRACT=TXT
TAG POS=96 TYPE=SPAN ATTR=* EXTRACT=TXT
' other commands with extraction
SET !TIMEOUT_STEP 6
SET !ERRORIGNORE NO
SET !EXTRACT EVAL("'{{!EXTRACT}}'.replace(/\\[EXTRACT\\]|#EANF#/g, '').trim();")

Accessing dynamically created iframe within imacros

I need help with implementing an imacros script.
My basis script looks like this:
VERSION BUILD=8940826 RECORDER=FX
TAB T=1
URL GOTO=URL
TAG POS=1 TYPE=INPUT:TEXT ATTR=NAME:login-custnum CONTENT=12345
TAG POS=1 TYPE=INPUT:TEXT ATTR=NAME:login-username CONTENT=myuser
SET !ENCRYPTION NO
TAG POS=1 TYPE=INPUT:PASSWORD ATTR=NAME:login-password CONTENT=password
TAG POS=1 TYPE=BUTTON ATTR=NAME:login-login
This script works, the login is performed.
After this I need to use one of 3 iframes.
I cannot use
FRAME NAME="menu_iframe"
because the frames are created dynamically and NOT statically.
I tried the following:
var myframe = window.frames["menu_iframe"];
But this does not work.
After that I want to click a certain button in that iframe.
Thanks in advance
First of all I suggest checking the frame names. Try the macro below that consists of only one line. It must show names of all frames on the page with ‘alert’ dialog.
URL GOTO=javascript:{window.location.href='javascript:{var<SP>f=[];var<SP>l=window.frames.length;for(i=0;i<l;i++){try{f.push("\\""+window.frames[i].name+"\\"");}catch(e){f.push("no_frame_name")}}alert("FOUND<SP>"+f.length+"<SP>FRAMES:"+"\\n\\n"+f.join("\\n"));}';undefined;}

Data extraction using imacros

I need to have a imacros script to extract all data from this website
http://www.gibsondunn.com/Search/Pages/LawyersSearch.aspx?k=('Last Name'~A*).
I manually click on alphabet link and count the result then input on max loop to play loop but its really time consuming doing that way and try to find on Google but there is no luck and try to find here. Hope there's someone help me with this.
Here also the script I create through record imacros.
VERSION BUILD=8871104 RECORDER=FX
SET !TIMEOUT_PAGE 20
SET !EXTRACT_TEST_POPUP NO
SET !ERRORIGNORE YES
TAB T=1
TAG POS={{!LOOP}} TYPE=A ATTR=HREF:/lawyers/* EXTRACT=HREF
TAB OPEN NEW
TAB T=2
URL GOTO={{!EXTRACT}}
WAIT SECONDS=2
'data text
SET !EXTRACT {{!URLCURRENT}}
TAG POS=1 TYPE=H1 ATTR=CLASS:gd_title EXTRACT=TXT
TAG POS=4 TYPE=SPAN ATTR=* EXTRACT=TXT
TAG POS=13 TYPE=DIV ATTR=* EXTRACT=TXT
TAG POS=15 TYPE=DIV ATTR=* EXTRACT=TXT
TAG POS=19 TYPE=DIV ATTR=* EXTRACT=TXT
TAG POS=20 TYPE=DIV ATTR=* EXTRACT=TXT
TAG POS=21 TYPE=DIV ATTR=* EXTRACT=TXT
TAG POS=22 TYPE=DIV ATTR=* EXTRACT=TXT
TAG POS=23 TYPE=DIV ATTR=* EXTRACT=TXT
SAVEAS TYPE=EXTRACT FOLDER=* FILE=gibsondunn.csv
tab close
TAB T=1
Really appreciate your help .
Without using JavaScript Scripting Interface, the simplest way facilitating your task is to get rid of counting the number of lines you have to extract. Below the code that prompts it. Write down this macro in a separate .iim-file and, of course, input the obtained value manually on max loop before running your own one.
SET !EXTRACT_TEST_POPUP NO
URL GOTO=javascript:{var<SP>els=window.document.getElementsByTagName("td");var<SP>n=0;for(i=0;i<els.length;i++){if(els[i].className=="gd_nameColumnData")n++}n;}
URL GOTO=javascript:{window.history.back()}
TAG POS=1 TYPE=HTML ATTR=* EXTRACT=TXT
BACK
PROMPT {{!EXTRACT}}

how can i make a div target of a asp.net hyperlink control

have a page with many links.
when any link is clicked, the page corresponding to that hyperlink should open within a specified area on the same page, that is the "div content"
yes i can use iframe
but can i make a div target of the hyperlink?
please no jscript only html and asp.net
You can use anchors and use it to target your iframe
Google
<iframe name="thisframe" src="http://www.yahoo.com"></iframe>
As far as i know, you can never target *div*s unless you use javascript to manipulate the DOM, though. But if you really REALLY had to use the HyperLink control instead, the you can use the following code instead of using the anchor:
<asp:HyperLink ID="uxHyperLink" runat="server" Target="thisframe" NavigateUrl="http://www.google.com">Google</asp:HyperLink>
<iframe name="thisframe" />
you can use html bookmarks. For an example click on this link
http://www.matlus.com/quartz-for-aspnet/#videos
If will not only take you to a page but scroll your browser down to the "videos" section onthe page.
In the url (the link) notice the /#videos. That is the bookmark. On the target page in question I have an anchor tag with it's name attribute set to "videos"
[a name="videos"][/a]
I'm using square brackets above because of the editor not allowing anchor tags. So basically, just above your "target" div, place an anchor tag and set the "name" attribute. Then you your link simply append the /#anchorname at the end of the url.

Resources