Imacro go to next pages of reults to extract data? - web-scraping

I want to extract data from linkein result page. I need to go next pages, one by one until last page of results.
I read all the docs:
http://wiki.imacros.net/FAQ#How_do_I_loop_through_multiple_pages_of_results.3F
http://wiki.imacros.net/FAQ#Q:_How_to_create_nested_loops.3F
But it is too complicate for me.
So I searched for similaire problem with linkedin, but nothing with my case. All problem are custom.
So I am asking here help if some expert Imacro could tell me which lines of code to add in my script to go to next page result.
here is my actual script:
VERSION BUILD=844 RECORDER=CR
URL GOTO=https://www.linkedin.com/vsearch/p? keywords=gestionnaire%20de%20patrimoine&trk=tyah&trkInfo=clickedVertical%3Aautocomplete,clickedEntityId%3A1,idx%3A1-1-1,tarId%3A1466679830045,tas%3Agestionaire%20de%20patrimoine&rsid=1951573471466679858087&openFacets=N,G,CC&orig=FCTD&f_G=fr%3A0&page_num=5&pt=people
TAG POS=1 TYPE=A ATTR=TXT:Suivant<SP>>
TAG POS=1 TYPE=I ATTR=CLASS:fa<SP>fa-square&&TXT:
TAG POS=1 TYPE=BUTTON ATTR=TXT:Find<SP>email<SP>addresses<SP>& <SP>save<SP>leads
TAG POS=1 TYPE=A ATTR=TXT:Suivant<SP>>
Does anyone can help me please?

I think you can go to the next page results by means of the code like this:
SET !LOOP 1
URL GOTO=https://www.linkedin.com/vsearch/p?keywords=gestionnaire%20de%20patrimoine&trk=tyah&trkInfo=clickedVertical%3Aautocomplete,clickedEntityId%3A1,idx%3A1-1-1,tarId%3A1466679830045,tas%3Agestionaire%20de%20patrimoine&rsid=1951573471466679858087&openFacets=N,G,CC&orig=FCTD&f_G=fr%3A0&page_num={{!LOOP}}&pt=people
Just play it in loop mode and make sure that it works.

Related

Absolute external links bug, url of the current page added at the beginning of the url

I have a very strange external links behavior on this page:
https://dev.switchonpaper.site/en/daniel-g-andujar-the-artist-as-a-thinker-and-augur-of-what-happens/
There is a list of external links visible by clicking on "Go Deeper".
On some links, the address of the current page is added at the beginning of the external link.
E.g.: iSAMâ„¢ (1997)
E.g.: TTTP Photo Collection - 1997
All external links are absolute links.
When you look at the source code, the links are correct.
This site runs under Wordpress, the links are contained in a Gutenberg block built with the ACF plugin.
I tested the following things:
Disable all plugins.
The browser or something else continues to add the current page address on some links only.
I emptied the server cache, removed all the .htaccess rules except the wordpress part.
I made sure that the PHP file that writes these links is in UTF-8.
By recreating the links, it is always the same ones who are affected.
Does anyone have any idea what could cause this?
Thank you for your time and help!
You have the so called "hidden characters" before your link start. I suggest you to check it by yourself with some online tool like this: https://www.soscisurvey.de/tools/view-chars.php. If you try to paste there the link copied by your source code you will see you have hidden stuff before "https:..."
The solution to this issue is that you delete all the characters and you write them all over again by yourself, w/o copy/pasting them from another source or in alternative paste them inside some non-HTML text editor before pasting them to your website

Mendix iFrame - Page not found

I have created multiple custom .html pages and placed them in /themes in my project. To use these custom pages in my project, I am using iFrame widget, which is placed in a dataview and all the settings are done correctly as I have used this widget previously as well. When I navigate to this page with widget using iFrame, it works fine the first time and displays the page correctly. However, after first time, no link to any of the pages works and gives an error "Page Not Found". Can someone point out what I am doing wrong here or guide me to a different/better solution to achieve this?
Note: I have also tried the same with iFrame tag inside HTML Snippet, and the behaviour is exactly same.
A common mistake is that people use just the filename instead of complete url in iframe url property. Try changing the filename to complete url e.g. http://example.com/file.html instead of file.html.

stop page from jumping back to top when empty <a> tag is clicked

The title says it all, how would I make it using only html, no JAVASCRIPT, to stop the page from jumping back to the top if a user clicks on an empty tag? So for example, if at the very bottom of my site, I have a link that is empty, but click on it, it takes be clear back up to the top...
A simple solution to this would simply put in the a tag:
Title
In doing this, it won't scroll your page back to the top. To have it scroll back to the top, take out the a after the # symbol...so it would look like this:
Title
That is the best explanation I can give you without any code provided from you.
Give that a try and it should work with what you are asking for...No javascript is needed. In fact you can even make the #a jump to a different location if you'd like on your page :)
UPDATE:
This may suit you better! Add this to either a js file or add it inline with you html document.
Separate js file (just make sure to call it externally on your html file):
$('#Add_Your_Id_Or_Class_Here').removeAttr('href');
Example: $('#link a').removeAttr('href'); or $('.link a').removeAttr('href'); or even $('a').removeAttr('href');
Now, if you want to achieve this via inline on your html file, simply do this:
<script>
$('#Add_Your_Id_Or_Class_Here').removeAttr('href');
</script>
Again, you can use any of the examples above as well. In fact there are many ways you can achieve this now that I think about it...Hope that helps :)
If your link isn't supposed to be linked (such as when it's just a placeholder for where a link could be) then you should not add an [href] attribute to the <a> element

FancyBox not working in asp.net

I have downloaded fancybox-1.3.4, I tried using it with one of my pages(which has a master page which has the same DOCTYPE as index.html of fancybox-1.3.4) I copy pasted the entire code(subtracting head, body etc) but it doesn't seem to work, however, if I copy paste the entire code(including doctypes etc. all) to a new Default.aspx without master page, it work perfectly.
please
Help me out here.
I agree with the other posters; it's hard to know without seeing the code. But master pages do change the client IDs on the page. If you're calling the fancy box by ID, you might want to try it with a css class.
i.e.
$(".fancy").fancybox();
...
click here
instead of:
$("#my_link").fancybox();
...
click here

How to disable header on frontpage of rst2pdf document?

I am generating some PDF's and I would like to disable the header on the frontpage. I know there are built-in templates in rst2pdf and one template is called coverPage but I don't seem to be able to get it to work.
The manual is saying you should use a
..raw:: pdf
PageBreak coverPage
statement but that will insert a empty before the coverpage, so how can I have a coverpage without a header and without using the oddeven directive (I want to use the same header on all remaining pages).
Thanks for your suggestions!
That's how you change the stylesheet after the cover page. You'll need to create a custom stylesheet that specifies what is the format of the first page and then change the style for the rest of document. Have a look at chapter 15 of the manual.
Note: current accepted answer contains broken link (linked website has gone).
The correct answer is simple:
1) In you style-file define:
pageSetup:
firstTemplate: coverPage
2) Then in your template, when you want to start using header/footer add:
..raw:: pdf
PageBreak cutePage
Make sure cutePage has set header/footer to true.

Resources