not able to lazy load in phantomjs - web-scraping

I'm trying to scrape some information from the link (http://www.myntra.com/women-sarees?nav_id=606) that involves lazy loading. Below is my code snippet for this:
window.setInterval(function() {
//var count returns the visibility of the div that checks for lazyloading
if((count == 'none')) { // more products to be loaded
page.evaluate(function() {
// Scrolls to the bottom of page
window.document.body.scrollTop = document.body.scrollHeight;
});
page.render('myn'+k+'.png');
}
else { // Found
//Do what you want
//console.log('len123');
}, 5000); // Number o ms to wait between scrolls
But I'm getting only first 6 rows loaded. I don't understand where am I going wrong.

Related

Fullcalendar using resources as a function with select menu

Using Fullcalendar 4, I am trying to show/hide my resources using a select menu. When the user selects one of the providers from a menu, I want to only show that one resourc's events.
Above my fullcalendar I have my select menu:
<select id="toggle_providers_calendar" class="form-control" >
<option value="1" selected>Screech Powers</option>
<option value="2">Slater</option>
</select>
I am gathering the resources I need using an ajax call on my included fullcalendar.php page. I am storing them in an object and then trying to control which resources are shown onscreen:
document.addEventListener('DOMContentLoaded', function() {
var resourceData = [];
$.getJSON('ajax_get_json.php?what=schedule_providers_at_location',
function(data) {
$.each(data, function(index) {
resourceData.push({
id: data[index].value,
title: data[index].text
});
});
console.log(resourceData);
});
//below, set the visible resources to whatever is selected in the menu
//using 1 in order for that to show at start
var visibleResourceIds = ["1"];
//below, get the selected id when the the menu is changed and use that in the toggle resource function
$('#toggle_providers_calendar').change(function() {
toggleResource($('#toggle_providers_calendar').val());
});
var calendar_full = document.getElementById('calendar_full');
var calendar = new FullCalendar.Calendar(calendar_full, {
events: {
url: 'ajax_get_json.php?what=location_appointments'
},
height: 700,
resources: function(fetchInfo, successCallback, failureCallback) {
// below, I am trying to filter resources by whether their id is in visibleResourceIds.
var filteredResources = [];
filteredResources = resourceData.filter(function(x) {
return visibleResourceIds.indexOf(x.id) !== -1;
});
successCallback(filteredResources);
},
...
});
// below, my toggle_providers_calendar will trigger this function. Feed it resourceId.
function toggleResource(resourceId) {
var index = visibleResourceIds.indexOf(resourceId);
if (index !== -1) {
visibleResourceIds.splice(index, 1);
} else {
visibleResourceIds.push(resourceId);
}
calendar.refetchResources();
}
To make sure the getJSON is working, I have console.log(resourceData). The information in the console once it's gathered is:
[{id: '1', title: 'Screech Powers'}, {id: '2', title: 'Slater}]
... the above are the correct resources that can be chosen/rendered. So that seems to be okay.
On page load, no resources show at all, when resource id of '1' (Screech Powers) should be shown per my code. Well, at least, that's what I am trying to do right now.
When the menu changes, resources will show/hide, but not based on what's selected; the logic of only showing what is selected in the menu doesn't seem to be working.
I used to use a URL request for my resources: 'ajax_get_json.php?what=schedule_providers_at_location', and it worked fine! All resources show then their events properly. I am just trying to modify it by using a menu to show/hide the resources as needed.
Here's what I'm doing to make it happen so far! In case someone comes across this post ever, this will help.
Here's my code before my fullcalendar code.
var resourceData = [];
var visibleResourceIds = [];
$.getJSON('ajax_get_json.php?what=schedule_providers_at_location',
function(data) {
$.each(data, function(index) {
resourceData.push({
id: data[index].value,
title: data[index].text
});
});
});
$('#toggle_providers_calendar').change(function() {
toggleResource($('#toggle_providers_calendar').val());
});
My select menu with id 'toggle_providers_calendar' is the same as my original post. My fullcalendar resources as a function is the same too.
After the calendar is rendered, here are the changes I made to my toggle resources function:
// menu button/dropdown will trigger this function. Feed it resourceId.
function toggleResource(resourceId) {
visibleResourceIds = [];
//if select all... see if undefined from loading on initial load = true
if ((resourceId == '') || (resourceId === undefined)) {
$.map( resourceData, function( value, index ) {
visibleResourceIds.push(value.id);
});
}
var index = visibleResourceIds.indexOf(resourceId);
if (index !== -1) {
visibleResourceIds.splice(index, 1);
} else {
visibleResourceIds.push(resourceId);
}
calendar.refetchResources();
}
This causes the resources to show and hide properly. If the user selects "Show All" that works too!
In order to have a default resource show on load, I add this to my fullcalendar script:
loading: function(bool) {
if (bool) {
//insert code if still loading
$('.loader').show();
} else {
$('.loader').hide();
if (initial_load) {
initial_load = false;
//code here once done loading and initial_load = true
var default_resource_to_show = "<?php echo $default_provider; ?>";
if (default_resource_to_show) {
//set the menu to that provider and trigger the change event to toggleresrource()
$('#toggle_providers_calendar').val(default_provider).change();
} else {
//pass in nothing meaning 'select all' providers for scheduler to see
toggleResource();
}
}
}
},
I am using a bool variable of initial_load to see if the page was just loaded (basically not loading data without a page refresh). The bool of initial_load = true is set outside of DOMContentLoaded
<script>
//show selected date in title box
var initial_load = true;
document.addEventListener('DOMContentLoaded', function() {
My only current problem is that when toggleResource function is called, the all day vertical time block boundaries don't line up with the rest of the scheduler. Once I start navigating, they do, but I don't understand why it looks like this on initial load or when toggleResource() is called:
Any thoughts on how to correct the alignment of the allday vertical blocks?

Material Design Lite - Programatically Open and Close Toast

I would like to open and close MDL toast rather than use the timeout property as indicated in the MDL usage guide. The reason is that I want the toast to remain while geolocation is occuring, which sometimes takes 10+ seconds and other times happens in 1 second.
Any idea how this could be done?
A q&d solution i found, invoke cleanup_ method on the sb object.
With this solution i can show the sb, click action handler to hide it, then re trigger the action to show it without any problem.
var snackbar = form.querySelector("[class*='snackbar']");
if (snackbar) {
var data = {
message: 'Wrong username or password',
timeout: 20000,
actionHandler: function(ev){
// snackbar.classList.remove("mdl-snackbar--active")
snackbar.MaterialSnackbar.cleanup_()
},
actionText: 'Ok'
};
snackbar.MaterialSnackbar.showSnackbar(data);
}
As cleanup_ is not part of the public api, i guess it worth to enclose this with some small checks to avoid a disaster.
snackbar.MaterialSnackbar.cleanup_
&& snackbar.MaterialSnackbar.cleanup_()
!snackbar.MaterialSnackbar.cleanup_
&& snackbar.classList.remove("mdl-snackbar--active")
Got it working as so: I basically set a 30 second timeout on the toast assuming my geolocation and georesults (GeoFire) will take no more than 30 seconds.
I get the length of the returned array of map markers and multiply that by the javascript timeout events. I finally remove mdl-snackbar--active which hides the toast. So, basically - it works.
UPDATED
The above actually had a major problem in that additional toasts would not display until that long timeout completed. I could not figure out how to apply the clearTimeout() method to fix it so I found a solution that works - trigger the toast up and down by just toggling the mdl-snackbar--active class - no timer setting necessary.
So to call toast as normal using this code, simply tools.toast('hello world',error,3000). To programatically open and close toast call tools.toastUp('hey') and tools.toastDown(), respectively. So, you might call tools.toastDown after a promise resolves or something...
var config = (function() {
return {
timeout: 50, //in milliseconds
radius: 96, //in kilometers
};
})();
var tools = (function() {
return {
toast: function(msg,obj,timeout){
var snackbarContainer = document.querySelector('#toast'); //toast div
if(!obj){obj = ''}
if(!timeout){timeout = 2750}
data = {
message: msg + obj,
timeout: timeout
};
snackbarContainer.MaterialSnackbar.showSnackbar(data);
},
toastUp: function(msg){
var toast = document.querySelector('#toast');
var snackbarText = document.querySelector('.mdl-snackbar__text');
snackbarText.innerHTML = msg;
toast.classList.add("mdl-snackbar--active");
},
toastDown: function(count) {
setTimeout(function () {
var toast = document.getElementById("toast");
toast.classList.remove("mdl-snackbar--active");
}, config.timeout * count);
},
};
})();
In case you want to fire tools.toastDown after a timeout loop, you can do:
function drop(filteredMeetings) {
tools.clearMarkers(true);
for (var i = 0; i < filteredMeetings.length; i++) {
//drop toast once markers all dropped
if(i === filteredMeetings.length - 1) {
tools.toastDown(i);
}
tools.addMarkerWithTimeout(filteredMeetings[i], i * config.timeout);
}
}

Open multiple links in casperjs

I am trying to scrape all links of special kind (boxscore-links) from this website http://www.basketball-reference.com/teams/GSW/2016_games.html and then visit them one by one, scraping some information from every visited link. For a beginning I want to scrape all links, visit them one by one and get a title of website. The problem is that it always prints the same title and the same current url (initial url) even though it clearly has to be a new one. Seems to me that there is a problem with 'this'-keyword...
(Don't look at limit of links, I took the code from sample on github of casperjs and I left it for console not to be overloaded.)
This is my code:
var casper = require("casper").create({
verbose: true
});
// The base links array
var links = [ "http://www.basketball-reference.com/teams/GSW/2016_games.html" ];
// If we don't set a limit, it could go on forever
var upTo = ~~casper.cli.get(0) || 10;
var currentLink = 0;
// Get the links, and add them to the links array
function addLinks(link) {
this.then(function() {
var found = this.evaluate(searchLinks);
this.echo(found.length + " links found on " + link);
links = links.concat(found);
});
}
// Fetch all <a> elements from the page and return
// the ones which contains a href starting with 'http://'
function searchLinks() {
var links = document.querySelectorAll('#teams_games td:nth-child(5) a');
return Array.prototype.map.call(links, function(e) {
return e.getAttribute('href');
});
}
// Just opens the page and prints the title
function start(link) {
this.start(link, function() {
this.wait(5000, function() {
this.echo('Page title: ' + this.getTitle());
this.echo('Current url: ' + this.getCurrentUrl());
});
});
}
// As long as it has a next link, and is under the maximum limit, will keep running
function check() {
if (links[currentLink] && currentLink < upTo) {
this.echo('--- Link ' + currentLink + ' ---');
start.call(this, links[currentLink]);
addLinks.call(this, links[currentLink]);
currentLink++;
this.run(check);
} else {
this.echo("All done.");
this.exit();
}
}
casper.start().then(function() {
this.echo("Starting");
});
casper.run(check);
Considering an array of URLs, you can iterate over them, visiting each in succession with something like the following:
casper.each(urls, function(self, url) {
self.thenOpen(url, function(){
this.echo('Opening: ' + url);
// Do Whatever
});
});
Obviously this will not find links on a page, but it is a nice way to go over a known set of URLs.

Jasmine - Testing links via Webdriver I/O

I have been working on a end-to-end test using Webdriver I/O from Jasmine. One specific scenario has been giving me significant challenges.
I have a page with 5 links on it. The number of links actually challenges as the page is dynamic. I want to test the links to see if each links' title matches the title of the page that it links to. Due to the fact that the links are dynamically generated, I cannot just hard code tests for each link. So, I'm trying the following:
it('should match link titles to page titles', function(done) {
client = webdriverio.remote(settings.capabilities).init()
.url('http://www.example.com')
.elements('a').then(function(links) {
var mappings = [];
// For every link store the link title and corresponding page title
var results = [];
for (var i=0; i<links.value.length; i++) {
mappings.push({ linkTitle: links.value[0].title, pageTitle: '' });
results.push(client.click(links.value[i])
.getTitle().then(function(title, i) {
mappings[i].pageTitle = title;
});
);
}
// Once all promises have resolved, compared each link title to each corresponding page title
Promise.all(results).then(function() {
for (var i=0; i<mappings.length; i++) {
var mapping = mappings[i];
expect(mapping.linkTitle).toBe(mapping.pageTitle);
}
done();
});
});
;
});
I'm unable to even confirm if I'm getting the link title properly. I believe there is something I entirely misunderstand. I am not even getting each links title property. I'm definately not getting the corresponding page title. I think I'm lost in closure world here. Yet, I'm not sure.
UPDATE - NOV 24
I still have not figured this out. However, i believe it has something to do with the fact that Webdriver I/O uses the Q promise library. I came to this conclusion because the following test works:
it('should match link titles to page titles', function(done) {
var promise = new Promise(function(resolve, reject) {
setTimeout(function() { resolve(); }, 1000);
});
promise.then(function() {
var promises = [];
for (var i=0; i<3; i++) {
promises.push(
new Promise(function(resolve, reject) {
setTimeout(function() {
resolve();
}, 500);
})
);
}
Promise.all(promises).then(function() {
expect(true).toBe(true)
done();
});
});
However, the following does NOT work:
it('should match link titles to page titles', function(done) {
client = webdriverio.remote(settings.capabilities).init()
.url('http://www.example.com')
.elements('a').then(function(links) {
var mappings = [];
// For every link store the link title and corresponding page title
var results = [];
for (var i=0; i<links.value.length; i++) {
mappings.push({ linkTitle: links.value[0].title, pageTitle: '' });
results.push(client.click(links.value[i])
.getTitle().then(function(title, i) {
mappings[i].pageTitle = title;
});
);
}
// Once all promises have resolved, compared each link title to each corresponding page title
Q.all(results).then(function() {
for (var i=0; i<mappings.length; i++) {
var mapping = mappings[i];
expect(mapping.linkTitle).toBe(mapping.pageTitle);
}
done();
});
})
;
});
I'm not getting any exceptions. Yet, the code inside of Q.all does not seem to get executed. I'm not sure what to do here.
Reading the WebdriverIO manual, I feel like there are a few things wrong in your approach:
elements('a') returns WebElement JSON objects (https://code.google.com/p/selenium/wiki/JsonWireProtocol#WebElement_JSON_Object) NOT WebElements, so there is no title property thus linkTitle will always be undefined - http://webdriver.io/api/protocol/elements.html
Also, because it's a WebElement JSON object you cannot use it as client.click(..) input, which expects a selector string not an object - http://webdriver.io/api/action/click.html. To click a WebElement JSON Object client.elementIdClick(ID) instead which takes the ELEMENT property value of the WebElement JSON object.
When a client.elementIdClick is executed, the client will navigate to the page, trying to call client.elementIdClick in the next for loop cycle with next ID will fail, cause there is no such element as you moved away from the page. It will sound something like invalid element cache.....
So, I propose another solution for your task:
Find all elements as you did using elements('a')
Read href and title using client.elementIdAttribute(ID) for each of the elements and store in an object
Go through all of the objects, navigate to each of the href-s using client.url('href'), get the title of the page using .getTitle and compare it with the object.title.
The source I experimented with, not run by Jasmine, but should give an idea:
var client = webdriverio
.remote(options)
.init();
client
.url('https://www.google.com')
.elements('a')
.then(function (elements) {
var promises = [];
for (var i = 0; i < elements.value.length; i++) {
var elementId = elements.value[i].ELEMENT;
promises.push(
client
.elementIdAttribute(elementId, 'href')
.then(function (attributeRes) {
return client
.elementIdAttribute(elementId, 'title')
.then(function (titleRes) {
return {href: attributeRes.value, title: titleRes.value};
});
})
);
}
return Q
.all(promises)
.then(function (results) {
console.log(arguments);
var promises = [];
results.forEach(function (result) {
promises.push(
client
.url(result.href)
.getTitle()
.then(function (title) {
console.log('Title of ', result.href, 'is', title, 'but expected', result.title);
})
);
});
return Q.all(promises);
});
})
.then(function () {
client.end();
});
NOTE:
This fails to solve your problem, when the links trigger navigation with JavaScript event handlers not the href attributes.

How to return number of items in collection?

I'm new to Meteor and I want to create a slideshow with items from a collection, in this case simple words. The slideshow should be controlled by back and forward buttons and replace the current word.
In JavaScript/jQuery I would create an array of objects and a control index, with limits via if-statements, so the index never can drop below zero or overflow the length of the array.
See fiddle for working example:
http://jsfiddle.net/j0pqd26w/8/
$(document).ready(function() {
var wordArray = ["hello", "yes", "no", "maybe"];
var arrayIndex = 0;
$('#word').html(wordArray[arrayIndex]);
$("#previous").click(function(){
if (arrayIndex > 0) {
arrayIndex -= 1;
}
$('#word').html(wordArray[arrayIndex]);
});
$("#next").click(function(){
if (arrayIndex < wordArray.length) {
arrayIndex += 1;
}
$('#word').html(wordArray[arrayIndex]);
});
});
Meteor
I'm curious how to implement this in regards to best practice in meteor and abide to the reactive pattern as I'm still trying to wrap my head around this interesting framework. My first hurdle is to translate the
if (arrayIndex < wordArray.length)
// to
if (Session.get("wordIndex") < ( (((length of collection))) )
According to the docs I should do a find on the collection, but I have only manage to return an empty array later with fetch. Sorry if this got long, but any input would be appreciated to help me figure this out.
collection.find([selector], [options])
cursor.fetch()
This is the code I have so far:
Words = new Mongo.Collection("words");
if (Meteor.isClient) {
// word index starts at 0
Session.setDefault("wordIndex", 0);
Template.body.helpers({
words: function () {
return Words.find({});
},
wordIndex: function () {
return Session.get("wordIndex");
}
});
Template.body.events({
"submit .new-word": function (event) {
// This function is called when the word form is submitted
var text = event.target.text.value;
Words.insert({
text: text,
createdAt: new Date() //current time
});
// Clear form
event.target.text.value = "";
// Prevent default form submit
return false;
},
'click #previous': function () {
// decrement the word index when button is clicked
if (Session.get("wordIndex") > 0) {
Session.set("wordIndex", Session.get("wordIndex") - 1);
}
},
'click #next': function () {
// increment the word index when button is clicked
if (Session.get("wordIndex") < 10 ) {
Session.set("wordIndex", Session.get("wordIndex") + 1);
}
}
});
}
if (Meteor.isServer) {
Meteor.startup(function () {
});
}
.count() will return the number of documents in a collection.
`db.collection.count()`
There is something called Collection helpers, which works similar to other helpers (eg., template, etc.,). More elaborate explanation is covered here: https://medium.com/space-camp/meteor-doesnt-need-an-orm-2ed0edc51bc5

Resources