I think that my question is actually very simple and I thought the answer was somewhere hidden in some related questions. But I couldn't get my code to work, so here is my problem:
I need to read a file that is non-periodically chanced and do really nice stuffs with the data in it. So far, I've been doing this "locally" (this file will be on a server later) and it works just fine. Just for testing to see if the data was being read correctly, the changes on the file were done by me and I just hit F5 on the browser to get the "new page". All this is fine!
The thing is, I need the webpage to reload itself only when the file was changed. So I read the file, check if update!= lastupdate to reload the page. The problem is that it doesn't matter if the condition is true or false the page always reloads!! not cool! This is one of the approaches I've done so far:
setInterval(function() {
$.getJSON('object.json', function(data) {
if ( data.update != lastUpdate ){
lastUpdate = data.update;
window.location.reload();
}
});
}, 2000);
This functions checks every 2 seconds if the file was changed and then if true reload the page. But it reloads every 2 seconds instead of each time the file is changed ... Could anyone tell me what am I doing wrong?
Thanks and regards,
Julls
Are you missing a closing curly brace?
setInterval(function() {
$.getJSON('object.json', function(data) {
if ( data.update != lastUpdate ){
lastUpdate = data.update;
window.location.reload();
}
});
}, 2000);
Related
There are sites whose DOM and contents are generated dynamically when the page loads. (Angularjs-based sites are notorious for this)
What approach do you use?
I tried both phantomjs and jsdom but it seems I am unable get the page to execute its javascript before I scrape.
Here's a simple jsdom example (not angularjs-based but still dynamically generated)
var env = require('jsdom').env;
exports.scrape = function(link, callback) {
var config = {
url: link,
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36'
},
done: jsdomDone
};
env(config);
}
function jsdomDone(err, window) {
var info = null;
if(err) {
console.error(err);
} else {
var $ = require('jquery')(window);
console.log($('.profilePic').attr('src'));
}
}
exports.scrape('https://www.facebook.com/elcompanies');
I tried phantomjs with moderate success.
var page = new WebPage()
var fs = require('fs');
page.onLoadFinished = function() {
console.log("page load finished");
window.setTimeout(function() {
page.render('export.png');
fs.write('1.html', page.content, 'w');
phantom.exit();
}, 10000);
};
page.open("https://www.facebook.com/elcompanies", function() {
page.evaluate(function() {
});
});
Here I wait for the onLoadFinished event and even put a 10-second timer. The interesting thing is that while my export.png image capture of the page shows a fully rendered page, my 1.html doesn't show the .profilePic class element in its rightful place. It seems to be sitting in some javascript code, surrounded by some kind of "require("TimeSlice").guard(function() {bigPipe.onPageletArrive({..." block
If you can provide me a working example that scrapes the image off this page, that'd be helpful.
I've done some scraping in Facebook by using nightmarejs.
Here is a code that I did to get some content from some posts of a Facebook page.
module.exports = function checkFacebook(callback) {
var nightmare = Nightmare();
Promise.resolve(nightmare
.viewport(1000, 1000)
.goto('https://www.facebook.com/login/')
.wait(2000)
.evaluate(function(){
document.querySelector('input[id="email"]').value = facebookEmail
document.querySelector('input[id="pass"]').value = facebookPwd
return true
})
.click('#loginbutton input')
.wait(1000)
.goto('https://www.facebook.com/groups/bierconomia')
.evaluate(function(){
var posts = document.getElementsByClassName('_1dwg')
var length = posts.length
var postsContent = []
for(var i = 0; i < length; i++){
var pTag = posts[i].getElementsByTagName('p')
postsContent.push({
content: pTag[0] ? pTag[0].innerText : '',
productLink: posts[i].querySelector('a[rel = "nofollow"]') ? posts[i].querySelector('a[rel = "nofollow"]').href : '',
photo: posts[i].getElementsByClassName('_46-i img')[0] ? posts[i].getElementsByClassName('_46-i img')[0].src : ''
})
}
return postsContent
}))
.then(function(results){
log(results)
return new Promise(function(resolve, reject) {
var leanLinks = results.map(function(result){
return {
post: {
content: result.content,
productLink: extractLinkFromFb(result.productLink),
photo: result.photo
}
}
})
resolve(leanLinks)
})
})
The thing that I find useful with nightmare is that you can use the wait function to either wait for X ms or for a specific class to render.
This is because generated web pages based on AJAX calls have asynchronous AJAX calls and you can't rely on onLoad events (because data still not available).
In my personal opinion, the most reliable way would be tracing which REST services are being called from this HTML and make direct calls to them. Sometimes you will need using values found in HTML or values taken from another calls.
I know this may sound complicated, and in fact it is. You kinda need to debug page and learn what is being called. But this will work for sure.
By the way, using chrome developer tools will help this task. Just observe which call are made in network tab. You can even observe what has been sent and received in each AJAX call.
If it is a one time thing, that is, if I just want to scrape a single page once, I just use the browser and artoo-js.
I never tried to write a page on disk using phantom, but I have two observations:
1) you are using fs.write to write things to disk, but writeFile is an async call. This means that you either need to change it to fs.writeFileSync or use a callback before closing phantom.
2) I hope you aren't expecting to write a HTML to a file and open it in a browser and get it rendered like when you saved a png, because it doesnt work this way. Some objects can be stored directly in DOM properties and certainly there are values stored in javascript variables, those things will never be persisted.
I'm working on a project with the following Firebase structure:
user {
score: 0,
messages : {
key1 { name: name, text: text }
key2 { name: name, text: text }
key...
}
}
I currently have two problems. The first is determining if the user has a "messages" child, if not, then give it one (along with a score), here's the code I came up with so far:
ref.once('value', function (snapshot) {
if (!snapshot.hasChild("messages")) {
ref.set({
score: 0,
messages: 0
});
}
});
The next is retrieving and displaying the messages from the child once the data has been pushed to it like so:
ref.child("messages").on('child_added', function (snapshot) {
var message = snapshot.val();
$('#messagesDiv').prepend(message.text ": " + message.name);
});
but that doesn't seem like it's working either.
Here is the fiddle I made.
I hope you guys can help me fix this problem! The syntax looks right and I read over the docs to find most of the current code.
Thanks in advance!
Setting the initial data
Your code with hasChild seems is executed fine. It just doesn't make a lot of sense. The structure that you're adding leads to:
user {
score: 0,
messages: 0
}
Which is not the same as the structure you've drawn in your question: messages here is just a number, while you want it to be a collection of messages. In addition this change will not trigger your child_added handler, since... you're not adding a child to messages.
You've done the right thing by starting with designing a data structure. The next step is to ensure that you stick to that data structure. So if you want to add an initial message, add the message in the correct structure:
ref.once('value', function (snapshot) {
if (!snapshot.hasChild("messages")) {
ref.set({
score: 0,
messages: { 0: { name: 'puf', text: 'welcome' }}
});
}
});
If you modify the fiddle you will see that the welcome message does show up in your #messagesDiv.
I think this approach is still flawed though. Unless you are really looking to add a welcome message, there is no need to add a messages node. I would just set the score to 0 and the messages node will be added once the user enters their first message:
ref.once('value', function (snapshot) {
if (!snapshot.hasChild("messages")) {
ref.set({ score: 0 });
}
});
Adding new messages
I noticed that you also have the following code in your fiddle:
$('#messageInput').keypress(function (e) {
if (e.keyCode == 13) {
var name = user;
var text = $('#messageInput').val();
// POST
ref.child("messages").set({
name: name,
text: text
});
$('#messageInput').val('');
}
});
The input handling is fine, but once again your code that modifies the Firebase data structure does not follow along with the data structure you started your question with. If we execute this code, the data structure will be:
user {
score: 0,
messages: {
name: 'NotToBrag',
text: 'asked 10 hours ago'
}
}
In case it's not obvious: this structure is missing the crucial key1 or your structure. Oh... and it has also overwritten the welcome message.
When you're adding a child node to a Firebase list, you almost always want to use push:
ref.child("messages").push({
name: name,
text: text
});
With that tiny change, the data structure becomes:
user {
score: 0,
messages: {
0: {
name: 'puf',
text: 'welcome'
},
'-Jh-aFN42nWef-FvgcfS': {
name: 'NotToBrag',
text: 'asked 10 hours ago'
}
}
}
All of these are (as usual) pretty small changes. But together they ensured that your scenario was pretty badly broken. The tricks I used to troubleshoot are incredibly basic and you'd do well to add them to your arsenal and learn to use them.
Debugging trick 1: console.log the data structure
Whenever I first get an MCVE of somebody's problem, I immediately log their data structure:
new Firebase('https://your.firebaseio.com/').once('value', function(s) {
console.log(s.val());
})
As times I might stringify the JSON:
new Firebase('https://your.firebaseio.com/').once('value', function(s) {
console.log(JSON.stringify(s.val()));
})
That last snippet is for example a great way to get the data structure for use in your question.
The snippet only shows the data structure once, so keep running this snippet every time something changes.
Debugging trick 2: remove your data
Your whole hasChild snippet seems aimed to set up your initial data structure for a user. To aid in testing, I frequently removed the data:
new Firebase('https://your.firebaseio.com/myName').remove()
And then when you run the fiddle again, you can see what your hasChild-using code does.
I often put code to clean out (or otherwise reset) my test data either at the start of my fiddles or simply run a snippet from the browser's JavaScript console.
I'm using the meteor-paginated-subscription package in my app. On the server, my publication looks like this:
Meteor.publish("posts", function(limit) {
return Posts.find({}, {
limit: limit
});
});
And on the client:
this.subscriptionHandle = Meteor.subscribeWithPagination("posts", 10);
Template.post_list.events = {
'click #load_more': function(event, template) {
template.subscriptionHandle.loadNextPage();
}
};
This works well, but I'd like to hide the #load_more button if all the data is loaded on the client, using a helper like this:
Template.post_list.allPostsLoaded = function () {
allPostsLoaded = Posts.find().count() <= this.subscriptionHandle.loaded();
Session.set('allPostsLoaded', allPostsLoaded);
return allPostsLoaded;
};
The problem is that Posts.find().count() is returning the number of documents loaded on the client, not the number available on the server.
I've looked through the Telescope project, which also uses the meteor-paginated-subscription package, and I see code that does what I want to do:
allPostsLoaded: function(){
allPostsLoaded = this.fetch().length < this.loaded();
Session.set('allPostsLoaded', allPostsLoaded);
return allPostsLoaded;
}
But I'm not sure if it's actually working. Porting their code into mine does not work.
Finally, it does look like Mongo supports what I want to do. The docs say that, by default, cursor.count() ignores the effects of limit.
Seems like all the pieces are there, but I'm having trouble putting them together.
None of the answers do what you really want becase none provide solution that is reactive.
This package does exactly what you want and also reactive.
publish-counts
I think you can see the demo: counts-by-room in meteor doc
It can help you publish the counts of your posts at server and get it at client
You can simply write this:
// server: publish the current size of your post collection
Meteor.publish("counts-by-room", function () {
var self = this;
var count = 0;
var initializing = true;
var handle = Posts.find().observeChanges({
added: function (id) {
count++;
if (!initializing)
self.changed("counts", 'postCounts', {count: count});
},
removed: function (id) {
count--;
self.changed("counts", postCounts, {count: count});
}
});
initializing = false;
self.added("counts", 'postCounts', {count: count});
self.ready();
self.onStop(function () {
handle.stop();
});
});
// client: declare collection to hold count object
Counts = new Mongo.Collection("counts");
// client: subscribe to the count for posts
Tracker.autorun(function () {
Meteor.subscribe("postCounts");
});
// client: simply use findOne, you can get the count object
Counts.findOne()
The idea of sub.loaded() is to help you with exactly this problem.
Posts.count() isn't going to return the right thing because, as you've guessed, on the client, Meteor has no way of knowing the real number of posts that live on the server. But what the client knows is how many posts it's tried to load. That's what that .loaded() tells you, and is why the line this.fetch().length < this.loaded() will tell you if there are more posts on the server or not.
What I would do is write a Meteor server side method that retrieves the count like so:
Meteor.methods({
getPostsCount: function () {
return Posts.find().count();
}
});
Then call it on the client, in observe to make it reactive:
function updatePostCount() {
Meteor.call('getPostsCount', function (err, count) {
Session.set('postCount', count);
});
}
Posts.find().observe({
added: updatePostCount,
removed: updatePostCount
});
Although this question is old, I thought I would provide an answer that ended up working for me. I did not create the solution, I found the basis for it here (so credit where credit is due): Discover Meteor
Anyway, in my case I was trying to get "size" of the database from client side, so I can determine when to hide the "load more" -button. I was using template level subscriptions. Oh and for this solution to work, you need to add reactive-var -package. Here is my (in short):
/*on the server we define the method which returns
the number of posts in total in the database*/
if(Meteor.isServer){
Meteor.methods({
postsTotal: function() {
return PostsCollection.find().count();
}
});
}
/*In the client side we first create the reactive variable*/
if(Meteor.isClient){
Template.Posts.onCreated(function() {
var self = this;
self.totalPosts = new ReactiveVar();
});
/*then in my case, when the user clicks the load more -button,
we call the postsTotal-method and set the returned value as
the value of the totalPosts-reactive variable*/
Template.Posts.events({
'click .load-more': function (event, instance){
Meteor.call('postsTotal', function(error, result){
instance.totalPosts.set(result);
});
}
});
}
Hope this helps someone (I recommend checking the link first). For template level subscriptions, I used this as my guide Discover Meteor - template level subscriptions. This was my first stacked-post and I am just learning Meteor, so please have mercy...:D
Ouch this post is old, anyway maybe it will help someone.
I had exactly the same issue. I managed to solve it with 2 simple lines...
Remember the :
handle = Meteor.subscribeWithPagination('posts', 10);
Well I used in client handle.loaded() and Posts.find().count(). Because when they are different it means that all the posts are loaded. So here is my code :
"click #nextPosts":function(event){
event.preventDefault();
handle.loadNextPage();
if(handle.loaded()!=Posts.find().count()){
$("#nextPosts").fadeOut();
}
}
I had the same problem, and using the publish-counts package didn't work with the subs-manager package. I created a package that can set a reactive server-to-client session, and keep the document count in this session. You can find an example here:
https://github.com/auweb/server-session/#getting-document-count-on-the-client-before-limit-is-applied
I'm doing something like this:
On cliente
Template.postCount.posts = function() {
return Posts.find();
};
Then you create a template:
<template name="postCount">
{{posts.count}}
</template>
Then, whatever you want to show the counter: {{> postCount}}
Much easier than any solution i have seen.
If I read a value from Firebase and then remove it, a subsequent limited read (e.g. dataRef.limit(10).once("value") ) will still see the removed value.
If I do an unlimited read, then I won't see the removed value, and a subsequent limited read will also no longer see the removed value.
var gFirebase = new Firebase("https://brianshmrian.firebaseio.com/");
function CreateValue()
{
gFirebase.child("TestBug/Key").set("Value");
}
function ReadValue(limit)
{
var dataRef = gFirebase.child("TestBug");
if (limit)
dataRef = dataRef.limit(10);
dataRef.once("value",function(snapshot)
{
alert((limit?"Limited read\n":"Normal read\n") + snapshot.val());
});
}
function RemoveValue()
{
gFirebase.child("TestBug/Key").remove();
}
In this example code, if I do a CreateValue(), then a ReadValue(), then a RemoveValue(), then a ReadValue(true), the object will still be reported to me in the last ReadValue(). However, if I do a ReadValue(false), I'll no longer see the value, and a subsequent ReadValue(true) will not see the value either.
See here to try it for yourself: http://jsfiddle.net/brianshmrian/5WWR6/
So is this a bug? Or am I making a mistake?
EDIT
Ok, that seems like a not too painful workaround. The code below solves my problem for now:
// Need to do this before the remove to avoid caching problem
dataRef.on("value", function(snapshot)
{
setTimeout(function() { dataRef.off(); }, 3000);
});
dataRef.remove();
I can't find any issues with the code. There is always the gotcha that locally cached data is returned synchronously, but I don't see that as an issue here; there's no way for the read to be getting called before the remove has completed. It looks like a pretty straightforward bug.
I was able to circumvent the behavior by setting up the limit(10).on('value') before calling the add/delete operations. So I think that if you establish your query ref first, you'll be okay.
Example: http://jsfiddle.net/katowulf/6wQFF/2/ (the pre tag is set up on load)
I've been scratching my head as to why this code will work some of the time, but not all (or at least most of the time). I've found that it actually does run displaying the correct content in the browser some of the time, but strangely there will be days when I'll come back to the same code, run the server (as per normal) and upon loading the page will receive an error in the console: TypeError: 'undefined' is not an object (evaluating 'Session.get('x').html')
(When I receive that error there will be times where the next line in the console will read Error - referring to the err object, and other times when it will read Object - referring the data object!?).
I'm obviously missing something about Session variables in Meteor and must be misusing them? I'm hoping someone with experience can point me in the right direction.
Thanks, in advance for any help!
Here's my dummy code:
/client/del.html
<head>
<title>del</title>
</head>
<body>
{{> hello}}
</body>
<template name="hello">
Hello World!
<div class="helloButton">{{{greeting}}}</div>
</template>
My client-side javascript file is:
/client/del.js
Meteor.call('foo', 300, function(err, data) {
err ? console.log(err) : console.log(data);
Session.set('x', data);
});
Template.hello.events = {
'click div.helloButton' : function(evt) {
if ( Session.get('x').answer.toString() === evt.target.innerHTML ) {
console.log('yay!');
}
}
};
Template.hello.greeting = function() {
return Session.get('x').html;
};
And my server-side javascript is:
/server/svr.js
Meteor.methods({
doubled: function(num) {
return num * 2;
},
foo: function(lmt) {
var count = lmt,
result = {};
for ( var i = 0; i < lmt; i++ ) {
count++;
}
count = Meteor.call('doubled', count);
result.html = "<em>" + count + "</em>";
result.answer = count;
return result;
}
});
I think it's just that the session variable won't be set yet when the client first starts up. So Session.get('x') will return undefined until your method call (foo) returns, which almost certainly won't happen before the template first draws.
However after that it will be in the session, so things will probably behave right once you refresh.
The answer is to just check if it's undefined before trying to access the variable. For example:
Template.hello.greeting = function() {
if (Session.get('x')) return Session.get('x').html;
};
One of the seven principles of Meteor is:
Latency Compensation. On the client, use prefetching and model simulation to make it look like you have a zero-latency connection to the database.
Because there is latency, your client will first attempt to draw the lay-out according to the data it has at the moment your client connects. Then it will do the call and then it will update according to the call. Sometimes the call might be able to respond fast enough to be drawn at the same time.
As now there is a chance for the variable to not be set, it would throw an exception in that occasion and thus break down execution (as the functions in the call stack will not continue to run).
There are two possible solutions to this:
Check that the variable is set when using it.
return Session.get('x') ? Session.get('x').html : '';
Make sure the variable has an initial value by setting it at the top of the script.
Session.set('x', { html = '', answer = ''});
Another approach would be to add the templates once the call responds.
Meteor.call('foo', 300, function(err, data) {
Session.set('x', data);
$('#page').html(Meteor.ui.render(function() {
return Template.someName();
}));
});