Scrapy-splash not send cookies in other requests - splash-screen

I use cookies to access the website. Everything is fine, I'm logged in. But after I use lua script to click on element => js sends an ajax with some information. And I realized that ajax doesn't send cookies with it.
Similar case, when logged in, I click on another page with lua script (eg about, prices, ...) then I also get the page not logged in.
The problem is that I have to do it using lua script. Because I can't do a new SplashRequest. Because the ajax is sent with the parameter generated from js. And this js code is very complex. I can't reproduce it in python to use requests libs. That's why I need a browser like splash to do it for me. But it seems I was unlucky to have encountered the limitation of splash.
If you've ever had the same problem, please help me fix it. Is that a limitation of splash? Or is there some way to solve it?

Related

How to know if user is authenticated on the first request with firebase

Given a backend wrote in nodejs that returns a page that should either link to login (if the user is not logged in) or a link to logout (if the users is already logged in).
Considering I'm using firebase as authentication tool, how can I know in the first request, when the user is accessing the website, if is he authenticated to then
set the ejs template to respond with the correct link ?
Is there some header, or token that can I use ?
The only solution I found was use ajax after the server response, but I don't like this solution because apparently there is a delay in the link renderization.
As far as I know there is no way to know if the user is authentication on the initial request. From a quick inspection no data is sent along with that request. That kinda makes sense, given that upon this request it is not even known if you're using authentication to being with.
Update
I actually just ran into this blog post from one of the Firebase engineers, which seems promising: Introducing session cookies for server-side web apps. I haven't fully read it yet, but the title sounds like it may be exactly what you want.

Is it bad design to have a link in email message result in no browser action when clicked?

Original post:
This web application sends out emails which contain a link to a URL.
Correction-Clarification 9/17/2014:
An .EXE running as a scheduled task on a server (in "support" of the web app and connecting to same database) sends out emails which contain a link to a URL.
The nature of the email content is essentially a "reminder"; the link when clicked is essentially an acknowledgement signaling "done".
Resumption of original post follows:
Clicking the link in the email does 2 things at the target .ASPX page:
the page logic updates a database and sends another email to the same user
the page finishes by displaying a "success" message in the browser
Would it be bad design to eliminate the success message sent by the browser?
I'm thinking the opening of the web page just to announce "success" is not needed. If the target URL were replaced with something with no user interface (e.g. HTTPHandler, webservice) then I'm thinking the email sent back to the same user confirming "success" would be adequate.
Yet, part of that approach "feels awkward", I guess because normally clicking on links in emails causes web pages to open. Given these requirements, would this be bad design to eliminate the browser?
UPDATE - 10-17-2014:
see: Submit to HttpHandler results in RequestType GET instead of POST
update below
Actually, it's bad design to have a state change occur based on a GET request. A number of email systems (and virus scanning software) will follow the link in order to determine whether it should be quarantined or not.
Never mind that a GET request causing a change in state is pretty much against how the web is supposed to work anyway.
What should happen is they click the link, then the mail program opens the browser. You then show a page asking them to confirm the action by clicking a button. That buttons makes a POST request which you then act on.
Finally, I'm not entirely sure how you would eliminate the browser anyway. The mail program detects that it's a link and opens the browser once the user clicks on it. This is no different than how things like opening word documents or zip files. The email program just asks the OS what program is supposed to handle the action and passes it off to that program.
With your update, I think there's a much cleaner way. However this is dependent on the capabilities of the email client that'll be receiving the messages. Should be good for the vast majority of them.
In the body of the email, instead of sending a link, include an HTML form that contains a button which performs a postback to your server. See this ( link ) for samples of how some other companies have done it.
This way the action is a single step instead of two AND you aren't doing things the Wrong Way(tm).

Aspnet LinqtoTwitter - PageCycle Issues

I am currently working on the linqtotwitter library.
I am using cookies to store the token and key. My problem isnt with the api as much. It is more with ASP net and page life cycle.
The problem i have with my webform app is the same with the aspnet webform defaultasp sample same at linqtotwitter site.
This is how the api works
You pass the Credentials to Authorize object to Twitter context in a nut shell.
In the sample you authorize and etc. Once the page load the auth.screenname label is changed to your twitter handle because you authenicated and it passed the auth.credentials to the twittercontext.
This is where my problem is. If I hit refresh the label is cleared out but I am still authenicated with twitter so I can post except i can not get values from the auth objects.
How would I keep the state on a refresh so I keep something like the auth.screenname or something else in memory.
I think i would need to preload the twitter authorized context but I have no idea about doing that.
I do not think using a hidden form element is proper because your masking the underlying problem.
If you want to see what linqtotwitter is, it is at http://linqtotwitter.codeplex.com/
You could throw the tokens into Session if you have it enabled, that might solve your issue.

How can I access the captcha image that was generated when the page was loaded?

On some websites, when you want to login, you need to enter a captcha as well. If I want to provide support for an user to enter a captcha into my application ( which will then log into the website ), how would I do this?
My problem is that the link to the captcha image is like this: example.com/captcha , and it serves a different image each time it's accesed.
My approach is like this:
request page
download image
show image to user
user inputs login information
application logs in
The thing is, if you download the image in order to show it to the user, you're actually receiving a different image than the one generated when the page was loaded, right? How can I get to the image that was generated when the page was loaded, so that when I show it to the user, it's the correct one?
The question is language agnostic.
I think your problem is about sessions, the session your app downloading the image and the session your app submiting the login form may not be the same session, then your captcha will never be correct, you should maitain the session between requests, normally is some cookie set by the website.
By design, most captcha will always give you a different image. No way to work around that fact.
The first thing to do, is to open up fiddler. That way you can see what the browser is doing so that it can autenticate & remain autenticated.
It usually comes down to a cookie being sent. So what you need to do is to hold the cookie on your client app, and have all the requests sent with that cookie. Different platforms provide features to do so, but I'm sure a quick search will show you how.
Remember to pay attention to all being exchanged in fiddler, you need to make sure your apps triggers the same. Besides cookies, pay attention to any hidden field a js might set on the form.
It sounds like you're trying to invent a captcha solution yourself. Have you considered using reCAPTCHA? It's free.
Can you be a bit more specific about your situation? From what you've said, I'm assuming the following:
You have a "client GUI app" that logs in to a third-party site. Is this a web-app, or a desktop/standalone application? In what language is it written?
Your app contacts the third party site and downloads the Captcha image. This image is then shown to the user.
The user enters the captcha phrase and submits it to your app. Your app then submits this phrase to the site for validation. This is where sessions come in. Assuming the remote site uses cookie-based session tracking, you will need to send the same cookie to the third-party server with this submission as you do when the image was downloaded (in the step above). This allows the server to match your submission to the correct image it sent. Precisely how you do this depends on what language you've written your app in and the precise structure of it all. Without more information, a more specific solution is impossible.
The image that's generated is also the image served to the user. Your 'main' html page doesn't/shouldn't generate the image, it only embeds it using the image tag.
You could pass a token of some kind with the captcha image, perhaps appended to the filename such as captcha-0ad719bef61bc6a0.jpg and the appended data could link into a temporary table in a database server side that has the correct answer. This would allow you to check things were ok without passing both the image and answer across to your application.
I'm not sure if I entirely understand this question, but wouldn't you simply store the captcha locally after requesting it from the server, and then embed the local image from the client application, while storing any necessary session captcha data that will allow the captcha to be validated on post, assuming the user input is correct?
If the problem is that the captcha changes everytime you request it, just request it only once.
Can you offer any more clarification if this wouldn't apply to you?
It depends from capcha to another captcha. Maybe you need to use sessions or cookies or some captcha image filename. Show the page with that captcha.

Post to Facebook Page via ASP.NET

I've seen this and this but before I sink a ton of time into it, I want to know if what I'm trying to do is possible. I have a Page on FB (not a profile, but a Page for business, websites, etc) and I want to post a story to it via my site automatically. I don't want to do anything else but that. I don't want to create an app (if I don't have to), just post to a Page. Is there an easy way to do this, or is this super complicated?
Also, if I have to build an app, what's the simplest way to go about this (the other guy's question was never answered)?
Thanks!
Yes, you will need to get a page access token. Simply use the user access token for an admin of the page and call me/accounts There you will find a list of all the pages and apps admined by that user. Find the page, and in that object will be the page access token. Use that page access token and HTTP POST to me/feed with the post parameters set.
See also:
http://developers.facebook.com/docs/reference/api/page/
https://developers.facebook.com/docs/reference/api/permissions
http://developers.facebook.com/docs/authentication/
You could write a script to control a web browser. The script could log in then post the message... Use a library like WatiN to script the browser.
You are either going to have to make a Facebook Application, use franks method, or do some sniffing and figure out how the publisher works and login / post with cURL and cookies.
Also there is a application called "Blog RSS Feed Reader" if you wanted to go the RSS route.

Resources