How to click Logout as finalizer in CasperJS - web-scraping

I am creating a web scraper (in CasperJS using JavaScript) that needs login authentication. However I need it to click logout link in the end of script even when the process throws error, such as non-existent element which might occurs frequently while in trial-and-error development. The behavior is similar to Java's finally {} block.
This behavior is required since the web allows only single session per user. If I don't click the logout, the next scraper invocation must wait 5 minutes to expire the previous session which is not good.
Where should I put the logout click?

You can set the casper.options.exitOnError property to false to let the script continue to execute which will eventually execute your logout click.
var casper = require('casper').create({
exitOnError: false,
});
or
casper.options.exitOnError = false;
This only works in with PhantomJS 1.x, because PhantomJS 2 doesn't throw an error when a selector cannot be found. CasperJS simply stops in this case.

Related

What actually happens when you Stop Debugging?

I have 2 ASP.NET applications. 1 is in VB, 1 is in C#.
When the user logins with certain credentials in the VB app should be re-routed to the C# app. Likewise, certain credentials for the C# app gets re-routed to the VB app, and vice versa.
VB -> C# works. This functionality was written by a third party. (The C# application is essentially just a rewrite of our VB app, but more modern. However, the entire package isn't being rewritten).
I've tried to reverse the code so that the C# app will call a stored procedure in the DB to create a token, redirect the browser to the VB app which calls a procedure to get that token and set some Session variables.
I don't have it quite working right, one of the major issues is that that the Browser simply does not navigate off the C# login page to the VB page. If I run Profiler on the DB however, I can see that "load token" stored procedure being called. That must mean that the code is getting executed, but the browser isn't redirecting correctly, right?
More importantly, and the reason I'm posting this question however is I don't understand what's actually happening when I stop debugging my app. I set a break point immediately following the call to create that token in the DB. So I run my application, log in, trigger the break point and I can see the good data in the DB. If I immediately Stop Debugging, the load token procedure still gets called. How!?
Here's the code;
In my LoginController:
public ActionResult ValidateUser(objLogin)
{
var ds = LoginData.ValidateUser(objLogin);
string url = "someUrl/" + ds.Tables["Key"].Rows[0][0].ToString();
System.Web.HttpContext.Current.Response.Redirect(url, true);
return Json(objLogin, JsonRequestBehavior.AllowGet);
}
That redirect points to the landing page in the VB application, which parses out the key from the URL and passes that as a parameter to another DB SP... However, the browser never navigates off my login page regardless if I stop debugging or not.
Frankly, I'm not entirely sure what the return statement does; if I try to step into it it just continues on as if I hit "Play". Application resumes control and just chills at the login page. It's part of the third-party rewrite. The VB app was very old, pretty unstructured. New C# rewrite uses MVC. I'm familiar with the principles but I'm not an expert on it, especially not in .NET.
And in LoginData
public DataSet ValidateUser(Login objLogin)
{
DataSet dsData;
using (SqlCommand sqlCommand = new SqlCommand("Validate_User_Main")
{
// execute this procedure; assign results to dsData
}
string authKey = GetAuthKey(dsData.userId);
DataTable dtTemp = new DataTable("key"); //putting break point here after the key gets created but before the redirect is called in LoginController.ValidateUser
dtTemp.Columns.Add("Key");
DataDrow drTemp = dtTemp.NewRow();
drTemp[0] = authkey;
dtTemp.Rows.Add(drTemp);
dsData.Tables.Add(dtTemp);
return dsData;
}
Edit: If I close my browser window while still waiting on my breakpoint, then stop debugging that "load token" call isn't utilized. If I simply Stop Debugging but leave my browser open, it gets called. So it must be redirecting "behind the scenes", right? I don't understand...
When you stop debugging the debugger is detached. This simply means that it stops tracking the running code. The code keeps running, as you have seen, but know you can't set breakpoints, watch variable etc.

How do you securely log out and clear all subscriptions?

I implemented my own login system, because I'm using a third party web service to authenticate users against an enterprise authentication system. As such, I built a form that calls a server method to make the web service call to the auth system, and if the credentials are valid, it sets a session variable with the user's id. This is how I change the template to show the main screen of the application and not the login screen. Works fine. And the logout button then just sets that userid session variable to false, effectively hiding the main application screen and showing the login form again.
<body>
{{#if loggedInUser}}
{{> navbar}}
{{> mainScreen}}
{{else}}
{{> customLogin}}
{{/if}}
</body>
Template.navbar.helpers({
loggedInUser: function () {
return Session.get('userName');
}
});
'click #logoutButton': function () {
Session.set("userName", false);
}
What I have discovered though, is that the local minimongo collections/subscriptions are still in the browser, and accessible in the console, after the user logs out.
I did some searching but didn't find concrete solutions as to how to properly clear out (or stop?) these subscriptions on the client. In fact, the top 3 hits on a search for "meteor publish subscribe " don't mention stopping or security upon logout.
One suggestion on SO was to save the subscription handle ... but I'm calling subscribe multiple times, so it seems I would have to store up an array depending on how many different subscribes the user triggered during their use of the application, and then go through them calling "stop" on each handle when logging out??
I'm hoping there's a simple way to stop all subscriptions... seems like a logical thing to do for security when a user clicks a logout button.
Thanks!
Could you not use .stop() function on the collection?
var subscription = Meteor.subscribe("info");
//on logout
subscription.stop();
According to the docs:
stop()
Cancel the subscription. This will typically result in the server directing the client to remove the subscription's data from the client's cache.
Updated: Maybe check out this package: Subs Manager. It appears they may be able to do what you want, specifically from their readme:
Clear Subscriptions
In somecases, we need to clear the all the subscriptions we cache. So, this is how we can do it.
var subs = new SubsManager();
// later in some other place
subs.clear();

Handle session timeout in asp.net using Javascript

Essentially I want to be able to catch when a user lets their session timeout and then clicks on something that ends up causing an Async postback. I figured out that if I put this code in my Session_Start (in Global.asax) then I can catch a postback that occurred during a session timeout:
With HttpContext.Current
If TypeOf .Handler Is Page Then
Dim page As Page = CType(.Handler, Page)
If page IsNot Nothing AndAlso page.IsPostBack Then
'Session timeout
End If
End If
End With
This works fine. My question is, I would like to be able to inject some javascript into the Response and then call Response.End() so that the rest of the application does not get finish executing. The problem is that when I try Response.Write("<script ... ") followed by Response.End() then javascript does not get written to the response stream. I'm sure there are other places in the application that I can safely write Javascript to the Response but I can't let the rest of the application execute because it will error when it tries to access the session objects.
To sum up: I need to inject javascript into the response in the Session_Start event in Global.asax
Note: You may be wondering why I'm not doing this in Session_End...we don't use InProc sessions and so Session_End doesn't get called...but that's beside the point...just wanted to make it clear why I'm doing this in Session_Start.
Writing to the response stream outside of an HttpHandler is generally not a good idea; it may work in some corner cases, but it's not how things are intended to work.
Have you considered using either a Page base class or a Page Adapter to do this? That way, you would only need one copy of the code, and it could be applied to either all pages or just the ones you select.
Another option would be to use URL rewriting to redirect the incoming request to a page that generates the script output you need.

Abort Asynchronous Web Service Call and redirect to another URL (ASP.NET Ajax)

In my webapp, I have a list of links generated from code-behind and bound to a repeater control. Clicking on a link opens a popup window, where, along with displaying some data, an asynchronous call to a WCF Service is made (through a javascript proxy). This service in turn calls another third party web service that might take a long time to respond. I am working with IE6, thats a unavoidable requirement.
Now, I abort this service on onunload if the user decides to not wait for the call to complete and just closes the popup window. The problem is, if the user clicks another link from the repeater immediately after, the new popup window opens but doesn't load the page (doesn't go to the supplied URL) till the previous asynchronous call has completed (I have verified this through Fiddler). Interestingly, this only happens for links within the same domain. If I change the link for one of the popus to, say, www.google.com, then the window opens and goes to the correct url as intended. But, for popups with links within my own domain, which are opened immediately after a popup window with an unfinished request was closed, it waits till the previous request completes before loading the url.
I have verified the correct way to abort a callback and abort does fire properly. I also know that I can only abort my client side call, and not the server side call and I don't care about it. My only requirement is that the browser load the next link regardless of the previous asynchronous response.
//Method to Call Service:
function GetData(Id) {
//call the service
Sys.Net.WebRequestManager.add_invokingRequest(On_InvokingRequest);
var service = new WrapperService();
service.GetData(Id, handleSuccess, handleError, null);
Sys.Net.WebRequestManager.remove_invokingRequest(On_InvokingRequest);
}
//method to get the current requests abort executor
function On_InvokingRequest(executor, eventArgs) {
var currentRequest = eventArgs.get_webRequest();
abortExecutor = currentRequest.get_executor();
}
//abort service on unload
function unload() {
if (abortExecutor != null) {
abortExecutor.abort();
}
}
Helpful/Similar links for the background:
browser-waits-for-ajax-call-to-complete-even-after-abort-has-been-called-jquery
aborting-an-asp-net-web-service-asynchronous-call
canceling-ajax-web-service-call
Anybody faced this before? Its driving me nuts! Any help will be greatly appreciated.
The answer in one of your links sounds like the problem to me:
Browser waits for ajax call to complete even after abort has been called (jQuery)
Does your service require session state?
You could prove whether the problem is that IE itself won't issue the request by configuring IE to allow for more than 2 requests to the same domain. If it's being blocked because the aborted request is somehow eating up one of those connections, then increasing it should yield different results. If it still has the problem, it must be that the server is waiting to respond.
Configure IE for more than 2 requests:
http://support.microsoft.com/kb/282402
Quote from one of the SO questions you linked:
It turns out I was completely wrong about this being a browser issue - the problem was on the server. ASP.NET serializes requests of the same session that require session state, so in this case, the next page didn't begin processing on the server until those ajax-initiated requests completed.
Unfortunately, in this case, session state is required in the http handler that responded to the ajax calls. But read-only access is good enough, so by marking the handler with IReadOnlySessionState instead of IRequiresSessionState, session locks are not held and the problem is fixed.

Logoff button IIS6 ASP.NET Basic Authentication

I have a requirement for an explicit logout button for users in a ASP.NET web app. I am using IIS6 with Basic Authentication (SSL). I can redirect to another web page but the browser keeps the session alive. I have googled around and found a way to do it by enabling an active x control to communicate with IIS and kill the session. I am in a restricted environment that does not allow forms authentication and active x controls are not forbidden as well. Has anyone else had this requirement and how have you handled it?
Okay that is what I was afraid of. I have seen similar answers on the net and I was hoping someone would have a way of doing it. Thanks for your time though. I guess I can use javascript to prevent the back button like the history.back()
I was struggling with this myself for a few days.
Using the IE specific 'document.execCommand('ClearAuthenticationCache');' is not for everyone a good option:
1) it flushes all credentials, meaning that the user will for example also get logged out from his gmail or any other website where he's currently authenticated
2) it's IE only ;)
I tried using Session.Abandon() and then redirecting to my Default.aspx. This alone is not sufficient.
You need to explicitly tell the browser that the request which was made is not authorized. You can do this by using something like:
response.StatusCode = 401;
response.Status = "401 Unauthorized";
response.AddHeader("WWW-Authenticate", "BASIC Realm=my application name");
resp.End();
This will result in the following: the user clicks the logout button ==> he will get the basic login window. HOWEVER: if he presses escape (the login dialog disappears) and hits refresh, the browser automagically sends the credentials again, causing the user to get logged in, although he might think he's logged out.
The trick to solve this is to always spit out a unique 'realm'. Then the browser does NOT resend the credentials in the case described above. I chose to spit out the current date and time.
response.StatusCode = 401;
response.Status = "401 Unauthorized";
string realm = "my application name";
response.AddHeader("WWW-Authenticate", string.Format(#"BASIC Realm={0} ({1})", realm, DateTimeUtils.ConvertToUIDateTime(DateTime.Now)));
resp.End();
Another thing that you need to do is tell the browser not to cache the page:
Response.Cache.SetCacheability(HttpCacheability.NoCache);
Response.Cache.SetExpires(DateTime.MinValue);
Response.Cache.SetNoStore();
With all these things in place it works (for me) in IE, but until now I still wasn't able to prevent firefox from logging in the user when the user first presses escape (hides the basic login dialog) and then refresh (F5) or the browsers back button.
The Session.Abandon method destroys all the objects stored in a Session object and releases their resources. If you do not call the Abandon method explicitly, the server destroys these objects when the session times out.
Have you tried calling Session.Abandon in response to the button click?
Edit:
It would seem this is a classic back button issue.
There is very little you can do about the back button. Imagine the user has just opened the current page in a new window then clicked the logOut button, that page appears to log out but it will not immediately affect the content of the other window.
Only when they attempt to navigate somewhere in that window will it become apparent that their session is gone.
Many browsers implement the back button in a similar (although not identical) way. Going back to the previous page is not necessarily a navigation for a HTML/HTTP point of view.
This is a solution for this problem that works in IE6 and higher.
<asp:LinkButton ID="LinkButton1" runat="server" OnClientClick="logout();">LinkButton</asp:LinkButton>
<script>
function logout()
{
document.execCommand("ClearAuthenticationCache",false);
}
</script>
Found this from
http://msdn.microsoft.com/en-us/library/bb250510%28VS.85%29.aspx
Web Team in Short
Your Credentials, Please
Q: Jerry B. writes, "After the user has validated and processed his request, I now want to invalidate him. Assuming this machine is in an open environment where anyone could walk up and us it, I want to throw a new challenge each time a user accesses a particular module on the Web."
A: This is a frequently requested feature of the Internet Explorer team and the good people over there have given us a way to do it in Internet Explorer 6.0 SP1. All you need to do is call the execCommand method on the document, passing in ClearAuthenticationCache as the command parameter, like this:
document.execCommand("ClearAuthenticationCache");
This command flushes all credentials in the cache, such that if the user requests a resource that needs authentication, the prompt for authentication occurs again.
I put this on my logout link button and it works in IE6 sp1 and higher:
OnClientClick="document.execCommand('ClearAuthenticationCache');"

Resources