trying to wget my stock portfolio

trying to wget my stock portfolio - http

I'm trying to read my stock portfolio into a script. The following works with NAB Online Trading but not Bell Direct.
install the Export Domain Cookies Firefox addon
log in to my online broker with Firefox
save the domain cookies to a file (eg cookies.txt)
wget --no-check-certificate --load-cookies=cookies.txt -O folio.htm https://...(portfolio URL)
-- The idea being to reuse the browser's login session. When I try it with Bell Direct, wget is redirected to the login page. I get the same results with curl. What am I missing? Is there some state that is stored in the browser besides in the cookies? Bell isn't using "basic authentication" because the login page is a form for username / password - it doesn't pop up the browser's built-in login dialog.
Here is what happens (under Windows XP with Cygwin):
$ wget --server-response --no-check-certificate --load-cookies=cookies-bell.txt -O folio-bell.htm https://www.belldirect.com.au/trade/portfoliomanager/
--2009-12-14 10:52:08-- https://www.belldirect.com.au/trade/portfoliomanager/
Resolving www.belldirect.com.au... 202.164.26.80
Connecting to www.belldirect.com.au|202.164.26.80|:443... connected.
WARNING: cannot verify www.belldirect.com.au's certificate, issued by '/C=ZA/ST=Western Cape/L=Cape Town/O=Thawte Consulting cc/OU=Certification Services Division/CN=Thawte Server CA/emailAddress=server-certs#thawte.com':
Unable to locally verify the issuer's authority.
HTTP request sent, awaiting response...
HTTP/1.1 302 Found
Connection: keep-alive
Date: Sun, 13 Dec 2009 23:52:16 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: /account/login.html?redirect=https://www.belldirect.com.au/trade/portfoliomanager/index.html
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 229
Location: /account/login.html?redirect=https://www.belldirect.com.au/trade/portfoliomanager/index.html [following]
...

Perhaps the server is validating the session based on User-Agent as well as the cookie. Check what user-agent your Firefox install is using (perhaps use WhatIsMyUserAgent.com if you don't know it), and try using that exact same user agent in your Wget call (via the --user-agent="..." parameter).

You need to POST the login form variables, then, with those cookies, goto the inner page.
http://www.trap17.com/index.php/automatic-login-curl_t38162.html for some example code.

The login is encrypted over the HTTPS protocol, and you do not provide a certificate. Perhaps belldirect requires a valid certificate for client authentication.
You can export a certificate in Firefox by clicking the highlighted blue portion of the URL > More Information > Security Tab > View Certificate > Details > Export. Then, you can use the --certificate=filename option to specify the exported certificate in your wget command.

Maybe you need to set the referrer too.

Related

Removing hash from URL on HTTP redirect

I just added a feature on a website to allow users to log in with Facebook. As part of the authentication workflow Facebook forwards the user to a callback URL on my site such as below.
https://127.0.0.1?facebook-login-callback?code.....#_=_
Note the trailing #_=_ (which is not part of the authentication data, Facebook appears to add this for no clear reason)
Upon receiving the request in the backend I validate the credentials, create a session for the user, then forward them to the main site using a Location header in the HTTP response.
I've inspected the HTTP response via my browser developer tools and confirmed I have set the location header as.
Location: https://127.0.0.1/
The issue is that the URL that appears in the browser address bar after forwarding is https://127.0.0.1/#_=_
I don't want the user to see this trailing string. How can I ensure it is removed when redirecting the user to a new URL?
The issue happens in all browsers I have tested. Chrome, Firefox, Safari and a few others
I know a similar question has been answered in other threads however there is no jquery or javascript in this workflow as in the other threads. All the processing of the login callback happens in backend code exlusively.
EDIT
Adding a bounty. This has been driving up the wall for some time. I have no explanation and don't even have a guess as to what's going on. So I'm going to share some of my hard earned Stackbux with whoever can help me.
Just To be clear on a few points
There is no Javascript in this authentication workflow whatsoever
I have implemented my own Facebook login workflow without using their Javascript libraries or other third party tools, it directly interacts with the Facebook REST API using my own Python code in the backend exclusively.
Below are excerpts from the raw HTTP requests as obtained from Firefox inspect console.
1 User connects to mike.local/facebook-login and is forwarded to Facebook's authentication page
HTTP/1.1 302 Found
Server: nginx/1.19.0
Date: Sun, 28 Nov 2021 10:44:30 GMT
Content-Type: text/plain; charset="UTF-8"
Content-Length: 0
Connection: keep-alive
Location: https://www.facebook.com/v12.0/dialog/oauth?client_id=XXX&redirect_uri=https%3A%2F%2Fmike.local%2Ffacebook-login-callback&state=XXXX
2 User accepts and Facebook redirects them to mike.local/facebook-login-callback
HTTP/3 302 Found
location: https://mike.local/facebook-login-callback?code=XXX&state=XXX#_=_
...
Requested truncated here. Note the offending #_=_ in the tail of the Location
3 Backend processes the tokens Facebook provides via the user forwarding, and creates a session for the user then forwards them to mike.local. I do not add #_=_ to the Location HTTP header as seen below.
HTTP/1.1 302 Found
Server: nginx/1.19.0
Date: Sun, 28 Nov 2021 10:44:31 GMT
Content-Type: text/plain; charset="UTF-8"
Content-Length: 0
Connection: keep-alive
Location: https://mike.local/
Set-Cookie: s=XXX; Path=/; Expires=Fri, 01 Jan 2038 00:00:00 GMT; SameSite=None; Secure;
Set-Cookie: p=XXX; Path=/; Expires=Fri, 01 Jan 2038 00:00:00 GMT; SameSite=Strict; Secure;
4 User arrives at mike.local and sees a trailing #_=_ in the URL. I have observed this in Firefox, Chrome, Safari and Edge.
I have confirmed via the Firefox inspect console there are no other HTTP requests being sent. In other words I can confirm 3 is the final HTTP response sent to the user from my site.

According to RFC 7231 §7.1.2:
If the Location value provided in a 3xx (Redirection) response does
not have a fragment component, a user agent MUST process the
redirection as if the value inherits the fragment component of the
URI reference used to generate the request target (i.e., the
redirection inherits the original reference's fragment, if any).
If you get redirected by Facebook to an URI with a fragment identifier, that fragment identifier will be attached to the target of your redirect. (Not a design I agree with much; it would make sense semantically for HTTP 303 redirects, which is what would logically fit in this workflow better, to ignore the fragment identifier of the originator. It is what it is, though.)
The best you can do is clean up the fragment identifier with a JavaScript snippet on the target page:
<script async defer type="text/javascript">
if (location.hash === '#_=_') {
if (typeof history !== 'undefined' &&
history.replaceState &&
typeof URL !== 'undefined')
{
var u = new URL(location);
u.hash = '';
history.replaceState(null, '', u);
} else {
location.hash = '';
}
}
</script>
Alternatively, you can use meta refresh/the Refresh HTTP header, as that method of redirecting does not preserve the fragment identifier:
<meta http-equiv="Refresh" content="0; url=/">
Presumably you should also include a manual link to the target page, for the sake of clients that do not implement Refresh.
But if you ask me what I’d personally do: leave the poor thing alone. A useless fragment identifier is pretty harmless anyway, and this kind of silly micromanagement is not worth turning the whole slate of Internet standards upside-down (using a more fragile, non-standard method of redirection; shoving yet another piece of superfluous JavaScript the user’s way) just for the sake of pretty minor address bar aesthetics. Like The House of God says: ‘The delivery of good medical care is to do as much nothing as possible’.

Not a complete answer but a couple of wider architectural points for future reference, to add to the above answer which I upvoted.
AUTHORIZATION SERVER
If you enabled an AS to manage the connection to Facebook for you, your apps would not need to deal with this problem.
An AS can deal with many deep authentication concerns to externalize complexity from apps.
SPAs
An SPA would have better control over processing login responses, as in this code of mine which uses history.replaceState.
SECURITY
An SPA can be just as secure as a website with the correct architecture design - see this article.

Unexpected page redirection from HTTPS to HTTP

I was testing a page with Twitter Cards and for the provided URL (I'm simplifying the URL but the output is valid) using Twitters card preview tool:
https://example.net/hello/test-page/
It was pulling the card but with this warning:
INFO: Page fetched successfully
INFO: 16 metatags were found
INFO: twitter:card = summary_large_image tag found
INFO: Card loaded successfully
WARN: this card is redirected to http://example.net/hello/test-page/
The only difference here is that the HTTPS has been switched to HTTP. It is not just Twitter, but FB and LinkedIn too.
Using LinkedIn's link preview tool, it similarly reported:
URL redirect trail
1 301 Redirect https://example.net/hello/test-page/
2 200 Success http://example.net/hello/test-page/
And with Facebooks link debugger:
Response Code
200
Fetched URL
https://example.net/hello/test-page/
Canonical URL
http://example.net/hello/test-page/
Redirect Path
Input URL -> https://example.net/hello/test-page/
301 HTTP Redirect -> http://example.net/hello/test-page/
og:url Meta Tag -> https://example.net/hello/test-page/
So... I check the source of the generated web page and I can confirm:
All OG and Twitter Card related META and LINK tags are not using "http". They are all "https". The canonical LINK tag is also using "https".
If I manually go to the "http" version of the URL in the browser, it redirects me immediately to the "https" version.
Can anyone explain why this might happen, and places where I should start to dig?
One last example, when I run this curl command in Terminal, it also seems fine:
curl -sLD - https://example.net/hello/test-page/ -o /dev/null -w '%{url_effective}'
HTTP/2 200
server: nginx
date: Sun, 23 Aug 2020 16:27:28 GMT
content-type: text/html; charset=UTF-8
strict-transport-security: max-age=31536000
vary: Accept-Encoding
last-modified: Sun, 23 Aug 2020 16:23:11 GMT
cache-control: max-age=43, must-revalidate
x-nananana: Batcache-Hit
vary: Cookie
x-ac: 2.ord _atomic_dca
(In case its relevant; this is a WordPress site but I have no unusual plugins running)

The 'issue' was with hosting provider, in this case Pressable, and because we used Lets Encrypt for SSL certs.
Here are the relevant parts from the support chat:
The reason that this occurs is because on WordPress.com, there wasn't
always free SSL certificates via Let's Encrypt. So, in relation to
Facebook, for example, posts that displayed a Facebook like button
with counts would have a canonical URL at Facebook like
http://myblog.com/some/post. Turning on SSL and 301 redirecting the
http:// version of that page to https:// would wipe out those
like/share counts because Facebook sees the exact same URL with a
different protocol as two distinct things.
So, the implementation had to serve SSL certificates and force SSL for
the millions of sites that were now going to have a free Let’s Encrypt
SSL certificate, but also needed to give these crawlers a way to
access the content over http.
When doing the Pressable implementation of Let’s Encrypt, a lot of it
was copied from WordPress.com, and it was not incorrect to do so.
Pressable also didn’t always have free SSL certificates. If we started
to force the redirection of sites that were always http to https that
had “like” buttons and share counts, those counts would have been
lost.
Can't say I'm thrilled with the answer.

MSXML6.dll Access Denied redirecting HTTP to HTTPS

I am using MSXML6 in a vbscript-like code to download data over HTTP. But the server now requires connections to upgrade to HTTPS.
This is causing the xmlhttp object to fail with the error "msxml6.dll: Access is denied."
Set http = CreateObject("msxml2.xmlhttp.6.0")
http.open "Get", URL, False 'false is for 'async'
http.send
Using a sniffing tool, the operation stops after receiving the redirection-to-https response, and the error is generated without further details.
Requesting http://host/doc.php (plain http), the returned headers look something like this:
HTTP/1.1 301 Moved Permanently
Date: Fri, 19 Jul 2019 23:59:30 GMT
Content-Type: text/html; charset=iso-8859-1
Transfer-Encoding: chunked
Connection: keep-alive
Location: https://host/doc.php
Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
Server: cloudflare
However, if the requested URL is already https, the operation resumes normally without any complaint.
Is there anything I can do on the server side to convince xmlhttp to upgrade the connection to https peacefully?
Updating the code in the client application is out of the question as it is a legacy application with so many users out there using it, without an update mechanism.
Asking the users to update the URL adding an "s" after http is workable but too much hassle, as reaching them to tell them is also not easy at all.
Edit:
The conclusion is in this comment. The summary is that this is a client-side protection feature and it cannot be overridden from the server side.

The problem as mentioned in Xmlhttp request is raising an Access Denied error is you need to use the Server version of XMLHTTP that isn't restricted to accessing sites trusted by IE and restricted by the IE security policies. This is because XMLHTTP is designed for client-side whereas ServerXMLHTTP is specifically designed for server-side usage.

Gassfish 3.1.2 with HTTP DIGEST authentication fails occasionally with 401

Has anyone used Glassfish 3.1.2 with HTTP DIGEST authentication in anger?
I got it to work fine, or so I thought... until I discovered that its behavior was erratic...
it works maybe 9 out of 10 times, but fails to authenticate the 10th time.
This is when I test it with wget as a client on the same machine with the same credentials and the same Java EE application, (as it happens, a REST web service, but I also have the problem on other Applications.)
I ran wget locally.
My Glassfish machine is only servicing those wget requests, it isn't doing much else!
I've no reason to believe wget is misbehaving occasionally. I calculated the request digest by hand (from the wget HTTP debug) on one of the occasions that it failed, just to be sure. It seemed fine.
When I run wget with debug, I can see it failing first time without credentials, then
succeeding with credentials. However, 1 time in 10 or thereabouts it fails the 2nd time
too ( debug shown here.)
[writing POST file request.xml ... done]
HTTP request sent, awaiting response...
---response begin---
HTTP/1.1 401 Unauthorized
X-Powered-By: Servlet/3.0 JSP/2.2 (GlassFish Server Open Source Edition
3.1.2 Java/Sun Microsystems Inc./1.6)
Server: GlassFish Server Open Source Edition 3.1.2
WWW-Authenticate: Digest realm="jdbc-realm",qop="auth",nonce="1377101691098:d07adb4a1421a265f3aa36bd99df7f6ef8c7a6e7887eb7d876e6b5ce079d1126",
opaque="C26EED99B0A8C0BCA16900215CCD241F"
Content-Type: text/html
Content-Length: 1069
Date: Wed, 21 Aug 2013 16:14:50 GMT
---response end---
401 Unauthorized
Skipping 1069 bytes of body: [<!DOCTYPE html P...
I set debug for javax.enterprise.system.core.security.level=FINE
I didn't see any error messages... but I did notice that for a "good" wget, the "hasResourcePermission" was called 3 times, 2 times returning false and one time returning true.
However, for the "bad" wget call, it is only called 2 times returning false.
|FINE|glassfish3.1.2|javax.enterprise.system.core.security|_ThreadID=36;_ThreadName=Thread->2;
ClassName=com.sun.enterprise.security.web.integration.WebSecurityManager;
MethodName=hasResourcePermission;|[Web-Security] hasResource isGranted: false|#]
|FINE|glassfish3.1.2|javax.enterprise.system.core.security|_ThreadID=36;_ThreadName=Thread-
2;ClassName=com.sun.enterprise.security.web.integration.WebSecurityManager;
MethodName=hasResourcePermission;|[Web-Security] hasResource isGranted: false|#]
GOOD CASE ONLY
|FINE|glassfish3.1.2|javax.enterprise.system.core.security|_ThreadID=36;_ThreadName=Thread-
2;ClassName=com.sun.enterprise.security.web.integration.WebSecurityManager;
MethodName=hasResourcePermission;|[Web-Security] hasResource isGranted: true|#]
Any ideas anyone ? Is there more Debug I could enable ?
thanks
******************GLASSFISH DIGEST INSTRUCTIONS********
Install a mysql database with yum.
Follow these instructions (with some changes, this blog is for FORM authentication so stop at step 4)
http://jugojava.blogspot.ie/2011/02/jdbc-security-realm-with-glassfish-and.html
Create the mysql database "realm_db" with the tables in the above blog
Using the Glassfish console UI, I created a JDBC Connection Pool and JDBC Resource for mysql database.
In the Pool Additional Properties, add in your mysql database properties as shown in the blog
On the server-config, Security page, I set "Default Realm" to jdbc-realm
IMPORTANT: When creating the JDBC security realm, use JAAS context of "jdbcDigestRealm" and JNDI of "jdbc/realm_db".
I left these fields blank, Digest Algorithm, Encoding, Charset, Password, Encryption Algormithm etc. and I put the passwords in the mysql database in clear text.
By the way, I used an up-to-date version of wget for testing because I read somewhere that older versions don't have proper RFC2617 DIGEST support. The version is 1.14 from Aug 12.
you need a driver file in $GLASSFISH_HOME/domains/domain1/lib. The file is called mysql-connector-java-3.1.13-bin.jar

What can cause IE9 to fail on a redirect?

I'm loooking at an existing web forms app that I didn't write. It's working as expected in IE8 and FF, but fails in IE9 with:
"Internet Explorer cannot display the webpage"
The code is a simple handler that's doing a context.Response.Redirect.
Using Fiddler, I can see the 302 response, so everything looks fine.
Any ideas why IE9 behaves differently, or what I can do to fix?
Edit due to request for code:
Sure, here's the line:
context.Response.Redirect("file:" & Filename.Replace("/", "\"))
Fiddler shows:
HTTP/1.1 302 Found
Date: Thu, 09 Aug 2012 19:01:24 GMT
Server: Microsoft-IIS/6.0
X-Powered-By: ASP.NET
X-AspNet-Version: 2.0.50727
Location: file:J:\Replay\Meetings\Meetings-2012.pdf
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Length: 254
<html><head><title>Object moved</title></head><body>
<h2>Object moved to here.</h2>
</body></html>

I'm asking just to make sure, but do you have J:\Replay\Meetings\Meetings-2012.pdf file locally on your disk? The file:// protocol is used only to access local files. I suppose it's ok, as you wrote it works as expected in other browsers.
If so, I've read that this kind of error can be caused by invalid url to the file.
Try to redirect like this:
context.Response.Redirect("file://" & Filename);
Let me know if this helps.

This may be a zone elevation issue. Specifically, IE tries to prevent sites in one security zone from elevating themselves to another security zone. Redirecting to your local machine from outside your local machine is considered dangerous.
Possible fixes (I am not sure if these will work in IE9):
1. Add the site that triggers these redirections to the Trusted zone.
2. Change your security settings. Notice the "Websites in less privileged web content zone can navigate into this zone" setting (Internet Options -> Internet Zone -> Custom Level). You need to set that to "Enable" or "Prompt" for the "My Computer Zone." I suspect this can be done by either adding the "My Computer Zone" to your zones list ( http://support.microsoft.com/kb/315933 ) or by editing the "My Computer Zone" directly (via the registry). You may also need to add the HKCU\Software\Microsoft\Internet
Explorer\Main\Disable_Local_Machine_Navigate key (set to 0 REG_DWORD).

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

trying to wget my stock portfolio - http

Perhaps the server is validating the session based on User-Agent as well as the cookie. Check what user-agent your Firefox install is using (perhaps use WhatIsMyUserAgent.com if you don't know it), and try using that exact same user agent in your Wget call (via the --user-agent="..." parameter).

You need to POST the login form variables, then, with those cookies, goto the inner page. http://www.trap17.com/index.php/automatic-login-curl_t38162.html for some example code.

Maybe you need to set the referrer too.

Related

Removing hash from URL on HTTP redirect

Unexpected page redirection from HTTPS to HTTP

MSXML6.dll Access Denied redirecting HTTP to HTTPS

Gassfish 3.1.2 with HTTP DIGEST authentication fails occasionally with 401

What can cause IE9 to fail on a redirect?

Categories

Resources