Qt WebView - intercept loading of JS/CSS Libraries to load local ones - qt

I've been looking for a while through documentation to find a way to accomplish this and haven't been successful yet. The basic idea is, that I have a piece of html that I load through Qt's webview. The same content can be exported to a single html file.
This file uses Libraries such as Bootstrap and jQuery. Currently I load them through CDN which works when online just fine. However, my application also needs to run offline. So I'm looking for a way to intercept loading of the Libraries in Qt and serve a locally saved file instead. I've tried installing a https QWebEngineUrlSchemeHandler, but that never seems to trigger the requestStarted method on it.
(PyQT example follows)
QWebEngineProfile.defaultProfile().installUrlSchemeHandler(b'https', self)
If I use a different text for the scheme and embed that into the page it works, so my assumption is that it doesn't work as Qt has a default handler for it already registered. But that different scheme would fail in the file export.
Anyway, back to the core question; Is there a way to intercept loading of libraries, or to change the url scheme specifically within Qt only?
Got Further with QWebEngineUrlRequestInterceptor, now redirecting https requests to my own uri, which has a uri handler. However, the request never gets through to it, because: Redirect location 'conapp://webresource/bootstrap.min.css' has a disallowed scheme for cross-origin requests.
How do I whitelist my own conapp uri scheme?
Edit: For completeness sake, it turns out back when I originally stated the question, it was impossible to accomplish with PySide 5.11 due to bugs in it. The bug I reported back then is nowadays flagged as fixed (5.12.1 I believe) so it should now be possible to accomplish this again using Qt methods, however for my own project I'll stick to jinja for now which has become a solution for many other problems.

The following example shows how I've done it. It uses the QWebEngineUrlRequestInterceptor to redirect content to a local server.
As an example, I intercept the stacks.css for stackoverflow and make an obvious change.
import requests
import sys
import threading
from PyQt5 import QtWidgets, QtCore
from PyQt5.QtWebEngineWidgets import QWebEngineView, QWebEnginePage, QWebEngineProfile
from PyQt5.QtWebEngineCore import QWebEngineUrlRequestInterceptor, QWebEngineUrlRequestInfo
from http.server import HTTPServer, SimpleHTTPRequestHandler
from socketserver import ThreadingMixIn
# Set these to the address you want your local patch server to run
HOST = '127.0.0.1'
PORT = 1235
class WebEngineUrlRequestInterceptor(QWebEngineUrlRequestInterceptor):
def patch_css(self, url):
print('patching', url)
r = requests.get(url)
new_css = r.text + '#mainbar {background-color: cyan;}' # Example of some css change
with open('local_stacks.css', 'w') as outfile:
outfile.write(new_css)
def interceptRequest(self, info: QWebEngineUrlRequestInfo):
url = info.requestUrl().url()
if url == "https://cdn.sstatic.net/Shared/stacks.css?v=596945d5421b":
self.patch_css(url)
print('Using local file for', url)
info.redirect(QtCore.QUrl('http:{}:{}/local_stacks.css'.format(HOST, PORT)))
class ThreadingHTTPServer(ThreadingMixIn, HTTPServer):
"""Threaded HTTPServer"""
app = QtWidgets.QApplication(sys.argv)
# Start up thread to server patched content
server = ThreadingHTTPServer((HOST, PORT), SimpleHTTPRequestHandler)
server_thread = threading.Thread(target=server.serve_forever)
server_thread.daemon = True
server_thread.start()
# Install an interceptor to redirect to patched content
interceptor = WebEngineUrlRequestInterceptor()
profile = QWebEngineProfile.defaultProfile()
profile.setRequestInterceptor(interceptor)
w = QWebEngineView()
w.load(QtCore.QUrl('https://stackoverflow.com'))
w.show()
app.exec_()

So, the solution I went with in the end was, first, introduce jinja templates. Then, using those the template would have variables and blocks set based on export or internal use and from there I did not need the interceptor anymore.

Related

How to force all calls to pythons requests.get to use proxy by default?

I am using a third party library in my code to get access token (ADAL). This library has a lot of calls to requests.get and requests.post. How can I force all the calls to use user provided proxies without having to modify each call to requests.get('http://example.com', proxies=proxies).
I cannot do export HTTP_PROXY. I have to do it from within my script.
You could monkey patch requests.
At the very start of your script:
import requests
import functools
orig_get = requests.get
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
requests.get = functools.partial(orig_get, proxies=proxies)

set proxy To hide my IP address for scraping the webpage using scrapy

I am using scrapy to crawl website now I need to set proxy handle the request which has been sent. Can anyone help me solve this set proxy in scrapy app. Please give any sample link too if you have so. And I need solution that from which IP this request is going.
You can do it through the code below found here:
1 – Create a new file called middlewares.py and save it in your scrapy project and add the following code to it.
# Importing base64 library because we'll need it ONLY
#in case if the proxy we are going to use requires authentication
import base64
# Start your middleware class
class ProxyMiddleware(object):
# overwrite process request
def process_request(self, request, spider):
# Set the location of the proxy
request.meta['proxy'] = "http://YOUR_PROXY_IP:PORT"
# Use the following lines if your proxy requires authentication
proxy_user_pass = "USERNAME:PASSWORD"
# setup basic authentication for the proxy
encoded_user_pass = base64.encodestring(proxy_user_pass)
request.headers['Proxy-Authorization'] = 'Basic ' + encoded_user_pass
2 – Open your project’s configuration file (./project_name/settings.py) and add the following code
DOWNLOADER_MIDDLEWARES = {
'scrapy.contrib.downloadermiddleware.httpproxy.HttpProxyMiddleware': 110,
'project_name.middlewares.ProxyMiddleware': 100,
}
Also, you can use multiple proxies with scrapy. More information can
be found here.

Basic HTTP Authentication with python 3.2 (urllib.request)

This is my first post with this account, and Ive been struggling for the last week to get this to work, so I hope someone can help me get this working.
Im trying to pull some data from https://api.connect2field.com/ but its rejecting all of my authentication attempts from python (not from a browser though).
The code Im using
import urllib.request as url
import urllib.error as urlerror
urlp = 'https://api.connect2field.com/api/Login.aspx'
# Create an OpenerDirector with support for Basic HTTP Authentication...
auth_handler = url.HTTPBasicAuthHandler()
auth_handler.add_password(realm='Connect2Field API',
uri=urlp,
user='*****',
passwd='*****')
opener = url.build_opener(auth_handler)
# ...and install it globally so it can be used with urlopen.
url.install_opener(opener)
try:
f = url.urlopen(urlp)
print (f.read())
except urlerror.HTTPError as e:
if hasattr(e, 'code'):
if e.code != 401:
print ('We got another error')
print (e.code)
else:
print (e.headers)
Im pretty sure the code is doing everything right, which makes me think that maybe theres another authentication step that ASP.net requires. Does anybody have any experience with ASP.Net's authentication protocol?
Im gonna be checking this post throughout the day, so I can post more info if required.
Edit: Ive also tried running my script against a basic http auth server running at home, and it authenticates, so Im pretty sure the request is set up properly.
It appears that IIS is set up to do basic authentication, ASP.NET will be most probably be configured to use windows authentication.
As you have said that authentication works via browser, so the best bet for you is to use tool such as fiddler to capture request/response when connecting via browser and also when connecting via your code. Compare them to troubleshoot the issue.
For example, I remember a case where the web site first requested authentication credentials and then re-directed to different url which prompted for different credentials.

Is there a way to add detailed remote crash reporting to a Flex Air application?

I will be releasing my Air/Flex application soon, but I am pretty sure there are a couple of bugs that may pop up on the various platforms that Air is available for. So I was wondering if there is a way to implement a mechanism, that would send an error report, logging where the error happened, to a remote server each time an app crashes? This way I might catch errors that otherwise would go unnoticed.
Global error handling is now supported in Flash 10 and AIR2. More info on that here: http://help.adobe.com/en_US/air/reference/html/flash/events/UncaughtErrorEvent.html
Using that kind of functionality to catch uncaught exceptions; you can submit the trace to some web service set up specifically to grab them. Using Google App Engine is excellent for this purpose since it already has a logging feature which grabs all kinds of meta data from the client calling the application. Also, if your logs become huge for some reason - at least you wont have to worry about storing them. Google does that for you :)
I've set up such a service as outlined below (granted it has some flaws, in particular anyone can call it and add "traces", but you could add some shared secret and post over HTTPS to have some tiny measure of security).
App Engine Logging Service
#!/usr/bin/env python
from google.appengine.ext import webapp
from google.appengine.ext.webapp import util
class MainHandler(webapp.RequestHandler):
def post(self):
import logging
if self.request.get('trace'):
logging.error(self.request.get('trace')) #Adds a row to GAE:s own logs :)
self.response.out.write('trace logged')
else:
set_status(501)
def get(self):
""" Kill this function when done testing """
test_form = """
<form action="/" method="POST">
<textarea name="trace"></textarea>
<input type="submit">
</form>"""
self.response.out.write(test_form)
def main():
application = webapp.WSGIApplication([('/', MainHandler)],
debug=False)
util.run_wsgi_app(application)
if __name__ == '__main__':
main()
I wrote a little AIR-app containing this little test function which simply POST:ed the app engine service with the parameter "trace" specified.
Posting to the logging service (ActionScript)
private function postToLogger(event:MouseEvent):void
{
var service:HTTPService = new HTTPService();
var parameters:Object = {'trace': "omg something went wrong"};
service.url = "https://YOURSUPERSIMPLELOGGINGSERVICE.APPSPOT.COM";
service.method = HTTPRequestMessage.POST_METHOD;
service.resultFormat = HTTPService.RESULT_FORMAT_E4X;
service.addEventListener("result", onSuccess);
service.addEventListener("fault", onError);
service.send(parameters);
}
And finally, this is how it looks in the logs, lots of metadata, and of the trace you caught in your AIR app.

Slipping podcasts through a filter

My workplace filters our internet traffic by forcing us to go through a proxy, and unfortunately sites such as IT Conversations and Libsyn are blocked. However, mp3 files in general are not filtered, if they come from sites not on the proxy's blacklist.
So is there a website somewhere that will let me give it a URL and then download the MP3 at that URL and send it my way, thus slipping through the proxy?
Alternatively, is there some other easy way for me to get the mp3 files for these podcasts from work?
EDIT and UPDATE: Since I've gotten downvoted a few times, perhaps I should explain/justify my situation. I'm a contractor working at a government facility, and we use some commercial filtering software which is very aggressive and overzealous. My boss is fine with me listening to podcasts at work and is fine with me circumventing the proxy filtering, and doesn't want to deal with the significant red tape (it's the government after all) associated with getting the IT department to make an exception for IT Conversations or the Java Posse, etc. So I feel that this is an important and relevant question for programmers.
Unfortunately, all of the proxy websites for bypassing web filters have also been blocked, so I may have to download the podcasts I like at home in advance and then bring them into work. If can tell me about a lesser-known service I can try which might not be blocked, I'd appreciate it.
Can you SSH out? SSH Tunnels are your friend!
Why not subscribe at home and have your favorite podcasts copied to your mp3 player or a USB drive and just take it to work with you each day and back home in the evening? Then you can listen and your are not circumventing your clients network.
There are many other Development/Dotnet/Technology podcasts, try one of those. for the blocked sites try an anonymous proxy site, there are plenty out there.
Since this is work related material, I would recommend opening up a request that the sites in question not be blocked.
I ended up writing an extremely dumb-and-simple cgi-script and hosting it on my web server, with a script on my work computer to get at it. Here's the CGI script:
#!/usr/local/bin/python
import cgitb; cgitb.enable()
import cgi
from urllib2 import urlopen
def tohex(data):
return "".join(hex(ord(char))[2:].rjust(2,"0") for char in data)
def fromhex(encoded):
data = ""
while encoded:
data += chr(int(encoded[:2], 16))
encoded = encoded[2:]
return data
if __name__=="__main__":
print("Content-type: text/plain")
print("")
url = fromhex( cgi.FieldStorage()["target"].value )
contents = urlopen(url).read()
for i in range(len(contents)/40+1):
print( tohex(contents[40*i:40*i+40]) )
and here's the client script used to download the podcasts:
#!/usr/bin/env python2.6
import os
from sys import argv
from urllib2 import build_opener, ProxyHandler
if os.fork():
exit()
def tohex(data):
return "".join(hex(ord(char))[2:].rjust(2,"0") for char in data)
def fromhex(encoded):
data = ""
while encoded:
data += chr(int(encoded[:2], 16))
encoded = encoded[2:]
return data
if __name__=="__main__":
if len(argv) < 2:
print("usage: %s URL [FILENAME]" % argv[0])
quit()
os.chdir("/home/courtwright/mp3s")
url = "http://example.com/cgi-bin/hex.py?target=%s" % tohex(argv[1])
fname = argv[2] if len(argv)>2 else argv[1].split("/")[-1]
with open(fname, "wb") as dest:
for line in build_opener(ProxyHandler({"http":"proxy.example.com:8080"})).open(url):
dest.write( fromhex(line.strip()) )
dest.flush()

Resources