Simulating a remote website locally for testing - http

I am developing a browser extension. The extension works on external websites we have no control over.
I would like to be able to test the extension. One of the major problems I'm facing is displaying a website 'as-is' locally.
Is it possible to display a website 'as-is' locally?
I want to be able to serve the website exactly as-is locally for testing. This means I want to simulate the exact same HTTP data, including iframe ads, etc.
Is there an easy way to do this?
More info:
I'd like my system to act as closely to the remote website as possible. I'd like to run command fetch for example which would allow me to go to the site in my browser (without the internet on) and get the exact same thing I would otherwise (including information that is not from a single domain, google ads, etc).
I don't mind using a virtual machine if this helps.
I figured this was quite a useful thing in testing. Especially when I have a bug I need to reliably reproduce in sites that have many random factors (what ads show, etc).

As was already mentioned, caching proxies should do the trick for you (BTW, this is the simplest solution). There are quite a lot of different implementations, so you just need to spend some time selecting a proper one (according to my experience squid is a good solution). Anyway, I would like to highlight two other interesting options:
Option 1: Betamax
Betamax is a tool for mocking external HTTP resources such as web services and REST APIs in your tests. The project was inspired by the VCR library for Ruby. Betamax aims to solve these problems by intercepting HTTP connections initiated by your application and replaying previously recorded responses.
Betamax comes in two flavors. The first is an HTTP and HTTPS proxy that can intercept traffic made in any way that respects Java’s http.proxyHost and http.proxyPort system properties. The second is a simple wrapper for Apache HttpClient.
BTW, Betamax has a very interesting feature for you:
Betamax is a testing tool and not a spec-compliant HTTP proxy. It ignores any and all headers that would normally be used to prevent a proxy caching or storing HTTP traffic.
Option 2: Wireshark and replay proxy
Grab all traffic you are interested in using Wireshark and replay it. This I would say it is not that hard to implement required replaying tool, but you can use available solution called replayproxy
Replayproxy parses HTTP streams from .pcap files
opens a TCP socket on port 3128 and listens as a HTTP proxy using the extracted HTTP responses as a cache while refusing all requests for unknown URLs.
Such approach provide you with the full control and bit-to-bit precise simulation.

I don't know if there is an easy way, but there is a way.
You can set up a local webserver, something like IIS, Apache, or minihttpd.
Then you can grab the website contents using wget. (It has an option for mirroring). And many browsers have an option for "save whole web page" that will grab everything, like images.
Ads will most likely come from remote sites, so you may have to manually edit those lines in the HTML to either not reference the actual ad-servers, or set up a mock ad yourself (like a banner image).
Then you can navigate your browser to http://localhost to visit your local website, assuming port 80 which is the default.
Hope this helps!

I assume you want to serve a remote site that's not under your control. In that case you can use a proxy server and have that server cache every response aggressively. However, this has it's limits. First of all you will have to visit every site you intend to use through this proxy (with a browser for example), second you will not be able to emulate form processing.
Alternatively you could use a spider to download all content of a certain website. Depending on the spider software, it may even be able to download JavaScript-built links. You then can use a webserver to serve that content.

This service http://www.json-gen.com provides mock for html, json and xml via rest. By this way, you can test your frontend separately from backend.

Related

Serving two websites written in Google Go within a single VM

I have a VM from Digital Ocean. It currently has two domains linked to the VM.
I do not use any other web server but Golang's built in http module. Performance-wise I like it, and I feel like I have a full control over it.
Currently I am using a single Go program that has multiple websites built in.
http.HandleFunc("test.com/", serveTest)
http.HandleFunc("123.com/", serve123)
http.HandleFunc("/", serve123)
As they are websites, Go program is using port 80 for that.
And the problem is when I am trying to update only 1 website, I have to recompile whole thing as they are written in the same code.
1) Is there a way to make it hot-swappable only with Golang (without Nginx or Apache)
2) What would be a standard best practice?
Thank you so much!
Well, you can do hotswapping in go, but I really wouldn't want to do that unless really ncecessary as the complexity added isn't negligible (and I'm not talking about code).
You can have something close with a kind of proxy that would sit in front of the program and do a graceful swap whenever your binary change : the principle is to have the binary on one port, the proxy on another. When a new binary is ready, you run it on another port, and make the proxy redirect to the new port, then gracefully shutdown the old one.
There was a tool for that in Go that I can't remember the name of…
EDIT: not the one I had in mind, but close call https://github.com/rcrowley/goagain
Personnal advice: use a reverse proxy for that, its much more simple to do. My personnal setup is to use h2o to terminate SSL, HTTP2, etc, and send the requests to the various websites running on the background. Not only Go ones, though, but also PHP ones, a Gitlab instance, etc. Its much more flexible, and the performance penalty of the proxy is small…

Web browser as web server

Sorry if this is a dumb question that's already been asked, but I don't even know what terms to best search for.
I have a situation where a cloud app would deliver a SPA (single page app) to a client web browser. Multiple clients would connect at once and would all work within the same network. An example would be an app a business uses to work together - all within the same physical space (all on the same network).
A concern is that the internet connection could be spotty. I know I can store the client changes locally and then push them all to the server once the connection is restored. The problem, however, is that some of the clients (display systems) will need to show up-to-date data from other clients (mobile input systems). If the internet goes down for a minute or two it would be unacceptable.
My current line of thinking is that the local network would need some kind of "ThinServer" that all the clients would connect to. This ThinServer would then work as a proxy for the main cloud server. If the internet breaks then the ThinServer would take over the job of syncing data. Since all the clients would be full SPAs the only thing moving around would be the data - so the ThinServer would really just need to sync DB info (it probably wouldn't need to host the full SPA - though, that wouldn't be a bad thing).
However, a full dedicated server is obviously a big hurdle for most companies to setup.
So the question is, is there any kind of tech that would allow a web page to act as a web server? Could a business be instructed to go to thinserver.coolapp.com in a browser on any one of their machines? This "webpage" would then say, "All clients in this network should connect to 192.168.1.74:2000" (which would be the IP:port of the machine running this page). All the clients would then connect to this new "server" and that server would act as a data coordinator if the internet ever went down.
In other words, I really don't like the idea of a complicated server setup. A simple URL to start the service would be all that is needed.
I suppose the only option might have to be a binary program that would need to be installed? It's not an ideal solution - but perhaps the only one? If so, are their any programs out there that are single click web servers? I've tried MAMP, LAMP, etc, but all of them are designed for the developer. Any others that are more streamlined?
Thanks for any ideas!
There are a couple of fundamental ways you can approach this. The first is to host a server in a browser as you suggest. Some example projects:
http://www.peer-server.com
https://addons.mozilla.org/en-US/firefox/addon/browser-server/
Another is to use WebRTC peer to peer communication to allow the browsers share information between each other (you could have them all share date or have one act as a 'master' etc deepening not he architecture you wanted). Its likely not going to be that different under the skin, but your application design may be better suited to a more 'peer to peer' model or a more 'client server' one depending on what you need. An example 'peer to peer' project:
https://developer.mozilla.org/en-US/docs/Web/Guide/API/WebRTC/Peer-to-peer_communications_with_WebRTC
I have not used any of the above personally but I would say, from using similar browser extension mechanisms in the past, that you need to check the browser requirements before you decide if they can do what you want. The top one above is Chrome based (I believe) and the second one is Firefox. The peer to peer one contains a list of compatible browser functions, but is effectively Firefox and Chrome based also (see the table in the link). If you are in an environment where you can dictate the browser type and plugins etc then this may be ok for you.
The concept is definitely very interesting (peer to peer web servers) and it is great if you have the time to explore it. However, if you have an immediate business requirement, it might be that a simple on site server based approach may actually be more reliable, support a wider variety of browser and actually be easier to maintain (as the skills required are quite commonly available).
BTW, I should have said - 'WebRTC' is probably a good search term for you, in answer to the first line of your question.
httprelay.io v.s. WebRTC
Pros:
Simple to use
Fast
Supported by all browsers and HTTP clients
Can be used with the not stable network
Opensource and cross-platform
Cons:
Need to run a server instance
No data streaming is supported (yet)

HTTP requests trace

Are there any tools to trace the exact HTTP requests sent by a program?
I have an application which works as a client to a website and facilitates certain tasks (particularly it's a bot which makes automatic offers in a social lending webstite, based on some predefined criteria), and I'm interested in monitoring the actual HTTP requests which it makes.
Any tutorials on the topic?
Some popular protocol/network sniffers are:
Wireshark (previous the famous Ethereal)
Nirsoft SmartSniff (using WinPcap)
Nirsoft SocketSniff (allows you to watch the WinSock activity of the selected process and watch the content of each send or receive call, in Ascii mode or as Hex Dump)
Microsoft's Network Monitor (and a list of video-tutorials here, note video 'Advanced Filtering 2 of 2' where they specifically filter on process)
Wikipedia article 'Comparison of packet analyzers' has a nice overview of some other tools to.
Alternatively you could also look into (man-in-the-middle) proxy tools like:
Fiddler
mitmproxy
Both of the above actually record/decrypt/modify/replay HTTPS to!! You'd need to point the application you are monitoring to this proxy. If nothing else uses that proxy the log would be application/process specific and another upside to this approach is that one could also run the monitor/logger on a different machine.
Once you choose a tool, you can easily google a tutorial to go along with it.
However the core idea is usually the same: basically one sets a filter (on capture itself or display of captured data) on things like protocol, network/mac address, portno, etc. Depending on the tool, some can also filter on local application.
Hope this helps!
Take a look at HTTP Toolkit (disclaimer: it's my project).
Totally automatic HTTP & HTTPS interception, with zero setup, isolated to just the code you want to debug.
You can open a browser with it, and see all the traffic from that one window immediately (but no others), or run a terminal and automatically see all traffic only from processes started from that terminal. Built-in HTTPS decryption for everything, with no risky system-wide certificates and no manual setup. Let me know what you think!

Proxys for WebDAV

I'd like to set up a reverse proxy for my webdav server. The main reason for this is so that I can better control which files are being uploaded to the webdav server. I cannot do this at the webdav server itself, it's a service by alfresco and I have now idea whether or not it's possible to configure the webdav service at all.
In particular I'd like to prevent my mac to do the AppleDouble thingy on the webdav server, i.e. stop my mac from uploading ._* files for every real file I upload. There is as far as I know no way to stop my mac from attempting this.
Does the proxy server need to know more than merely relaying http requests back and forth, does it also need to know something about webdav in order for this to work?
Which proxy servers could your recommend for this?
Günther
Unless I'm missing something, a reverse proxy will have to rewrite header fields (such as Destination: and If:) to work properly and potentially even request/response bodies, and thus is unlikely to work well.
A "proper" proxy shouldn't get in the way, though.
You could do this with SabreDAV. It has a TemporaryFileFilter Plugin that does exactly what you need. Not only does it intercept these resource forks, it also places them in a temporary 'quarantine'. This is important, because OS/X will check if the file was successfully written and fail horribly otherwise.
There will be two things you still need to do to make this work though:
Automatic cleanup of these files (a script suitable for cron is also supplied).
The actual proxy bit. This means you'll have to implement a Collection and a File class that perform the HTTP requests.
Disclaimer: I authored SabreDAV

HTTP Tools for analysis and capture of requests/response

I am looking for tools that can be used for debugging web applications.I have narrowed my search to the following tools:
HTTPwatch.
Fiddler.
ieHTTPheader
liveHTTPheader.
It would be great if some of you having experience with these tools could discuss their pros and cons (features that you like or you think are missing in some of the tools but present in others).I am majorly confused between HTTPWatch and Fiddler, I would prefer Fiddler (being free) if it could fullfill all or most of HTTPWatch's features (however I am ready to pay for HTTPWatch if it's worth it).
P.S. - I know HTTPWatch and Fiddler are far more powerful than the other two tools (let me know if you disagree).
I am sure most of you would want more details as to what I would exactly like to do with these tools however I would like if you could compare these tools taking a broader perspective in mind comparing them as tools in general.
** Disclaimer: Posted by Simtec Limited **
Here's a list of the main advantages of HttpWatch (our product) and Fiddler. Of course we're biased, but we've tried to be objective:
HttpWatch Advantages
Shows requests that were read from
the browser cache without going onto network
Shows page level events, e.g. Render Start, DOM Load, etc
Handles SSL traffic without certificate warnings or requiring changes to trusted root CAs
Reduces 'observer effect' by not requiring HTTP proxy at network level
Groups requests by page
Fiddler Advantages
Works with almost any HTTP client not just Firefox and IE
Can intercept traffic from clients on non-Windows platforms, e.g. mobile devices
Requests can be intercepted and modified on the fly, e.g. change cookie value
Supports plugins to add extra functionality
Wireshark works at the network layer and of course gives you more information that the other tools you have mentioned here, however, if you want to debug web applications by breaking on requests/responses, modifying them and replaying - Fiddler is the tool for you!
Fiddler cannot however show TCP level information however and in such cases you will need Network Monitor or Wireshark.
If you specify what exactly you want to do with the 'debugger', one can suggest what's more appropriate for the job.
Fidler is good and simple to use. Wireshark is also worth considering since it gives a lot of extra information
You could also use Wireshark which allows you to analyze many protocols including TCP/IP.
A lab exercise from a University lecture on using Wireshark to analyze HTTP can be found here: Wireshark Lab: HTTP
take a look at HTTP Debugger Pro
It works with all browsers and custom software and doesn't change proxy settings.

Resources