how to intercept and modify HTTP responses on server side? - http

I am working with a client/server application which uses HTTP, and my goal is to add new features to it. I can extend the client by hooking my own code to some specific events, but unfortunately the server is not customizable. Both client and server are in a Windows environment.
My current problem is that performance is awful when a lot of data are received from the server: it takes time to transmit it and time to process it. The solution could be to have an application on server side to do the processing and send only the result (which is much smaller). The problem is there is not built-in functions to manipulate responses from the server before sending them.
I was thinking to listen to all traffic on port 80, identifying relevant HTTP responses and send them to my application while blocking the response (to avoid sending huge data volume which won't be processed by the client). As I am lacking a lot of network knowledge, I am a bit lost when thinking about how to do it.
I had a look at some low-level packet intercepting methods like WinPCap, but it seems to require a lot of work to do what I need. Moreover I think it is not possible to block or modify responses with this API.
A reverse proxy which allows user scripts to be triggered by specific requests or responses would be perfect, but I am wondering if there is no simpler way to do this interception/send elsewhere work.
What would be the simplest and cleanest method to enable this behavior?
Thanks!

I ended making a simple reverse proxy to access the HTTP server. The reverse proxy then extracts relevant information from the server response and sends it to the server-side processing component, and replaces information extracted from the response by an ID the client uses to request the other component to get the processing results.
The article at http://www.codeproject.com/KB/web-security/HTTPReverseProxy.aspx was very helpful to make the first draft of the reverse proxy.

Hmm.... too much choices.
2 ideas:
configure on all clients a Http Proxy. there are some out there, that let you manipulate what goes through in both directions (with scripts, plugins).
or
make a pass through project, that listens to port 80, and forewards the needed stuff to port 8080 (where your original server app runs)
question is, what software is the server app running at,
and what knowledge (dev) do you have?
ah. and what is "huge data"? kilobyte? megabyte? gigabyte?

Related

Angular 6 - how to make a single http request and listens to multiple responses?

My backend generates log on processing some data and i would like to show it as a console in my frontend.
How can i implement a method that can listen to multiple response till a certain parameter is true from backend on a single http request in angular 6.
you can make use of WebSocket, i.e. make websocket connection with the your backend and get data, this is kind of push mechanism where server push data to on connection and client get data as new data is available in connection.
it not possible with help of single http request as it follows pull mechanism. so you will get data which are available. to get new data you have to perform another http request.
Unfortunately, an HTTP request cannot remain open listening for multiple responses, once it receives a response it will close the connection.
Fortunately, you can use websockets.
Implementing websockets is not too difficult, and there are many tutorials for implementing with Angular such as this one: https://tutorialedge.net/typescript/angular/angular-websockets-tutorial/
I'm not sure what back end technology you're using, but most modern ones have websocket support.
If you're not familiar with websockets in general, checkout this article: https://medium.com/#dominik.t/what-are-web-sockets-what-about-rest-apis-b9c15fd72aac
“WebSockets” is an advanced technology that allows real-time interactive communication between the client browser and a server. It uses a completely different protocol that allows bidirectional data flow, making it unique against HTTP.
The article also compares/contrasts it to HTTP, so it may give you a better understanding of HTTP as well.

Implementing a WebServer

I am trying to create a Web Server of my own and there are several questions about working of Web servers we are using today. Questions are:
After receiving a HTTP request from a client through port 80, does server respond using same port 80?
If yes then while sending a large file say a pic in MB's, webserver will be unable to receive requests from other clients?
Is a computer port duplex or simplex? (Can it send and receive at the same time)?
If another port on server side is used to send response to client, then (if TCP is used, which is generally used), again 3-way handshaking will be done which will be overhead...
http://beej.us/guide/bgnet/output/html/singlepage/bgnet.html here is a good guide on what's going on with webservers, although it's in c but the concepts are all there. This will explain the whole client server relationship as well as some implementation details.
I'll just give a high level on what's going on:
Usually what happens is when your server gets a new request that comes in it creates a fork that will process it, that way you are not bogged down by each request, when the request comes in the child process is handed a new file to write to(again this is all implementation details).
So really you have one server waiting for requests and for each request it received it spawns a child to process to deal with this request. I'm sure there are much easier languages to implement this stuff than c(I had to do both a c and java server serving to either one in my past) but c really gets you to understand the things that are going on and I'm betting that is what you are looking for here
Now there are a couple of things to think about:
how you want the webserver to work. The example explains the parent child process.
Do you want to use tcp/UDP there are differences in the way to payload gets delivered.
You don't have to connect on port 80. that's just the default for web.
Hopefully the guide will help you.
Yes. The server sends the response using the TCP connection established by the client, so it also responds using the same port. The server can handle connections from multiple clients using the same port because TCP connections are identified by (local-ip, local-port, remote-ip, remote-port), so the server can even handle multiple connections from same client provided that the source ports are different.
There are different techniques you can use to be able to serve multiple clients at the same time. These include
using multiple processes or threads: when one is busy serving a client the others can serve other clients.
using events: the server listens for events from the OS: when it can write a block of data to a connection it writes it, when a new client connects it accepts the connection, ...
Frequently both approaches are be combined.
A TCP connection is duplex: you can send and receive at the same time. The HTTP protocol is based on a simple request-response model though: at any given time only one party is "talking."

Why HTTP was designed to be a pull protocol?

I was watching many presentations about Html 5 WebSockets , where server can initialize connection with client and push the data without the request from the client.
We don't need Polling etc.
And , I am curious , why Http was designed as a "pull" and not full duplex protocol in the first place ? What where the reasons behind that kind of decision ?
Because when http was first designed it was meant to be used to retrieve documents from a server. And the easiest way to do is when the client asks the server for a document and gets it delivered as response (or an error in case it does not exist). When you have push protocol that means the server would need to keep client connections around for potentially a long time creating more resource management problems - remember we are talking about early 1990s here.
Http was designed for simply retrieving hypertext documents from a server. There were no reasons to push anything to the client when the pages were just pure, static html without scripting capabilities.
Since there was no need at the time for pushing things back to the client, the protocol was kept simple.
HTTP is mainly a pull protocol—someone loads information on a Web server and
users use HTTP to pull the information from the server at their convenience. In particular,
the TCP connection is initiated by the machine that wants to receive the file.

"Proxying" HTTP requests

I have some software which runs as a black box, I have no access to it. This software makes HTTP requests. What I want to do is intercept these requests, forward them on, catch the response, do something with it, before passing the response back to the software.
Can this be done? What's the best method?
Thanks
Edit: Requests are to the public internet from a local intranet via a gateway/router. I have root access to my machine. Another machine could be used as intermediate gateway.
Edit 2: Requests are not encrypted. What I am actually trying to do is save down any images that are requested.
Try yellosoft-alchemy.
If the communication isn't encrypted, use Ethereal (or any other similar program) to sniff the communication on the wire.
edit: since the communication isn't encrypted, you can do that easily with Ethereal. You can save each TCP stream independently from there.
Edit2: Ok, you want to do this automatically. In this case, I would suggest you look at two tools available on Linux called tcpflow and tcpreen.
tcpreen creates a proxy similar to what you want between a local port and a remote one. It's a TCP proxy, not an HTTP proxy so this means you'll have to write some parsing tool to isolate the HTTP streams that contain the images you want (probably based on the MIME type of the response). it's not too complex a task, though, if you understand how HTTP works.
tcpflow is similar to tcpreen except that it's a sniffer instead of a proxy. Use whatever tool you think its more adapted to your environment.

Understanding REST: REST as a high volume transport?

I'm designing a system that will need to move multi-GB backup images over TCP, and I'm looking at REST as an alternative to ONC RPC.
For example, I might have
POST http://site/backups/image1
where image1 is an 50GB file whose data is contained in the HTTP body.
My question: is this within the scope of what REST is meant for? Is it inappropriate to move massive files over HTTP? My preliminary testing shows that the performance isn't too bad, and I like the clean, debuggable protocol, as opposed to a custom ONC RPC server. But is this overloading the role of a webserver?
Thanks,
-Steve
HTTP has about the same overheads as FTP.
An HTTP server if often asked to do more work than an FTP server. But otherwise, using HTTP to send a large file is about the same as using FTP.
The only consideration is making sure your web server and web application framework are configured to do this kind of thing without needlessly expanding the entire 50Gb file inside Apache.
Steve,
HTTP has a look-before-you-leap 'feature' that allows the client to ask the server whether it will accept the data submission before it actually sends out the data. I'd look into using this to avoid transferring GBs of data only to find out that the server is currently not willing to handle them. Look at the HTTP Expect header and 100 Continue status codes.
Also, you can use FTP within a RESTful approach, IOW, think along the lines of
<backup-store href="ftp://example.org/site/backup/images/"/>
and make your clients understand the ftp URI scheme.
Finally, the T in HTTP means transfer and not transport - an important distinction to make because the former is an application semantic (HTTP is an application protocol) and the latter is a not.
HTH,
Jan
REST has nothing to do with how large your data is or which method you use to transport it.

Resources