When to use HTTP Proxy - http

I am having trouble understanding ProxyFromEnvironment and ProxyURL in net/http package. Can someone please explain me when and why these two functionalities are used?
My current understanding (at least for ProxyFromEnvironment) is that this is used to get the URL of a PROXY Server from the Environment Variables and this PROXY Server is used to make HTTP Requests.

Both functions are related to how you use the http.Transport mechanism.
One can be used to allow the transport to dynamically retrieve the proxy settings from the environment, the other can be used to provide a static URL to be used by the transport for the proxy every time.
ProxyFromEnvironment is a func that returns a URL describing the proxy that is configured in the Environment; it can be assigned to the Transport.Proxy field, and every time the Transport makes a request, the Proxy will depend on the values in the Environment.
ProxyURL is a func that returns a general func which returns the given URL every time it is invoked; it can be used to generate a helper function to assign to the Transport.Proxy field, so that your Transport has a consistent Proxy every time the Transport makes a request.

Related

Proper way to mux an HTTP server in Go that can handle non-HTTP protocols

I have built a router with extended logging capabilities using Go. It works properly for most use cases. However, it encounters problems when clients send non-standard HTTP messages on port 80.
To date, I have solved this by implementing my own version of ServeHTTP():
func (myproxy *MyProxy) ServeHTTP(w http.ResponseWtier, r *http.Request) {
// Inspect headers
// Determine if it is a custom protocol (ie: websockets, CONNECT requests)
// Implement handlers for each time
}
In the event that I determine a request is a non-standard HTTP protocol, the request is played back to the original destination (via http.Request.Write()) and everyone is happy.
At least, most of the time. The problem arises with edge cases. Tivo clients do not send "Host" headers and appear to not like all kinds of other standard things that Go does (such as capitalizing header names). The number of possible variations on this is endless so what I would very much like to do is to buffer the original request - exactly as it was sent to me without any modifications - and replay it back to the original destination server.
I could hack this by re-implementing net.Http.Server, but this seems like an extraordinarily difficult and brittle approach. Would I would prefer to do would be to somehow hook into net/http/http/server.go right around the point where it receives a connection, then wrap that in a spy wrapper that logs the request to a buffer.
func (srv *Server) Serve(l net.Listener) error {
// Lots of listener code...
c := srv.newConn(rw)
c.setState(c.rwc, StateNew) // before Serve can return
// I'd like to wrap c here and save the request for possible reply later.
go c.serve(ctx)
}
https://golang.org/src/net/http/server.go?s=93229:93284#L2805
Edit: I have looked at httpUtil:DumpRequest but this slightly modifies the original request (changes case and order of headers). If it didn't do that, it would be an ideal solution.
https://godoc.org/net/http/httputil#DumpRequest
Is there a way to hook connections around this point before they are parsed by http.Request?
In the interest of helping others, I wanted to answer my question. The approach I suggested above does in fact work and is the proper way to do this. In summary:
Implement ListenAndServe()
Wrap the incoming net.Conn in a TeeReader or other multiplexed connection wrapper
Record the request
Dial the original destination and connect with the inbound connection, replaying the original request if necessary.
A similar use case is required when upgrading connection requests for websockets servers. A nice writeup can be found here:
https://medium.freecodecamp.org/million-websockets-and-go-cc58418460bb

http spec: Wait for creation of "file"

Is there a http related spec which helps me to implement "wait for creation of file"?
Here is what should be implemented:
Server called "sink" connects to server "middle" and waits until a file on this server gets created
Server "creator" creates a file and sends it to server "middle"
As soon as the whole file is available at server "middle" server "sink" should start the download.
I know that I could use websockets and custom methods to implement this.
But before going this way I would like to know if there is a http related spec which does cover this use case.
Short answer
Yes, the HTTP specification itself (RFC7230) already covers your use case
and there is also a known mechanism called HTTP Long Polling described in RFC6202
where a server waits for something for many seconds before returning a response.
Long answer
HTTP is a flexible protocol, so much so that the HTTP specification itself (RFC7230) already covers your use case.
In Section 1 (emphasis mine):
HTTP is a generic interface protocol for information systems. It is
designed to hide the details of how a service is implemented by
presenting a uniform interface to clients that is independent of the
types of resources provided. Likewise, servers do not need to be
aware of each client’s purpose: an HTTP request can be considered in
isolation rather than being associated with a specific type of client
or a predetermined sequence of application steps. The result is a
protocol that can be used effectively in many different contexts and
for which implementations can evolve independently over time.
The specification allows the server to connect to other server and wait for something ("hide the details of how a service is implemented")
and that something can be a file or anything else ("independent of the types of resources provided").
You should notice that the specification does not define the internal workings of the server.
It does not define what the server must do in order to provide the response, as explained also in Section 1 (emphasis mine):
One consequence of this flexibility is that the protocol cannot be
defined in terms of what occurs behind the interface.
In your use case, server "sink" is actually a client because it connects to server "middle" expecting to receive a file.
The fact that server "middle" is waiting for another server to do something is irrelevant because the specification does not define what should occur behind the interface.
However, if you are looking for a similar mechanism where the server also waits for something,
one example would be HTTP Long Polling described in RFC6202 (emphasis mine):
HTTP Long Polling: The server attempts to "hold open" (not
immediately reply to) each HTTP request, responding only when
there are events to deliver.
In HTTP Long Polling, after receiving a client's request, the server does not reply immediately ("not immediately reply to").
It simply waits for an event to occur.
How long the server can wait is not even defined.
This mechanism works within the confines of how HTTP works and therefore complies with HTTP specification.

lua-resty-http:connect method clarification

I've been implementing a kong plugin that needs to make HTTP requests to retrieve information to share it with the upstream services.
There is an excellent library called lua-resty-http that can be used to make HTTP requests.
The service that contains the information needed, it is configured behind the proxy, and it matches the path: /endpoint-providing-info.
The goal is to rely on the proxy capabilities to avoid having to parse the hostname which has some particular form that is not relevant to this question.
By playing around, I was able to achieve the desired behavior by doing the following:
local ok, err = http_client:connect("127.0.0.1", ngx.var.server_port)
if not ok and err then return nil, 'there was a failure opening a connection: ' .. err
local res, err = http_client:request({
method = 'GET',
path = '/endpoint-providing-info'
})
//parse the response, etc...
The request gets routed to the upstream service and works as expected.
My primary concern is the following:
By connecting to localhost, I assumed that the current Nginx node is the one attending the request. Will this affect the performance? Is it better/possible to connect to the cluster directly?
I suppose that you configure for current nginx a location which match /endpoint-providing-info, use proxy module and configure an upstream for a cluster.
If you would use lua-resty-http:
Pros - you may use body_reader - an iterator function for reading the body in a streaming fashion.
Cons - your request will go thru kernel boundary (loopback interface).
Another possibility is to issue a subrequest using ngx.location.capture API
Pros - subrequests just mimic the HTTP interface but there is no extra HTTP/TCP traffic nor IPC involved. Everything works internally, efficiently, on the C level.
Cons - it is full buffered approach, will not work efficiently for big responses.
Update - IMO:
If you expect from upstream server big responses -lua-resty-http is your choice.
If you expect from upstream server a lot of small responses - ngx.location.capture should be used.

Session and remote IP address in grpc-go

In grpc-go, when implementing a service, the service interface defines methods contains only Context and Request. From the source of the Context, it is as simple as
type Context interface {
Deadline() (deadline time.Time, ok bool)
Done() <-chan struct{}
Err() error
Value(key interface{}) interface{}
}
So I wonder if it is possible to get some metadata (including remote IP address and other data) to maintain a session.
Thanks.
There's nothing that gRPC provides (in any language) that would be particularly robust as a session system across requests.
The streaming mechanism is great when you need to maintain context on a single server for clients: the stream callback's stack can point to whatever session information you need.
If you need state across separate RPC's (or across machines) you'll need to add your own session layer. You could do this by creating some unique id that you attach to (say) a 'my-session-id' metadata element when sending requests.
It's possible to retrieve remote IP address through the use of stats.Handler (see especially this struct https://github.com/grpc/grpc-go/blob/v1.20.x/stats/stats.go#L89).
grpc.Metadata is commonly used to store arbitrary information about sessions.

how to intercept and modify HTTP responses on server side?

I am working with a client/server application which uses HTTP, and my goal is to add new features to it. I can extend the client by hooking my own code to some specific events, but unfortunately the server is not customizable. Both client and server are in a Windows environment.
My current problem is that performance is awful when a lot of data are received from the server: it takes time to transmit it and time to process it. The solution could be to have an application on server side to do the processing and send only the result (which is much smaller). The problem is there is not built-in functions to manipulate responses from the server before sending them.
I was thinking to listen to all traffic on port 80, identifying relevant HTTP responses and send them to my application while blocking the response (to avoid sending huge data volume which won't be processed by the client). As I am lacking a lot of network knowledge, I am a bit lost when thinking about how to do it.
I had a look at some low-level packet intercepting methods like WinPCap, but it seems to require a lot of work to do what I need. Moreover I think it is not possible to block or modify responses with this API.
A reverse proxy which allows user scripts to be triggered by specific requests or responses would be perfect, but I am wondering if there is no simpler way to do this interception/send elsewhere work.
What would be the simplest and cleanest method to enable this behavior?
Thanks!
I ended making a simple reverse proxy to access the HTTP server. The reverse proxy then extracts relevant information from the server response and sends it to the server-side processing component, and replaces information extracted from the response by an ID the client uses to request the other component to get the processing results.
The article at http://www.codeproject.com/KB/web-security/HTTPReverseProxy.aspx was very helpful to make the first draft of the reverse proxy.
Hmm.... too much choices.
2 ideas:
configure on all clients a Http Proxy. there are some out there, that let you manipulate what goes through in both directions (with scripts, plugins).
or
make a pass through project, that listens to port 80, and forewards the needed stuff to port 8080 (where your original server app runs)
question is, what software is the server app running at,
and what knowledge (dev) do you have?
ah. and what is "huge data"? kilobyte? megabyte? gigabyte?

Resources