I've implemented http reverse proxy middleware which used with Gin framework app:
app := gin.New()
app.Use(proxy.ReverseProxy("127.0.0.1:8008")) // HERE I'm attaching ReverseProxy middleware
In ReverseProxy method I'm creating instance of httputil.ReverseProxy which takes transport from already initialized during init() variable.
var transport *http.Transport
func init() { // HERE creating instance of Transport
transport = &http.Transport{
// some params
}
}
func ReverseProxy(targetServer string) gin.HandlerFunc {
return func(c *gin.Context) {
proxy := &httputil.ReverseProxy{
Transport: transport, // HERE reusing instance of Transport
// some params
}
proxy.ServeHTTP(c.Writer, c.Request)
}
}
So QUESTION:
is it correct to have one instance of http.Transport and reuse it in httputil.ReverseProxy or I've to create new transport on every request?
func ReverseProxy(targetServer string) gin.HandlerFunc {
return func(c *gin.Context) {
// HERE creating instance of Transport
transport = &http.Transport{
// some params
}
proxy := &httputil.ReverseProxy{
Transport: transport, // HERE using NEW instance of Transport
// some params
}
proxy.ServeHTTP(c.Writer, c.Request)
}
}
Which way is best?
I currently reuse transport cause I've got performance boost, seems like it uses already created tcp connection.
But in case of high load I'm not sure how it will act and will it return unrelated response to unrelated client?
Link to sources
For your question
is it correct to have one instance of http.Transport and reuse it in httputil.ReverseProxy or I've to create new transport on every request?
Creating one proxy and reusing it could be the correct way.
You could find more details per the Transport documentation.
Transport is an implementation of RoundTripper that supports HTTP, HTTPS, and HTTP proxies (for either HTTP or HTTPS with CONNECT).
By default, Transport caches connections for future re-use. This may leave many open connections when accessing many hosts. This behavior can be managed using Transport's CloseIdleConnections method and the MaxIdleConnsPerHost and DisableKeepAlives fields.
Transports should be reused instead of created as needed. Transports are safe for concurrent use by multiple goroutines.
Related
I am using the new HttpClient shipped with JDK 11 to make many requests (to Github's API, but I think that's irrelevant), especially GETs.
For each request, I build and use an HttpClient, like this:
final ExecutorService executor = Executors.newSingleThreadExecutor();
final HttpClient client = client = HttpClient
.newBuilder()
.followRedirects(HttpClient.Redirect.NORMAL)
.connectTimeout(Duration.ofSeconds(10))
.executor(executor)
.build();
try {
//send request and return parsed response;
} finally {
//manually close the specified executor because HttpClient doesn't implement Closeable,
//so I'm not sure when it will release resources.
executor.shutdownNow();
}
This seems to work fine, except every now and then, I get the bellow exception and requests will not work anymore until I restart the app:
Caused by: java.net.ConnectException: Cannot assign requested address
...
Caused by: java.net.BindException: Cannot assign requested address
at java.base/sun.nio.ch.Net.connect0(Native Method) ~[na:na]
at java.base/sun.nio.ch.Net.connect(Net.java:476) ~[na:na]
at java.base/sun.nio.ch.Net.connect(Net.java:468) ~[na:na]
Note that this is NOT the JVM_Bind case.
I am not calling localhost or listening on a localhost port. I am making GET requests to an external API. However, I've also checked the etc/hosts file and it seems fine, 127.0.0.1 is mapped to localhost.
Does anyone know why this happens and how could I fix it? Any help would be greatly appreciated.
You can try using one shared HttpClient for all requests, since it manages connection pool internally and may keep connections alive for same host (if supported). Performing a lot of requests on different HttpClients is not effective, because you'll have n thread pools and n connection pools, where n is an amount of clients. And they won't share underlying connections to the host.
Usually, an application creates a single instance of HttpClient in some kind of main() and provides it as a dependency to users.
E.g.:
public static void main(String... args) {
final HttpClient client = client = HttpClient
.newBuilder()
.followRedirects(HttpClient.Redirect.NORMAL)
.connectTimeout(Duration.ofSeconds(10))
.build();
new GithubWorker(client).start();
}
Update: how to stop current client
According to JavaDocs in internal private class of JDK in HttpClientImpl.stop method:
// Called from the SelectorManager thread, just before exiting.
// Clears the HTTP/1.1 and HTTP/2 cache, ensuring that the connections
// that may be still lingering there are properly closed (and their
// possibly still opened SocketChannel released).
private void stop() {
// Clears HTTP/1.1 cache and close its connections
connections.stop();
// Clears HTTP/2 cache and close its connections.
client2.stop();
// shutdown the executor if needed
if (isDefaultExecutor) delegatingExecutor.shutdown();
}
This method is called from SelectorManager.showtdown (SelectorManager is created in HttpClient's constructor), where shutdown() method called in finally block around busy loop in SelectorManager.run() (yes, it implements Thread). This busy loop is while (!Thread.currentThread().isInterrupted()). So to enter this finally block you need to either fail this loop with exception or interrupt the running thread.
I am trying to handle incoming HTTP requests by Nginx and Lua. I need to read a blue from Redis in each request and currently, I open a Redis connection in every request by this code:
local redis = require "resty.redis"
local red = redis:new()
local ok, err = red:connect("redis", 6379)
if not ok then
ngx.say("failed to connect: ", err)
return
end
local res, err = red:auth("abcd")
if not res then
ngx.log(ngx.ERR, err)
return
end
Is there any way to make this connection static or singleton to increase my request handler performance?
It is impossible to share a cosocket object (and, therefore, a redis object, check this answer for details) between different requests:
The cosocket object created by this API function has exactly the same lifetime as the Lua handler creating it. So never pass the cosocket object to any other Lua handler (including ngx.timer callback functions) and never share the cosocket object between different Nginx requests.
However, nginx/ngx_lua uses a connection pool internally:
Before actually resolving the host name and connecting to the remote backend, this method will always look up the connection pool for matched idle connections created by previous calls of this method
That being said, you just need to use sock:setkeepalive() instead of sock:close() for persistent connections. The redis object interface has a corresponding method: red:set_keepalive().
You'll still need to create a redis object on a per request basis, but this will help to avoid a connection overhead.
I've been implementing a kong plugin that needs to make HTTP requests to retrieve information to share it with the upstream services.
There is an excellent library called lua-resty-http that can be used to make HTTP requests.
The service that contains the information needed, it is configured behind the proxy, and it matches the path: /endpoint-providing-info.
The goal is to rely on the proxy capabilities to avoid having to parse the hostname which has some particular form that is not relevant to this question.
By playing around, I was able to achieve the desired behavior by doing the following:
local ok, err = http_client:connect("127.0.0.1", ngx.var.server_port)
if not ok and err then return nil, 'there was a failure opening a connection: ' .. err
local res, err = http_client:request({
method = 'GET',
path = '/endpoint-providing-info'
})
//parse the response, etc...
The request gets routed to the upstream service and works as expected.
My primary concern is the following:
By connecting to localhost, I assumed that the current Nginx node is the one attending the request. Will this affect the performance? Is it better/possible to connect to the cluster directly?
I suppose that you configure for current nginx a location which match /endpoint-providing-info, use proxy module and configure an upstream for a cluster.
If you would use lua-resty-http:
Pros - you may use body_reader - an iterator function for reading the body in a streaming fashion.
Cons - your request will go thru kernel boundary (loopback interface).
Another possibility is to issue a subrequest using ngx.location.capture API
Pros - subrequests just mimic the HTTP interface but there is no extra HTTP/TCP traffic nor IPC involved. Everything works internally, efficiently, on the C level.
Cons - it is full buffered approach, will not work efficiently for big responses.
Update - IMO:
If you expect from upstream server big responses -lua-resty-http is your choice.
If you expect from upstream server a lot of small responses - ngx.location.capture should be used.
I want to use Golang as my server side language, but everything I've read points to nginx as the web server rather than relying on net/http (not that it's bad, but it just seems preferable overall, not the point of this post though).
I've found a few articles on using fastcgi with Golang, but I have no luck in finding anything on reverse proxies and HTTP and whatnot, other than this benchmark which doesn't go into enough detail unfortunately.
Are there any tutorials/guides available on how this operates?
For example there is a big post on Stackoverflow detailing it with Node, but I cannot find a similar one for go.
That's not needed at all anymore unless you're using nginx for caching, Golang 1.6+ is more than good enough to server http and https directly.
However if you're insisting, and I will secretly judge you and laugh at you, here's the work flow:
Your go app listens on a local port, say "127.0.0.1:8080"
nginx listens on 0.0.0.0:80 and 0.0.0.0:443 and proxies all requests to 127.0.0.1:8080.
Be judged.
The nginx setup in Node.js + Nginx - What now? is exactly the same setup you would use for Go, or any other standalone server for that matter that isn't cgi/fastcgi.
I use Nginx in production very effectively, using Unix sockets instead of TCP for the FastCGI connection. This code snippet comes from Manners, but you can adapt it for the normal Go api quite easily.
func isUnixNetwork(addr string) bool {
return strings.HasPrefix(addr, "/") || strings.HasPrefix(addr, ".")
}
func listenToUnix(bind string) (listener net.Listener, err error) {
_, err = os.Stat(bind)
if err == nil {
// socket exists and is "already in use";
// presume this is from earlier run and therefore delete it
err = os.Remove(bind)
if err != nil {
return
}
} else if !os.IsNotExist(err) {
return
}
listener, err = net.Listen("unix", bind)
return
}
func listen(bind string) (listener net.Listener, err error) {
if isUnixNetwork(bind) {
logger.Printf("Listening on unix socket %s\n", bind)
return listenToUnix(bind)
} else if strings.Contains(bind, ":") {
logger.Printf("Listening on tcp socket %s\n", bind)
return net.Listen("tcp", bind)
} else {
return nil, fmt.Errorf("error while parsing bind arg %v", bind)
}
}
Take a look around about line 252, which is where the switching happens between HTTP over a TCP connection and FastCGI over Unix-domain sockets.
With Unix sockets, you have to adjust your startup scripts to ensure that the sockets are created in an orderly way with the correct ownership and permissions. If you get that right, the rest is easy.
To answer other remarks about why you would want to use Nginx, it always depends on your use-case. I have Nginx-hosted static/PHP websites; it is convenient to use it as a reverse-proxy on the same server in such cases.
From the introduction on gRPC:
In gRPC a client application can directly call methods on a server application on a different machine as if it was a local object, making it easier for you to create distributed applications and services. As in many RPC systems, gRPC is based around the idea of defining a service, specifying the methods that can be called remotely with their parameters and return types. On the server side, the server implements this interface and runs a gRPC server to handle client calls. On the client side, the client has a stub that provides exactly the same methods as the server.
The above paragraph talks about a client and a server, with the former being the one who is invoking methods to the other. What am I wondering is: can the server-end of the connection invoke methods that have been registered on the client?
No, a server cannot invoke calls on the client. gRPC works with HTTP, and HTTP has not had such semantics in the past.
There has been discussion as to various ways to achieve such a feature, but I'm unaware of any work having started or general agreement on a design. gRPC does support bidirectional streaming, which may get you some of what you need. With bidirectional streaming the client can respond to messages from server, but the client still calls the server and only one type of message can be sent for that call.
The protocol does not implement it, but you may pretend this situation.
Define a server method that returns a stream of a ServerRequest message:
import "google/protobuf/any.proto";
service FullDuplex {
rpc WaitRequests (google.protobuf.Any) returns (stream ServerRequest);
}
message ServerRequest {
float someValue = 1;
float anotherAnother = 1;
}
ServerRequest may be an Oneof, so your may receive different types of server requests.
If you need that your client sends back a response for each request, you may create a stream from your client to the server, but you will need to implement a logic in your server side that triggers a timeout waiting for that response.
service FullDuplex {
rpc WaitRequests (stream ClientResponse) returns (stream ServerRequest);
}
What you can do is start a HTTP server in both processes and use clients at each end to initiate communication. There's a bit of boilerplate involved and you have to design a simple handshaking protocol (one end registers with the other, advertising its listen address) but it's not too much work.