Where is http protocol implementation in Linux - http

I try understand how http work's and can't understand on which level http protocol implemented, it's OS level, or it's depend from where I need use it protocol? For example if I want use it on C I must implement it on C language as library and only then use it?

Http runs on top of tcp - and tcp is implemented in the network stack of your OS.
Http protocol is used between a client and a server. What a client sends is what a server receives, and vice-versa. Http was designed for the server to simply sit and wait for requests (possibly including data), and then respond (possibly including data).
All web servers implement the server side of http. In terms of applications (let's use the term "application" to mean "client", although some might say the server is an application), the client side of http protocol will, I suppose, most commonly be implemented in an application like a browser, but also command-line applications like curl and wget implement an http client. For languages such as Python there is a http server implementation in the standard library, or there are libraries such as requests which handle the client side of http so the python author just worries about the higher-level problem of which http requests to make.
So the answer is, http is not implemented in the OS, it is implemented in applications - some client-side, some server-side.
For your C application you will either have to implement http yourself (doesn't sound like fun to me but would be a good way of understanding http implementation, I suppose) or (much less stress and much more likely to have predictable correctish behaviour) use a library if you can find one.

Related

Why grpc-go can run grpc server and http server at the same address and port, but grpc-node cannot

I had read this answer: https://stackoverflow.com/a/56943771/6463558, it says that there is no way to run gRPC server and HTTP server at same address and port using grpc-node package.
But I can create gRPC server and HTTP server at same address and port(e.g. both using localhost:3000) using grpc-go package. Here is an example: https://github.com/mrdulin/grpc-go-cnode/blob/master/cmd/server/main.go#L79
So, why grpc-node and grpc-go behave inconsistently. Does this make sense?
The result I expect is that no matter what language is implemented in grpc, the behavior should be consistent. So the grpc server should be able to share the same port with the server created by Node's standard library http in same system process.
It is all about implementation. Each language has its own implementation for gRPC. There are many differences from each language implementation, some due to language capability but also due to the maintainers. Each project is a different project.
In this case, we can not really say that gRPC and HTTP servers are sharing the same address. There is only the HTTP server running. However, Golang implementation for gRPC server has an option to serv the gRPC through HTTP.
Calling
server.ServeHTTP()
instead of
server.Serve()
That is possible because, under the hood, gRPC server is built on top HTTP2
This snippet from the link you shared make what I said clear
if request.ProtoMajor != 2 {
mux.ServeHTTP(writer, request)
return
}
if strings.Contains(request.Header.Get("Content-Type"), "application/grpc") {
grpcServer.ServeHTTP(writer, request)
return
}
If you want to do the same in Node, you need to check in the grpc-node implementation if there is such a thing available
Your example uses http.NewServeMux(), which is provided by the Go standard library. The Node standard library does not provide an equivalent feature, so you can't share the port that way.

Comparison between HTTP and RPC

RPC protocol uses TCP as an underlying protocol and HTTP again uses TCP as an underlying protocol. So why is HTTP is widely accepted?
Why does SOAP use HTTP as an underlying protocol - why not RPC?
Remote Procedure Calls (RPC) is not a protocol, it's a principle that is also used in SOAP.
SOAP is an application protocol that uses HTTP for transport (so it won't have to think about encoding, message boundaries and so on). One of the reasons to use SOAP over HTTP is that for HTTP you usually don't need firewall rules and that the HTTP infrastructure is mature and commonly rolled out.
RPC does not require HTTP. Basically, RPC describes any mechanism that is suitable to invoke some piece of code remotely. The transport mechanism used to perform the RPC could be SOAP over HTTP. It could also be a REST call returning some JSON data over HTTP.
SOAP can also be used via Mails, and AFAIK (not sure here) the BizTalk Server should support this scenario. But even something exotical like trying SOAP over Avian Carriers can also be considered an RPC, although the latency of the latter may not be sufficient for real-world applications.
Think of an RPC as sending somehow some kind of message to a destination, in order to initiate a specific action and (optionally) getting some information back after the action has been completed. What prticular technology you choose to transmit these messages does not really matter.

what is the benefit of using http hijacker

Go http pkg provide a Hijacker interface, can anyone tell when should I use it.
I check the comment, after a Hijack call lets the caller take over the connection, the HTTP server library will not do anything else with the connection.
I understand it as it's used to support both http request and common tcp interactive within one port. Is it right? Does it has any other benefits.
It means that you take over the control of TCP connection.
TCP is a generic transport protocol, whereas HTTP is an application protocol on top of TCP. The OSI seven layer model describes TCP as layer 4 and HTTP is layer 7.
If you need to implement a different application protocol, this is one use-case for hijacking.
Or if you need to do something specialised with HTTP, like preventing keep-alive connections, that is another use-case.
An example for an alternative web application protocol is Google's SPDY. It's also a good reason why you might hijack an existing HTTP connection, rather than create a TCP connection directly. For SPDY, a browser would first make an HTTP request that included 'accept' headers indicating that it is also able to understand SPDY. So now you could hijack the connection and implement SPDY instead of HTTP.

CGI to Handle Multiple Requests on a Persistent HTTP Connection

CGI programs typically get a single HTTP request.
HTTP 1.1 supports persistent HTTP connections whereby multiple HTTP requests/responses are made w/o closing the connection.
Is there a way for a CGI program (or similar mechanism) to handle multiple HTTP requests/responses on the same connection?
I am using Apache httpd.
Keep-alives are one of the higher-level HTTP features that is wholly dealt with by the web server. They are out-of-scope for CGI applications themselves.
Accessing CGI scripts through Apache mod_cgi works with keep-alive for me. The browser re-uses the same TCP connection to fetch the page and then resources referred to by it, without the scripts in question having to do anything special.
If you mean you would like to have the same CGI process handle one request and then the next (instead of the process ending and a new one being spawned), then I'm afraid that's not possible. The web server will intercept keep-alives and make them look like single requests before your scripts can do anything about it. (If you want to do that to improve performance, consider a different gateway interface, such as FastCGI or language-specific options like WSGI.)
SCGI sounds exactly like what you want. It is similar to FastCGI but a simpler solution to implement (the S stands for Simple :)).

How to implement HTTP Tunneling

I've written a Flash (Flex) client connecting to a back-end server to exchange data.
I've also written my server from scratch, and it serves two purposes:
(1) Web (HTTP) Server- By default listens on port 80
(2) Socket/Application- Server - By default listens on port 443
Just FYI, both servers run in the same process space, for convenience reasons. They are not expected to handle massive loads, so I'm fine with that.
As soon as the Flash client is served to the browser from the HTTP socket, the client attempts to open an XMLSocket to the Socket/Application server.
I now want to implement HTTP tunneling, so that my client can connect to the Application server even if the user is behind a firewall. I do not want any external servers involved (proxies etc.) - simply use the servers I already have.
My questions:
(1) Is it better to use port 443 for that? (does it better fool firewalls?)
(2) As far as I can see, what I am required to do, is just ensure that my actual application data is simply encapsulated in an HTTP structure (preceded by a dummy HTTP header), both from the client and server sides. Is that so or am I missing anything here?
(3) Do I need to keep hiding/encapsulating my data every message I send through the socket, or can I just encapsulate the first message when opening the connection?
Thanks in advance guys!
Fuzz
Don't reinvent the wheel - use remoting via AMF protocol. AMF an HTTP-based binary format that performs serialization between ActionScript (MXML) and server side languages. Technically, this is HTTP tunneling.
Adobe offers BlazeDS (open source) and LCDS (commercial) implementations of AMF for AS/Java, but there are third-party implementations of AMF for AS/PHP, AS/Python, AS/Ruby, AS/.Net.
BTW, AMF is an open source format.

Resources