Concept of URL path is supported only at application layer level? - http

I'm writing my own client and server, they are implement my own application level protocol. For transport layer TCP is used. For internet layer IPv4 is used. Nor TCP header, nor IPv4 header not contain any information about URL path. TCP header contains port, IPv4 contains IP address. Who should contain path?
So, consider this URL - 127.0.0.1:4444/location-path. 127.0.0.1 is supposed to be part of IPv4. 4444 is supposed to be part of TCP. For example, in Golang, using standard net.Dial() and net.Listen(), I can only use 127.0.0.1:4444. Using of 127.0.0.1:4444/location-path will throw an error, which is really expected because it provides support only for TCP/IP/domain name resolving, and /location-path is not part of all these three.
So, on client and server side, how I should handle /location-path in a such way that client can send requests to different locations and server can serve different locations?
Every application protocol should implement its own logic to handle /location-path?
According to RFC 1738:
url-path
The rest of the locator consists of data specific to the
scheme, and is known as the "url-path". It supplies the
details of how the specified resource can be accessed. Note
that the "/" between the host (or port) and the url-path is
NOT part of the url-path.
The url-path syntax depends on the scheme being used, as does the
manner in which it is interpreted.
So, I can implement my own rules for url-path and how it will be interpreted by client and server? And to transfer it I can just send it as a data?
For example, I can just define rules that my own application level protocol - proto - will place url-path at first four bytes of TCP payload? So, proto://127.0.0.1:4444/path will become 127.0.0.1:4444 with TCP payload [112 97 116 104].

The easiest way to compare this is HTTP.
A http url might look like http://example:1234/bar
A http client understands that URL. The HTTP client will (under the hood) still make a standard TCP connection to get there.
It does this by grabbing the host and port portion, and (temporarily) discard everything else. So only example and 1234 is used in that stage.
After the TCP connection is established, it will use the path part and send it as the first line.
TCP clients don't know about URLs, they know about hosts and ports. What you do with the path portion is up to you.

Related

Faking an HTTP request header

I have a general networking question but it's related with security aspect.
Here is my case: I have a host which is infected by a malware. The malware creates an http packet to communicate with it's command and control server. While constructing the packet, the IP layer contains the correct IP address of the command and control server. The tcp layer contains the correct port number 80.
Before sending the packet out, the malware modifies the http header to replace the host header with “google.com" instead of it's server address. It then attaches the stolen data with the packet and sends it out.
My understanding is that the packet will get delivered to the correct server because the routing will happen based on the IP.
But can I host a webserver on this IP that would receive all packets with header host google.com and parse it correctly?
Based on my reading on the internet, it is possible but if it is that easy then why have malware authors not adopted this technique to spoof the http headers and bypass traditional domain whitelisting engines.
When you make a request to let's say Apache2 server, what actually Apache does is match your "Host" header with any VirtualHost within server's configuration. Only if it cannot be found / is invalid, Apache will route the request to default virtualhost if it's defined. Basically nothing stops you from changing these headers.
You can simply test it by editing your hosts file and pointing google.com to any other IP - you will be able to handle the google.com domain on your server, but only you will be to use it this way - no one else.
Anything you send inside HTTP headers shouldn't be trusted - it just a guide for your server on how to actually handle the traffic.
The fake host header is just there to trick some deep-inspection firewalls ("it's for Google? you may pass..."). The server on that IP either doesn't care about the host header (default vhost) or is explicitly configured to accept it.
Passing the loot on by using fake headers or just as plain data behind the headers is another trick to fool data loss prevention.
These methods can mislead shallow application-layer inspection but won't pass a decent firewall.

Are http and https resources equivalent?

Are HTTP and https resources equivalent? That is, does http://example.com/ABC refer to the same resource as https://example.com/ABC?
Evidence for: (1) Cookies with matching domain and path without "secure" attribute are set and returned independent of protocol. (2) HTTP strict transport security bounces you from HTTP to HTTPS with an implicit assumption the resource is the same.
Evidence against: (1) Same origin policy treats a different protocol as a different origin. (2) HTTP RFC shows HTTP, and https comparison is unequal. (3) Resources for other protocols like FTP aren't equivalent to HTTP resources for the same domain (e.g., FTP server root dir different), so what magic does https have over FTP in resource equivalence to HTTP?
I am going to say - Yes - they are the same resources.
The protocol only depicts the transportation layer.
To me
http://example.com/ABC
reads like following:
At example.com a commercial domain I have a resource called ABC.
I read the same for the following irrespective of protocol.
https://example.com/ABC
However web servers can be configured to represent and entirely different contents at the same ABC resource path based on https but in my mind they should not do so.
However the only caveat is if anyone wants to return some sort of warning for using plain HTTP we now have a different meaning but it should return 500 or some error condition for doing so.
The answer is, it depends on the web server configuration. They can and in a lot of cases do point to the same resources, because HTTP and HTTPS tends to be bound to the same single site/application.
However, because they are accessed over different TCP ports (HTTP port 80, HTTPS port 443), it is perfectly possible to have the HTTP resource be served up by a different bound site than the HTTPS resource with the same URI (except protocol) and therefore be totally different.

Is it possible to delegate an HTTP request transparently?

Suppose computer A sends an HTTP request to a server B, and B wants C to answer it. Is it possible for C to send a response back to A without B intervention and without specific actions from A (as with a 3xx redirection)? Suppose C may not have a public IP address.
That's what a reverse proxy would do. Depending what platform you are on, there are a lot of options.
One way that works on many platforms is e.g. node-http-proxy that you could start on server B. In the most simple case, this one-liner would do:
require('http-proxy').createServer(81, 'serverb').listen(80);
It listens on port 80 and redirects to port 81 on serverb.
See https://github.com/nodejitsu/node-http-proxy for more options.
Of course, there are lots of well-established proxies with a lot more bells and whistles (although node-http-proxy can do https tunneling etc. as well), but configuring those can be more challenging that running this one-liner. It all depends on your use case.
Edit: Reading your comment, this can be done using direct routing. Your question is about HTTP (layer 7), and as direct routing works on a lower layer, higher-level protocols like HTTP work as well. Quote from http://horms.net/projects/has/html/node9.html:
Direct Routing: Packets from clients are forwarded directly to the
back-end server. The IP packet is not modified, so the back-end
servers must be configured to accept traffic for the virtual server's
IP address. This can be done using a dummy interface, or packet
filtering to redirect traffic addressed to the virtual server's IP
address to a local port. The back-end server may send replies directly
back to the client. That is if a host based layer 4 switch is used, it
may not be in the return path.

What's the difference between http request and writing http request text to tcp/ip socket on 80 port

Can someone explain the difference between the HTTP request and it handling and socket requests on 80 port. As I understood, HTTP server listen the 80 port and when someone sends an HTTP request on this port - server handle it. So when we place socket listener on port 80, and then write HTML formatted message to it - does it means that we send usual HTTP request? But as fiddler said - it false. What's the difference on a packet level? Or another lower than presentation-level between HTTP request and HTTP-formed writing to socket? Thanks.
First of all, port 80 is the default port for HTTP, it is not required. You can have HTTP servers listening on other ports as well.
Regarding the difference between "regular" HTTP requests and the ones you make yourself over a socket - there is no difference. The "regular" HTTP requests you are referring to (made by a web browser for example) are also implemented over sockets, just like you would do it manually yourself. And the same goes for the server. The implementation of the HTTP server listens for incoming socket connections and parses the data that passes there just like you would.
As long as you send in your socket valid HTTP protocol (according to the RFC), there should be no difference in the packet level (if the lower network stack is identical).
Keep in mind that the socket layer is just the layer the HTTP data always passes over. It doesn't really matter who put the data there, it just comes out from the other side the same way it was put in.
Please note that you have some degree of freedom when implementing an HTTP yourself. There are many optional fields and the order of the headers doesn't matter. So it is possible that two different HTTP implementations will be different in the packet level, but will behave basically the same.
The best way to actually see what's going on in the packet level, is by using a network sniffer - like wireshark or packetyzer. A sniffer actually records the packets of the network and shows you their content. So if you record several HTTP implementations (from various browsers) and your own socket implementation, you can make the required changes to make them identical in the packet level.

split HTTP and TCP-only (non-HTTP) traffic

I have web application that runs on Tomcat (and gets HTTP requests) and some other backend standalone application that gets only TCP. For some reasons, I can use outside only port 8080. So, I need to get all TCP requests (from outside) to port 8080 and forward HTTP ones to web application on Tomcat and all TCP pure requests (that are not HTTP) - to standalone application. Internal forwarding could be done to any port, e.g. 8181 on Tomcat and 8282 on standalone application. Is it possible to setup such configuration? How it could be done?
Thanks in advance.
TCP and HTTP are protocols in different networking stack layer. If you want to use some application to filter HTTP requests, your application should deal with Application-Layer information, not Network-Layer(like TCP/UDP).
I don't see how this can be possible generally. You could look packet-by-packet, but the middle of an http body can be arbitary so you can't just look at the data of each packet
If any particular client will send you either http or general TCP but not both, can you do this by source-IP address? Do you know the addresses of either the servers that will send you http requests or the ones that will send you TCP requests?
If you don't know the source IPs, you could heuristically look at the first packet from some previously unknown IP and see if it looks like http, then tag that address as containing http traffic.
What is the content/format ot the TCP communication? Is there any pattern you can detect in that?
Y
Perhaps you could do something like this using iptables + L7 filter. Of course this will only work if you run Linux on your box. Also I don't know how recently l7 filter project has been updated.
Java servlet technology is not limited to Http. The servlet interface lets you read in the incoming input stream via ServletRequest.getInputStream(). So you can create an implementation of Servlet interface and map it in web.xml and you are all set to receive any TCP traffic.
Once you have the read the input stream to sniff the content you will want to forward HTTP requests to an HttpServlet. To do this you will need to make sure that the input stream you pass on is positioned at the very beginning of the input.
EDIT: On reading your question once again, I noticed that you don't plan to expose the Tomcat directly on external port as I originally thought. If you are willing to make the tomcat listen on the external port, you can try the approach described above

Resources