How can I interpret an HTTPS with non-utf8 characters? - networking

I'm working on a very simple and straightforward reverse proxy in rust without any external libraries. I'm come to my first roadblock. I've noticed that when I try to parse an https request into utf8 it fails. I printed the request as a lossy string. Here is the output:
�f�^���;�r�;�d��N7# ^�8�6 �m�xpPk�
����B]���Fi��֚*G]"�+�/̨̩�,�0�
� ����/5�rus
I was thinking this has something to do with ssl because on the client side, it says something along the lines of "Secure Connection has Failed". I've looked into decoding ssl requests or whatever this is and have found nothing useful. Any ideas would be greatly appreciated.
I have tried parsing the request using several different solution from other platforms. They consisted of relying on base64 and other ssl related crates meant for decoding text.
For more context, below is a general example for how I go about getting the output from above:
use std::{
io::{Read, Result},
net::TcpListener,
};
fn main() -> Result<()> {
let server = TcpListener::bind("localhost:443")?;
for mut stream in server.incoming().filter_map(Result::ok) {
let mut buf = [0; 256];
let bytes = stream.read(&mut buf)?;
let utf8_lossy = String::from_utf8_lossy(&buf); // this contains the non-utf8 wumbo jumbo
let utf8 = String::from_utf8(buf.to_vec()).unwrap(); // this fails
}
Ok(())
}

When an https client connects with the server, they establish a secure socket to protect data transfer between them. The data that is being passed over this socket is not necessarily text, and cannot be interpreted as such.
The process of establishing a socket is a multi-step protocol, where the client sends a ClientHello message, to which you should reply with a ServerHello which contains your certificate. The client then replies with it's keys and some cipher information, before the socket is finally ready to be used for data. All of these initalization steps are happening with a binary protocol, that cannot be interpreted as text. That is the reason you're not seeing any sensible output.
Once that socket is setup, only then does http data begin to flow over the connection. This is likely what you're expecting to see, as it contains the familiar 'HTTP/1.1 GET', etc.
Openssl, the library you mentioned using, has a way to setup a socket that will perform handshakes for you. See the docs.

Related

Difference between multiplex and multistream

What is the difference between multistream (yamux, multistream-select, ..) and multiplex (mplex)?
I'd like to utilize one TCP connection for RPC, HTTP, etc (one client is behind firewall) like this:
conn = tcp.connect("server.com:1111")
conn1, conn2 = conn.split()
stream1 = RPC(conn1)
stream2 = WebSocket(conn2)
..
// received packets tagged for conn1 is forwarded to stream1
// received packets tagged for conn2 is forwarded to stream2
// writing to stream1 tags the packets for conn1
// writing to stream2 tags the packets for conn2
Which one suits this case?
The short answer: mplex and yamux are both Stream Multiplexers (aka stream muxers), and they're responsible for interleaving mulitiple "logical streams" over a single "raw" connection (e.g. TCP). Multistream is used to identify what kind of protocol should be used when sending / receiving data over the stream, and multistream-select lets peers negotiate which protocols are supported by each end and hopefully agree on one to use.
Long answer:
Stream muxing is an interface with several implementations. The "baseline" stream muxer is called mplex - a libp2p-specific protocol with implementations in javascript, go and rust.
Stream multiplexers are "pluggable", meaning that you add support for them by pulling in a module and configuring your libp2p app to use them. A given libp2p application can support several multiplexers at the same time, so for example, you might use yamux as the default but also support mplex to communicate with peers that don't support yamux.
While having this kind of flexibility is great, it also means that we need a way to figure out what stream muxer to use for any specific connection. This is where multistream and multistream-select come in.
Multistream (despite the name) is not directly related to stream multiplexing. Instead, it acts as a "header" for a stream of binary data that contextualizes the stream with a protocol id. The closely-related multistream-select protocol uses mutlistream protocol ids to negotiate what protocols to use for the "next phase" of communication.
So, to agree upon what stream muxer to use, we use multistream-select.
Here's an example the multistream-select back-and-forth:
/multistream/1.0.0 <- dialer says they'd like to use multistream 1.0.0
/multistream/1.0.0 -> listener echoes back to indicate agreement
/secio/1.0.0 <- dialer wants to use secio 1.0.0 for encryption
/secio/1.0.0 -> listener agrees
* secio handshake omitted. what follows is encrypted via secio: *
/mplex/6.7.0 <- dialer would like to use mplex 6.7.0 for stream multiplexing
/mplex/6.7.0 -> listener agrees
This is the simple case where both sides agree upon everything - if e.g. the listener didn't support /mplex/6.7.0, they could respond with na (not available), and the dialer could either try another protocol, ask for a list of supported protocols by sending ls, or give up.
In the example above, both sides agreed on mplex, so future communication over the open connection will be subject the semantics of mplex.
It's important to note that most of the details above will be mostly "invisible" to you when opening individual connections in libp2p, since it's rare to use the multistream and stream muxing libraries directly.
Instead, a libp2p component called the "switch" (also called the "swarm" by some implementations) manages the dialing / listening state for the application. The switch handles the multistream negotiation process and "hides" the details of which specific stream muxer is in use from the rest of the libp2p stack.
As a libp2p developer, you generally dial other peers using the switch interface, which will give you a stream to read from and write to. Under the hood, the switch will find the appropriate transport (e.g. TCP / websockets) and use multistream-select to negotiate encryption & stream multiplexing. If you already have an open connection to the remote peer, the switch will just use the existing connection and open another muxed stream over it, instead of starting from scratch.
The same goes for listening for connections - you give the switch a protocol id and a stream handler function, and it will handle the muxing & negotiation process for you.
Our documentation is a work-in-progress, but there is some information at https://docs.libp2p.io that might help clarify, especially the concept doc on Transports and the glossary. You can also find links to example code.
Improving the docs for libp2p is my main quest at the moment, so please feel free to file issues at https://github.com/libp2p/docs to let me know what your most important missing pieces are.

Proper way to mux an HTTP server in Go that can handle non-HTTP protocols

I have built a router with extended logging capabilities using Go. It works properly for most use cases. However, it encounters problems when clients send non-standard HTTP messages on port 80.
To date, I have solved this by implementing my own version of ServeHTTP():
func (myproxy *MyProxy) ServeHTTP(w http.ResponseWtier, r *http.Request) {
// Inspect headers
// Determine if it is a custom protocol (ie: websockets, CONNECT requests)
// Implement handlers for each time
}
In the event that I determine a request is a non-standard HTTP protocol, the request is played back to the original destination (via http.Request.Write()) and everyone is happy.
At least, most of the time. The problem arises with edge cases. Tivo clients do not send "Host" headers and appear to not like all kinds of other standard things that Go does (such as capitalizing header names). The number of possible variations on this is endless so what I would very much like to do is to buffer the original request - exactly as it was sent to me without any modifications - and replay it back to the original destination server.
I could hack this by re-implementing net.Http.Server, but this seems like an extraordinarily difficult and brittle approach. Would I would prefer to do would be to somehow hook into net/http/http/server.go right around the point where it receives a connection, then wrap that in a spy wrapper that logs the request to a buffer.
func (srv *Server) Serve(l net.Listener) error {
// Lots of listener code...
c := srv.newConn(rw)
c.setState(c.rwc, StateNew) // before Serve can return
// I'd like to wrap c here and save the request for possible reply later.
go c.serve(ctx)
}
https://golang.org/src/net/http/server.go?s=93229:93284#L2805
Edit: I have looked at httpUtil:DumpRequest but this slightly modifies the original request (changes case and order of headers). If it didn't do that, it would be an ideal solution.
https://godoc.org/net/http/httputil#DumpRequest
Is there a way to hook connections around this point before they are parsed by http.Request?
In the interest of helping others, I wanted to answer my question. The approach I suggested above does in fact work and is the proper way to do this. In summary:
Implement ListenAndServe()
Wrap the incoming net.Conn in a TeeReader or other multiplexed connection wrapper
Record the request
Dial the original destination and connect with the inbound connection, replaying the original request if necessary.
A similar use case is required when upgrading connection requests for websockets servers. A nice writeup can be found here:
https://medium.freecodecamp.org/million-websockets-and-go-cc58418460bb

Keeping socket open after HTTP request/response to Node.js server

To support a protocol (Icecast Source Protocol) based on HTTP, I need to be able to use a socket from Node.js's http.Server once the HTTP request is finished. A sample request looks like this:
Client->Server: GET / HTTP/1.0
Client->Server: Some-Headers:header_value
Client->Server:
Server->Client: HTTP/1.0 200 OK
Server->Client:
Client->Server: <insert stream of binary data here>
This is to support the source of an internet radio stream, the source of the stream data being the client in this case.
Is there any way I can use Node.js's built in http.Server? I have tried this:
this.server = http.createServer(function (req, res) {
console.log('connection!');
res.writeHead(200, {test: 'woot!'});
res.write('test');
res.write('test2');
req.connection.on('data', function (data) {
console.log(data);
});
}).listen(1337, '127.0.0.1');
If I telnet into port 1337 and make a request, I am able to see the first couple characters of what I type on the server console window, but then the server closes the connection. Ideally, I'd keep that socket open indefinitely, and take the HTTP part out of the loop once the initial request is made.
Is this possible with the stock http.Server class?
Since it is reporting HTTP/1.0 as the protocol version the server is probably closing the connection. If your client is something you have control over, you might want to try to set the keep alive header (Connection: Keep-Alive is the right one I think).
My solution to this problem was to reinvent the wheel and write my own HTTP-ish server. Not perfect, but it works. Hopefully the innards of some of these stock Node.js classes will be exposed some day.
I was in a similar situation, here's how I got it to work:
http.createServer(function(res, req){
// Prepare the response headers
res.writeHead(200);
// Flush the headers to socket
res._send('');
// Inform http.serverResponse instance that we've sent the headers
res._headerSent = true;
}).listen(1234);
The socket will now remain open, as no http.serverResponse.end() has been called, but the headers have been flushed.
If you want to send response data (not that you'll need to for an Icecast source connection), simply:
res.write(buffer_or_string);
res._send('');
When closing the connection just call res.end().
I have successfully streamed MP3 data using this method, but haven't tested it under stress.

What does it take to convert an http server into an https server?

This question is similar to
Starting to use OpenSSL
but more specific and detailed so I think it's fair to ask.
Suppose I have an simple http server that does the following in a successful GET scenario
creates a listening socket
a client connects
reads the data through recv
parses the GET request, now it knows which resource to return
writes the response through send
close the socket
This server is written in c++ on linux.
My question is, What does it take to convert this server into an minimal https server? (in particular using OpenSSL, but answers in a general sense are welcome.)
Here's my understanding (question marks mean I'm have no idea)
initialize the library
read the server certificate and private key and other configurations
create a normal listening socket(?)
a client connects
do the handshaking through a library function(?)
handshaking done
do I need a special step before I start receiving and sending data?
read data through library function(?)
does the data look exactly like an HTTP GET at this point?
if it does, parse the GET and get the resource
write return data through library function(?)
close the connection through a library function(?)
In summary, I'm hoping that it only requires adding some extra steps to the current code and does not affect the HTTP parsing. Is this assumption correct?
Many thanks to anybody who could fill in the blanks.
Look through "Network Security with OpenSSL", as it covers this. Even if you don't have the book, you can look through the code.

Streaming Zope HTTP responses with proxy views

I am using the following PLone + urllib code to proxy responses from another server through a BrowserView
req = urllib2.Request(full_url)
try:
# Important or if the remote server is slow
# all our web server threads get stuck here
# But this is UGLY as Python does not provide per-thread
# or per-socket timeouts thru urllib
orignal_timeout = socket.getdefaulttimeout()
try:
socket.setdefaulttimeout(10)
response = urllib2.urlopen(req)
finally:
# restore orignal timeoout
socket.setdefaulttimeout(orignal_timeout)
# XXX: How to stream respone through Zope
# AFAIK - we cannot do it currently
return response.read()
My question is how could I make this function not to block and start streaming the proxied response through Zope instantly when the first bytes arrive? When interfaces, objects or patterns are used in making streamable Zope responses?
I think there are two ways you can do this. Firstly, the Zope response itself is file-like so you can use the response's write() method to write successive chunks of data to the response as they come in. Here's an example where I use a Zope response as a file-like object for a csv.writer.
Or you can use ZPublisher's IStreamIterators and wrap the response in a ZPublisher.Iterators.filestream_iterator wrapper and return the wrapper.
This should actually be a comment, but I don't have the reputation yet.
I am trying to do the same thing as you Mikko, and RESPONSE.write() does exactly that, as Ross said it would. Note however that the bytes won't actually leave the interface until there's 64K of them (or connection closes). Flushing stdout won't help so it seems that you will have to interfere further down with the socket to promptly send a few bytes right away.

Resources