Sending binary data over http in the url - http

I am trying for a way to pass binary data to a server over http, via the URL field in the browser. Is there a way to bypass the automatic http encoding done by the browser so I can just encode the data by myself.
e.g.: Instead of the byte with value 48, to fill in the URL %30 so that the browser doesn't re-encode the url and I end up with %2530
Solved: To whom may encounter similar problems in the future. You can do so by using wget parameter
--restrict-file-name=ascii
Which basically ensures that '%' won't be escaped

Use base64 encoding, that's what it's designed to do.

I managed to do so, by writing my own tcp client to connect to the http server and transmit the request, by inputting it manually.

Use the base62 encoding.
The encoded string doesn't contain any character that will be URL-encoded.

Related

how to decrypt a http request captured during packet capture

Hi Guys Im currently capturing some traffic from an android application, but the requests it is sending out to server seems encrypted. Would you guys know how to decrypt such requests? Or is that impossible to do?
platform=android&version=1.0.31&lang=en&requestId=44&time=1485552535566&batch=%5b%7b%22id%22%3a177205%2c%22time%22%3a1485552512601%2c%22name%22%3a%22collectResource%22%2c%22params%22%3a%5b155%5d%2c%22hash%22%3a1948904473%7d%5d&sessionId=674937_bc59a16eae9e1559b2e60ae068baf4e7
That's not encrypted, it's encoded. Do a search for "online url decode". In your example you will get:
platform=android&version=1.0.31&lang=en&requestId=44&time=1485552535566&batch=[{"id":177205,"time":1485552512601,"name":"collectResource","params":[155],"hash":1948904473}]&sessionId=674937_bc59a16eae9e1559b2e60ae068baf4e7
The %xx are url encoded hex values. For example %22 is the hex version of the double quote character. I think that if you use javascript or other tool to decode the url encoding or manually change all % strings to the equivalent characters, you will see that the message is really just url encoded plain text.

Sending a SOAP XML manually and receiving a HTTP 500 error code and binary data in the response

I am using ruby to send a SOAP request to a very enterprisey bla bla service, so unfortunately I can not attach any samples, there's nobody to send any server-side logs, nobody knows whats wrong on the provider side or how the actual HTTP requests need to look like (except a single XML example I got, but no HTTP headers), the docs are very Microsoft-centric with C# examples and whatnot ("instantiate AbstractFactoryFactory..." and whatnot), long live enterprise software.
But the bottom line is, eventually I took one of their own XMLs from their logs and sent it via HTTP to the endpoint from the WSDL and sent it to their host using the Savon gem raw XML option and got a HTTP 500 error from their host and a bunch of non-ascii binary data inside - literally, no ASCII characters are in the body.
I guessed that maybe Savon does some bad magic or that the XML option is not working as expected and I tried sending the same request via Faraday, but got the same thing,
the HTTP response headers says it's a HTTP response, XML encoded, from an ASP.NET host:
"content-type"=>"text/xml; charset=utf-8",
"server"=>"Microsoft-IIS/7.5",
"x-aspnet-version"=>"2.0.50727",
"x-powered-by"=>"ASP.NET",
but again, a 440 bytes worth of binaries in the response:
method=:post,
body=
"\x1F\x8B\b\x00\x00...
etc.
Am I missing some weird aspect of the SOAP specification and I need to do something to decode this data or has their server gone bonkers from my XML, HTTP headers or something else and I need to ping the provider?
Update 1
I noticed that their original XML had UTF-16 encoding set, so I tried encoding the raw string to UTF-16, then had Savon spew errors at me about bad data, then I updated encoding in the Savon client config. But I still get HTTP 500 error and binaries as response and if I try to log anything Savon reports a bug:
Encoding::CompatibilityError: incompatible encoding regexp match (US-ASCII regexp with UTF-16 string)
from /home/bbozo/.rvm/gems/ruby-2.2.4/gems/savon-2.11.1/lib/savon/log_message.rb:13:in `to_s'
Faraday basically reported the same behavior, an binary blob.
Update 2
I tried piping the encoding to every known encoding, and got nothing, even though the HTTP headers imply the encoding is UTF-8, it obviously isn't
Encoding.name_list.map{ |e_in| [ e_in, ( response.body.dup.force_encoding(e_in).encode('utf-8') rescue 'incompatible' ) ] }
There is nothing that would indicate the encoding in the WSDL files, the API spec doesn't even mention encoding except that the request XMLs need to be UTF-8 encoding, I tried encoding the body, changing the XML encoding definition, HTTP headers, but still I get the same binary blob, with the same heading (\x1F\x8B\b\x00\x00) - so it's not some weird encryption either.
Compression maybe?
I tried with https for good measure and nothing.
Question
Am I missing some weird aspect of the SOAP specification and I need to do something to decode this data or has their server gone bonkers from my XML, HTTP headers or something else and I need to ping the provider?
The response body was compressed! In the end I just gunzipped it and there it was,
How to decompress Gzip string in ruby?

How can a server know which character encoding the query string is using in a GET request?

I encountered some problems about character encoding recently.
When I tried to fire a HTTP GET request, which contains some non-ascii characters in the query string, I found that the server could not decode the parameters correctly.
My current solution is to configure the server.xml of tomcat, adding the attribute URIEncoding="utf-8" to the <Connector> element.
Well, it solves the problem. But my question is: What if the URL is not encoded with utf-8?(Like some ANSI encoding, you can do that, right?)
Is there a way for the server to figure out what encoding the URL is using other than just setting a fixed value?
PS: I know some basics of character encoding and the differences between UTF-8 and Unicode.
The server dictates the charset(s) it will accept for (percent-encoded) URLs to its resources. If the client sends a URL in the wrong charset, it will not work correctly. There is no protocol to allow the server to advertise its desired charset(s), though. So it is kind of a catch-22. If the URL originates from an HTML page, use the charset of the HTML. Otherwise you just have to guess, and you will probably guess wrong, if the server does not accept UTF-8.

asp.net convert utf-8 from webservice request

I get some string data from a webservice in utf-8. How do I convert it in an aspx vb to a readable format? The website is german.
UTF-8 is readable. ASP.NET should be able to read it just fine. If it's transmitted with a Content-Type whose charset parameter is set to something other than UTF-8 you might need to instruct ASP.NET to force the decoding to UTF-8. Use Fiddler and figure out how the HTTP request looks like and pay special attention to the Content-Type parameter.
If you have a different output-encoding than UTF-8, you should still be able to output the characters correctly if you decode them with the correct encoding. What is your output encoding? What encoding is the web service you're communicating with using? Figure the answer to these questions (using Fiddler) and your solution should be obvious.

How do web servers know the charset using in forms posted to them?

When a web server gets a POST of a form, parsing it into param-value(s) pairs is quite straightforward. However, if the values contain non-English chars that have been encoded by the browser, it must know the charset used in order to decode them.
I've examined the requests sent by two posts. One was done from a page using UTF-8, and one from a page using Windows-1255. The same text was encoded differently. AFAIK, the Content-type header could contain a charset after the application/x-www-form-urlencoded, but it wasn't (using Firefox).
In a servlet, when you use request.getParameter(), you're supposed to get the decoded value. How does the servlet container do that? Does it always bet on UTF-8, use some heuristics, or is there some deterministic way I'm missing?
From the Serlvet 3.0 Spec, section 3.10 Request Data Encoding (emphasis mine)
Currently, many browsers do not send a char encoding qualifier with the ContentType header, leaving open the determination of the character encoding for reading
HTTP requests. The default encoding of a request the container uses to create the
request reader and parse POST data must be “ISO-8859-1” if none has been specified
by the client request. However, in order to indicate to the developer, in this case, the
failure of the client to send a character encoding, the container returns null from
the getCharacterEncoding method.
If the client hasn’t set character encoding and the request data is encoded with a
different encoding than the default as described above, breakage can occur. To
remedy this situation, a new method setCharacterEncoding(String enc) has
been added to the ServletRequest interface. Developers can override the
character encoding supplied by the container by calling this method. It must be
called prior to parsing any post data or reading any input from the request. Calling
this method once data has been read will not affect the encoding.
In practice, I find that setting the charset in a response influences the charset used in the subsequent POST. To be extra sure, you can write a Servlet Filter that calls the setCharacterEncoding on every request object before it is used.
You may also find this thread useful - Detecting the character encoding of an HTTP POST request
The apropriate header for specifying charsets is Accept-Charset.
Latest Chrome for linux, e.g., spits:
Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.3
on each request.
Section 14.2 from http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html states:
The Accept-Charset request-header field can be used to indicate what character sets are acceptable for the response. This field allows clients capable of understanding more comprehensive or special- purpose character sets to signal that capability to a server which is capable of representing documents in those character sets.
(...)
If no Accept-Charset header is
present, the default is that any
character set is acceptable. If an
Accept-Charset header is present, and
if the server cannot send a response
which is acceptable according to the
Accept-Charset header, then the server
SHOULD send an error response with the
406 (not acceptable) status code,
though the sending of an unacceptable
response is also allowed.
So if you receive such a header from a client, the value with highest q can be the encoding you're receiving from it.

Resources