Better understand http requests - http

I'm trying to learn properly how the http protocols are working.
I'm struggling to find online books or resources, as most of the time I'm finding how to make these requests in various languages and not how they actually work.
For context, I'm trying to build a flutter app with a rust server as an exercise, with some security standards. I've been programming for a while now, so I have some concepts settled down but I want to learn more about other stuff.
What I've understood for now is that it is possible to ask a server for something at some url, and from that url the server will send back html page content.
My questions are :
What data do the http requests carry ? can it be anything ? Or is it exclusive html text ?
When doing a http get request, is there anyway to write data in the body of the request ? or do I only have the url with params in it to ask the server for specific data ?
Can I write anything in a http request ? for example, encoded strings with private / public keys ?
How can the client be assured he is indeed talking to the right server when posting the first requests ?
Maybe I'm a little out of context for this forums, but I've been trying to learn all this properly and feel stuck with out I found online.

Related

How to make a request to a site with reCAPTCHA with Python Requests

Goal
I want to make a request to a website with Python requests to scrape some information for containers location and time.
This is the website I'm trying to get data from : https://www.cma-cgm.com/ebusiness/tracking by inserting the container number.
I'm trying something simple, like :
import requests
url = "some_url_i_cant_find"
tracking_number = ABCD1234567
requests.post(url, payload=tracking_number)
Problem
I cannot find in the Network tab how the request to get the container's data is being processed.
I assume this has something to do with reCAPTCHA, but I don't know much about this or how to handle it.
Solution
Some other answer or topic regarding this issue
How to make a request to this website and read the response.

Change web html content/Run bash script on http request

Disclaimer: I am not good at understanding http requests, so please bear with me
I am trying to change the content of an html web page whenever an http GET/POST request is made. It would work something like this:
What I want to accomplish
When my phone is charging, it is going to send an http request to the web server. The web server is going to change the content of the webpage to say something like "Phone is charging."
What I've done so far
I managed to send an HTTP request from my phone to the server every time the phone connects to a charger, I just don't know what to do with the http request that arrives to the server.
Thanks ahead of time!
EDIT: I figured out, according to #LawrenceCherone (thanks Lawrence!) that I can't do this with a static html page and just nginx. He said that I have to use a scripting language. Does bash work for this? Or should I learn something like python, PHP or something else?
I can't find any tutorials online for what I am trying to accomplish for some reason. Haven't seen any tutorials on how to 'react' to a POST request

Inspect how requests routed through a proxy look to their destination

My web app makes request to third party servers, and we sometimes route them trough proxies. I'd like to be able to "see what they see" -- see what the request looks like once its been routed through the proxy.
Specifically, I'm interested in how much identifying information about the source (my web app) is left in the request once it reaches the destination, having been routed through the proxy.
Does anyone know an easy way to do this? Maybe a web service that will just echo back all the information about the incoming request in the outgoing response?
Not a full answer, but maybe you can try:
http://www.cantoni.org/2012/01/08/simple-webservice-echo-test
And the other 2 webs mentioned there:
http://respondto.it/
http://requestb.in/
To setup a URL to send your requests and see if the info provided helps you.
I'm just stating this as an idea that came to me. You could try sending requests to your own URL, which you control (i.e. a resource in your own web application). That way, you can use your debugging infrastructure or other facilities (basically anything you want) to inspect the request that's coming into your application. It seems to me this might be the most powerful / easiest way to do this. It won't let you test the URL you were trying to test, but in terms of proxy visibility, it might be what you need.
Good luck!
If the proxy supports the TRACE method and the Max-Forwards header you can use that. Not all do, however.

HTTP Response before Request

My question might sound stupid, but I just wanted to be sure:
Is it possible to send an HTTP response before having the request for that resource?
Say for example you have an HTML page index.html that only shows a picture called img.jpg.
Now, if your server knows that a visitor will request the HTML file and then the jpg image every time:
Would it be possible for the server to send the image just after the HTML file to save time?
I know that HTTP is a synchronous protocol, so in theory it should not work, but I just wanted someone to confirm it (or not).
A recent post by Jacques Mattheij, referencing your very question, claims that although HTTP was designed as a synchronous protocol, the implementation was not. In practise the browser (he doesn't specify which exactly) accepts answers to requests have not been sent yet.
On the other hand, if you are looking to something less hacky, you could have a look at :
push techniques that allows the server to send content to the browser. The modern implementation that replace long-polling/Comet "hacks" are the websockets. You may want to have a look at socket.io also.
Alternatively you may want to have a look at client-side routing. Some implementations combine this with caching techniques (like in derby.js I believe).
If someone requests /index.html and you send two responses (one for /index.html and the other for /img.jpg), how do you know the recipient will get the two responses and know what to do with them before the second request goes in?
The problem is not really with the sending. The problem is with the receiver possibly getting unexpected data.
One other issue is that you're denying the client the ability to use HTTP caching tools like If-Modified-Since and If-None-Match (i.e. the client might not want /img.jpg to be sent because it already has a cached copy).
That said, you can approximate the server-push benefits by using Comet techniques. But that is much more involved than simply anticipating incoming HTTP requests.
You'll get a better result by caching resources effectively, i.e. setting proper cache headers and configuring your web server for caching. You can also inline images using base 64 encoding, if that's a specific concern.
You can also look at long polling javascript solutions.
You're looking for server push: it isn't available in HTTP. Protocols like SPDY have it, but you're out of luck if you're restricted to HTTP.
I don't think it is possible to mix .html and image in the same HTTP response. As for sending image data 'immediately', right after the first request - there is a concept of 'static resources' which could be of help (but it will require client to create a new reqest for a specific resource).
There are couple of interesting things mentioned in the the article.
No it is not possible.
The first line of the request holds the resource being requested so you wouldn't know what to respond with unless you examined the bytes (at least one line's worth) of the request first.
No. HTTP is defined as a request/response protocol. One request: one response. Anything else is not HTTP, it is something else, and you would have to specify it properly and implement it completely at both ends.

Is it safe to redirect to the same URL?

I have URLs of the form http://domain/image/⟨uuid⟩/42x42/some_name.png. The Web server (nginx) is configured to look for a file /some/path/image/⟨uuid⟩/thumbnail_42x42.png, and if it does not exist, it sends the URL to the backend (Django via mod_wsgi) which then generates the thumbnail. Then the backend emits a 302 redirect to exactly the same URL that was requested by the client, with the idea that upon this second request the server will notice the thumbnail file and send it directly.
The question is, will this work with all the browsers? So far testing has shown no problems, but can I be sure all the user agents will interpret this as intended?
Update: Let me clarify the intent. Currently this works as follows:
The client requests a thumbnail of an image.
The server sees the file does not exist, so it forwards the request to the backend.
The backend creates the thumbnail and returns 302.
The backend releases all the resources, letting the server share the newly generated file to current and subsequent clients.
Having the backend serve the newly created image is worse for two reasons:
Two ways of serving the same data must be created;
The server is much better at serving static content. What if the client has an extremely slow link? The backend is not particularly fast nor memory-efficient, and keeping it in memory while spoon-feeding the client can be wasteful.
So I keep the backend working for the minimum amount of time.
Update²: I’d really appreciate some RFC references or opinions of someone with experience with lots of browsers. All those affirmative answers are pleasant but they look somewhat groundless.
If it doesn't, the client's broken. Most clients will follow redirect loops until a maximum value. So yes, it should be fine until your backend doesn't generate the thumbnail for any reason.
You could instead change URLs to be http://domain/djangoapp/generate_thumbnail and that'll return the thumbnail and the proper content-type and so on
Yes, it's fine to re-direct to the same URI as you were at previously.

Resources