Send a File as well as parameters (through JSON) inside one HTTP request - http

I am creating a server using Go that allows the client to upload a file and then use a server function to parse the file. Currently, I am using two separate requests:
1) First request sends the file the user has uploaded
2) Second request sends the parameters to the server that the server needs to parse the file.
However, I have realised that due to the nature of the program, there can be concurrency problem if multiple users try to use the server at the same time. My solution to that was using mutex locks. However, I am receiving the file, sending a response, and then receiving the parameters and it seems that Go cannot send a response back when the mutex is locked. I am thinking about solving this problem by sending both the file and the parameters in one single HTTP request. Is there a way to do that? Thanks
Sample code (only relevant parts):
Code to send file from client:
handleUpload() {
const data = new FormData()
for(var x = 0; x < this.state.selectedFile.length; x++) {
data.append('myFile', this.state.selectedFile[x])
}
var self = this;
let url = *the appropriate url*
axios.post(url, data, {})
.then(res => {
//other logic
self.handleParser();
})
}
Code for handleParser():
handleNessusParser(){
let parserParameter = {
SourcePath : location,
ProjectName : this.state.projectName
}
// fetch the the response from the server
let self = this;
let url = *url*
fetch(url, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(parserParameter),
}).then( (response) => {
if(response.status === 200) {
//success logic
}
}).catch (function (error) {
console.log("error: ", error);
});
}

The question is not really about Go or reactjs or any particular software library.
To solve your problem you'd first need to understand how HTTP POST works,
hence I invite you to first read this intro on MDN.
In short:
There are multiple ways to encode the data sent in a POST request.
The way the receiver should deal with this data depends on how it's encoded by the sender.
The sender has to communicate the encoding with its request — usually via the Content-Type header field.
I won't go into the details of possible encodings — the referenced introductory material covers them, and you should do your own research on them, but to maybe recap what's written there, here is some perspective.
Back in the 80s and 90s the web was "static" and the dreaded era of JavaScript-heavy "web apps" did not yet come. "Static" means you could not run any code in the client's browser, and had to encode any communication with the server in terms of plain HTML.
An HTML document could have two ways to make the client rendering it to send something back to the server: a) embed an URL which would include query parameters; this would make the client to perform a GET request with these parameters sent to the server; b) embed an HTML "form" which, when "submitted", would result in performing a rather more complex POST request with the data taken from the filled in form.
The latter approach was the way to leverage the browser's ability to perform reasonably complex data processing — such as slurpling a file selected by the user in a specific form's control, encoding it appropriately and sending it to the server along with the other form's data.
There were two ways to encode the form's data, and they are both covered by the linked article, please read about them.
The crucial thing to understand about this "static web with forms" approach is that it worked like this: the server sends an HTML document containing a web form, the browser renders the document, the user fills the form in and clicks the "submit" button rendered by the browser; the browser collects the data from the form's controls, for entries of type "file" it reads and encodes the contents of those files and finally performs an HTTP POST request with this stuff encoded to the URL specified by the form. The server would typically respond with another HTML document and so on.
OK, so here came "web 2.0", and an "XHR" (XMLHttpRequest) was invented. It has "XML" in its name because that was the time when XML was perceived by some as a holy grail which would solve any computing problem (which it, of course, failed to do). That thing was invended to be able to send almost arbitrary data payloads; XML and JSON encoding were supported at least.
The crucial thing to understand is that this way to communicate with the server is completely parallel to the original one, and the only thing they share is that they both use HTTP POST requests.
By now you should possibly see the whole picture: contemporary JS libs allow you to contruct and perform any sort of request: they allow you to create a "web form"-style request or to create a JS object, and serialise it to JSON, and send the result in an HTTP POST request.
As you can see, any approach allows you to pass structured data containing multiple distinct pieces of data to the server, and the way to handle this all is a matter of agreement between the server and the client, that is, the API convention, if you want.
The difference between various approaches is that the web-form-style approach would take care of encoding the contents of the file for you, while if you opt to send your file in a JSON object, you'll need to encode it yourself — say, using base64 encoding.
Combined approaches are possible, too.
For instance, you can directly send binary data of a file as a POST request's body, and submit a set of parameters along with the request by encoding them as query-parameters of the URL. Again, it's up to the agreement between the client and the server about how the latter encodes the data to be sent and the former decodes them.
All-in-all, I'd recommend to take a pause and educate yourself on the stuff I have outlined above, and then have another stab at solving the problem, but this time — with reasonably complete understanding about how the stuff works under the hood, and how you intend to wield it.

Related

How to get continuous HTTP data?

I'm trying to get live trading data from the Internet via HTTP, but it is updated continuously, so if I GET the data, it will keep downloading as long as there is data available. Until I stop the downloading stream, then I can access the data.
How to access the stream of data while the downloading is in progress?
I tried using Indy's TIdHTTP, so I can use SSL, but I tried the IdIOHandlerStream, but it was already used for IdSSLIOHandlerSocketOpenSSL. So I'm absolutely clueless here.
This is in response to a "multipart/form-data" request.
Please guide me...
Lrequest.Values['__RequestVerificationToken'] := RequestVerificationToken;
Lrequest.Values['acct'] := 'demo';
Lrequest.Values['pwd'] := 'demo';
try
Response.Text := Onhttp.Post('https://trading/data', Lrequest);
Form1.Memo1.Lines.Add(TimeToStr(Time) + ': ' + Response.Text);
except
on E: Exception do
Form1.Memo1.Lines.Add(TimeToStr(Time) + ': ' + E.ClassName +
' error raised, with message : ' + E.Message);
end;
UPDATE:
The data is an endless JSON string, like this:
{"id":"data","val":[{"rc":2,"tpc":"\\RealTime\\Global\\SGDIDR.FX","item":[{"val":{"F009":"10454.90","F011":"-33.1"}}]}]}
{"id":"data","val":[{"rc":2,"tpc":"\\RealTime\\Global\\SGDIDR.FX","item":[{"val":{"F009":"10458.80","F011":"-29.2"}}]}]}
and so on, and so on...
You can't use TIdIOHandlerStream to interface with a TCP connection, that is not what it is designed for. It is meant for performing I/O operations using user-provided TStream objects, ie for debugging previously captured sessions.
TIdHTTP is not really designed to handle endless HTTP responses in most cases, as you have described. What is the exact format that the server is delivering its live data as? What do the HTTP response headers look like? It is really difficult to answer your question without know the exact format being used.
However, that being said, there are some cases to consider, depending on what the server is actually sending:
if the server is using a MIME-based server-push format, like multipart/x-mixed-replace, you can enable the hoNoReadMultipartMIME flag in the TIdHTTP.HTTPOptions property, and then read the MIME data yourself from the TIdHTTP.IOHandler after TIdHTTP.Get() exits. For instance, you can use TIdMessageDecoderMIME to help you parse the MIME parts, see New TIdHTTP hoNoReadMultipartMIME flag in Indy's blog, or Delphi Indy TIdHttp and multipart/x-mixed-replace with Text and jpeg image.
Otherwise, if the server is using Transfer-Encoding: chunked, where each data update is sent as a new HTTP chunk, you can use the TIdHTTP.OnChunkReceived event. Or, you can enable the hoNoReadChunked flag in the TIdHTTP.HTTPOptions property, and then read the chunks yourself from the TIdHTTP.IOHandler after TIdHTTP.Get() exits. See New TIdHTTP flags and OnChunkReceived event in Indy's blog.
Otherwise, you could give TIdHTTP.Get() a TIdEventStream to write into, and then use that stream's OnWrite event to access the raw bytes. Or, you could write your own TStream-derived class that overrides the virtual Write() method. Either way, you would be responsible for manually parsing and buffering the raw body data as they are being written to the stream.
Otherwise, you may have to resort to using TIdTCPClient instead, implementing the HTTP protocol manually, then you would be solely responsible for reading in the HTTP response body however you want.

How to reuse variables from previous request in the Paw rest client?

I need to reuse value which is generated for my previous request.
For example, at first request, I make a POST to the URL /api/products/{UUID} and get HTTP response with code 201 (Created) with an empty body.
And at second request I want to get that product by request GET /api/products/{UUID}, where UUID should be from the first request.
So, the question is how to store that UUID between requests and reuse it?
You can use the Request Sent Dynamic values https://paw.cloud/extensions?extension_type=dynamic_value&q=request+send these will get the value used last time you sent a requst for a given request.
In your case you will want to combine the URLSentValue with the RegExMatch (https://paw.cloud/extensions/RegExMatch) to first get the url as it was last sent for a request and then extract the UUID from the url.
e.g
REQUEST A)
REQUEST B)
The problem is in your first requests answer. Just dont return "[...] an empty body."
If you are talking about a REST design, you will return the UUID in the first request and the client will use it in his second call: GET /api/products/{UUID}
The basic idea behind REST is, that the server doesn't store any informations about previous requests and is "stateless".
I would also adjust your first query. In general the server should generate the UUID and return it (maybe you have reasons to break that, then please excuse me). Your server has (at least sometimes) a better random generator and you can avoid conflicts. So you would usually design it like this:
CLIENT: POST /api/products/ -> Server returns: 201 {product_id: UUID(1234...)}
Client: GET /api/products/{UUID} -> Server returns: 200 {product_detail1: ..., product_detail2: ...}
If your client "loses" the informations and you want him to be later able to get his products, you would usually implement an API endpoint like this:
Client: GET /api/products/ -> Server returns: 200 [{id:UUID(1234...), title:...}, {id:UUID(5678...),, title:...}]
Given something like this, presuming the {UUID} is your replacement "variable":
It is probably so simple it escaped you. All you need to do is create a text file, say UUID.txt:
(with sample data say "12345678U910" as text in the file)
Then all you need to do is replace the {UUID} in the URL with a dynamic token for a file. Delete the {UUID} portion, then right click in the URL line where it was and select
Add Dynamic Value -> File -> File Content :
You will get a drag-n-drop reception widget:
Either press the "Choose File..." or drop the file into the receiver widget:
Don't worry that the dynamic variable token (blue thing in URL) doesn't change yet... Then click elsewhere to let the drop receiver go away and you will have exactly what you want, a variable you can use across URLs or anywhere else for that matter (header fields, form fields, body, etc):
Paw is a great tool that goes asymptotic to awesome when you explore the dynamic value capability. The most powerful yet I have found is the regular expression parsing that can parse raw reply HTML and capture anything you want for the next request... For example, if you UUID came from some user input and was ingested into the server, then returned in a html reply, you could capture that from the reply HTML and re-inject it to the URL, or any field or even add it to the cookies using the Dynamic Value capabilities of Paw.
#chickahoona's answer touches on the more normal way of doing it, with the first request posting to an endpoint without a UUID and the server returning it. With that in place then you can use the RegExpMatch extension to extract the value from the servers's response and use it in subsequent requests.
Alternately, if you must generate the UUID on the client side, then again the RegExpMatch extension can help, simply choose the create request's url for the source and provide a regexp that will strip the UUID off the end of it, such as /([^/]+)$.
A third option I'll throw out to you, put the UUID in an environment variable and just have all of your requests reference it from there.

Why Tomcat returns different headers for HEAD and GET requests to my RESTful API?

My initial purpose was to verify the HTTP chunked transfer. But accidentally found this inconsistency.
The API is designed to return a file to client. I use HEAD and GET methods against it. Different headers are returned.
For GET, I get these headers: (This is what I expected.)
For HEAD, I get these headers:
According to this thread, HEAD and GET SHOULD return identical headers but not necessarily.
My question is:
If Transfer-Encoding: chunked is used because the file is dynamically fed to the client and Tomcat server cannot know its size beforehand, how could Tomcat know the Content-Length when HEAD method is used? Does Tomcat just dry-run the handler and count all the file bytes? Why doesn't it simply return the same Transfer-Encoding: chunked header?
Below is my RESTful API implemented with Spring Web MVC:
#RestController
public class ChunkedTransferAPI {
#Autowired
ServletContext servletContext;
#RequestMapping(value = "bootfile.efi", method = { RequestMethod.GET, RequestMethod.HEAD })
public void doHttpBoot(HttpServletResponse response) {
String filename = "/bootfile.efi";
try {
ServletOutputStream output = response.getOutputStream();
InputStream input = servletContext.getResourceAsStream(filename);
BufferedInputStream bufferedInput = new BufferedInputStream(input);
int datum = bufferedInput.read();
while (datum != -1) {
output.write(datum);
datum = bufferedInput.read();
}
output.flush();
output.close();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}
ADD 1
In my code, I didn't explicitly add any headers, then it must be Tomcat that add the Content-Length and Transfer-Encoding headers as it sees fit.
So, what are the rules for Tomcat to decide which headers to send?
ADD 2
Maybe it's related to how Tomcat works. I hope someone can shed some light here. Otherwise, I will debug into the source of Tomcat 8 and share the result. But that may take a while.
Related:
HTTP HEAD and GET different result
Content-Length header with HEAD requests?
Does Tomcat just dry-run the handler and count all the file bytes?
Yes, the default implementation of javax.servlet.http.HttpServlet.doHead() does that.
You can look at helper classes NoBodyResponse, NoBodyOutputStream in HttpServlet.java
The DefaultServlet class (the Tomcat servlet that is used to serve static files) is more wise. It is capable of sending the correct Content-Length value, as well as serving GET requests for a subset of the file (the Range header). You can forward your request to that servlet, with
ServletContext.getNamedDispatcher("default").forward(request, response);
Although it seems strange, it might make sense to send the size only in response to a HEAD request and chunked in response to a GET request, depending on the type of data that has to be returned by the server.
While your API seems to provide a static file, you also talk about dynamically created files or data, so I will be talking in general here (also for webservers in general).
First let's have a look at the different usages for GET and HEAD:
With GET the client is requesting the whole file or data (or a range of the data), and wants it as fast as possible. So there is no specific reason for the server to send the size of the data first, especially when it could start sending faster/sooner in chunked mode. So the fastest possible way is preferred here (the client will have the size after the download anyway).
With HEAD on the other hand, the client usually wants some specific information. This could just be a check on existance or 'last-changed', but it could also be used if the client wants a certain part of the data (with a range request, including a check to see if range requests are supported for that request), or just needs to know the size of the data up front for some reason.
Lest's look at some possible scenarios:
Static file:
HEAD: there's no reason to not include the size in the response-header because that information is available.
GET: most of the time the size will be inluded in the header and the data sent in one go, unless there are specific performance reasons to send it in chunks. On the other hand it seems you are expecting chunked transfer for you file, so this could make sense here.
Live logfile:
Ok, somewhat strange, but possible: downloading a file where the size could change while downloading.
HEAD: again, the client probably wants the size, and the server can easily provide the size of the file at that specific time in the header.
GET: since loglines could be added while downloading, the size is unknown up front. Only option is to send chunked.
Table with fixed-sized records:
Let's imagine a server needs to send back a table with fixed-length records coming from multiple sources/databases:
HEAD: size is probably wanted by the client. The server could quickly do a query for count in each database, and send the calculated size back to the client.
GET: instead of doing a query for count in each database first, the server better starts sending the resulting records from each database in chunks.
Dynamically generated zip-files:
Maybe not common, but an interesting example.
Imagine you want to provide dynamically generated zip-files to the user based on some parameters.
Let's first have a look at the structure of a zip-file:
There are two parts: first there's a block for each file: a small header followed by the compressed data for that file. Then there's a list of all the files inside the zip-file (including sizes/positions).
So the prepared blocks for each file could be pre-generated on disk (and the names/sizes stored in some data structure.
HEAD: the client probably wants to know the size here. The server can easily calculate the size of all the needed blocks + the size of the second part with the list of the files inside.
If the client wants to extract a single file, it could directly ask for the last part of the file (with a range-request) to have the list, and then with a second request ask for that single file. Although the size is not necessarily needed to get the last n bytes, it could be handy if for example if you wanted to store the different parts in a sparse file with the same size of the full zip-file.
GET: no need to do the calculations first (including generating the second part to know its size). It would be better and faster to just start sending each block in chunks.
Fully dynamically generated file:
In this case it wouldn't be very efficient to return the size to a HEAD request of course, since the whole file would need to be generated just to know its size.

How to expose a validation API in a RESTful way?

I'm generally a fan of RESTful API design, but I'm unsure of how to apply REST principles for a validation API.
Suppose we have an API for querying and updating a user's profile info (name, email, username, password). We've deemed that a useful piece of functionality to expose would be validation, e.g. query whether a given username is valid and available.
What are the resource(s) in this case? What HTTP status codes and/or headers should be used?
As a start, I have GET /profile/validate which takes query string params and returns 204 or 400 if valid or invalid. But validate is clearly a verb and not a noun.
The type of thing you've described is certainly more RPC-style in its' semantics, but that doesn't mean you can't reach your goals in a RESTful manner.
There's no VALIDATE HTTP verb, so how much value can you get from structuring an entire API around that? Your story centers around providing users with the ability to determine whether a given user name is available - that sounds to me like a simple resource retrieval check - GET: /profile/username/... - if the result is a 404, the name is available.
What this highlights is that that client-side validation is just that - client side. It's a UI concern to ensure that data is validated on the client before being sent to the server. A RESTful service doesn't give a whit whether or not a client has performed validation; it will simply accept or reject a request based on its' own validation logic.
REST isn't an all-encompassing paradigm, it only describes a way of structuring client-server communications.
We have also encountered the same problem. Our reasoning for having the client defer to the server for validation was to prevent having mismatched rules. The server is required to validate everything prior to acting on the resources. It didn't make sense to code these rules twice and have this potential for them to get out of sync. Therefore, we have come up with a strategy that seems to keep with the idea of REST and at the same time allows us to ask the server to perform the validation.
Our first step was to implement a metadata object that can be requested from a metadata service (GET /metadata/user). This metadata object is then used to tell the client how to do basic client side validations (requiredness, type, length, etc). We generate most of these from our database.
The second part consist of adding a new resource called an analysis. So for instance, if we have a service:
GET /users/100
We will create a new resource called:
POST /users/100/analysis
The analysis resource contains not only any validation errors that occurred, but also statistical information that might be relevant if needed. One of the issues we have debated was which verb to use for the analysis resource. We have concluded that it should be a POST as the analysis is being created at the time of the request. However, there have been strong arguments for GET as well.
I hope this is helpful to others trying to solve this same issue. Any feedback on this design is appreciated.
You are confusing REST with resource orientation, there's nothing in REST that says you cannot use verbs in URLs. When it comes to URL design I usually choose whatever is most self-descriptive, wheather is noun or verb.
About your service, what I would do is use the same resource you use to update, but with a test querystring parameter, so when test=1 the operation is not done, but you can use it to return validation errors.
PATCH /profile?test=1
Content-Type: application/x-www-form-urlencoded
dob=foo
... and the response:
HTTP/1.1 400 Bad Request
Content-Type: text/html
<ul class="errors">
<li data-name="dob">foo is not a valid date.</li>
</ul>
A very common scenario is having a user or profile signup form with a username and email that should be unique. An error message would be displayed usually on blur of the textbox to let the user know that the username already exists or the email they entered is already associated with another account. There's a lot of options mentioned in other answers, but I don't like the idea of needing to look for 404s meaning the username doesn't exist, therefore it's valid, waiting for submit to validate the entire object, and returning metadata for validation doesn't help with checking for uniqueness.
Imo, there should be a GET route that returns true or false per field that needs validated.
/users/validation/username/{username}
and
/users/validation/email/{email}
You can add any other routes with this pattern for any other fields that need server side validation. Of course, you would still want to validate the whole object in your POST.
This pattern also allows for validation when updating a user. If the user focused on the email textbox, then clicked out for the blur validation to fire, slightly different validation would be necessary as it's ok if the email already exists as long as it's associated with the current user. You can utilize these GET routes that also return true or false.
/users/{userId:guid}/validation/username/{username}
and
/users/{userId:guid}/validation/email/{email}
Again, the entire object would need validated in your PUT.
It is great to have the validation in the REST API. You need a validation anyway and wy not to use it on the client side. In my case I just have a convention in the API that a special error_id is representing validation errors and in error_details there is an array of error messages for each field that has errors in this PUT or POST call. For example:
{
"error": true,
"error_id": 20301,
"error_message": "Validation failed!",
"error_details": {
"number": [
"Number must not be empty"
],
"ean": [
"Ean must not be empty",
"Ean is not a valid EAN"
]
}
}
If you use the same REST API for web and mobile application you will like the ability to change validation in both only by updating the API. Especialy mobile updates would take more than 24h to get published on the stores.
And this is how it looks like in the Mobile application:
The response of the PUT or POST is used to display the error messages for each field. This is the same call from a web application using React:
This way all REST API response codes like 200 , 404 have their meaning like they should. A PUT call responses with 200 even if the validation fails. If the call passes validation the response would look like this:
{
"error": false,
"item": {
"id": 1,
"created_at": "2016-08-03 13:58:11",
"updated_at": "2016-11-30 08:55:58",
"deleted_at": null,
"name": "Artikel 1",
"number": "1273673813",
"ean": "12345678912222"
}
}
There are possible modifications you could make. Maby use it without an error_id. If there are error_details just loop them and if you find a key that has the same name as a field put his value as error text to the same field.

Process raw HTTP request content

I am doing an e-commerce solution in ASP.NET which uses PayPal's Website Payments Standard service. Together with that I use a service they offer (Payment Data Transfer) that sends you back order information after a user has completed a payment. The final thing I need to do is to parse the POST request from them and persist the info in it. The HTTP request's content is in this form :
SUCCESS
first_name=Jane+Doe
last_name=Smith
payment_status=Completed
payer_email=janedoesmith%40hotmail.com
payment_gross=3.99
mc_currency=USD
custom=For+the+purchase+of+the+rare+book+Green+Eggs+%26+Ham
Basically I want to parse this information and do something meaningful, like send it through e-mail or save it in DB. My question is what is the right approach to do parsing raw HTTP data in ASP.NET, not how the parsing itself is done.
Something like this placed in your onload event.
if (Request.RequestType == "POST")
{
using (StreamReader sr = new StreamReader(Request.InputStream))
{
if (sr.ReadLine() == "SUCCESS")
{
/* Do your parsing here */
}
}
}
Mind you that they might want some special sort of response to (ie; not your full webpage), so you might do something like this after you're done parsing.
Response.Clear();
Response.ContentType = "text/plain";
Response.Write("Thanks!");
Response.End();
Update: this should be done in a Generic Handler (.ashx) file in order to avoid a great deal of overhead from the page model. Check out this article for more information about .ashx files
Use an IHttpHandler and avoid the Page model overhead (which you don't need), but use Request.Form to get the values so you don't have to parse name value pairs yourself. Just pretend you're in PHP or Classic ASP (or ASP.NET MVC, for that matter). ;)
I'd strongly recommend saving each request to some file.
This way, you can always go back to the actual contents of it later. You can thank me later, when you find that hostile-endian, koi-8 encoded, [...], whatever it was that stumped your parser...
Well if the incoming data is in a standard form encoded POST format, then using the Request.Form array will give you all the data in a nice to handle manner.
If not then I can't see any way other than using Request.InputStream.
If I'm reading your question right, I think you're looking for the InputStream property on the Request object. Keep in mind that this is a firehose stream, so you can't reset it.

Resources