Is it possible to transfer a file through CoAP? - http

Recently, I am doing a project and I am trying to transfer a json file to the CoAP server. I put some random values in key:value pairs such as:
{
key1: value1,
key2: [value21, value22, value23]
}
Questions:
CoAP is pretty much similar to HTTP. So, like HTTP, is it possible to transfer a json file through CoAP using POST/PUT method? If it is possible, what is the recommended directory location to put the uploaded file into the server (resource directory)?
Update:
The actual file size is about to 152.8 kB.

You can transfer arbitary JSON files using CoAP POST/PUT. Which directory would be writable depends fully on the server.
Note that for a file of that size, transfer times would be considerably longer than with HTTP, as packages are sent in lock-step (putting the first 1kB, response, next 1kB – whereas HTTP has a TCP window).

For a first shot, you may try out eclipse/californium's "simple-fileserver-example".
cf-simple-fileserver
The supports the read (GET) and uses option block 2 for that.
If you go deeper and leave the laboratory, RFC7959 blockwise may be faced several issues.
coap usually assumes, that the endpoints are identified by their ip-address (and
port). Though a blockwise transfer may last longer, that assumption may get broken. If the client is faced such a address change, a block option 2 (GET) may work, but for block option 1 (PUT), that would require special preparation.
Though such a blockwise transfer tends to last longer, it may get paused due to temporary transmission issus. That requires then a "resumption or fail" strategy. Also here GET is much easier than PUT.
Basic transmission issues on crashes. In my experience, blockwise comes with many blocks and so many MID are in use in a short period of time. If a client crashs and select a random MID on startup, the probability of an unaware MID clash is rather high. Depending on the coap servers deduplication implementation (strict according RFC7252 or advanced in awareness of that), your client may require a strategy to escape the situation, where the server retransmits unrelated messages just based on MIDs. My experience from that time was, "analyse what your get, if it smells, wait for the 247s :-)". Your client may also save the last used MID to overcome that or use a special/separate "blockwise endpoint" with disabled deduplication.
IP. FMPOV some have seen the issues left to the implementation and started to fill patents. That may require attention as well.
All together: If you use bockwise for payload of sometimes some K bytes, my experience is not that bad. But if you regulary transfer more, coap may be not the right choice.

Related

Deciding whether to use ws or wss?

I have an account with a trading exchange and they have a websockets API which supports ws://... and wss://...
For the non-authenticated channels such as the current state of the orderbook, is it an easy decision to just use ws, mostly for the (however minimal) time savings? Obviously I would like to have my data as recent as possible.
I just want to check there's not some other factor which is more important than a few TLS encryption CPU cycles and ms in latency saved.
I see absolutely no reason not to use wss for all your connections, particular with a webSocket. The normal webSocket use is to make a connection, then keep that connection for a long time and use it. While there is a bit of overhead on every transmission because of the encryption, the main wss overhead is when you first make the connection and that only happens once per connection.
I just want to check there's not some other factor which is more important than a few TLS encryption CPU cycles and ms in latency saved.
No, there is not some other factor. In fact, the opposite. There are more and more and more reasons these days to use TLS whenever possible to protect your privacy.
For the non-authenticated channels such as the current state of the orderbook, is it an easy decision to just use ws, mostly for the (however minimal) time savings?
Why? If wss is available, I'd be using it for everything. If you actually run into a CPU problem down the road, you could revisit whether using wss has anything to do with it, but that is unlikely to happen and, in my opinion, you have nothing to lose by starting with wss. While one wants to design code intelligently, you don't want to try to micro-optimize performance-related things before you even have a documented, measured performance issue to worry about.
General reasons to use TLS:
Privacy (nobody in the middle can snoop on what you're doing)
Security of data (nobody can read your data, not even proxies)
Security of endpoint (endpoint you're connecting to can't be hijacked without you knowing about it)
I just want to check there's not some other factor which is more important than a few TLS encryption CPU cycles and ms in latency saved.
Actually, there is.
This is also stated in the RFC:
At the time of writing of this specification, it should be noted that connections on ports 80 and 443 have significantly different success rates, with connections on port 443 being significantly more likely to succeed, though this may change with time.
Some network intermediaries (especially with some mobile providers) will fail on ws connections but work properly when using wss.
The reason seems to be that these intermediaries (proxies / routers) will attempt to read the WebSocket message as if it were HTTP and "fix" HTTP errors or resolve caching (which actually corrupts WebSocket data).
The encrypted wss protocol will trigger a pass-through mode, since these intermediaries won't be able to read the data or "fix" any HTTP errors.
The Websocket protocol uses client-side frame masking for the same purpose, but sometimes with limited results. Using wss increases connectivity on some networks.

Does SPDY/HTTP2 concatenates responses?

I have a question about SPDY/HTTP2:
Normally you concatenate multiple CSS and JS files into one file to save requests and to get a better performance. I heard that SPDY/HTTP2 combines multiple requests into a single response. Would that mean that I don't need to pre-concatenate CSS and JS files anymore, because this is handled by the protocol?
To say it in other words:
Can I use <script source="moduleA.js"></script> and <script source="moduleB.js"></script> with SPDY/HTTP2 in the same way as I would use <script source="allScripts.js"></script> with HTTP1? Is this the same from a response performance point of view, but with the benefit of caching each file on its own, so that I can change moduleB.js and keep moduleA.js cached?
HTTP/2.0 does not (AFAIK) exist yet - it's still a proposed standard. But it seems likely that it will use similar connection handling to SPDY.
SPDY doesn't concatenate them it multiplexes the requests across the same connection - from the network's point of view the effect is the same.
Yes, you don't need to merge the content files by hand, yes they will be cached independently.
SPDY3 and HTTP2 are multiplexing requests on the same physical connection.
But even multiplexed, requests may be sent sequentially for each resource, causing major slowdowns due to roundtrip time waits.
Both SPDY3 and HTTP2 have a feature called "Resource Push" (also known as "SPDY Push", not to be confused with "Server Push") that allows related resources to be pushed without the client requesting them, and the Jetty project - I am a committer - is the only one to my knowledge that implements that feature.
You can watch Resource Push in action in this video: http://webtide.intalio.com/2012/10/spdy-push-demo-from-javaone-2012/.
With Resource Push, you save additional roundtrips to get all the different JS files and still benefit of the browser cache per single file.
The whole point of resource concatenation is exactly to reduce the number of roundtrips necessary to get all the resources needed, and Resource Push helps to solve that problem.
HTTP/2.0 allows for multiplexing, where multiple request/response streams exchange data over the same TCP connection.
Because creating and starting TCP connections is expensive, HTTP/2.0's multiplexing will usually be faster than the semi-parallel downloading of HTTP/1.1, where a limited amount of TCP connections is (re)used by the browser to perform a given amount of requests for resources.
But your mileage may vary. Measure it.
As a sidenote, you might want to reference all your libraries separately when developing and debugging, but bundle and minify the JS/CSS into one file upon a deploy.

Optimizing file synchronization over HTTP

I'm attempting to synchornize a set of files over HTTP.
For the moment, I'm using HTTP PUT, and sending files that have been altered. However, this is very inefficient when synchronizing large files where the delta is very small.
I'd like to do something closer to what rsync does to transmit the deltas, but I'm wondering what the best approach to do this would be.
I know I could use an rsync library on both ends, and wrap their communication over HTTP, but this sounds more like an antipattern; tunneling a standalone protocol over HTTP. I'd like to do something that's more in line with how HTTP works, and not wrap binary data (except my files, duh) in an HTTP request/response.
I've also failed to find any relevant/useful functionality already implemented in WebDAV.
I have total control over the client and server implementation, since this is a desktop-ish application (meaning "I don't need to worry about browser compatibility").
The HTTP PATCH recommended in a comment requires the client to keep track of local changes. You may not be able to do that due to the size of the file.
Alternatively you could treat "chunks" of the huge file as resources: depending on the nature of the changes and the content of the file it could be by bytes, chapters, whatever.
The client could query the hash of all chunks, calculate the same for the local version, and PUT only the changed ones.

Understanding REST: REST as a high volume transport?

I'm designing a system that will need to move multi-GB backup images over TCP, and I'm looking at REST as an alternative to ONC RPC.
For example, I might have
POST http://site/backups/image1
where image1 is an 50GB file whose data is contained in the HTTP body.
My question: is this within the scope of what REST is meant for? Is it inappropriate to move massive files over HTTP? My preliminary testing shows that the performance isn't too bad, and I like the clean, debuggable protocol, as opposed to a custom ONC RPC server. But is this overloading the role of a webserver?
Thanks,
-Steve
HTTP has about the same overheads as FTP.
An HTTP server if often asked to do more work than an FTP server. But otherwise, using HTTP to send a large file is about the same as using FTP.
The only consideration is making sure your web server and web application framework are configured to do this kind of thing without needlessly expanding the entire 50Gb file inside Apache.
Steve,
HTTP has a look-before-you-leap 'feature' that allows the client to ask the server whether it will accept the data submission before it actually sends out the data. I'd look into using this to avoid transferring GBs of data only to find out that the server is currently not willing to handle them. Look at the HTTP Expect header and 100 Continue status codes.
Also, you can use FTP within a RESTful approach, IOW, think along the lines of
<backup-store href="ftp://example.org/site/backup/images/"/>
and make your clients understand the ftp URI scheme.
Finally, the T in HTTP means transfer and not transport - an important distinction to make because the former is an application semantic (HTTP is an application protocol) and the latter is a not.
HTH,
Jan
REST has nothing to do with how large your data is or which method you use to transport it.

Why can HTTP handle only one pending request per socket?

Being curious, I wonder why HTTP, by design, can only handle one pending request per socket.
I understand that this limitation is because there is no 'Id' to associate a request to its response, so the only way to match a response with its request is to send the response on the same socket that sent the request. There would be no way to match a response to its request if there was more than one pending request on the socket because we may not receive the responses in the same order requests were sent.
If the protocol had been designed to have a matching 'Id' for requests and responses, there could be multiple pending requests on only one socket. This could greatly reduce the number of socket used by internet browsers and applications using web services.
Was HTTP designed like this for simplicity even if it's less efficient or am I missing something and this is the best approach?
Thanks.
Not true. Read about HTTP1.1 pipelining. Apache implements it and Firefox implements it. Although Firefox disables it by default.
To turn it on in Firefox use about:config and write 'pipelining' in the filter.
see: http://www.mozilla.org/projects/netlib/http/pipelining-faq.html
It's basically for simplicity; various proposals have been made over the years that multiplex on the same connection (e.g. SPDY) but none have taken off yet.
One problem with sending multiple requests on a single socket is that it would cause inefficient queuing.
For instance, lets say you are in a store and there are 2 cashiers, and 10 people waiting to be checked out. The ideal way to make the line is to have a single queue of 10 people and the next person in line goes to a cashier when they become available. However, if you sent all the requests at once you would probably send 5 people to cashier A and 5 to cashier B. However, what if you sent the 5 people with the largest shopping carts to the same cashier? That's bad queuing and what could happen if you queued a bunch of requests on a single socket.
NOTE: I'm not saying that you couldn't use queuing well, but it keeps it simple to do it right if there is no queuing on a single socket.
There are a few concidertaions I would review.
The first is related to the nature of TCP itself. TCP suffers from 'head-of-line' blocking issue where there can only be a single outstanding (unacknowledged) request (connection/TCP level) in flight. Given traditional latencies this can be a problem from a load time user experience perspective compared to results of parallel connection scheme browsers employ today. The higher the latency of the link the larger the impact of this fundemental limitation.
There is also a concurrency issue in that sometimes you really want to load multiple resources incrementally / in parallel. Back in the day one of the greatest features mozilla had over mosaic was that it would load images and objects incrementally so you could begin to see what was going on and use a resource without having to wait for it to load. With fewer connections there is a risk in that for example loading a large image on page before a style sheet can be catastrophic from an experience point of view. Expecting some kind of mitigating intelligence or explicit configuration to optimally order requests may not be a realistic or ideal solution.
There are proposals such as HTTP over SCTP that will more or less totally correct the issue you raise at the transport level.
Also realize that HTTP doesn't necessarily mandate a Content-Length header to serve data. Even if each HTTP response was ID'd, how would you manage streaming binary content with no content length (HTTP/1.0 style)? or if the client sent the Connection: close header to have the client close due to non-known lengths?
To manage this you would have to HTTP chunk (already present) in multiplex (I don't think anyone implements this) and add some non-trivial work to many programs.

Resources