Is there a public HTTP 1.1 reference implementation? - http

While learning more about HTTP 1.1 and reading the spec, it strikes me that it could be helpful to have a public reference implementation which can demonstrate the protocol. I imagine it would provide ideal, basic examples, as well as working examples of those parts of the protocol which are often disabled on public servers (e.g., TRACE).
I'm talking about a running, publicly accessible server(s). The idea would be to show how HTTP (should) works via an actual running webserver(s) (and the source). A user could build arbitrary requests using fiddler or the like, to see how the server responds. I'm assuming it would be open source. It would likely be based on an existing webserver implmentation (e.g., Apache), perhaps with extensions to support the entire protocol where the existing impl. doesn't (Transfer-Encoding compression, etc.). I know this last part is a pipe dream, I'm just putting it here by way of explanation.
I understand that HTTP is a very broad protocol, so that a reference implementation would not be comprehensive. I can imagine many, many reasons why something like this would not exist, and I know I can start up my own local server and play around with it (I've done that sort of thing for years). I know I can poke around against well-known existing public servers (Google, etc.). But, I'm wondering if anything like a public reference implementation exists.

As an IETF spec, HTTP/1.1 does not have a reference implementation. Instead,
"at least two independent interoperating implementations
with widespread deployment and successful operational experience"
were required.
From the Implementation report for HTTP/1.1 to Draft Standard, you can see there were substantially more than that:
We have implementation & testing reports from 26 implementations
You say:
I can imagine many, many reasons why something like this would not exist
Here's one: for a reasonably complex specification, you don't want people designing to a specific implementation. Any "reference" implementation would have bugs, which would then be picked up by subsequent code built against that reference.
The specification is authoritative; in the case that an implementation diverges, you should consult the specification (and its errata) for the correct behaviour.
I know I can poke around against well-known existing public servers
Exactly. Per The Tao of IETF:
"We believe in rough consensus and running code"

Related

appropriate user-agent header value

I'm using HttpBuilder (a Groovy HTTP library built on top of apache's httpclient) to sent requests to the last.fm API. The docs for this API say you should set the user-agent header to "something appropriate" in order to reduce your chances of getting blocked.
Any idea what kind of values would be deemed appropriate?
The name of your application including a version number?
I work for Last.fm. "Appropriate" means something which will identify your app in a helpful way to us when we're looking at our logs. Examples of when we use this information:
investigating bugs or odd behaviour; for example if you've found an edge case we don't handle, or are accidentally causing unusual load on a system
investigating behaviour that we think is inappropriate; we might want to get in touch to help your application work better with our services
we might use this information to judge which API methods are used, how often, and by whom, in order to do capacity planning or to get general statistics on the API eco-system.
A helpful (appropriate) User-Agent:
tells us the name and version of your application (preferably something unique and easy to find on Google!)
tells us the specific version of your application
might also contain a URL at which to find out more, e.g. your application's homepage
Examples of unhelpful (inappropriate) User-Agents:
the same as any of the popular web browsers
the default user-agent for your HTTP Client library (e.g. curl/7.10.6 or PEAR HTTP_Request)
We're aware that it's not possible to change the User-Agent sent when your application is browser-based (e.g. Javascript or Flash) and don't expect you to do so. (That shouldn't be a problem in your case.)
If you're using a 3rd party Last.fm API library, such as one of the ones listed at http://www.last.fm/api/downloads , then we would prefer it if you added extra information to the User-Agent to identify your application, but left the library name and version in there as well. This is immensely useful when tracking down bugs (in either our service or in the client libraries).

HTTPbis - what does bis mean?

I've often seen "bis" appended to versions of protocols (eg v.34bis or httpbis).
What does "bis" mean or stand for?
A telecom engineer I know thinks it might be French in origin.
As others have already said, "bis" comes from "twice" or "repeat". It's used to indicate a second variant of something (although usually with only minor variations that don't warrant a new name).
In the context of HTTP, HTTPbis is the name of the working group in charge of refining HTTP. According to its charter:
HTTP is one of the most successful and widely-used protocols on the
Internet today. However, its specification has several editorial
issues. Additionally, after years of implementation and extension,
several ambiguities have become evident, impairing interoperability
and the ability to easily implement and use HTTP.
The working group will refine RFC2616 to:
Incorporate errata and updates (e.g., references, IANA registries, ABNF)
Fix editorial problems which have led to misunderstandings of the specification
Clarify conformance requirements
Remove known ambiguities where they affect interoperability
Clarify existing methods of extensibility
Remove or deprecate those features that are not widely implemented and also unduly affect interoperability
Where necessary, add implementation advice
Document the security properties of HTTP and its associated mechanisms (e.g., Basic and Digest authentication, cookies, TLS) for common applications
It will also incorporate the generic authentication framework from RFC
2617, without obsoleting or updating that specification's definition
of the Basic and Digest schemes.
Finally, it will incorporate relevant portions of RFC 2817 (in
particular, the CONNECT method and advice on the use of Upgrade), so
that that specification can be moved to Historic status.
In doing so, it should consider:
Implementer experience
Demonstrated use of HTTP
Impact on existing implementations and deployments
The Working Group must not introduce a new version of HTTP and should
not add new functionality to HTTP. The WG is not tasked with producing
new methods, headers, or extension mechanisms, but may introduce new
protocol elements if necessary as part of revising existing
functionality which has proven to be problematic.
The last paragraph (emphasis mine) explains why they've used "bis" in this context, since they explicitly don't want a new version.
bis
The word (also used as a prefix or suffix) bis , applied to some modem protocol standards, is Old Latin for "repeat" (akin to Old High German "twice"). When a protocol ends with "bis," it means that it's the second version of that protocol.
Similarly, ter is from Old Latin meaning "three times." The suffix terbo in the V.xx modem protocol is an invented word based on the Old Latin ter and the word turbo (Latin for "whirling top" or "whirlwind") meaning "speed." V.32terbo is the third version developed of the V.32 modem protocol..
(from http://whatis.techtarget.com/definition/0,,sid9_gci211669,00.html)

Why aren't the rest of the HTTP verbs used?

Most of the times, websites mainly only use GET and POST for all operations, yet there are seven more verbs out there. Where they used in older times but not so much now?
Or maybe is it because some browsers don't recognize the other verbs? And if that's the case, why do browser vendors choose to implement half of the protocol?
[Update]
I found this article which gives a good summary about the situation: Why REST failed.
The HTML spec is a big culprit, only really allowing GET, POST and HEAD. They are getting used quite a bit though, but not as much directly in browsers.
The most common uses of the other crud-verbs such as PUT and DELETE are in REST services and WebDAV.
You'll see OPTIONS more in the future, as it's used by the CORS specification (cross domain xmlhttprequest).
TRACE is pretty much disabled everywhere, as it poses a pretty big security risk. CONNECT is definitely used quite a bit by proxies.
PATCH is brand new. While it's odd to me that they decided to add it to the list (but not PROPFIND, MKCOL, ACL, LOCK, and so on), I do think we'll see it appear more in the future in RESTful services.
Addendum: The original browser used both GET and PUT (the latter for updating web pages). Later browsers pretty much became read-only until forms and the POST request made their way into the specifications.
Most of them are still used, though not as widely as GET or POST. For example RESTful web services use PUT & DELETE as well as GET & POST:
RESTful Web Service - Wiki Article
HEAD is very useful for server debugging of the HTTP headers, but as it doesn't return the response body, it's not much use to the browser / average web visitor...
Other verbs like TRACE aren't as widespread, because of potential security concerns etc. Mentioned briefly in the Wiki article:
HTTP Protocol Methods - Wiki Article
A decade later, these other verbs are used very commonly in RESTful APIs, which back nearly all of today's ubiquitous SPA applications and many mobile applications.
Though, interest in REST as an API structure is beginning to wane with the advent of GraphQL and growing interest in functional programming styles which benefit from RPC-style API structures.

Best Publish/Subscribe "Middleware" [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 9 years ago.
Improve this question
I'm in the market for a good open source network based Pub/Sub (observer pattern) library. I haven't found any I like:
JMS - tied to Java, treats message contents as dumb binary blobs
NDDS - $$, use of IDL
CORBA/ICE - Pub/Sub is built on-top of RPC, CORBA API is non-intuitive
JBOSS/ESB - not too familiar with
It would be nice if such a package could to the following:
Network based
Aware of payload data, users should not have to worry about endian/serialization issues
Multiple language support (C++, ruby, Java, python would be nice)
No auto-generated code (no IDLs!)
Intuitive subscription/topic management
For fun, I've created my own. Thoughts?
You might want to look into RabbitMQ.
As pointed-out by an earlier post in this thread, one of your options is OpenSplice DDS which is an Open Source implementation of the OMG DDS Standard (the same standard implemented by NDDS).
The main advantages of OpenSplice DDS over the other middleware you are considering can be summarized as:
Performance
Rich support for QoS (Persistence, Fault-Tolerance, Timeliness, etc.)
Data Centricity (e.g. possibility of querying and filtering data streams)
Something that I'd like to understand is what are your issues with IDL. DDS uses IDL as language-independent way of specifying user data types. However DDS is not limited to IDL, you could be using XML, if you prefer. The advantage of specifying your data types, and decoupling their representation from a specific programming language, is that the middleware can:
(1) take away from you the burden of serializing data,
(2) generate very time/space efficient serialization,
(3) ensure end-to-end type safety,
(4) allow content filtering on the whole data type (not just the header like in JMS), and
(5) enable on-the wire interoperability across programming languages (e.g. Java, C/C++, C#, etc.)
Depending on the system or application you are designing, some of the properties above might not be useful/relevant. In that case, you can simply generate one, a few, "DDS Type" which is the holder of you serialized data.
If you think about JMS, it provides you with 5 different topic types you can use to send your data. With DDS you can do the same, but you have the flexibility to define exactly the topic types.
Finally, you might want to check out this blog entry on Scala and DDS for a longer discussion on why types and static-typing are good especially in distributed systems.
-AC
We use the RTI DDS implementation. It costs $$, but it supports many quality of service parameters.
There is a free DDS implementation called OpenDDS, but I've not used it.
I don't see how you can get around the need to predefine your data types if the target language is statically typed.
Look a bit deeper into the various JMS implementations.
Most of them are not Java only, they provide client libraries for other languages too.
Suns OpenMQ have at least a C++ interface, Apache ActiveMQ provides client side libraries for many common languages.
When it comes to message formats, they're usually decoupled from the message middleware itself. You could define your own message format. You could define your own XML schema and send XML messages. You could send BER encoded ASN.1 using some 3. party library if you want.
Or format and parse the data with a JSON library.
You might be interested in the MUSCLE library (disclaimer: I wrote it, so I may be biased). I think it meets all of the criteria you specified.
https://public.msli.com/lcs/muscle/
Three I've used:
IBM MQ Series - Too Expensive, hard to work with.
Tico Rendezvous - (renamed now to EMS?) Was very fast, used UDP, could also be used with no central server. My favorite but expensive and requires a maint fee.
ActiveMQ - I'm using this currently but finding it crashes frequently. Also it's requires some projects ported from Java like spring.net. It works but I can't recommend it due to stability issues.
Also used MSMQ in an attempt to build my own Pub/Sub, but since it doesn't handle it out of the box your stuck writing a considerable amount of code.
There is also OpenSplice DDS. This one is similar to RTI's DDS, except that it's LGPL!
Check it out:
IBM Webpshere MQ, and the licence is not too expnsive if you work on a corporate level.
You might take a look at PubSubHubbub. It's a extension to Atom/RSS to alow pubsub through webhooks. The interface is HTTP and XML, so it's language-agnostic. It's gaining increasing adoption now that Google Reader, FriendFeed and FeedBurner are using it. The main use case is blogs and stuff, but of course you can have any sort of payload.
The only open source implementation I know of so far is this one for the Google AppEngine. They say support for self-hosting is coming.

What exactly does REST mean? What is it, and why is it getting big now?

I understand (I think) the basic idea behind RESTful-ness. Use HTTP methods semantically - GET gets, PUT puts, DELETE deletes, etc... Right? thought I understood the idea behind REST, but I think I'm confusing that with the details of an HTTP implementation. What is the driving idea behind rest, why is this becoming an important thing? Have people actually been using it for a long time, in a corner of the internets that my flashlight never shined upon?
The Google talk mentions Atom Publishing Protocols having a lot of synergy with RESTful implementations. Any thoughts on that?
This is what REST might look like:
POST /user
fname=John&lname=Doe&age=25
The server responds:
201 Created
Location: /user/123
In the future, you can then retrieve the user information:
GET /user/123
The server responds (assuming an XML response):
200 OK
<user><fname>John</fname><lname>Doe</lname><age>25</age></user>
To update:
PUT /user/123
fname=Johnny
Here's my view...
The attraction to making RESTful services is that rather than creating web-services with dozens of functional methods, we standardize on four methods (Create,Retrieve, Update, Destroy):
POST
GET
PUT
DELETE
REST is becoming popular because it also represents a standardization of messaging formats at the application layer. While HTTP uses the four basic verbs of REST, the common HTTP message format of HTML isn't a contract for building applications.
The best explanation I've heard is a comparison of TCP/IP to RSS.
Ethernet represents a standardization on the physical network. The Internet Protocol (IP) represents a standardization higher up the stack, and has several different flavors (TCP, UDP, etc). The introduction of the "Transmission Control Protocol" (guaranteed packet delivery) defined communication contracts that opened us up to a whole new set of services (FTP, Gopher, Telnet, HTTP) for the application layer.
In the analogy, we've adopted XML as the "Protocol", we are now beginning to standardize message formats. RSS is quickly becoming the basis for many RESTful services. Google's GData API is a RSS/ATOM variant.
The "desktop gadget" is a great realization of this hype: a simple client can consume basic web-content or complex-mashups using a common API and messaging standard.
HTTP currently is under-used and mis-used.
We usually use only two methods of HTTP: GET and POST, but there are some more: DELETE, PUT, etc (http://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html)
So if we have resources, defined by RESTful URLs (each domain object in your application has unique URL in form of http://yoursite.com/path/to/the/resource) and decent HTTP implementation, we can manipulate objects in your domain by writing sentences:
GET http://yoursite.com/path/to/the/resource
DELETE http://yoursite.com/path/to/the/resource
POST http://yoursite.com/path/to/the/resource
etc
the architecture is nice and everything.
but this is just theoretical view, real world scenarios are described in all the links posted in answers before mine.
Lets go to history, Talk about the Roy Fielding Research – “Architectural Styles and the Design of Network-based Software Architectures“. Its a big paper and talks a lot of various stuff. But as a standard engineer How you would like to explain the clear meaning of REST (Representational State Transfer), and what is its Architectural Style.
Here is my way to explain – “What is REST”.
See this www(world wide web) running on top of various hardwares e.g. routers,servers,firewalls, cloud infrastructures,switches,LAN,WAN. The overall objective of this www(world wide web) to distribute hypermedia. This world wide web equipped with various services e.g. informational based services, websites, youtube channels, dynamic websites, static websites. This world wide web uses HTTP protocol to distribute hypermedia across the world with a client/server mechanism. This HTTP Protocol works on top of TCP/IP or other appropriate network stack.
This HTTP protocol is using eight methods to manage the ‘protocol of distribution’ or ‘Architectural Style of Distribution’. Those eight methods are namely : OPTIONS,GET,HEAD,POST,PUT,DELETE,TRACE,CONNECT.
But on Top of this HTTP, web applications are using its own way of distributing hypermedia e.g web applications are using web services which are highly tied with clients and servers ‘or’ web applications are using its own way of designed client/server mechanism to make such distribution channel on top of HTTP.
What Roy Fielding Research says , that these eight methods OPTIONS,GET,HEAD,POST,PUT,DELETE,TRACE,CONNECT of HTTP are so successful to deliver HyperMedia to all across the world on top of variety of hardware resources and network stacks with client/server mechanism, Why don’t we use the similar strategy with our web based application as well. On this GET,POST,DELETE and PUT are used the most. so four methods deliver HyperMedia to all across the world.
In REST API Architecture Style application, a web application need to design the business logic(resides in a server e.g. Tomcat,Apache HTTP) with all set of object entities(e.g. Customer is an entity) and possible operations(e.g.‘Retrieve Customer Information based on a customer id’) on them. Those possible operations with these entities should be designed with four main operations or methods namely- Create,Retrieve,Update,Delete. These entities called as resources and these are presented or represented in a form e.g. JSON or XML or something else. We have Client(Browsers) who calls Create,Retrieve,Update,Delete (CRUD) methods to perform the appropriate function on such resource resides in the Server.
But as explained the concept of Representation, means the way entities of business logic or objects are represented. but what about with ‘State Transfer’ ?.
The State Transfer, its talks about the “state of communication” from Client to Server. It talks about the design of ‘state transfers’ from Client to Server e.g. Client first called the operation ‘Create Customer’, after calling this what would be next state of customer or states of customer which ‘client’ can call. Its state may be to ‘retrieve the created client data’, ‘update the client data’ or what
REST is an architecture where resources are defined and addressed.
To understand REST best, you should look at the Resource Oriented Architecture (ROA) which gives a set of guidelines for when actually implementing the REST architecture.
REST does not need to be over HTTP, but it is the most common. REST was first created by one of the creators of HTTP though.

Resources