Best general-purpose message format for SOA? - soa

If I have a bunch of servers (eventually groups of servers), each being a different service (SOA), and I want them to be able to:
Send requests, receive responses via TCP over a high-throughput, low-latency, unmetered network.
Use a common message format that:
Is fast to encode and decode/parse
Supports lists and binary strings
Won't necessarily require updating all services at once (e.g. adding a field should not prevent the outdated services from reading the message and picking out all of the fields they are expecting)
Which format would you guys recommend? I'm currently looking into encoding messages as BSON, but would like to hear some suggestions.
Thanks :-)

Thanks to #Radu's comment for this answer.
MessagePack
Google's Protocol Buffers
Thrift
are all great options. More info here.

Related

Difference between XMPP and RSS from real-time perspective

I have a general question here regarding the RSS and XMPP technology. I want to know what exactly makes XMPP a real-time alternative to RSS?
Right now, I'm assuming a one-way message stream (from server to client ONLY).
What I'm confused about is, let's say I program a RSS reader on my client side and make it scan for new feeds for VERY short interval of time. Will that not make my system real-time'ish? Or is there any negative impact on doing that which XMPP resolves (apart from the security features)?
Because there are some systems which uses a combination of the two to form a real-time feeding system (Eg : Superfeedr).
So, if someone can briefly explain why someone would implement XMPP over RSS when designing a real-time notification system, I'd highly appreciate that.
Sorry for the long post, I have recently begun with these two technologies and I'm very curious about their functionalities. I tried looking up on the internet but the answers were either too brief or insufficient.
These are very different things, but complementary.
RSS is just a data format. It does not involve any "latency" by itself. It's just the way it's being consumed which determines if it can be realtime or not.
XMPP is a communication protocol. It's connected and in that regard can be considered as "real-time". RSS itself (or rather Atom), as it's XML, can be transported quite easily over XMPP. That's one of the ways you can make RSS realtime.
And RSS can be realtime if it's served through the PubSubHubbub protocol (is not it, Julien :-) !

What is the best method to send data from a device to a server

I am currently developing a website for an energy-monitoring company. We are trying to send high volumes data from the devices which record the data to a server so they can be processed in a database. The guy developing the firmware seems to think that the best way to send the data is to produce CSV files and send them via FTP. A program on the server needs to monitor the files received via FTP and run a PHP script to process them. I, however, feel that the best way of sending the data is via HTTP POST.
We had HTTP POST working and then I began trying to work with the CSVs which became a pain as reliably monitoring the files received via FTP meant editing the ProFTPD configuration file (which I found to be a near impossible task) and install a package called mod_exec (which comes with security risks) so that ProFTPD could run a PHP script. These issues and the fact that I am unfamiliar with the linux console which I am required to use extensively to set this up, makes the CSV method very difficult to set up. HTTP POST to me seems like a more direct way of sending the data without having to worry about files or relying on ProFTPD. It would also allow us to use identifiers to give the data being passed meaning as opposed to a string of values for which the meaning is not immediately apparent. In addition, the query string could be URL encoded to pass a multidimensional array which would work well given the type of data being passed.
Nevertheless, just because the HTTP POST method would be easier doesn't mean that the CSV method doesn't have advantages. Furthermore, the firmware guy has far more experience than me with computers so I trust his opinion.
Can you please help me to understand his point of view on the advantages of the CSV method and explain what the best method is?
You're right. FTP has major issues with firewalls, and especially doesn't work well on mobile (NAT'ted) IPv4. HTTP POST works far, far better under such circumstances, if only because nobody accepts an "internet" connection that breaks HTTP.
Furthermore, HTTP is a lot easier on the device as well. It's just a single-socket protocol, with trivial read/write semantics on that socket.
Some more benefits? HTTP has almost-native support for compression (gzip). HTTP transmission can start before the input is complete. HTTP is easier to secure (HTTPS)...
No, there really is little reason to use FTP.
The 'CSV method' (I'd call it the 'FTP method' though) has the advantage of being known to the embedded developer. The receiving side will have to create some way of checking if there is a file though. That adds complexity.
The 'HTTP method' has several advantages:
HTTP is easy to implement on the sending side
No need to create a file-checker
You can reply to the embedded device if everything went OK
I actually just implemented a system just like that (not too much data, but still) and use HTTP POST to send the data. I implemented the HTTP POST myself.

How to author an Internet protocol?

We're all familiar with popular protocols like IMAP and POP, used for email messaging.
I have a plan for a new protocol, but I'm not sure to go about implementing it.
Is the protocol a collection of C source code, for example, that accepts and sends data through ports? Or is a protocol just a thorough description of how data should be sent, which clients then implement?
I'm lost where to start here, and I'm not very familiar with how the protocol system works.
Edit:
Also, if I write a protocol and it isn't made official by the standards group, can people/clients still implement it?
The official way is to write an RFC - a Request for Comments. People will respond to that (that's why it's an RFC) and probably try to implement your protocol.
As soon as two independent implementations exist that completely support the protocol, it's a new standard.
Of course, people aren't going to implement a new protocol for someone just for fun. So you should first find a group who is interested in listening to you. Maybe there already is a protocol which does what you want (or can easily be extended).
But you probably don't want to invent a new standard. Standards are a lot of work and - for some - overrated.
So you should describe how it works and create a library that can read and write the protocol, so developers can use it even though it's not an official standard.
As you are interested in the Replace Email section of the Paul Graham article you linked, then IMHO you will need to both develop a protocol definition, and also provide an example implementation. The protocol definition does not need to be published as an internet protocol standard in order to be useful.
You will need an implementation to so that you can test, refine and improve the ideas. It is extremely unlikely the protocol will be right at the first attempt, and you'll need something to support the initial users.
You don't need a protocol definition to implement an improved email, but you will need one if you expect others to work with you and adopt it, though it very much depends on your 'business model'. I strongly recommend you have a protocol definition from the start, even if only to keep yourself sane when you try to produce the second implementation.
I recommend having a look at some examples of sneaky approaches to protocols and implementation. My favourite is described in the Viewpoints Research 2008 Progress report on a super-compact approach to TCP/IP.
They did not follow the traditional approach to developing the implementation of a protocol (the protocol stack). Instead they wrote code which parsed the human-readable TCP/IP protocol specification, and generated the code of a TCP/IP stack from that protocol document. The usual TCP/IP stack is about 40,000 lines of code, or more. Their program, which read the protocol specification, and generated the code for a TCP/IP stack 'automatically' was only 160 lines of code. They use extremly powerful programming tools.
If you had an approach like that, you could keep the protocol implementation synchronised with the specification, and potentially make it straightforward for others to adopt your protocol.
HTH
You are confusing a protocol standard with the implementation.
These 2 are unrelated.
A protocol is described in a high level but has enough information for someone to undestand how it should be implemented.
The idea is that someone reading the document can understand how/what to implement in any language of preference
To give an example: SIP protocol in the RFC describes the various flows and also has the various messages and how they are supposed to b processed i.e. the semantics well defined.
You can implement a SIP UA or Server in C++ or Java. This is irrelevant to the SIP protocol
For this you don't need to provide any source code (you could though if you think it helps clarify some obscurity of the description).
The most important part is that your protocol is actually reviewed by stakeholders i.e. people that expect it to solve their problems.
This part is the most important not only because it could solve problems in your protocol but because they can actually verify that the concept is solid i.e. can be technically implemented
The only case that one could specify something concrete or imply something is if for example the protocol described something demanding some specific constraints e.g. hard-real time constraint which could serve as "hint" on which implementation/languages to avoid
Also, if I write a protocol and it isn't made official by the
standards group, can people/clients still implement it?
Strange question.What do you mean?How will someone know your protocol exists?
If it is official he can get it from the standards group to implement it.
Otherwise it is obvious that you have some sort of "proprietary" protocol (which is not uncommon e.g. a company can have an internal protocol for its own software) and people have to get the spec from you.

How can I inject raw packets onto my network

In testing certain network device driver receive features, I need to send special packets on the wire. I know I need to open a raw socket and push the bytes out. Is there some well-known example (C, perl, whatever) code already available for playing at this level?
(added later) I would prefer non-platform-specific answers, they'll be the most useful for everyone.
Look at the documentation for packet. Basically, you create a socket with SOCK_RAW or SOCK_DGRAM, then write to the socket using normal socket i/o. However, the data you send will be put directly on the line, rather than automatically getting the headers that are necessary for most network interop.
http://www.codeproject.com/KB/IP/sendrawpacket.aspx
There's already an existing project that may be able to help you with this.
Check out http://tcpreplay.synfin.net/wiki/tcprewrite#RewritingLayer2
and http://tcpreplay.synfin.net/
Seems to me you are looking for a tool to generate your own packets, Scapy is such a tool often used in the security industry (such as pentesters).
Demo is available: http://www.secdev.org/projects/scapy/demo.html
I can't think of any examples. But you should just be able to open up a UDP socket to any IP address you like and start writing data to it. Make sure its UDP or this will not work.
I found that there's a good C example here at Security-Freak, which only needed a little modification for flexibility. I'm hoping there are more answers in other languages.

Sniffing traffic between a Flex app and ColdFusion backend

What is a good strategy for sniffing/tracing function calls between a Flex application and a ColdFusion-based backend running on ColdFusion server? I understand they use AMF protocol.
I'm used to using Fiddler to sniff transactions between HTTP clients and servers, and it works great as long as you're using plain text or XML HTTP requests and responses (including those over SSL) but it isn't much help for binary protocols like AMF over HTTP.
In my case, I do have access to the source code for the client and server, but I'm looking for an easy way to passively sniff traffic in any Flex + ColdFusion situation, without having to tweak anything on the server.
Wireshark: sniffing the glue that holds the internet together
http://www.wireshark.org/
http://www.charlesproxy.com/
Although not free, will decode AMF binary data and allows to trace SSL connections too.
ServiceCapture is another option. It decodes the binary AMF for you, if I remember correctly.
http://kevinlangdon.com/serviceCapture/
Firebug with the Flashbug plugin will show all decoded AMF messages both to and from a Flash app. Works well over HTTPS too.
https://addons.mozilla.org/en-us/firefox/addon/amf-explorer/.
The simple and poor man's trick. Create one cfc to log calls to the different cfc's and pages as you need. Dump it all to a table. Filter and sort at will. I have done this in the past and it has worked great. It's like putting in little fish hooks anywhere you want to know. This would likely give you the most application relevant data. If you need an example let me know.
ditto for wireshark (the artist formerly known as Ethereal). you can sniff at every protocol layer, and stitch together traffic streams.

Resources