Can XProc 3 handle any XPath 3.1 value/type? - xproc

While reading up on XProc 3 I wonder whether a step like an XSLT 3 stylesheet can return any type of the XSLT 3 or XPath 3.1 data model.
The spec in http://spec.xproc.org/master/head/xproc/#documents.9 has a section saying
If the result is a map, array or any atomic value, a JSON document is
created and content-type application/json is used.
I am struggling to understand what would happen with sequences in general, e.g. a sequence of arrays of nodes (e.g. type array(node())*) or a sequence of maps from an atomic type to a node (e.g. a type map(xs:string, node())*) as that is not a type JSON could handle, at least not in the sense I know JSON or the XSLT 3 serialization spec allows JSON serialization for.
Any insight as to whether XProc 3 is meant to allow passing on any XDM 3.1 sequence between steps?

This is an absolutely valid question and I think some clarification in the spec is due. Would you mind raising an issue on Github?
https://github.com/xproc/3.0-specification/issues/
Gerrit

Related

how to deserialize based on one field value in payload in Spring Kafka?

I need consume from kafka same topic with different types. The content of different types are slightly. But they are different. Inside payload of the message, there is one field named type which could determine which javatype it should be deserialized to.
It a little bit difficult for me because to fetch field value of 'type'. I need do a deserialize already.
Is there good way to do this with spring kafka?
Thanks in advance.
As Artem Bilan pointed out in the comments, it'd be easier to help if you added more details to your question. I'll try to help by making some assumptions.
When using Avro, deserialization of different payload types should be handled (and enforced) by the schema-registry first, such as documented here, and this doesn't seem to be compatible with having type information inside the payload, so I'll assume you're talking about JSON payloads.
If possible, you should probably change the design to have that kind of metadata in the record headers - this way you don't need to deserialize the payload in order to access it, and can use Spring Kafka built in tools such as DelegatingDeserializer.
Considering both initial assumptions are right - you're using JSON and can't change the design to use headers for type metadata, you should be able to implement a custom JsonTypeResolver to be set on the JsonDeserializer, where you can use an ObjectMapper to deserialize the payload to, for example, a Map - you can then fetch the type information from there and associate it with the proper JavaType.
EDIT: A more performant solution would be to implement a custom deserializer where you can use the ‘readTree’ method in ‘ObjectMapper’ to deserialize the payload to a ‘JsonNode’. You can then traverse this object to fetch type information, and use the ‘treeToValue’ method to continue deserializing the json node to the proper type. That way you’re only deserializing it once.

ANTLR for parsing calculation in JSON format

I am new to ANTLR and trying to see if ANTLR fit into my scenario or I should stick to JSON deserialization library and write some custom code.
Input text is a JSON which represents an expression like
{
"Operator":"ADD",
"Operands": [{"Operator":"ADD", "Operands":[{"OperandValue":23},{"OperandValue":32} ]},
{"Operator":"ADD", "Operands":[{"OperandValue":11},{"OperandValue":12} ]}]}
}
list of Operators can evolve
JSON is created programmatically and not manually created by users by hand.
I have to read this JSON and validate it and give meaningful error messages to my client.
If JSON in valid/parsed, I have to translate JSON to TSQL code and execute on SQL server.
All my code will be in C# and I need:
max Debug support and ease
unit testability
less custom code
min. learning
With above two needs, I tried to write a rough ANTLR grammar and custom code and below are my observation
ANTLR in a way my logic will reside grammar file, which might be difficult to
debug writing grammar for new comer. and unit test. for doing my grammar POC I was relying only context.GetText() property to figure out where I am currently and change my grammar.
I have to write a modularized grammar(building blocks), so that I can have visitor for my smallest part and more manageable visitor class.
How can I give more meaning full messages to my clients, with them having least knowledge of my grammar parsing engine?
custom JSON deserialization (JSON.NET)
code easy to debug, and everyone understands. I get a JSON reader and write condition to check if JSON Object or JsonArray and if it has Property Operator with value ADD and similar.
custom code I can give more meaningful validation failure messages.
To me ANTLR seems to have high value when your input is not highly structured and you don't have available parsers, but in case of JSON it doesn't give much value add over JSON parsers.
Is ANTLR meant for this scenario?

Proper way to include data with an HTTP PATCH request

When I'm putting together an HTTP PATCH request, what are my options to include data outside of URL parameters?
Will any of the following work, and what's the most common choice?
multipart/form-data
application/x-www-form-urlencoded
Raw JSON
...any others?
There are no restrictions on the entity bodies of HTTP PATCH requests as defined in RFC 5789. So in theory, your options in this area are unlimited.
In my opinion the only sensible choice is to use the same Content-Type used to originally create the resource. The most common choice is application/json simply because most modern APIs utilize JSON as their preferred data transfer format.
The only relevent statement RFC 5789 makes in regard to what should and shouldn't be part of your PATCH entity body is silent on the matter of Content-Type:
the enclosed entity contains a set of instructions describing how a resource currently residing on the origin server should be modified to produce a new version.
In summary, how you choose to modify resources in your application is entirely up to you.
As rdlowrey writes, RFC 5789 does not mandate specific content types, so the choice of format is up to you.
However, using the general formats you listed or making up your own format is not interoperable, and developers could have a hard time figuring out the semantics you chose. An official erratum to the RFC states this in a more formal way:
The means of applying a PATCH request to a resource's state is
determined by the request's media type. If a server receives a PATCH
request with a media type whose specification does not define
semantics specific to PATCH, the server SHOULD reject the request by
returning the 415 Unsupported Media Type status code, unless a more
specific error status code takes priority.
In particular, servers SHOULD NOT assume PATCH semantics for generic
media types that don't define them, such as application/xml or
application/json. Doing so will cause interoperability issues,
because the semantics of PATCH become specific to that resource,
rather than general.
(Quote formatted for readability, but unchanged otherwise)
One media type whose specification defines PATCH semantics is application/json-patch+json, also called JSON Patch: RFC 6902. I suppose it could be considered the "standard" choice (at least) when dealing with data originally posted as JSON.
The PATCH method is defined in the RFC 5789. This document, however, doesn't enforce any media type for the payload:
The PATCH method requests that a set of changes described in the request entity be applied to the resource identified by the Request-URI. The set of changes is represented in a format called a "patch document" identified by a media type.
Other RFCs, released years later, define some media types for describing a set of changes to the applied to a resource, suitable for PATCHing:
application/json-patch+json
Defined in the RFC 6902:
JSON Patch defines a JSON document structure for expressing a sequence of operations to apply to a JavaScript Object Notation (JSON) document; it is suitable for use with the HTTP PATCH method. The application/json-patch+json media type is used to identify such patch documents.
application/merge-patch+json
Defined in the RFC 7396:
This specification defines the JSON merge patch format and processing rules. The merge patch format is primarily intended for use with the HTTP PATCH method as a means of describing a set of modifications to a target resource's content.

Flash: AMF3 with reference tables?

AMF3 specification defines use of so called "reference tables" (see Section 2.2 of this specification).
I implemented this behavior in my AMF3 encoder/decoder I developed in Erlang, but being not very experienced with Flash API, I can hardly find how can I easily force Flash to use these reference tables when serializing objects to AMF3; for example if I use ByteArray, it seems that it just repeats full object encodings
var ba:ByteArray = new ByteArray();
ba.writeObject("some string1");
ba.writeObject("some string1");
# =>
# <<6,25,115,111,109,101,32,115,116,114,105,110,103,49,
# 6,25,115,111,109,101,32,115,116,114,105,110,103,49>>
(which is clearly a repetition).
However, if these two strings are in a one single writeObject call, it does seem to use references:
ba.writeObject(["some string1", "some string1"]);
# => <<9,5,1,6,25,115,111,109,101,32,115,116,114,105,110,103,49,6,0>>
Socket seems to behave the same way.
So, can I make use of reference tables in Flash code? (provided I might have a non-standard protocol between Flash application and server )
Thank you!
I think the difference is that in the first example you're writing two string literals. In the second example you're writing an array (or Complex Object in Adobe's specs) that has a reference to two strings. So if you reference the string from an object or an array it will write it in the reference table.
This isn't necessarily a way to enforce it but it seems logical that the AMF serializer built into flash would serialize objects this way so it is probably a reliable way to get the behavior your want (reference table strings).
I hope that is helpful to you!
As per the final sentence of the AMF3 specification (AMF 3.0 Spec at Adobe.com):
Also note that ByteArray uses a new set of implicit reference tables for objects, object traits and strings for each readObject and writeObject call.
It appears that the intention with ByteArray.writeObject is to create a serialization which could be stored or recovered on a per-object basis.
The NetConnection object's behavior is similar to what you had hoped for.
When updating the string-references table, it is important to not add empty strings to the reference table.
When maintaining the object-references table, you may be able to implement defensive programming as follows: the object-references table is constructed recursively and at some times contains objects for which the traits are not yet completely known. If the table indices are not allocated in advance, the numbering will be inconsistent across applications. An AMF3 decoder should not use the traits from a partially-constructed object -- such input should be flagged as erroneous.
The strings-reference table is implemented at the encoder by 'tagging' in-memory string objects as they are serialized. Encoding two different string objects with the same content (matching strings) do not seem to be encoded with one string referencing the other. Both strings will be output and a string-by-reference will not be used.
There may be a solution to your original question. If you have a number of objects all belonging to the same class, and you would like to store those objects all in one storage, I suggest the following: Create a "parent object" with references to all the objects you intend to store. Then use ByteArray.writeObject to persist that parent object. AMF will encode all of the referenced objects and will represent the traits of repeated object classes in an efficient way.
Look at the last page of the official AMF3 spec and you will see that ByteArray is pretty much worthless. You will have to write your own AMF3 serializer/deserializer.

Flex - XML Serialization and De-Serialization of nested Object structures

Our Flex app would like to work with requests and responses as object graphs (nothing unusual there) e.g. response becomes the model of some view, and would be a structure with several layers of nesting.
** Now, ideally we would like to use the same client (and server) side objects for different message formats e.g. XML and AMF, and have a pluggable serialization/de-serialization layer (!)
AMF has serialization and matching of client to server using
[RemoteClass(alias="samples.contact.Contact")]
but it seems there is no equivalent for XML.
I am (somewhat optimistically) looking for a neat way of serializing the object graph to XML, to send through a HTTPService from the client.
For responses, the default 'object' and 'E4X' provide some de-serialization. This is handy, but of course we don't have the niceties of unpacking the XML back into specific AS classes like we do with AMF.
Any suggestions?
(did have one idea come through about wrapping/casting object as XML or XMLList - this does not seem to work, however)
Update:
Both these libraries look useful, and I will very likely use them at some point.
For now, I really need the simplicity of re-using the metadata set for the AMF3 serialization which we are using in any case ([RemoteClass],[Transient])
.. so the best option at the moment is AMFX - used Flex Data Services for AMF transfer using XML - classes in mx.messaging.channels.amfx package - only drawback at the moment is any Externalizable class is transformed into a Hex byte stream - and ArrayCollection is Externalizable! (hoping to workaround by serializing the internal Array in a subclass ..)
Hope that's useful to someone ..
Regarding the Xml serialization I can give you a starting point (as biased as it may be, though :D).
I am working on a project that allows for automatic conversion of AS3 objects to and from xml. It basically uses annotations on the model objects you use for communication in order to construct the xml structure or populating an object from xml.
It is called FlexXB and you can check it out at http://code.google.com/p/flexxb/.
I started this project cos I got into the same issues at work (namely I have a server that communicates through xml) and I hoped it be of use to someone else.
Cheers,
Alex
Yet another project: FleXMLer (http://code.google.com/p/flexmler/).
It has both the straightforward attitude of asx3m where you can just call:
new FleXMLer().serialize(obj);
Or you can customize XML element names, skip elements and tweak the way arrays and hash tables are serialized.
Would appreciate your input.
checkout asx3m project at http://code.google.com/p/asx3m
It's an AS3 port of Java XStream serialization library and works pretty well.
I made it because I had to connect to a server platform that used XStream for exchanging data objects and put a lot of work in it.
It can be extended to serialize AS3 objects to any format (JSON for example) and could leverage power of user defined metatags.
Cheers,
Tomislav
There's a library including JSON available from Adobe, too. And since ActionScript is a superset of JavaScript ... and JSON is increasingly supported cross-framework ...

Resources