Flash: AMF3 with reference tables? - apache-flex

AMF3 specification defines use of so called "reference tables" (see Section 2.2 of this specification).
I implemented this behavior in my AMF3 encoder/decoder I developed in Erlang, but being not very experienced with Flash API, I can hardly find how can I easily force Flash to use these reference tables when serializing objects to AMF3; for example if I use ByteArray, it seems that it just repeats full object encodings
var ba:ByteArray = new ByteArray();
ba.writeObject("some string1");
ba.writeObject("some string1");
# =>
# <<6,25,115,111,109,101,32,115,116,114,105,110,103,49,
# 6,25,115,111,109,101,32,115,116,114,105,110,103,49>>
(which is clearly a repetition).
However, if these two strings are in a one single writeObject call, it does seem to use references:
ba.writeObject(["some string1", "some string1"]);
# => <<9,5,1,6,25,115,111,109,101,32,115,116,114,105,110,103,49,6,0>>
Socket seems to behave the same way.
So, can I make use of reference tables in Flash code? (provided I might have a non-standard protocol between Flash application and server )
Thank you!

I think the difference is that in the first example you're writing two string literals. In the second example you're writing an array (or Complex Object in Adobe's specs) that has a reference to two strings. So if you reference the string from an object or an array it will write it in the reference table.
This isn't necessarily a way to enforce it but it seems logical that the AMF serializer built into flash would serialize objects this way so it is probably a reliable way to get the behavior your want (reference table strings).
I hope that is helpful to you!

As per the final sentence of the AMF3 specification (AMF 3.0 Spec at Adobe.com):
Also note that ByteArray uses a new set of implicit reference tables for objects, object traits and strings for each readObject and writeObject call.
It appears that the intention with ByteArray.writeObject is to create a serialization which could be stored or recovered on a per-object basis.
The NetConnection object's behavior is similar to what you had hoped for.
When updating the string-references table, it is important to not add empty strings to the reference table.
When maintaining the object-references table, you may be able to implement defensive programming as follows: the object-references table is constructed recursively and at some times contains objects for which the traits are not yet completely known. If the table indices are not allocated in advance, the numbering will be inconsistent across applications. An AMF3 decoder should not use the traits from a partially-constructed object -- such input should be flagged as erroneous.
The strings-reference table is implemented at the encoder by 'tagging' in-memory string objects as they are serialized. Encoding two different string objects with the same content (matching strings) do not seem to be encoded with one string referencing the other. Both strings will be output and a string-by-reference will not be used.
There may be a solution to your original question. If you have a number of objects all belonging to the same class, and you would like to store those objects all in one storage, I suggest the following: Create a "parent object" with references to all the objects you intend to store. Then use ByteArray.writeObject to persist that parent object. AMF will encode all of the referenced objects and will represent the traits of repeated object classes in an efficient way.

Look at the last page of the official AMF3 spec and you will see that ByteArray is pretty much worthless. You will have to write your own AMF3 serializer/deserializer.

Related

spaCy adding pointer to another token in custom component

I am trying to find how token.head and token.children are implemented. I want to replicate this implementation as I add a custom component to my spaCy pipeline for SRL.
That is, each token can point to predicates for which it is an argument. Intuitively, I think that this should work kind of like token.children wherein (I think) it returns a generator of the actually dependent child token objects.
I assume that I should not simply store an attribute of that token as this does not seem very memory efficient and rather redundant. Does anyone know the correct way to implement this? Or is this handled implicitly by the spaCy Underscore.set method?
Thanks!
The Token object is only a view -- it's sort of like holding a reference to the Doc object, and an index to the token. The Span object is like this too. This ensures there's a single source of truth, and only one copy of the data.
You can find the definition of the key structs in the spacy/structs.pxd file. This defines the attributes of the TokenC struct. The Doc object then holds an array of these, and a length. The Token objects are created on the fly when you index into the Doc. The data definition for the Doc object can be found in spacy/tokens/doc.pxd, and the implementation of the token access is in spacy/tokens/doc.pyx.
The way the parse tree is encoded in spaCy is a bit unsatisfying. I've made an issue about this on the tracker --- it feels like there should be a better solution.
What we do is encode the offset of the head relative to the token. So if you do &doc.c[i] + doc.c[i].head you'll get a pointer to the head. That part is okay. The part that's a bit weirder is that we track the left and right edges of the token's subtree, and the number of direct left and right children. To get the rightmost or leftmost child, we navigate around within this region. In practice this actually works pretty well because we're dealing with a contiguous block of memory, and loops in Cython are fast. But it still feels a bit janky.
As far as what you'll be able to do as a user...If you run your own fork of spaCy you can happily define your own data on the structs. But then you're running your own fork.
There's no way to attach "real" attributes to the Doc or Token objects, as these are defined as C-level types --- so their structure is defined statically; it's not dynamic. You could subclass the Doc but this is quite ugly: you need to also subclass.
This is why we have the underscore attributes, and the doc.user_data dictionary. It's really the only way to extend the objects. Fortunately you shouldn't really face a data redundancy problem. Nothing is stored on the Token objects. The definitions of your extensions are stored globally, within the Underscore class. Data is stored on the Doc object, even if it applies to a token --- again, the Token is a view. It can't own anything. So the Doc has to note that we have some value assigned to token i.
If you're defining a tree-navigation system, I'd recommend considering defining it as your own Cython class, so you can use structs. If you use native Python types it'll be pretty slow and pretty large. If you pack the data into numpy arrays the representation will be more compact, but writing the code will be a pretty miserable experience, and the performance is likely to be not great.
In short:
Define your own types in Cython. Put the data into a struct owned by a cdef class, and give the class accessor methods.
Use the underscore attributes to access the data from spaCy's Doc, Span and Token objects.
If you come up with a compelling API for SRL and the data can be coded compactly into the TokenC struct, we'd consider adding it as native support.

What's best practice in this situation?

I was just writing a small asp.net web page to display a collection of objects by binding to a repeater, when this came to mind.
Basically the class I've created, let's call it 'Test', has a price property that's an integer data type (ignore the limitations of using this type, I'm just using it as an example). However I want to format this property so it displays a currency and the correct decimal places etc.
Is it best practice to have a function within the class that returns the formatted string for the object, or would it be better to have a function in the back end of my web form that operations on the object and returns the formatted string?
I've heard before that a class should contain all it's relative functions but I've also heard that presentation should be kept in the 'presentation layer' in my N-tier app.
What would be the best approach in my situation? (and apologies if I haven't explained this clearly enough!)
Thanks!
In my opinion, both options are valid from an OO point of view.
Since the value is a price (that just happens to have the wrong data type), it makes sense to put the formatting into the data class. It's not something that's specific to the web interface, and, if you develop a different kind of user interface, you are very likely to require this formatting again.
On the other hand, it's a presentation issue, so it also makes sense to put it into the presentation layer.
For general OOP stuff, the object should not be exposing implementation details. I choose to interpret this as "avoid setters and getters when possible".
In the context of your question, I suggest that you have a getPriceDisplay() method that returns a string containing the formatted price.
The actual implementation of the formatting is hidden in the implementation details. You could provide a generic function for formatting, use some backend call, or something else. Those details should make no difference to the consumer of the 'Test' object.
Though it's not an OOP approach, in my opinion, this is a good time for an extension method. Call it .ToCurrency() which has the format of the currency...this could be taken from the Web.Config file if you wanted.
Edit
To elaborate, I would simply call .ToString("your-format") (of course this could be as simple as .ToString("C") for your specific question) in the extension method. This allows you change the format throughout the UI in one place. I have found this to be very useful when dealing with DateTime formats in web applications.
Wouldn't .ToString("C"); do the job? This would be in the presentation layer I would imagine.

Determining type of CollectionBase via Reflections (or Microsoft.Cci)

Question:
Is there a static way to reliably determine the type contained by a type derived from CollectionBase, using Reflection or Microsoft.Cci?
Background:
I am working on a code generator that copies types, makes customized versions of those types, and converters between. It walks the types in the source assembly via Microsoft.Cci. It prints out source code using textual templates. It does a lot of conversion and customization, and tosses out code that I don't care about.
In my resulting code, I intend to replace List<T> everywhere that a CollectionBase, IEnumerable<T>, or T[] was previously used. I want to use List<T> because I am pretty sure I can serialize it without extra work, which is important for my application. T is concrete in every case. I am trying not to copy CollectionBase classes because I'd have to copy over the custom implementation, and I'd like to avoid having to do that in my code generator.
The only part I'm having a problem with is determining T for List<T> when replacing a custom CollectionBase.
What I've done so far:
I have briefly looked at the MSDN docs and samples for CollectionBase, and they mention creating a custom Add method on your derived type. I don't think this is in any way enforced, so I'm not sure I can rely on that. An implementor could name it something else, or worse, have a collection that supports multiple types, with Object as their only common ancestor.
Alternatives I have considered:
Maybe the default serialization does some tricks that I can take advantage of. Is there a default serialization for CollectionBase collections, or do you generally have to implement it yourself? If you have to do it yourself, is there some reliable metadata I could look at in order to determine the types? If it supports default serialization, does it rely on the runtime types of the items in the collection?
I could make a mapping in my code generator of known CollectionBase types, mapped to their corresponding T for List<T>. If a given CollectionBase type that I encounter isn't in the list, throw an exception. This is probably what I'll go with if I there isn't a reliable alternative.
I'm still not sure enough about what you want to do to give advice. Still, do your CollectionBase-derived classes all implement Add(T)? If so, you could look for an Add method with single parameter of type other than object, and use that type for T.

Flex - XML Serialization and De-Serialization of nested Object structures

Our Flex app would like to work with requests and responses as object graphs (nothing unusual there) e.g. response becomes the model of some view, and would be a structure with several layers of nesting.
** Now, ideally we would like to use the same client (and server) side objects for different message formats e.g. XML and AMF, and have a pluggable serialization/de-serialization layer (!)
AMF has serialization and matching of client to server using
[RemoteClass(alias="samples.contact.Contact")]
but it seems there is no equivalent for XML.
I am (somewhat optimistically) looking for a neat way of serializing the object graph to XML, to send through a HTTPService from the client.
For responses, the default 'object' and 'E4X' provide some de-serialization. This is handy, but of course we don't have the niceties of unpacking the XML back into specific AS classes like we do with AMF.
Any suggestions?
(did have one idea come through about wrapping/casting object as XML or XMLList - this does not seem to work, however)
Update:
Both these libraries look useful, and I will very likely use them at some point.
For now, I really need the simplicity of re-using the metadata set for the AMF3 serialization which we are using in any case ([RemoteClass],[Transient])
.. so the best option at the moment is AMFX - used Flex Data Services for AMF transfer using XML - classes in mx.messaging.channels.amfx package - only drawback at the moment is any Externalizable class is transformed into a Hex byte stream - and ArrayCollection is Externalizable! (hoping to workaround by serializing the internal Array in a subclass ..)
Hope that's useful to someone ..
Regarding the Xml serialization I can give you a starting point (as biased as it may be, though :D).
I am working on a project that allows for automatic conversion of AS3 objects to and from xml. It basically uses annotations on the model objects you use for communication in order to construct the xml structure or populating an object from xml.
It is called FlexXB and you can check it out at http://code.google.com/p/flexxb/.
I started this project cos I got into the same issues at work (namely I have a server that communicates through xml) and I hoped it be of use to someone else.
Cheers,
Alex
Yet another project: FleXMLer (http://code.google.com/p/flexmler/).
It has both the straightforward attitude of asx3m where you can just call:
new FleXMLer().serialize(obj);
Or you can customize XML element names, skip elements and tweak the way arrays and hash tables are serialized.
Would appreciate your input.
checkout asx3m project at http://code.google.com/p/asx3m
It's an AS3 port of Java XStream serialization library and works pretty well.
I made it because I had to connect to a server platform that used XStream for exchanging data objects and put a lot of work in it.
It can be extended to serialize AS3 objects to any format (JSON for example) and could leverage power of user defined metatags.
Cheers,
Tomislav
There's a library including JSON available from Adobe, too. And since ActionScript is a superset of JavaScript ... and JSON is increasingly supported cross-framework ...

Is there a tool to capture an objects state to disk?

What I would like to do is capture an object that's in memory to disk for testing purposes. Since it takes many steps to get to this state, I would like to capture it once and skip the steps.
I realize that I could mock these objects up manually but I'd rather "record" and "replay" real objects because I think this would be faster.
Edit: The question is regarding this entire process, not just the serialization of the object (also file operations) and my hope that a tool exists to do this process on standard objects.
I am interested in Actionscript specifically for this is application but...
Are there examples of this in other
programming languages?
What is this process commonly called?
How would this be done in
Actionscript?
Edit:
Are there tools that make serialization and file operations automatic (i.e. no special interfaces)?
Would anybody else find the proposed tool useful (if it doesn't exist)?
Use case of what I am thinking of:
ObjectSaver.save(objZombie,"zombie"); //save the object
var zombieClone:Zombie = ObjectSaver.get("zombie"); // get the object
and the disk location being configurable somewhere.
Converting objects to bytes (so that they can be saved to disk or transmitted over network etc.) is called serialization.
But in your case, I don't think that serialization is that useful for testing purposes. When the test creates all its test data every time that the test is run, then you can always trust that the test data is what you expect it to be, and that there are no side-effect leaking from previous test runs.
I asked the same question for Flex a few days ago. ActionScript specifically doesn't have much support for serialization, though the JSON libraries mentioned in one of the responses looked promising.
Serialize Flex Objects to Save Restore Application State
I think you are talking about "object serialization".
It's called Serialization
Perl uses the Storable module to do this, I'm not sure about Actionscript.
This used to be called "checkpointing" (although that usually means saving the state of the entire system). Have you considered serializing your object to some intermediate format, and then creating a constructor that can accept an object in that format and re-create the object based on that? That might be a more straightforward way to go.
What is this process commonly called?
Serializing / deserializing
Marshalling / unmarshalling
Deflating / inflating
Check out the flash.utils.IExternalizable interface. It can be used to serialize ActionScript objects into a ByteArray. The resulting data could easily be written to disk or used to clone objects.
Note that this is not "automatic". You have to manually implement the interface and write the readExternal() and writeExternal() functions for each class you want to serialize. You'll be hard pressed to find a way to serialize custom classes "automatically" because private members are only accessible within the class itself. You'll need to make everything that you need serialized public if you want to create an external serialization method.
The closest I've come to this is using the appcorelib ClassUtil to create XML objects from existing objects (saving the xml manually) and create objects from this xml. For objects with arrays of custom types it takes configuring ArrayElementType Metadata tags and compiler options correctly as described in the docs.
ClassUtil.createXMLfromObject(obj);
CreateClassFromXMLObject(obj,targetClass);
If you're using AIR, you can store Objects in the included local database.
Here's a simple example using local SQLite database on the Adobe site, and more info on how data is stored in the database.

Resources