WF and Hierarchical State Machines - workflow-foundation-4

Searching bing and google I'm finding some, but surprisingly little information about state charts in relation to windows work flow. The only clear answer I've derived is "yes it can handle state charts, and here is a tutorial."
But what I'm trying to determine is:
Does it support hierarchical state machines as richly as Samek's QHSM. For example:
Can it handle a transition from a nested state to another nested state, where all entry and exit events are correctly fired?
Does it handle or facilitate transition to history?
What drawbacks, if any, should I consider about using WF for (hierarchical) state charts?
How has the story changed, if any, from WF4 to WF45?
I am not concerned if WF (hierarchical) state charts are much slower than Samek's. It is only the expressiveness I'm curious about.

Related

Storing data on edges of GraphDB

It's being proposed that we store a data about a relationship between two vertices on the edge between them. The idea would be that these two vertices are related and there are user level pieces of information that are looking to be stored in graph. The best example I can think of would be a Book, and a Reader, and the Reader can store cliff notes on the edges for retrieval later on.
Is this common practice? It seems to me that we should minimize the amount of data living in edges and that a vast majority of GraphDB data be derived data, rather than using it as an actual data store. Given that its in memory, what happens when it goes down? (We're using Neptune so.. there are technically backups).
Sorry if the question is a bit vague, but I'm not sure else how to ask. I've googled around looking for best practices and its all pretty generic data related to the concepts and theories of graph db.
An additional question, is it common practice to expose the gremlin API directly to users, or should there always be a GraphQL (or other) API in front of it?
Without too much additional detail it is hard to provide exact modeling advice , but in general one of the advantages of using a graph databases is that edges are first class citizens and allow for properties on edges. A common use case for this would be something like PERSON - purchases -> Product where you might have a purchase_date on the purchases edge to represent the date of the purchase, as someone might buy the same thing multiple times.
I am not sure what exactly you mean by that a vast majority of GraphDB data be derived data as you can use graphs to derive and infer data/relationships based on the connections but they do fully support storing data in them as well.
Given that its in memory, what happens when it goes down? - Amazon Neptune (and most other DBS) use a buffer cache to store some data in memory, but that data is also persisted to disk, so if the instance goes down, there is no problem with recovering it from the durable storage.
An additional question, is it common practice to expose the gremlin API directly to users, or should there always be a GraphQL (or other) API in front of it? - Just as with any database, I would not recommend exposing the Gremlin API directly to consumers, as doing so comes with a whole host of potential security risks. Generally, the underlying data store of any application should be transparent to the users. They should be interacting with an interface like REST/GraphQL that is designed to answer business related questions and not really know or care that there is a graph database backing those requests.

Managing a connection to an external component in an Aggregate

I just discovered Axon framework and am very eager to use it as it answers a lot of my questions and concerns about DDD.
The first project I'd like to use it on contains small cameras which are controlled via the GigEVision protocol (over TCP and UDP for the control and stream channels). I think my problem would be the same for any case where we maintain a connection to an external component or more generally we want to link an external component lifecycle to Axon's lifecycles.
I'd like to have an Aggregate named Camera to which I can send Commands to grab 1 image or start grabbing N images at a certain FPS.
What I'm not sure about is how to manage the connection to an external component in my Aggregate.
- Should I inject the client to my camera in my Camera Aggregate and consider connecting to it as part of my protocol / business commands? In this case how would I link the camera lifecycle (a camera get disconnected all of a sudden) to the aggregate lifecyle (create a corresponding CameraDisconnectedEvent)?
- Should the connection be handled in a side car Saga which get the camera client injected, the saga starting on ConnectionRequestedEvent and stopping as soon as we get a connection error from the camera. I would get the same issue of linking the connection lifecycle to the lifecycle of the Saga I think.
- Am I leaking implementation details in the business layer and should manage the issue an other way?
- Am I just using the wrong tool for this job and should not try to force it into Axon?
Thank you very much in advance, hope my message and issues make sense.
Best regards,
First and foremost what you should do, is ensure the language spoken by the GigEVision protocol by no means transitions over into your other domain.
These two should be separate and remain so, as they cover different concerns.
This brings to light the necessity to have a translation layer of some sort.
More formuly called a context mapping. From a DDD perspective, you would take this even further by talking about an Anti-Corruption Layer.
The name already says it, you add a layer to ensure you are not corrupting your domain logic with that from another domain.
Another useful topic to read up on here would be the notion of a Bounded Context.
I digress though, let's go back to the problem at hand: where to position this anti-corruption layer.
What is currently unclear to me, is what domain requirements are in place why the connection is required to be maintained all the time when requesting for images.
Is the command you want to send requesting for a live feed? Or just "some" images from a given time frame?
In both scenarios I am not immediately convinced that any of these operations requires the validation through a single Camera aggregate to be honest.
You could still model this in a command and event format, as the messaging paradigm is very helpful to allow clean segregation of concerns.
But given the current description, I am uncertain whether you need DDD's idea of an Aggregate to model a single Aggregate in.
I might be wrong on this note, but I just don't know enough about your domain at this stage.
That's my two cents to the situation, hoping this helps!

Markov Decision Process absolute clarification of what states have the Markov property

I seem to consistently encounter counter-examples in different texts as to what states constitute having the Markov property.
It seems some presentations assume a MDP to be one in which the current state/observation relays absolutely all necessary environmental information to make an optimal decision.
Other presentations state only that the current state/observation have all necessary details from prior observed states to make the optimal decision (eg see: http://www.incompleteideas.net/book/ebook/node32.html).
The difference between these two definitions is vast since some people seem to state that card games such as poker lack the Markov property since we cannot know the cards our opponent is holding, and this incomplete information thus invalidates the Markov property.
The other definition from my understanding seems to suggest that card games with hidden state (such as hidden cards) are in fact Markov, so long as the agent is basing its decisions as if it had access to all of its own prior observations.
So which one does the Markov property refer to? Does it refer to having complete information about the environment to make the optimal decision, or rather does it accept incomplete information but rather simply refer to the current state/observation of the agent simply being based upon an optimal decision as if that state had access to all prior states of the agent? Ie: In the poker example, as long as the current state gives us all information that we have observed before, even if there is a lot of hidden variables, would this now satisfy the Markov property?

State of the World HazelCast Map with Listener

With Hazelcast, I imagine a common scenario is that consumers want to know about the current State of the world (what ever is currently in the Map) and then updates with nothing lost in between.
Imagine a scenario where a Hazelcast map is holding some variety of data that a consumer wants to be streamed (maybe via Rx) the current state and then any updates from the listener.
The API suggests that we add a Listener for updates and treat the Map as a normal ConcurrentMap. However, while I'm enumerating the Map, updates may come in via the listener, so ensuring the correct order of items is hard.
We could share a lock between the map enumerator and the listener, but that seems a bit of a code smell.
So my question in general is, if we want to stream the SoTW and then updates, how can we do this? Is there anything built into Hazelcast that can help us?
Thanks for the help
First of all, and I guess it is just unluckily explained, a map has no order!
The second thing, a Hazelcast map is a non snapshotting, non-persistent data-structure. Just like a ConcurrentHashMap, changes to the data-structure are reflected to the iterator and vice-versa.
Events, on the other side, are a completely independent system, especially since events are delivered asynchronously. So it might happen, that the events actually arrives before you advanced the iterator up to the same element or the other round.
If you need a better guarantee, iterate first, add the listener, apply changes (you need to do a delta in a second iteration) between start of the first run and events that you might have missed. Not an actually nice way to handle it but I don't think there's another way.

StatsD/Graphite Naming Conventions for Metrics

I'm beginning the process of instrumenting a web application, and using StatsD to gather as many relevant metrics as possible. For instance, here are a few examples of the high-level metric names I'm currently using:
http.responseTime
http.status.4xx
http.status.5xx
view.renderTime
oauth.begin.facebook
oauth.complete.facebook
oauth.time.facebook
users.active
...and there are many, many more. What I'm grappling with right now is establishing a consistent hierarchy and set of naming conventions for the various metrics, so that the current ones make sense and that there are logical buckets within which to add future metrics.
My question is two fold:
What relevant metrics are you gathering that you have found indespensible?
What naming structure are you using to categorize metrics?
This is a question that has no definitive answer but here's how we do it at Datadog (we are a hosted monitoring service so we tend to obsess over these things).
1. Which metrics are indispensable? It depends on the beholder. But at a high-level, for each team, any metric that is as close to their goals as possible (which may not be the easiest to gather).
System metrics (e.g. system load, memory etc.) are trivial to gather but seldom actionable because they are too hard to reliably connect them to a probable cause.
On the other hand number of completed product tours matter to anyone tasked with making sure new users are happy from the first minute they use the product. StatsD makes this kind of stuff trivially easy to collect.
We have also found that the core set of key metrics for any teamchanges as the product evolves so there is a continuous editorial process.
Which in turn means that anyone in the company needs to be able to pick and choose which metrics matter to them. No permissions asked, no friction to get to the data.
2. Naming structure The highest level of hierarchy is the product line or the process. Our web frontend is internally called dogweb so all the metrics from that component are prefixed with dogweb.. The next level of hierarchy is the sub-component, e.g. dogweb.db., dogweb.http., etc.
The last level of hierarchy is the thing being measured (e.g. renderTime or responseTime).
The unresolved issue in graphite is the encoding of metric metadata in the metric name (and selection using *, e.g. dogweb.http.browser.*.renderTime) It's clever but can get in the way.
We ended up implementing explicit metadata in our data model, but this is not in statsd/graphite so I will leave the details out. If you want to know more, contact me directly.

Resources