I have a class diagram which consist a class on the name of SYSTEM. I have written a constraint for availability of this system.
For example :
The system should be available 24/7.
Now I want to convert the above statement into OCL constraint. I am new to OCL. I have searched and tried some research papers and videos but nothing found specific for availability.
Ar run time: OCL evaluates and checks a query using the instantaneous system state.
OCL has no support for time, but you may Google for Temporal OCL to see what various researchers are doing. More generally time is an active research area without solid solutions. Unchanged, OCL can only access an up-time variable and check that it greater than 24 hours.... When you first start, is your system supposed to fail because it has not been available 24/7?
If you consider your specific query, it is obviously impossible. In practice designers may analyze the failure rates on one/two/three/...-fold redundant systems with respect to relevant foreseeable failure mechanisms. No system is likely to survive an unforeseen failure, let alone a hostile act by some insider, or a well-informed outsider. Again more realistically, there should be an SLA that accepts a certain amount of down time per year, the smaller the downtime the higher the cost.
At design time, you may use OCL as the formulation of your design constraints. e.g. the mathematics that computes the aggregate failure rate of a single server, or the composite failure rate of redundant servers. But OCL wouldn't be my first choice for complex floating point calculations.
Related
I realize that it will be a basic concept, but it would be helpful if anyone could explain if fast data access related to the availability (A) in CAP theorem. Fast data access is an important feature expected of Big Data systems. And does the various K-Access and K-grouping method all a part of it.
Availability in CAP theorem is about whether you can access your data even if there is failure in the hardware (e.g. network outage, node outage etc).
Fast access to large volume of data is important feature observed in most of the big data systems. But, it should not confused with availability as described above.
Availability in CAP theorem means that all your requests will receive a response, but does not specify when or how accurate. Nor does it specify what fast could mean.
Every request receives a (non-error) response – without guarantee that
it contains the most recent write
Keep in mind that this theorem enforces strict guarantees. For example, systems can guarantee C and A, and still be good at P most of the time.
I am aware of current Ethernet Technology and it's non-deterministic behavior, due to CSMA-/CA/CD. I also see a lot of news around Time Sensitive Networking.
Could anyone briefly explain how TSN could change or enhance timing, synchronization, how is it related to IEEE 1588(PTP) etc ?
Your question is way too broad, but anyway...
First:
All Ethernet link these days are full duplex, so collision avoidance such as CSMA-/CA/CD are a thing of the past since about 15 years I would say: Ethernet is not a shared media (like is air for Wi-Fi), and there simply isn't any collision, so no need to avoid it. Did you get that "non-deterministic because of CSMA-/CA/CD" from a book or teacher? Then those would need a serious update.
(And so no, the collision risk and its avoidance mechanism is not the cause of "current Ethernet Technology and it's non-deterministic behavior", especially not with the word "current".)
Then:
About TSN (Time Sensitive Networking): TSN is just the new name for an IEEE 802 task sub-group that was first called AVB for "Audio Video Bridging", as you can see from here (2005 or 2008 to 2012):
http://www.ieee802.org/1/pages/avbridges.html
Note: The Audio/Video Bridging Task Group was renamed the
"Time-Sensitive Networking Task Group" in November, 2012. Please refer
to that page for further information. The rest of this page has been
preserved it existed at that time, without some obsolete meeting
information.
The change was made to reflect its now broader perspective: not only for Pro Audio/Video distribution use, but also to automotive, and latter industrial applications as well. So the work continues here:
http://www.ieee802.org/1/pages/tsn.html
As a result, you'll actually find most information about the fundamentals of TSN by googling "Ethernet AVB" rather than "Ethernet TSN". The wikipedia page, carefully maintained by people directly involved with the technology, is a good start:
https://en.wikipedia.org/wiki/Audio_Video_Bridging
Also, as with every technology, there is a technical side, that's the IEEE AVB->TSN group, and there is a marketing side, taking care of branding, use-cases, and (very important) certification programs to label and guarantee the interoperability of products, to have a healthy ecosystem. For AVB/TSN, this marketing side is handled by the AVnu Alliance SIG (Special Interest Group), founded in 2009:
http://avnu.org/
There too, you can find a lot of information in the knowledge base section of the web site (technologies, whitepapers, specifications, FAQs): why was it made (what is the problem to solve), how does it work, what are the use cases in the various fields it targets.
Some final words:
AVB/TSN it is not a single protocol, but rather a set of protocols, put together with possible variants according to the use case. Example: originally designed with auto-configuration/plug and play' built in (geared towards sound engineers, without the need for network engineers), its automotive profile rather use static configuration (because of reduced boot and configuration time, lesser code/hardware for reduced cost embedded devices, and you're not going to change a car networking topology nor roles of the nodes everyday anyway).
And that makes a BIG pile of standards: the base IEEE AVB standards put together, the last one published in 2013 IIRC, where already about 1,500 pages, and TSN is now expanding that. Distribution of a common wall-clock to participants, which is a prior need to synchronization, with sub-microsecond range of precision, is a big, complex problem in itself to start with . Be it with a static clock device reference ("PTP master") in IEEE 1588, and even more with a elected, and then constantly monitored and possibly re-elected GM ("Grand Master) via a BMCA (Best master clock algorithm) as in IEEE 802.1AS.
Also, all this require implementation not only in the end nodes, but also in the switches (bridges), which takes an active part in it at pretty much every level (clocking, but also bandwidth reservation, then admission and traffic shaping). So part of the standards are relevant to nodes, others to the bridges.
So all this is really quite big and complex, and going from 1,500 pages (which I once read from cover to cover) - and it's now more than that - to "briefly explain how TSN could change or enhance timing, synchronization, how is it related to IEEE 1588 (PTP) etc ?" is a bit challenging... especially the "etc" part... :-)
Hope these few pointers help though.
we have a single threaded application that simulates the interaction of a hundred of thousands of objects over time with the shared memory model.
obviously, it suffers from its inability to scale over multi CPU hardware.
after reading a little about agent based modeling and functional programming/actor model I was considering a rewrite with the message-passing paradigm.
the idea is very simple - each object will be an actor and their interactions will be messages so that the simulation could happen in parallel. given a configuration of objects at a certain time - its future consequences can be easily computed.
the question is how to model the time:
for example let's assume the the behavior of object X depends on A and B, as the actors and the messages calculations order is not guaranteed it could be that when X is to be computed A has already sent its message to X but B didn't.
how to make sure the computation happens correctly?
I hope the question is clear
thanks in advance.
Your approach of using message passing to parallelize a (discrete-event?) simulation is well-known and does not require a functional style per se (although, of course, this does not prevent you to implement it like that).
The basic problem you describe w.r.t. to the timing of events is also known as the local causality constraint (see, for example, this textbook). Basically, you need to use a synchronization protocol to ensure that each object (or agent) processes its messages in the right order. In the domain of parallel discrete-event simulation, such objects are called logical processes, and they communicate via events (i.e. time-stamped messages).
Correctly implementing a synchronization protocol for these events is challenging and the right choice of protocol is highly application-specific. For example, one important factor is the average amount of computation required per event: if there is little computation required, the communication costs dominate the overall execution time and it will be hard to scale the simulation.
I would therefore recommend to look for existing solutions/libraries on top of the actors framework you intend to use before starting from scratch.
I am curious about the contraints and tradeoffs for generating unique sequence numbers in a distributed and concurrent environment.
Imagine this: I have a system where all it does is give back an unique sequence number every time you ask it. Here is an ideal spec for such a system (constraints):
Stay up under high-load.
Allow as many concurrent connections as possible.
Distributed: spread load across multiple machines.
Performance: run as fast as possible and have as much throughput as possible.
Correctness: numbers generated must:
not repeat.
be unique per request (must have a way break ties if any two request happens at the exact same time).
in (increasing) sequential order.
have no gaps between requests: 1,2,3,4... (effectively a counter for total # requests)
Fault tolerant: if one or more, or all machines went down, it could resume to the state before failure.
Obviously, this is an idealized spec and not all constraints can be satisfied fully. See CAP Theorem. However, I would love to hear your analysis on various relaxation of the constraints. What type of problems will we left with and what algorithms would we use to solve the remaining problems. For example, if we rid of the counter constraint, then the problem becomes much easier: since gaps are allowed, we can just partition the numeric ranges and map them onto different machines.
Any references (papers, books, code) are welcome. I'd also like to keep a list of existing software (open source or not).
Software:
Snowflake: a network service for generating unique ID numbers at high scale with some simple guarantees.
keyspace: a publicly accessible, unique 128-bit ID generator, whose IDs can be used for any purpose
RFC-4122 implementations exist in many languages. The RFC spec is probably a really good base, as it prevents the need for any inter-system coordination, the UUIDs are 128-bit, and when using IDs from software implementing certain versions of the spec, they include a time code portion that makes sorting possible, etc.
If you must be sequential (per machine) but can drop the gap/counter requirments look for an implementation of the Version 1 UUID as specified in RFC 4122.
If you're working in .NET and can eliminate the sequential and gap/counter requirements, just use System.Guids. They implement RFC 4122 Version 4 and are already unique (very low collision probability) across machines and requests. This could be easily implemented as a web service or just used locally.
Here's a high-level idea for an approach that may fulfill all the requirements, albeit with a significant caveat that may not match many use cases.
If you can tolerate having two sequence numbers - a logical one returned immediately; guaranteed unique and ordered but with gaps - and a separate physical one guaranteed to be in sequential order with no gaps and available a short while later - then the solution seems straightforward:
One distributed system that can serve up a high resolution clock + machine id as the logical sequence number
Stream all the logical sequence numbers into a separate distributed system that orders the logical sequence numbers and maps them to the physical sequence numbers.
The mapping from logical to physical can happen on-demand as soon as the second system is done with processing.
In R. Kent Dybvig's paper "Three Implementation Models for Scheme" he speaks of "FFP languages" and "FFP machines". Apparently there is some correlation between FFP machines, and string-reduction on multiple processors.
Googling doesn't really uncover much in terms of explanations or examples.
Can anyone shed some light on this topic?
Thanks.
Kent Dybvig's advisor, Gyula A. Mago, published a detailed description in "The FFP Machine: Technical Report 87-014" in 1987 by Mago and Stanat.
As of this writing, the PDF is freely available at:
http://www.cs.unc.edu/techreports/87-014.pdf
The FFP Machine is a very fine-grained parallel computer architecture:
each processor holds a single symbol / atom / value.
It uses a string reduction model of computation in which
innermost function applications are found and replaced by their
equivalent result (eager evaluation).
Where a result is used in several places, it tends to be re-evaluated
instead of incurring the costs of accessing some global store
(but see Mago's paper on "Copying Operands vs Copying Results", or better yet Mago's "Data Sharing in an FFP Machine" in the 1982 Functional Programming Languages and Computer Architecture conference).
The L cells holding the FFP expression being reduced
communicate through a tree structured arrangement of T cells.
Note that IC's are basically two dimensional and with wiring,
circuits can move towards being three dimensional in physical space.
Interconnection networks that occupy higher dimensions
(such as the Hypercube, Omega, Banyan, Star, etc. networks)
will eventually be unable to perform near their theoretical limit.
This communication network is circuit-switched rather than being packet-switched.
Data packets contain no addresses and do not need routing.
Packets from distinct reductions cannot meet, cannot conflict
and cannot experience congestion with each other.
The configuring activity (called "Partitioning") is performed
in a single sweep upwards in the tree, using a handful
of logic operations on 3-bit messages, leaving "area machines" in its wake,
each created to advance at most a single reducible application.
While it is technically logarithmic in time,
the resulting area machines can begin communicating
in a pipelined fashion behind the partitioning wave,
practically costing a constant time penalty.
(The dismantling of area machines remains a logarithmic cost in time).
Packets within a single reduction should, and must, meet
and thus provide a often-useful synchronization.
Sequences of packets are sorted and combined as they rise
within an area, to be broadcast from the root of the area machine.
Parallel Prefix and Parallel Suffix operations are provided
to reduce area traffic, since there remains a potential bottleneck
within an individual reducible application.
This is accomplished without the need exhibited in
the Ultracomputer (Jack (Jacob?) Schwartz at NYU)
for a separate logarithmic-sized cache memory in each
communication node.
Each T cell (internal tree node) only needs a FIFO buffer
(for efficiency) of size greater than the pipeline path to
the top of the tree and back down.
(This latter is a conjecture of mine, but it seems reasonable).
Since the tree maintains the left-to-right order of data
(unlike some other combining networks), the system enables cells
to rotate their data in logarithmic rather than linear time,
avoiding the plausible congestion at the root of the area machine.
It's worth noting again, that the parallelism within an area
machine is independent of the simultaneous parallelism in other
area machines, and has available to it a number of processors
proportional to the quantity of data in the operand.
Have you come across this yet?: Compiling APL for parallel execution on an FFP machine
Formal FP. Similar to FP, but with regular sugarless syntax, for machine execution is all I can offer you.
See Wikis Fp page.