Is there a standard or a de-facto standard recognized among developers for definitions/schemas in data systems?
Example: "username", often inconsistently referred to as "login", "nickname", "user", category/metadata "access data, personal data"; other ex: "retina scan", often inconsistently referred to as "retina scan", "iris scan", category/metadata "biometric data, access data".
Is there a standard that provides standard definitions/schemas for such terms?
Looking forward very much to your answers.
I have looked at ISO-11179 so far. However, this standard only provides guidance on how to define the terms, but no definitions themselves.
ISO-8000 is similar and for master data only.
Microsoft's Common Data Model provides definitions to a certain extent, but the CDM only covers a certain part of all possible definitions. Moreover, it is questionable how widespread it actually is. The Microsoft Open Data Initiative does not seem to have worked properly.
It is similar with schema.org.
I wonder if there is such a list of common definitions or each company ultimately creates their own definitions in their own systems (CRM / ERP / Whatever).
There is no global standard definition. There may be standards for specific industries, countries, ecosystems, etc but nothing that you could assume would always be accepted
Related
I noticed that tutanota and mega.io mentioned "Privacy by design" in their homepages. So, I became curious and found the wikipedia page about Privacy by design, but it seems to be an abstract concept (a collection of principals). However, I was looking for something like - do a and b or implement y and z. For example, mega.io uses Zero Knowledge Encryption (User-Controlled End-to-End Encryption). What other features do a product need to have to be called a "Privacy by Design" service.
By their very nature, abstract principles do not concern themselves with implementation detail. There are many different ways to implement them, and mandating one approach over another is simply out of scope – what matters is the net effect. It's also applicable to non-tech environments, paper records, etc; it's not exclusive to web dev.
Privacy by design (PbD) is a term coined by Ann Cavoukian, an ex-information commissioner in Canada, and it has a collection of principles, as that Wikipedia page describes. PbD is also referenced by GDPR. I've given various talks on privacy and security at tech conferences around the world – you can see one of my slide decks on PbD.
So how do you use them in web development? Take the second principle: "Privacy as the default". This means that if a person using your web app does nothing special, their privacy must preserved. This means, amongst other things, that you should not load any tracking scripts (perhaps even remote content), and not set any cookies that are not strictly necessary. If you do want to track them (and thus break the user's privacy to some extent), then you need to take actual laws into account, such as the EU privacy directive, which is what requires consent for cookies and trackers.
So although the principle itself did not require these measures, it influenced the technical decisions you needed to make in your implementation in order to comply with the spirit of the principle. If that happens, the principle has done its job.
So what you have to do in order to claim privacy by design (though it's not like you get a badge!) is to introspect and consider how these principles apply to your own services, then act on those observations and make sure that the things you design and build conform to the principles. This is a difficult process (especially at first), but there are tools to help you perform "privacy impact assessments" (also part of GDPR) such as the excellent PIA tool by the French information commissioner (CNIL).
If you're thinking about PbD, it's worth looking at two other important lists: the data protection principles that have been the basis of pretty much all European legislation since the 1980s, including GDPR, and the 6 bases for processing in GDPR. If you get your head around these three sets of concerns, you'll have a pretty good background on how you might choose to implement something privacy-preserving, and also a good set of critical guidelines that will help you to spot privacy flaws in products and services. A great example of this is Google Tag Manager; it's a privacy train wreck, but I'll leave it to you to contemplate why!
Minor note: the GDPR links I have provided are not to the official text of GDPR, but a reformatted version that is much easier to use.
I am trying to build an Alexa Skills Kit, where a user can invoke an intent by saying something like
GetFriendLocation where is {Friend}
and for Alexa to recognize the variable friend I have to define all the possible values in LIST_OF_Friends file. But what if I do not know all the values for Friend and still would like to make a best match for ones present in some service that my app has access to.
Supposedly if you stick a small dictionary into a slot (you can put up to 50,000 samples), it becomes a "generic" slot and becomes very open to choosing anything, rather than what is given to it. In practice, I haven't had much luck with this.
It is a maxim in the field of Text To Speech that the more restrictive the vocabulary, the greater the accuracy. And, conversely, the greater the vocabulary, the lower the accuracy.
A system like VoiceXML (used mostly for telephone prompt software) has a very strict vocabulary, and generally performs well for the domains it has been tailored for.
A system like Watson TTS is completely open, but makes up for it's lack of accuracy by returning a confidence level for several different interpretations of the sounds. In short, it offloads much of the NLP work to you.
Amazon have, very deliberately, chosen a middle road for Alexa. Their intention model allows for more flexibility than VoiceXML, but is not as liberal as a dictation system. The result gives you pretty good options and pretty good quality.
Because of their decisions, they have a voice model where you have to declare, in advance, everything it can recognize. If you do so, you get consistent and good quality recognition. There are ways, as others have said, to "trick" it into supporting a "generic slot". However, by doing so, you are going outside their design and consistency and quality suffer.
As far as I know, I don't think you can dynamically add utterances for intents.
But for your specific question, there is a builtin slot call AMAZON.US_FIRST_NAME, which may be helpful.
I am working on a Rhapsody SysML project for work and we need to be able to model different configurations of our system. To give a concrete example, if our system is a vehicle, we want to be able to simulate that vehicle with different configurations of engines, wheels, etc.
This is my first time using SysML but in the book A Practical Guide to SysML it discusses, in chapter 7, the concept of Instance Specifications. These sound like exactly what we need, and Rhapsody appears to have support for them. So we created an Instance Specification in Rhapsody, giving it specific values for the engine and wheels. But once we create the instance specification we cannot find any way to actually create an instance from that specification. We noticed that Rhapsody doesn't even generate any code for the instance specification.
So my questions are the following, can Instance Specifications be used to create different configurations of a system and if so how? If not, what is the best method for modeling different configurations of a system?
Thanks for any help you can provide.
For data normalisation of standard tin can verbs, is it best to use verbs from the tincan registry https://registry.tincanapi.com/#home/verbs e.g.
completed http://activitystrea.ms/schema/1.0/complete
or to use the adl verbs like those defined:
in the 1.0 spec at https://github.com/adlnet/xAPI-Spec/blob/master/xAPI.md
this article http://tincanapi.com/2013/06/20/deep-dive-verb/
and listed at https://github.com/RusticiSoftware/tin-can-verbs/tree/master/verbs
e.g.
completed http://adlnet.gov/expapi/verbs/completed
I'm confused as to why those in the registry differ from every other example I can find. Is one of these out of date?
It really depends on which "profile" you want to target with your Statements. If you are trying to stick to e-learning practices that most closely resemble SCORM or some other standard then the ADL verbs may be most fitting. It is a very limited set, and really only the "voided" verb is provided for by the specification. The other verbs were related to those found in 0.9 and have become the de facto set, but aren't any more "standard" than any other URI. If you are targeting statements to be used in an Activity Streams way, specifically with a social application then you may want to stick with their set. Note that there are verbs in the Registry that are neither ADL coined or provided by the Activity Streams specification.
If you aren't targeting any specific profile (or existing profile) then you should use the terms that best capture the experiences which you are trying to record. And we ask that you either coin those terms at our Registry so that they are well formed and publicly available, or if you coin them under a different domain then at least get them catalogued in our Registry so others may find them. Registering a particular term in one or more registries will hopefully help keep the list of terms from exploding as people search for reusable items. This will ultimately make reporting tools more interoperable with different content providers.
I am working on a paper on Ada 83. We have an assignment that lists the sections of the paper (history, design goals, syntax, etc.,) The instructor has mentioned that some of us are going to have sections that simply say "This language does not support this feature."
Two of these sections are Data Types and Data Structures. Well, everything I can see indicated that Ada only has data types and not data structures. Is this true or am I missing something? I know this is kinda a weird question (asking about the 1983 version of Ada) but I don't want to make such a big claim without only to find that it was false.
I assume that by "data structures" you mean linked lists, stacks, queues etc.
In Ada83 you could implement data structures, but the standard library didn't contain any. Non-standard libraries were available.
The same was true in Ada95, but the new object-oriented programming features resulted in several open-source container libraries, many of which are still available.
Part of the Ada05 revision was the introduction of a standardised container library Ada.Containers, which has been extended in the Ada12 revision.
A lot of things can be called data structure. As for Ada, records and arrays would be language-supported data structures. Packages are also a kind of a data structure. Ada 2005's Ada.Containers (as mentioned by Simon) are part of the standard library and not of the language itself (your definition may vary; they are defined in the LRM).
Complex data structures like stacks, hashed maps, linked lists etc. are usually a feature of the language's standard library, but in some scripting languages, some of these (particularly hashed maps) are actually language features.