Difference between type and struct - julia

I'm trying to learn Julia and am reading a book that shows the following two code example in a chapter about composite types:
1.
type Points
x::Int64
y::Int64
z::Int64
end
2.
struct Point
x::Int
y::Int
z::Int
end
The book however does not explain, when to use struct and when to use type.
What is the difference?

This is quite a mess in your sources, since it mixes different meanings from non-compatible epochs in the history of the language.
Originally (pre 0.7, I think?), composite types were declared with either type or immutable, where type was used for mutable types (there also was bitstype for what is now called "primitive types").
Now, we have mutable struct and struct for the sampe purposes (and, consistently with it, primitive type and abstract type).
So, basically the names have changed so that all ways to define types becomes a bit more consistent, and immutable structs have become the "unmarked" case.
"Mutable" in this context means that you can't reassign a field (p.x = 3). It does not imply that the contents of a field cannot be changed, it they happen to be mutable (something.v[1] = 2 will also work if something is an immutable type!).

As argued by #phipsgabler you are mixing code from different versions of Julia.
On this specific topic you can look also on my "Julia concise tutorial", on the topic of "Custom Structures".
The same topic, extended, is available in the "Julia Quick Syntax reference" book (Apress, 2019) from which I report an except from the unedited source (still full of English errors, but you should get the point..):
Custom Types
In Chapter 2 "Data Types and Structures" we discussed built-in types, including containers. In this chapter we move to explain how to create user-defined types.
"type" vs "structure":
Let's be clear about these two terms and what they mean in the Julia language context.
As type of an object we mean, as in plain English, the set of characteristics that an object can be described with. For example an object of type sheet can be described with its dimensions, weight, colour..
All values in Julia are true objects belonging to a given type (they are individual "instances" of the given type).
Julia types include so called primitive types made of just a fixed amount of bits (like all numerical types.. Int64, Float64, but also Char..) and composite types or structures, where the set of characteristics of the object is described trough multiple fields and a variable number of bits.
Both structures and primitive-type can be user-defined and are hierarchically organised. Structures roughly correspond to what classes are known in other languages.
Primitive type definition
A user-defined primitive type is defined with the keyword primitive type and just its name and the number of bits it requires:
primitive type [name] [bits] end
For example:
primitive type My10KBBuffer 81920 end
IMPORTANT: A (current) limitation of Julia is that the number of bits must be a multiple of 8 below 8388608.
Optionally a parent type can be specified:
primitive type [name] <: [supertype] [bits] end
Note that the internal representation of two user-defined types with the same number of bits is exactly the same. The only thing that would change is their names, but that's an important difference: it's the way that functions are defined to act when given objects of these types are passed as arguments that changes, i.e. it's the usage across the program of named types that distinguish them rather than their implementation.
Structure definition
In a similar way of primitive types, to define a structure we use the keyword mutable struct, give the structure a name, specify the fields, and close the definition with the end keyword:
mutable struct MyOwnType
field1
field2::String
field3::Int64
end
Note that while you can optionally define each individual field to be of a given type (e.g. field3::Int64) you can't define fields to be subtypes of a given type (e.g. field3<:Number). In order to do that you can however use templates in the structure definition:
mutable struct MyOwnType{T<:Number}
field1
field2::String
field3::T
end
Using templates, the definition of the structure is dynamically created the first time an object whose field3 is of type T is constructed.
The type with whom you annotate the individual fields can be either a primitive one (like in the example above) or a reference to an other structure (see later for an example). +
Note also that, differently from other high-level languages (e.g Python), you can't add or remove fields from a structure after you firstly defined it. If you need this functionality use <> instead, but be aware that you will trade-off this flexibility with worse performances.
At the opposite, to gain performances (but again trading up with flexibility), you can omit the mutable keyword in front of struct.
This will constraint that once an object of that type has been created, its fields can no longer be changed (i.e., structures are immutable by default).
Note that mutable objects -as arrays- remain themselves mutable also in a immutable structure.

Related

Using shacl to validate a property that has at most one value in its properties

I'm trying to create a shacl based on the ontology that my organization is developing (in dutch): https://wegenenverkeer.data.vlaanderen.be/
The objects described have attributes (properties), that have a specified datatype. The datatype can a primitive (like string or decimal) or complex, which means the property will have properties itself (nested properties). For example: an asset object A will have an attribute assetId which is a complex datatype DtcIdentificator, which consists of two properties itself. I have succesfully created a shacl that validates objects by creating multiple shapes and nesting them.
I now run into the problem of what we call union datatypes. These are a special kind of complex datatypes. They are still nested datatypes: the attribute with the union datatypes will have multiple properties but only exactly zero or one of those properties may have a value. If the attribute has 2 properties with values, it is invalid. How can I create such a constraint in shacl?
Example (in dutch): https://wegenenverkeer.data.vlaanderen.be/doc/implementatiemodel/union-datatypes/#Afmeting%20verkeersbord
A traffic sign (Verkeersbord, see https://wegenenverkeer.data.vlaanderen.be/doc/implementatiemodel/signalisatie/#Verkeersbord) can have a property afmeting (size) of the datatype DtuAfmetingVerkeersbord.
If an asset A of this type would exist, I could define its size as (in dotnotation):
A.afmeting.rond.waarde = 700
-or-
A.afmeting.driehoekig.waarde = 400
Both are valid ways of using the afmeting property, however, if they are both used for the same object, this becomes invalid, as only one property of A.afmeting may have a value.
I have tried using the union constraint in shacl, but soon found out that that has nothing to do with what we call "union datatypes"
I think the reason you are struggling is because this kind of problem is usually modelled differently. Basically you have different types of Traffic signs and these signs can have measurements. With the model as you described, A.afmeting.rond.waarde captures 2 ideas using 1 property: (a) the type and (b) the size. From your question, this seems to be the intend. However, this is usually not how this kind of problem is addressed.
A more intuitive design is for Traffic sign to have 2 different properties: (a) type and (b) a measurement. The Traffic sign types are achthoekig, driehoekig, etc. Then you can use SHACL to check that a traffic sign has either both or no properties for a traffic sign.

Using flow, how do I require that an array contains at least one element?

Seems that the docs say that you can force it to a fixed length. Is it possible to require that an array contains at least one element?
There is no way to do this with the Array type. "Array with length >= 1" is not a type on its own, it's either an array, or it isn't.
the docs say that you can force it to a fixed length
This is because at this point the type isn't an array, the type is "a tuple of N values" where a given number of items and their types equates to its own standalone type. Tuples are also considered read-only in that you cannot change their size.
For instance, what would Flow do if you call .pop() on the array? It would have to somehow be disallowed because if the length is part of the type itself, changing the length of the array would actually count as changing the type of the object.
If you expect to change the number of items in the array, what you could do is define your own type that validates the size of the array, and then only exposes methods to add items and throws if the size becomes less than 1. At the end of the day these are runtime checks that it would be up to you to maintain.
On the other hand, you can design your own datastructure that would ensure what you want. A typechecker then could assert that at least one value exists if you define your own datastructure, e.g.
type MinOneList<T> = {
value: T,
next: MinOneList<T> | null,
};
so if you have
var foo: MinOneList<T> = ...
you are guaranteed that foo.value exists, so the list has at least one item. For it to qualify as empty, the type would have to be MinOneList<T> | null.

Finding all entity names from deprecated freebase

I'm training a few Machine learning models that represent words as vectors, using freebase as training data. Since the API has been deprecated, I'm working with raw freebase dump, which is now a list of 3.1 billion triples, containing more than 500 million distinct entities (subject/object), and I'd like to reduce this number.
I would like to remove all triples which simply denote names of subjects so that only triples containing MIDs remain. However, I've found multiple possible predicates that define the 'name' of an entity.
i) common.notable_for.display_name
ii) type.object.name
iii) /rdf-schema#label
I have 3 questions :
a) Is there any difference between the above predicates?
b) Are there any additional predicates which also describe the names of entities?
c) Apart from the triple where a name is defined, does the name ever appear in other triples, instead of the MID?
Thank you for your help!
You should only concentrate on the type.object.name that's the schema property holding the topic's name.
The /rdf-schema#label is equalization, it is not part of the freebase schema.
The common.notable_for.display_name description is: "Localized/gender appropriate display name for the notable object.", it is also a property within a CVT (compound value type) and it holds different type of information: "of all types that a topic has, what't it most "important". As far as I remember "Larry Page" was an "entrepreneur". So you don't need this property. Concentrate on the TON type.object.name.

How to iterate maps in insertion order?

I have a navbar as a map:
var navbar = map[string]navbarTab{
}
Where navbarTab has various properties, child items and so on. When I try to render the navbar (with for tabKey := range navbar) it shows up in a random order. I'm aware range randomly sorts when it runs but there appears to be no way to get an ordered list of keys or iterate in the insertion order.
The playground link is here: http://play.golang.org/p/nSL1zhadg5 although it seems to not exhibit the same behavior.
How can I iterate over this map without breaking the insertion order?
The general concept of the map data structure is that it is a collection of key-value pairs. "Ordered" or "sorted" is nowhere mentioned.
Wikipedia definition:
In computer science, an associative array, map, symbol table, or dictionary is an abstract data type composed of a collection of (key, value) pairs, such that each possible key appears just once in the collection.
The map is one of the most useful data structures in computer science, so Go provides it as a built-in type. However, the language specification only specifies a general map (Map types):
A map is an unordered group of elements of one type, called the element type, indexed by a set of unique keys of another type, called the key type. The value of an uninitialized map is nil.
Note that the language specification not only leaves out the "ordered" or "sorted" words, it explicitly states the opposite: "unordered". But why? Because this gives greater freedom to the runtime to implement the map type. The language specification allows to use any map implementation like hash map, tree map etc. Note that the current (and previous) versions of Go use a hash map implementation, but you don't need to know that to use it.
The blog post Go maps in action is a must read regarding to this question.
Before Go 1, when a map was not changed, the runtime returned the keys in the same order when you iterated over its keys/entries multiple times. Note that this order could have changed if the map was modified as the implementation might needed to do a rehash to accommodate more entries. People started to rely on the same iteration order (when map was not changed), so starting with Go 1 the runtime randomizies map iteration order on purpose to get the attention of the developers that the order is not defined and can't be relied on.
What to do then?
If you need a sorted dataset (be it a collection of key-value pairs or anything else) either by insertion order or natural order defined by the key type or an arbitrary order, map is not the right choice. If you need a predefined order, slices (and arrays) are your friend. And if you need to be able to look up the elements by a predefined key, you may additionally build a map from the slice to allow fast look up of the elements by a key.
Either you build the map first and then a slice in proper order, or the slice first and then build a map from it is entirely up to you.
The aforementioned Go maps in action blog post has a section dedicated to Iteration order:
When iterating over a map with a range loop, the iteration order is not specified and is not guaranteed to be the same from one iteration to the next. Since Go 1 the runtime randomizes map iteration order, as programmers relied on the stable iteration order of the previous implementation. If you require a stable iteration order you must maintain a separate data structure that specifies that order. This example uses a separate sorted slice of keys to print a map[int]string in key order:
import "sort"
var m map[int]string
var keys []int
for k := range m {
keys = append(keys, k)
}
sort.Ints(keys)
for _, k := range keys {
fmt.Println("Key:", k, "Value:", m[k])
}
P.S.:
...although it seems to not exhibit the same behavior.
Seemingly you see the "same iteration order" on the Go Playground because the outputs of the applications/codes on the Go Playground are cached. Once a new, yet-unique code is executed, its output is saved as new. Once the same code is executed, the saved output is presented without running the code again. So basically it's not the same iteration order what you see, it's the exactly same output without executing any of the code again.
P.S. #2
Although using for range the iteration order is "random", there are notable exceptions in the standard lib that do process maps in sorted order, namely the encoding/json, text/template, html/template and fmt packages. For more details, see In Golang, why are iterations over maps random?
Go maps do not maintain the insertion order; you will have to implement this behavior yourself.
Example:
type NavigationMap struct {
m map[string]navbarTab
keys []string
}
func NewNavigationMap() *NavigationMap { ... }
func (n *NavigationMap) Set(k string, v navbarTab) {
n.m[k] = v
n.keys = append(n.keys, k)
}
This example is not complete and does not cover all use-cases (eg. updating insertion order on duplicate keys).
If your use-case includes re-inserting the same key multiple times (this will not update insertion order for key k if it was already in the map):
func (n *NavigationMap) Set(k string, v navbarTab) {
_, present := n.m[k]
n.m[k] = v
if !present {
n.keys = append(n.keys, k)
}
}
Choose the simplest thing that satisfies your requirements.

How does one convert string to number in JDOQL?

I have a JDOQL/DataNucleus storage layer which stores values that can have multiple primitive types in a varchar field. Some of them are numeric, and I need to compare (</>/...) them with numeric constants. How does one achieve that? I was trying to use e.g. (java.lang.)Long.parse on the field or value (e.g. java.lang.Long.parseLong(field) > java.lang.Long.parseLong(string_param)), supplying a parameter of type long against string field, etc. but it doesn't work. In fact, I very rarely get any errors, for various combinations it would return all values or no values for no easily discernible reasons.
Is there documentation for this?
Clarification: the field is of string type (actually a string collection from which I do a get). For some subset of values they may store ints, e.g. "3" string, and I need to do e.g. value >= 2 filters.
I tried using casts, but not much, they do produce errors, let me investigate some more
JDO has a well documented set of methods that are valid for use with JDOQL, upon which DataNucleus JDO adds some additional ones and allows users to add on support for others as per
http://www.datanucleus.org/products/accessplatform_3_3/jdo/jdoql.html#methods
then you also can use JDOQL casts (on the same page as that link).

Resources