IcCube + Ragged hierarchies - iccube
Does IcCube support ragged hierarchies from a flat source? It's usually implemented with a 'HideMemberIf' property on a hierarchy level (you can set it to hide if member is the same as the parent)
The microsoft version (and what I'm currently using): https://msdn.microsoft.com/en-AU/library/ms365406.aspx
--edit 1---
here's an example of a 3 level hierarchy in the 9 level schema. Benefit is hardcoded attribute and level identifiers mean simpler MDX calcs no matter what kind of hierarchy is loaded into it
PortfolioKey,PortfolioLevel1Key,PortfolioLevel1Name,PortfolioLevel1Label,PortfolioLevel2Key,PortfolioLevel2Name,PortfolioLevel2Label,PortfolioLevel3Key,PortfolioLevel3Name,PortfolioLevel3Label,PortfolioLevel4Key,PortfolioLevel4Name,PortfolioLevel4Label,PortfolioLevel5Key,PortfolioLevel5Name,PortfolioLevel5Label,PortfolioLevel6Key,PortfolioLevel6Name,PortfolioLevel6Label,PortfolioLevel7Key,PortfolioLevel7Name,PortfolioLevel7Label,PortfolioLevel8Key,PortfolioLevel8Name,PortfolioLevel8Label,PortfolioLevel9Key,PortfolioLevel9Name,PortfolioLevel9Label
7,100001,Non-Perishable,Category,200001,Condiment,Sub Category,300007,Pepper,Product,400007,Pepper,Product,500007,Pepper,Product,600007,Pepper,Product,700007,Pepper,Product,800007,Pepper,Product,900007,Pepper,Product
8,100001,Non-Perishable,Category,200001,Condiment,Sub Category,300008,Salt,Product,400008,Salt,Product,500008,Salt,Product,600008,Salt,Product,700008,Salt,Product,800008,Salt,Product,900008,Salt,Product
5,100001,Non-Perishable,Category,200002,Soup,Sub Category,300005,Chicken,Product,400005,Chicken,Product,500005,Chicken,Product,600005,Chicken,Product,700005,Chicken,Product,800005,Chicken,Product,900005,Chicken,Product
6,100001,Non-Perishable,Category,200002,Soup,Sub Category,300006,Vegetable,Product,400006,Vegetable,Product,500006,Vegetable,Product,600006,Vegetable,Product,700006,Vegetable,Product,800006,Vegetable,Product,900006,Vegetable,Product
1,100002,Perishable,Category,200003,Dairy,Sub Category,300001,Cheese,Product,400001,Cheese,Product,500001,Cheese,Product,600001,Cheese,Product,700001,Cheese,Product,800001,Cheese,Product,900001,Cheese,Product
2,100002,Perishable,Category,200003,Dairy,Sub Category,300002,Milk,Product,400002,Milk,Product,500002,Milk,Product,600002,Milk,Product,700002,Milk,Product,800002,Milk,Product,900002,Milk,Product
3,100002,Perishable,Category,200004,Fruit,Sub Category,300003,Apple,Product,400003,Apple,Product,500003,Apple,Product,600003,Apple,Product,700003,Apple,Product,800003,Apple,Product,900003,Apple,Product
4,100002,Perishable,Category,200004,Fruit,Sub Category,300004,Orange,Product,400004,Orange,Product,500004,Orange,Product,600004,Orange,Product,700004,Orange,Product,800004,Orange,Product,900004,Orange,Product
It should be straight forward. Let's take for the sake of the example a Geo dimension with Monaco as a city without country (for the sake of the example)
Europe
Monaco (no country)
Spain
Madrid
France
Paris
Our data source can be defined like this (each column is a level) :
Continent,Country,City
Europe,Spain,Madrid
Europe,France,Paris
Europe,,Monaco
Note the country for Monaco is just empty.
You can create now your multi-level hierarchy and see the ragged version (Monaco parent is Europe):
Related
How to source statments with neoj4 graph database?
I'm looking forward to use NeoJ4 for some researchs. However I have to check if it can do what I want first. I would like to build a graph that says : StatementID1 = Cannabidiol hasPositiveEffectOn ChronicPain StatementID1 isSupportedBy Study1 StatementID1 isSupportedBy Study2 StatementID1 isNotSupportedBy Study3 This is easy to add key:Value properties to a relationship in NeoJ4. The difficulty is that I want Study1,2,3 to be nodes. So that these can have them own set of properties. This can be done in a triplestore where each triple has an ID like "Statment1" here. This is a matter of adding triples where the object is another triple ID. url:TripleID1 = url:Cannabidiol url:hasPositiveEffectOn url:ChronicPain url:TripleID2 = url:TripleID1 url:isSupportedBy url:Study1 url:TripleID3 = url:TripleID1 url:isSupportedBy url:Study2 url:TripleID4 = url:TripleID1 url:isNotSupportedBy url:Study3 For the moment I can't find how to do it simplely in NeoJ4. I could add the DOI of the study as a property : Study 1 : DOI:123/123 Then add the same DOI in the link : isSupportedBy: DOI:123/123 Since the DOI is unique it could be possible to make some searchs. However this will make things much more complex. Is there a simpler method to achieve that?
I suppose this is a database design issue. Would a node/relationship model something like the following fit your data and make your queries easy?
Neo4j doesn’t support edges going from an edge to a node. Edges are always between nodes. So you’ll have to convert your positiveEffect edge to a node (as proposed in rickhg12hs’s answer) or model the positiveEffect as a non-edge (as you yourself proposed).
Is there a way to represent the changes over time in a graph of the knowledge base in grakn?
Given a context, for example, you do have a set of facts in your graph database / knowledge base (as in the grakn), that would represent a current state of a graph (in plain text here) like : version 1 (jan/2016): "Rachel is a person that is a english teacher for a class of 10 students in a University ABC" . change 1 (mar/2016), that generates version 2: "Alice replaces Rachel" version 2: (mar/2016): "Alice is a person that is a english teacher for a class of 10 students in a University ABC". So given that, I know that I could represent the versions inside the graph and replicate everything (minus the change) from version 1 into a new set of the data (nodes and edges) to the version 2, But I am wondering if there is a Best Practice (or some mechanism of the engine) in representing these changes overtime, like versioning of that data set, or something similar that would make the change to a new data set but keep a history so that you can recompose the previous state the graph.
The only thing close to that is that Grakn can support attaching attributes to relationships. For example: insert $x (spouse: $p1, spouse: $p2) isa marriage; $x has date "01/10/2010" You can also attach attributes to attributes. So if you defined a attribute type for example Version you could attach that to all your relationships. So while it cannot represent change over time out of the box you can work around it to some degrees depending on your use case.
R - Employee Reporting Structure
Background: I am using R along with some packages to pull JSON data from a ticketing system. I'm pulling all the users and want to build a reporting structure. I have a data set that contains employees and their managers. The columns are named as such ("Employee" and "Manager"). I am trying to build a tree of a reporting structure that goes up to the root. We are in an IT organization, but I am pulling all employee data, so this would look something like: Company -> Business Unit -> Executive -> Director -> Group Manager -> Manager -> Employee That's the basic idea. Some areas have a tree structure that is small, others it's multiple levels. Basically, what I am trying to do is get a tree, or reporting structure I can reference, so I can determine for an employee, who their director is. This could be 1 level removed or up to 5 or 6 levels removed. I came across data.tree, but so far, as I look at it, I have to provide a pathString that defines that structure. Since I only have the two columns, what I'd like to do is throw this data frame into a function and have it traverse the list as it finds the employee, put it under that manager, when it finds that manager as an employee, nest it under their direct report, along with anything nested under them. I haven't been able to figure out how to make data.tree do this without defining the pathString, but in doing so, I can only build the pathString on what I know for each row - the employee and their manager. The result is a tree that only has 2 levels and directors aren't connected to their Group Manager and Group Managers aren't connected to their managers and so forth. I thought about writing some logic/loops to go through and do this, but there must be an easier way or a package that I can use to do this. Maybe I am not defining the pathString correctly.... Ultimately, what I'd like the end result to be is a data frame with columns that look like: Employee, Manager1, Manager2, Manager3, ManagerX, ... Of course some rows will only have entries in columns 1 and 2, but others could go up many levels. Once I have this, I can look up devices in our configuration management system, find the owner and aggregate those counts under the appropriate director. Any help would be appreciate. I cannot post the data, as it is confidential in nature, but it simply contains the employee and their managers. I just need to connect all the dots... Thanks!
The data.tree package has the FromDataFrameNetwork function for just this scenario: library(data.tree) DataForTree <- data.frame(manager = c("CEO","sally","sally","sue","mary", "mary"), employee = c("sally","sue","paul","mary","greg", "don"), stringsAsFactors = FALSE) tree <- FromDataFrameNetwork(DataForTree) print(tree) Results in: 1 CEO 2 °--sally 3 ¦--sue 4 ¦ °--mary 5 ¦ ¦--greg 6 ¦ °--don 7 °--paul
The hR package is specifically designed to address the needs for data analysis using people/employee data; albeit, it is minimal at this point. The hierarchy function can produce a wide data frame as you would like; this helps with joining in other data and continuing an analysis. library(hR) ee = c("Dale#hR.com","Bob#hR.com","Julie#hR.com","Andrea#hR.com") supv = c("Julie#hR.com","Julie#hR.com","Andrea#hR.com","Susan#hR.com") hierarchy(ee,supv,format="wide") Employee Supv1 Supv2 Supv3 1 Dale#hR.com Susan#hR.com Andrea#hR.com Julie#hR.com 2 Bob#hR.com Susan#hR.com Andrea#hR.com Julie#hR.com 3 Julie#hR.com Susan#hR.com Andrea#hR.com <NA> 4 Andrea#hR.com Susan#hR.com <NA> <NA>
In what case the DICOM should be build with SQ VR type attribute
All, Forgive me I am a Newbie of the DICOM. And I was just learning the DICOM standard right now. I just knew there has an attribute named SQ (Sequencing Data Sets) in the DICOM standard. Basically, It can be used to describe a DICOM object like a tree. I am just curious about in what particular case we should use this kind of structure to build a DICOM object? Thanks.
The dicom sequence is type of nested structure to define some complex tag and consist in a set of datasets, like a structured report. The image above can exemplify: Currently I'm working in ultrasound images and I use the dicom sequence to specify a region of a image, for example: The region 'A' have a specific tag: (0018,6011) Sequence of Ultrasound Regions, and this region have nested tags like: (0018,6018) Region Location Min x0 (0018,601A) Region Location Min y0 (0018,601C) Region Location Max x1 (0018,601E) Region Location Max y1 (0018,6024) Physical Units X Direction (0018,6026) Physical Units Y Direction These tags is used for a instance of a region, the region 'B', 'C' or whatever may have the same tags. To exemplify better see the image above For more information, in this link (http://dicom.nema.org/dicom/2013/output/chtml/part05/sect_7.5.html) have a standard associated with nesting structures, and in this link(http://dicom.nema.org/medical/dicom/2014c/output/chtml/part03/sect_C.8.5.5.html) have specific use for ultrasound image to use with example. Good luck in your Dicom studies!
One relevant thing is missing in the excellent answer by Gabriel IMHO: It is not the implementor's choice, when to use a sequence to encode data in DICOM. DICOM datasets are structured in modules which constitute from attributes. So there is a list of attributes allowed for a particular type of DICOM object (such as ultrasound image, CT image,...). The attribute has a "type" (in DICOM terms: Value Representation - VR) - string, number, person name or sequence - where the items allowed in the sequence are also well defined. So the answer to "when to use a sequence" is: When the DICOM standard requires you to. References: DICOM Part 3 - which attributes are required/allowed for which type of DICOM object DICOM Part 6 - which attributes are encoded with which value representation
Graph Database - How to deal with multilingual data
I'm trying to approach a multilingual graph database but I'm struggling on how to achieve an optimal model. My current proposal is to make two node types: Movie and MovieTranslation. Movie holds all relationships as likes, related, ratings and comments. MovieTranslation contains all translatable data (title, plot, genres). A Movie node does not contain these kind of properties, only the original_title. Movie and MovieTranslation are tied together by a translation relationship. When I query nodes, I would check if they have a translation relationship with the queried locale (en_US for example). If true, merge the translation with the main node as the result. I think this way might not be the best, but I can't think on a better one. Do you guys have a better suggestion for the database model? It would be very appreciated. I'm using neo4j, if you need this information. Thanks, Vinicius.
I suggest moving the original title to its own node also, call it MovieTitle. "Complicating" your model in this way should actually "simplify" (or at least standardise) your queries because you're always looking in one place for film titles (also for indexing and searching). You're assuming that films only have one original title which isn't the case. A Korea-Japan co-production will have at least two original titles. Whole genres of Japanese cinema were released with different original Japanese titles in cinemas and on VHS. Distinct from the idea of an original title is that of specific language titles. The same film released in different Chinese-speaking countries will have different Chinese-language titles that are deemed more marketable to the specific local audiences. To get the original title: MATCH (c:Country)<-[HAS_NATIONALITY]-(m:Movie)-[HAS_TITLE]->(t:MovieTitle)-[HAS_NATIONALITY]->(c:Country) WHERE m.id = 1 RETURN COLLECT(t.title, c.country_code) To get the original title in China: MATCH (m:Movie)-[HAS_TITLE]->(t:MovieTitle)-[HAS_NATIONALITY]->(c:Country) WHERE c.country_code == "CN" RETURN m, COLLECT(t.title, c.country_code) To get all language titles: MATCH (m:Movie)-[HAS_TITLE]->(t:MovieTitle)-[HAS_NATIONALITY]->(c:Country)-[HAS_LANGUAGE]->(l:Language) RETURN m, COLLECT(t.title, l.language_code) To get all Chinese-language titles: MATCH (m:Movie)-[HAS_TITLE]->(t:MovieTitle)-[HAS_NATIONALITY]->(c:Country)-[HAS_LANGUAGE]->(l:Language) WHERE l.language_code == "zh" RETURN m, COLLECT(t.title, c.name) I would separate plot and genre into their own nodes. There is an argument that different national cinemas have unique genres, but if westerns and samurai dramas are both sub-genres of period dramas then you want to find them both on a period drama search. I would still have the idea of Translation nodes but don't confuse with them the domain you're modelling. It should be domain-ignorant and - for simple words/phrases like "romantic comedy" - should almost be a third-party graph plug-in released by GraphAware in 2025. Get the French-language genre titles of a specific film: MATCH (m:Movie)-[HAS_GENRE*]->(g:Genre)-[HAS_TRANSLATION]->(t:Translation)-[HAS_LANGUAGE]->(l:Language) WHERE m.id = 100 AND l.language_code = "fr" RETURN COLLECT(t.translation) Get all romanic comedies: MATCH (m:Movie)-[HAS_GENRE*]->(g:Genre)-[HAS_TRANSLATION]->(t:Translation) WHERE t.translation = "comédie romantique" RETURN m Unlike movie titles and genres, plots are altogether more simple because you're modelling the film's story as a blob of text and not as domain objects in itself. Perhaps later you may do textual analysis on the plot texts to find themes, gender bias, etc, and model this in the graph as well. Get the French language plot for a specific movie: MATCH (m:Movie)-[HAS_PLOT]->(p:Plot)-[HAS_LANGUAGE]->(l:Language)-[HAS_TRANSLATION]->(t:Translation) WHERE m.id = 100 AND t.translation = "French" RETURN p.plot (Please treat the Cypher queries as pseudo-code. I didn't make a graph and test them.)
I think the model is ok. You can RETURN movie, translation or RETURN {movie:movie, translation:translation} Currently converting nodes to maps and combining these maps is not yet supported, that's something on the roadmap. How and where would you want to use the nodes? If for rendering, you can just access the two columns or entries. If for graph visualization you can also combine them into a node in the json source for the viz.