I have a looping node NationalityDet which holds multiple current former nationality or citizenships (CurrentNatCit) I need to ensure that all the Country values for Current Nationality map go to the nationality node and Current citizenship are mapped to Citizenship node, all former Nationality/citizenship are mapped to the OtherNationality/OtherCitizenship (Citizenship is only allowed one record it is node). Any ideas?
Source sample
<NationalityDet>
<NatCit>
<Type>NATIONALITY/CITIZENSHIP</Type>
<Status>CURRENT/FORMER</Status>
<Country>UK</Country>
</NatCit>
<OtherNatCit>
<Type>NATIONALITY/CITIZENSHIP</Type>
<Status>CURRENT/FORMER</Status>
<Country>UK</Country>
</OtherNatCit>
</NationalityDet>
Destination sample
<Person>
<Person1>
<Nationality>NATIONALITY/CURRENT</Nationality>
<Nationality>NATIONALITY/CURRENT</Nationality>
<Nationality>NATIONALITY/CURRENT</Nationality>
<Citizenship>CITIZENSHIP/CURRENT</Citizenship>
<Citizenship>CITIZENSHIP/CURRENT</Citizenship>
<Citizenship>CITIZENSHIP/CURRENT</Citizenship>
<OtherNationality>
<Nationality>NATIONALITY/FORMER</Nationality>
<Nationality>NATIONALITY/FORMER</Nationality>
<Nationality>NATIONALITY/FORMER</Nationality>
</OtherNationality>
<OtherCitizenship>CITIZENSHIP/FORMER</OtherCitizenship>
</Person1>
</Person>
Currently have used the looping functoid u mentioned and a number of equals and &'s to allow for this mapping. I am stuck in regards to counting the nodes from two different parent nodes for TYPE=CITIZENSHIP and STATUS=FORMER for OtherCitizenship. any thoughts?
It is rather unclear from your question and sample as to exactly what you want mapped where.
But the pattern you will probably need is as below. Add a looping functoid that goes to both the singleton and repeating node. Add an Iteration functoid that goes to an equals functoid and a greater than functoid both with a second fixed value of 1, and map respectively to the singleton and the repeating node. Map the source field to both fields.
Update after question changed.
So lets say you have the following XML
<NationalityDet>
<NatCit>
<Type>NATIONALITY</Type>
<Status>CURRENT</Status>
<Country>UK</Country>
</NatCit>
<NatCit>
<Type>CITIZENSHIP</Type>
<Status>CURRENT</Status>
<Country>Netherlands</Country>
</NatCit>
<NatCit>
<Type>NATIONALITY</Type>
<Status>FORMER</Status>
<Country>Brazil</Country>
</NatCit>
<NatCit>
<Type>CITIZENSHIP</Type>
<Status>FORMER</Status>
<Country>USA</Country>
</NatCit>
<OtherNatCit>
<Type>NATIONALITY</Type>
<Status>CURRENT</Status>
<Country>Australia</Country>
</OtherNatCit>
<OtherNatCit>
<Type>CITIZENSHIP</Type>
<Status>CURRENT</Status>
<Country>New Zealand</Country>
</OtherNatCit>
<OtherNatCit>
<Type>NATIONALITY</Type>
<Status>FORMER</Status>
<Country>Argentina</Country>
</OtherNatCit>
<OtherNatCit>
<Type>CITIZENSHIP</Type>
<Status>FORMER</Status>
<Country>Germany</Country>
</OtherNatCit>
</NationalityDet>
Then your map will look like this.
I will explain the highlighted shapes, the rest follow the same pattern. From top to bottom, left to right.
An looping functoid linked to both NatCit and OtherNatCit and linked to Nationality.
An equal functoid linked to NatCit\Type and value NATIONALITY
An equal functoid linked to NatCit\Status and value CURRENT
An equal functoid linked to OtherNatCit\Type and value NATIONALITY
An equal functoid linked to OtherNatCit\Status and value CURRENT
An AND functoid lined to the two equal functoids of NatCit
An AND functoid lined to the two equal functoids of OtherNatCit
A Value mapping functoid linked to the AND from NatCit and NatCit\Country going to Person1\Nationality.
A Value mapping functoid linked to the AND from OtherNatCit and OtherNatCit\Country going to Person1\Citizenship.
I then copied the first group and changed the NATIONALITY to CITIZENSHIP and linked to the same input fields but putting the outputs of the value mapping to Citizenship.
I then copied the first group and changed the CURRENT to FORMER and linked to the same input fields but putting the outputs of the value mapping to OtherNationality\Nationality.
I then copied the second group (which has CITIZENSHIP ) and changed the CURRENT to FORMER and linked to the same input fields but putting the outputs of the value mapping to OtherCitenship.
Below is the output.
<Person>
<Person1>
<Nationality>UK</Nationality>
<Nationality>Australia</Nationality>
<Citizenship>Netherlands</Citizenship>
<Citizenship>New Zealand</Citizenship>
<OtherNationality>
<Nationality>Brazil</Nationality>
<Nationality>Argentina</Nationality>
</OtherNationality>
<OtherCitizenship>USA</OtherCitizenship>
<OtherCitizenship>Germany</OtherCitizenship>
</Person1>
</Person>
Related
I need to create a Flat file schema out of a .csv file having repeated lines:
#Constant
#Date: 1.1.1999
Type1;xxx;yyy;zzz;aaa;bbb
Type2;xxx;yyy;zzz;aaa;bbb
Type3;xxx;yyy;zzz;aaa;bbb
0;123;222;333;444
1;1;22;333;2;22
1;2;33;22;2;22
1;;;33;3;33
2;100;22;1;222;11;22
0;23;22;33;44
1;2;11;22;11;22
1;22;11;22;22;33
0;23;22;55;66
1;22;11;22;66;77
As you can see the rows of type 0,1 and 2 are repeating.
I tried to create flat file considering #Constant till Type3 as field elements and 0,1,2 rows as repeating records with their respective tag identifiers. But since these rows are repeating , i am getting error while validating schema instance.
You can create the schema using the flat file schema wizard and some manual modification.
Start with the wizard.
First create the schema for the repeating part: select the first block of lines 0,1,1,1,2, leave the delimiter empty (remove the default value) and set element type to "Repeating record". The default name will be Root_Child1.
Parse it into child nodes with CRLF as delimiter. Set the element type of line 0, the first line 1 and line 2 to "Repeating record" and set it to "Ignore" for the second and third line 1. You will end up with three child records (Root_Child1_Child1, Root_Child1_Child2 and Root_Child1_Child5).
Continue parsing these child records into fields using the semicolon as delimiter and setting tag identifiers to 0, 1 and 2 respectively. Finally, on the record node representing line 2 (Root_Child1_Child5) modify Min Occurs to 0.
Now manually add a sibling record node before Root_Child1 to represent the constant block. Right click it and select "Define Record from Flat File Instance". Select the top five lines, leave the delimiter empty and set the element type to Record. Continue by parsing the record into 5 child records with CRLF as delimiter. You can then parse those child records into field nodes with a semicolon delimiter if you wish.
I am currently working on a small Talend job, which imports CSV data, gets the address field and sends the address to Google Maps API for geocoding. Afterwards, I need to combine both the input and geocoding data.
My problem is, that the combination of initial data row and geocoding result seems not possible; After passing the TRestClient, all reference to the input data seems gone.
Here's my non-final data flow:
Subjob 1: CSVInput --> THashMapOutput
|
|
Subjob 2: THashInput --> tRestClient --> tExtractJSONFields --> tMap --> tBufferOutput
| (Lookup)
|
tHashInput
|
|
Subjob 3: tBufferInput --> tFileOutputDelimited
Herein, the last tMap does not have a foreign key aka reference to the input row. Therefore the join creates the cross product of all different combinations of input and geocoded raw.
Is there a way to combine both input and geocoding results? Can we configure tRestClient to forward inputs as well?
(a combination of two resulting csv files seems to fail for the same missing identifier)
Ok, answer was quite easy:
Assume you have the first link in subjob 2 called row2.
Then you can open the second tMap component.
Remove the lookup shown above.
Add the references to row 2 within tMap: e.g. row2.URL, row2.Name
Et voila: Now you get each row combined of geocoded result and original data.
this is about XQuery - I am using MarkLogic as Database.
I have data as in the following example:
<instrument name="myTest1" id="test1">
<daten>
<daily>
<day date="2016-02-05">
<screener>
<column name="i1">
<value>1</value>
<bg>red</bg>
</column>
<column name="i2">
<value>1</value>
<fg>lime</bg>
</column>
<column name="i4">
<fg>black</bg>
</column>
</screener>
</day>
</daily>
</daten>
</instrument>
I have many instruments, and each one has an entry for each day in the daily element, and inside screener, there can be manz columns, all with different names. Some screeners include more columns than others. Each column can include a value element, a bg element and a fg element.
I want to search for instruments that fullfill specific criteria about what kind of columns do have children with specific values. Example: I want a sequence of all instruments, that for a given day, have a value 1 for column i1 and that have a fg black for column i2
Since I have many different of those conditions, I would not like to hardcode them in XQuery where clauses. I did that for a few and it works, but the code gets a lot of duplications and is hard to maintain.
My question is, is it possible to build a where clause in a FLOWR statement programatically, meaning, based on another xml structure, which could look like this:
<searchpatterns>
<pattern name="test1">
<c>
<name>i1</name>
<element>value</element>
<value>1</value>
</c>
<c>
<name>i2</name>
<element>fg</element>
<value>red</value>
<modifier>not</modifier>
</c>
</pattern>
</searchpatterns>
which would find those instruments, where the screener has a column i1 which itself has a value of 1, and also it must not have column i2 with a fg of red.
When I do it the normal way I query my date like this:
for $res in doc()/instrument
where $res/daten/daily/day[#date="2016-02-05"]/screener/column[#name="i1"]/value/text()="1"
and res/daten/daily/day[#date="2016-02-05"]/screener/column[#name="i2"]/fg/text()!="red"
This kind of where clause I want to generate based on an XML structure.
I did some research of the MarkLogic inbuilt cts:search function and a lot of stuff around it but it seems to be for something else (more user interactive searching)
If you have a hint to point me in the right direction, if what I want is even possible, I would very much appreciate it.Thanks!
The doc()/instrument XPath asks for every document with an instrument element and then filters those documents.
Where possible, it's usually better in MarkLogic to model the documents so you can use the indexes to retrieve as few documents as possible. It's also usually better to use cts:search() instead of XPath to generate the sequence so you are working directly with the indexes.
In this case, you might consider using the values of the name attribute as elements instead of the generic "column." You could then generate a cts:element-query that matches the name containing a cts:element-value-query that matches the value within the name.
Hoping that helps,
Yes, this can be achieved programmatically. If you want to check whether an element satisifes a test for every item in a sequence, the every ... satisfies construct comes to mind. So in this case it could be:
for $res in doc()/instrument
where every $pattern in $searchpatterns/pattern/c satisfies (
let $equal := $res/daten/daily/day[#date="2016-02-05"]/screener/column[#name = $pattern/name]/*[name() = $pattern/element] = $pattern/value
return if ($pattern/modifier = "not") then not($equal) else $equal
)
return $res
So every $pattern will be checked. I assume the modifier element is supposed to modify the equal construct. So we first check if the element satisfies the equal condition and the we check whether the modifier element is equal to not. Of course, applying the same idea could also be used to implement other modifiers as well.
I have a pipe delimited .txt Flat File that I'm using to do bulk insert to SQL. Everything works well for straight one to one. However, the Flat File now contains 2 new fields that can repeat an unknown number of times.
Is there a way to create a single flat file schema where I can have an unbounded child within the main unbounded child? I think the place I'm getting tripped up is how to make the ChildRoot listed below just a "group heading" like Root is where ChildRoot doesn't correspond to a location in the flat file. How do I insert something like that?
Schema:
-Roots
--Root (unbounded)
---ChildID
---ChildName
Roots gets a direct link to my sql stored procedure to do a bulk insert on as many "Root" rows that come in.
Now I have:
Schema:
-Roots
--Root (unbounded)
---Child
---ChildName
---ChildRoot (unbounded)
----ChildRootID
----ChildRootName
**EDIT
I should also add that ChildRootID & ChildRootName can repeat an indefinite number of times until the row delimiter (carriage return) is found
I have a flat file with some repeating sections in it, and I'm confused how to create the schema via the BT flat file mapping wizard. The file looks like this:
001,bunch of data
002,bunch of data
006,bunch of data
006A,bunch of data
006B,bunch of data
006B,bunch of data
006,bunch of data
006A,bunch of data
006B,bunch of data
As you can see, the 006* records can repeat. I'm going to want to wind up with XML that looks like this:
<001Stuff>...</001Stuff>
<002Stuff>...</002Stuff>
<006Loop>
<006Stuff>...</006Stuff>
<006AStuff>...</006AStuff>
<006BStuff>...</006BStuff>
<006BStuff>...</006BStuff>
</006Loop>
<006Loop>
<006Stuff>...</006Stuff>
<006AStuff>...</006AStuff>
<006BStuff>...</006BStuff>
</006Loop>
Obviously I can't just set the first group of 006* records to "Repeating record" and Ignore the second set. I'm used to dealing with single repeating rows via the wizard (i.e. another 006 row right after the first one) and not nested things like this - any suggestions on how to proceed? Thanks!
Working with the Flat File Schema Wizard is quite hard and there is only so much it can help you with. I always seem to have to tweak its output a little bit.
In order to make things a little bit easier, I suggest you should restrict your sample document to a single occurrence of the whole <006> structure. You will not have to set many lines to Ignored using the Flat File Schema Wizard :
001,bunch of data
002,bunch of data
006,bunch of data
006A,bunch of data
006B,bunch of data
006B,bunch of data
Next, each repeating structure should be wrapped inside a corresponding Repeating Record in the definition of your Xml Schema.
Please, note that you can always run the Flat File Schema Wizard recursively on nested structures to have more fine-grained control. So I would suggest, first, to run the wizard with an all-encompassing repeating <006> structure, like so :
Then, you can right click on the structure, and provide a more detailed definition of nested child structures, only highlighting a subset of the sample contents, like so:
Then, the most important part: you need to tweak the Child Order property to Conditional Default for both repeating structures, because there is only one empty line at the end of your document file and the Wizard cannot help you out with this situation.
For reference, your resulting structure should look like so:
With the following settings:
BunchOfStuff (Root) : Delimited, 0x0D 0x0A, Suffix.
_001Stuff : Delimited, ,, Prefix, Tag Identifier 001.
_002Stuff : Delimited, ,, Prefix, Tag Identifier 002.
_006Loop : Delimited, 0x0D 0x0A, Conditional Default.
_006Stuff : Delimited, ,, Prefix, Tag Identifier 006.
_006AStuff : Delimited, ,, Prefix, Tag Identifier 006A.
_006BLoop : Delimited, 0x0D 0x0A, Conditional Default.
_006BStuff : Delimited, ,, Prefix, Tag Identifier 006B.
Hope this helps.
Treat everything from the first start of the first 006, record to the start of the second 006, record as one record. When you define 006 record, set it up as a repeating record also. This should create a node for each 660, group and nodes for each 600 under it.
That is what I would try.
Here is my output after 2 minutes of work. Except for the node/element names I think it is what you want. You would still have to create seperate elements for each of the fields in your data.
<_x0030_01 xmlns="">001,bunch of data
<_x0030_02 xmlns="">002,bunch of data
<_x0030_06 xmlns="">
<_x0030_06_Child1>bunch of data
<_x0030_06_Child2>
<_x0030_06_Child2_Child1>A,bunch of data
<_x0030_06_Child2>
<_x0030_06_Child2_Child1>B,bunch of data
<_x0030_06_Child2>
<_x0030_06_Child2_Child1>B,bunch of data
<_x0030_06 xmlns="">
<_x0030_06_Child1>bunch of data
<_x0030_06_Child2>
<_x0030_06_Child2_Child1>A,bunch of data
<_x0030_06_Child2>
<_x0030_06_Child2_Child1>B,bunch of data