How to return multiple values from unconnected lookup? - flat-file

In my mapping, I am using flat files as source and target. I have to use unconnected lookup. Can somebody tell me how to return multiple values from unconnected lookup specially when we are using flat files as source and target.
I know how to return multiple values when we use relational tables. In that case, we just concat values and return as single value. Again we split those values.
Please help me.

if unconnected lookup on relational table
In lookup override we can concatenate two or multiple ports and return that port to expression transformation.
In expression transformation extract those values.

I think you replace the first delimiter with some other delimiter(say &) in the source file. Using "&" as delimiter you can create the lookup and use it to retrieve the concatenated return field which wil give you multiple return values for the match.

Related

Mapping Data Flows - Working with Dynamic Business Keys

I am building a parameterised Mapping dataflow pipeline and have run into a problem that I need help with.
My ADF Load is based on a config file, a sample of which is given below:
I would like the ability to join using the Stagekeys column in my config file using the EXISTS transformation shown below
Any suggestions on how I can achieve it?
Kind Regards
If my understanding was right we can parameterize key columns and prepare Exists Expression.
FYI, attached condition for single key we can extend that with multi keys as "source1#keyColumn1 == source2#keyColumn1 && source1#keyColumn2 == source2#keyColumn2"
--Dataflow Parameter
--Exists Expressions
For multiple keys from same target table can use following expression and send key columns as array
array(byNames($pKeyColumns,'sourceADLSCSV')) == array(byNames($pKeyColumns,'targetASQL'))
--Pipeline Parameter
--Dataflow Parameter
--Exists Expressions

SQLite C API equivalent to typeof(col)

I want to detect column data types of any SELECT query in SQLite.
In the C API, there is const char *sqlite3_column_decltype(sqlite3_stmt*,int) for this purpose. But that only works for columns in a real table. Expressions, such as LOWER('ABC'), or columns from queries like PRAGMA foreign_key_list("mytable"), always return null here.
I know there is also typeof(col), but I don't have control over the fired SQL, so I need a way to extract the data type out of the prepared statement.
You're looking for sqlite3_column_type():
The sqlite3_column_type() routine returns the datatype code for the initial data type of the result column. The returned value is one of SQLITE_INTEGER, SQLITE_FLOAT, SQLITE_TEXT, SQLITE_BLOB, or SQLITE_NULL. The return value of sqlite3_column_type() can be used to decide which of the first six interface should be used to extract the column value.
And remember that in sqlite, type is for the most part associated with value, not column - different rows can have different types stored in the same column.

filter pushdown using spark-sql on map type column in parquet

I am trying to store my data in nested way in parquet and using map type column to store complex objects as values.
If somebody could let me know whether filter push down works on map type of columns or not.For example below is my sql query -
`select measureMap['CR01'].tenorMap['1M'] from RiskFactor where businessDate='2016-03-14' and bookId='FI-UK'`
measureMap is a map with key as String and value as a custom data type containing 2 attributes - String and another map of String,Double pair.
I want to know whether pushdown will work on map or not i.e if map has 10 key value pairs , Spark will bring whole map's data in memort and create the object model or it will filter out the data depending upon the key at I/O read level.
Also I want ot know is there is any way to specify key in where clause, something like - where measureMap.key = 'CR01' ?
The short answer is No. Parquet predicate pushdown doesn't work with mapType columns or for the nested parquet structure.
Spark catalyst optimizer only understands the top level column in the parquet data. It uses the column type, column data range, encoding etc to finally generate the whole stage code for the query.
When the data is in a MapType format it is not possible to get this information from the column. You could have hundreds of key-value pair inside a map which is impossible with current spark infrastructure to do a predicate pushdown.

SINGLEVALUEQUERY and MULTIVALUEQUERY with Pentaho Report Designer

I have multiple data sets that drive the Pentaho report. The data is derived from a handful of stored procedures. I need to access multiple data sources within the report without using sub reports and I believe the best solution is to create open formulas. The SINGLEVALUEQUERY I believe will only return the first column or row. I need to return multiple columns.
As an example here my stored procedure which is named HEADER in Pentaho (CALL Stored_procedure_test (2014, HEADER)), returns 3 values - HEADER_1, HEADER_2, HEADER_3. I'm uncertain of the correct syntax to return all three values for the open formula. Below is what I tried but was unsuccessful.
=MULTIVALUEQUERY("HEADER";?;?)
The second parameter denotes the column that contains the result.
If you dont give a column name here, the reporting engine will simply take the first column of the result. In the case of the MULTIVALUEQUERY function, the various values of the result set are then aggregated into a array of values that is suitable to be passed into a multi-select parameter or to be used in a IN clause in a SQL data-factory.
For more details see https://www.on-reporting.com/blog/using-queries-in-formulas-in-pentaho/

Riak inserting a list and querying a list

I was wondering if there was a effecient way of handling arrays/lists in Riak. Right now I'm storing the whole array as a string and searching the string to find out if a element exists in the array.
ID (key) : int[] (Value)
And also How do I write a map/reduce query to give all the keys for which the value array contains a element
For example 1 : 2,3,4
2 : 2,5
How would I write a M/R query to give me all the keys for which value contains 2 the result is 1,2 in this case.
Any help is appreciated
If you are searching for a specific element in the list and are using the LevelDB backend, you could create a secondary index that will contain the values of the array. Secondary indexes in Riak may contain multiple values and can be searched for equality, which should allow you to search for single elements in the array without having to resort to MapReduce.
If you need to make more complicated queries based on either several elements in the list or other parameters, you could retrieve a subset of records based on the secondary index and then process them further on the client side or perhaps even through a MapReduce job.

Resources