Mondrian: Cannot seem to get Aggregation Tables to be used - olap

I have been struggling to get aggregation tables to work. Here is what my fact table looks like:
employment_date_id
dimension1_id
dimension2_id
dimension3_id
dimension4
dimension5
measure1
measure2
measure3
I'm collapsing the employment_date_id from year, quarter, and month to include just the year, but every other column is included. This is what my aggregation table looks like:
yearquartermonth_year
dimension1_id
dimension2_id
dimension3_id
dimension4
dimension5
measure1
measure2
measure3
fact_count
I'm only collapsing the year portion of the date. The remaining fields are left as is. Here is my configuration:
<AggFactCount column="FACT_COUNT"/>
<AggForeignKey factColumn="dimension1_id" aggColumn="dimension1_id"/>
<AggForeignKey factColumn="dimension2_id" aggColumn="dimension2_id"/>
<AggForeignKey factColumn="dimension3_id" aggColumn="dimension3_id"/>
<AggMeasure name="[Measures].[measure1]" column="measure1"/>
<AggMeasure name="[Measures].[measure2]" column="measure2"/>
<AggMeasure name="[Measures].[measure3]" column="measure3"/>
<AggLevel name="[dimension4].[dimension4]" column="dimension4"/>
<AggLevel name="[dimension5].[dimension5]" column="dimension5"/>
<AggLevel name="[EmploymentDate.yearQuarterMonth].[Year]" column="yearquartermonth_year"/>
I'm for the most part copying the 2nd example of aggregation tables from the documentation. Most of my columns are not collapsed into the table and are foreign keys to the dimension tables.
My query I'm trying to execute is something like:
select {[Measures].[measure1]} on COLUMNS, {[EmploymentDate.yearQuarterMonth].[Year]} on ROWS from Cube1
The problem is that when I debug it and turn on the logging I see bit keys that look like this:
AggStar:agg_year_employment
bk=0x00000000000000000000000000000000000000000000000111111111101111100000000000000000000000000000000000000000000000000000000000000000
fbk=0x00000000000000000000000000000000000000000000000000000001101111100000000000000000000000000000000000000000000000000000000000000000
mbk=0x00000000000000000000000000000000000000000000000111111110000000000000000000000000000000000000000000000000000000000000000000000000
And my query's bit pattern is:
Foreign columns bit key=0x00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
Measure bit key= 0x00000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000000000000000000000000000000
And so my aggregation table is skipped. However, these are the exact columns that are folded into the table. But the bit positions are off between the query's and the aggregation table's. The other thing I find strange is that a portion of the columns is collapsed into the table, but all AggForeignKeys aren't included as bits so if I make a query with those columns this aggregation table will get skipped? That's counter to what I had planned. My plan was as long as you are making a query on year boundaries use this aggregation table.
I don't understand why this isn't working and why it fails to build the bit keys properly. I've tried debugging mondrian code, but figuring out which column maps to which position in the bit keys is not obvious. I feel like this shouldn't be this hard, but everything out there doesn't really explain this very well. And this aggregation table architecture is really to break.
What am I doing wrong? And why doesn't my solution work?
Update Here is my mondrian.properties file:
mondrian.jdbcDrivers=com.mysql.jdbc.Driver,oracle.jdbc.driver.OracleDriver
mondrian.rolap.generate.formatted.sql=true
mondrian.rolap.localePropFile=locale.properties
mondrian.rolap.aggregates.Use=true
mondrian.rolap.aggregates.Read=true
mondrian.trace.level=2
mondrian.drillthrough.enable=true

Might be the case mondrian­.­rolap­.­aggregates­.­Read is set to true and mondrian.rolap.aggregates.Use is set to false.
Please set mondrian.rolap.aggregates.Use=true and check.
Reference: http://mondrian.pentaho.com/documentation/configuration.php
If this is not the case, please attach all the properties related to aggregate tables and the complete cube definition XML.

Related

Can sqlite-utils convert function select two columns?

I'm using sqlite-utils to load a csv into sqlite which will later be served via Datasette. I have two columns, likes and dislikes. I would like to have a third column, quality-score, by adding likes and dislikes together then dividing likes by the total.
The sqlite-utils convert function should be my best bet, but all I see in the documentation is how to select a single column for conversion.
sqlite-utils convert content.db articles headline 'value.upper()'
From the example given, it looks like convert is followed by the db filename, the table name, then the col you want to operate on. Is it possible to simply add another col name or is there a flag for selecting more than one column to operate on? I would be really surprised if this wasn't possible, I just can't find any documentation to support it.
This isn't a perfect answer as it doesn't resolve whether sqlite-utils supports multiple column selection for transforms, but this is how I solved this particular problem.
Since my quality_score column would just be basic math, I was able to make use of sqlite's Generated Columns. I created a file called quality_score.sql that contained:
ALTER TABLE testtable
ADD COLUMN quality_score GENERATED ALWAYS AS (likes /(likes + dislikes));
and then implemented it by:
$ sqlite3 mydb.db < quality_score.sql
You do need to make sure you are using a compatible version of sqlite, as this only works with version 3.31 or later.
Another consideration is to make sure you are performing math on integers or floats and not text.
Also attempted to create the table with the virtual generated column first then fill it with my data later, but that didn't work in my case - it threw an error that said the number of items provided didn't match the number of columns available. So I just stuck with the ALTER operation after the fact.

Is there a way to extract a substring from a cell in OpenOffice Calc?

I have tens of thousands of rows of unstructured data in csv format. I need to extract certain product attributes from a long string of text. Given a set of acceptable attributes, if there is a match, I need it to fill in the cell with the match.
Example data:
"[ROOT];Earrings;Brands;Brands>JeweleryExchange;Earrings>Gender;Earrings>Gemstone;Earrings>Metal;Earrings>Occasion;Earrings>Style;Earrings>Gender>Women's;Earrings>Gemstone>Zircon;Earrings>Metal>White Gold;Earrings>Occasion>Just to say: I Love You;Earrings>Style>Drop/Dangle;Earrings>Style>Fashion;Not Visible;Gifts;Gifts>Price>$500 - $1000;Gifts>Shop>Earrings;Gifts>Occasion;Gifts>Occasion>Christmas;Gifts>Occasion>Just to say: I Love You;Gifts>For>Her"
Look up table of values:
Zircon, Diamond, Pearl, Ruby
Output:
Zircon
I tried using the VLOOKUP() function, but it needs to match an entire cell and works better for translating acronyms. Haven't really found a built in function that accomplishes what I need. The data is totally unstructured, and changes from row to row with no consistency even within variations of the same product. Does anyone have an idea how to do this?? Or how to write an OpenOffice Calc function to accomplish this? Also open to other better methods of doing this if anyone has any experience or ideas in how to approach this...
ok so I figured out how to do this on my own... I created many different columns, each with a keyword I was looking to extract as a header.
Spreadsheet solution for structured data extraction
Then I used this formula to extract the keywords into the correct row beneath the column header. =IF(ISERROR(SEARCH(CF$1,$D769)),"",CF$1) The Search function returns a number value for the position of a search string otherwise it produces an error. I use the iserror function to determine if there is an error condition, and the if statement in such a way that if there is an error, it leaves the cell blank, else it takes the value of the header. Had over 100 columns of specific information to extract, into one final column where I join all the previous cells in the row together for the final list. Worked like a charm. Recommend this approach to anyone who has to do a similar task.

Grouping, missing data - Cognos Report Studio

In IBM Cognos Report Studio
I have a data structure like so, plain dump of the customer details:
Account|Type|Value
123-123| 19 |2000
123-123| 20 |2000
123-123| 21 |3000
If I remove the Type from my report I get:
Account|Value
123-123|2000
123-123|3000
It seems to have treated the two rows with an amount '2000' as some kind of duplicated amount and removed it from my report.
My assumption was that Cognos will aggregate the data automatically?
Account|Value
123-123|8000
I am lost on what it is doing. Any pointers? If it is not grouping it, I would at least expect 3 rows still
Account|Value
123-123|2000
123-123|2000
123-123|3000
In any case I would like to end up with 1 line. The behaviour I'm getting is something I can't figure out. Thanks for any help.
Gemmo
The 'Auto-group & Summarize' feature is the default on new queries. This will find all unique combinations of attributes and roll up all measures to these unique combinations.
There are three ways to disable auto-group & summarize behavior:
Explicitly turn it off at the query level
Include a grain-level unique column, e.g. a key, in the query
Not include any measures in the query
My guess is that your problem is #3. The [Value] column in your example has to have its 'Aggregate Function' set to an aggregate function or 'Automatic' for the auto-group behavior to work. It's possible that column's 'Aggregate Function' property is set to 'None'. This is the standard setting for an attribute value and would prevent the roll up from occurring.

MS Access- Calculated Column for Distinct Count in Table Rather than a Query

I'd like to have a Calculated Column in a table that counts the instances of a concatenation.
I get the following error when inputting Abs(Count([concat])) as the column formula for the calculation: The expression Abs(Count([concat])) cannot be used in a calculated column.
Is there any other way to do it without doing a query? I'm pretty sure it can't be done but I figured I'd ask anyways since I didn't see any other posts about it.
No, and even if there was, you should create and use a query for this.
Besides, applying Abs on a count doesn't make much sense, as the count cannot be negative.

How to create a view that returns a 2x2 (or NxN) matrix of results

So I know enough SQL just to be really dangerous (I don't normally work the back-end) but cannot get the following view to be created successfully ;) The result set I'm after is a data set that has rows assigned as a column alias from multiple tables (instead of a 1xN flat of all columns). There is a many-to-one relationship when looking at the main table, based on foreign keys associated to the row id of the appropriate related table.
Ideally I'd like a data set that looks like this in the return:
dataset.transaction_row[n]: col1, col2, col3, coln... (columns from the transaction table)
dataset.category_row[n]: col1, co2, col3, coln... (columns from the category table)
and so on...
I get the following error:
Query Error: near "AS": syntax error Unable to execute statement
From:
CREATE VIEW view_unreconciled_transactions
AS SELECT account_transaction.* AS transaction_row,
category.* AS category_row,
memorized.name_rule_replace OR account_transaction.name AS payee
FROM account_transaction
LEFT JOIN memorized ON account_transaction.memorized_key = memorized.id
LEFT JOIN category ON account_transaction.category_key = category.id
WHERE status != 2
ORDER BY account_transaction.dt_posted DESC
It seems easy enough since the result-column selector is repeatable which includes expressions (referencing sqlite's syntax diagrams). In reference to the error, I'm assuming it's complaining about the 2nd 'AS' where I'm trying to get table.* assigned as an alias. Any help in the right direction is appreciated. If I had to, I suppose I could explicitly state all columns but that feels like a kludge.
The AS modifier can only be applied to a single column, not to a collection such as the * you used. You will have to break them out into specific names, (which is best practice IMHO anyway)
It looks like you want to make a "pivot table". They can be tricky to make in a database. I can say that if you a data result, where each row comes from a different table source, and the columns form each table are IDENTICAL, then you could try using a UNION statement to join the different results together like they are just one dataset.
NOTE that the columns all take their naming cue from the first dataset in a UNION and the datatype all need to be the same.

Resources