How to perform a YTD aggregation on data arranged in parent child hierarchies with unary operators? - parent-child

I am using a parent-child relationship for accounts in the OLAP database icCube. To include financial logic I make use of unary operators. In addition, I have set-up several account hiearchies using the many-2-many relationship and all is working very smoothly, except ....
when I want to apply time logic on the result, e.g. show the YTD value for April,30 2014 by:
Aggregate(crossjoin ({[View].[View].[Periodiek]},PeriodsToDate([Tijd].[Kalender].[jaar],[Tijd].[Kalender].currentmember)))
I get the message:
Aggregate() : the aggregation 'unary-operator' is not supported (measure or calculated measure/member:[Measures].[bedrag])
Apparently, this is not the way to do this.
How can one achieve cumulative figures (periods-to-date) in this setting?

The current version of icCube - 4.8.2 - does not support the Aggregate function for measures with Aggregation type 'unary operator'. See Aggregation function doc here.
The Aggregate function is a bit dodgy if you're using many-2-many relations as well as special measure aggregation types. For example :
Aggregate( { [Account1], [Account2] }, [Measures].[Special] )
If [Special] is a measure with 'Sum' aggregation type and [Account1] and [Account2] have a many-2-many relation we would be counting twice the same 'shared' amount (aka same row is counted twice).
Other measures with aggregation types are just not supported to avoid getting unexpected results. This applies for aggregation types a Open / Close / Distinct Count.
The solution in your case is using the Sum function :
Sum( CompactSet( PeriodsToDate([Tijd].[Kalender].[jaar],[Tijd].[Kalender].currentmember) ) , [View].[View].[Periodiek] )
Compact Set allows to compact to set in case you're using days or hours reducing the set. It's a perfomance booster.
If you want to handle properly m2m relations and special measure aggregation types you can use Categories in icCube, see here some doc. Quickly, they allow to define dynamically a members as a set of tuples.
To properly support Aggregate we should add a new method, e.g. icAggregate, that works as using Categories. The Aggregate function it's a bit strange, for the time being we mimic a bit SSAS...

To prove the comment of Sourav try changing your measure to the following:
Aggregate(
{[View].[View].[Periodiek]}
* PeriodsToDate([Tijd].[Kalender].[jaar],[Tijd].[Kalender].currentmember)
, [Measures].[MEASURE_NOT_bedrag] //<<replace with actual
)
Do you still get the same error?

Related

Handle a string return from R to Tableau and SPLIT it

I connect Tableau to R and execute an R function for recommending products. When R ends, the return value is a string which will have all products details, like below:
ID|Existing_Prod|Recommended_Prod\nC001|NA|PROD008\nC002|PROD003|NA\nF003|NA|PROD_ABC\nF004|NA|PROD_ABC1\nC005|PROD_ABC2|NA\nC005|PRODABC3|PRODABC4
(Each line separated by \n indicating end of line)
On Tableau, I display the calculated field which is as below:
ID|Existing_Prod|Recommended_Prod
C001|NA|PROD008
C002|PROD003|NA
F003|NA|PROD_ABC
F004|NA|PROD_ABC1
C005|PROD_ABC2|NA
C005|PRODABC3|PRODABC4
Above data reaches Tableau through a calculated field as a single string which I want to split based on pipeline ('|'). Now, I need to split this into three columns, separated by the pipeline.
I used Split function on the calculated field :
SPLIT([R_Calculated_Field],'|',1)
SPLIT([R_Calculated_Field],'|',2)
SPLIT([R_Calculated_Field],'|',3)
But the error says "SPLIT function cannot be applied on Table calculations", which is self explanatory. Are there any alternatives to solve this ?? I googled to check for best practices to handle integration between R and Tableau and all I could find was simple kmeans clustering codes.
Make sure you understand how partitioning and addressing work for table calcs. Table calcs pass vectors of arguments to the R script, and receive a single vector in response. The cardinality of those vectors depends on the partitioning of the table calc. You can view that by editing the table calc, clicking specific dimensions. The fields that are not checked determine the partitioning - and thus the cardinality of the arguments you send and receive from R
This means it might be tricky to map your problem onto this infrastructure. Not necessarily impossible. It was designed to send a series of vector arguments with one cell per partitioning dimension, say, Manufacturer and get back one vector with one result per Manufacturer (or whatever combination of fields partition your data for the table calc). Sounds like you are expecting an arbitrary length list of recommendations. It shouldn’t be too hard to have your R script turn the string into a vector before returning, but the size of the vector has to make sense.
As an example of an approach that fits this model more easily, say you had a Tableau view that had one row per Product (and you had N products) - and some other aggregated measure fields in the view per Product. (In Tableau speak, the view’s level of detail is at the Product level.)
It would be straightforward to pass those measures as a series of argument vectors to R - each vector having N values, and then have R return a vector of reals of length N where the value returned at each location was a recommender score for the product at that position. (Which is why the ordering aka addressing of the vectors also matters)
Then you could filter out low scoring products from the view and visually distinguish highly recommended products.
So the first step to understanding R integration is to understand how table calcs operate with partitioning and addressing and to think in terms of vectors of fixed lengths passed in both directions.
If this model doesn’t support your use case well, you might be able to do something useful with URL actions or the JavaScript API.

MT4: How to plot existing data under the price chart as an indicator?

I have some COT data that I want to plot under the main price window as an indicator. The COT data are external data, i.e. independent of the prices. So one can not write it like a traditional indicator calculated from the prices. Since I have all the data needed, I don't need to do any calculation. I only need to convert the date and time so that it aligns with the price chart. I will figure out how to do it later. Now, if we ignore the alignment, what I want to ask is how could I plot the data under the price chart? Thanks!
Alternative A:
Use the MT4-GUI tools and plot the data programmatically right into the MT4.Graph or using the screen-layout-plane of GUI-objects, independent of the underlying live-[TimeDOMAIN,PriceDOMAIN]-graphing, both using Expert Advisor-type of MQL4-code. We use this approach most often for all the tasks, that would normally land as a Custom Indicator-type of MQL4-code, as the New-MQL4.56789 code-execution engine has reduced the achievable performance for all ( yes, ALL ) Custom Indicator code-units' execution into a single, thus both RealTime-sensitive and potentially blocking, thread.
Using this alternative, you retain the full freedom of the code-design and may benefit a lot from pre-computing & pre-setting the GUI-objects inside OnInit(){...} section, before entering the trading-loop. This also minimises the latency costs associated with a need to update the GUI-scene from inside an OnTick(){...} event-loop.
Alternative B:
One may also opt to do a similar job using an independent Script-type of MQL4-code unit, as the COT data are weekly announced and thus static per-se.
Launching Script is a step, that can happen whenever feasible and this implementation model may also enjoy some ex-post modification tools, that could be run from another Expert Advisor or another Script MQL4-code, for the sake for some ex-post live-GUI-scene modification/maintenance.
Alternative C:
If one indeed insists to do so, the GUI-composition might be assembled inside a rather special-purpose live-calculated Custom Indicator-type of MQL4-code.
This approach but has to carefully deploy the GUI-composition into the Custom Indicator OnInit(){...} section and avoid any risk of blocking a flow of execution inside the above said critical section of OnCalculate(){...}.
Buffer-mapped, register-based Custom Indicator data & graphing tools are rather rigid for more advanced purposes, that do not strictly follow the hard-wired logic of a code, responding just to a stream of MarketEvent-s, which may, but need not, happen at once, but is being arranged by a sort of mini-batches, so as to process the whole depth of the DataStore in a segmented ( thus less-blocking ) processing approach.
Building the GUI-scene inside the OnInit() section of the Custom Indicator, one may still benefit from distributed processing, if external data source is to be read and/or any similar type inter-platform communications ( be it for a messaging or a signalling purpose ).
My choice would be the [A]
Mapping { Date, Time } onto a MQL4-datetime is trivial, MQL4 used to use since its beginning datetime as int seconds elapsed since 1970-01-01,00:00:00.000 - so simple, so easy.
declare the indicator buffer:
double ExtBufferCOT[];
assign indexes of buffers
SetIndexStyle( 0, DRAW_LINE );
SetIndexBuffer( 0, ExtBufferCOT );
in the OnCalculate() function - make sure it is time to check the levels again ( I think you do not need to update them every tick, right? Maybe once a day or once a week) and then read the file that you have ( we do not have example of file so senseless to describe how to do that here ), convert elements of the file, using StrToTime() and StrToDouble()
the last step - get last N lines from your file, and map them to the indicator buffers:
double value;
datetime time; // - your values from file are here
int shift = iBarShift( _Symbol, 0, time );
ExtBufferCOT[shift] = value; /* probably need to fill buffer
of next candles too
if your chart timeframe
is smaller then frequency
of observations in the file
*/

How to pass each row as an argument to R script from Tableau calculated field

I am trying to do sentiment analysis on a table that I have.
I want each row of string data to be passed to the R script, but the problem is that Tableau is accepting only aggregate data as params for:
SCRIPT_STR(
'output <- .arg1; output', [comments]
)
This gives me an error message:
# All fields must be aggregate or constant.
From the Tableau and R Integration documentation:
Given that the SCRIPT_*() functions work as table calculations, they
require aggregate measures or Tableau parameters to work properly.
Aggregate measures include MIN(), MAX(), ATTR(), SUM(), MEDIAN(), and
any table calculations or R measures. If you want to use a specific
non-aggregated dimension, it needs to be wrapped in an aggregate
function.
In your case you could do:
SCRIPT_STR(
'output <- .arg1; output', ATTR([comments])
)
ATTR() is a special Tableau aggregate that does the following:
IF MIN([Dimension]) = MAX([Dimension]) THEN
[Dimension] ELSE * (a special version of Null) END
It’s really useful when building visualizations and you’re not sure of the level of detail of data and what’s being sent
Note: It can be significantly slower than MIN() or MAX() in large data sets, so once you get confident your results are accurate then you can switch to one of the other functions for performance.
Try MIN([comments]) and make sure you have appropriate dimensions on your viz to partition the data fine enough to get a single comment for each combination of dimensions.

How do I retrieve aggregate measures from R when I need to pass disaggregated data in Tableau?

I have extensively read and re-read the Troubleshooting R Connections and Tableau and R Integration help documents, but as a new Tableau user they just aren't helping me.
I need to be able to calculate Kaplan-Meier survival probabilities across any dimensions that are dragged onto the sheet. Ideally, I would be able to retrieve this in a tabular format at multiple time points, but for now, I would be happy just to get it at a single time point.
My data in Tableau have columns for [event-boolean] and [time to event]. Let's say I also have columns for Gender and District.
Currently, I have a calculated field [surv] as:
SCRIPT_REAL('
library(survival);
fit <- summary(survfit(Surv(.arg2,.arg1) ~ 1), times=365);
fit$surv'
, min([event-boolean])
, min([time to event])
)
I have messed with Computed Using, Addressing, Partitions, Aggregate Measures, and parameters to the R function, but no combination I have tried has worked.
If [District] is in Columns, do I need to change my SCRIPT_REAL call or do I just need to change some other combination of levers?
I used Andrew's solution to solve this problem. Essentially,
- Turn off Aggregate Measures
- In the Measure Values shelf, select Compute Using > Cell
- In the calculated field, start with If FIRST() == 0 script_*() END
- Ctrl+drag the measure to the Filters shelf and use a Special > Non-null filter.

Transform for graphite counter

I'm using the incr function from the python statsd client. The key I'm sending for the name is registered in graphite but it shows up as a flat line on the graph. What filters or transforms do I need to apply to get the rate of the increments over time? I've tried an apply function > transform > integral and an apply function > special > aggregate by sum but no success yet.
Your requested function is "Summarize" - see it over here: http://graphite.readthedocs.org/en/latest/functions.html
In order to the totals over time just use the summarize functions with the "alignToFrom =
true".
For example:
You can use the following metric for 1 day period:
summarize(stats_counts.your.metrics.path,"1d","sum",true)
See graphite summarize datapoints
for more details.
The data is there, it just needs hundreds of counts before you start to be able to see it on the graph. Taking the integral also works and shows number of cumulative hits over time, have had to multiple it by x100 to get approximately the correct value.

Resources