count distinct of text field in data studio increased when using filter - google-analytics

I want to count the number of sessions with a certain event label in google data studio. I have created a new field in data studio on a google analytics source like this:
COUNT_DISTINCT(CASE WHEN Event Label = "Form Start" THEN Session ID ELSE "" END)
where Session ID is a custom dimension from GA (string).
The problem is that when I for example pull this new metric to a scorecard, I get a value of 6, if I then add a filter on this scorecard with Event Label = "Form Start" (the exact same event label as in the case statement of the new field) the metric is increased to 23! (which is the correct number).
Is there some data truncation going on in data studio behind the scenes or why does using the filter increase the distinct count?

The weird numbers you're seeing could be due to sampling. At the bottom of the report in "view" mode, it should indicate if the numbers are sampled or not.
Also, the Unique Events metric should tell you the number of times a specific event happened per session. You might not need to do all that custom work in data studio, just a filter for the label.

I might be missing something that requires the COUNT_DISTINCT function, but would a simpler, different formula work?
CASE
WHEN Event Label = "Form Start" THEN 1
ELSE 0
END
This would create a number field that that can be used in the metric element of a Scorecard with multiple aggregation options? The key option being SUM :)

I had a similar problem I think, where I was trying to tally all the pages with a certain category in the meta:
CASE
WHEN REGEXP_MATCH(idio:industry, '.*Agriculture.*') THEN "Agriculture"
else "Others"
END
In your case, I think you would use this:
CASE
WHEN REGEXP_MATCH(Event Label, '.*Form Start.*') THEN Session ID
else "Others"
END

Related

"Calculated columns cannot contain volatile functions like Today and Me" error message on Sharepoint

I try to add a new calculated column to sharepoint list that will show elapsed day. I enter name and write a formula like;
=ABS(ROUND(Today-Created;0))
The data type returned from this formula is: Single line of text
When I want to save I get an error like
Calculated columns cannot contain volatile functions like Today and
Me.
Calculated Column Values Only Recalculate As Needed
The values in SharePoint columns--even in calculated columns--are stored in SharePoint's underlying SQL Server database.
The calculations in calculated columns are not performed upon page load; rather, they are recalculated only whenever an item is changed (in which case the formula is recalculated just for that specific item), or whenever the column formula is changed (in which case the formula is recalculated for all items).
(As a side note, this is the reason why in SharePoint 2010 you cannot create or change a calculated column on a list that has more than the list view threshold of 5000 items; it would require a mass update of values in all those items, which could impact database performance.)
Thus, in order for calculated columns to accurately store "volatile" values like "Me" and "Today", SharePoint would need to somehow constantly recalculate those column values and continuously update the column values in the database. This simply isn't possible.
Alternatives to Calculated Columns
I suggest taking a different approach entirely instead of using a calculated column for this purpose.
Conditional Formatting: You can apply conditional formatting to highlight records that meet certain criteria. This can be done using SharePoint Designer or HTML/JavaScript.
Filtered List views: Since views of lists are queried and generated in real time, you can use volatile values in list view filters. You can set up a list view web part that only shows items where Created is equal to [Today]. Since you can place multiple list view web parts on one page, you could have one section for today's items, and another web part for all the other items, giving you a visual separation.
A workflow, timer job, or scheduled task: You can use a repeating process to set the value of a normal (non-calculated) column on a daily basis. You need to be careful with this approach to ensure good performance; you wouldn't want it to query for and update every item in the list if the list has surpassed the list view threshold, for example.
I found some conversations about this issue. Many people suggest to creating a new Date Time column, visible is false, default value is Today's Date and it will be named as Today. Then we can use this column in our formulas.
I tried this suggestion and yes error is gone and formula is accepted but calculated columns' values are wrong. I setted column Today is visible and checked, it was empty. Default value Today's Date was not working. When I looking for a solution for this issue I deleted column Today carelessly. Then I realized calculated columns' values are right.
Finally; I don't know what is trick but before using Today keyword in your formulas if you create a column named as Today and after your formula saving if you delete Today column, it is working.
UPDATE
After #Thriggle's answer I realized this approach doesn't work like a charm. Yes, formula doesn't cause an error when calculated column saving but it works correctly only first time, in the next day the calculated column shows old values, because its values are static as Thriggle explained.

Working with event stat in google analytics api

I have the website where merchants sell some stuff. Each item has stats like unique views during last 24 hours, a week and a month and a number of visitors that clicked on "show contacts" button. Uniqueness based on IP. This works, there're huge tables that collects all (IP,item_id) pairs during the last month, and there're a lot of updates.
Today I dig into google analytics api, I would like to know if it's possible to use it instead of my system.
The fact is all this stat is private, available only for merchant, so I don't need to have all stat at a time (it's not compared etc.). So it might be requested on demand for the specific item.
I created service account, connected it to analytics data and it seems export works (based on this example). Then enabled event tracking for "show contacts" button.
For example, when user click on "show contacts" where should I add item_id? Is it eventLabel or eventValue? So, for item_id=1234 it should be
ga("send","event","Contacts","show","",1234) or ga("send","event","Contacts","show",1234)?
I'm confused with eventValue column in Top Events report (it seems that eventValue keeps a sum of all eventValues and even caculates Avg.Value). Does it mean item_id should be stored in eventLabel as string?
I added second, nonInteraction event for collecting item views, ga("send","event","Data","show","1234",1,{nonInteraction:true}). It count all events, but I need only unique ones (performed by unique visitors) in specified period of time (day, week, month). Is it possible?
1) The parameters are category, action, label and value. "Value" is a metric and is expected to be an integer. Since it's a metric it will be added up. So if you do
ga("send","event","Contacts","show","",1234)
you will increment a metric by 1234, not store an id. You do not want this (especially if you have a linked adwords account, since this will be used to calculate your "return on advertising spent").
You want to use your item_id as label, however label is a string. So what you need to do is:
ga("send","event","Contacts","show","1234")
i.e. wrap the value for your label in quotes.
2) Is there anything wrong with ga:uniqueEvents for your purposes ?

How to list unique values of a particular field in Kibana

I am having a field named rpc in my elasticsearch database and I am displaying it using Kibana. When I search in search bar of kibana like:
rpc:*
It display all the values of rpc field but I want to have only those value to be displayed which are unique.
I have been playing around with Kibana4 since a couple of weeks now. I find it intuitive and simple and the experience has been great till now. Following your question, I tried getting unique results via a Data Table visualization. Why? Because I personally find it easier to understand. Following are the steps:
1. Get unique count
Create the visualization (Visualize -> Data Table). First lets get
the count of how many unique entries we have for a particular field
(We will use this in the later part for verification). I'm using
clientip.raw but as I see, it will work just fine with any friendly
field name too.
2. Set the aggregation right
Set you aggregation back to count and have a Split Rows as follows. Not doing this will give you count 1 for each field value (since it is looking for unique counts) when you populate the table. Noteworthy part is setting the Top field to 0. Because Kibana won't let you enter anything else than a digit (Obviously!). This was the tricky part. Hit Apply and you'll get the results. Unique field values and the count of each of them.
3. Verification:
Going to the last page of the table, we see there are exactly 543 results. This is how I know it works.
What Next?
You save this visualization and add it to a Dashboard. There you can always check the request, query, response and other stats.
Just an addition to the above mathakoot answer.
For the user of newer version (which do not allow bucket size of 0 anymore) just set a value greater than the maximum number of result
And report the value in the Options>Per Page field
I am using Kibana 6 so the UI looks a bit different than the older answers here.
Here is what worked for me
Create a visualization from your query, I used a line graph type (don't think it matters)
Under Data, set metrics aggregation = "Unique Count" and set field to your field.
Set x-axis aggregation = "Terms" and set field to your field.
Set Size > your number of records
Under Metrics and Axes, disable drawing of the graph, circles, and labels (this really helps the UI not lag)
Run query and then click "Inspect" and download CSV
Data
Metrics & Axes
I wanted to achieve something similar but I'm stuck with Kibana 3.1.
I simply added a panel of type "TERMS" and configured its Field = User-agent and left everything else on default values. This gave me a nice bar chart with one bar for each User-agent.

Filtering data and Display Heading based on filters

being a newbie to SSRS, I am trying to figure out the following:
say for instance I have a dataset which does a :
SELECT [cols...] from [some view]
I want to be able to further filter this based on parameters given from an ASP.NET site (I am using the AJAX control toolkit for the report viewer). There could be x amount of parameters and potentially can be filtered on 1 or more columns.
First question is, how would I filter the dataset and pass along the parameters along with which field the filter should apply to? I may have [col1] and I want to filter it with x values.
Second question Is, I want to be able to group the results per page based upon a column. So for each grouped result set, I want them to be displayed per page (per group per page).
Then on the headers of the page, I want it to display what the page grouping is. How would I do this?
In terms of what have I tried - nothing as I DO NOT KNOW HOW, it is why I am asking the question here to see what the experts (you) can suggest and guide me.
thank you!
To do this you can create Parameters in SSRS, they do not need to be in your query or anything. Then, go to your tablix and click either ROW or column depending on the filter type and set it show/hide visibility. For example I have a report that has personal information, so i have true/false parameter that hides/shows those columns, similar I have one that hides/shows any row with a -1 for the total paid.

In a Drupal Views argument, how can I get the total number of nodes in a nodequeue?

I'm working on a site that is a database of thousands record albums and am attempting to create an "album of the day" block. My solution has been to create a nodequeue of specific records, create a view that passes the current day of the year as an argument, and then uses this value to call the nodequeue item with that same numbered position. I do this by providing a "Default Argument" as PHP code in the "Nodequeue: Position" argument setting. Here's the code I use:
$nodequeueTotalNodes = 120;
$dayOfTheYear = date("z");
$nodeQueuePosition = $dayOfTheYear % $nodequeueTotalNodes;
return $nodeQueuePosition;
The above code works to my satisfaction but my problem is I have to manually change the value of $nodequeueTotalNodes every time I add or remove an item from my nodequeue.
Is there a way to pull the total number of nodes from my queue to replace the "120" in my code above?
The nodequeue_nodes table holds all the nodes in your queues. Something like this should do the trick, where qid is the queue id:
$nodequeueTotalNodes = db_result(db_query('SELECT COUNT(nid) FROM {nodequeue_nodes} WHERE qid = %d', $qid));
If you're using a subqueue there's a column called sqid which you can use.

Resources