I have been trying for the past several hours to write a calculated field in Google Data Studio.
I need to know how to get a percentage calculation on some events. Table below:
|Event Label|Event Action |Total Events|
|-----------|----------------|------------|
|CTA 1 |Link Displayed | 100 |
|CTA 1 |Link Clicked | 20 |
I want to get the conversion, which means dividing 20 by 100 but I can't seem to write a calculated which does that. I feel like I've tried everything e.g.:
sum((total events(link clicked)) / total events(link displayed)))
And the like. Please help!
Thanks
This function is not available yet through the data studio (note that data studio is brand new). For this you have to use the API which I would strongly recommend!
Here you could use r and the sqldf-package that would provide you the data as the data studio (very simple sql querys). The same package you have in phyton.
Related
after years of reading, I now have a question, that I did'nt find, probably because I don't really know how it's called what I want.
I recentyl started using databases and sql and therefore I have a minimum of knowledge about it.
What I need is the following:
I want an output that looks like this:
Post Code | coordinates | Birth Date | Temperature
The first three come from different Tables, everything fine. Bute the last one is the tricky one:
My Temperature table (temperature) looks like this:
Date | 50.95N_12.45E |50.85N_12.35E |...
1.1.1950|10|3.2|...
2.1.1950|10.2|3.5|...
And now I need to tell sqlite:
SELECT mom.coordinates AS coordniates
temperature.(what you find in coordinates) AS temperature
FROM...
Is this understandable?
Thanks in advance :)
You can't refer to a column's name like that.
You must enumerate all the columns in a CASE expression:
SELECT mom.coordinates AS coordniates,
CASE mom.coordinates
WHEN '50.95N_12.45E' THEN temperature."50.95N_12.45E"
WHEN '50.85N_12.35E' THEN temperature."50.85N_12.35E"
.................................
END AS temperature
FROM
I have a set of data in the following format:
Resp | Q1 | Q2
P1 | 4 | 5
P2 | 1 | 2
P3 | 4 | 3
P4 | 6 | 4
I'd like to show the count and % of people who gave an answer greater than 3. So in this case, the output would be:
Question | Count | Percent
Q1 | 3 | 75%
Q2 | 2 | 50%
Any suggestions?
Although it sounds like a fairly easy thing, it is a bit more complicated.
Firstly your data is not row based so you will have to pivot it.
Load your data into Tableau
In the DataSource Screen choose column Q1 and Q1, right click on them and chosse "Pivot"
Name the column with the answers "Answers" (just for clarity.
You should get a table that looks like this:
Now you need to create a calculated field (I called it Overthreshold to check for your condition:
if [Answer] > 3 then
[Answer]
End
At this point you could substitute the 3 with a parameter in case you want to easily change that condition.
You can already drop the pills as follows to get the count:
Now if you want the percentage it gets a bit more complicated, since you have to determine the count of the questions and the count of the answers > 3 which is information that is stored in two different columns.
Create another Calculated field with this calculation COUNT([Overthreshold]) / AVG({fixed [Question]:count([Answer])})
drop the created pill onto the "text" field or into the columns drawer and see the percentage values
right click on the field and choose Default Propertiess / Number Format to have it as percentage rather than a float
To explain what the formular does:
It takes the count of the answers that are over the threshold and devides it by the count of answers for each question. This is done by the fixed part of the formular which counts the rows that have the same value in the Question column. The AVG is only there because Tableau needs an aggregeation there. Since the value will be the same for every record of the question, you could also use MIN or MAX.
It feels like there should be an eassier solution but right now I cannot think of one.
Here is a variation on #Alexander's correct answer. Some folks might find it slightly simpler, and it at least shows some of the Tableau features for calculating percentages.
Starting as in Alexander's answer, revise Overtheshold into a boolean valued field, defined as Answer > 3
Instead of creating a second calculated field for the percentage, drag Question, Overthreshold and SUM(Number Of Records) onto the viz as shown below.
Right click on SUM(Number of Records) and choose Quick Table Calculation->Percentage of Total
Double click Number of Records in the data pane on the left to add it to the sheet, which is a shortcut for bringing out the Measure Names and Measure Values meta-fields. Move Measure Names from Rows to Columns to get the view below, which also uses aliases on Measure Names to shorten the column titles.
If you don't want to show the below threshold data, simply right click on the column header False and choose Hide. (You can unhide it if needed by right clicking on the Overthreshold field)
Finally, to pretty it up a bit, you can move Overthreshold to the detail shelf (you can't remove it from the view though), and adjust the number formatting for the fields being displayed to get your result.
Technically, Alexander's solution uses LOD calculations to compute the percentages on the server side, while this solution uses Table calculations to compute the percentage on the client side. Both are useful, and can have different performance impacts. This just barely nicks the surface of what you can do with each approach; each has power and complexity that you need to start to understand to use in more complex situations.
I'm trying to sum up values based on the 'Description' column of a dataset. So far, I have this
=Sum(Cdbl(IIf(First(Fields!Description.Value, "Items") = "ItemA", Sum(Fields!Price.Value, "Items"), 0)))
But it keeps giving me an error saying that it "contains a First, Last, or Previous aggregate in an outer aggregate. These aggregate functions cannot be specified as nested aggregates" Is there something wrong with my syntax here?
What I need to do is take something like this...
Item | Price
Item A | 400.00
Item B | 300.00
Item A | 200.00
Item A | 100.00
And I need to get the summed Price for 'ItemA' - 700.00 in this case.
All of the answers I've found so far only show for a single dataset OR for use with a tablix. For example, the below code does not work because it does not specify the scope or the dataset to use.
=Sum(Cdbl(IIf(Fields!Description.Value) = "ItemA", Sum(Fields!Price.Value), 0)))
I also can't specify a dataset to use, because the control I'm loading into is a textbox, not a tablix.
If anyone else sees this and wants an answer, I ended up returning a count back of what I needed on another dataset. The other option I was thinking would possibly be to create a 1x1 tablix, set the dataset, and then use the second bit of code posted.
This question comes in the sequence of a previous one I asked this week.
But generally my problem goes as follows:
I have a datastream of records entering in R via a socket and I want to do some analyses.
They come sequentially like this:
individual 1 | 1 | 2 | timestamp 1
individual 2 | 4 | 10 | timestamp 2
individual 1 | 2 | 4 | timestamp 3
I need to create a structure to maintain those records. The main idea is discussed in the previous question but generally I've created a structure that looks like:
*var1* *var2* *timestamp*
- individual 1 | [1,2,3] | [2,4,6] | [timestamp1, timestamp3...]
- individual 2 | [4,7,8] | [10,11,12] | [timestamp2, ...]
IMPORTANT - this structure is created and enlarged at runtime. I think this is not the best choice as it takes too long creating. The main structure is a matrix and inside each pair individual variable I have lists of records.
The individuals are on great number and vary a lot over time. So without going through some records I don't have enough information to make a good analyse. Thinking about creating some king of cache at run time on R by saving the records of individuals to disk.
My full database has an amount of approximately 100 GB. I want to analyse it mainly by seasonal blocks within each individual (dependent on the timestamp variable).
The creation of my structure takes too long as I enlarge the amount of records I'm collecting.
The idea of using a matrix of data with lists inside each pair individual - variable was adapted from using a three dimensional matrix because I don't have observations at the same timestamps. Don't know if it was a good idea.
If anyone has any idea on this matter I would appreciate it.
I have tabulated data. I have to write some code to dynamically generate some .pdf reports. Once I know how to make R read and publish only 1 row at a time, I will be using Sweave to format it and make it look nice.
For example, if my data set looks like this:
Name | Sport | Country
Ronaldo | Football | Portugal
Federer | Tennis |Switzerland
Woods | Golf | USA
My output would be composed of three .pdf files. The first one would say "Ronaldo plays football for Portugal"; and so on for the other two rows.
I have started with a for-loop but every forum I have trawled through talks about the advantages of the -apply functions over it but I don't know how to make it apply on every row of the data.
PS: This is my first post on stackoverflow.com. Excuse me if I am not following the community rules here. I will try my best to ensure that the question conforms to the guidelines based on your feedback.