Power BI - divide COUNT by DISTINCTCOUNT - count

How can I create a Measure in Power BI to divide COUNT by DISTINCTCOUNT of the same thing?
Example:
Source data - only one column:
PERSON
A
A
A
B
B
C
Now I want to show the following result:
PERSON ... APPEARANCES
A ... 3
B ... 2
C ... 1
If I try this by APPEARANCES = COUNT(PERSON) / DISTINCTCOUNT(PERSON), that doesn't work.
Thank you.

After loading the data into power bi, go to edit queries--> In the home tab--> select group by--> A window will appear asking for : on which column you would like to group the data.
Please refer the below pic.

Related

How to create a prompt/answer system in R?

I saw in the thread 'Creating a Prompt/Answer system to input data into R' that it is possible to Let R respond to questions. I would like to do the same but based on my dataframe.
My dataframe looks like the following:
PP
Trait
1
1
2
1
3
2
4
1
5
3
6
2
I basically would like to let R answer the following questions:
Did participants score at least 1 on Trait? Y/N
Did participants score at least 2 on Trait? Y/N
Did participants score at least 3 on Trait? Y/N
Is this possible?
Thank you in advance!
The svDialogs - package might provide what you need. You can create Dialog- and Input-Boxes with it that return the entered values wich you can later process within your script.
Link: svDialogs on Github

Count no. of categories within categories in my df in R

Each row of data in my df is from a study (or "Article"). Each Article has a sponsor ("Sponsor") who may have sponsored a number of the articles in my dataset.
I want to produce a summary table to show how many articles each Sponsor has sponsored in my dataset.
I hope you can help!
many thanks!!!!
Annabel
What I presume your dataframe looks like:
df=data.frame(article=1:10,sponsor=letters[round(runif(10,1,5))])
head(df)
article sponsor
1 1 a
2 2 b
3 3 e
4 4 d
5 5 e
6 6 b
How you could quickly check the number of articles per sponsor:
table(df$sponsor)
I'm not sure what you want to do exactly, but
I guess what you are looking for is:
table(df$Sponsor)
If you want the value that occurs the most frequent, you can use:
names(sort(table(df$Sponsor), decreasing=TRUE))[1]
next time, please try to provide a MWE so it is easier for us to help.

To sort a specific column in a DataFrame in SparkR

In SparkR I have a DataFrame data. It contains time, game and id.
head(data)
then gives ID = 1 4 1 1 215 985 ..., game = 1 5 1 10 and time 2012-2-1, 2013-9-9, ...
Now game contains a gametype which is numbers from 1 to 10.
For a given gametype I want to find the minimum time, meaning the first time this game has been played. For gametype 1 I do this
data1 <- filter(data, data$game == 1)
This new data contains all data for gametype 1. To find the minimum time I do this
g <- groupBy(data1, game$time)
first(arrange(g, desc(g$time)))
but this can't run in sparkR. It says "object of type S4 is not subsettable".
Game 1 has been played 2012-01-02, 2013-05-04, 2011-01-04,... I would like to find the minimum-time.
If all you want is a minimum time sorting a whole data set doesn't make sense. You can simply use min:
agg(df, min(df$time))
or for each type of game:
groupBy(df, df$game) %>% agg(min(df$time))
By typing
arrange(game, game$time)
I get all of the time sorted. By taking first function I get the first entry. If I want the last entry I simply type this
first(arrange(game, desc(game$time)))
Just to clarify because this is something I keep running into: the error you were getting is probably because you also imported dplyr into your environment. If you would have used SparkR::first(SparkR::arrange(g, SparkR::desc(g$time))) things would probably have been fine (although obviously the query could've been more efficient).

How to bulid a report with a total and breakout columns with SQL Server and Reporting Services

I have a data structure where I have two tables Alpha and Beta and they are one to many. For the sake of an example let's say that table alpha has a column for "State" and table B has "Colors you like" and you can pick more than one. I would like to build a report that has columns like this:
STATE TOTAL RED GREEN BLUE
Alaska 5 1 3 1
Florida 2 2 2 0
New York 10 5 8 1
The column TOTAL would be a count of the records in Alpha and as you can see due to the one to many relationship the sum of the colors can exceed the count. I suppose it could be less as well if people didn't like colors.
How would you build a report like this. I'll be using SQL Server and Reporting Services in .NET so it could either be a complex query that I just dump into a data table report or a less complex query with some counting and totaling done by the report. I just don't really know the best way to tackle this.
Since you don't know which colors are going to be the columns you should use the Matrix Control
You'll need to set up the query
SELECT
a.State,
b.ColorName,
COUNT(b.ColorID) ColorCount
FROM
alpha a
LEFT JOIN beta b
ON a.id = b.a_id
GROUP BY
a.State,
b.ColorName
Just drag state for the rows, color for the columns and ColorCount for the data (Count(ColorID) will display in the data field))
Note: The LEFT JOIN and Count(ColorID) instead of Count(*) are required if you want a 0 value to appear correctly.
If you did know the colors you could use PIVOT or the sum case technique
SELECT state SUM(CASE WHEN Color = 'RED' THEN 1 ELSE 0 END) as Red, ...

Using two datasets in a single report using SQL server reporting service

I need to show a report of same set of data with different condition.
I need to show count of users registered by grouping region, country and userType, I have used drill down feature for showing this and is working fine. Also the reported data is the count of users registered between two dates. Along with that I have to show the total users in the system using the same drill down that is total users by region, country and usertype in a separate column along with each count (count of users between two date)
so that my result will be as follwsinitialy it will be like
Region - Country - New Reg - Total Reg - User Type 1 - UserType2
+ Region1 2 10 1 5 1 5
+ Region2 3 7 2 4 1 3
and upon expanding the region it will be like
Region - Country - New Reg - Total Reg - User Type 1 - UserType2
+ Region1 2 10 1 5 1 5
country1 1 2 1 2 - -
country2 1 8 1 8 - -
+ Region2 3 7 2 4 1 3
Is there a way I can show my report like this, I have tried with two data sets one with conditional datas and other with non conditional but it didn't work, its always bing total number of regiostered users for all the total reg columns
Unless I'm mistaken, you're trying to create an expandable table, with different grouping levels? Fortunately, this can be easily done in SSRS if you know where to look. The totals on your example don't seem to match up in the user columns, so I may have misunderstood the problem.
For starters, set up your query to produce a single dataset like this:
Region Country New Reg - Total Reg - User Type 1 - User Type 2
Region1 country1 1 2 1
Region1 country2 1 8 1
Region2 country3 2 4 1 1
Region2 country4 1 3 1
Now that you've got that, you want to set up a new table with the fields "NewReg", "TotalReg", "UserType1" and "UserType2". Then right-click the table row, and go to "Add Group > Row Group > Parent Group". Select "Country" in the Group by and click okay. Then, repeat this process and select "Region". This time however, tick the "Add group header" box. This will insert another row above the original.
Now, for each of your fields ("NewReg", "TotalReg" etc), click in the new row above and select the field again. this will automaticaly add a Sum(FieldName) value into the cell. This will add together all the individual row totals and present a new, grouped by region row when you run the report.
That should give you the table you require with the data aggregated correctly, so all you need to do is manage the show/hide the detail rows on demand.
To do this, select your detail row (the original row) and right-click "> Row visibility". Set this to "Hide". Now, select the cell that contains the "Region" and take note of its ID using Properties (for now, let's assume it's called "Region"). Click back onto your detail row and look at the properties window. At the bottom you'll see a "Visibility" setting. In there, set "InitialToggleState" to False and "ToggleItem" to the name of your region group's cell (i.e. "Region").
Now all that should be left is to do the formatting etc and tidy up.
I have solved this problem by taking all the records from DB and filtering the records to collect new reg count by using an expression as following
=Sum(IIF(Fields!RegisteredOn.Value >Parameters!FromDate.Value and Fields!RegisteredOn.Value < Parameters!EndDate.Value , 1,0))

Resources