Calculating new columns in PowerBI

Calculating new columns in PowerBI - bigdata

I've got this table I've defined in PowerBI:
I'd like to define a new table which has the percentage of medals won by USA from the total of medals that were given that year for each sport.
An example:
Year Sport Percentage
1986 Aquatics 0.0%
How could I do it?

You can use SUMMARIZE() to calculate a new table:
NewTable =
SUMMARIZE(
yourDataTable;
[Year];
[Sports];
"Pct";
DIVIDE(
CALCULATE(
COUNTROWS(yourDataTable);
yourDataTable[Nat] = "USA"
);
CALCULATE(
COUNTROWS(yourDataTable);
ALLEXCEPT(
yourDataTable;
yourDataTable[Year];
yourDataTable[Sports]
)
);
0
)

I know that an answer has already been accepted, but I feel that I should provide my suggested solution to utilize all of Power BI's capabilities.
By creating a calculated table, you are limited in what you can do with the data, in that it is hard coded to be filtered to USA and is only based on Year and Sport. While that is the current requirements, what if they change? Then you have to recode your table or make another one.
My suggestion is to use measures to accomplish this task, and here's how...
First, here is my set of sample data.
With that data, I created a simple measure that count the rows to get the count of medals.
Medal Count = COUNTROWS(Olympics)
Throwing together a basic matrix with that measure we can see the data like this.
A second measure can then be created to get a percentage for a specific country.
Country Medal Percentage = DIVIDE([Medal Count], CALCULATE([Medal Count], ALL(Olympics[Country])), BLANK())
Adding that measure to the matrix we can start to see our percentages.
From that matrix, we can see that USA won 25% of all medals in 2000. And their 2 medals in Sport B made up 33.33% of all medals that year.
With this you can utilize slicers and the layout of the matrix to get the desired percentage. Here's a small example with a country and year slicer that shows the same numbers.
From here you are able to cut the data by any sport or year and see the percentage of any selected country (or countries).

Related

How to make code efficient when there is many cross referencing from datasets?

My script currently is very messy and I was wondering if there is a better way of doing this;
I am trying to workout the final price of a product. This is the steps I take;
Dataset A has all the products which changes every year, Dataset B has the base price for the products which also changes every year.
In Dataset A, find the product base price by matching the product in Dataset B.
In Dataset A, apply numerous variations to the base price, these adjustments change every year. I am currently manually putting in the variations.
After the variations, this is my final price.
Dataset A has columns product.
Dataset B has columns product and base.price.
Variation 1 = base price needs to be adjusted by 10% if it meets a condition.
Variation 2 = after variation 1, base price adjusted to be 5% if it meets a condition.
Variation 3 = after variation 1 and 2, base price to be adjusted by 8% if its meets a condition.
library(tidyverse)
####creating sample database
product <- c("pants", "shirt", "boots", "dress")
databaseA<-data.frame(product)
base.price <- c(10, 8, 9,16)
databaseB<-data.frame(product,base.price)
###
datasetAB<-dplyr::left_join(datasetA, datasetB, by = c("product"="product"))
#variation 1
datasetAB<-datasetAB%>%mutate(baseprice1=base.price*1.1)
#variation 2
datasetAB<-datasetAB%>%mutate(baseprice2=baseprice1*1.05)
#variation 3
datasetAB<-datasetAB%>%mutate(baseprice3=baseprice2*1.08)
I am trying to workout if there is a better way of doing this, instead of importing so many datasets and referencing all the different datasets in my code. Because it changes every year, there is just too much datasets.
I am sorry, I don't have enough reputations to show properly.

It would definitely help if your example would contain your further calculations including these conditions. Nevertheless, I dare say you're better off applying a "final_price" function to your products.
If you insist on having all your possible prices available in a data.frame, then I suggest you just add columns to a single data.frame, such that you have columns product, base.price, price.conditionA, etc...

Trouble with summing specific rows and columns

I have a problem and was wondering if there was a code that would let me solve my problem faster than doing it manually.
So for my example, I have 100 different funds with numerous securities in the fund. Within each fund, I have the Name of each type of security in the fund, the Date which shows the given quarter, the State where the security is issued, and the Weighting of each security of the total fund. The Name is not important, just the State from where it was issued is.
I was wondering if there was a way that would allow me to add up the Weighting from each different fund based on the specific State I want for each quarter. So let's say from Fund1, I need the sum of the Weighting just for the state SC and AZ in 16-1Q. The sum would be (.18 + .001). I do not need to include the weighting for KS because I am not interested in that specific state. I would only be interested in the states SC and AZ for every FundId. However, in my real problem I am interested in ~30 states. I would then do the same task for Fund1 for 16-2Q and so on until 17-4Q. My end goal is to find the sum of every portfolio weighting for the states I'm interested in and see how it changes over time. I can do this manually by each fund, but is there a way to automatically sum up the Weighing for each FundId based on the State I want and for each Date (16-1Q, 16-2Q, etc.)?
In the end I would like a table such as:
(.XX) is the sum of portfolio weight
Example of Data

The Example of Data link you sent has a much better data format than the "XX is the sum of portfolio weight" example... only in Excel would you prefer this other kind of format
so using the Example data frame, do this operation
library(dplyr)
example_data <- example_data %>%
group_by(Fund_Id) %>%
summarize(sum = sum(Weighting))

We can use aggregate in base R
aggregate(Weighting ~ Fund_id, example_data, sum)

Tableau - Average of Ranking based on Average

For a certain data range, for a specific dimension, I need to calculate the average value of a daily rank based on the average value.
First of all this is the starting point:
This is quite simple and for each day and category I get the AVG(value) and the Ranke based on that AVG(Value) computed using Category.
Now what I need is "just" a table with one row for each Category with the average value of that rank for the overall period.
Something like this:
Category Global Rank
A (blue) 1,6 (1+3+1+1+1+3)/6
B (orange) 2,3 (3+2+3+2+2+2)/6
C (red) 2,0 (2+1+2+3+3+1)/6
I tried using the LOD but it's not possble using rank table calculation inside them so I'm wondering if I'm missing anything or if it's even possible in Tableau.
Please find attached the twbx with the raw data here:
Any Help would be appreciated.

Get Maximum Values of a column as a function of another column

I have a column with a few dozen grades that have been assigned values Good, Average or Poor. I have a different column with employment rates. I want the maximum employment rate associated with Good, Average and Poor. I can get it to pull the value for each one in three different commands using the code below, but I need it written as a single command similar to this:
max(unHomework$Employment.Rate[unHomework$Job.Satisfaction.Category == 'Poor'])

We can use data.table
library(data.table)
setDT(unHomework)[, .(MaxER =max(Employment.Rate)), by = Job.Satisfaction.Category]

Summarizing Data across age groups in R

I have data for customer purchases across different products , I calculated the amount_spent by multiplying Item Numbers by the respective Price
I used cut function to segregate people into different age bins, Now how can I find the aggregate amount spent by different age groups i.e the contribution of each age group in terms of dollars spent
Please let me know if you need anymore info
I am really sorry that I can't paste the data here due to remote desktop constraints . I am actually concerned with the result I got after summarize function

library(dplyr)
customer_transaction %>% group_by(age_gr) %>% select(amount_spent) %>% summarise_each(funs(sum))
Though I am not sure if you want the contribution to the whole pie or just the sum in each age group.
If your data is of class data.table you could go with
customer_transaction[,sum(amount_spent),by=age_gr]

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Calculating new columns in PowerBI - bigdata

I've got this table I've defined in PowerBI: I'd like to define a new table which has the percentage of medals won by USA from the total of medals that were given that year for each sport. An example: Year Sport Percentage 1986 Aquatics 0.0% How could I do it?

Related

How to make code efficient when there is many cross referencing from datasets?

Trouble with summing specific rows and columns

Tableau - Average of Ranking based on Average

Get Maximum Values of a column as a function of another column

Summarizing Data across age groups in R

Categories

Resources