I have a table which contains a list of products scores by date:
From this table, have to make a plot of the porcentage of each quality by date.
I know how to do it in python but I´m having a hard time figuring out how to make it using power BI.
This is what I'm trying to make:
1) Get the percentages by class:
I'm python this is easily done by grouping by date and score and divede it be a group by date:
df_grouped = (df.groupby(["Date","Score"]).sum()/df.groupby(["Date"]).sum())*100
And then just make a plot of the percentage of each score by day
Like this:
How Can I get a similar result from powerbi?
Here is a google drive link to download a csv with the sample data: https://drive.google.com/file/d/1dEdUwwofv1OQ9rOGQMuyfYKO9_YJDTcl/view?usp=sharing
EDIT:
I'm getting this result from M D code:
Create a new measure and change the data type of the measure to a percentage in the data modelling tab. The measure is to have the following DAX formula:
Measure = CALCULATE(sum(Table1[Percentage_By_Class]),filter(Table1,Table1[Date]=max(Table1[Date])),ALLEXCEPT(Table1,Table1[Score]))/ CALCULATE(sum(Table1[Percentage_By_Class]),filter(all(Table1),Table1[Date]=max(Table1[Date])))
This will calculate the sum of a group (the score) by date and divide it by the total for all groups on the day. Then add it to a line chart with date as the axis, score as legend and the new measure as the values.
Review below images as well as the code.
Graph Created With Your Sample Data
Related
I'm working with a dataset where I have one continous variable (V1) and want to see how that variable differs depending on demographics such as sex, age group etc.
I would like to do one graph that contains multiple boxplots - so that V1 is on the Y-axis and all my demographic variables (sex, age groups etc.) are on the x-axis with their corresponding p-values. Anyonw know how to do this in R?
I've added two photos to illustrate my dataset and the output I want.
Thanks!
Output example
Data example
It would be nice to have actual data and the code you already have so we can replicate what you have and work what you want. That being said, this link might be what you are looking for:
https://statisticsglobe.com/draw-multiple-boxplots-in-one-graph-in-r#example-2-drawing-multiple-boxplots-using-ggplot2-package
Scroll down about half way to Example 4: Drawing Multiple Boxplots for Each Group Side-by-Side
Hi so I am new in R and kind of don't know what I'm looking for. I want to measure probability of each frequency of a dust concentration so I need to divide each frequency to whole total of dust concentration frequency. By then I can continue by looking for CDF and PMF of the dust concentration.
So I have a dust probability data that has two column(Dust Concentration and its Frequencies) and it looks like this:
In my first thought, I have to increment i on this line of R queries
dustProb[i, "Frekuensi"]
that should've take specific frequency in row i so I can sum all frequency queried from it after getting that with for loops like this.
# the dataset is called dustData here
# dustFrequencies = dustData[i, "Frekuensi"]
for(i in dustFrequencies){
print(dustFrequencies)
}
The print() part supposed to be where I sum all the variables earned through that incremented queries.
My question is:
Can I increment the 'i' inside that R queries
Was my way is too complicated or there's other way to measure probability in R?
Sorry for lots of confusion, inneficiency, and holes, I hope I was clear enough here.
Using loops in R isn't very tidy-freindly. You can do:
library(dplyr)
dustData <- dustData %>%
mutate(probabilities = Frekuensi/sum(Frekuensi))
The new column is the frenquency divided by the sum of all frequencies, for each dust concentration.
I have a panel data set with return, ESG score and market value for a number of companies over 11 years. I need to extract data for all variables for one year at a time, to make yearly portfolios.
The data frame looks like this:
How can I extract one year at a time and then construct portfolios of high and low ESG score for each year?
Thanks in advance
Have you considered processing the data with Python and Pandas instead of R? The following solution should help to slice your data into different time intervals:
Slice JSON File into Different Time Intercepts with Python
In terms of sorting ESG scores, you can use the following command: df.sort_values('ESG')
Hope that helps and good luck with your dataset.
So I have imported call center data from a csv file into R.
flows = read.csv("data.csv")
There are two important columns to me:
name
duration
I am trying to create a bar chart that calculates the average duration of the call for a group, which is divided up by the variable name. Essentially, the chart displays which types of calls have the highest average duration.
There are also about 50 different names, so if I could limit the chart to the top 5/10 that would be ideal. Sorry if this is a simple problem, appreciate any help in advance.
This should work
flows %>%
group_by(name) %>%
dplyr::summarize(Mean = mean(duration, na.rm=TRUE))
After this, you probably want to sort it according to duration and keep the 5 first values.
flows<-flows[order(flows$Mean),]
flows<-flows[5,]
I'm making a project connected with identifying dynamic of sales. That's how the piece of my database looks like http://imagizer.imageshack.us/a/img854/1958/zlco.jpg . There are three columns:
Product - present the group of product
Week - time since launch the product (week), first 26 weeks
Sales_gain - how the sales of product change by week
In the database there is 3302 observations = 127 time series
My aim is to cluster time series in groups which are going to show me different dynamic of sales. I used k-medoids algorithm (after transforming data with FFT/DWT) and I do not know how to present each cluster = grouped time series on different plots.
Can somebody tell me how should I do that?
Here is the code of clustering:
clustersalesGain = pam(t(salesGain), 8)
nazwy = as.character(nazwy)
cbind(nazwy,clustersalesGain$clustering)
I would like to present the output on different plots.
k-medoids returns actual data points as cluster centers.
Just visualize them the same way you visualize your data!
(And if you havn't been visualizing your data, you better work on that now.)