Get total (count) per month in Power BI - count
I have an 'issue' data set in CSV format that looks like this.
Date,IssueId,Type,Location
2019/11/02,I001,A,Canada
2019/11/02,I002,A,USA
2019/11/11,I003,A,Mexico
2019/11/11,I004,A,Japan
2019/11/17,I005,B,USA
2019/11/20,I006,C,USA
2019/11/26,I007,B,Japan
2019/11/26,I008,A,Japan
2019/12/01,I009,C,USA
2019/12/05,I010,C,USA
2019/12/05,I011,C,Mexico
2019/12/13,I012,B,Mexico
2019/12/13,I013,B,USA
2019/12/21,I014,C,USA
2019/12/25,I015,B,Japan
2019/12/25,I016,A,USA
2019/12/26,I017,A,Mexico
2019/12/28,I018,A,Canada
2019/12/29,I019,B,USA
2019/12/29,I020,A,USA
2020/01/03,I021,C,Japan
2020/01/03,I022,C,Mexico
2020/01/14,I023,A,Japan
2020/01/15,I024,B,USA
2020/01/16,I025,B,Mexico
2020/01/16,I026,C,Japan
2020/01/16,I027,B,Japan
2020/01/21,I028,C,Canada
2020/01/23,I029,A,USA
2020/01/31,I030,B,Mexico
2020/02/02,I031,B,USA
2020/02/02,I032,C,Japan
2020/02/06,I033,C,USA
2020/02/08,I034,C,Japan
2020/02/15,I035,C,USA
2020/02/19,I036,A,USA
2020/02/20,I037,A,Mexico
2020/02/22,I038,A,Mexico
2020/02/22,I039,A,Canada
2020/02/28,I040,B,USA
2020/02/29,I041,B,USA
2020/03/02,I042,A,Mexico
2020/03/03,I043,B,Mexico
2020/03/08,I044,C,USA
2020/03/08,I045,C,Canada
2020/03/11,I046,A,USA
2020/03/12,I047,B,USA
2020/03/12,I048,B,Japan
2020/03/12,I049,C,Japan
2020/03/13,I050,A,USA
2020/03/13,I051,B,Japan
2020/03/13,I052,A,USA
I'm interested in analyzing the count of issues, particularly across months and years. Now if I wanted to simply plot a chart of issues by date, that's pretty easy. But what if I want to calculate total issues per month and plot it, and perhaps do some analysis of trends etc? How would I go about calculating these sums per (say) month to analyze.
The best approach I could take so far is the following.
I create a new column, called YearMonth which looks like this:
YearMonth = FORMAT(Issues[Date],"YYYY/MM")
Then if I plot Axis = YearMonth vs Values = Count of IssueId, I get what I want.
But the biggest drawback here is that my X-axis is the newly created column, not the original Date column. Since my project has other data that I would like to analyze using the date as well, I would like for this to be using the actual Date instead of my custom column.
Is there a way for me to get this same result but without having to create a new column?
What you usually do is create a calendar table, which will contain all the time-related columns (year, month, year-month, etc) and then link it to your data by date.
In your visuals, you will then use the "Calendar" table columns, without having to alter your original table. The calendar table will be sued also by any other table that needs date related data.
Related
r: How to manipulate the GGparcoord input column inside the function
I want to compare between the week() of the year of two parallel date columns from two different years. I`m using the GGparcoord function and looking for a way to manipulate the dates in the two columns to be the week count of the specific date. I wish not to manipulate the table itself. my code is: ggparcoord(data, columns = 38:39) and I'm looking for something like ggparcoord(data, columns = week(38):week(39)), that actually works. In addition, if anyone knows how, I would be happy to learn how to use the ggparcoord with column name instead of column number. Tnx!
group by class multiple category variables in R
I am extremely new to R and thus not familiar with the various packages. I am simply using Soybean data from library(mlbench) data(Soybean) and I want visualize in a table the CLASS factor (19 levels) by various categories (date, plant.stand, precip, etc) (there are 35 such vars). I want to show frequency, NAs and mode. In essence then each Class would then be broken out by the various category (date, plant.stand, precip) etc with the frequency data. I' sure there must be a simple way but I'm very new to R. Thanks for the help. Update As per the table below: table data I want to basically count all the categorical data ie (date, plant.stand, precip, etc) and sort by CLASS variable. The only way I can think of is by creating a key for each factor level per categorical variable, counting the occurrence of each key and then sorting. Is there perhaps an easier way?
Cant I use dates as axes in a scatter plot in SAS VA?
In Enterprise Guide, I draw scatter plots with creation and closing date of issues to detect when backloggs occur and when they are resolved: (The straight lines in the graph are batch interventions, like closing a set of issues that were handled outside ot the system.) proc sgplot data=alert; scatter x=create_Date y=CloseDate / group=CloseReason; run; When I try to do the same in SAS Visual Analytics, I can only put measures on the x-ax and y-ax and I cant make te date or datetime variable a measure. Do I do something wrong? Should I use another graph type?
My take is that the inability of SAS VA Explorer to allow dates to be measures is a real weakness. Old school trickery would be perhaps to create a duplicate data item that computes the SAS data value (giving you a number result and thus a measure) and then formatting that with a custom format to render it back as a human readable date. However, according to http://support.sas.com/kb/47/100.html#explorer How SAS Visual Analytics Designer supports formats In SAS Visual Analytics Designer, the Format property of the data item displays the name of the format for both numeric and character data items. However, there are some differences between numeric and character data items. Numeric data items You can change the format. If you change the format, you can restore the user-defined format by selecting Reset to Default in the Format type box. You can specify to sort by formatted or unformatted values (release 6.2 and later). (My bolds) Numeric data items with a user-defined format are classified as categories. You cannot change these data items to measures while the user-defined format is applied.
According to support.sas.com/documentation/cdl/en/vaug/68648/PDF/default/vaug.pdf , page 166, you could work on defining data roles for a scatter plot. I am not sure that this could solve your situation but it says that: "In addition to measures, you can assign a Group variable. The Group variable groups the data based on the values of the category data item that you assign. A separate set of scatter points is created for each value of the group variable. You can add data items to the Data tips role. The values for the data items in the Data tips role are displayed in the data tips for the scatter plot". Hope it helps.
flexibly naming subsetted objects in R
I'm somewhat new to R so i apologize in advance if the answer to this question is obvious. I have a very long data frame (only one variable) from which i want to create multiple objects from subsets within the data frame. The code to scrape the data & format as data frame 'aa', define the variable as 'whatever': aa<-data.frame(readLines("ftp://ftp.cmegroup.com/pub/settle/stlint")) aa<-data.frame(aa[-1:-3,]) colnames(aa)<-"whatever" I am looking to subset each section under a heading beginning with 'ZE' and ending with the last data row before the next 'ZE' or before the 'TOTAL'... so basically i want 36 objects (length(grep("ZE",aa$whatever[1:nrow(aa)]))=36) each starting with their respective 'ZE' title followed by (roughly) 70 rows of data, with each object identified by their respective title. So for instance, I would want the first dataset (headed by row ZE MAR15 EURODOLLAR OPTIONS CALL) to be named some variant of 'March 2015 Calls' as i just need to denote the month, year, and whether the data is for calls or puts. I can actually code this up in batch thru a loop, but here's my problem: right now of course the first 'ZE' month is Mar15, ie March 2015, and the last 'ZE' month is Dec18, or Dec 2015. This will change as time goes on though, and i'm hoping to be able to automatically name them based on the first line without tweaking the script when the months change for each contract. So is it possible to flexibly name each of these subsets based on the content of the header? Thanks
Count repeated values in a column
I have a column of year values by which I am sorting. I'd like to find the quantity per year (read: number of repeats of each year value). I'd like to chart said values. I'm not sure how to make this happen. I am using Apple's Numbers '08, but if possible a general solution that multiple people could use would be preferred.
You should use the countif() function: http://office.microsoft.com/en-us/excel/HP052090291033.aspx I did a similar thing to count how many hours of work there are for each upcoming version of my iPhone app. I was doing sumif(), but you just want countif(). See cells N4-N6 here: http://spreadsheets.google.com/ccc?key=0AhL0igVI9HVNdGpaS3U1cS1qOGVNd3h0Slg0a21vUWc&hl=en
On a new sheet, list the unique years in one column, then their quantity count in the column next to them. Select the entire range created, then create a chart. I'm unsure from your question what you would specifically need more than this (and I work in Excel 2003).