Data exampleThis is my first time working in R studio.
I have a database of 36 participants but it has 150600 entries.
There is a column for the participants:
A column for the probes Activityprobe/ Screenprobe, SMSprobe and CallLogprobe
A column for the Activity Level High/low/none, screenon/off etc.
I need a code that helps me count the activity level of all the participants
High activity level. No activity level and Low activity level.
And to help me find out for every Participant what the percentages are of all their high/no/low activity.
For screenprobe I need to count how many times the participant turned their screen on and how many times they turned it off and the percentage of screen on/off.
For callLog I need to count how many times each participant got called and the percentage.
For SMS I need to count the number of SMS for each participant and their percentage.
I also need to categorize the probes. So that my database shows all the activity levels first, organized by none/high/low and then all the screenprobes, organized by on and off etc.
I hope that my description is clear
I am about to start creating a dimensional date table for a Data Warehouse project using SQl Server 2012.
On of the first user comments is that....
'Different customers will have a different first day of week, so not always a Monday'.
How would I accomodate for potentially 7 different start of week days in a single dim table or should i simply calculate it the conventional way on the fly on a per customer basis in a fact table and not use the dim date table?
Option 1. Calculate it on the fly using built in date match functions. SQL Server defaults to Sunday as the first day of the week.
Option 2. Create an additional column in your table for each day of the week indicating its day number of the week. For example column TuesdayFirst would have a 1 for every Tuesday and a 2 for each Wednesday.
Option 3. (Best) Create a view on your date dimension that calculates the additional columns for each day. Any of the columns that are not needed in the select will be ignored and not calculated. This gives the benefits of the persistent columns and the consistent calculation method, but does have some processing overhead versus pre-calculating.
If you choose option 3, do not use a CASE statement to calculate it. You must do it strictly with date math in order for it to perform decently when aggregating.
I'm wondering how to accomplish this requirement.
I have to compare value data with the average over the selected period or over another period.
I've collected millions of records in an index. These records contains the sellout amount day by day for different vendors, products, sectors and product families.
What I'd like to do is to analyse any single value with the average of the selected period of the average of the same periodo of a previous year. I'd like to use Kibana to show data to users.
How can I accomplish it?
Thanks
I am having
dimension tables
item (item_id,name,category)
Store(store_id,location,region,city)
Date(date_id,day,month,quarter)
customer(customer_id,name,address,member_card)
fact tables
Sales(item_id,store_id,date_id,customer_id,unit_sold,cost)
My question is if I want to find average sales of a location for a month Should I add average_sales column in fact table and if i want to find sales done using the membership card should I add corresponding field in fact table?
My understanding so far is only countable measures should be in fact table so I guess membership_card should not come in fact table.
Please let me know if I am wrong.
No, you should not add an average sales column to your fact table, it is a calculated value, and is not at the same "grain" as the fact table.
Your sales fact table should be as granular as possible, so it should really be sales_order_line_items, one row per sales order line item.
You want to calculate the average sales of a given store for a given month...?
First, by "sales" do you mean "revenue" (total dollars in) or "quantity sold"?
Average daily revenue?
Average monthly revenue, by month?
If you have the store id, date, quantity sold (per line item) and unit price, then it's pretty easy to figure out.
You Should not add aggregate columns In the same fact table. The measures in the fact table should be at the same grain. So if you want aggregate metrics, build a separate fact table at the required grain.
So, I might have a fact aggregate table named F_LOC_MON_AGG which has the measures aggregated at location and month level.
If you do not have aggregate tables, modern business intelligence tools such as OBIEE can do the aggregation at run time.
Vijay
I've made a SQL report that needs to do a bunch of different things, but my issue is as you can see in the picture I grouped patients, because several of them had multiple discharges and I need a count of total patients discharged. I have several other counts, but when I right click and insert summary it doesn't give me the option to select a group to do a count by. Is there a way to insert a count by the patient group?
Ryan's answer was correct. I just did a distinct count.