Dimensional Date Table with multiple first day of weeks - datetime

I am about to start creating a dimensional date table for a Data Warehouse project using SQl Server 2012.
On of the first user comments is that....
'Different customers will have a different first day of week, so not always a Monday'.
How would I accomodate for potentially 7 different start of week days in a single dim table or should i simply calculate it the conventional way on the fly on a per customer basis in a fact table and not use the dim date table?

Option 1. Calculate it on the fly using built in date match functions. SQL Server defaults to Sunday as the first day of the week.
Option 2. Create an additional column in your table for each day of the week indicating its day number of the week. For example column TuesdayFirst would have a 1 for every Tuesday and a 2 for each Wednesday.
Option 3. (Best) Create a view on your date dimension that calculates the additional columns for each day. Any of the columns that are not needed in the select will be ignored and not calculated. This gives the benefits of the persistent columns and the consistent calculation method, but does have some processing overhead versus pre-calculating.
If you choose option 3, do not use a CASE statement to calculate it. You must do it strictly with date math in order for it to perform decently when aggregating.

Related

Tracking total meeting attendance across participant start and end times

I have attendance data for a virtual meeting. This includes the employee name, their direct manager, and their respective join/leave timestamps. Some employees coming and going multiple times (see rows 5 and 6)
I'd like to create a visualization showing attendance (as a percentage of total attendees) over the course of the meeting session, beginning to end.
To do this, I figured I'd create a table with the first column being a sequence of times starting from the min(join_time) and ending with the max(leave_time) broken down in 30 second intervals. Then add summary columns that count the total instances where that time falls in an employee's join/leave time like this...
I'm working in R within the tidyverse so if there's a dplyr, lubridate solution that'd be ideal.

How to code a revenue forecasting model in R for every certain value?

I have a data table that has a column for the fiscal quarter, a column for the net revenue made for row X's sale, and a column for the type of sale it was.
I want to use a forecasting method in R (was planning to use ARIMA, but am open to options) to predict future fiscal quarter net revenue per type of sale. For example, if the two types of sale are service and good, I want to create a model to predict future revenue for service and a model for good's future net revenue.
How would I approach this code and are there any websites you'd recommend I reference. Thank you in advance!
The websites I have found so far reference if every timestamp (i.e. every fiscal quarter) has only one row. But, my data table shows how i.e. quarter 1 can have 10 different sales and 5 can be labelled service and 5 are good.

Find difference in days between two date fields in Infopath

Could someone help me with determining the difference in days between two date fields in InfoPath forms.
Usual subtraction doesn't work with the date fields. like DateField1 - DateField2. Any code or no code solution is highly appreciated.
There are ways to get difference of dates in InfoPath, but they are very complex and involve writing rules and parsing the date into month/day/year. Instead, I recommend this method described elsewhere that uses Excel Services. Because Excel is excellent at calculations, it makes sense to write the calculations in Excel and call the Excel document from InfoPath (if you have SharePoint with Excel Services).
Here are 2 sets of instructions on how to set up InfoPath and Excel Services. The instructions are long and/or copyrighted so I cannot include them here, but to summarize you would set up new Data Connections in InfoPath to use web services (SOAP) open the Excel document and set the date fields based on your InfoPath date fields and retrieve the calculated value from Excel.
Calculating date differences in InfoPath using SharePoint Excel Services
InfoPath and Excel Services
It took me about an hour to get it working because I had to do some trial & error with the Trusted Location settings.
I utilize a separate SharePoint list to help calculate the number of days between two dates in Infopath. This is so I can account for leap years and have the ability to just count work days, not all days. I update this list with new data once a year.
Here is the Excel file source for the list containing 2018-2020 data: https://1drv.ms/x/s!ApLhBloaS1wVgsUOMRrRfbekFftY9Q
Steps:
Import the first worksheet of the above Excel file as a new list in Sharepoint.
Add a receive data connection to this list from your InfoPath form.
For the first date, create a action to convert the date to a number in YYYYMMDD format. Assuming the date is stored as DateTime, you can use this formula:
floor(number(translate(substring-before(../my:endDate, "T"), "-", "")))
Query the list for the number value of the first date. Store this number in a field (column) in your form. (This field does not need to show in your form.)
Repeat steps 3 & 4for the second date.
Subtract the first date number from the 2nd, store this in a third column.
Note: The Excel file uses the formula "NETWORKDAYS" and includes columns for weekdays, weekdays minus US Federal holidays, and weekdays minus NYSE holidays. Now you can get the number of work days between two dates using one of these columns. If you live outside the US, you could add a column to the Excel for other holidays, such as UK bank holidays.

dimensional data modelling design - Data warehouse

I am having
dimension tables
item (item_id,name,category)
Store(store_id,location,region,city)
Date(date_id,day,month,quarter)
customer(customer_id,name,address,member_card)
fact tables
Sales(item_id,store_id,date_id,customer_id,unit_sold,cost)
My question is if I want to find average sales of a location for a month Should I add average_sales column in fact table and if i want to find sales done using the membership card should I add corresponding field in fact table?
My understanding so far is only countable measures should be in fact table so I guess membership_card should not come in fact table.
Please let me know if I am wrong.
No, you should not add an average sales column to your fact table, it is a calculated value, and is not at the same "grain" as the fact table.
Your sales fact table should be as granular as possible, so it should really be sales_order_line_items, one row per sales order line item.
You want to calculate the average sales of a given store for a given month...?
First, by "sales" do you mean "revenue" (total dollars in) or "quantity sold"?
Average daily revenue?
Average monthly revenue, by month?
If you have the store id, date, quantity sold (per line item) and unit price, then it's pretty easy to figure out.
You Should not add aggregate columns In the same fact table. The measures in the fact table should be at the same grain. So if you want aggregate metrics, build a separate fact table at the required grain.
So, I might have a fact aggregate table named F_LOC_MON_AGG which has the measures aggregated at location and month level.
If you do not have aggregate tables, modern business intelligence tools such as OBIEE can do the aggregation at run time.
Vijay

Week number and ISO 8601 for Pageview counters

I need to store page view for specific products by week, day and year. I thought about using the following tables:
Weekly
List item
id
product_id
week_number
year
total_page_views (counter)
Daily
product_id
total_page_views
[for the day I won't keep archive data and yearly will be calculated based on the weekly table]
My question is actually about the week numerical value. I am wondering which week number I should use, I've read that ISO 8601 is one option and using .NET DataTime function to get the week is another one. I need your help to know which one should I use and whether there is a better way to optimize the table.
At the end, I want to show the top viewed products for the previous week, this week, previous year, etc.
I target my app towards the US and Europe crowd, where the week start at Monday. I'm developing my application in C#/ASP.NET/MySQL/Entity Framework. THANKS.

Resources