I need help creating a single SELECT statement as part of a CREAT VIEW statement that contains multiple, separate filtering or grouping requirements.
I am working on an SQLite database to track usage of our local food pantry, where we have two types of visitors, “Scheduled” or “Drop-In”, visiting on different days. One of the central tables is the “visit_log” table that tracks each visit by date, time, type of visit, and people in the household.
I’m trying to create a VIEW that summarizes that “visit_log” grouped by the visit_date, and for both number of records and SUM of household size, displaying the number of “Drop-Ins”, the number of “Scheduled” and the total of the two types.
Here is the “visit_log”
CREATE TABLE "visit_log" ("visit_date" DATE, "visit_time" TIME, "client_relation" TEXT, "household_size" INTEGER)
Here is a sample of the “visit_log” table’s content. (We have not started recording the visit_time yet, so those values are blank).
"visit_date","visit_time","client_relation","household_size"
"6/9/20","","Scheduled","1"
"6/9/20","","Scheduled","1"
"6/9/20","","Drop-In","2"
"6/9/20","","Drop-In","3"
"6/9/20","","Drop-In","8"
"6/9/20","","Drop-In","5"
"6/16/2020","","Scheduled","1"
"6/16/2020","","Scheduled","1"
"6/16/2020","","Drop-In","4"
"6/16/2020","","Drop-In","5"
"6/16/2020","","Drop-In","2"
"6/16/2020","","Drop-In","2"
"6/16/2020","","Drop-In","5"
"6/16/2020","","Drop-In","1"
I can create three separate VIEW, one for each type and one for the two combined. But my goal is to have the results of these three VIEWs in one.
Here are the three VIEWs. First is for the two client types combined.
CREATE VIEW "visit_summary" AS SELECT
visit_date,
COUNT (*) AS households_total,
SUM (household_size) AS individuals_total
FROM
"visit_log"
GROUP By visit_date
This yields
"visit_date","households_total","individuals_total"
"06/09/2020","12","44"
"06/16/2020","8","21"
"06/23/2020","7","20"
"06/30/2020","10","22"
"07/07/2020","7","18"
Next is the VIEW for the Drop-Ins
CREATE VIEW "visit_summary_dropin" AS SELECT
visit_date,
COUNT (*) AS households_dropin,
SUM (household_size) AS individuals_dropin
FROM
"visit_log"
WHERE client_relation = "Drop-In"
GROUP By visit_date
This yields
"visit_date","households_dropin","individuals_dropin"
"06/09/2020","10","42"
"06/16/2020","6","19"
"06/23/2020","4","13"
"06/30/2020","6","12"
"07/07/2020","6","16"
Finally is the VIEW for the Scheduled
CREATE VIEW "visit_summary_scheduled" AS SELECT
visit_date,
COUNT (*) AS households_schedualed,
SUM (household_size) AS individuals_scheduled
FROM
"visit_log"
WHERE client_relation = "Scheduled"
GROUP By visit_date
This yields
"visit_date","households_schedualed","individuals_scheduled"
"06/09/2020","2","2"
"06/16/2020","2","2"
"06/23/2020","3","7"
"06/30/2020","4","10"
"07/07/2020","1","2"
What I'm hoping to create is a single VIEW that yields
"visit_date","households_total","individuals_total","households_dropin","individuals_dropin","households_schedualed","individuals_scheduled"
"06/09/2020","12","44","10","42","2","2"
etc…
So my ultimate question, finally, is how to create a single VIEW containing something like multiple WHERE classes to define different columns?
You can do it with conditional aggregation:
CREATE VIEW visit_summary_scheduled_all AS
SELECT visit_date,
COUNT(*) households_total,
SUM(household_size) individuals_total,
SUM(client_relation = 'Drop-In') households_dropin,
SUM(CASE WHEN client_relation = 'Drop-In' THEN household_size END) individuals_dropin,
SUM(client_relation = 'Scheduled') households_scheduled,
SUM(CASE WHEN client_relation = 'Scheduled' THEN household_size END) individuals_scheduled
FROM visit_log
GROUP By visit_date
See the demo.
Results:
| visit_date | households_total | individuals_total | households_dropin | individuals_dropin | households_scheduled | individuals_scheduled |
| ---------- | ---------------- | ----------------- | ----------------- | ------------------ | -------------------- | --------------------- |
| 6/16/2020 | 8 | 21 | 6 | 19 | 2 | 2 |
| 6/9/20 | 6 | 20 | 4 | 18 | 2 | 2 |
User wants a count of unique sessions per week in application insights. I have the query working, including a pivot, but the Week columns are out of order. I would prefer if they were in order.
pageViews
| where timestamp < now()
| summarize Sessions= dcount(session_Id)
by Week=bin(datepart("weekOfYear", timestamp), 1), user_AuthenticatedId
| order by Week
| evaluate pivot(Week, sum(Sessions))
| join kind=innerunique (pageViews
| summarize MostRecentRequest = max(timestamp) by user_AuthenticatedId)
on $right.user_AuthenticatedId == $left.user_AuthenticatedId
| project-away user_AuthenticatedId1
I've tried ordering by timestamp before the summarize, and ordering by week after the summarize (still in there) and no luck.
There's currently a "trick" that will work: serialize right after your order by
pageViews
| where timestamp < now()
| where isnotempty(user_AuthenticatedId)
| summarize Sessions= dcount(session_Id)
by Week=bin(datepart("weekOfYear", timestamp), 1), user_AuthenticatedId
| order by Week
| serialize // <--------------------------------- RIGHT HERE
| evaluate pivot(Week, sum(Sessions))
| join kind=innerunique (pageViews
| summarize TotalSessions=dcount(session_Id), MostRecentRequest = max(timestamp) by user_AuthenticatedId)
on $right.user_AuthenticatedId == $left.user_AuthenticatedId
| project-away user_AuthenticatedId1
| top 100 by TotalSessions desc
gets me this in workbooks, with the weeks in descending order (I also added total session count to sort/top by with some custom column settings set):
the custom settings I have for the column settings in workbooks:
delete all the #'d columns that are there by default and add one for ^[0-9]+$ set to heatmap:
I refactored query a bit for my own comprehension. I took the the left and right into "views". Thought I'd share.
let users_MostRecent_Session =
pageViews
| summarize
TotalSessions=dcount(session_Id)
, MostRecentRequest = max(timestamp)
by
user_AuthenticatedId
;
//
let users_sessions_ByWeek =
pageViews
| where timestamp < now()
| where isnotempty(user_AuthenticatedId)
| summarize
Sessions= dcount(session_Id)
by
Week=bin(datepart("weekOfYear", timestamp), 1)
, user_AuthenticatedId
| order by Week
| serialize
| evaluate pivot(Week, sum(Sessions))
;
//
//
users_sessions_ByWeek
| join kind=innerunique
users_MostRecent_Session
on user_AuthenticatedId
| project-away user_AuthenticatedId1
| top 100 by TotalSessions desc
I want do an equivalent of the following SQL query -
(roughly)
SELECT
Name,
application_Version
Rank() OVER (PARTITION BY application_Version ORDER BY CountOfEventNamePerVersion)
FROM
customEvents
Assuming I get the CountOfCompanyPerVersion field easily. I want to do the same using AIQL but I'm not able to do this. Here's a query that I am tried -
customEvents
| summarize count() by name, application_Version
| project name, application_Version, count_
| summarize x = count(count_) by application_Version
| where x = count_
Basically I want to get the most common Name per application_Version. How can I do this?
arg_max should do the trick:
customEvents
| summarize count() by Name, application_Version
| summarize arg_max(count_, Name) by application_Version
| order by application_Version
| project application_Version, Name=max_count__Name
I have a sqlite database with some time series data:
holdings:
| id | date | instrument | position | price | portfolio | sector |
prices:
| id | date | instrument | open | high | low | close | adjclose |
static_data
| id | ticker | name | sector | industry | country | currency |
and I'd like to get the holdings for a particular day with the change in price on that day as a calculated field.
I've tried the following query
SELECT h.date,
h.portfolio,
h.instrument,
s.name,
h.position,
p.adjclose AS curpx,
(p.adjclose AS lastpx WHERE
h.date = "2013-01-10 00:00:00" AND
h.instrument = p.instrument)
FROM holdings AS h,
static_data AS s,
prices AS p
WHERE h.date = "2013-01-11 00:00:00"
AND h.portfolio = "usequity"
AND (h.instrument = p.instrument)
AND (h.date = p.date)
AND (h.instrument = s.ticker);
but I get a syntax error.
[2014-11-14 06:11:04] [1] [SQLITE_ERROR] SQL error or missing database (near "as": syntax error)
I'm a complete N00b at SQL so I'd like to know how I can get two sets of data from the same table and show them side by side or to perform a calculation using one against the other in SQL.
Thanks
You want a correlated subquery:
SELECT ...,
p.adjclose AS curpx,
(SELECT p2.adjclose
FROM prices AS p2
WHERE p2.date = datetime(h.date, '-1 days')
AND p2.instrument = h.instrument
) AS lastpx
FROM ...
How do I populate Dim_tbls from a relational source?
These example tables are given:
tbl_sales: id_sales, fk_id_customer, fk_id_product, country, timestamp
tbl_customer: id_customer, name, adress, zip, city
tbl_product: id_product, price, product
My goal is to get these attributes into a start-schema. The problem I have is the logic behind loading the dimension tables. I mean, what data would I load into the Dim_Product? All the products that are in tbl_product? But how would I know how many Sales are done with a specific product?
Analysis I would like to do are:
How many people bought product x.
How many sales are made from city x.
How many sales were made between Time x and y.
Example data:
tbl_sales: id_sales | fk_id_customer | fk_id_product | country | timestamp
1 | 2 | 1 | UK | 19.11.2013 10:23:22
2 | 1 | 2 | FR | 20.11.2013 06:04:22
tbl_customer: id_customer | name | adress | zip | city
1 | Frank|Street X| 211 | London
2 | Steve|Street Y| 431 | Paris
tbl_customer: id_product| Price | product
1 | 100,00| Hammer
2 | 50,00| Saw
Let's start with a very simple star schema model; for example, I assumed you don't need to worry about handling changes to dimensions' attributes.
factSales
DateKey
CustomerKey
ProductKey
Counter (=1; this is a factless fact table)
dimDate
DateKey
Date
Year
Quarter
Month
...
dimCustomer
CustomerKey
Name
Address
Zip
City
dimProduct
ProductKey
Name
Price (if it changes, you need move it to factSales)
How many people bought product x.
SELECT DISTINCT CustomerKey
FROM factSales
WHERE ProductKey IN ( SELECT ProductKey
FROM dimProduct
WHERE Name = 'Product X' )
How many sales are made from city x.
SELECT SUM(Counter)
FROM factSales
WHERE CustomerKey IN ( SELECT CustomerKey
FROM dimCustomer
WHERE City = 'City X' )
How many sales were made between Time x and y.
SELECT SUM(Counter)
FROM factSales
WHERE DateKey IN ( SELECT DateKey
FROM dimDate
WHERE Date BETWEEN DateX AND DateY )