How to select multiple custom Firebase event parameters in BigQuery? - firebase

I exported Firebase events to BigQuery and now I'm trying to select two parameters from a certain event. Here is the query for selecting one parameter:
select event_dim.params.value.int_value as level_id
from [com_company_appname_ANDROID.app_events_20161210]
where event_dim.name = "level_replays_until_first_victory" and event_dim.params.key = "level_id"
Both parameters are int values, name of the first parameter is level_id, and the second parameter is count. What I would like is to show is level_id in first column and count in second column.

Below will work with BigQuery Standard SQL
SELECT
(SELECT params.value.int_value FROM x.params
WHERE params.key = 'level_id') AS level_id,
(SELECT params.value.int_value FROM x.params
WHERE params.key = 'count') AS count
FROM `com_company_appname_ANDROID.app_events_20161210`, UNNEST(event_dim) AS x
WHERE x.name = 'level_replays_until_first_victory'
See also Migrating from legacy SQL in case if you are stuck with Legacy SQL

I love the previous solution! Here is an alternative solution for the same problem I came up with. I'd welcome comments on which solution is more efficient/cheaper and why.
SELECT event_param1.value.int_value AS level_id,
event_param2.value.int_value AS count
FROM `com_company_appname_ANDROID.app_events_20161210`,
UNNEST(event_dim) event,
UNNEST(event.params) as event_param1,
UNNEST(event.params) as event_param2
WHERE event.name = 'level_replays_until_first_victory'
AND event_param1.key = 'level_id'
AND event_param2.key = 'count'

Another solution I find quite handy is to use User Defined Functions to analyze user properties and event parameters
#Standard-SQL
#UDF for event parameters
CREATE TEMP FUNCTION paramValueByKey(k STRING, params ARRAY<STRUCT<key STRING, value STRUCT<string_value STRING, int_value INT64, float_value FLOAT64, double_value FLOAT64 >>>) AS (
(SELECT x.value FROM UNNEST(params) x WHERE x.key=k)
);
#UDF for user properties
CREATE TEMP FUNCTION propertyValueByKey(k STRING, properties ARRAY<STRUCT<key STRING, value STRUCT<value STRUCT<string_value STRING, int_value INT64, float_value FLOAT64, double_value FLOAT64>, set_timestamp_usec INT64, index INT64 > >>) AS (
(SELECT x.value.value FROM UNNEST(properties) x WHERE x.key=k)
);
#Query the sample dataset, unnesting the events and turn 'api_version', 'round' and 'type_of_game' into columns
SELECT
user_dim.user_id,
event.name,
propertyValueByKey('api_version', user_dim.user_properties).string_value AS api_version,
paramValueByKey('round', event.params).int_value as round,
paramValueByKey('type_of_game', event.params).string_value as type_of_game
FROM `firebase-analytics-sample-data.android_dataset.app_events_20160607`,
UNNEST(event_dim) as event
WHERE event.name = 'round_completed'
LIMIT 10;

An update to the second solution
SELECT
event_param1.value.int_value AS level_id,
event_param2.value.int_value AS count
FROM
`com_company_appname_ANDROID.app_events_20161210`,
UNNEST(event_params) as event_param1,
UNNEST(event_params) as event_param2
WHERE event_name = 'level_replays_until_first_victory'
AND
event_param1.key = 'level_id'
AND
event_param2.key = 'count'

Related

Firebase Automatically Collected User Property Gender Data - How to Extract the Gender Data

How to query Firebase Automated User Property Gender from Big Query. I used below query, the query is right, but after executing the query, I am getting the result "The query returned no results"
select app_info.id,app_info.version,
event_date,event_name,
param1.value.string_value as category,
param2.value.string_value as action,
param3.value.string_value as label,
param4.value.string_value as Gender,
count(1) as totalevent,count( distinct user_pseudo_id ) as uniqueusers
From `Tablet_*`,
UNNEST(event_params) as param1,
UNNEST(event_params) as param2,
UNNEST(event_params) as param3,
UNNEST(user_properties) as param4
WHERE _TABLE_SUFFIX BETWEEN '20200203' AND '20200203'
and param1.key='category'
and param2.key='action'
and param3.key='label'
and param4.key='Gender'
group by 1, 2, 3, 4, 5,6,7,8
order by totalevent desc
I have tested your query with the public table firebase-public-project.analytics_153293282.events_*, please validate the following:
1 The table suffix values to confirm the available range of dates:
SELECT DISTINCT(_TABLE_SUFFIX) AS RANGE_DATES
FROM `firebase-public-project.analytics_153293282.events_*`
2 If the previous contains data in date '20200203', check your filters:
param2.key='action'
param3.key='label'
param4.key='Gender'
For example, I tried to filter by 'action':
SELECT event_date, param1.key
FROM `firebase-public-project.analytics_153293282.events_*`,
UNNEST(event_params) as param1
where param1.key ='action'
The query returned no results

Google Bigquery Standard SQL - How to: If hit contains sku then select hitnumber else max(hitnumber)?

select
concat(fullvisitorid,cast(visitid as string)) as unique_session_id
,case
when h.item.productSku is not null then h.hitNumber
else max(h.hitnumber)
end
,h.item.transactionid
,h.item.itemrevenue/pow(10,6)
,h.item.productSku
from `myproject.mydataset.ga_sessions_20180101`, unnest(hits) as h
group by 1
Looking at case statement above (line 3)
How do I return that hitnumber where the productsku is populated
otherwise return the max hitnumber and then group this by the unique_session_id?
How to filter out transactionid's that contain '_ABC' at the same time?
I would suggest doing the grouping and finding the max hit number in a subquery. If you are going to use an aggregate function like MAX() in the select clause, then you need to group on or have aggregate functions for the other fields in the select. It can be useful to do aggregate sub-queries using common table expressions.
WITH data AS (
SELECT
CONCAT(fullvisitorid, CAST(visitid AS string)) AS unique_session_id,
h.hitNumber,
h.item.transactionid,
h.item.itemrevenue/POW(10,6) AS itemRevenue,
h.item.productSku
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_20170801`,
UNNEST(hits) AS h
),
max_hits AS (
SELECT
unique_session_id,
MAX(hitNumber) AS max_hit_number
FROM data
GROUP BY 1
)
SELECT
d.unique_session_id,
CASE
WHEN d.productSku IS NOT NULL THEN d.hitNumber
ELSE m.max_hit_number
END,
d.transactionid,
d.itemrevenue,
d.productSku
FROM
data AS d JOIN max_hits AS m
ON d.unique_session_id = m.unique_session_id

query with max and second factor [duplicate]

I have:
TABLE MESSAGES
message_id | conversation_id | from_user | timestamp | message
I want:
1. SELECT * WHERE from_user <> id
2. GROUP BY conversation_id
3. SELECT in every group row with MAX(timestamp) **(if there are two same timestamps in a group use second factor as highest message_id)** !!!
4. then results SORT BY timestamp
to have result:
2|145|xxx|10000|message
6|1743|yyy|999|message
7|14|bbb|899|message
with eliminated
1|145|xxx|10000|message <- has same timestamp(10000) as message(2) belongs to the same conversation(145) but message id is lowest
5|1743|me|1200|message <- has message_from == me
example group with same timestamp
i want from this group row 3 but i get row 2 from query
SELECT max(message_timestamp), message_id, message_text, message_conversationId
FROM MESSAGES
WHERE message_from <> 'me'
GROUP BY message_conversationId
ORDER by message_Timestamp DESC
what is on my mind to do union from message_id & timestamp and then get max???
Your query is based on non-standard use of GROUP BY (I think SQLite allows that only for compatibility with MySQL) and I'm not at all sure that it will produce determinate results all the time.
Plus it uses MAX() on concatenated columns. Unless you somehow ensure that the two (concatenated) columns have fixed widths, the results will not be accurate for that reason as well.
I would write the query like this:
SELECT
m.message_timestamp,
m.message_id,
m.message_text,
m.message_conversationId
FROM
( SELECT message_conversationId -- for every conversation
FROM messages as m
WHERE message_from <> 'me'
GROUP BY message_conversationId
) AS mc
JOIN
messages AS m -- join to the messages
ON m.message_id =
( SELECT mi.message_id -- and find one message id
FROM messages AS mi
WHERE mi.message_conversationId -- for that conversation
= mc.message_conversationId
AND mi.message_from <> 'me'
ORDER BY mi.message_timestamp DESC, -- according to the
mi.message_id DESC -- specified order
LIMIT 1 -- (this is the one part)
) ;
Try below sql to achieve your purpose by group by twice.
select m.*
from
Messages m
-- 3. and then joining to get wanted output columns
inner join
(
--2. then selecting from this max timestamp - and removing duplicates
select conversation_id, max(timestamp), message_id
from
(
-- 1. first select max message_id in remainings after the removal of duplicates from mix of cv_id & timestamp
select conversation_id, timestamp, max(message_id) message_id
from Messages
where message <> 'me'
group by conversation_id, timestamp
) max_mid
group by conversation_id
) max_mid_ts on max_mid_ts.message_id = m.message_id
order by m.message_id;
http://goo.gl/MyZjyU
ok it was more simple than I thought:
basically to change select from:
max(message_timestamp)
to:
max(message_timestamp || message_id)
or max(message_timestamp + message_id)
so it will search for max on concatenation of timestamp and message_id
ps. after a digging - it's working only if message id is growing with timestamp ( order of insertion is preserved )
edit:
edit2 :
so why it works ?
SELECT max(message_timestamp+message_id), message_timestamp, message_id, message_conversationId, message_from,message_text
FROM MESSAGES
WHERE message_conversationId = 1521521
AND message_from <> 'me'
ORDER by message_Timestamp DESC

Condensing several SQL queries into a single query so a parameter can be used for different data types

I'm looking to Condense several SQL Server queries into a single query so a parameter can be used for different data types. These types are dates, or numbers or strings. The parameter is called: #SearchValue.
In a strongly typed DataSet we have the 3 queries listed below.
This is for ASP.Net with a VB.Net code-behind file but I think this question is may also be good for non ASP.Net as well.
I call this one if the user enters a date into a search TextBox:
Query:
SELECT ID, PaymentAmount, PaymentDate, WhatWasPaymentFor
FROM Payments
WHERE (ParentID = #ParentID) AND
(PaymentDate = #SearchValue)
Call from VB.Net code-behind for the date search query:
tblObject = theTableAdapter.GetDataByPaymentDate(dcmParentsId, TextBoxSearch.Text)
If tblObject.Count() > 0 Then
GridViewSummary.DataSource = tblObject
GridViewSummary.DataBind()
End If
The other ones are for numbers only and the last one is for everything else.
This one is for numbers only:
SELECT PaymentDate, PaymentAmount, WhatWasPaymentFor, ID
FROM Payments
WHERE (ParentID = #ParentID) AND
(PaymentAmount = #SearchValue)
This one is called when the other 2 queries don't find any data:
SELECT PaymentDate, PaymentAmount, WhatWasPaymentFor, ID
FROM Payments
WHERE (ParentID = #ParentID) AND
((WhatWasPaymentFor LIKE '%' + #SearchValue + '%') OR
(#SearchValue = 'ALL'))
All of this coding works as is and I did it this way because there would be an error if I tried to call .GetDataByPaymentDate with a non date value.
Is there a way to use a single query to handle the searching by dates, numbers, and strings?
* UPDATES *
Thanks for all the sample queries. I am trying all of the sample queries in the SQL Server Management Studio to see what results come up.
I this one based on Gordon's query but it does not return any data:
DECLARE #SearchValue VARCHAR = '01/01/2012'
DECLARE #SearchType VARCHAR = 'Dates'
DECLARE #ParentID INT = 3
SELECT ID, PaymentAmount, PaymentDate, WhatWasPaymentFor
FROM Payments cross join
(select #SearchValue as sv) const
WHERE ParentID = #ParentID AND
(case when #SearchType = 'Dates' and ISDATE(const.sv) = 1
then (case when PaymentDate = CAST(const.sv AS datetime) then 'true' else 'false' end)
when #SearchType = 'Numbers' and ISNUMERIC(const.sv) = 1
then (case when PaymentAmount = cast(const.sv as Int) then 'true' else 'false' end)
when #SearchType = 'Everything Else'
then (case when WhatWasPaymentFor LIKE '%' + const.sv + '%' OR const.sv='ALL' then 'true' else 'false' end)
end) = 'true'
This is based on the one from gh9 and pulls up data. Thanks gh9:
DECLARE #SearchValue VARCHAR = 'Books'
DECLARE #ParentID INT = 3
DECLARE #PaymentDate DATETIME = NULL
DECLARE #PaymentAmount MONEY = NULL
SELECT ID, PaymentAmount, PaymentDate, WhatWasPaymentFor
FROM Payments
WHERE ParentID = #ParentID
AND (#paymentDate is null OR PaymentDate = #Paymentdate)
AND (#paymentAmount is null OR paymentAmount = #paymentAmount)
AND ((#SearchValue is null OR
(WhatWasPaymentFor LIKE '%' + #SearchValue + '%' OR #SearchValue='ALL'))
)
This lets you take advantage of not having to cast searchvalue to whatever you need, also is a bit more readable. Modify the syntax as needed, but the key idea is to use sql server ability to have null parameters and short circuit logic to evaluate strongly typed parameters instead of casting to the data type you need.
#ParentID INT
#PaymentAmount INT = NULL
#PaymentDate Datetime = null
#GenericSearchTerm varchar(100) = null
AS
BEGIN
SELECT
ID,
PaymentAmount,
PaymentDate,
WhatWasPaymentFor
FROM Payments
WHERE #ParentID = #ParentID
AND ( (#paymentDate is null OR PaymentDate = #Paymentdate))
AND (#paymentAmount is null OR paymentAmount = #paymentAmount))
AND ( #GenericSearchTerm is null OR ((WhatWasPaymentFor LIKE '%' + #GenericSearchTerm + '%' OR #SearchValue='ALL'))
EDIT:updated answer per #andriyM comments
How about adding a third parameter of #SearchType with possible values of ('datetime','int', or 'nvarchar') and using it in your WHERE clause to cast the #SearchValue to appropriate type for comparison. Something like:
SELECT
ID,
PaymentAmount,
PaymentDate,
WhatWasPaymentFor
FROM Payments
WHERE #ParentID = #ParentID
AND (
(#SearchType = 'datetime' AND PaymentDate = CAST(#SearchValue AS datetime))
OR
(#SearchType = 'int' AND PaymentAount = CAST(#SearchValue AS int))
OR
(#SearchType = 'nvarchar' AND
(WhatWasPaymentFor LIKE '%' + #SearchValue + '%' OR #SearchValue='ALL')
);
This is inspired by Bstateham's answer, but it does two things differently. First, it stores the values in a table, turning them from constants into variables. Second, it uses the case because that guarantees shortcircuiting.
This results in something like:
SELECT ID, PaymentAmount, PaymentDate, WhatWasPaymentFor
FROM Payments cross join
(select #SearchValue as sv) const
WHERE #ParentID = #ParentID AND
(case when #SearchType = 'datetime' and ISDATE(const.sv) = 1
then (case when PaymentDate = CAST(const.sv AS datetime) then 'true' else 'false' end)
when #SearchType = 'int' and ISNUMERIC(const.sv) = 1
then (case when PaumentAmount = cast(const.sv as Int) then 'true' else 'false' end)
when #SearchType = 'nvarchar'
then (case when WhatWasPaymentFor LIKE '%' + const.sv + '%' OR const.sv='ALL' then 'true' else 'false' end)
end) = 'true'
I also added in validation checks for the conversion to integer and date (not perfect, but will help). Also, you should probably name the #SearchType after the destination field rather than the type. However, I've kept the version with the type name.

SQL Select fields with a value of 'Y' and order by date descending, then select all others and order by another field ascending

I am generating an SQL query:
SELECT * FROM ToDoList
WHERE ws_status <> 'Completed'
AND (user_id= 'TESTUSR' OR ww_cover='TESTUSR'
OR (ws_status = 'Orphan' AND wwt_workgroupid IN (108)))
**ORDER BY psc_alt_code ASC**
And I need to list all results with wi_urgent set to 'Y' and order them by date Desc *first and then list all other results ordered by psc_alt_code descending* so I thought something like this would suffice:
ORDER BY (wi_urgent = 'Y') DESC, psc_alt_code ASC
I am getting SqlClient.SqlException: Incorrect syntax near '=' error when trying to run that query. Please note that I am querying an SQL View if that makes a difference?
You can use a case expression in the order by
SELECT * FROM ToDoList
WHERE ws_status <> 'Completed'
AND (user_id= 'TESTUSR' OR ww_cover='TESTUSR'
OR (ws_status = 'Orphan' AND wwt_workgroupid IN (108)))
ORDER BY CASE WHEN wi_urgent = 'Y' THEN 0 ELSE 1 END ASC
,psc_alt_code
I don't think you can do wi_urgent = 'Y' in an ORDER BY.
Since you're looking for all results with wi_urgent, try adding it to the WHERE clause:
SELECT * FROM ToDoList
WHERE ws_status <> 'Completed'
AND (user_id= 'TESTUSR' OR ww_cover='TESTUSR'
OR (ws_status = 'Orphan' AND wwt_workgroupid IN (108)))
AND wi_urgent = 'Y'
ORDER BY wi_urgent DESC,
psc_alt_code ASC

Resources