Connecting Google Analytics clientId to BigQuery fields - google-analytics

I wanted to resurface this question since I couldn't find a good answer.
How do you match the dot-separated clientId (see link google analytics section) in Google Analytics to fields in BigQuery? I'd expect the clientId to be the fullVisitorId, but it isn't. The clientId is in the form 1234567890.9876543210 where the last ten digits are the visitId in BigQuery.

A while ago I asked the same question to the GAP support team and their answer was that the fullVisitorId in BigQuery is a hash form of Client ID, so the Client ID from the _ga cookie isn't represented directly in BigQuery.
In other words, a single Client ID will map to a single fullVisitorId, but you can't get the client ID in BigQuery (unless you capture it as a custom dimension).
Update:
Starting on May 24, 2018, Google began exporting the Google Analytics Client ID to BigQuery. The schema addition details are:
-------------------------------------------------------------------------
| Field Name | Type | Description |
-------------------------------------------------------------------------
| clientId | STRING | Unhashed version of the Client ID for a given |
| | | user associated with any given visit/session. |
-------------------------------------------------------------------------

Related

What does Visit Number mean in Google Analytics - BigQuery export?

Is `visitNumber the number of times a user visits a site to date?
But in the data (see screenshot below), I'm seeing visit numbers skipped, and the visitorId is also null.
1
I assume you are dealing with Google Analytics exported to BigQuery
If so:
visitorId is deprecated (thus nulls) and fullVisitorId should be used instead.
visitNumber is an INTEGER that represents session number for the user. If this is the first session, then this is set to 1.
fullVisitorId is a STRING that represents unique visitor ID (also known as client ID).
See more at BigQuery Export schema

BigQuery : Returning Visits Count Mismatch with google Analytics data

I am importing google Analytics data into bigquery session_streaming table using Owox BI. I have a requirement to count returning visits count using this data but result is not matching.
Business logic: If newVisits is null then its a returning visitor
Date Range: 10th june 2018
Source : Google
Medium: CPC
BigQuery Result: 136 Returning Visits
GA Account: 95(TotalUsers-New Users)
SELECT
count(distinct clientId ) as returningvisits
FROM `test.Test.session_streaming_20180610` where trafficSource.medium ='cpc' and trafficSource.source ='google' and newVisits is null
Schema of session streaming table
user RECORD NULLABLE
user.id STRING NULLABLE
user.phone STRING NULLABLE
user.email STRING NULLABLE
clientId STRING NULLABLE
date STRING NULLABLE
sessionId STRING NULLABLE
visitNumber INTEGER NULLABLE
newVisits INTEGER NULLABLE
There are few more fields.
Could you please help me whats wrong with this query ?
Mayank!
You've already contacted our support service and we're replied in the support chat.
Just to double check everything, we're sending you the reply here.
First of all, it's not the best idea to count new users in GA like Total Users - Returned Users.
Because a single user can be a New Visitor and a Returning Visitor in the same time period.
New Users are the first time visitors to your website who have initiated a session for the first time for a given date range. If the same users return back to your website on the same day or any other day then they’ll be considered as returning visitors in GA.
We recommend you to use the same logic of counting users in the segment, e.g. count New users in both GA and BQ.
And use GROUP BY instead of COUNT(DISTINCT ...), it gives a more accurate result in most cases.

Correct Firebase database layout for a user to user (and group) chat app?

I'm trying to understand the best database structure to store and retrieve user to user conversations using the Firebase database for a chat app (web based).
My current plan is to give each chat its own ID which would be created by combining the unique Firebase IDs of the two chat participants (like UserID1_UserID2), for example: FQ5d0jwLQDcQLryzevBxKrP72Bb2_GSIbxEMi4jOnWhrZaq528KJKDbm8 this chat ID would be stored in the database, and would contain the messages sent between the two participants.
Example layout:
MYAPP
|_______conversations
| |_____UserID1_UserID2
| | |
| | |__OshwYF72Jhd9bUw56W7d
| | | |__name:"Jane"
| | | |__text:"Hello!"
| | |
| | |__KbHy4293dYgVtT9pdoW
| | |__PS8tgw53SnO892Jhweh
| | |__Qufi83bdyg037D7RBif
| | |__Gicuwy8r23ndoijdakr
| |
| |_____UserID5_UserID16
| |_____UserID8_UserID7
| |_____UserID3_UserID8
|
|_______users
Whenever a user signs into the app, they'll see a list of their contacts. When they select one to chat with, I would use some Javascript to combine their, and their selected friend's Firebase ID to generate the chat ID. This chat ID would then be either created in the database (if it's their first time to chat), or it would be used to load previous messages that they have exchanged (if they have chatted before), from the database.
My question is, is this the correct method to use? What issues might I run into if I use this method? For example, would I have problems if I try to implement group conversations (with more than 2 people) in the future?
I'd be really grateful for any help, or examples of the correct database layout logic for a person to person (and group) chat application using Firebase/a no SQL database.
Thank you in advance!
Something that I would like to point out as one of the most important "rules" to consider when creating a NoSQL database is that you must
Structure your data after the view.
This means that the data must be structured in such a way, that when you want to display it on your view (probably your html pages) you do a single read.
So in order to find the best way to structure your database, you must first look at your view. And try to imagine how you would read data (direct reads and queries).
Altought your current structure looks good (for what you're building now), yes, you might have some problems when creating group chats. I would recommend using something like this:
You wil notice that this way, you can actually add more participants to each chat easily. Note that this isn't supposed to be your only node. You would have to create other nodes like users (to store the user details) and messages (store each chat's messages), etc
To help you with that, I recommend watching David East's Firebase Database For SQL Developers.

Tying an Application Insight metric to a Operation Id

Configuration: I have a metric reporting the request duration, and I have two custom events setup to show the start time and end time, and each event is filled out with pertinent information for the request.
Problem: I have a metric which is reporting long request durations. But the list of insights is not friendly to correlate the metric with the events and dependencies for the operation.
I would like to either find the duration between the two events to identify which operations are taking long, or assign an Operation Id to a metric which would then allow me to filter the list of insights to ones that have high durations.
I can suggest 2 approaches.
Approach #1:
Open Search in Azure portal and filter requests by performance bucket. You can then click on each search result to view correlated events. If predefined buckets don't work for you you can assign your own using telemetry initializer in SDK (please let us know if predefined buckets don't work)
Approach #2
Use Analytics join query to find out telemetry items with chosen operation ID, for example:
requests
| where duration > 5000
| project operation_Name , operation_Id, duration
| join (traces | project operation_Id, message ) on operation_Id
| project operation_Name , message
| limit 10

Triggers or T-SQL?

I am trying to make notification system for my website:
This is the table structure
notification
-----------------
id (pk)
userid
notification_type (for complexity like notifications for pictures, videos, apps etc.)
notification
time
notificationsRead
--------------------
id (pk) (i dont think this field is required, anyways)
lasttime_read
userid
Now my understanding is that when a notification is added, we need to find the users friends and insert all those rows in the notification table right ? If this is correct, then, what would be the best way to achieve this ?
Triggers?
Write T-SQL (sql query in server side) to Select all the friends and then use SQL bulk copy?
A good relational approach would be to set up your tables so that you have a people table. Each person has a primary key that uniquely identifies them. Then another tables links the peoplekeys together as friends. There are a few ways to go about this but as a rough example:
People
---------------------------------------
PeopleKey | Name | <other profile data>
Then you have a table that relates people to each other as friends.
Friendships
----------------------------------------------------------------------
PeopleKey | FriendKey (fk to peoplekey) | <details about friendship>
Then you have your notifications table that says a person has done something.
Notification
-----------------------------------------------------------
NotificationKey | PeopleKey | Date | <notification details>
With a query and your friendship relations you can obtain all the notifications for all the friends of a person:
select notification.* from notification inner join
friendships on friendships.peoplekey = notification.peoplekey
where date = #importantdate
The relationship can reveal a lot without having to store repetitive data. There are 1000 other ways to join, query, or link the tables to friendships and events. So for example, you could say show all events where peoplekey is one of my friends. Show all events where peoplekey is only one of my friends etc. etc. etc.

Resources