Getting MCF Conversions path data from Google Bigquery - google-analytics

I am using Google Bigquery to extract data on conversion paths from Google Analytics (GA).
When I analyze these conversion paths from the exported dataset, the last-click conversions match the Acquisition report in GA, but not to the Multi Channel Funnel (MCF) data. Apparently Bigquery doesn't really export raw data, but transforms it by deleting all last direct clicks. like described here: https://support.google.com/analytics/answer/1319312?hl=en.
Is it possible to get the Bigquery data to correspond to Multi Channel Funnel (MCF) conversion path data? To undo the deletion of last non-direct click and get proper 'raw' user level data?

All of the trafficSource fields in BigQuery Export for Google Analytics use campaign attribution as described in this processing flow, which will overwrite direct traffic with the most recent campaign (if there is one and it is within the specified timeout), as you mentioned.
If you are using Universal Analytics, you can adjust the campaign timeout to be shorter than the 6 month default. For example, if you set the campaign timeout to be one day, any direct visits that come in at least one day after a visit with a campaign will be attributed to direct instead of the previous campaign. This can be done with Classic Analytics as well using _setCampaignCookieTimeout. This technique will affect data collection from the time it is implemented going forward.

This thread is rather dated, so I thought I'd update just in case anyone else comes across this same question.
There is a field that was introduced (both in the Google Analytics interface and the BigQuery export) that allows you to match the numbers in the MCF reports. In BigQuery, look for the field trafficSource.isTrueDirect
BigQuery Export Schema
trafficSource.isTrueDirect
True if the source of the session was Direct (meaning the user typed
the name of your website URL into the browser or came to your site via
a bookmark), This field will also be true if 2 successive but distinct
sessions have exactly the same campaign details. Otherwise NULL.

Related

Google Analytics API returning no data before 23/09/16

I have a Google Analytics API request that provides full data for any date after 23/09/16, but nothing before.
The metrics/dimensions in use are:
date
ga:sessions
ga:users
ga:deviceCategory
ga:sourceMedium
ga:campaign
I have created a custom report with the same dimensions/metrics in the web view, and I can confirm that the data does exist there (and is being provided). If I take out deviceCategory (or only have deviceCategory) then results are returned. This suggests to me that before this date, deviceCategory and sourceMedium/campaign were an invalid combination, but I can't find anything in the release notes to suggest this was changed and checking previous versions of the dimension explorer using archive.org does not indicate this was the case either.
I have raised a support request with Analytics support but they have said they don't have a team for API related issues.
Any help would be greatly appreciated. We already have a lot of reporting built around this combination and would like to be able to compare historical data.
Edit: I think this has something to do with the data retention settings in Analytics (which default to 26 months). Not sure why this particular combination would be affected as there is no user or event data required here.
You are right, it's because of data retention settings.
"The retention period applies to user-level and event-level data associated with cookies, user-identifiers <...>"
https://support.google.com/analytics/answer/7667196?hl=en
deviceCategory is associated with ClientID.

Download Google Analytics information with a unique user ID

I'm looking to download hit data from a Google Analytics view for a small period of time that includes unique ID for a session and URL that was viewed. I believe I could do this going forward by setting something in Google Tag Manager to a Custom Dimension, but I was looking to avoid that (we have a good number of custom dimensions) and because I wouldn't be able to go backward.
Is it possible in the free version of GA to do something like? I picture the output being the URLs in my x-axis and my users in the y-axix with counts.
I'll be looking to take this data and do a cluster analysis to determine user behavior types.
Nope. Google Analytics does not expose a user specific id via the API or via data exports in a standard account (in GA360 you could use BigQuery to extract the client id).
You either have to set up a custom dimension (as you said this does not work for historic data), or try and use calcuated fields in Google Data Studio in the hope that if you aggregate enough different dimensions into one field you will end up with something specific per user.

Google AdWords Conversions not matching database entries

I've just set up a tool on a client site that users can use to request a quote from our client. To do this the user lands on a form page, fills in their details, submits and then lands on a thank-you page. Pretty basic.
I set this process up as a goal in Google Analytics, using the destination type goal: "begins with /thank-you" and shared that goal as a conversion in Google AdWords.
I decided to run a few Google AdWords ads to promote the tool. I also wanted to double-check the conversion data that AdWords gives you so I set the destination URL in Adwords to www.example.com/form-page?adsrc=adwords1 (2, 3, 4 etc. for each ad) and I configured the DB so that there was a column that tracked which URL the user was on when filling in the form (this would be the column I counted to get the number of conversions that came from AdWords so I could compare)
Further to this, I made sure that the initial URL parameters that the user landed on were stored in the session so that if the user browsed to other pages and came back to fill in the form later, it would still attribute the conversion to AdWords.
I tested this thoroughly on a staging and production environment and everything was working correctly.
I ran the campaign for a week and when I checked, the conversion results in the Data Base vs the ones coming from AdWords are wildly different. The DB tells me I've had 5 conversions while AdWords gives me 21.
Is there anything in the way Google uses its gclid that may be causing this issue? Or is there a problem with the way I've set up the measurement structure?
This can be caused by few things, but I think this is the GA/AdWords issue, more than your DB/session set-up.
Gclid shouldn't influence your goal, since it is used only for AdWords/Analytics interactions, Goals should not be affected in your set-up.
https://support.google.com/analytics/answer/2938246?hl=en
Probable cause: If your goal set-up only contains "begins with /thank-you", isn't it possible, that you are counting all the sessions which reach thanks-you page? Not just AdWords?
Solution: if you need to count conversions in AdWords (for performance improvements), use AdWords conversion code at the same page, this counts only those users, who clicks an ad and reach your thank-you page in x (default 30) days. Be sure to count only unique conversions (users by cookie).
Differences between GA/AdWords conversion count:
https://support.google.com/analytics/answer/2679221?hl=en
Google attributes conversions to the last marketing channel, where direct visits do not count as a marketing channel (if you look at their attribution flow visualization you see that the penultimate step is to check for existing campaign information for the user). So GA might overcount Adwords visits (or other campaigns) and conversely shows fewer conversions for direct visits.
On contrast your database probably records the last traffic channel without an elaborate attribution model, so it will show less campaign traffic.
Also IIRC the adwords interface records the conversion for the time of the ad click, not the actual goal conversion, so the timeframes for the conversions differ.

Track individual via Google Analytics

I am new to Google Analytics. I have a conversion set up, and I'd like to capture demographics for individuals who convert at e individual level. When a user converts, I'd like to pass the statistics from Google Analytics into my own database, or keep a record of the ID assigned at conversion in Google Analytics and download the data. Is this possible? I want to do analysis on who is likely to convert versus not - age, shopping habits, etc - and link these details back to the type of specific conversion.
Thoughts?
We if want to add add track users at individual level, try creating a Custom diamension and passing the userID everytime you make a GA call.
Next to view the same :- in tables that you view your analytics data, there will be this option called 'Secondary diamention', from the list that appears.. Choose the custom diamension name that you had assigned. Now you will be able to see each row with the userID along slide.
Add user ID to your tracking script
The first step to your question is to add the user ID to your tracking. If you have some way to identify users on your website (ie: through email marketing tool, your CRM, etc), then you should set up user ID tracking. Here is the Google Dev article about that:
https://developers.google.com/analytics/devguides/collection/analyticsjs/cookies-user-id?hl=en
However, basically... you just have to add this to your tracking script (replacing the 'create' line you currently have, replacing "USER_ID" with the user ID your system gives them.
ga('create', 'UA-XXXXX-Y', { 'userId': USER_ID });
In addition, I usually include this UserID as a custom dimension, so that I can view it in Google Analytics and other reports. To do this, first set up a custom dimension for your ID, as a "User-level" dimension. Then just add this after your user ID is available (assuming this is your first custom dimension):
ga('set', 'dimension1', USER_ID);
Connect User ID to Demographics
Unfortunately, the demographics information that Google Analytics provides (under "Audience") is not compatible with custom dimensions (like userID). So, the API only allows you to pull the audience data in aggregate (ie: connected to City, number of users, pageviews of those types of people... etc). If this works for you, check out the GA Query Explorer (below) to try out different combinations of dimensions & metrics to drill down as deep as you can and maximize the information you gain from this demographic info.
Connect Google Analytics Data to Your Database
In order to connect GA data to your database in an automated fashion, you will need to set up some kind of scheduled process that runs a query off of the Google Analytics API. To explore what combinations of metrics and stuff are available, I'd suggest checking out the Google Analytics API Explorer and the Google Analytics API reference material. What I did was set up an SSIS package (using SSIS GoogleAnalyticsSource) as the data source, which made it fairly easy. Then just scheduled that to run daily, populating the data I want into my database.
Alternatively, you could download less complicated reports directly from Google Analytics and import them into your database with something like SQL Server Management Studio.

How to include custom segments in the list of segments when querying the Google Analytics API?

This may be a possible duplicate of this question, but according to all the Google Analytics documentation I really should be able to pull my list of custom segments.
Since I have a very large list of them, it would be suboptimal for me to manually copy the segment ids over one at a time.
I'm following this walk through. Steps to reproduce:
Create a custom segment using date of first session in your Google Analytics account.
Authorize the Google Analytics guide to access your Google Analytics account.
Try their on-page query tester, and inspect whether your custom segment is there.
One thing I've already ruled out was the user that created the segment. I've manually created a segment with the same user that I'm querying the API with and it still does not show. Is there a flag I need to set somewhere to include custom segments?
Edit:
It turns out that it will list some custom segments, but not ones created with date of first session, so this is a duplicate of this question, which means that there is a bug in the Google Analytics API.
There was a bug which is now fixed. So it is now possible to list the Date of Session Segments in the Google Analytics Management API by calling the segments.list() method.
So after days of trying to solve this one I've come to the conclusion that it cannot be done as asked.
There is, however, another way to do it. For every segment set up a daily (or weekly, etc) email report to a email as a TSV. In each email body specify the name of the segment so when you're consuming the emails you can know which segment the attached TSV is for. It doesn't look like the daily reports were designed with segments in mind, since non of the metadata included in the TSV mentions which segment it is for.
From there it's trivial. Connect to the email address using an IMAP client once a day and update the numbers.
Note that the daily email only contains the numbers for that day (not a specified range), so you'll need to first generate the report one time with the historical data to load in.
While hacky, one nice thing about this approach is that it keeps your reports in sync with your (faked through email) api code (provided you match the column headings in the TSV). So, if for example, a new filter is included into a report, the new daily fields will continue to update.
Unfortunately though, the past data won't be reflected in the change.
Obviously this isn't great, but if you are monitoring daily cohorts it's the best you've got if you need to stay with Google Analytics. I have raised this as a bug to the Google Analytics developers, but I haven't heard back as to whether or not they plan to fix it.

Resources