ga:sessions via API not matching interface when custom segment applied - google-analytics

I have a custom segment that has been is use for in excess of a year which very recently started returning some very odd results in the interface. Session numbers seemed to reduce down to match the transactions numbers, so ecommerce conversion rate was ~100%.
The interface seemed to "recover" last week, but the API still will not return session numbers that are expected. Other custom segments return normal session numbers. I have queried the API with two different spreadsheet add-on and the query explorer but they all return the same - even for very short time periods where sampling is not an issue.

Related

Google Analytics API segment vs. filter

My goal is to get ga::users, ga::sessions and ga::avgSessionDuration for all sessions that visited page path containing /mockeroo from Google Analytics API v4. Ignoring session duration for now, it is my understanding that a filter clause like ga:page contains /mockeroo and sessions::condition::ga:page contains /mockeroo should produce the same results for at least number of users and number of sessions. However, I cannot replicate the results on Google Analytics (which is using the segment constraint) with my API call (which is using the filter clause). Am I not understanding the difference between a filter clause and a segment correctly?

How can I view individual hits to pages within a GA custom report

I would like to compare some data between a 3rd party analytics tool and GA.
Now I would love to see the IP addresses that Ga is receiving however it seems that they do not reveal this information, fine, however, I cannot find a way to use the flat table in the GA custom report to show me the following if possible;
Full Date Time (Seems as though they don't want you to have this either)
Browser Version
Browser Width & Height
Page (from the hit)
And I would like this data not to be grouped by the metric, this way I can see that if the same user has hit a page 3 times it isn't grouped.
If anyone can help please let me know. If the question is poorly phrased please let me know.
Thanks,
Connor.
This requires some work, and it will allow the breakdown only for future hits, not for hits that are already collected.
To view individual hits you need to create a hit based dimension that is unique per hit. Unless your page has an amazing amount of traffic a timestamp in milliseconds (e.g. new Date().getTime()) will be sufficient (for your report you might want to format that in a nice way). So in the admin section of your GA property you go to custom definitions, create a hit scoped custom dimension, and then modify your pagecode to send the timestamp to that dimension. Hit scoped means it is attached to the pageview (or other interacton hit) it is sent with.
If you want to break down your report by user you need the clientid (clientid is how Google recognizes that hits belong to the same user). Again, send it as a custom dimension.
This does not tell you how many sessions the user had (there is no session identifier in GA). If you need to know that you can create a session scoped custom dimension and send a random number along ("session scope" means that GA only stores the last value in a session, so you don't need to maintain a session id over multiple pageviews, since the last value will be set for all hits within the session). The number of different sessions ids per client id then tells you the number of sessions per user.
The takeaway is that GA only shows aggregated data, and if you want to defeat this mechanism you need to throw data at it that cannot be aggregated further. You might run into other constraints (i.e. there is a limited number of rows per report).

How to replicate the GA field Visits in Big Query

In a typical GA session, after picking a View ID and a date range,
We can get a week's worth of data like this:
Users
146,207
New Users
124,582
Sessions
186,191
The question is, what BQ field(s) to query in order to get this Users value?
Here is an example query with 2 methods (the 2nd method is commented out).
SELECT
count(DISTINCT(CONCAT(CAST(visitID as STRING),cast(visitNumber as
STRING)))) as visitors,
-- count(DISTINCT(fullVisitorId)) as visitors
I noticed the FVID method was fairly close to what I see in GA (with Users being a little understated by a 3% in BQ) and if I use the commented out method, I get a value that is about 15% overstated as compared to GA. Is there a more reliable method in BQ to acquire the Users value in GA?
The COUNT(DISTINCT fullVisitorId) method is the most correct method, but it won't match what Analytics 360 reports by default. Since last year, Google Analytics 360 by default uses a different calculation for the Users metric than it previously did. The old calculation, which is still used in unsampled reports, is more likely to match what you get out of BigQuery. You can verify this by exporting your report as an unsampled report, or using the unsampled reporting features in the Management API.
If you want the numbers to match exactly, you can turn off the new calculation by using the instructions here. The new calculation's precise details are not public, so duplicating that value in BigQuery is quite difficult.
There are still some reasons you might see different numbers, even with the old calculation. One is if the site has implemented User ID, in which case the GA number will be lower than BigQuery for fullVisitorId. Another is sampling, though that's unlikely in Analytics 360 at the volumes you're talking about.

Revenue in Google Analytics not appearing correctly

I've just finished tagging a website, and currently submitting test data.
On one page we fire an Ecommerce purchase event, which consists off the transaction information / product information.
The transaction object is populated by retrieving values off the page i.e from html labels this is all working as expected, however when we check Google Analytics > All Web Site Data it seems GA is formatting the number and removing trailing zeros for some strange reason, please see an example:
This doesn't happen to all purchase events, as you can see at the top of the table there is an entry for $1,793.04 and this has been displayed correctly.
In regards to populating the transaction object, we aren't doing any formatting what so ever we simply reference the html label value and pass that, so I'm unsure into how this could be happening. Has anyone experienced this before?
Personally I have not found this anywhere else documented but just to be sure you can conform with the formatting expected by the Measurement Protocol. Meaning the http request that will be ultimately sent to GA servers to process your values. The transactionRevenue and most monetary values in analytics.js are of type CURRENCY and you can find the documentation of these data type in the parameter reference. And from the link I quote
A decimal point is used as a delimiter between the whole and
fractional portion of the currency.
So the thing that I would advise you to do is use some js function to format your string values. Remove the comma and then use the (.) as a separator for whole and fractional point.

How can I query Google Analytics condition on TWO different dates?

I wish to extract (via the Analytics Core Reporting API) all the transactions made TODAY by users that had a specific ga:eventCategory few weeks ago.
I'm looking to see the date of a transaction and all dated of event that are related to that transaction.
If GA was sql I would join by the ga user and take in the dimension both his transactions date and his dimension update date...
Thanks.
Noam.
Like I have indicated in my comment you can segment the data to include only those users who have the specific event. Segmentation works fine with the core reporting API.
Your segment defintion would look like this:
users::condition::ga:eventCategory==[myEventCategory]
(where obviously the thing in [brackets] is a placeholder that needs to be substituted for the event category name). The "users::" prefix means you are segmenting by user scope (as opposed to sessions), so this will include all sessions in the selected timeframe for users who had the event at least in one of their session (even if the event was outside the selected timeframe).
Select transactionId as dimension and some metric (revenue) and todays date and you are done. Or you would be done if this was actually going to work, but there are at least two caveats:
Google Analytics does not work in realtime, so it's unlikely that TODAYs transactions are fully available (Google says it's 24 hours until the data is processed - actually it might happen faster, but you cannot rely on it).
If a user has deleted his or her cookie she won't be recognized as a recurring user and GA will be unable to segment her out. The longer the interval between the event and the transaction the less likey it is that the GA cookie is still present.
So even with a technically correct query it might be that you won't get the data you need.

Resources