I'm using the Marketing Solutions Criteo API.
I want to withdraw metrics at Ads level, but there is no endpoint for ads, and the lowest level available seems to be Adsets.
Therefore, I'm using the endpoint statistics/report in analytics to get information on Ads level, given that there is an AdId and Ad (ad name): https://developers.criteo.com/marketing-solutions/docs/campaign-statistics
The response includes a total, and rows. I expected these rows to be at Ad level, but it's not clear to me what are these rows. For example, If I receive a 115 rows response, 28 of them have as AdId an empty string, other 28 of them have -1 as AdId and a number also repeated 28 times as another AdId, then there are just a few unique values. What is the API actually returning me on rows? It doesn't seem to be Ads, then what is it?
I cannot find anything on these rows in the documentation.
typically -1 is used to indicate the value for unknown or the values that are not applicable, e.g if you are querying historical data for the adset which is linked to an archived creative. To get more precise information I suggest that you reach out to the Criteo account representative of your account. I agree that something like this should be documented on the public documentation pages.
Related
I wonder whether someone can help me please.
I have a user who under a specific property, sporadically receives the following error:
Some hits sent on 03-Jul-2018 to property ...... exceeded one or more hit quotas and were therefore not processed.
Hits can be dropped when daily or monthly hit limits are exceeded. You can view your hit volume levels in Property Settings in Analytics.
Hits can also be dropped if visitor hit limits are exceeded. This can happen when your site is incorrectly generating the visitor ID for a GA session. Contact your website administrator to check that the visitor ID generation has been correctly implemented.
They are not using the Premium account but when I look at the data for the day in question, there aren't any issues with regards to 'High Cardinality' which unless I've misunderstood I'd expect to see.
Could someone look at this please and offer some guidance where the issue may be because this area is fairly new to me.
Many thanks and kind regards
Chris
Collection limits are influenced by 2 factors:
The tracker: whether you use ga.js,gtag.js,analytics.js etc... here are the details.
The property type: whether you are using GA (10M hits / month) or GA 360 (2B hits / month).
In your case you are facing a property limit. To find out when such limits where reached, you can create a custom report using a time dimension (eg date+time) combined with the hits metric. You can also combine the hit metrics with other dimensions (country, browser, device) to see if you find any patterns as to why you're getting so many hits.
Cardinality is something else: it refers to the number of unique value combinations for your dimensions. For instance if you have 500K events where each event category is different, you'll have a Cardinality of 500K on the event category dimension. The more hits, the more likely you'll have a high cardinality, but the 2 aren't necessary related (if you send 10B events with the same category, the cardinality on the category is 1).
So focus on identifying and solving your limits/quotas issue, as it's the real issue here:
If the number of hits is legitimate (you have a huge amount of traffic), then the only options are to upgrade to GA 360 or reduce the number of hits for each session
If the number of hits is abnormally high (eg traffic is stable but hits increased dramatically), look for implementation issues, especially generic event trackers such as error tracking with tools like Google Tag Manager
In a typical GA session, after picking a View ID and a date range,
We can get a week's worth of data like this:
Users
146,207
New Users
124,582
Sessions
186,191
The question is, what BQ field(s) to query in order to get this Users value?
Here is an example query with 2 methods (the 2nd method is commented out).
SELECT
count(DISTINCT(CONCAT(CAST(visitID as STRING),cast(visitNumber as
STRING)))) as visitors,
-- count(DISTINCT(fullVisitorId)) as visitors
I noticed the FVID method was fairly close to what I see in GA (with Users being a little understated by a 3% in BQ) and if I use the commented out method, I get a value that is about 15% overstated as compared to GA. Is there a more reliable method in BQ to acquire the Users value in GA?
The COUNT(DISTINCT fullVisitorId) method is the most correct method, but it won't match what Analytics 360 reports by default. Since last year, Google Analytics 360 by default uses a different calculation for the Users metric than it previously did. The old calculation, which is still used in unsampled reports, is more likely to match what you get out of BigQuery. You can verify this by exporting your report as an unsampled report, or using the unsampled reporting features in the Management API.
If you want the numbers to match exactly, you can turn off the new calculation by using the instructions here. The new calculation's precise details are not public, so duplicating that value in BigQuery is quite difficult.
There are still some reasons you might see different numbers, even with the old calculation. One is if the site has implemented User ID, in which case the GA number will be lower than BigQuery for fullVisitorId. Another is sampling, though that's unlikely in Analytics 360 at the volumes you're talking about.
If I make a request to the Google Analytics API using only the metric "ga:users", the result is different to the one that is returned in the "totalsForAllResult" field when I add a dimension.
Does anyone know the explanation for this and which is the correct result?
You cannot sum up the user counts per dimension and get a gross total as one and the same user can appear in multiple dimension values. For a detailed explanation look here. If you want to get the total users value, repeat the API request w/o the dimension. Apparently, Google does the mistake of blindly summing the values themselves in the totalsForAllResults field of the response of the Core Reporting API which can be highly misleading.
To anonymously analyze users flow and engagement I want to use the ClientID, as identifier of each user, as a value of a custom dimension. I have two questions regarding this idea:
How many values can be associated to a custom dimension? This will determine the feasibility of this approach or not.
Is there any other approach to track individually, yet anonymously, users activity?
I'm not aware of a limit though for custom dimension length. But storing userId, sessionId customerId and timestamps for all hits in custom dimensions is not all that unusual these days. Here is a link to a post by Simo Ahava's post Improve Data Collection With Four Custom Dimensions on how to set it all up in google tag manager.
For hit based custom dimension you can store as many values as there are hits. The problem is not storage, the problem is that the interface will not show more than 50 000 rows with distinct values (any additional value will go into a row labeled "other"). Also some of the reports (namely demographics) will not work with very small segments.
I cannot think of any other way to track users individually (and if you are interested in opinions, I blogged about how I do not understand why people want to do this). The interface is not very well suited for this kind of "atomic" information, so I think the approach is more useful for API integrations that can properly visualize information on a per user basis.
If you refer to http://www.google.com/intl/en_uk/analytics/premium/features.html, you will notice that Standard allows for 10 million hits processed per month and Premium allows for 1 billion.
I have a website on an account, with multiple "folders" for different sub-domains, and also different "Views" or dashboards for some of these sub-domains.
The website I am on recently lost tracking for conversion rates, and everything has plummeted to near 0%, which is an incorrect statistic. I am curious as to how I can figure up if this account is reaching the 10 million limit on the standard version. Or at least how to figure actual hits processed a day, week, or month?
Any ideas?
Thanks!
I don't know how Google enforces hit limits in 2015. However in 2013 a Google representative sent one of our bigger clients a document (answering a question about data limits) that contained the following paragraph:
How do data limits impact sampling? Google Analytics does not sample
your clients data at the point of collection or processing, regardless
of how far they exceed our stated limits. So no hits are discarded.
The only way to sample data at the point of collection is for clients
to use_setSampleRate in their tracking code.
[...]
[...] we reserve the right to shutdown their account [sc. if limits are exceeded], but it won't
happen before we have attempted to contact the account Admins multiple times
and we have exhausted all other options.
Unless Google has changed it's policy in the last 1,5 years I would say not, unprocessed hits are not your problem; it seems Google would have contacted you with an request to limit your hits or upgrade to Analytics Premium before problems occurr.
Plus, since you mentioned that you have several views - views do not count towards your quota (they display the same data in different ways). However properties (I think that is what you mean by "folders") do.
Updated 2017: It seems that Google intends to enforce limits more strictly. One of my clients now has the following warning in his GA interface:
Your data volume (XXX hits) exceeds the limit of 10M hit per month as
outlined in our terms of service. If you continue to exceed the limit
you will lose access to future data.
You can create a database table, like this:
visits(
id bigint primary key auto_increment,
ip text,
visit_date timestamp default current_timestamp
)
Upon each page visit, you can insert a record into the table. Later you can view statistics. For instance, visit count in a given day would look like:
select id, ip, visit_date
from visits
where visit_date >= '2015-07-21 00:00:00' and visit_date < '2015-07-22 00:00:00'