I have a requirement to programmatically get unique visitors grouped by partial matches on some fields. For example, assume I want to group my users by the source domain like "google" or "facebook".
A single user's visits might come in with a ga:source of "m.facebook.com" and then "www.facebook.com" on another visit, or "m.google.com" and "www.google.co.uk", etc. I can perform an API query specifying "ga:source" as the dimension, and it will give me the unique visitors for "m.facebook.com", "www.facebook.com", "m.google.com" and "www.google.co.uk" respectively. However users who visited via more than one of them in the requested period are counted in each group, so aggregating this data subsequently into "facebook" and "google" groups results in duplicate users being counted.
Would it be possible to group the "ga:source" dimension using a Regex (^(?:.*?\.)(.*?)(?:\..*) for instance) or some similar arbitrary mechanism so that I can get two groups of unique visitors instead: "facebook" and "google"?
I can of course, use filters to get each category and then perform multiple requests and that works fine, but being the lazy programmer I am, I was wondering if I could do it all in one go, or if anyone had alternative suggestions I haven't thought of.
The conclusion appears to be that the only way is indeed to submit a filtered query for each desired grouping of unique users. So I shall do that. :)
Related
I'm working with internal site search terms from Google Analytics in Google Data Studio. I need to count how many times users searched specific terms on the website. The problem is, the data is case sensitive and users often misspell words when they search, so that won't get tallied in a normal count function. For example, "careers", "Careers", "cAREERS", and "carers" are all different searches. What formula can I use to easily count how many times users searched different terms?
First add a field with the formula LOWER. Then add a field with case when to correct each possible spelling errors.
Another route would be to create a "sounds like" field. Here BigQuery give a nice function SOUNDEX. Data Studio does not offer somthing like that, but you can build a function with reg_exs so that: first character of word and then only the vocals of the word, but remove duplicated vocals first.
I'm trying to get all unique visitors for a selected time period, but I want to filter them by date on the server. However, the sum of unique visitors for each day isn't the number of unique visitors for the time period.
For example:
Monday: 2 unique visitors
Tuesday: 3 unique visitors
The unique visitors for the two days period isn't necessarily 5.
Is there a way to get the results I want using the Google Analytics API (v3)?
You're right that Users aren't additive, so you can't simply add them day by day. There are several ways around this.
The fist and most obvious is that if you've implemented the User-ID you should be able to straight up pull and interrogate the data about which users saw your site on which days.
Another way I've implemented before is to dynamically pull the number of Users from the Google Analytics API whenever you need it. Obviously this only works if you're populating a live web dashboard or similar, but since it's just the one figure you're asking for, it wouldn't slow down the load time by much. Eg. if you're using a dashboarding tool such as Klipfolio, you may be able to define a dynamic data source, and query Google whenever you needthe figure (https://support.klipfolio.com/hc/en-us/articles/216183237-BETA-Working-with-dynamic-data-sources)
You could also limit the number of ways that the data can be interrogated, and calculate all of them. For example, if you only allow users to look at data month-by-month or day-by-day, then you only need those figures.
Finally, you can estimate the figure with reasonable accuracy by splitting it into two parts. New Users are equal to New Sessions (you're only new on your first Session), which is additive, so that figure can be separated out and combined as required.
Then, you could take a rough ratio of new to returning Users (% New Users) from, say, 1 year of data, and use that with the New Users figure to generate an average on any level.
To anonymously analyze users flow and engagement I want to use the ClientID, as identifier of each user, as a value of a custom dimension. I have two questions regarding this idea:
How many values can be associated to a custom dimension? This will determine the feasibility of this approach or not.
Is there any other approach to track individually, yet anonymously, users activity?
I'm not aware of a limit though for custom dimension length. But storing userId, sessionId customerId and timestamps for all hits in custom dimensions is not all that unusual these days. Here is a link to a post by Simo Ahava's post Improve Data Collection With Four Custom Dimensions on how to set it all up in google tag manager.
For hit based custom dimension you can store as many values as there are hits. The problem is not storage, the problem is that the interface will not show more than 50 000 rows with distinct values (any additional value will go into a row labeled "other"). Also some of the reports (namely demographics) will not work with very small segments.
I cannot think of any other way to track users individually (and if you are interested in opinions, I blogged about how I do not understand why people want to do this). The interface is not very well suited for this kind of "atomic" information, so I think the approach is more useful for API integrations that can properly visualize information on a per user basis.
I want to create a Google Analytics segment for our users who view at least a certain number of pages on our site. From what I can tell (please correct me if I'm wrong) this is easy to do if you don't care about what kind of page they view: you create a filter for the segment that checks to see if Unique Pageviews is greater than some value such as 4. However our site has a whole bunch of pages that I don't really care if someone reads (our "about page" for example). So what I'm trying to do is create a segment of how many people view at least X pages of what we call "Learning Content" (basically two specific page types on our site). How can I segment the users who read a certain amount of learning content?
Two types of pages fit into our definition of learning content. The first one has a URL matching a regex that sort of looks like /learning_content_1/.* and the second matches regex /learning_content_2/.*. I've already created a content group for learning content that correctly identifies these two content groups. However I wasn't able to find any way to filter a segment based on how many unique pageviews (or even just pageviews) come from a specific content grouping. Is this even possible? If not, how might I work around that?
The research I've done so far: Google Analytics: How to segment by many groups of pages was somewhat helpful but didn't address the question of how to create an actual GA segment based on pageview information for a content grouping or content group.
The only way I can think of handling this, is by associating a specific custom event that gets triggered on this page. Then you can create a segment that matches users who have that event category:
and total events greater than 4:
It's a workaround, and it doesn't work if you are tracking other events, but maybe that works for you?
Our customers log in from several different computers over the course of a day, and so our unique visitor count in google analytics is really inflated. I'd like to give GA our user ids, for example, so that it could be much smarter about this stat. Is there any way to influence what GA considers a unique visitor?
Besides trickery with cookies (which I wouldn't recommend), there's no "built-in" way to better inform Google Analytics to take into account multiple computers for its Unique Visitors calculations.
However, you could set the user ID in a Custom Variable, and use that to track the number of "real" unique visitors, and the distribution within the users.
_gaq.push(["_setCustomVar", 1, "User", user_id_string ,1]);
Be sure to delete this custom variable once the user signs out, both for privacy and accuracy reasons.
GA uses two cookies __utmb and __utmc. You can try to capture and recreate those if your users authenticate or you have a way of tracking them.