GA not tracking due to Unicode domain name? - google-analytics

I have 2 domains which I track using GA and GTM.
One works fine, the other doesn't (it doesn't track any data or realtime users).
Here is a screenshot of the working one including GTA screenshot:
I think I've set up the other (not working) domain just the same, however GTA only shows a GTM tracking, but no GA tracking:
It should show 2 trackings, namely GA and GTM tracking, right?
If it only shows the GTM tracking, it's wrong, right?
The domain that doesn't work has an Unicode name, but I'm not sure if that's the problem.
What could I investigate to find out why tracking the other domain isn't working?
This is my GTM setup:

I found the error.
Obviously, GTM will not accept the transcribed Unicode name of the domain, but it expects the "real" Unicode name, represented with Unicode characters.
Once I added the Unicode domain name in the lookup table, it worked:
The upper entry (transcribed Unicode name) could be deleted, it didn't work anyways.
Later, I removed it, and it still worked, which means that it wasn't used anyways.

Related

Use look-up table in Tag Manager to determine which Analytics Property to post data to, based on URL

I have two websites on the same domain: example.com/fr and example.com/us. Each website has its own Google Analytics Property. I'd like to record page view data in each website's respective Analytics property.
Since there may be many more websites and Analytics Properties in future I thought the best way to manage this would be to create a custom variable of type RegEx Table to map Page URLs to Analytics Property Ids, like this:
*example\.com/us* maps to: UA-123456789-0
*example\.com/fr* maps to: UA-123456789-1
etc
…then use that variable as the value of the 'Google Analytics Settings' field in a new Universal Analytics Tag. I feel like I've set this up properly but it just will not log any data in Analytics whatsoever. I've tried it with even the loosest RegEx patterns e.g *example\.com* and even * and still nothing gets into Analytics.
I've tried a similar thing using a Lookup Table (as opposed to a RegEx Table) and that works as expected, but I believe a look-up table is limited to matching only an exact URL, so if I map http://example.com/us/ to UA-123456789-0 (my US site's Analytics Property Id) it works great when I visit that page exactly, but when I visit http://example.com/us/test/ it doesn't log anything as the URL isn't matched exactly in the lookup. So I know the principle of what I'm doing works with a Lookup Table, but it seems the same approach doesn't work with a RegEx Table.
I wanted to ask:
It is possible to use a Regex table for outputting an Analytics Property Id?
If so, any pointers as to what I might have done wrong?
If not, are there any other neat options? (Otherwise I'm potentially going to have to set up a lot more variables/events/tags if I need to log the same things on each site to different Analytics Properties - it'd involve adding the same conditions onto pretty much every tag.
Many thanks.
As prompted by the comment from #EikePierstorff, it seems that it is possible to use a Regex Table as a lookup for determining the Analytics Property Id to use based on URL. The simple issue was incorrect RegEx expressions.

Google Analytics tracking for PDA emails

I have a requirement where I need to track whether a user clicked a link in a PDA email where the link included in the email is >900 characters.
I'm not sure if Google analytics support tracking in PDA.
If anyone has ever done this,please help me out.
Thanks
I seem to have misunderstood the question, so here is an update. Google will usually track any valid Urls. The two exceptions I can think of are more theoretical than a practical concerns.
Some old browsers (I think IE6 and similar vintages) have a character limit for GET requests (2048 bytes IIRC), so very long links will not work, and this not be tracked correctly. For all practical purposes these browsers should be extinct by now
A Google Analytics request is limited to 8096 bytes.The request has to transmit the document location as part of the payload, so if your URL is really massively oversizes (technically 8000 characters is ">900") this would not be tracked. Again, this is hardly a practical concern (unless there is a lot of other data, like e.g. Enhanced E-Commerce product impressions in that request).
Old (and probably irrelevant) answer:
Google Analytics does typically not track actions within emails, since email clients do not usually support javascript (there are implementations of email open tracking via "web bugs" linked to a script that does a measurement protocol request, but event that does not work particularly well).
If this is a link that points to your homepage the typical way to track this would be via utm parameters - i.e. you do not track the action within the email itself, but the result (the visit to your homepage).
UTM parameters (or "campaign parameters") are
utm_medium - the kind of traffic (if it's paid advertising, banner ads, or in your case e.mail)
utm_source - the specific vendor (e.g. "google" if the link is from a paid Google Ad, or in your case it could be the name of the department that sent out the mail)
utm_campaign - your advertising campaign; in the case of a periodic newsletter this could be e.g. the number of the newsletter
utm_term - you usually would not use that in an email, that's reserved for when a link is a result of a search (then you would insert the search term)
utm_content - if you have multiple links with the same link target and campaign info you can add additional information (e.g. if you have the same link at the top and the bottom of your mail you could indicate the position here)
You cannot do anything dynamic, though - if you want to mark links with a specific character count you would have to do this within your newsletter programm and insert the number. GA would then be able to pick this up from the campaign parameters.
E.g. for your use case you might construct a target URL like
www.example.com?utm_medium=email&utm_source=my_department&utm_campaign=pda_mail&utm_content=<number of characters>
and then get the information from the Aquisition reports in Google Analytics.
If the links do not point to your own homepage you would need to set up an intermediate page that tracks the utm_parameters before it redirects to the intended destination.

Google Analytics Site Search Config Questions

We have site search set up on our site, with the correct query parameter, however, we are not seeing site search data. See the image here for how our config is set up.
The URL for our search page looks like this https://www.premierinc.com/?s=Example.
We know there has been search traffic. Internal users (myself included) have performed a number of searches. The tag has been live for over a week, so processing delay shouldn't be an issue.
I've also triple checked that I'm on the correct view in GA.
Any ideas?
So it turns out that GA favors the document path (dp) parameter over the document location (dl) parameter. So although the search term was in the payload sent to GA, it was promptly ignored when it got there :)
Moral, if you use dp, you probably need to use dq as well.
(Thanks to Kim Towne on the GA forums for helping me figure this one out).

In Google Analytics, how do I ignore a specific subdomain as a referral? The proper use of _addIgnoredRef

I need help understanding and also instructions on how to properly use "_addIgnoredRef".
First I will explain what type of situation I am in. We have 2 subdomains (each subdomain has it's own webserver/application) that is being used as one website, meaning there are links going back and forth between these two subdomains. For the sake of just using an example, let's say we have "abc.website.com" and "123.website.com". We have links on abc.website.com that links to 123.website.com and vice versa, however they are treated as one website.
Second, we do not develop or change any google anyalytics code to the domain website.com. We only have access to the subdomains, abc.website.com and 123.website.com.
So the issue we are seeing is that Google Analytics is telling us that we have referrals coming from these two subdomains. I understand that it's because we have these two subdomains linking back and forth.
I do understand GA has a command that allows me to IGNORE the referrals by using _addIgnoredRef . However, am I safe to assume that I go to abc.website.com and append this to the GA code,
_gaq.push(['_addIgnoredRef', '123.website.com']);
and vice versa for 123.website.com?
Ultimately we want to not see referrals coming from 123.website.com and abc.website.com, but we don't mind seeing referrals coming from website.com or www.website.com.
If what I have assume is correct, then I must be missing something because that is what I have setup currently. Then my next question would be, do I have this correct?
_gaq.push(['_setDomainName', 'website.com']);
Do I need the trailing period?
You can go on and on about how that having two subdomains as one website isn't a good idea for many reasons but this is what was provided to us when we first started. We will eventually merge them into one website, but for now let's just say it will take awhile and we need to "bandage the situation".
First off, adding _addIgnoreRef will only convert the referral into direct traffic. If this is desired, then yes, you would add:
_gaq.push(['_addIgnoredRef', '123.website.com']); //add to abc.website.com
and
_gaq.push(['_addIgnoredRef', 'abc.website.com']); //add to 123.website.com
The trailing period isn't necessary, just as long as you are consistent across the entire website. According to Google, the trailing period comes more into play when you have multiple layers of subdomains - https://developers.google.com/analytics/devguides/collection/gajs/methods/gaJSApiDomainDirectory?csw=1#_gat.GA_Tracker_._setDomainName

How to decode google gclids

Now, I realise the initial response to this is likely to be "you can't" or "use analytics", but I'll continue in the hope that someone has more insight than that.
Google adwords with "autotagging" appends a "gclid" (presumably "google click id") to link that sends you to the advertised site. It appears in the web log since it's a query parameter, and it's used by analytics to tie that visit to the ad/campaign.
What I would like to do is to extract any useful information from the gclid in order to do our own analysis on our traffic. The reasons for this are:
Stats are imperfect, but if we are collating them, we know exactly what assumptions we have made, and how they were calculated.
We can tie the data to the rest of our data and produce far more accurate stats wrt conversion rate.
We don't have to rely on javascript for conversions.
Now it is clear that the gclid is base64 encoded (or some close variant), and some parts of it vary more than others. Beyond that, I haven't been able to determine what any of it relates to.
Does anybody have any insight into how I might approach decoding this, or has anybody already related gclids back to compaigns or even accounts?
I have spoken to a couple of people at google, and despite their "don't be evil" motto, they were completely unwilling to discuss the possibility of divulging this information, even under an NDA. It seems they like the monopoly they have over our web stats.
By far the easiest solution is to manually tag your links with Google Analytics campaign tracking parameters (utm_source, utm_campaign, utm_medium, etc.) and then pull out that data.
The gclid is dependent on more than just the adwords account/campaign/etc. If you click on the same adwords ad twice, it could give you different gclids, because there's all sorts of session and cost data associated with that particular click as well.
Gclid is probably not 100% random, true, but I'd be very surprised and concerned if it were possible to extract all your Adwords data from that number. That would be a HUGE security flaw (i.e. an arbitrary user could view your Adwords data). More likely, a pseudo-random gclid is generated with every impression, and if that ad is clicked on, the gclid is logged in Adwords (otherwise it's thrown out). Analytics then uses that number to reconcile the data with Adwords after the fact. Other than that, there's no intrinsic value in the gclid number itself.
In regards to your last point, attempting to crack or reverse-engineer this information is explicitly forbidden in both the Google Analytics and Google Adwords Terms of Service, and is grounds for a permanent ban. Additionally, the TOS that you agreed to when signing up for these services says that it is not your data to use in any way you feel like. Google is providing a free service, so there are strings attached. If you don't like not having complete control over your data, then there are plenty of other solutions out there. However, you will pay a premium for that kind of control.
Google makes nearly all their money from selling ads. Adwords is their biggest money-making product. They're not going to give you confidential information about how it works. They don't know who you are, or what you're going to do with that information. It doesn't matter if you sign an NDA and they have legal recourse to sue you; if you give away that information to a competitor, your life isn't worth enough to pay back the money you will have lost them.
Sorry to break it to you, but "Don't be Evil" or not, Google is a business, not a charity. They didn't become one of the most successful companies in the world by giving away their search algorithm to the first guy who asked for it.
The gclid parameter is encoded in Protocol Buffers, and then in a variant of Base64.
See this guide to decoding the gclid and interpreting it, including an (Apache-licensed) PHP function you can use.
There are basically 3 parameters encoded inside it, one of which is a timestamp. The other 2 as yet are not known.
As far as understanding what these other parameters mean—it may be helpful to compare it to the ei parameter, which is encoded in an extremely similar way (basically Protocol Buffers with the keys stripped out). The ei parameter also has a timestamp, with what seem to be microseconds, and 2 other integers.
FYI, I just posted a quick analysis of some glcid data from my sites on this post. There definitely is some structure to the gclid, but it is difficult to decipher.
I think you can get all the goodies linked to the gclid via google's adword api. Specifically, you can query the click performance report.
https://developers.google.com/adwords/api/docs/appendix/reports#click
I've been working on this problem at our company as well. We'd like to be able to get a better sense of what our AdWords are doing but we're frustrated with limitations in Analytics.
Our current solution is to look in the Apache access logs for GET requests using the regex:
.*[?&]gclid=([^$&]*)
If that exists, then we look at the referer string to get the keyword:
.*[?&]q=([^$&]*).*
An alternative option is to change your Apache web log to start logging the __utmz cookie that google sets, which should have a piece for the keyword in utmctr. Google __utmz cookie and you should be able to find plenty of information.
How accurate is the referer string? Not 100%. Firewalls and security appliances will strip it out. But parsing it out yourself does give you more flexibility than Google Analytics. It would be a great feature to send the gclid to AdWords and get data back, but that feature does not look like it's available.
EDIT: Since I wrote this we've also created our own tags that are appended to each destination url as a request parameter. Each tag is just an md5 hash of the text, ad group, and campaign name. We grab it using regex from the access log and look it up in a SQL database.
This is a non-programmatic way to decode the GCLID parameter. Chances are you are simply trying to figure out the campaign, ad group, keyword, placement, ad that drove the click and conversion. To do this, you can upload the GCLID into AdWords as a separate conversion type and then segment by conversion type to drill down to the criteria that triggered the conversion. These steps:
In AdWords UI, go to Tools->Conversions->Add conversion with source "Import from clicks"
Visit the AdWords help topic about importing conversions https://support.google.com/adwords/answer/7014069 and create a bulk load file with your GCLID values, assigning the conversions to you new "Import from clicks" conversion type
Upload the conversions into AdWords in Tools->Conversions->Conversion actions (Uploads) on left navigation
Go to campaigns tab, Segment->Conversions->Conversion name
Find your new conversion name in the segment list, this is where the conversion came from. Continue this same process on the ad groups and keywords tab until you know the GCLID originating criteria
Well, this is no answer, but the approach is similar to how you'd tackle any cryptography problem.
Possibility 1: They're just random, in which case, you're screwed. This is analogous to a one-time pad.
Possibility 2: They "mean" something. In that case, you have to control the environment.
Get a good database of them. Find gclids for your site, and others. Record all times that all clicks occur, and any other potentially useful data
Get cracking! As you have started already, start regressing your collected data against your known, and see if you can find patterns used decrypting techniques
Start scraping random gclid's, and see where they take you.
I wouldn't hold high hope for this to be successful though, but I do wish you luck!
Looks like my rep is weak, so I'll just post another answer rather than a comment.
This is not an answer, clearly. Just voicing some thoughts.
When you enable auto tagging in Adwords, the gclid params are not added to the destination URLs. Rather they are appended to the destination URLs at run time by the Google click tracking servers. So, one of two things is happening:
The click servers are storing the gclid along with Adwords entity identifiers so that Analytics can later look them up.
The gclid has the entity identifiers encoded in some way so that Analytics can decode them.
From a performance perspective it seems unlikely that Google would implement anything like option 1. Forcing Analytics to "join" the gclid to Adwords IDs seems exceptionally inefficient at scale.
A different approach is to simply look at the referrer data which will at least provide the keyword which was searched.
Here's a thought: Is there a chance the gclid is simply a crytographic hash, a la bit.ly or some other URL shortener?
In which case the contents of the hashed text would be written to a database, and replaced with a unique id.
Afterall, the gclid is shortening a bunch of otherwise long text.
Takes this example:
www.example.com?utm_source=google&utm_medium=cpc
Is converted to this:
www.example.com?gclid=XDF
just like a URL shortener.
One would need a substitution cipher in order to reverse engineer the cryptographic hash... not as easy task: https://crypto.stackexchange.com/questions/300/reverse-engineering-a-hash
Maybe some deep digging into logs, looking for patterns, etc...
I agree with Ophir and Chris. My feeling is that it is purely a serial number / unique click ID, which only opens up its secrets when the Analytics and Adwords systems talk to each other behind the scenes.
Knowing this, I'd recommend looking at the referring URL and pulling as much as possible from this to use in your back end click tracking setup.
For example, I live in NZ, and am using Firefox. This is a search from the Firefox Google toolbar for "stack overflow":
http://www.google.co.nz/search?q=stack+overflow&ie=utf-8&oe=utf-8&aq=t&client=firefox-a&rlz=1R1GGLL_en-GB
You can see that: a) im using .NZ domain, b) my keyword "stack+overflow", c) im running firefox.
Finally, if you also stash the full landing page URL, you can store the GCLID, which will tell you the visitor came from paid, whereas if it doesn't have a GCLID, then the user must have come from natural search (if URL tagging is enabled of course).
This would theoretically allow you to then search for the keyword in your campaign, and figure out which adgroup them came from. Knowing the creative would probably be impossible though, unless you split test your landing URLs or tag them somehow.

Resources