Dealing with multiple sub-sub domains as single hostname in GTM - google-analytics

My company has recently changed the way we deal with affiliates in the URL structure.
The old format: subdomain.website.com/page/?a=affiliate
with hostname subdomain.website.com
New format: affiliate.subdomain.website.com/page with hostname affiliate.subdomain.website.com
Is there a way (ideally in Google Tag Manager) to change the hostname being passed such that the affiliate part (if present) is stripped out?
More info: Not all visitors to the website come from affiliates. The issue I am facing is that the hostname for affilaites being passed in the data layer is different to what it used to be and is unique for each affiliate that we work with. This causes lookup tables (e.g. for analytics ID) not to work as they don't match the hostname anymore. It would be impossible to keep the lookup tables updated for each new affiliate added so I'm looking for a way to strip the affiliate part of the hostname to keep all hostnames consistent on this journey.

Sure. In your Google Analytics settings (either in your tag, or, if you use that, in your settings variable you can set the hostname field in the "fields to set" section.
If you have a single domain you just could set a constant value for the hostname. If you still have (non-affiliates) subdomains you need to track you still need a variable for that, and since it is not feasible to have a lookup table for all affiliate values you would go the other way round and whitelist the hostnames you want to keep (e.g. creating a lookup table with hostname as input that returns your cleaned up hostname as default value, and the original input for the hostnames you want to keep):
The above will return the hostname as entered for the three subdomains, and a default for everything else, thus creating a whitelist.

Related

Website with https, Google Analytics for long time with http... can I change it?

and its URL is 'secured' with SSL (with httpS://mywebsite.nl).
However, I found out that, for a long time, at Google Analytics, I use http://mywebsite.nl, ('non-secured') at my property and view's 'Default URL'.
I have two questions:
Did I miss data because I used http instead of https in the property and view's Default URL?
Can I CHANGE the http to httpS (in Google Analytics property/view) without problem, or do I lose historical data because of that? (This probably also depends on answer of Q1...) Or should I ADD a new property and/or view with https Default URL?
Thanks!
you didn't
you don't lose the historical data, feel free to change it.
That "default url" is for your convenience. you can do anything with it. That's just what GA uses to form full URLs from page paths only. Instead of using the hostname dimension there.
Also, GA is gracious enough to warn you whenever you can do significant changes to your core data.

If there exists a dot after ".com", is it a valid URL?

I came across a few URLs which also render with or without a dot/period after .com, while some do not.
For example:
www.example.com.
Should the URL render normally if a dot/period is added after .com or should it go to a 404 page?
As said in comment this great resource, solves many of your queries, including a portion below specific to your query:
Fully-Qualified Domain Names
When I double-click a Bonjour (DNS-SD) Name in a web browser like Safari, the resulting URL has a hostname with a dot at the end. Is this a bug?
No, the dot at the end is correct.
You can try it here. Try adding a dot at the end of www.dns-sd.org, as shown in the subtitle at the top of this page, and you should still get the same page.
It's a little-known fact, but fully-qualified (unambiguous) DNS domain names have a dot at the end. People running DNS servers usually know this (if you miss the trailing dots out, your DNS configuration is unlikely to work) but the general public usually doesn't. A domain name that doesn't have a dot at the end is not fully-qualified and is potentially ambiguous. This was documented in the DNS specification, RFC 1034, way back in 1987:
Since a complete domain name ends with the root label, this leads to a
printed form which ends in a dot. We use this property to distinguish between:
a character string which represents a complete domain name
(often called "absolute"). For example, poneria.ISI.EDU.
a character string that represents the starting labels of a
domain name which is incomplete, and should be completed by
local software using knowledge of the local domain (often
called "relative"). For example, "poneria" used in the
ISI.EDU domain.
How this affects web browsing
The people defining the HTTP protocol understood this issue, and RFC 1738 specifies clearly that the part of a URL is supposed to contain a fully qualified domain name:
3.1. Common Internet Scheme Syntax
//<user>:<password>#<host>:<port>/<url-path>
host
The fully qualified domain name of a network host
Unfortunately, the people implementing web browser clients appeared not to understand what this meant. When you access a web site, the value most web browsers put in the "Host:" field is what the user typed, not what the computer actually ended up using, after applying the DNS user's searchlist to constuct a fully-qualified name from the partial name. For example, here are three different ways the user may refer to the host "www.example.com."
www.example.com. — Absolute domain name
www.example.com — Relative domain name, which, after applying the "." that's always implicitly in everyone's DNS searchlist, becomes www.example.com.
www with "example.com" in DNS searchlist — user types "www" and gets
www.example.com.
When sending the Host: parameter to the web server, the web browser client puts in what the user typed (www.example.com., www.example.com, or www) instead of what the client ended up actually looking up in DNS (www.example.com. in all three cases). Unfortunately the Apache web server (at least in some versions) doesn't recognise that all those three names are just three different ways of referring to the same host.
If you're a web site administrator setting up a web site using Apache "VirtualHost" directives or similar, you need to have a ServerAlias line listing all the things the user might type to get to that web site (typically the first label, the whole name without a trailing dot, and the whole name with a trailing dot, as shown in the example above).
See: http://www.dns-sd.org/trailingdotsindomainnames.html
And the old RFC it links to: http://www.ietf.org/rfc/rfc1034.txt
Truly fully qualified domain names have a period after the TLD, but unless you're managing a DNS server you almost never come across them. It is however something you might want to consider if you were for instance writing an HTTP server varying on hostname.
A period at the end of a hostname is an indicator that the resolver should not attempt to use its search domains in order to resolve the hostname if the given name does not resolve. That is, if the resolver has a search domain of "lan", if you attempt to look up "web" it would first try resolving "web" followed by "web.lan", but with "web." it would only try "web".
As for the server, it never sees the URL, only the hostname and path (as separate entities), and there is no reason for it to complain if the Host header includes the period (although there is also no reason for the client to include it).

What settings are required to put AWS CloudFront CDN in front of a squarespace website?

I had trouble getting AWS CloudFront to work with SquareSpace. Issues with forms not submitting and the site saying website expired. What are the settings that are needed to get CloudFront working with a Squarespace site?
This is definitely doable, considering I just set this up. Let me share the settings I used on Cloudfront, Squarespace, and Route53 to make it work. If you want to use a different DNS provide than AWS Route53, you should be able to adapt these settings. Keep in mind that this is not an e-commerce site, but a standard site with a blog, static pages, and forms. You can likely adapt these instructions for other issues as/if they come up.
Cloudfront (CDN)
To make this work, you need to create a Cloudfront Distribution for Web.
Origin Settings
Origin Domain Name should be set to ext-cust.squarespace.com. This is Squarespace's entry point for external domain names.
Origin Path can be left blank.
Origin ID is just the unique ID for this distribution and should auto-populate if you're on the distribution creation screen, or be fixed if you're editing Origin Settings later.
Origin Custom Headers do not need to be set.
Default Cache Behavior Settings / Behaviors
Path Patterns should be left at Default.
I have Viewer Protocol Policy set to Redirect HTTP to HTTPS. This dictates whether your site can use one or both of HTTP or HTTPS. I prefer to have all traffic routed securely, so I redirect all HTTP traffic to HTTPS. Note that you cannot do the reverse and redirect HTTPS to HTTP, as this will cause authentication issues (your browser doesn't want to expose what you thought was a secure connection).
Allowed HTTP Methods needs to be GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE. This is because forms (and other things such as comments, probably) use the POST HTTP method to work.
Cached HTTP Methods I left to just GET, HEAD. No need for anything else here.
Forward Headers needs to be set to All or Whitelist. Squarespace's entry point we mentioned earlier needs to know where what domain you're coming from to serve your site, so the Host header must be whitelisted, or allowed with everything else if set to All.
Object Caching, Minimum TTL, Maximum TTL, and Default TTL can all be left at their defaults.
Forward Cookies cookies is the missing component to get forms working. Either you can set this to All, or Whitelist. There are certain session variables that Squarespace uses for validation, security, and other utilities. I have added the following values to Whitelist Cookies: JSESSIONID, SS_MID, crumb, ss_cid, ss_cpvisit, ss_cvisit, test. Make sure to put each value on a separate line, without commas.
Forward Query Strings is set to True, as some Squarespace API calls use query strings so these must be passed along.
Smooth Streaming, Restrict Viewer Access, and Compress Objects Automatically can all be left at their default values, or chosen as required if you know you need them to be set differently.
Distribution Settings / General
Price Class and AWS WAF Web ACL can be left alone.
Alternate Domain Names should list your domain, and your domain with the www subdomain attached, e.g. example.com, www.example.com.
For SSL Certificate, please follow the tutorial here to upload your certificate to IAM if you haven't already, then refresh your certificates (there is a control next to the dropdown for this), select Custom SSL Certificate and select the one you've provisioned. This ensures that browsers recognize your SSL over HTTPS as valid. This is not necessary if you're not using HTTPS at all.
All following settings can be left at default, or chosen to meet your own specific requirements.
Route 53 (DNS)
You need to have a Hosted Zone set up for your domain (this is specific to Route 53 setup).
You need to set an A record to point to your Cloudfront distribution.
You should set a CNAME record for the www subdomain name pointing to your Cloudfront distribution, even if you don't plan on using it (later we'll go through setting Squarespace to only use the root domain by redirecting the www subdomain)
Squarespace
On your Squarespace site, you simply need to go to Settings->Domains->Connect a Third-Party Domain. Once there, enter your domain and continue. Under the domain's settings, you can uncheck Use WWW Prefix if you'd like people accessing your site from www.example.com to redirect to the root, example.com. I prefer this, but it's up to you. Under DNS Settings, the only value you need is CNAME that points to verify.squarespace.com. Add this CNAME record to your DNS settings on Route 53, or other DNS provider. It won't ever say that your connection has been fully completed since we're using a custom way of deploying, but that won't matter.
Your site should now be operating through Cloudfront pointing to your Squarespace deployment! Please note that DNS propogation takes time, so if you're unable to access the site, give it some time (up to several hours) to propogate.
Notes
I can't say exactly whether each and every one of the values set under Whitelist Cookies is necessary, but these are taken from using the Chrome Inspector to determine what cookies were present under the Cookie header in the request. Initially I tried to tell Cloudfront to whitelist the Cookie header itself, but it does not allow that (presumably because it wants you to use the cookie-specific whitelist). If your deployment is not working, see if there are more cookies being transmitted in your requests (under the Cookie header, the values you're looking for should look like my_cookie=somevalue;other_cookie=othervalue—my_cookie and other_cookie in my example are what you'd add to the whitelist).
The same procedure can be used to forward other headers entirely that may be needed via the Forward Headers whitelist. Simply inspect and see if there's something that looks like it might need to go through.
Remember, if you're not whitelisting a header or cookie, it's not getting to Squarespace. If you don't want to bother, or everything is effed (pardon my language), you can always set to allow all headers/cookies, although this adversely affects caching performance. So be conservative if you can.
Hope this helps!
Here are the settings to get CloudFront working with Squarespace!
Behaviours:
Allowed HTTP Methods Ensure that you select: GET, HEAD, OPTIONS, PUT, POST, PATCH, DELETE. Otherwise forms will not work:
Forward Headers: Select whitelist and choose 'Host'. Otherwise squarespace will not know which website they need to load up and you get the message 'Website has expired' or similar.
Origins:
Origin Domain Name set as: ext-cust.squarespace.com
Origin Protocol Policy Select HTTPS so that traffic between the CDN and the origin is secure too
General
Alternate Domain Names (CNAMEs) put both your www and none www addresses here and let Squarespace decide on if to direct www to root or vice-versa (.e.g example.com www.example.com)
You can now configure SSL on CloudFront
HTTPS You can now enforce HTTPS using a certificate for your site here rather than in Squarespace
Setting I'm unsure about still:
Forward Query Strings: recommended not for caching reasons but I think this could break things...
Route53
Create A records for www and root (e.g. example.com www.example.com) and set as an alias to your CloudFront distribution

How can we pass on referrer details to Adobe SiteCatalyst?

Our website is a vertical search engine and we refer a lot of traffic offsite to partners sites.
We recently switched our website over to serve all traffic via HTTPS. We realised this might confuse some of our partners if they were looking at referrer stats and saw a drop in traffic attributed to us. Therefore at the same time, we added the content-security-policy:referrer origin header and we can see that the referrer is correctly passed along by the browser.
Generally this is working fine but we have had complaints from users of Adobe SiteCatalyst (previously Omniture) who are no longer able to attribute traffic as being referred from us. We don't have access to SiteCatalyst to test this out. How does SiteCatalyst track referral traffic and is there a way to view all traffic split by different sources/referrers?
I don't know if this accounts for everything, since I don't have full context on both your end or your users' end, but here is some info / thoughts that might help.
By default, Adobe Analytics tracks referrer from document.referrer. This can be overridden by setting s.referrer.
In general, depending on how your site directs visitors to the other site vs. Browser security/privacy settings, document.referrer may or may not have a value. For example, Internet Explorer's default security/privacy settings is to suppress document.referrer on dynamically generated popup windows (e.g. window.open() calls).
So, and again, this is just speculation because I don't know the full context, you may need to work something out w/ your users, e.g. explicitly passing the referring url as a query param to the target page, and have your users pop s.referrer with it if it exists. Something along the lines of:
if ( !document.referrer ) {
s.referrer=s.Util.getQueryParam( 'refURL' );
}
Note: s.Util.getQueryParam is a utility function for Adobe Analytics AppMeasurement library that will return the value of the specified query param, or an empty string if it doesn't exist. If your users are still using legacy H code, they should use the s.getQueryParam plugin instead. Or use whatever homebrewed method of getting a query param from the URL, since javascript doesn't have a built-in function for it.

Unable to Verify Custom Domain with Firebase Using Namecheap

After I followed the instruction by inserting the Text Record 1 that firebase has provided into my NameCheap, this error message keeps popping up:
Current Status: Sorry, we were unable to verify your domain.
This message has been shown up for about 5 days now.
I've captured screenshots of firebase and namecheap setting as below:
After #Frank van Puffelen suggested to change the host value from my domain name to #, this is the screenshot, and we'll wait and see after a few hours, hopefully it can verify successfully.
After waiting for a few hours this message appears:
From other reports and the information on this Google page for verifying namecheap domains, it looks like you may have to use # for the host field.
In the Namecheap site, click Manage next to the domain you want to
verify with your Google service.
Click the Advanced DNS tab on the domain dashboard.
Scroll down and click Add New Record under the host records table.
Select TXT Record from the record type drop-down list.
Paste the entire verification record into the Value field.
Enter # in the Host field.
Leave the TTL field set to Automatic.
Click the green check mark to save your TXT record.
Note: The change may take up to 24 hours to update. However, as you go
through the next steps in the Setup Wizard, the wizard immediately
starts checking for your new TXT record to verify your domain.
Can you try that? If it doesn't work, let me know and also reach out to support#firebase.com.
Your attached screens do not show your CNAME configuration. Even though firebase instructions indeed only asks for two TXT records, records that are both correctly set up as your screens show, I believe that these pair of TXT records that firebase requests do not free you up from the need of setting up at least a CNAME record in addition to both TXT records.
This was my case: while I did not set a CNAME record, firebase never recognized my domain.
I am not an expert (sorry!), but in the lack of other answers, even I may be helpful to suggest that you set up your CNAME record to point to:
CNAME record
host: www
value: [yourfirebaseappname].firebaseapp.com.
(Please note the dot after the '.com').
In my case this was enough to make firebase works well and recognize my domain.
In my specific case, and I'll register here at least for my own future use, I prefered to use, as an after step, both A records supplied from firebase as my way to route to my domain without www.
I believe this can be done with CNAME, but in my case the final setup was:
Advanced DNS Management
type: CNAME record
host: www
value: [my-domain-name-without-www]
type: A record
host: #
value: [IP address from firebase, like '1.2.3.4']
type: A record
host: #
value: [Second IP address from firebase, like '2.3.4.5']
Everything is working fine using this configuration. Goal reached.
As a todo future step, It would be useful to learn how to achieve similar goal using the CNAME record pointing to firebase domain instead of A records pointing to firebase supplied IP addresses.
Hope this helps other users in similar situation!
I had issues connecting my custom domain as well, apart from using # in the host field and the CNAME, you'd also need A records. Here's all I ended up with, I sat back a while for the domain to get propagated (I had just purchased it) and waited for 10 minutes after adding all of the records and it worked. Also I've found helpful instructions in this blog post.
I had the same problem and I took Namecheap support. Then they provided me following properties to change when connecting Namecheap with Firebase.
For the TXT records Host should be # instead of yourdomain.com
For the CNAME records remove the trailing domain name from the Host.
In this example value should be
firebase1._domainkey
and not
firebase1._domainkey.yourdomain.com
I was having the same issue for days (my custom domain made it through the verification stage but would not connect) but it turns out that I didn't have to change my DNS configuration to fix it (the screenshot for the config that worked is below for reference).
I just had to re-run the Connect Domain wizard. https://firebase.google.com/docs/hosting/custom-domain
Hosting -> Select the vertical ellipsis under your custom domain -> delete domain -> then click Connect Domain to start the wizard again.
namecheap dns configuration
Lookup your domain using this tool to see if there are DNS errors https://toolbox.googleapps.com/apps/dig/
For me (with namecheap) I had to remove all existing records, then the TXT record alone worked.
Reference: this post
That configuration worked for me. Seems the easiest and propagation takes around 20 minutes (include your TXT verification as well, of course).

Resources