Ok! So I have spoken to a google representative about this issue, however since I am not enterprise level, he can't push me to tech support and suggested that I use the SO for answers. Here is the question...
In Google Maps Terms it states the following:
(b) No Pre-Fetching, Caching, or Storage of Content. You must not pre-fetch, cache, or store
any Content, except that you may store: (i) limited amounts of Content for the purpose of
improving the performance of your Maps API Implementation if you do so temporarily (and in
no event for more than 30 calendar days), securely, and in a manner that does not permit
use of the Content outside of the Service; and (ii) any content identifier or key that
the Maps APIs Documentation specifically permits you to store. For example, you must not
use the Content to create an independent database of "places" or other local listings
information.
This led me to originally believe that google would not allow caching of any type of information. However, then I read the following:
When to Use Client-Side Geocoding
The basic answer is "almost always." As geocoding limits are per user session, there is no risk that your application will reach a global limit as your userbase grows. Client-side geocoding will not face a quota limit unless you perform a batch of geocoding requests within a user session. Therefore, running client-side geocoding, you generally don't have to worry about your quota.
Two basic architectures for client-side geocoding exist.
Run the geocoding and display entirely in the browser. For instance, the user enters an address on your page. Your application geocodes it. Then your page uses the geocode to create a marker on the map. Or your app does some simple analysis using the geocode. No data is sent to your server. This reduces load on your server, but doesn't give you any sense of what your users are doing.
Run the geocode in the browser and then send it to the server. For instance, the user enters an address. Your application geocodes it in the browser. The app then sends the data to your server. The server responds with some data, such as nearby points of interest. This allows you to customize a response based on your own data, and also to cache the geocode if you want. This cache allows you to optimize even more. You can even query the server with the address, see if you have a recently cached geocode for it, and if you do, use that. If you don't, then return no result to the browser, and let it geocode the result and send it back to the server to for caching.
So one side says you cannot cache, the other side tells you, you should. Another solution it states is to always use clientside when you can, but then this becomes a grey area as well, because both examples state that you must have a user input data. What if the jquery read data from a div or span and then geocoded the information? The user wouldn't have actually done the geocode,but it was still done client-side? I'm trying to create a site that has a bunch of events generated by users and this site could get pretty loaded, so I am trying to determine the best practice in being able to do this. Google suggested here, so before you go and say this is "off-topic" please note, this is where they stated me to post.
Any feedback would be greatly appreciated.
The first quote does not explicitly forbid caching data at all. It is ambiguous as to how much you can cache (what number explicitly is "limited amounts"?) but it does not forbid caching.
You are allowed to cache the data if it helps improve the performance of your site as long as you retain the data for no longer than 30 days and do not make it available in any way to any other service except the service that originally retrieved the data.
Regarding user interaction - if your user explicitly enters a page with the expectation that they will be shown geocoded information I would assume that this would fulfill "user interaction".
As an example from a project I worked on last year I had it set up to do the following:
- Show markers on the map
- If the user clicked a marker they were shown a popup with data from the cache if available, otherwise a geocode would be performed and the returned information would be cached along with the date/time of the cache.
Another page of the site showed a history of these markers at 5 minute intervals throughout the day. If cached data was present (from clicking the map marker as in the previous part) this would be shown, otherwise a geocode would be performed and the data cached as before. The user clicking to run the report was (in my opinion) enough "user interaction" to not count as pre-fetching as the user had to manually select a timeframe before the report would be displayed.
A cronjob then ran every day at midnight which would go through each record with cached data over 25 days old and remove it.
As it was I was caching much less than 10% of the marker positions being shown (20+ markers being updated every minute, but the report was being run on maybe 3-5 markers each day and only geocoding data for every 5th point).
Related
I am using Google Analytics 4 (GA4) on the client to track a whole bunch of different events. However, there are 2 scenarios that I can't cover client side:
A user completing check out on a payment page hosted by a third-party (Stripe in this case).
A refund that is made by the support team.
These events are handled by the server using webhooks. To me it seems like the most straightforward solution, would be to let the server send the event to GA4 (as opposed to the client sending it). I believe the Measurement Protocol should be used for this.
For each event submitted through the Measurement Protocol a client_id is required. When the client is submitting an event, this is an automatically generated ID which is used to track a particular device.
My question thus is, what should the client_id be when submitting an event server-side?
Should the same client_id perhaps be used for all events, as to recognize the server as one device? I have read some people proposing to use a randomly generated client_id for each event, but this would result in a new user to be recognized for every server-side event...
EDIT:
One of the answers proposes to use the client_id, which is part of the request as a cookie. However, for both examples given above, this cookie is not present as the request is made by a third-party webhook and not by the user.
I could of course store the client_id in the DB, but the refund in the second example is given by the support team. And thus conceptually it feels odd to associate that event with the user's client_id as the client_id is just a way to recognize the user's device? I.e. it is not the user's device which triggered the refund event here.
Another refund event example would be when user A makes a purchase with user B and user B refunds this purchase a week later. In this situation, should the client_id be the one of user A or of user B? Again, it feels odd to use a stored client_id here. Because, what if user A is logged in on two devices? Which client_id should be used here then?
Great question. Yes, your aim to use Measurement Protocol is a proper solution here.
Do not hardcode the client id. It's gonna be a hellish mess in reports. The nature of user-based reporting (which GA is) demands client ids to uniquely identify users. To your best ability.
GA stores the client id in a cookie. You should have convenient and immediate access to it on every client hit to BE. The cookie name is _ga. GA4 appends the measurement id to the cookie name. Here, google's docs on it: https://developers.google.com/analytics/devguides/collection/analyticsjs/cookie-usage But you can easily find it if you inspect "collect" hits and look at their payloads. There's another cookie named _gid that contains a different value. That would be a unique client id. Set it too if you can, but don't use it for the normal client id. It has a different purpose. Here how the cookie looks here, on stack:
And here it is in Network. You will need it for proper debugging. Mostly to make sure your FE client ids are the same as BE client ids:
Keep an eye on the cases when the cookie is not set. When a cookie is not set, that most frequently means the user is using an ad-blocker. Your analysts will still want to know that the transaction happened even if there's a lack of context about the user. You still can track them properly.
3.1 The laziest solution would be giving them an "AnonymousUser" client id and then append a random number to that so that it would
both indicate that a user is anonymous and still make it possible
for GA to separate them.
3.2 A better solution would be for you to make a fingerprint client id for such users, say, hashing a concatenated string of their
useragent+ip+locale+screen resolution, this is up to your analysts
to actually work on the definition of a unique user if the google
analytics library is unable to do it.
3.3 Finally, one of the best solutions for you would be generating a client id on your own, keeping GA's format and maybe adding an indicator there that it has been generated on your end just for easier debugging in the Future and setting it as a cookie, using it instead of _ga. Just use a different cookie name so that ad-blockers wouldn't know to block it.
If you want to indicate that a hit was sent through the server, that's a good idea. Use custom dimension for that. Just sync it with your analysts first. Maybe they wouldn't want that, or maybe they would want it in a different dimension.
Now, this is very trivial. There are ways to go much deeper and to improve the quality of data from here. Like gluing the order id, the transaction id, the user id to that, using them to generate client id, do some custom client tracking for the future. But I must say that it's better than what more than 90% of, say, shopify clients have.
Also, GA4 is not good enough for deeper production usage. Many things there are still very rudimentary and lacking. I would suggest concentrating on Universal Analytics and having GA4 as a backup for when Google makes GA4 actually good enough to replace UA. That is, unless you're downloading your data elsewhere and not using GA's interface for analysis.
It seems that this page (Relevant portion in the screenshot below), advices to either send the data along with the client_id or user_id. However fails to address the fact client_id is a mandatory field as stated here.
I believe it is probably safe to assume that randomly generating this field should work. At least it seems to on my end however be warned that I am unsure if this has any impact on attribution.
* In the above image, Device ID refers to client_id
I would like to create a small sidebar on each page of my website that contains related/popular pages with perhaps the top five pages users visit after reading the current page.
I could track and record user movements across the site myself and build the list that way, but as my site already uses Google Analytics and I know the data is there I'd rather access that if all possible.
The trouble is that I don't have the faintest idea whether it is possible or not.
Remember that the Google Analytics Reporting API is not real-time it can take between 24 - 48 hours for the data to finish processing and be in the API for you to request.
The Realtime Google Analytics api is real time but the data is only about 5 minutes old and its very limited to the dimensions and metrics you can request.
Quota, with either of those APIs you are limited to 10,000 requests per day per profile / view. I have no idea how many pages there are on your site or how may users are on your site but this could quickly blow out this NON extendable quota.
Options: Except that its not realtime data and use the reporting api every night run a request against the api get everything for two days ago then show your users on your site data that's two days old. Store the data in the database then you are showing them data on in your DB and wont have an issue with the quota as you only requested it once.
But this isn't exactly what you want as its not showing a users activity over the site. TBH I am not sure you can exactly use Google Analytics to track a user as the data is user non specific.
If you don't want to get involved with learning the API and develop this from the ground up, check out EmbeddedAnalytics (disclaimer: I created the service). We could provide such a widget.
You may find This Article useful. It provides the necessary query to find the "next page visited" using the page of interest as a filter. Ultimately your query would look like this:
https://www.googleapis.com/analytics/v3/data/ga?ids=ga%3Aabc&start-date=30daysAgo&end-date=yesterday&metrics=ga%3Apageviews&dimensions=ga%3ApreviousPagePath%2Cga%3AnextPagePath&sort=-ga%3Apageviews&filters=ga%3ApreviousPagePath%3D%40pricing
The query above will give you the "Next Page" along with pageviews assuming the "previous" page contains the word "pricing".
We could easily build such report widget for you:
You would insert a javascript source code snippet into your page. The javascript would pass the page url to our server and we would return the next "most popular pages visited".
The pages could be "linkified" so that someone could click the link to go to that page.
We already have caching mechanism in place. So each pageview would not require a new query to google (making it quicker and also staying away from the API quota that was mentioned above). For pages that are hardly ever looked at (e.g. less than once a week), we could make "on-demand" calls to get the statistics.
In my experience with the API, the lag in the API is only a couple hours. It may be longer for larger sites.
Please let me know if you are interested in such widget and I can work with you.
This may be a possible duplicate of this question, but according to all the Google Analytics documentation I really should be able to pull my list of custom segments.
Since I have a very large list of them, it would be suboptimal for me to manually copy the segment ids over one at a time.
I'm following this walk through. Steps to reproduce:
Create a custom segment using date of first session in your Google Analytics account.
Authorize the Google Analytics guide to access your Google Analytics account.
Try their on-page query tester, and inspect whether your custom segment is there.
One thing I've already ruled out was the user that created the segment. I've manually created a segment with the same user that I'm querying the API with and it still does not show. Is there a flag I need to set somewhere to include custom segments?
Edit:
It turns out that it will list some custom segments, but not ones created with date of first session, so this is a duplicate of this question, which means that there is a bug in the Google Analytics API.
There was a bug which is now fixed. So it is now possible to list the Date of Session Segments in the Google Analytics Management API by calling the segments.list() method.
So after days of trying to solve this one I've come to the conclusion that it cannot be done as asked.
There is, however, another way to do it. For every segment set up a daily (or weekly, etc) email report to a email as a TSV. In each email body specify the name of the segment so when you're consuming the emails you can know which segment the attached TSV is for. It doesn't look like the daily reports were designed with segments in mind, since non of the metadata included in the TSV mentions which segment it is for.
From there it's trivial. Connect to the email address using an IMAP client once a day and update the numbers.
Note that the daily email only contains the numbers for that day (not a specified range), so you'll need to first generate the report one time with the historical data to load in.
While hacky, one nice thing about this approach is that it keeps your reports in sync with your (faked through email) api code (provided you match the column headings in the TSV). So, if for example, a new filter is included into a report, the new daily fields will continue to update.
Unfortunately though, the past data won't be reflected in the change.
Obviously this isn't great, but if you are monitoring daily cohorts it's the best you've got if you need to stay with Google Analytics. I have raised this as a bug to the Google Analytics developers, but I haven't heard back as to whether or not they plan to fix it.
I would look for some feedback on tracking user activity on an commerce website using th google analytics commerce capabilities.
I can't fully understand those 3 parts :
Adding an item (ecommerce:addItem) : obviously when some user add a thing to the cart
Adding a Transaction (ecommerce:addTransaction) : that's where I'm very confused
Sending the data (ecommerce:send) : that's obvious
Can those 3 event append at a different moment ? in what manner ?
What would be a real-world use case that would make you use execute ecommerce:addTransaction and ecommerce:send at a different moment ?
This thing makes me wonder a lot, and I'd like to have some experienced feedback on this as you tend to easily break your stats if something is not done week enough
Thanks in advance
EDIT
So the main purpose right here is to get stats for the pending orders (you add stuff to your cart), and the complete orders (you paid for the things you added).
Right now I only send it all when the order is complete, and things are working pretty good in analytics, but I just don't know anything about the ones that did not complete.
This question was a lack of knowledge.
Simple ecommerce plugin has nothing to do with the enhanced ecommerce plugin
You won't track that much with the first one, except the checkouts. A plain, one order at a time, revenue value.
If you want a deep insight on your users behaviors (when i say deep, I mean it), You have to go for the second one.
We might be able to debate over the unusefullness of the first one; and the fact that its existence in itself compared to the second is completely misleading, as when you first get in, as usual with google, you get flooded by an endless documentation
ecommerce:addItem does not add items to a cart; it adds items to a transaction (with "conventional" ecommcerce tracking there is no cart tracking, you'd have to use enhanced ecommerce tracking. Actually your title refers to enhanced ("ec:") and your question to conventional ecommerce ("ecommerce:") tracking).
So ecommerce:addTransaction starts a transaction; here goes the stuff that affects the transaction as a whole, like transaction id, tax on the total purchase or shipping costs.
Now that you have started the transaction you can add items to it that are associated via the transaction id.
Finally the ecommerce:send command tells Universal Analytics that the transaction should be processed on the server. "send" is actuall a misnomer; addItem and addTransaction do already send data to the server (they each create an request to the tracking server and thus count towards your hit quota).
The reason for this is, as far as I can tell, that the information is transmitted via url parameters (you call the Google Analytics endpoint which returns an transparent pixel). The maximum length for an url request is limited (actual limits depend on browser and browser version).
So the transaction is broken up into multiple parts not because you want to execute the commands at different moments but so it can be transmitted via Url parameters without being truncated. The send command merely tells that you are now finished adding new parts to the transaction and the data can now be processed.
I have a site that associates Google Place information with users, and displays that information on a map. For instance, a user can search for a place (currently with the Places Library Autocomplete API) and bookmark this place for later retrieval.
As per the Google Maps TOS, I am only storing the Place ID and its reference in my database and am making client-side requests for the coordinate information of each place whenever I need to display them on a map.
I've recently encountered an issue where making more than 10 consecutive API requests for coordinate information (within a javascript loop) using the getDetails method on the service object returns an OVER_QUERY_LIMIT status code.
My question is this: if I am correct in assuming that storing the latitude and longitude of each place in my database violates Google's TOS, how can I programmatically retrieve the coordinate information for a number of places so that I can display these places on a map for a given user?
According to the terms - 10.1.3(b) -
(b) No Pre-Fetching, Caching, or Storage of Content. You must not pre-fetch, cache, or store any Content, except that you may store: (i) limited amounts of Content for the purpose of improving the performance of your Maps API Implementation if you do so temporarily, securely, and in a manner that does not permit use of the Content outside of the Service; and (ii) any content identifier or key that the Maps APIs Documentation specifically permits you to store. For example, you must not use the Content to create an independent database of "places" or other local listings information.
So if there are places getting queried a lot, you can temporarily store information on that place to increase performance.