Mobile data reported in GA Measurement Protocol appear in realtime but not in daily summary - google-analytics

I've been attempting to log activity on a mobile-like device using the Google Analytics Measurement Protocol. All of these attempts have validated using the validation URL, and I can see activity when I look at the real-time reports on the Analytics website. But when I look at the Home or Overview reports for the day - no activity is shown.
The view is set for "All Mobile App Data".
The POST body looks something like this:
v=1&tid=UA-000000000-1&ds=app&qt=1601&uid=uid-zzzzz&t=screenview&cd=Foo&an=Foo%20App%20Name&aid=com.example.foo&aiid=com.example.foo&av=0.0.1&ua=Mozilla%2F5.0%20(Linux%3B%20Android%207.0%3B%20SM-G930V%20Build%2FNRD90M)%20AppleWebKit%2F537.36%20(KHTML%2C%20like%20Gecko)%20Chrome%2F59.0.3071.125%20Mobile%20Safari%2F537.36
The ua field is just a pre-defined string. I found that if I omitted it, the Real Time monitoring listed the hits as desktop hits, although I was in a Mobile report and the ds field was "app".
Am I missing a field that is required? Is there some reason why it is showing up in the real-time report, but not in a daily report? Is there some other way to diagnose why the data is vanishing, or confirm the data is actually being captured?
When i check the debug endpoint the hit is valid
Request:
https://www.google-analytics.com/debug/collect?v=1&tid=UA-XXX-1&ds=app&qt=1601&uid=uid-zzzzz&t=screenview&cd=Foo&an=Foo%20App%20Name&aid=com.example.foo&aiid=com.example.foo&av=0.0.1&ua=Mozilla%2F5.0%20(Linux%3B%20Android%207.0%3B%20SM-G930V%20Build%2FNRD90M)%20AppleWebKit%2F537.36%20(KHTML%2C%20like%20Gecko)%20Chrome%2F59.0.3071.125%20Mobile%20Safari%2F537.36
Response
{
"hitParsingResult": [ {
"valid": true,
"parserMessage": [ ],
"hit": "/debug/collect?v=1\u0026tid=UA-53766825-1\u0026ds=app\u0026qt=1601\u0026uid=uid-zzzzz\u0026t=screenview\u0026cd=Foo\u0026an=Foo%20App%20Name\u0026aid=com.example.foo\u0026aiid=com.example.foo\u0026av=0.0.1\u0026ua=Mozilla%2F5.0%20(Linux%3B%20Android%207.0%3B%20SM-G930V%20Build%2FNRD90M)%20AppleWebKit%2F537.36%20(KHTML%2C%20like%20Gecko)%20Chrome%2F59.0.3071.125%20Mobile%20Safari%2F537.36"
} ],
"parserMessage": [ {
"messageType": "INFO",
"description": "Found 1 hit in the request."
} ]
}
I cannot use one of the mobile libraries from Firebase - this is not one of the platforms they support. I do not wish to pretend this is a web page - there is no associated hostname or path. I do not wish to use Events since I can't do event Behavior Flow, which is one of the things I'm interested in seeing.
I'm aware that it can sometimes take "a day or so" for results to first appear. The site was setup over five days ago at this point, and has received data during that time.
Good thought about the anti-spam setting, however the setting appears to be correct:
I've also tried using GET instead of POST - no change, it still shows the hit in real-time, but then it vanishes.
However, I know that it can record hits permanently. There were two hits from a spammer in Russia that have shown up in the daily report (I wasn't there to see it show up in real-time). I don't know what they did, but would love to find out since it might help figure out how I can add a record.
In the real-time reports, it correctly points out the data center all the hits are coming from. Perhaps that is filtering it out somewhere out of my control?

Try adding Cid I know it says this is an optional parameter but for mobile accounts I belive it may be required.
Client ID
Optional.
This field is required if User ID (uid) is not specified in the request. This anonymously identifies a particular user, device, or browser instance. For the web, this is generally stored as a first-party cookie with a two-year expiration. For mobile apps, this is randomly generated for each particular instance of an application install. The value of this field should be a random UUID (version 4) as described in http://www.ietf.org/rfc/rfc4122.txt.
Example value: 35009a79-1a05-49d7-b876-2b884d0f825b
Although this says it needs to be a UUIDv4, it does work with other UUIDs (I've tested it with a v5, which is a hash against the value used for the uid parameter).

Related

Why Google Measurement Protocol is splitting my session before the configured limit?

I'm working on a TV application and I would like to track what people do on it. So I used Google Measurement Protocole to do so. The page and event tracking is working fine because I can see in real time what's is happening on the Analytics 360 website. Everything was ok until I look at the "User explorer" to pick a random user in order to see what he did previously. I found out that he had a lot of sessions in a very short period. So I ran a test for myself yesterday and the result is the same today. I had only 1 session with multiple events for more than an hour and by paying attention to trigger an event before the 30min of inactivity which is configured by default.
Here are examples of data sent :
ds=qml
v=1
tid=UA-XXX-YYY
cd1=local
cm=tv
cid=123456789 (unique id)
uip=XX.XX.XX.XX
ua=SAMSUNG
ul=en
cd=HomePage
dp=HomePage
dt=home
cs=MY_TV_APP
t=event
ec=MY_TV_APP
ea=click on button
el=HomePage
or
ds=qml
v=1
tid=UA-XXX-YYY
cd1=local
cm=tv
cid=123456789 (unique id)
uip=XX.XX.XX.XX
ua=SAMSUNG
ul=en
cd=HomePage
dp=HomePage
dt=home
cs=MY_TV_APP
t=pageview
Here is a screenshot of my session: here
As you can see, there are 7 sessions instead of 1... I don't understand why my session was splitted into 7 parts. I tried to use "sc=start" and "sc=end", the result is the same.
Has anyone ever encountered this problem before?

Google Analytics User Id doesn't take utm_ (acquisition) parameters into account

I have implemented the following process to track User ID on my website:
If the user is not yet logged in, track a pageview
Once he registers or logs in, set user id and keep setting it on every page
I have session unification turned on.
It works OK except that in the "user id" view, I see 100% of acquisition surce/medium as (direct) / (none), but in "all data" though, the utm_ parameters for the same session have been tracked correctly.
I would like to see which campaign was each given user acquired by, but because of this, I am not seeing that.
What am I doing wrong?
Ok, I found the answer in the docs eventually:
Session unification is completed during daily data processing. Processing begins at 5am each day, based on the western most timezone selected in any reporting view that is associated with the property.
Looks like I have everything set up right, just need to wait for the recalculation.

Paypal Processing - Need to grab TransactionId, CorrelationId and TimeStamp

Current Project:
ASP.NET 4.5.2
MVC 5
PayPal API
I am using this example to build myself a PayPal transaction (and yes, my code is virtually identical), as I do not know of any other method that will return the three values in the title.
My main problem is that, the example I am utilizing is much more concise and compact than the one I used for a much older Web Forms application, and as such, I am unsure as to where or even how to grab the three values I need.
My initial thought was to do so right after the ACK, and indeed I was able to obtain the CorrelationId as well as the TimeStamp, but because this was prior to the user being carted off to PayPal’s site (sandbox in this case -- see the return new PayPalRedirect contained within the if), the TransactionId was blank. And in this example, PayPal explicitly redirects the user to a Success page without returning to the Action that sent the user to PayPal in the first place, and I am not seeing any GET values in the URL at all aside from the Token and the PayerId, much less ones that could provide me with the TransactionId.
Suggestions?
I have also looked at the following examples:
For ASP.NET Core, was unsure how to adapt to my current project particularly due to appsettings.json, but it looked quite well done. I really liked how the values were rolled up in lists.
For MVC 4, but I couldn’t find where ACK was being used to determine success or successwithwarning so I couldn’t hook into that.
I have also found the PayPal content to be like trying to drink from a fire hose at full blast -- not only was the content was hopelessly outdated (Web Forms code, FTW!) but there was also so many different examples it would have taken me days to determine which one was most appropriate to use.
Any assistance would be greatly appreciated.
Edit: my initial attempt at modifying the linked code has this portion:
values = Submit(values);
var ack = values["ACK"].ToLower();
if(ack == "success" || ack == "successwithwarning") {
using(_db = new ApplicationDbContext()) {
var updateOrder = await _db.Orders.FirstOrDefaultAsync(x => x.OrderId == order.OrderId);
if(updateOrder != null) {
updateOrder.OrderProcessed = false;
updateOrder.PayPalCorrelationId = values["CORRELATIONID"];
updateOrder.PayPalTransactionId = values["TRANSACTIONID"];
updateOrder.PayPalTimeStamp = values["TIMESTAMP"];
updateOrder.IPAddress = HttpContext.Current.Request.UserHostAddress;
_db.Entry(updateOrder).State = EntityState.Modified;
await _db.SaveChangesAsync();
}
}
return new PayPalRedirect {
Token = values["TOKEN"],
Url = $"https://{PayPalSettings.CgiDomain}/cgi-bin/webscr?cmd=_express-checkout&token={values["TOKEN"]}"
};
}
Everything within and including the using() is my added content. As I mentioned, the CorrelationId and the TimeStamp come through just fine, but I have yet to successfully obtain the TransactionId.
Edit 2:
More problems -- the transactions that are “successful” through the sandbox site (the ReturnUrl is getting called) aren’t reflecting properly on my Facilitator and Buyer accounts, even when I do payments straight from the buyer’s PayPal account (not using the Credit Card). I know I am supposed to see transactions in the Buyer’s account, either through the overall Dev account (Accounts -> Profile -> balance or Accounts -> Notifications) or through the Buyer’s account in the sandbox front end. And yet -- multiple transactions returning me to the ReturnUrl path, and yet no transactions in either.
Edit 3:
Okay, this is really, really weird. I have gone over all settings with a fine-toothed comb, and intentionally introduced errors to see where things should crap out. It turns out that the entire process goes swimmingly - except nothing shows up in my notifications and no amounts get moved between my different accounts (Facilitator and Buyer). It’s like all my transactions are going into /dev/null, yet the process is successful.
Edit 4: A hint!
In the sandbox, where Buyer accepts the transaction, there is a small note, “You will be able to review the transaction before completing it” or something like that -- suggesting that an additional page is not coming up and that the user is being uncerimoniously dumped back to the success page. Why the success page? No clue. But it’s happening.
It sounds like you are only doing the first part of the process.
Express Checkout consists of 3 API calls:
SetExpressCheckout
GetExpressCheckoutDetails
DoExpressCheckoutPayment
SEC generates a token, and then you redirect to PayPal where the user signs in and reviews the transactions before agreeing to pay.
They are then sent to the ReturnURL included in your SEC request, and this is where you'll call GECD in order to obtain all the buyer details that are now available since they signed in.
Using that data you can complete the final DECP request, which is what finalizes the procedure. No money is actually processed until this final call is completed successfully.

Google Drive SDK Push Notifications, receiving wrong changes id

When I subscribe to all changes of my Drive account, sometimes I receive changes with wrong id. According to my observations changes of specific file are aggregated in last change with some time period.
For example:
If I change file in my drive and if I have received 3 notifications with ids: "#21, #22, #23", I expected that I can get change of "#23", if there is no more changes to that file. But sometimes I receive last change with id greater than it exists. When I use API changes list, I get lastlargestChangeId = receivedChangesId - 1.
I have tested it with google examples and I get the same results:
push notifications test
{"notification_id": "xxxxxxxxxxx", "resource_state": "change", "expiration": "Mon, 07 Jul 2014 13:58:37 GMT", "self_link": "https://www.googleapis.com/drive/v2/changes/3387"}
{
"kind": "drive#changeList",
"etag": "xxxxxxxxxxx",
"selfLink": ".../changes?startChangeId=3340",
"largestChangeId": "3386",
"items": [
...
]
}
Am I wrong?
It seems that Google categorized it as a bug:
https://code.google.com/a/google.com/p/apps-api-issues/issues/detail?id=3706
Star it so Google can prioritize it.
Here is the official answer:
After talking with our engineers, this actually is working as
intended. The change ID given in the push notification (the ID and the
self link that contains that same ID) remain valid only as long as
there are no newer changes on a particular resource. Take the
following example scenario on a user's drive with push notifications
being sent for all changes. 1) Resource A (a file or folder, etc)
changes. 2) Drive API sends a push notification with a change ID
(example 1234). 3) This change ID (and the selflink) can be used now
to successfully pull up the change. 4) Resource A changes again. 5)
Drive API sends a push notification with change ID 1235. Note that
change IDs are monotonically increasing. 6) Accessing the earlier
change ID (1234) through a changes.get or the self link will 404 now
because there is a newer change on the same resource. 7) You can still
access change ID 1235 here.
In general, you should not expect to be able to get a particular
change ID with changes.get. Any resource is only in the changes stream
once. So as soon as the resource has a newer change, the old change ID
is invalid. Some commenters have noted that you can do changeID - 1 to
get this change. However, this will not always work. In my testing, I
have gotten a 404 on both the change ID and for change ID - 1.
Instead, you should sync a given set of resources and note the largest
change ID of that set. Then, when you get push notifications in the
future, see if the change ID you get in the push notification is newer
than the largest you stored. If so, use changes.list in order to get
all of the resource changes between the last change ID you have seen /
stored and the one you just received.
tldr; Do not count on change IDs existing with changes.get. Use
changes.list instead to get all changes from a base change ID to the
change ID you get in the push notification.
https://code.google.com/a/google.com/p/apps-api-issues/issues/detail?id=3706

Google Analytics Realtime Sandbox Environment

I am looking for a way to setup a google analytics sandbox environment that will allow me
to test out my custom js code near real time.
My app will be using custom variables for advanced segmentation, and I would like to test out multiple scenarios quickly, as opposed to setting up a dummy GA account and wait for a whole day to confirm the test.
Thanks
Great question.
For GA, server updates occur every four hours, and after every sixth such update, the entire set is recalculated, which means a 24-hour lag from code change to reliable feedback. This delay also applies to most customizations to the GA Browser (e.g., "custom filters").
So if you are going to use GA as your web metrics system, and you expect to actually rely on those data then a test rig is essential.
For me, it's useful to group test systems for client-side analytics using two rubrics: (i) complete, self-contained (closed-loop) systems; or (ii) simpler automated data pulls from the production system (by "production system" here i mean GA's system, not the Site whose pages the GA code is tracking).
For the latter, just add this line to each page of your Site that contains the GA tracking code, just below '__trackPageview()':
pageTracker._setLocalRemoteServerMode();
That line will cause a copy of each transaction line to be logged to your server's activity log--so in essence, you get the data captured by GA in real-time That's all you need to do to capture the data; to parse it, you can use, for instance, any of the excellent open source web log analyzers like AWStats, or roll your own.
This is simple and reliable--but all it can do is tell you (in real-time) "does the analytics code i just implemented on pages served by my production server actually work?"
Usually, that's not good enough--you would rather know if your code will work before it's on your production server. To do that, you need to simulate the production environment and find a way to access in real-time the data GA collects.
This kind of test rig is a little more involved, but still not difficult.
In sum, it requires these steps:
host/serve the ga.js and the
tracking pixel locally;
log the __utm.gif requests (in the
GA data flow, each request
corresponds to one logged
transaction); and
parse the headers into some
convenient human-readable form.
If you want more detail than that (ie, a step-by-step implementation), here it is:
I. Hosting/Serving the GA Script (& automating updates
To do that, you can create a small shell script like this one to wget the latest ga.js version into your local directory (replacing the extant version it finds there).
#!/bin/sh
rm /My_Sites/sitename.com/analytics/ga.js
cd /My_Sites/sitename.com/analytics/
wget http://www.google-analytics.com/ga.js
chmod 644 /My_Sites/sitename.com/analytics/ga.js
cd ${OLDPWD}
exit 0;
(Thanks to AskApache.com, which provided the original motivation and config details to do this in a production context.)
II. Create __utm.gif file
This is just a transparent 1x1 pixel gif image, which you will place in Site directory (doesn't matter where, it just needs to match the location recited in your pages)
III. Log the __utm.gif Requests
For a testing protocol in which you are the source of the client-side activity (e.g., you want to verify the cross-browser fidelity of some event-tracking code you've added to a page on your Site, so you automate 5000 clicks on the button you just wired up,serving the page from your dev server set up for this purpose) it's probably simplest to just log the Request Headers, because it's in those headers that the GA script directs the client to gather various data from the DOM, from the location bar (url), and from prior http headers, and append them to a request for a resource on the GA server (__utm.gif, which is just a 1x1 transparent pixel).
For this type of protocol, i use the Firefox addon, LiveHTTPHeaders. You install it like any other Firefox addon, a few mouse clicks is all. Next, open it, and click the "Generator" tab. From this window, you can see the actual requests in real time. At the bottom of the window is a 'save' button to store the log. I find it easier to configure LiveHTTPHeaders to log only the __utm.gif requests; to do that, just click the 'Edit' tab and create a siimple filter to exclude everything except these particular gif images (using the check boxes on the right, and the large text box to the right).
Other kinds of test protocols require you to work from your Server Activity Logs; in that case just add this line to each page of your Site, just below __trackPageview():
pageTracker._setLocalRemoteServerMode();
IV. Parse those logged requests so you can actually read them
So now your log will contain individual transction lines, each one of which is a string appended to an HTTP Request for the GA tracking pixel. This string is just a concatenation of key-value pairs, each key begins with the letters "utm" (probably for "urchin tracker"). Each of these parameters corresponds to a variable that you see in the GA Dashboard (here's a complete list and description of them). This is all you need to know to build a parser. In more detail:
First, here's a sanitized __utm.gif request (the entries in your LiveHTTPHeaders log):
http://www.google-analytics.com/__utm.gif?utmwv=1&utmn=1669045322&utmcs=UTF-8&utmsr=1280x800&utmsc=24-bit&utmul=en-us&utmje=1&utmfl=10.0%20r45&utmcn=1&utmdt=Position%20Listings%20%7C%20Linden%20Lab&utmhn=lindenlab.hrmdirect.com&utmr=http://lindenlab.com/employment&utmp=/employment/openings.php?sort=da&&utmac=UA-XXXXXX-X&utmcc=__utma%3D87045125.1669045322.1274256051.1274256051.1274256051.1%3B%2B__utmb%3D87045125%3B%2B__utmc%3D87045125%3B%2B__utmz%3D87045125.1274256051.1.1.utmccn%3D(referral)%7Cutmcsr%3Dlindenlab.com%7Cutmcct%3D%2Femployment%7Cutmcmd%3Dreferral%3B%2B
This is my parser (in Python):
# regular expression module imported
import re
pattern = r'\&{1,2}'
pat_obj = re.compile(pattern)
# splitting the gif request on the '&' character
# (which GA originally used to concatenate each piece to build the request)
# (here, i've bound the __utm.gif to the variable by 'gfx')
gfx1 = pat_obj.split(gfx)
# create a look-up table to map a descriptive name to each gif request parameter
# (note, this isn't the entire list, which i've linked to above)
keys = "utmje utmsc utmsr utmac utmcc utmcn utmcr utmcs utmdt utme utmfl utmhn utmn utmp utmr utmul utmwv"
values = "java_enabled screen_color_depth screen_resolution account_string cookies campaign_session_new repeat_campaign_visit language_encoding page_title event_tracking_data flash_version host_name GIF_req_unique_id page_request referral_url browser_language gatc_version"
keys = keys.strip().split()
#create the look-up table
GIF_REQUEST_PARAMS = dict(zip(keys, values))
# parse each request parameter and map the parameter name to a descriptive name:
pattern = r'(utm\w{1,2})=(.*?)$'
pat_obj = re.compile(pattern)
for itm in gfx1 :
m = pat_obj.search(itm)
if m :
fmt = '{0:25} {1:10}'
print( fmt.format( GIF_REQUEST_PARAMS[m.group(1)], m.group(2) ) )
The result looks like this:
gatc_version              1         
GIF_req_unique_id         1669045322
language_encoding         UTF-8     
screen_resolution         1280x800  
screen_color_depth        24-bit    
browser_language          en-us     
java_enabled              1         
flash_version             10.0%20r45
campaign_session_new      1         
page_title                Position%20Listings%20%7C%20Linden%20Lab
host_name                 lindenlab.hrmdirect.com
referral_url              http://lindenlab.com/employment
page_request              /employment/openings.php?sort=da
account_string            UA-XXXXXX-X
cookies
To avoid making this longer still, i left out the cookies' value. They obviously require a separate parsing step, though it's virtually identical to the step i just showed. Again, each request represents a single transaction, so you can store them as you need to.

Resources