How to show how many people have viewed a post - asp.net

I have a BBS forum, I just wondered how to count the views of a post?
I need to consider of both registered users and visitors.
I was thinking of counting IPs, but many users or visitors might come from a same IP. Then I am thinking of counting GA (as Google states every _ga lasts for 2 years).
I searched for a while, but have not found any examples.

Your question is too broad, but generally speaking, you'll need some sort of middleware to run as part of the request pipeline and update some data store. If you're looking to discount multiple views by the same individual, then you'll need some sort of discriminator. That's where something like IP address would come in, but as you've mentioned it's not perfect and could end up lumping multiple people into one unit. The best approach is to set an identifier via a cookie, much as GA does with their tracking cookies. That identifier could be anything; it should just be unique for the cookie. Guid.NewGuid() should suffice. Then, you simply record this with the record of the view, and before saving a new record of a view, check for a record with that cookie value, first, if present.
A few things to bear in mind:
Instead of updating a single ViewCount column or something, you should track this in an external table that records the URL that was visited and any discriminating info such as the tracking cookie identifier. When you need to get a full count of views, just aggregate the count from this table. That will remove most of your concurrency worries. Otherwise, you'd have to gate writes to the ViewCount prop, or whatever, which will create choke point for your application.
Even a tracking cookie isn't perfect. For one, it will be device-specific, so the same user will be counted if they visit from both a desktop and mobile device, or even just multiple browsers on the same device. However, there's really not a better option here. If you do have an actual logged in user to work with, you can use that as a potential discriminator to get more exact counts, but of course, you still need a fallback option for anonymous users.

Related

What should the client_id be when sending events to Google Analytics 4 using the Measurement Protocol?

I am using Google Analytics 4 (GA4) on the client to track a whole bunch of different events. However, there are 2 scenarios that I can't cover client side:
A user completing check out on a payment page hosted by a third-party (Stripe in this case).
A refund that is made by the support team.
These events are handled by the server using webhooks. To me it seems like the most straightforward solution, would be to let the server send the event to GA4 (as opposed to the client sending it). I believe the Measurement Protocol should be used for this.
For each event submitted through the Measurement Protocol a client_id is required. When the client is submitting an event, this is an automatically generated ID which is used to track a particular device.
My question thus is, what should the client_id be when submitting an event server-side?
Should the same client_id perhaps be used for all events, as to recognize the server as one device? I have read some people proposing to use a randomly generated client_id for each event, but this would result in a new user to be recognized for every server-side event...
EDIT:
One of the answers proposes to use the client_id, which is part of the request as a cookie. However, for both examples given above, this cookie is not present as the request is made by a third-party webhook and not by the user.
I could of course store the client_id in the DB, but the refund in the second example is given by the support team. And thus conceptually it feels odd to associate that event with the user's client_id as the client_id is just a way to recognize the user's device? I.e. it is not the user's device which triggered the refund event here.
Another refund event example would be when user A makes a purchase with user B and user B refunds this purchase a week later. In this situation, should the client_id be the one of user A or of user B? Again, it feels odd to use a stored client_id here. Because, what if user A is logged in on two devices? Which client_id should be used here then?
Great question. Yes, your aim to use Measurement Protocol is a proper solution here.
Do not hardcode the client id. It's gonna be a hellish mess in reports. The nature of user-based reporting (which GA is) demands client ids to uniquely identify users. To your best ability.
GA stores the client id in a cookie. You should have convenient and immediate access to it on every client hit to BE. The cookie name is _ga. GA4 appends the measurement id to the cookie name. Here, google's docs on it: https://developers.google.com/analytics/devguides/collection/analyticsjs/cookie-usage But you can easily find it if you inspect "collect" hits and look at their payloads. There's another cookie named _gid that contains a different value. That would be a unique client id. Set it too if you can, but don't use it for the normal client id. It has a different purpose. Here how the cookie looks here, on stack:
And here it is in Network. You will need it for proper debugging. Mostly to make sure your FE client ids are the same as BE client ids:
Keep an eye on the cases when the cookie is not set. When a cookie is not set, that most frequently means the user is using an ad-blocker. Your analysts will still want to know that the transaction happened even if there's a lack of context about the user. You still can track them properly.
3.1 The laziest solution would be giving them an "AnonymousUser" client id and then append a random number to that so that it would
both indicate that a user is anonymous and still make it possible
for GA to separate them.
3.2 A better solution would be for you to make a fingerprint client id for such users, say, hashing a concatenated string of their
useragent+ip+locale+screen resolution, this is up to your analysts
to actually work on the definition of a unique user if the google
analytics library is unable to do it.
3.3 Finally, one of the best solutions for you would be generating a client id on your own, keeping GA's format and maybe adding an indicator there that it has been generated on your end just for easier debugging in the Future and setting it as a cookie, using it instead of _ga. Just use a different cookie name so that ad-blockers wouldn't know to block it.
If you want to indicate that a hit was sent through the server, that's a good idea. Use custom dimension for that. Just sync it with your analysts first. Maybe they wouldn't want that, or maybe they would want it in a different dimension.
Now, this is very trivial. There are ways to go much deeper and to improve the quality of data from here. Like gluing the order id, the transaction id, the user id to that, using them to generate client id, do some custom client tracking for the future. But I must say that it's better than what more than 90% of, say, shopify clients have.
Also, GA4 is not good enough for deeper production usage. Many things there are still very rudimentary and lacking. I would suggest concentrating on Universal Analytics and having GA4 as a backup for when Google makes GA4 actually good enough to replace UA. That is, unless you're downloading your data elsewhere and not using GA's interface for analysis.
It seems that this page (Relevant portion in the screenshot below), advices to either send the data along with the client_id or user_id. However fails to address the fact client_id is a mandatory field as stated here.
I believe it is probably safe to assume that randomly generating this field should work. At least it seems to on my end however be warned that I am unsure if this has any impact on attribution.
* In the above image, Device ID refers to client_id

How can I track visitors’ paths from one page to another with full URLs?

Say I have two pages on a site called “Page 1” and “Page 10”. I'd like to be able to see the paths visitors take to get from “Page 1” to “Page 10” with full URLs intact. Many of the URLs (including those for “Page 1” and “Page 10”) will include query strings that are important.
Is this possible? If so, how?
Try using behavior flow reports. The report basically shows you how visitors click through your website. There are a lot of ways to customize the report, with which you will need to play around to really answer your question. By default, the behavior flow focuses on entry and exit points of visitors, regardless how many times they hit the different subpages in between. However, I'm sure you can set appropriate filters and settings to answer your question.
I use two methods for tracking where people have been on my website:
Track and store the information in my own SQL database. (details below)
Lead Forensics (paid subscription, but you can do a trial).
For tracking and storing my own data, I record unique visitors based upon the IP Address they're connecting from and then have a separate table that records all page views that links back to the unique visitor table.
Lead Forensics data simply allows me to link up those unique visitors with actual companies that have viewed my website.
Doing it yourself means you don't have to rely on Google working for your records to work, and in my experience Google Analytics tends to round numbers so you don't get a true indication of numbers, and also you can remove bots and website trawlers from your data by tracking the user agent string.
As a somewhat ugly hack you could use transaction tracking. If you use the same transaction id multiple times subsequent products will be added to the existing data. So assign an ID at the start of the visits and on each page record a transaction with the current page url as product name (and the ID as transaction id). This will give you the complete path per user (I am frankly not to sure how this is useful - at some point you probably want aggregated data. Plus each transaction and product counts towards your quota for interaction counts, so on a large site you might run over the 10mio hits limit).
you can do it programatically
have a MAP in the backend which stores the userId (assuming u would have given a unique ID at the time of login to each user) with a list of Strings(each string being URL visited by that user)
whenever the user hits another URL from Page 1(and only from page1, check it using JS), send a POST request to backend with the new URL in its data section.
In the backend, check if the URL is of Page 10 and if not, add this URL as a string into the MAP for that corresponding user
Finally, when the user clicks on the Page 10 URL, you know the URLs in the way from Page 1 to Page 10 and so use them.
Though if I consider JS and I have not misunderstood your question, we can get the previous URL from request header information using document.referrer.
Are you trying to do it from 'Google Tag Manager'? I am not sure whether you are trying to trace the URLS in clientside or server side?

can google analytics tell me http referrer for each specific goal conversion?

I have a website that allows people to create an account (that is the conversion I wish to track).
I wish to know where a specific person is coming from. I have google analytics installed and have set up the registration page as a goal, but the reporting tells me traffic sources as an aggregated pie chart. It doesn't report down to the user account level to say that 'person with email xyz' came from 'facebook' for example.
What custom variables or mark up would I need to add to GA to report at that detailed level, if that is at all possible?
Otherwise, I will just have to record the first http_referer in a cookie and stick it in a database during the registration process.
Any advice?
Firstly I must ask you, how actionable do you think it is to look at data at that granular of a level? Finding out what % of people who registered came from facebook or some other place is actionable, because it helps you do things like determine where to focus marketing efforts. But individual users? How is this actionable to you? (hint: it's not)
However, if you are still determined to know this, you should first note that it is against Google's ToS to record personally identifiable data both directly (recording the actual value in GA) or indirectly (e.g. - recording a unique id that you can use to tie to personal info stored within your own system). If this is something you don't want to risk, I suggest moving to another analytics tool that does not have this sort of thing in their ToS (e.g. Adobe SiteCatalyst, which costs money, or perhaps you may instead prefer to choose an "in-house" approach, like Piwik)
If you are still determined to follow through with this and hope not to get caught or whatever, Google Analytics doesn't record data like what info a visitor filled out in a form (like their email address) unless you populate that data in a custom field/dimension/metric/event to be sent along with the request. Usually you would populate this on the form "thank you" page (which is usually the same page you use as your goal url or goal event if you're popping and using an event for your goal). So you would populate the email address in one of those custom variables and then have it as a dimension to break down the http referrer by.

Counting anonymous votes accurately

I'm building a small application that highly depends on anonymous user voting on some sort of items. It's so small that requiring registration would be tedious and could not be justified.
Anyway, I did some research on this, including a search here on stackoverflow (https://stackoverflow.com/search?q=anonymous+votes), and doesn't seem that there's a satisfying answer.
My question is: are there any security measures that I can apply to prevent gaming anonymous votes?
One thing comes to mind is CAPTCHA, but I'd like to avoid that since users will vote on multiple items in a very short period of time, and CAPTCHAs will just annoy them.
Another thing I thought of is limiting the number of votes per minutes from a single IP (in addition to a cookie), but not sure how this is going to work.
Any thoughts?
There are a few ways I've seen work:
Email registration : you get their email, they need to confirm their vote. The combination of their IP + email makes a unique record that they can't then use to vote again (for the same poll).
Captcha : without having additional checks (IP, etc), it's easy enough for a team of monkeys to successfully enter a lot of captchas.
Site Registration : without account creation level limits (e.g. a non-free email account required for signing up) people can just create multiple accounts.
Depending on how you weigh up the cost of getting users to vote vs making sure their votes are for them and them alone, you can use a different level of vote-spam-protection.
You can use the CAPTCHA once to both confirm the vote and create a session with the IP and cookie.
Any time you are dealing with anonymous voting you are going to have an imperfect solution but you can shoot for "pretty good". Consider dropping a cookie on the client computer to prevent multiple/frequent voting and back this up by performing server side IP tracking to do the same. Do not allow anyone to vote that has cookies blocked.
Of course, if you require complete accuracy or if the voting involves awarding of something of monetary value, registration is really the way to go.

How to create a reliable and robust page view counter in a web application?

I want to count the visits on a web page, and this page represents an element of my model, just like the Stack Overflow question page views.
How to do this in a reliable (one visit, one pageview, without repetitions) and robust (thinking on performance, not just a new table attribute 'visits_count')
You can't.
Multiple people can visit your site from the same IP (makes ip storage useless)
Multiple people can visit your site from the same PC
Multiple people can visit your site from the same browser (makes cookies useless)
The closest you can get is to store the visitors IP in combination with a cookie to not count those in the future. Here is a tradeoff, if they clear the cookie, they are a new visitor. If you only store the IP, you count whole proxies as one visitor.
Another option is to use user accounts and track exactly which user viewed what page, but this is not really a good option for public sites.
I would suggest, actually, using Google Analytics. While nothing will ever be truly accurate, they do take a pretty gosh-darn good stab at it, and the level of reporting and useful information you can extract is amazing.
Not to mention being free for up to 4,000,000 page views a month.
http://www.google.com/analytics
One way to handle the performace issue, is to do the calculations on insert. Something like
UPDATE stat SET viewcount = viewcount+1 WHERE date = CURDATE()
It isn't the perfect solution but could get you some of the way.
To do it without repetitions is potentially a tricky problem. I would simply rely on using sessions or cookies, but you could come up with all kinds of strategies for filtering crawlers, clients with out cookie-support and so on.

Resources