Is there a call limit to accessing LinkedIn share counts? - linkedin

When using https://www.linkedin.com/countserv/count/share?format=json&url= to access an article's sharecount, is there an api daily limit?
We noticed that the time it was taking to retrieve count data was taking as much as 20 seconds on our production server. We added logic to cache the number of counts, and the 20 second delay stopped the next day. We are left wondering though what the limit might be (we can't seem to find it in your documentation).

Related

Why are Google Analytics Dashboard statistics changing?

Background:
I have a Google Analytics account using which I am tracking user activity for web and mobile app. After logging into your account and choosing the web property and the corresponding view, you generally see a dashboard with quick stats like Pageviews, Users, Sessions, Pages/Sessions, Avg. Session Duration, Bounce Rate and percentage of new sessions. You can change the time period (from the top right area of the Dashboard) to get the same stats for that period.
Problem:
Last week, I was interested in the three main stats: Page views, Users and Sessions for a particular day - say, day A. The dashboard showed the following stats:
Pageviews - 1,660,137
Users - 496,068
Sessions - 983,549
This report was based on 100% of sessions.
I go back to the dashboard TODAY and check the same stats for the same day A. Here's what I saw:
Pageviews - 1,660,137
Users - 511,071
Sessions - 1,005,517
This report is also based on 100% of sessions.
Nothing was changed in the tracking code for the web and mobile app. Could someone explain why I have this difference in the stats? Is this normal?
They need some time to update the system, otherwise their system would overwhelm
When you first create a profile it can take up to 48 -72 hours for it to start showing data.
After that time data will appear instantly in the Real-time reports.
Standard reports take longer to finish processing. You need to remember the amount of data that is being processed. Some of the data may appear in the standard reports after a few hours. The numbers have not completed processing for at least 24 hours, so anything you look at then will not be accurate.
When checking Google Analytics never look at todays or yesterdays numbers in the standards reports, if you want accurate information. Things get even more confusing when you consider time zones. When exactly is it yesterday? I have noticed numbers changing as far back as 48 hours. But Google Says in there documentation 24 hours. I am looking for the link in the documentation will post it when I find it.
Found it: Data Limits
Data processing latency
Processing latency is 24-48 hours. Standard accounts that send more
than 200,000 sessions per day to Google Analytics will result in the
reports being refreshed only once a day. This can delay updates to
reports and metrics for up to two days. To restore intra-day
processing, reduce the number of sessions you send to < 200,000 per
day. For Premium accounts, this limit is extended to 2 billion hits
per month.
So try doing the same thing again today but check your last day being Monday. When you check again next week the numbers should be correct.

Best retention practice using Graphite

I have been a happy user of Graphite+Grafana for a few months now and I have been advocating it around my firm.
My approach has been to measure data of interest and collect them into 1-minute or 5-minute buckets and send that information to Graphite. I was recently contacted by a group that processes quotes (billions a day!) and their approach has been to create a log line each time their applications process 1 million quotes. The problem is that the interval between 2 log lines can be highly erratic from 1 second to a few hours.
The dilemma is then: should I set my retention policy to a 1-second bucket so that I can see all measurements associated with spikes or should I use say a 1-minute bucket so that the number of data points to be saved and later on queried is much more manageable. FYI, when I set it to 1-second, showing the data for 8 or 10 charts, for a few days was bringing the system (or at least my browser) to a crawl because of the numbers of data points (mostly NULL) being pushed around from Graphite to Grafana
Here's my retention policy: 1s:10d,1m:36d,5m:180d
Alternatively, is there a way to configure Grafana+Graphite to only retrieve non-NULL data points?
What do you recommend?
You can always specify a lower retention period for 1s metrics so when you show a longer range Graphite will send you only the more coarse level.
For example, you can specify: 1s:2d, 1m:7d, 5m:180d
This way, if you show a range more than 2 days in the past you will get 1m resolution (and so on), which won't make your browser crawl, while you will still be able to inspect spikes in the last 2 days.

Having trouble getting accurate numbers from graphite

I have an application that publishes a number of stats to graphite via statsd. One of the stats simply sends a stat increment to statsd every time a message is received by the service. I need to display a graph that shows the the relative traffic over time for this stat. Generally speaking, I should be able to display a graph that refreshes every, say 10 seconds, and displays how many messages were recived in those 10 seconds as well as the history for a given period of time. However, no matter how I format my API query I cannot seem to get accurate data. I've read a number of articles including this one:
http://code.hootsuite.com/accurate-counting-with-graphite-and-statsd/
That seems to give some good insight but is still not quite giving me what I need. this is the closes I have come:
integral(hitcount(stats.recieved, "10seconds"))
However, I don't like the cumulative result of this and when I run this I get statistics that come nowhere near to what I see n my logs for messages received. I am ok with accepting some packet loss but we talking about orders of magnitude. I know I am doing something wrong. Just hoping someone can give me some insight as to what.
A couple of things to check/try:
Configure Graphite for Statsd
Check to make sure that you've used the retention schema and aggregation settings in Graphite that match how Statsd will be sending data (i.e. it sends one data point per 10 second flush interval).
Run a single Statsd aggregator
Be sure you are only running one instance of Statsd as running multiple statsd daemons will cause metrics to be dropped (as Graphite will be configured to only store one data point for it's highest precision of 10s:6h)
Limit the time range in the UI or URL API to less than 6 hours
When displaying graphs with data that crosses over the 6 hour threshold (e.g. from now to 7 hours ago), you will begin seeing 1 minute worth of aggregated count data for the displayed graph (if you've configured Graphite for statsd with retentions = 10s:6h,1min:7d,10min:5y). Rollups will occur based on the oldest data point in the time range (e.g. now till 7+ days = you'll get 10 min rollups).
If sending sparse or "bursty" data AND displaying old time range (triggering aggregation)
Confirm that your xFilesFactor is low enough that aggregation produces non null values even with a high rate of nulls. For example, 100 requests in the first 10 seconds, and none for the remaining 50 seconds in a minute would cause a storage of 100, null, null, null, null, null which would be summed up to null when the data ages if the XFilesFactor is higher than 1/6. Using the statsd recommended graphite configuration handles this, but it is good to know about... as this can give the appearance of lost data.
Saving schema or aggregation changes
If you changed the graphite schema or aggregation settings after any metrics were stored (in whisper = graphite's storage) you'll need to either delete the .wsp files for the metric (graphite will recreate them) or run whisper-resize.py.
Validating settings
You can verify the settings against some whisper data by running whisper-info.py on a .wsp file. Find the .wsp file for one of your metrics in /graphite/storage/whisper/
Run: whisper-info.py my_metric_data.wsp. whisper-info.py output should tell you more about how the storage settings are working.
TLDR;
You should ensure that Graphite is set to store one data point per 10 second interval for metrics coming from StatsD. You should make sure that Graphite is summing (not averaging) for count data coming from Statsd. Both of these can be handled by using the recommended Statsd configuration settings. Don't run more than one Statsd aggregator. When using the UI, limit the data returned to less than 6 hours OR understand what rollup you are viewing when looking at data that crosses retention thresholds. Lastly, make sure the settings take (if you've already been sending metrics).

How does collection sampling affect the "live" stats for Google Analytics?

We've noticed lately that as our site is growing, our data in Google Analytics is getting less reliable.
One of the places we've noticed this most strongly is on the "Realtime Dashboard".
When we were getting 30k users per day, it would show about 500-600 people on line at a time. Now that we are hitting 50k users per day, it's showing 200-300 people on line at a time.
(Other custom metrics from within our product show that the user behavior hasn't changed much; if anything, users are currently spending longer on the site than ever!)
The daily totals in analytics are still rising, so it's not like it's just missing the hits or something... Does anyone have any thoughts?
The only thing I can think of is that there is probably a difference in interpretation of what constitutes a user being on line.
How do you determine if the user is on line?
Unless there is an explicit login/logout tracking, is it possible that it assumes that a user has gone if there is no user generated event or a request from the browser within an interval of X seconds?
If that is the case then it may be worth while adding a hidden iframe with some Javascript code that keeps sending a request every t seconds.
You can't compare instant measures of unique, concurrent users to different time-slices of unique users.
For example, you could have a small number of concurrent unique users (say 10) and a much higher daily unique users number like 1000, because 1000 different people were there over the course of the day, but only 10 at any given time. The number of concurrent users isn't correlated to the total daily uniques, the distribution over the course of the day may be uneven and it's almost apples and oranges.
This is the same way that monthly unique and daily uniques can't be combined, but average daily uniques are a lower bound for monthly uniques.

Last “end date” with data in Analytics

I'm using "Reporting google Analitics API" and I can’t find information about what the last “end date” with data in Analytics is.
For example, let's suppose you want to retrive the last month’s data.
When do you have to perform the query?
The first day of the current month?
...or the second one?
...or maybe the third one?
And only another question: are the returned data for days in pacific time?
Google Analytics API is supposed to have access to the same data you have in the interface.
Google says that data can take up to 24h to process. The time it takes to really update the data depends on the type and size of the account. Small accounts are updated multiple times a day and can have data available in just a few hours. Once you reach 1M hits a month you are moved to a different mode where the data on your account is updated only once a day. Google Analytics Premium customers have updates more often even for large ammounts of traffic.
There's no way to tell through the API what is exactly the time of the last hit processed. You can query the data for today by the hour and see for yourself though.
Usually you don't care and just want to make sure that the data you're querying has been fully processed for that day.
So if you query data for yesterday there's a chance it has not being completely updated, for example if it's midnight the data for yesterday is just a couple minutes ago and probably haven't been completely processed yet. The safest bet in this case is to query data for 2 days ago.
So if today is 2012-06-15 and you want to get 1 month of data a safe approach is to query data with start-date=2012-05-13 and end-date=2012-06-13. This will most of the time give you data for days that have been fully processed, but it's not 100% safe as well. Google Analytics have had outages in the past where data took longer than that to process, these are not usual though. When you get the data out it's really hard to tell just for the API if the data for those days have been fully processed or not, using the 2 days ago isea you just make it more likely that it is.
The days are aggregate following your timezone settings configured on the Google Analytics profile.

Resources