How to control the metrics sent by statsd to graphite? - graphite

I am using hosted graphite and statsd. I want to reduce the costs, by reducing the metrics been sent.
For example: for each timer, I don't want to send all 14 metrics (upper, std...), but only 3 of them (mean_90, sum, and maybe another one).
How can I configure that in the statsd configuration file?

Looking at the statsd docs and its source, I don't think you can configure it to not send all the metrics.
I suggest that you either:
Edit the source code to only calculate and send the metrics you want. This is probably easy to do just deleting the lines where they are calculated.
Configure Graphite to drop all metrics coming from statsd not matching the three patterns you want.

Related

Different Ways to Call Google Analytics from Server Side?

Currently, I am using Measurement Protocol to push the data to GA. The problem is I didn't get any response back for Success or Error on Production, If yes Please suggest?
Due to this, I am looking if there is any other options available for the same like can we achieve it using analytics 360?
The google analytics production data collection endpoint does not return a request status back (always 200 OK) by design to ensure ultra-light processing speed.
What I usually recommend to clients using Measurement Protocol server-side is to
To log a reasonable amount (or all of them) of requests somewhere. Storage is extremely cheap nowadays and knowing the data format if an emergency happens you will be able to manually extract the data
Every once in a while (one on thousand or one on a million or even more oftne depending on the importance of the data randomly) validate request on GA debug endpoint and parse the returned json. If there are any warnings or error send a notification for further investigation. This way, if anything went wrong you will be on top of the problem by the time BI & Marketing Team would be affected

Effective subscription to data feeds

How to effectively implement subscription mechanism in G-Wan? Suppose, I want to make g-wan aggregate data from various tickers and farther process it. And, obviously, every feed provides the data in its unique format.
The straightforward way would be to create connections and subscribe to data in the init() function of the connection handler, then parse source info from the responses and dispatch data from the main() function to dedicated queues. But this approach doesn't seem to make any use of the effective task scheduling engine of G-Wan. So, may be a dedicated software would solve the problem faster?
Another approach would be creating dedicated servlets for every subscription. For that, in the main() func of the connection handler, I would need rewriting headers and including names of corresponding servlets. In this case I would employ the whole g-wan machinery. But doesn't the rewriting headers negate all performance advantage of g-wan?
G-WAN already provides a simple publisher/subscriber engine, see the Comet servlet example.
This works fine with slow (typically 1 update per second) feeds.
For real-time and BigData feeds, there's no alternative to using G-WAN protocol handlers (to bypass connection handler rewrites and to precisely define the desired latency).
That's what happened for this project distributing 150 million messages per second via 75,000 channels to 1.5 million subscribers.
We have also made a (now famous) demo for the ORACLE OpenWorld expo in SFO that processed 1.2 billion TPS (transactions per second) on one single server, by using G-WAN as a cache for the ORACLE noSQL database (a Java KV store).
So the limits are more a question of precise tuning than G-WAN's core engine limitations.

Graphite data display delay

I'm sending 4 metrics every minute from another server to the server hosting graphite. I've set up graphite & grafana and am able to see the data in grafana. However, I notice that there's about a 3-minute delay from the time I sent the metric to the time I see it in Grafana.
I'm using graphite and Grafana for a real-time display and is setting Grafana to auto-refresh every 10s. It's a bit unusual to have this 3-min delay. I doubt if the network is causing that much delay. Is there anyway to look into why this delay is so high?
Thank you
If you point your graphite setup (in graphite-web local_settings.py) to a memcached cluster, metrics are cached there for (by default) 1 min.
It could explain part of the delay.
turns out there's some issue in the graphite version I was using. The answer is here: https://answers.launchpad.net/graphite/+question/254964
I need to add that graphite doesn't show last point from rrd, thats why you can had delay.
/opt/graphite/webapp/graphite/readers/rrd.py in "fetch" method you need to comment
rows.pop() if this problem is important for you

What is the best usage of tsdb?

I found opentsdb as a powerful monitoring system. it has a structure like proc.loadavg.1min 1234567890 1.35 host=A.
But my questions are:
1- is it good for logging in php?
2- can I store every log data in it?
3- and please let me know if there is a good library for php for working with opnetsdb for e.g ( send data to opentsdb by php )
it not yet clear to me.
I would be tankful for any help.
In my opinion openTSDB is not a monitoring system but a way to store time series.
If you want to build a monitoring tool you'll need a bigger set of tools, including a way to feed your monitored metrics to the database and a way to display them.
For example you can use Logstash and statsd to collect, aggregate and send your metrics. For the display you can use a tool called Grafana.
openTSDB is just an options for storing it, but you can also Graphite or InfluxDB.

Sending Data from ganglia to graphite

I am currently collecting monitoring metrics with Ganglia and I would like to show graphs with that data with Graphite. I know such an integration is possible, and I found an article describing how it should be done. I am not quite sure exactly how this integration works, especially when I want to send it straight into graphite without parsing the data of the gmetad. Any help on how to integrate Ganglia with Graphite will be great.
thanks
There are two approaches to integrate ganglia with graphite.
use third party process to get metrics from gmetad/gmond, tweak metrics data format, send metrics data to carbon server finally.
use the feature "graphite integration" of gmetad where you just need to configure the carbon server address, port, protocol (with an optional graphite path syntax), then gmetad will do all the things left. The more details can be found from your /etc/ganglia/gmetad.conf
I would recommend #2 since it's pretty simple. you just need to upgrade your ganglia packages to version 3.3+.
In above solutions, you can store metrics data in both RRD and whisper. If you don't want this approach, it also supports altering rrdtool graphs with graphite in ganglia-web. see "Using graphite as graphing engine"
Have you checked the ganglia-web wiki ? There is a section Graphite Integration and an other called Using Graphite as the graphing engine which explain well how to do what you want.
I've worked a lot with Ganglia, Graphite from what I've researched works similarly. I was never able to master Whisper, but I've found RRD's (round robin databases) to be pretty reliable. Not sure what you're interested in monitoring, but I would definitely check out JMXtrans. You can get the code from Google. It provides multiple methods for extracting metric data from whatever JVM you're monitoring, and lets you define which metrics you'd like to pipe to Ganglia/Graphite, and some other options.

Resources