Grafana not picking up collectd metrics from graphite ui - graphite

I installed, graphite using pip on a system with the graphite ui, I installed collectd on the systems whose metrics I need, and everything shows up very well in the graphite ui.
I then installed grafana on a system and did an import of graphite, but in Grafana I see only, "Graphite Carbon Metrics", nothing related to collectd from the nodes presenting their metrics to graphite.
http://pastebin.com/fPZ61kwR

Make sure the collectd metrics naming convention matches what the dashboard expects.

Related

Monitoring HTTP Traffic in Kubernetes

I need to monitor HTTP traffic within my Kubernetes cluster. Specifically, I need response times, status codes, etc.
Currently, I am using a service mesh (Open Service Mesh) for this purpose. But, is there a lightweight solution that only allows monitoring (without security layer etc.) ?
Thanks for all ideas!
I can recommend you ELK stack (Elasticsearch, Fluent-bit and Kibana), one of the popular monitoring system suites on top of Kubernetes.
Fluent-bit has built-in parsers for logs of popular Web servers (apache, nginx).
Fluent Bit is a super fast, lightweight, and highly scalable logging and metrics processor and forwarder. It is the preferred choice for cloud and containerized environments.
Elasticsearch
Elasticsearch is a distributed, RESTful search and analytics engine capable of addressing a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data for lightning fast search, fine‑tuned relevancy, and powerful analytics that scale with ease.
Kibana
Kibana is a free and open user interface that lets you visualize your Elasticsearch data and navigate the Elastic Stack. Do anything from tracking query load to understanding the way requests flow through your apps.

Is there a very simple graphite tutorial available somewhere?

Given that I have Graphite installed within Docker, does anyone know of a very simple graphite tutorial somewhere that shows how to feed in data, then plot the data on a graph in Graphite Webapp? I mean the very basic things and not the endless configurations and pages after pages of setting various components up.
I know there is the actual Graphite documentation but it is setup after setup after setup of the various components. It is enough to drive anyone away from using Graphite.
Given that Graphite is running within Docker, as a start I just need to know the step of feeding in data using text, display the data in Graphite Web App, and query the data back.
I suppose that you containerized and configured all the graphite components.
First, be sure that you published plaintext and pickle port if you plan to feed graphite from the local or external host. (default: 2003-2004 )
After that, according to the documentation you can perform a simple Netcat command to send metrics over TCP/UDP to carbon with the format <metric path> <metric value> <metric timestamp>
while true; do
echo "local.random.diceroll $RANDOM `date +%s`" | nc -q 1 ${SERVER} ${PORT}
done
You should see in graphite-web GUI the path local/rando/diceroll generated with a graph of random integers.
Ref: https://graphite.readthedocs.io/en/latest/feeding-carbon.html

Airflow sends metrics by statsd but not all of them

Recently I have updated my airflow.cfg to enable metrics thought Statsd.
I have injected this settings to airflow.cfg:
I'm injecting this configuration:
AIRFLOW__SCHEDULER__STATSD_ON=True
AIRFLOW__SCHEDULER__STATSD_HOST=HOSTNAME
AIRFLOW__SCHEDULER__STATSD_PORT=9125
AIRFLOW__SCHEDULER__STATSD_PREFIX=airflow
I'm not using standard Statsd service, but Statsd-exporter which use Statsd protocol, so from my knowledge I can point directly Airflow to send metrics to Stats-exporter. By default it works on 9125 port.
After Statsd-exporter receives metrics, Prometheus is able to scrape them in regular manner.
All fine, all good. Moreover I made my mapping file to Statsd-exporter where I use a bit regex but.... my issue is that when I open WEB UI of Statsd-exported (9102 port) I see part of Airflow metrics, but not all of them!.
Documentation says about list of metrics here
For instance I see that Airflow sends: ti_failures, ti_successes, dagbag_size etc. But there are completely no metrics like: dag...duration or executor.open_slots and couple others.
Really big thank you for anyone who ever played with Statsd and Airflow, as I have no clue:(
I recently instrumented airflow and its metrics to be exported from statsd to prometheus.
In my architecture I have airflow running in Kubernetes pods and these specifically are:
scheduler
worker
flower
web
Only scheduler, worker and web have a side-car container to export statsd metrics (let's call these metrics pods).
And the list of metrics you see in official docs (https://airflow.apache.org/metrics.html) are not available to all metric pods.
To adhere to your specific problem, dag...duration are exported by worker node.

Machine's uptime in OpenStack

I would like to know (and retrieve via REST API) the uptime of individual VMs running in OpenStack.
I was quite surprised that OpenStack web UI has a colon called "Uptime" but it actually show time since the VM was created. If i stop the VM, the UI shows Status=Shutoff, Power State=Shutdown, but the Uptime is still being incremented...
Is there a "real" uptime (I mean for a machine that is UP)?
Can I retrieve it somehow via the OpenStack's REST API?
I saw the comment at How can I get VM instance running time in openstack via python API? but the page with the extension mentioned there does not exists and it looks to me that this extension will not be available in all OpenStack environment. I would like to have some standard way to retrieve the uptime.
Thanks.
(Version Havana)
I haven't seen any documentation saying this is the reason, but the nova-scheduler doesn't differentiate between a running and powered off instance. So your cloud can't be over-allocated or leave an instance in a position that would be unable to be powered on. I would like to see a metric of actual system runtime as well, but at the moment the only way to gather that would be through ceilometer or via Rackspaces StackTach

Monitoring Integration points

Our company is working on integrating Guidewire(claims processing system) into the existing claims system. We will be executing performance tests on the integrated system shortly. I wanted to know if there was some way to monitor the integration points specific to guidewire.
The system is connected through Web Services. We have access to Loadrunner and Sitescope, and are comfortable with using other open source tools also.
I realize monitoring WSDL files is an option, Could you suggest additional methods to monitor the integration points?
Look at the architecture of Guidewire. You OS have OS monitoring points and you have application monitoring points. The OS is straightforward using SiteScope, SNMP (with SiteScope or LoadRunner), Hyperic, Native OS tools or a tool like Splunk.
You likely have a database involved: This monitoring case is well known and understood.
Monitoring the services? As the application experts inside of your organization what they look at to determine if the application is healthy and running well. You might be implementing a set of terminal users (RTE) with datapoints, log monitoring through SiteScope, custom monitors scheduled to run on the host piping the output through SED to a standard form that can be imported into Analysis at the end of the test.
Think Architecturally. Decompose each host in the stack into OS and services. Map your known monitors to the hosts and layers. Where you run into issues grab the application experts and have them write down the monitors they use (they will have more faith in your results and analysis as a result)

Resources