How do I change the log level for airflow/composer - airflow

The info logging on airflow is way too much for my taste, so how do I change the log level?

Check this: https://github.com/apache/incubator-airflow/blob/master/airflow/config_templates/default_airflow.cfg
Line 55
# Logging level
logging_level = INFO

Related

Metrics not getting logged to Run history-azure machine learning

I have created an azure-ML pipeline using pythonscriptstep method. I am able to see the metrics getting logged to individual steps as in the figure below.
But I am not able to see the metrics (I logged the input_shape )getting logged to the Run History report when I open the experiment.
Any help is appreciated please.
Based on Priya's answer and this doc. It is solved by logging to the parent run like so:
mse = mean_squared_error(estimator.predict(data), target)
run = Run.get_context()
run.parent.log("mse", mse)

Raising Alerts from Application Insights log based on percentage?

I can write a query in application insights that gives me a percentage as a scalar. I want to create alert if that percentage is > X . How can this be done using log based alerts?
Basically, I have a lot of machines that send telemetry to application insights. Sometimes they log some exceptions. I send MachineName in customDimensions for all the logs. So I can get the names of all the machines that sent logs in last 24 hours. The exceptions are also sent with MachineName in customDimensions. When a particular error is raised by more than X% machines in last 24 hours, I want to raise an alert.
The way to write alert logic is using 'Number of Results' which cannot be used for this since it automatically adds '|count' to the query. The other way is using 'Metric Measurement', which I am guessing should help me raise an alert like this but I'm unable to figure out how.
I can get the total machine count by this query:
let num_machines = traces
| summarize by tostring(customDimensions["MachineName"])
| count;
I can get the number of machines that reported an exception like this:
let num_error_machines = exceptions
| where customDimensions["Message"] contains "ExceptionXRaised"
| summarize by tostring(customDimensions["MachineName"])
| count;
finally, i can get the percentage of machines that raised the issue like this:
print toscalar(num_error_machines)*100/toscalar(num_machines)
I am not sure how to use this result to raise an alert using MetricMeasurement. This needs to be modified somehow to get AggregatedValue and use bin, I am not sure if that is possible / how that query will be.
Sorry for the late reply. I've tested in my side and met many problems indeed.
I found that alert rule doesn't support to monitor the percentage number of the result, it only supports the numbers of query result and Metric measurement. So I think you may give up the percentage and use the num_err_machine like the screenshot below
Pls note, you can't append " ; " at the end of the query or it will give an error like The request had some invalid properties

Set log level for built in keywords in robotframework

In robot framework, it looks like it logs messages for keywords like "=" by default with 'INFO' log level. Ex:
<Test case>
${xyz} = "hello"
Would log message with:
'INFO': ${xyz} = "hello"
I would like to lower the log level for this to 'DEBUG' or 'TRACE' but can't seem to find it in the source code.
An advice for this?
Have you tried to execute the test bringing the whole execution to a deeper level "DEBUG" or "TRACE" adding this -L trace or -L debug to your test call. robot -L trace mytest.robot for instance.
Also, you can set your log level in the code. Like this:
Test Setup Set Log Level TRACE
Then in the log.html file, a visible log level dropdown is shown in the upper right corner. This allows users to remove messages below the chosen level from the view. This can be useful especially when running tests at the TRACE level.
for more information see: http://robotframework.org/robotframework/latest/RobotFrameworkUserGuide.html#visible-log-level
Source code defined like this
def log(self, message, level='INFO', html=False, console=False,
repr=False, formatter='str'):
u"""Logs the given message with the given level.
Valid levels are TRACE, DEBUG, INFO (default), HTML, WARN, and ERROR.
Messages below the current active log level are ignored. See
`Set Log Level` keyword and ``--loglevel`` command line option
for more details about setting the level.
usage example
Log you message:{message} level=DEBUG

Cacti: Graph value returns 0 on SNMP Generic OID Template

I can get the value from:
snmpwalk -v2c -c desscrt% 192.168.120.39 1.3.6.1.4.1.311.1.7.3.1.13
SNMPv2-SMI::enterprises.311.1.7.3.1.13.0 = Counter32: 47
I try to generate graph with:
SNMP - Generic OID Template
with OID 1.3.6.1.4.1.311.1.7.3.1.13
But my graph display the value "0" it should be 47,
how can I check to solve the problem?
In Poller Specific Logging enable MEDIUM or higher level of logging to log results. Then visit the log in System Utilities->View Cacti Log File, here you will see what the poller sees.

How to get Graphite to simply count counters, not time-rate them

I'm using Graphite and Collectd to monitor my server. In particular, I'm using the tail pluggin to count failed SSH logins. I'm using a counter for this metric, so expect to see 1, 2, 3, 0, etc... for data points. However, what I'm seeing is 0.1, 0.2, 0.3, 0, etc... It seems to me like Graphite is providing counts-per-second. I say this because my retention policy is one data point every 10 seconds for two hours. So 1 failed login per 10 seconds = 0.1 per second. I'm looking at this in a graph. It looks like this:
Furthermore, when I scale out to the next retention level, the numbers get adjusted accordingly: so 1 failed login which was shown as 0.1 is now shown as much less than this: 0.017 or something.
I don't think this is related to the aggregation method used: even the finest data is off. How can I get Graphite to treat this metric as a pure, raw, counter?
Here's my storage-schemas.conf (the retention policy):
[my_server]
pattern = .*
retentions = 10s:2h,1m:2d,30m:400d
Here's my configuration of the collectd tail plugin:
<Plugin "tail">
<File "/var/log/auth.log">
Instance "auth"
<Match>
Regex "sshd[^:]*: Failed password"
DSType "CounterInc"
Type "counter"
Instance "sshd-invalid_user"
</Match>
</File>
</Plugin>
And here's my configuration of the write_graphite pluggin (which sends data to graphite):
<Plugin write_graphite>
<Node "my_server_name">
Host "localhost"
Port "2003"
Protocol "tcp"
LogSendErrors true
Prefix "collectd."
#Postfix ""
StoreRates true
AlwaysAppendDS false
EscapeCharacter "_"
</Node>
</Plugin>
I tried setting StoreRates false for the write_graphite pluggin, but this didn't work. It did change the behaviour: when I performed a single failed SSH login, that metric shows as 1. However, it didn't drop back down to 0. When I performed two more failed logins, the metric pops up to 3.
Also of interest: I've also loaded the users pluggin which simply shows the number of users logged in and it's working great: shows 1 when I SSH in, two when I SSH in again, and back to 1 when I exit one SSH. For both settings of StoreRates. So it seems like what I want is possible somehow. Maybe not with the tail pluggin though.
The SSH logins with StoreRates false along with correct behaviour for Users Logged in can be seen in these graphs:
Any ideas? Thanks,
You are asking the system to count the number of events. And this is exactly what it's doing: it's counting the number of failed logins since its startup. Whether you're using StoreRates or not simply changes the way that information is displayed: as a rate or as the raw counter. A counter may never decrease! What you're actually asking for is a counter that resets itself upon reading: count the number of failed logins since the last time collectd checked.
As it happens the ABSOLUTE data source type in rrdtool can be used to achieve this, but that won't help you.
Step back, and think about what you're trying to achieve: the number of failed logins per second seems to me like a perfectly sane metric!
Although swissunix's answer is very helpful, to achieve the behaviour I was looking for, I ended up using Logster instead of Collectd. With Logster, you write the bit of code that parses the file as well as the bit that returns the metric. So although dividing a count by the time is common with Logster, you don't have to do this if you don't want to: there's lots of flexibility.
I've put my parsers here: https://github.com/camlee/logster-parsers
If you set StoreRates to false, in graphite you can apply the derivative function to the ever-increasing counter to get your rate of increase per retention interval, which would match your requirement.
E.g. in your example of reporting 1 failed login, then 2, you saw the values 1 and 3. The derivative is 1 and 2: the failed logs per interval that graphite tracks.

Resources