DataDog - pass multiple fnmatch paterns to configuration - directory

init_config:
instances:
- directory: /mnt/ftp/generic/Salesorder
pattern: '*_12_*.csv'
filegauges: true
dirtagname: history
- directory: /mnt/ftp/generic/Salesorder
pattern: '2021_*_*.csv'
filegauges: true
dirtagname: this-year
The directory contains multiple files with the next format YYYY_MM_SalesOrder.csv
How to use multiple patterns in a single instance?

This is not possible.
Only one instance per format.
This is what it is :/
Maybe in the future, there will be better options for DataDog.
init_config:
instances:
- directory: /mnt/ftp/generic/Salesorder
pattern: '*_12_*.csv'
filegauges: true
dirtagname: history
- directory: /mnt/ftp/generic/Salesorder
pattern: '2021_*_*.csv'
filegauges: true
dirtagname: this-year

Related

How to use a config group multiple times, while overriding each instance

Here is my current config structure
hydra/
pipeline/
common/
feature.yaml
stage/
train.yaml
with the following files:
train.yaml
# #package _global_
defaults:
- _self_
- ../pipeline/common#train: feature
- ../pipeline/common#val: feature
train:
conf:
split: train
val:
conf:
split: val
pipeline:
- ${oc.dict.values: train.steps}
- ${oc.dict.values: val.steps}
feature.yaml
conf:
split: train
steps:
tabular:
name: "${conf.split}-tabular
class: FeatureGeneration
dataset:
datasources: [ "${conf.split}_split" ]
What I've accomplished:
I've been able to figure out how to use the config group multiple times utilizing the defaults in train.yaml.
What I'm stuck on:
I'm getting an error: InterpolationKeyError 'conf.split' not found
I do realize that imports are absolute. If I put #package common.feature at the beginning of feature.yaml I can import conf.split via common.feature.conf.split, but is there not a cleaner way? I tried relative imports but got the same error.
I can't seem to override conf.split from train.yaml. You can see where I set train.conf.split and val.conf.split but these do not get propagated. What I need to be able to do is have each instance of the config group utilize a different conf.split value. This is the biggest issue I'm facing.
What I've referenced so far:
The following resources have gotten me to where I am so far, but am still having trouble with what's listed above.
Hydra : how to assign config files from same group to two different fields
https://hydra.cc/docs/advanced/overriding_packages/
https://hydra.cc/docs/patterns/extending_configs/
Interpolation is not import and it's evaluated at when you access the config node. At that point your config is already composed so it should be straight forward to use either absolute interpolation (the default) or relative based on the structure of your final config.
Hard to be 100% sure, but I suspect this problem is because your defaults list has _self_ at the beginning. This means that the content of the config with containing the defaults list is overridden by what comes after in the defaults list.
Try to move _self_ to the end:
# #package _global_
defaults:
- ../pipeline/common#train: feature
- ../pipeline/common#val: feature
- _self_
#...

salt attribute(key/value) replacement based on particular stanza

Using salt i want to find the attribute(key) and replace it with value based on specific stanza. The attribute(key) is present in multiple times in a file under different stanzas. I want to find my attribute under specific stanza and replace with value.
Example:
output.kafka:
# Boolean flag to enable or disable the output module.
enabled:
I need to find enabled: under output.kafka: and replace it with value. The enabled: attribute present multiple times in my file.
Thanks
Bala.
Salt has a few commands like file.line, file.replace and file.blockreplace that can modify an existing file, but I highly recommend managing the whole file using file.managed. It makes for a less brittle experience.
Here's an example based off your question:
Pillar top file:
cat /srv/pillar/top.sls
base:
'*':
- common
'minion01':
- minion01kafkasettings
Set our pillar data:
cat /srv/pillar/minion01kafkasettings.sls
kafka_output: True
Here's our filebeat template:
cat /srv/salt/filebeat.tmpl
output.kafka:
# Boolean flag to enable or disable the output module.
enabled: {{ pillar.get('kafka_output', True) }}
Here's the filebeat Salt sls file:
cat /srv/salt/filebeat.sls
the_filebeat_file:
file.managed:
- name: /etc/filebeat/filebeat.yml
- template: jinja
- user: root
- group: root
Then we can run the following:
Refresh our pillar data
salt 'minion01' saltutil.refresh_pillar
Then apply the sls file:
salt 'minion01' state.sls filebeat
I have another theory using file.seralize that might work but not in its current state, Maybe Dave could help.
{% set json_data = salt.cp.get_file_str('/etc/filebeat/filebeat.yml') | load_yaml %}
{% do json_data.update({'enabled': pillar.get('kafka_output', True) }) %}
update_config:
file.serialize:
- name: /etc/filebeat/filebeat.yml
- user: root
- group: root
- mode: 644
- formatter: yaml
- dataset: |
{{ json_data | yaml(False)| indent(8) }}
This state should load the whole configuration file then you can modify any of its values based on your pillar setting using the do statement in your case it could be
{% do json_data.update({'enabled': pillar.get('kafka_output', True) }) %}
The config file is populated but not as exepcted as the result will be as following:
'enabled: true
status: active
'
Note there are quotes and the yaml is not intended correctly, is there another way to make it work ? I will update this answer if I found any new results

how to exclude property-sort-order from scss-lint rules?

I have a sasslint.yml file with a a list of rules
One of them is
property-sort-order: 1
I have tried to exclude it with
property-sort-order: enabled:false
and with
scss-lint --exclude-linter PropertySortOrder
But all this unsuccessful.
Any ideas?
Many thanks
You configure scss-lint in yml a configuration file. The default is .scss-lint.yml, and you can specify a different file via the command line with --config (I think -c works too). The documentation covers this here: https://github.com/brigade/scss-lint#configuration
You disable a linter with
linters:
LinterName:
enabled: false
Judging by https://github.com/brigade/scss-lint/issues/132,
linters:
PropertySortOrder:
enabled: false
will work correctly.
If you'd actually rather not turn it off completely, configuration options for scss-lint's property-sort-order are documented here https://github.com/brigade/scss-lint/blob/master/lib/scss_lint/linter/README.md#propertysortorder

(SaltStack) ID dog in SLS dog is not a dictionary

I have been trying to find a pattern (bcm2708_wdog) in the /etc/modules file and if it isnt there add it to the bottom. Every time I try this I get the "ID dog in SLS dog is not a dictionary". I have no idea what this means.
Here is the file:
dog:
- file.replace:
- name: /etc/modules
- pattern: 'bcm2708_wdog'
- append_if_not_found: True
It should probably look like this:
dog:
file.replace: # <--------this line was your problem.
- name: /etc/modules
- pattern: 'bcm2708_wdog'
- append_if_not_found: True
Lines beginning with "-" denote items in a list. In your version, you've defined the top-level "dog" element as a list containing a dictionary. Salt expects it to be a straight dictionary instead, hence the error.
Depending on your version, in 2018
You will get that message if wrote the state with just 'file.replace' without the "dog" on top.
file.replace:
- name: /etc/modules
- pattern: 'bcm2708_wdog'
- append_if_not_found: True

Apache Nutch NoSuchElementException with bin/nutch inject , readdb, generate options

I am new to Apache Nutch 2.3 and Solr. I am trying to get my first crawl working. I installed Apache Nutch and Solr as mentioned in official documentation and both are working fine. However when I did the following steps I get errors -
bin/nutch inject examples/dmoz/ - Works correctly
(InjectorJob: total number of urls rejected by filters: 2
InjectorJob: total number of urls injected after normalization and filtering:130)
Error - $ bin/nutch generate -topN 5
GeneratorJob: starting at 2015-06-25 17:51:50
GeneratorJob: Selecting best-scoring urls due for fetch.
GeneratorJob: starting
GeneratorJob: filtering: true
GeneratorJob: normalizing: true
GeneratorJob: topN: 5
java.util.NoSuchElementException
at java.util.TreeMap.key(TreeMap.java:1323)
at java.util.TreeMap.firstKey(TreeMap.java:290)
at org.apache.gora.memory.store.MemStore.execute(MemStore.java:125)
at org.apache.gora.query.impl.QueryBase.execute(QueryBase.java:73) ...
GeneratorJob: generated batch id: 1435279910-1190400607 containing 0 URLs
Same errors if i do - $ bin/nutch readdb -stats
Error - java.util.NoSuchElementException ...
Statistics for WebTable:
jobs: {db_stats-job_local970586387_0001={jobName=db_stats, jobID=job_local970586387_0001, counters={Map-Reduce Framework={MAP_OUTPUT_MATERIALIZED_BYTES=6, REDUCE_INPUT_RECORDS=0, SPILLED_RECORDS=0, MAP_INPUT_RECORDS=0, SPLIT_RAW_BYTES=653, MAP_OUTPUT_BYTES=0, REDUCE_SHUFFLE_BYTES=0, REDUCE_INPUT_GROUPS=0, COMBINE_OUTPUT_RECORDS=0, REDUCE_OUTPUT_RECORDS=0, MAP_OUTPUT_RECORDS=0, COMBINE_INPUT_RECORDS=0, COMMITTED_HEAP_BYTES=514850816}, File Input Format Counters ={BYTES_READ=0}, File Output Format Counters ={BYTES_WRITTEN=98}, FileSystemCounters={FILE_BYTES_WRITTEN=1389120, FILE_BYTES_READ=1216494}}}}
TOTAL urls: 0
I am also not able to use generate or crawl commands.
Can anyone tell me what am I doing wrong?
Thanks.
I too am new to nutch. However, I think the problem is that you haven't configured a data store. I got the same error, and got a bit further. You need to follow this: https://wiki.apache.org/nutch/Nutch2Tutorial, or this: https://wiki.apache.org/nutch/Nutch2Cassandra. Then, rebuild: ant runtime

Resources