Flume agent - [tail -f /var/log/httpd/error_log] exited with 1 - flume-ng

i am new to flume. My flume agent is not writing data to HDFS. Please help. Here is the code. The purpose of the code is to get the data from apache and park it to HDFS.
#identify the components on agent a1
a1.sources = apache_server
a1.sinks = hdfs_sink
a1.channels = c1
# Configure the source:
a1.sources.apache_server.type = exec
a1.sources.apache_server.command = tail -f /var/log/httpd/error_log
# Describe the sink:
a1.sinks.hdfs_sink.type = hdfs
a1.sinks.hdfs_sink.hdfs.path = hdfs://hadoop1.example.com:9000/Apache_Logs
a1.sinks.hdfs_sink.hdfs.writeFormat = Text
a1.sinks.hdfs_sink.hdfs.fileType = DataStream
a1.sinks.hdfs-sink.hdfs.rollInterval = 10
a1.sinks.hdfs_sink.hdfs.rollSize = 0
a1.sinks.hdfs-sink.hdfs.filePrefix=apacheaccess
# Configure a channel that buffers events in memory:
a1.channels.c1.type = memory
a1.channels.c1.capacity = 20000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel:
a1.sources.apache_server.channels = c1
a1.sinks.hdfs_sink.channel = c1

Related

mariadb cluster synced but one node shows size=0

I use mariadb 10.5 with galera 4. I have a 3 node cluster which worked perfectly for the past 6 months. Lately I have been having problems with very cpu intensive query and had to kill that process. One of the nodes (n1) went out of sync so I recreated it. Everything synced perfectly but since that day n1 shows wsrep_cluster_size=0 and the rest of them show wsrep_cluster_size=3.
After a couple of days I decided to stop n2 and n3 to recreate it from n1. Again everything went smoothly but now n3 shows wsrep_cluster_size=0 and n1,n2 show wsrep_cluster_size=3.
I have no idea what's going on. I've checked all the logs and manually checked all the tables and everything seems ok. Data is synced and database is working just fine.
Heres is my configuration
[mysqld]
binlog_format = ROW
bind-address = 0.0.0.0
# Galera Provider Configuration
wsrep_on = ON
wsrep_provider = /usr/lib/galera/libgalera_smm.so
# Galera Cluster Configuration
wsrep_cluster_name = cluser
wsrep_cluster_address = gcomm://10.0.0.2,10.0.0.3,10.0.0.4
wsrep_node_address = 10.0.0.2
wsrep_node_name = n1
# Galera Synchronization Configuration
wsrep_sst_method = rsync
log_error = /var/lib/mysql/node.log
default_storage_engine = InnoDB
innodb_autoinc_lock_mode = 2
innodb_locks_unsafe_for_binlog = 1
innodb_file_per_table = 1
#innodb_thread_concurrency = 0
innodb_buffer_pool_size = 10G
#innodb_log_buffer_size = 64M
innodb_flush_method = O_DIRECT
innodb_log_file_size = 2G
innodb_log_files_in_group = 2
wsrep_slave_threads = 5
innodb_locks_unsafe_for_binlog = 1
innodb_autoinc_lock_mode = 2
skip-name-resolve
lc-messages-dir = /usr/share/mysql
skip-external-locking
key_buffer_size = 16M
max_connections = 300
wait_timeout = 20
max_allowed_packet = 16M
thread_stack = 192K
thread_cache_size = 8
# * Query Cache Configuration
#
query_cache_limit = 1M
query_cache_size = 16M
expire_logs_days = 10
max_binlog_size = 100M
Here is my SHOW STATUS LIKE 'wsrep%' for 3 nodes
https://pastebin.com/GXj0c38R
And logs
https://pastebin.com/YxJBcguK
This is definitely a bug. Please report it on MariaDB JIRA.
In addition to the wsrep_cluster_size=0 on n3, wsrep_cluster_conf_id is uninitialised (and not the 23 like other nodes) and wsrep_cluster_state_uuid is blank.
For a synced node I'd expect these to have consistent values on all nodes.

count bytes with influxs telegraf

I can receive messages with the inputs.mqtt_consumer telegraf plugin, but it gives me a lot of data in influxdb.
How can I in the telegraf configuration just count the number of received bytes and messages and report that to influx db?
# Configuration for telegraf agent
[agent]
interval = "20s"
round_interval = true
metric_batch_size = 1000
metric_buffer_limit = 10000
collection_jitter = "0s"
flush_interval = "10s"
flush_jitter = "0s"
precision = ""
hostname = ""
omit_hostname = false
[[outputs.influxdb_v2]]
urls = ["XXXXXXXXXXXXXXXX"]
token = "$INFLUX_TOKEN"
organization = "XXXXXXXXXXXXXXX"
bucket = "XXXXXXXXXXXXXXX"
[[inputs.mqtt_consumer]]
servers = ["tcp://XXXXXXXXXXXXXXXXXXXXX:1883"]
topics = [
"#",
]
data_format = "value"
data_type = "string"
I tried to google around but din't find any clear ways to do it.
I just want number of bytes and messages received each minute for the selected topic
I did not manage to receive all the messages and count them, but I found a solution where I can get the data from the broker. Not exactly what I asked for but fine for what I need.
topics = [
"$SYS/broker/load/messages/received/1min",
"$SYS/broker/load/messages/sent/1min",
]
...
data_format = "value"
data_type = "float"

MariaDB 10.1 using too much RAM on Debian 9.12

I am administrating a server with MariaDB and it often uses a lot of RAM - often so much that it crashes because it can't allocate more RAM, and even when it doesn't it can be pretty slow because it swaps. Even when it is running smoothly, I can see with htop that among the processes using the most RAM there are two dozen of /usr/sbin/mysqld processes. Of course I googled it but I barely understand what the settings in my.cnf do, so nothing I changed appeared to have an effect.
The server has:
2 GB of RAM
Debian GNU/Linux 9.12 (stretch)
mysql --version returns: Ver 15.1 Distrib 10.1.44-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
Here is the content of /etc/mysql/my.cnf:
# This file has been automatically moved from your previous
# /etc/mysql/my.cnf, with just this comment added at the top, to maintain MySQL
# operation using your previously customised configuration.
# To switch to the new packaging configuration for automated management of
# /etc/mysql/my.cnf across multiple variants:
#
# 1. Move your customisations from this file to /etc/mysql/conf.d/ and
# to /etc/mysql/<variant>.conf.d/ as appropriate.
# 2. Run "update-alternatives --remove my.cnf /etc/mysql/my.cnf.migrated"
# 3. Remove the file /etc/mysql/my.cnf.migrated
#
# The MySQL database server configuration file.
#
# You can copy this to one of:
# - "/etc/mysql/my.cnf" to set global options,
# - "~/.my.cnf" to set user-specific options.
#
# One can use all long options that the program supports.
# Run program with --help to get a list of available options and with
# --print-defaults to see which it would actually understand and use.
#
# For explanations see
# http://dev.mysql.com/doc/mysql/en/server-system-variables.html
# This will be passed to all mysql clients
# It has been reported that passwords should be enclosed with ticks/quotes
# escpecially if they contain "#" chars...
# Remember to edit /etc/mysql/debian.cnf when changing the socket location.
[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock
# Here is entries for some specific programs
# The following values assume you have at least 32M ram
# This was formally known as [safe_mysqld]. Both versions are currently parsed.
[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
nice = 0
[mysqld]
#
# * Basic Settings
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /tmp
lc-messages-dir = /usr/share/mysql
skip-external-locking
#
# Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
#bind-address = 127.0.0.1
bind-address = ***.***.***.***
#
# * Fine Tuning
#
key_buffer = 8M
max_allowed_packet = 16M
thread_stack = 192K
thread_cache_size = 8
# This replaces the startup script and checks MyISAM tables if needed
# the first time they are touched
myisam-recover = BACKUP
#max_connections = 100
#table_cache = 64
#thread_concurrency = 10
#
# * Query Cache Configuration
#
query_cache_limit = 1M
query_cache_size = 8M
#
# * Logging and Replication
#
# Both location gets rotated by the cronjob.
# Be aware that this log type is a performance killer.
# As of 5.1 you can enable the log at runtime!
#general_log_file = /var/log/mysql/mysql.log
#general_log = 1
#
# Error log - should be very few entries.
#
log_error = /var/log/mysql/error.log
#
# Here you can see queries with especially long duration
slow_query_log_file = /var/log/mysql/mysql-slow.log
# slow_query_log = 1
long_query_time = 1
#log_queries_no_using_indexes
#
# The following can be used as easy to replay backup logs or for replication.
# note: if you are setting up a replication slave, see README.Debian about
# other settings you may need to change.
#server-id = 1
#log_bin = /var/log/mysql/mysql-bin.log
expire_logs_days = 10
max_binlog_size = 100M
#binlog_do_db = include_database_name
#binlog_ignore_db = include_database_name
#
# * InnoDB
#
# InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/.
# Read the manual for more InnoDB related options. There are many!
#
# * Security Features
#
# Read the manual, too, if you want chroot!
# chroot = /var/lib/mysql/
#
# For generating SSL certificates I recommend the OpenSSL GUI "tinyca".
#
# ssl-ca=/etc/mysql/cacert.pem
# ssl-cert=/etc/mysql/server-cert.pem
# ssl-key=/etc/mysql/server-key.pem
character_set_server=utf8mb4
skip_character_set_client_handshake
skip-name-resolve
tmp_table_size= 32M # 128
max_heap_table_size=128M
wait_timeout=60 #120
max_connections=25 # 40
thread_stack=128K
[mysqldump]
quick
quote-names
max_allowed_packet = 16M
[mysql]
#no-auto-rehash # faster start of mysql but no tab completition
[isamchk]
key_buffer = 16M
#
# * IMPORTANT: Additional settings that can override those from this file!
# The files must end with '.cnf', otherwise they'll be ignored.
#
!includedir /etc/mysql/conf.d/
Is something wrong in my my.cnf? Would updating MariaDB help?

streaming files into hdfs using flume

I'm trying to move files to HDFS.
And this is my config file:
# Naming the components on the current agent.
FileAgent.sources = File
FileAgent.channels = MemChannel
FileAgent.sinks = HDFS
#configuring the souce
FileAgent.sources.File.type = spooldir
FileAgent.sources.File.spoolDir = /usr/lib/flume/spooldir
# Describing/Configuring the sink
FileAgent.sinks.HDFS.type = hdfs
FileAgent.sinks.HDFS.hdfs.path = hdfs://192.168.1.31:8020/user/Flume/
FileAgent.sinks.HDFS.hdfs.fileType = DataStream
FileAgent.sinks.HDFS.hdfs.writeFormat = Text
FileAgent.sinks.HDFS.hdfs.batchSize = 1000
FileAgent.sinks.HDFS.hdfs.rollSize = 0
FileAgent.sinks.HDFS.hdfs.rollCount = 10000
# Describing/Configuring the channel
FileAgent.channels.MemChannel.type = memory
FileAgent.channels.MemChannel.capacity = 10000
FileAgent.channels.MemChannel.transactionCapacity = 100
# Binding the source and sink to the channel
FileAgent.sources.File.channels = MemChannel
FileAgent.sinks.HDFS.channel = MemChannel
And it works well.But the files in hdfs have a name like this: FlumeData.1460976871742
In my case I want to keep the original file name.
How to keep the original file name in hdfs?
For example, if I have a file test.txt in the directory /usr/lib/flume/spooldir, I will have a file test.txt in HDFS.

Getting the issue in cloudera 4.7.. WARN conf.FlumeConfiguration: Could not configure sink c1 due to: No channel configured for sink: c1

Below is the conf file
a1.sources = r1
a1.channels = k1
a1.sinks = c1
a1.sources.r1.type = netcat
a1.sources.r1.bind = localhost
a1.sources.r1.port = 44444
a1.sinks.c1.type = logger
a1.channels.k1.type = memory
a1.channels.k1.capacity = 1000
a1.channels.k1.transactionCapacity = 100
a1.sources.r1.channels = k1
a1.sources.c1.channel = k1
The log file shows below warning.
14/10/21 07:14:44 INFO conf.FlumeConfiguration: Added sinks: c1 Agent: a1
14/10/21 07:14:44 INFO conf.FlumeConfiguration: Processing:c1
14/10/21 07:14:45 WARN conf.FlumeConfiguration: Could not configure sink c1 due to: No channel configured for sink: c1
org.apache.flume.conf.ConfigurationException: No channel configured for sink: c1
at org.apache.flume.conf.sink.SinkConfiguration.configure(SinkConfiguration.java:51)
at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSinks(FlumeConfiguration.java:680)
at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid(FlumeConfiguration.java:346)
at org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.access$000(FlumeConfiguration.java:212)
at org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:126)
at org.apache.flume.conf.FlumeConfiguration.<init>(FlumeConfiguration.java:108)
at org.apache.flume.node.PropertiesFileConfigurationProvider.getFlumeConfiguration(PropertiesFileConfigurationProvider.java:193)
at org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:94)
at org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:140)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
14/10/21 07:14:45 INFO conf.FlumeConfiguration: Post-validation flume configuration contains configuration for agents: [a1]
It shoud be
a1.sinks.c1.channel = k1
it should be this
a1.sources.r1.channels = c1
a1.sinks.k1.channel= c1

Resources