graphite webapp doesn't show data from all carbon cache process - graphite

i am running 4 carbon cache instance behind 1 carbon relay instance. Below is my carbon.conf.
[cache:1]
LINE_RECEIVER_PORT = 2103
PICKLE_RECEIVER_PORT = 2104
CACHE_QUERY_PORT = 7102
STORAGE_DIR = /graphite_data/01
LOCAL_DATA_DIR = /graphite_data/01
[cache:2]
LINE_RECEIVER_PORT = 2203
PICKLE_RECEIVER_PORT = 2204
CACHE_QUERY_PORT = 7202
STORAGE_DIR = /graphite_data/02
LOCAL_DATA_DIR = /graphite_data/02
[cache:3]
LINE_RECEIVER_PORT = 2303
PICKLE_RECEIVER_PORT = 2304
CACHE_QUERY_PORT = 7302
STORAGE_DIR = /graphite_data/03
LOCAL_DATA_DIR = /graphite_data/03
[cache:4]
LINE_RECEIVER_PORT = 2403
PICKLE_RECEIVER_PORT = 2404
CACHE_QUERY_PORT = 7402
STORAGE_DIR = /graphite_data/04
LOCAL_DATA_DIR = /graphite_data/04
I have configured my carbon relay with below configutaion
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2003
PICKLE_RECEIVER_INTERFACE = 0.0.0.0
PICKLE_RECEIVER_PORT = 2004
RELAY_METHOD = consistent-hashing
.
REPLICATION_FACTOR = 1
DESTINATIONS=127.0.0.1:2104:1,127.0.0.1:2204:2,127.0.0.1:2304:3,127.0.0.1:2404:4
I have configured my graphite webapp with the below configuration to get the data from all carbon cache process
STANDARD_DIRS = ['/graphite_data/01',
'/graphite_data/02',
'/graphite_data/03',
'/graphite_data/04']
# You *should* use 127.0.0.1 here in most cases
CARBONLINK_HOSTS = ["127.0.0.1:7102:1", "127.0.0.1:7202:2", "127.0.0.1:7302:3","127.0.0.1:7402:4"]
After configuration , i started pushing data with example-client.py to my carbon relay process. I could see that relay is pushing data to carbon-cache process.
**[root#poc-graphite graphite]# ls /graphite_data/02/system/loadavg_5min.wsp
/graphite_data/02/system/loadavg_5min.wsp
[root#poc-graphite graphite]# ls /graphite_data/03/system/loadavg_1min.wsp
/graphite_data/03/system/loadavg_1min.wsp
[root#poc-graphite graphite]# ls /graphite_data/04/system/loadavg_15min.wsp
/graphite_data/04/system/loadavg_15min.wsp**
But I am not able to see this metrics in my webapp. is there something wrong with configuration.

You should check the path of the twisted plugin with blow commands:
$python
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages']
If in the results exist the path "/usr/local/lib/python2.7/dist-packages", just remove it:
sudo rm -rf /usr/local/lib/python2.7/dist-packages/twiste*
and then
sudo service carbon-cache stop ## wait a few seconds here
sudo service carbon-cache start

Related

mariadb cluster synced but one node shows size=0

I use mariadb 10.5 with galera 4. I have a 3 node cluster which worked perfectly for the past 6 months. Lately I have been having problems with very cpu intensive query and had to kill that process. One of the nodes (n1) went out of sync so I recreated it. Everything synced perfectly but since that day n1 shows wsrep_cluster_size=0 and the rest of them show wsrep_cluster_size=3.
After a couple of days I decided to stop n2 and n3 to recreate it from n1. Again everything went smoothly but now n3 shows wsrep_cluster_size=0 and n1,n2 show wsrep_cluster_size=3.
I have no idea what's going on. I've checked all the logs and manually checked all the tables and everything seems ok. Data is synced and database is working just fine.
Heres is my configuration
[mysqld]
binlog_format = ROW
bind-address = 0.0.0.0
# Galera Provider Configuration
wsrep_on = ON
wsrep_provider = /usr/lib/galera/libgalera_smm.so
# Galera Cluster Configuration
wsrep_cluster_name = cluser
wsrep_cluster_address = gcomm://10.0.0.2,10.0.0.3,10.0.0.4
wsrep_node_address = 10.0.0.2
wsrep_node_name = n1
# Galera Synchronization Configuration
wsrep_sst_method = rsync
log_error = /var/lib/mysql/node.log
default_storage_engine = InnoDB
innodb_autoinc_lock_mode = 2
innodb_locks_unsafe_for_binlog = 1
innodb_file_per_table = 1
#innodb_thread_concurrency = 0
innodb_buffer_pool_size = 10G
#innodb_log_buffer_size = 64M
innodb_flush_method = O_DIRECT
innodb_log_file_size = 2G
innodb_log_files_in_group = 2
wsrep_slave_threads = 5
innodb_locks_unsafe_for_binlog = 1
innodb_autoinc_lock_mode = 2
skip-name-resolve
lc-messages-dir = /usr/share/mysql
skip-external-locking
key_buffer_size = 16M
max_connections = 300
wait_timeout = 20
max_allowed_packet = 16M
thread_stack = 192K
thread_cache_size = 8
# * Query Cache Configuration
#
query_cache_limit = 1M
query_cache_size = 16M
expire_logs_days = 10
max_binlog_size = 100M
Here is my SHOW STATUS LIKE 'wsrep%' for 3 nodes
https://pastebin.com/GXj0c38R
And logs
https://pastebin.com/YxJBcguK
This is definitely a bug. Please report it on MariaDB JIRA.
In addition to the wsrep_cluster_size=0 on n3, wsrep_cluster_conf_id is uninitialised (and not the 23 like other nodes) and wsrep_cluster_state_uuid is blank.
For a synced node I'd expect these to have consistent values on all nodes.

Thoug max_allow_packet is 1 G , "ERROR 2006 (HY000): MySQL server has gone away" appear

I installed Mariadb 10.5.12-1.el7 on Centos 7.9
Sometimes when I run some query like "SHOW VARIABLES LIKE 'max_join_size';" , this message appear :
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
Connection id: 3515012
Current database: *** NONE ***
I setup a cluster Packemaker/drbd/mariadb
And these are the conf:
max_allow_packet = 1G
bind-address = 0.0.0.0
datadir= /db/mysql/
socket=/db/mysql/mysql.sock
log_error=/var/log/mariadb/error.log
skip-external-locking
innodb_buffer_pool_size = 75G
innodb_log_file_size = 18G
innodb_buffer_pool_instances = 75
max_allowed_packet = 1G
thread_stack = 256K
thread_cache_size = 2000
max_connections = 2000
query_cache_limit = 256K
table_open_cache = 2000
table_definition_cache = 1400
expire_logs_days = 10
max_binlog_size = 100M
default_storage_engine = innodb
innodb_file_per_table = 1
interactive_timeout = 30
wait_timeout = 30
query_cache_type = 1
query_cache_size = 36M
query_cache_min_res_unit = 2K
What is the cause of this issue ?
Thanks

Postfix spf - delivers spoofing emails. Not fail

I started receiving spoofing emails. So I set up my server and domain but I still receive emails. SPF is not rejecting emails.
Can anyone help?
dns records
myserver.com. IN TXT "v=spf1 a mx a:myserver.com ip4:50.111.111.111 -all"
_dmarc.myserver.com. IN TXT "v=DMARC1; p=reject; fo=1; ri=3600; pct=100; rua=mailto:info#myserver.com; ruf=mailto:info#myserver.com
/etc/postfix-policyd-spf-python/policyd-spf.conf
debugLevel = 1
HELO_reject = Fail
Mail_From_reject = Fail
PermError_reject = False
TempError_Defer = False
skip_addresses = 127.0.0.0/8,::ffff:127.0.0.0/104,::1
postfix - main.cnf
smtpd_recipient_restrictions = permit_sasl_authenticated, permit_mynetworks, reject_unauth_destination unix check_policy_service: private / policyd-SPF reject_unauth_pipelining, reject_invalid_helo_hostname, reject_non_fqdn_helo_hostname, reject_unknown_recipient_domain, reject_rbl_client zen.spamhaus.org, bl.spamcop.net reject_rbl_client, check_policy_service inet: 127.0.0.1: 10023
postfix - master.cf
policyd-spf unix - n n - 0 spawn
user = policyd-spf argv = /usr/bin/policyd-spf
mail.log
Oct 12 21:13:36 myserver policyd-spf [26371]: None; identity = helo; client-ip = 72,167,234,237; helo = p3nlsmtp12.shr.prod.phx3.secureserver.net; envelope-from=test#baddkim.com; receiver=mymail#myserver.com
Oct 12 21:13:36 myserver policyd-spf [26371]: None; identity = mailfrom; client-ip = 72,167,234,237; helo = p3nlsmtp12.shr.prod.phx3.secureserver.net; envelope-from=test#baddkim.com; receiver=mymail#myserver.com
Oct 12 21:13:36 myserver policyd-spf [26369]: Pass; identity = mailfrom; client-ip = 72,167,234,237; helo = p3nlsmtp12.shr.prod.phx3.secureserver.net; envelope-from=test#emailspooftest.com; receiver=mymail#myserver.com
Oct 12 21:13:36 myserver postfix / smtpd [22955]: BFA1981347: client = p3nlsmtp12.shr.prod.phx3.secureserver.net [72.167.234.237]
Oct 12 21:13:36 myserver postgrey [2322]: action = pass, reason = triplet found, client_name = p3nlsmtp12.shr.prod.phx3.secureserver.net, client_address = 72.167.234.237, sender=test#baddkim.com, recipient=mymail#myserver.com
Oct 12 21:13:36 myserver postfix / smtpd [26363]: C1ADE814FA: client = p3nlsmtp12.shr.prod.phx3.secureserver.net [72.167.234.237]
Oct 12 21:13:36 myserver postgrey [2322]: action = pass, reason = triplet found, client_name = p3nlsmtp12.shr.prod.phx3.secureserver.net, client_address = 72.167.234.237, sender=test#emailspooftest.com, recipient=mymail#myserver.com

ORA-12505 + "network adapter could not establish the connection" on Oracle11g/VirtualBox

I have Oracle 11g installed locally on each of my virtualbox machines (working under Windows 7 64bit). Suddenty, after a simple reboot, the database on one of the 5 virtual machines doesn't want to connect anymore.
With SID connection I obtain ORA-12505 error, and with service name : "Network adapter could not establish the connection", in SqlDeveloper with both cases. If I try a connection with SqlPlus as sysdba, I obtain the connection but with "connected to an idle instance". Hence if I try to see, for example, the list of sessions and processes working, I have the error 01034 ("ORACLE not available"). I tried a lot of tricks but nothing works. Could it be a specific problem with virtual machines ?
Here what I tried :
the services (of my base and of the listener) are working (and I wait enough between relaunch and connection retry) ;
the files tnsnames.ora, listener.ora and sqlnet.ora seem correct (see below) ;
If I force localhost to be 127.0.0.1 in hosts file, I have the 12514 error ;
ORACLE_HOME and ORACLE_SID are correctly set ;
It can't a priori be a memory problem (I even try to allow more memory to the specific VM which doesn't work) ;
If I force "startup" on sysdba session, the next requests obtain : ORA-03114 : not connected to ORACLE ;
It's not a priori a problem of system files size. In all cases, the not working database is not my biggest database among all my databases (and any file in oradata are bigger than in others VMs which have exactly the same configurations).
# tnsnames.ora Network Configuration File: C:\oracle_32\product\11.2.0\dbhome_2\network\admin\tnsnames.ora
# Generated by Oracle configuration tools.
LISTENER_ORCL =
(ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521))
ORACLR_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
)
(CONNECT_DATA =
(SID = CLRExtProc)
(PRESENTATION = RO)
)
)
ORCL =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = orcl)
)
)
SQLNET.AUTHENTICATION_SERVICES= (NTS)
NAMES.DIRECTORY_PATH= (TNSNAMES, EZCONNECT)
# listener.ora Network Configuration File: C:\oracle_32\product\11.2.0\dbhome_2\network\admin\listener.ora
# Generated by Oracle configuration tools.
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = CLRExtProc)
(ORACLE_HOME = C:\oracle_32\product\11.2.0\dbhome_2)
(PROGRAM = extproc)
(ENVS = "EXTPROC_DLLS=ONLY:C:\oracle_32\product\11.2.0\dbhome_2\bin\oraclr11.dll")
)
)
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
(ADDRESS = (PROTOCOL = TCP)(HOST = localhost)(PORT = 1521))
)
)
ADR_BASE_LISTENER = C:\oracle_32
Thank you to read !
Here the alert log for the first connection of this morning :
Fri Jun 23 11:08:13 2017
Starting ORACLE instance (normal)
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
Picked latch-free SCN scheme 2
Using LOG_ARCHIVE_DEST_1 parameter default value as USE_DB_RECOVERY_FILE_DEST
Autotune of undo retention is turned on.
IMODE=BR
ILAT =167
LICENSE_MAX_USERS = 0
SYS auditing is disabled
Starting up:
Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options.
Using parameter settings in server-side spfile C:\ORACLE_32\PRODUCT\11.2.0\DBHOME_2\DATABASE\SPFILEORCL.ORA
System parameters with non-default values:
processes = 1000
sessions = 1524
memory_target = 1232M
control_files = "C:\ORACLE_32\ORADATA\ORCL\CONTROL01.CTL"
control_files = "C:\ORACLE_32\FLASH_RECOVERY_AREA\ORCL\CONTROL02.CTL"
db_block_size = 8192
compatible = "11.2.0.0.0"
db_recovery_file_dest = "C:\oracle_32\flash_recovery_area"
db_recovery_file_dest_size= 3852M
undo_tablespace = "UNDOTBS1"
remote_login_passwordfile= "EXCLUSIVE"
db_domain = ""
dispatchers = "(PROTOCOL=TCP) (SERVICE=orclXDB)"
local_listener = "LISTENER_ORCL"
audit_file_dest = "C:\ORACLE_32\ADMIN\ORCL\ADUMP"
audit_trail = "DB"
db_name = "orcl"
open_cursors = 300
diagnostic_dest = "C:\ORACLE_32"
Fri Jun 23 11:08:20 2017
PMON started with pid=2, OS id=2160
Fri Jun 23 11:08:20 2017
VKTM started with pid=3, OS id=2164 at elevated priority
VKTM running at (10)millisec precision with DBRM quantum (100)ms
Fri Jun 23 11:08:21 2017
GEN0 started with pid=4, OS id=2168
Fri Jun 23 11:08:21 2017
DIAG started with pid=5, OS id=2172
Fri Jun 23 11:08:21 2017
DBRM started with pid=6, OS id=2176
OER 7451 in Load Indicator : Error Code = OSD-04500: option indiquée interdite !
Fri Jun 23 11:08:21 2017
PSP0 started with pid=7, OS id=2180
Fri Jun 23 11:08:21 2017
DIA0 started with pid=8, OS id=2184
Fri Jun 23 11:08:21 2017
MMAN started with pid=9, OS id=2188
Fri Jun 23 11:08:21 2017
DBW0 started with pid=10, OS id=2192
Fri Jun 23 11:08:21 2017
LGWR started with pid=11, OS id=2196
Fri Jun 23 11:08:21 2017
CKPT started with pid=12, OS id=2200
Fri Jun 23 11:08:21 2017
SMON started with pid=13, OS id=2204
Fri Jun 23 11:08:21 2017
RECO started with pid=14, OS id=2208
Fri Jun 23 11:08:21 2017
MMON started with pid=15, OS id=2212
starting up 1 dispatcher(s) for network address '(ADDRESS=(PARTIAL=YES)(PROTOCOL=TCP))'...
Fri Jun 23 11:08:21 2017
MMNL started with pid=16, OS id=2216
starting up 1 shared server(s) ...
ORACLE_BASE from environment = C:\oracle_32
Fri Jun 23 11:08:22 2017
alter database mount exclusive
Successful mount of redo thread 1, with mount id 1475182246
Database mounted in Exclusive Mode
Lost write protection disabled
Completed: alter database mount exclusive
alter database open
Fri Jun 23 11:08:31 2017
Errors in file c:\oracle_32\diag\rdbms\orcl\orcl\trace\orcl_lgwr_2196.trc:
ORA-00338: log 3 of thread 1 is more recent than control file
ORA-00312: online log 3 thread 1: 'C:\ORACLE_32\ORADATA\ORCL\REDO03.LOG'
Errors in file c:\oracle_32\diag\rdbms\orcl\orcl\trace\orcl_lgwr_2196.trc:
ORA-00338: log 3 of thread 1 is more recent than control file
ORA-00312: online log 3 thread 1: 'C:\ORACLE_32\ORADATA\ORCL\REDO03.LOG'
Errors in file c:\oracle_32\diag\rdbms\orcl\orcl\trace\orcl_ora_2232.trc:
ORA-00338: fichier journal 1 du thread plus recent que le fichier de controle
ORA-00312: journal en ligne 3 thread 1 : 'C:\ORACLE_32\ORADATA\ORCL\REDO03.LOG'
USER (ospid: 2232): terminating the instance due to error 338
Fri Jun 23 11:08:34 2017
Instance terminated by USER, pid = 2232`
Did you check the alert log of the database? that could be a good place to start looking.
Also when logged as sysdba, did you try to start the database: startup ?
If yes, what is the error message if any?

Flume agent - [tail -f /var/log/httpd/error_log] exited with 1

i am new to flume. My flume agent is not writing data to HDFS. Please help. Here is the code. The purpose of the code is to get the data from apache and park it to HDFS.
#identify the components on agent a1
a1.sources = apache_server
a1.sinks = hdfs_sink
a1.channels = c1
# Configure the source:
a1.sources.apache_server.type = exec
a1.sources.apache_server.command = tail -f /var/log/httpd/error_log
# Describe the sink:
a1.sinks.hdfs_sink.type = hdfs
a1.sinks.hdfs_sink.hdfs.path = hdfs://hadoop1.example.com:9000/Apache_Logs
a1.sinks.hdfs_sink.hdfs.writeFormat = Text
a1.sinks.hdfs_sink.hdfs.fileType = DataStream
a1.sinks.hdfs-sink.hdfs.rollInterval = 10
a1.sinks.hdfs_sink.hdfs.rollSize = 0
a1.sinks.hdfs-sink.hdfs.filePrefix=apacheaccess
# Configure a channel that buffers events in memory:
a1.channels.c1.type = memory
a1.channels.c1.capacity = 20000
a1.channels.c1.transactionCapacity = 100
# Bind the source and sink to the channel:
a1.sources.apache_server.channels = c1
a1.sinks.hdfs_sink.channel = c1

Resources