Cloudera manager is not starting - cloudera

I am trying to start cloudera cluster after restart of the machine but it is not staring the server:
Getting below error in cloudera-scm-server logs:
2014-12-23 21:29:26,870 WARN [Task-Thread-for-com.mchange.v2.async.ThreadPerTaskAsynchronousRunner#2e39060b:resourcepool.BasicResourcePool#1841] com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask#6ec135d6 -- Acquisition Attempt Failed!!! Clearing pending acquires. While trying to acquire a needed new resource, we failed to succeed more than the maximum number of allowed acquisition attempts (30). Last acquisition attempt exception:
org.postgresql.util.PSQLException: FATAL: no pg_hba.conf entry for host "192.168.6.109", user "scm", database "scm", SSL off
at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:291)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:108)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:66)
at org.postgresql.jdbc2.AbstractJdbc2Connection.<init>(AbstractJdbc2Connection.java:125)
at org.postgresql.jdbc3.AbstractJdbc3Connection.<init>(AbstractJdbc3Connection.java:30)
at org.postgresql.jdbc3g.AbstractJdbc3gConnection.<init>(AbstractJdbc3gConnection.java:22)
at org.postgresql.jdbc4.AbstractJdbc4Connection.<init>(AbstractJdbc4Connection.java:30)
at org.postgresql.jdbc4.Jdbc4Connection.<init>(Jdbc4Connection.java:24)
at org.postgresql.Driver.makeConnection(Driver.java:393)
at org.postgresql.Driver.connect(Driver.java:267)
at com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:135)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:182)
at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:171)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:137)
at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1014)
at com.mchange.v2.resourcepool.BasicResourcePool.access$800(BasicResourcePool.java:32)
at com.mchange.v2.resourcepool.BasicResourcePool$AcquireTask.run(BasicResourcePool.java:1810)
at com.mchange.v2.async.ThreadPerTaskAsynchronousRunner$TaskThread.run(ThreadPerTaskAsynchronousRunner.java:255)
I tried to change the permission of db-data folder to 700 and also dropped the SCHEMA_VERSION table as per this link but no luck
EDIT
In DB logs from /var/log/cloudera-scm-server/db.log i got following FATAL error:
FATAL: no pg_hba.conf entry for host "192.168.6.109", user "scm", database "scm", SSL off

I got the solution of this problem
One of the other application that i am running updated the host file. So it was having two entries for localhost(Broken Host file). After fixing that problem was resolved.

Related

Task fails due to not being able to read log file

Composer is failing a task due to it not being able to read a log file, it's complaining about incorrect encoding.
Here's the log that appears in the UI:
*** Unable to read remote log from gs://bucket/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** 'ascii' codec can't decode byte 0xc2 in position 6986: ordinal not in range(128)
*** Log file does not exist: /home/airflow/gcs/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Fetching from: http://airflow-worker-68dc66c9db-x945n:8793/log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='airflow-worker-68dc66c9db-x945n', port=8793): Max retries exceeded with url: /log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1c9ff19d10>: Failed to establish a new connection: [Errno -2] Name or service not known',))
I try viewing the file in the google cloud console and it also throws an error:
Failed to load
Tracking Number: 8075820889980640204
But I am able to download the file via gsutil.
When I view the file, it seems to have text overriding other text.
I can't show the entire file but it looks like this:
--------------------------------------------------------------------------------
Starting attempt 1 of 1
--------------------------------------------------------------------------------
#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,313] {models.py:1569} INFO - Executing <Task(BigQueryOperator): merge_campaign_exceptions> on 2019-08-03T10:00:00+00:00#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,314] {base_task_runner.py:124} INFO - Running: ['bash', '-c', u'airflow run __campaign_exceptions_0_0_1 merge_campaign_exceptions 2019-08-03T10:00:00+00:00 --job_id 22767 --pool _bq_pool --raw -sd DAGS_FOLDER//-campaign-exceptions.py --cfg_path /tmp/tmpyBIVgT']#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:24,658] {base_task_runner.py:107} INFO - Job 22767: Subtask merge_campaign_exceptions [2019-08-04 10:01:24,658] {settings.py:176} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
Where the #-#{} pieces seems to be "on top of" the typical log.
I faced the same problem. In my case the problem was that I removed the google_gcloud_default connection that was being used to retrieve the logs.
Check the configuration and look for the connection name.
[core]
remote_log_conn_id = google_cloud_default
Then check the credentials used for that connection name has the right permissions to access the GCS bucket.
I'm having a similar problem with viewing logs in GCP Cloud Composer. It doesn't appear to be preventing the failing DAG task from running though. What it looks like is a permissions error between the GKE and Storage Bucket where the log files are kept.
You can still view the logs by going into your cluster's storage bucket in the same directory as your /dags folder where you should also see a logs/ folder.
Your helm chart should setup global env:
- name: AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT
value: "google-cloud-platform://"
Then, you should deploy a Dockerfile with root account only (not airflow account), additionaly, you set up your helm uid, gid as:
uid: 50000 #airflow user
gid: 50000 #airflow group
Then upgrade helm chart with new config
*** Unable to read remote log from gs://bucket
1)Found the solution after assigning the roles to the service account
2)The SA key(json or txt) to be added and configured to the connection in the
remote_log_conn_id = google_cloud_default
3)restart the scheduler and webserver of the airflow
4)restart the dags on the airflow
you can find the logs on the GCS bucket where its configured

Nexus3 is active(exited) and not accessible

My Nexus3 was stuck due to out of space issue, i clean some directories(non-nexus) and started, its show status like below
# service nexus status;
? nexus.service - LSB: nexus
Loaded: loaded (/etc/init.d/nexus; generated)
Active: active (exited)
in logs i can see below
2019-02-06 18:59:08,550+0100 ERROR [FelixStartLevel] *SYSTEM org.sonatype.nexus.extender.NexusContextListener - Failed to start nexus
com.orientechnologies.orient.core.exception.OStorageException: Cannot open local storage '/opt/nexus/sonatype-work/nexus3/db/component' with mode=rw
DB name="component"
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.open(OAbstractPaginatedStorage.java:323)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.open(ODatabaseDocumentTx.java:259)
at org.sonatype.nexus.orient.DatabaseManagerSupport.connect(DatabaseManagerSupport.java:174)
at org.sonatype.nexus.orient.DatabaseInstanceImpl.doStart(DatabaseInstanceImpl.java:56)
at org.sonatype.goodies.lifecycle.LifecycleSupport.start(LifecycleSupport.java:104)
at org.sonatype.goodies.lifecycle.Lifecycles.start(Lifecycles.java:44)
at org.sonatype.nexus.orient.DatabaseManagerSupport.createInstance(DatabaseManagerSupport.java:306)
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1688)
at org.sonatype.nexus.orient.DatabaseManagerSupport.instance(DatabaseManagerSupport.java:285)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291)
at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.io.FileNotFoundException: /opt/nexus/sonatype-work/nexus3/db/component/dirty.fl (Permission denied)
at java.io.RandomAccessFile.open0(Native Method)
at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
But when i do ls it shows file is there
root#XXX:/opt/nexus/sonatype-work/nexus3/db/component# ls -ltrh dirty.fl
-rw-r--r-- 1 root root 2 Feb 6 19:05 dirty.fl
Any clue what goes wrong?
Cannot open local storage '/opt/nexus/sonatype-work/nexus3/db/component' with mode=rw
The file is present, but NXRM can't open it in a read-write mode. Since you have already ran out of space on your disk, please ensure your disk isn't mounted in read-only mode.
If you're still out of space, move the sonatype-work/nexus3/db/component directory to another location and create a symlink to point to the new component directory. Keep in mind the performance when choosing your new location.
To prevent this from happening in the future, try using the Cleanup Policies and periodically run Compact blob store task.

Can I increase download timeout in Artifactory?

I'm getting error like this:
2018-01-16 09:56:17,354 [http-nio-8081-exec-8] [ERROR] (o.a.r.RemoteRepoBase:772) - IO error while trying to download resource 'pp-libs:ru/programpark/vector/10/vector-10.zip': org.apache.http.conn.ConnectTimeoutException: Connect to 192.168.3.20:8111 [/192.168.3.20] failed: connect timed out
in Artifactory log.
The file is half-gigabyte long and the channel to remote repo is not very wide.
Remote repo is an artifactory itself.
I'm not sure who closes the connection.
You need to increase the Socket Timeout in the Network Settings. See https://www.jfrog.com/confluence/display/RTF/Advanced+Settings

openstack: Failed to launch instance from the glance

We have setup OpenStack using conjure-up on a (Ubuntu LTS server 16.04.3) single machine. All are services are up and running, and successfully I am able to upload images to the glance.
We wanted to save these glance images created by "glance image-create" in remote machine which have nfs server. So we have configured glance-api.conf file as below.
My glance-api.conf looks like this:
[glance_store]
filesystem_store_datadir = /var/lib/glance/images
default_store = file
And in glance controller node, I have mounted
remote machine Ip/home/glance/images/ in this directory path
/var/lib/glance/images
and have mentioned the same mounted directory path inside the glance-api.conf file.
I have created the two sample private network with some ip (192.168.1.0 and 10.221.50.0) but have not created a public network as at this moment I don't want to access this VM instance from outside.
When I am trying to launch the instance from dashboard UI as well as through CLI, I am getting below error.
Error: Failed to perform requested operation on instance "Ubuntu_Hawkbit", the instance has an error status: Please try again later [Error: No valid host was found. There are not enough hosts available.].
Note: I have tried by associating Instance with different private network ,thinking that it may be network IP address issue but facing the same error.
When I check /var/log/nova/nova-compute.log logs, I see below error.
ERROR nova.image.glance [req-1459f1b2-491c-46a2-b803-6ff621a79d30 6ebc7996240c4ce688234f544c9d0116 07427c9d49704357a049b24193ee0a28 - -
-] Error contacting glance server 'http://10.206.193.159:9292' for 'data', done trying.
ERROR nova.image.glance CommunicationError:
Error finding address for
http://10.206.193.159:9292/v1/images/6c30e2ab-1078-45ad-bed2-3e3a75f6af8c:
('Connection aborted.', BadStatusLine("''",))
ERROR
nova_lxd.nova.virt.lxd.image
[req-1459f1b2-491c-46a2-b803-6ff621a79d30
6ebc7996240c4ce688234f544c9d0116 07427c9d49704357a049b24193ee0a28 - -
-] [instance: eedc008d-ef34-498d-8774-b3813ce032f4] Failed to upload 6c30e2ab-1078-45ad-bed2-3e3a75f6af8c to LXD: Connection to glance
ERROR nova_lxd.nova.virt.lxd.operations
[req-1459f1b2-491c-46a2-b803-6ff621a79d30
6ebc7996240c4ce688234f544c9d0116 07427c9d49704357a049b24193ee0a28 - -
-] [instance: eedc008d-ef34-498d-8774-b3813ce032f4] Faild to start container instance-00000020: Connection to glance host
http://10.206.193.159:9292 failed: Error finding address for
http://10.206.193.159:9292/v1/images/6c30e2ab-1078-45ad-bed2-3e3a75f6af8c:
('Connection aborted.', BadStatusLine("''",))
ERROR nova.compute.manager [req-1459f1b2-491c-46a2-b803-6ff621a79d30
6ebc7996240c4ce688234f544c9d0116 07427c9d49704357a049b24193ee0a28 - -
-] [instance: eedc008d-ef34-498d-8774-b3813ce032f4] Instance failed to spawn

OpenLDAP on Windows 7 not starting due to unclean shutdown detected

OpenLDAP was running and the laptop and since the battery power of the laptop did not last and the WIndows 7 OS shutdown. After restating WIndows 7 OS, tried to start OpenLDAP and get following error.
Tried to see if there is any lock or any kind of information on the internet / google search but none of them gave a good response.
53021aca backend_startup_one: starting "dc=my-domain,dc=com"
53021aca bdb_db_open: "dc=my-domain,dc=com"
53021aca bdb_db_open: database "dc=my-domain,dc=com": unclean shutdown detected; attempting recovery.
53021aca bdb_db_open: database "dc=my-domain,dc=com": dbenv_open(../var/openldap-data).
53021aca bdb_db_open: database "dc=my-domain,dc=com": alock_recover failed
53021aca ====> bdb_cache_release_all
53021aca bdb_db_close: database "dc=my-domain,dc=com": alock_close failed
53021aca backend_startup_one (type=bdb, suffix="dc=my-domain,dc=com"): bi_db_open failed! (-1)
53021aca slapd shutdown: initiated
53021acb ====> bdb_cache_release_all
53021acb bdb_db_close: database "dc=my-domain,dc=com": alock_close failed
53021acb slapd destroy: freeing system resources.
53021acb slapd stopped.
Above is the logs from the OpenLDAP server...
OpenLDAP was running and the laptop and since the battery power of the laptop did not last and the WIndows 7 OS shutdown. After restating WIndows 7 OS, tried to start OpenLDAP and get following error.
Tried to see if there is any lock or any kind of information on the internet / google search but none of them gave a good response.
Delete alock file
Just solved my own problem with exact same error log.
Go to your LDAP installed directory /var/openldap-data
there should be a file named alock . Delete this file . Start your LDAP.
You are welcome.
Also try this , from : http://www.zytrax.com/
select the dos window in which it is running and type CTRL-C, the server will stop and you will be offered a prompt Terminate Batch Job?, typing y to this prompt will close the window.
If this procedure is not followed (for example you closed your PC
without terminating the LDAP server) the server will probably
subsequently refuse to start. If this is the case navigate to the
directory c:\openldap\var\run and delete any files in this directory
(slapd.args and slapd.pid). The server should now restart. Failing
this look at the log file (default in \var\log).

Resources