Nexus3 is active(exited) and not accessible - nexus

My Nexus3 was stuck due to out of space issue, i clean some directories(non-nexus) and started, its show status like below
# service nexus status;
? nexus.service - LSB: nexus
Loaded: loaded (/etc/init.d/nexus; generated)
Active: active (exited)
in logs i can see below
2019-02-06 18:59:08,550+0100 ERROR [FelixStartLevel] *SYSTEM org.sonatype.nexus.extender.NexusContextListener - Failed to start nexus
com.orientechnologies.orient.core.exception.OStorageException: Cannot open local storage '/opt/nexus/sonatype-work/nexus3/db/component' with mode=rw
DB name="component"
at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.open(OAbstractPaginatedStorage.java:323)
at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.open(ODatabaseDocumentTx.java:259)
at org.sonatype.nexus.orient.DatabaseManagerSupport.connect(DatabaseManagerSupport.java:174)
at org.sonatype.nexus.orient.DatabaseInstanceImpl.doStart(DatabaseInstanceImpl.java:56)
at org.sonatype.goodies.lifecycle.LifecycleSupport.start(LifecycleSupport.java:104)
at org.sonatype.goodies.lifecycle.Lifecycles.start(Lifecycles.java:44)
at org.sonatype.nexus.orient.DatabaseManagerSupport.createInstance(DatabaseManagerSupport.java:306)
at java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1688)
at org.sonatype.nexus.orient.DatabaseManagerSupport.instance(DatabaseManagerSupport.java:285)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.ForEachOps$ForEachTask.compute(ForEachOps.java:291)
at java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:731)
at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)
Caused by: java.io.FileNotFoundException: /opt/nexus/sonatype-work/nexus3/db/component/dirty.fl (Permission denied)
at java.io.RandomAccessFile.open0(Native Method)
at java.io.RandomAccessFile.open(RandomAccessFile.java:316)
But when i do ls it shows file is there
root#XXX:/opt/nexus/sonatype-work/nexus3/db/component# ls -ltrh dirty.fl
-rw-r--r-- 1 root root 2 Feb 6 19:05 dirty.fl
Any clue what goes wrong?

Cannot open local storage '/opt/nexus/sonatype-work/nexus3/db/component' with mode=rw
The file is present, but NXRM can't open it in a read-write mode. Since you have already ran out of space on your disk, please ensure your disk isn't mounted in read-only mode.
If you're still out of space, move the sonatype-work/nexus3/db/component directory to another location and create a symlink to point to the new component directory. Keep in mind the performance when choosing your new location.
To prevent this from happening in the future, try using the Cleanup Policies and periodically run Compact blob store task.

Related

Artifactory service fails to start upon Fedora 35 reboot

I have installed on Fedora 35 jfrog-artifactory-oss (v7.31.11-73111900.x86_64) and enabled it as a system service to start at boot. But whenever I boot up my OS, the server never starts properly. I will always need to kill the PID of the active running Artifactory process. If I then do sudo service artifactory restart it will bring up the server cleanly and everything is good. How can I avoid having to do this little dance? Is there something about OS boot up that is causing Artifactory to get thrown off?
I have looked at console.log when the server is not running properly after bootup, I see some logs like:
2022-01-27T08:35:38.383Z [shell] [INFO] [] [artifactoryManage.sh:69] [main] - Artifactory Tomcat already started
2022-01-27T08:35:43.084Z [jfac] [WARN] [d84d2d549b318495] [o.j.c.ExecutionUtils:165] [pool-9-thread-2] - Retry 900 Elapsed 7.56 minutes failed: Registration with router on URL http://localhost:8046 failed with error: UNAVAILABLE: io exception. Trying again
That shows that the server is not running properly, but doesn't give a clear idea of what to try next. Any suggestions?
2 things to check,
How is the artifactory.service file in the systemd directory
Whenever the OS is rebooted, what is the error seen in the logs, check all the logs.
Hint: From the warning shared, it seems that Router service is not able to start when OS is rebooted, so whenever OS is rebooted and issue comes up check the router-service.log for any errors/warnings.

Task fails due to not being able to read log file

Composer is failing a task due to it not being able to read a log file, it's complaining about incorrect encoding.
Here's the log that appears in the UI:
*** Unable to read remote log from gs://bucket/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** 'ascii' codec can't decode byte 0xc2 in position 6986: ordinal not in range(128)
*** Log file does not exist: /home/airflow/gcs/logs/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Fetching from: http://airflow-worker-68dc66c9db-x945n:8793/log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log
*** Failed to fetch log file from worker. HTTPConnectionPool(host='airflow-worker-68dc66c9db-x945n', port=8793): Max retries exceeded with url: /log/campaign_exceptions_0_0_1/merge_campaign_exceptions/2019-08-03T10:00:00+00:00/1.log (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f1c9ff19d10>: Failed to establish a new connection: [Errno -2] Name or service not known',))
I try viewing the file in the google cloud console and it also throws an error:
Failed to load
Tracking Number: 8075820889980640204
But I am able to download the file via gsutil.
When I view the file, it seems to have text overriding other text.
I can't show the entire file but it looks like this:
--------------------------------------------------------------------------------
Starting attempt 1 of 1
--------------------------------------------------------------------------------
#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,313] {models.py:1569} INFO - Executing <Task(BigQueryOperator): merge_campaign_exceptions> on 2019-08-03T10:00:00+00:00#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:23,314] {base_task_runner.py:124} INFO - Running: ['bash', '-c', u'airflow run __campaign_exceptions_0_0_1 merge_campaign_exceptions 2019-08-03T10:00:00+00:00 --job_id 22767 --pool _bq_pool --raw -sd DAGS_FOLDER//-campaign-exceptions.py --cfg_path /tmp/tmpyBIVgT']#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
[2019-08-04 10:01:24,658] {base_task_runner.py:107} INFO - Job 22767: Subtask merge_campaign_exceptions [2019-08-04 10:01:24,658] {settings.py:176} INFO - setting.configure_orm(): Using pool settings. pool_size=5, pool_recycle=1800#-#{"task-id": "merge_campaign_exceptions", "execution-date": "2019-08-03T10:00:00+00:00", "workflow": "__campaign_exceptions_0_0_1"}
Where the #-#{} pieces seems to be "on top of" the typical log.
I faced the same problem. In my case the problem was that I removed the google_gcloud_default connection that was being used to retrieve the logs.
Check the configuration and look for the connection name.
[core]
remote_log_conn_id = google_cloud_default
Then check the credentials used for that connection name has the right permissions to access the GCS bucket.
I'm having a similar problem with viewing logs in GCP Cloud Composer. It doesn't appear to be preventing the failing DAG task from running though. What it looks like is a permissions error between the GKE and Storage Bucket where the log files are kept.
You can still view the logs by going into your cluster's storage bucket in the same directory as your /dags folder where you should also see a logs/ folder.
Your helm chart should setup global env:
- name: AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT
value: "google-cloud-platform://"
Then, you should deploy a Dockerfile with root account only (not airflow account), additionaly, you set up your helm uid, gid as:
uid: 50000 #airflow user
gid: 50000 #airflow group
Then upgrade helm chart with new config
*** Unable to read remote log from gs://bucket
1)Found the solution after assigning the roles to the service account
2)The SA key(json or txt) to be added and configured to the connection in the
remote_log_conn_id = google_cloud_default
3)restart the scheduler and webserver of the airflow
4)restart the dags on the airflow
you can find the logs on the GCS bucket where its configured

Airflow: How to setup log directory?

I upload a dag file to the web page and when I click 'Graph View' -> ${my_dag} -> 'View Log', it shows:
*** Log file isn't local.
*** Fetching here: http://:8793/log/demo_dag/hello_task/2018-11-14T15:06:00
*** Failed to fetch log file from worker.
*** Reading remote logs...
*** Unsupported remote log location.
I have checked the airflow.cfg and find these config info:
worker_log_server_port = 8793
base_log_folder = /root/airflow/logs
My question is:
How to setup IP address for log service (Only port is setup)?
I have setup directory for log service, why does it still go to /log/.. ?
Any help is appreciated.
This can happen when the task status was manually changed (likely through the "Mark Success" option) and the task never receives a hostname value on the record.
The webserver is attempting to reach out to a server, with no name, to get logs for a task that never ran.
PS: Be careful running processes as the root user.
I've been getting this error, fix it by correcting the socket volume path:
WARNING - OSError while attempting to symlink the latest log directory
In windows the volume will go with a double bar like this:
volumes:
- //var/run/docker.sock:/var/run/docker.sock
Bind to docker socket on Windows
Setting up Airflow to run with Docker Swarm’s orchestration

Cloudify : "Could not determine the type of file "sftp://root:***#192.168.10.xxx/root/gs-files"."

I am attempting to bootstrap a private OpenStack cloud using Cloudify 2.7.1. It boots up the Linux instance correctly but fails "Uploading files to 192.168.10.XXX." due to an SFTP problem : "Could not determine the type of file "sftp://root:***#192.168.10.xxx/root/gs-files".".
I can access to the Instance using ssh (there is no probleme in the connection). I tried with other images (CentOS, Ubuntu, Cerros, ...) but always the same error !!
can anyone help me please ?
I attached a screenshot of the network topology created by Cloudify, and the stack trace.
Full stack trace:
2015-04-30 10:26:27,470 INFO [org.cloudifysource.shell.commands.AbstractGSCommand] - Setting security profile to "nonsecure".
2015-04-30
10:26:27,589 INFO
[org.cloudifysource.shell.commands.AbstractGSCommand] - Bootstrapping
cloud openstack-havana. This may take a few minutes.
2015-04-30
10:26:27,677 INFO
[org.cloudifysource.esc.driver.provisioning.BaseProvisioningDriver] -
Setup network configuration for managers
2015-04-30 10:26:27,677
INFO [org.cloudifysource.esc.driver.provisioning.BaseProvisioningDriver]
- Using management network : Cloudify-Management-Network
2015-04-30
10:26:51,536 INFO
[org.cloudifysource.esc.shell.listener.CliAgentlessInstallerListener] -
Attempting to access Management VM 192.168.10.241.
2015-04-30
10:27:10,551 INFO
[org.cloudifysource.esc.shell.listener.CliAgentlessInstallerListener] -
Uploading files to 192.168.10.241.
2015-04-30 10:27:15,708 WARNING [com.jcraft.jsch] - Permanently added '192.168.10.241' (RSA) to the list of known hosts.
2015-04-30
10:27:25,998 INFO
[org.cloudifysource.esc.shell.installer.CloudGridAgentBootstrapper] -
Failed accessing management VM 192.168.10.241 Reason: Failed to set up
file transfer: Unknown message with code "Could not determine the type
of file "sftp://cirros#192.168.10.241/cirros/gs-files".".; Caused by:
org.cloudifysource.esc.installer.InstallerException: Failed to set up
file transfer: Unknown message with code "Could not determine the type
of file "sftp://cirros#192.168.10.241/cirros/gs-files".".
2015-04-30
10:27:26,210 INFO
[org.cloudifysource.esc.driver.provisioning.openstack.OpenStackCloudifyDriver]
- Deleting Floating ip:
FloatingIp[floatingNetworkId=15578898-5e6b-44d9-a73a-1328ca6ea140,floatingIpAddress=192.168.10.241,portId=4b8dc211-12e8-4383-8799-f783d2786e98,id=593d8424-cfec-41ed-8204-ed8609366416]
2015-04-30
10:27:29,607 SEVERE
[org.cloudifysource.shell.commands.AbstractGSCommand] - Failed to set up
file transfer: Unknown message with code "Could not determine the type
of file "sftp://cirros#192.168.10.241/cirros/gs-files".". :
org.cloudifysource.esc.installer.InstallerException: Failed to set up
file transfer: Unknown message with code "Could not determine the type
of file "sftp://cirros#192.168.10.241/cirros/gs-files".".
at org.cloudifysource.esc.installer.filetransfer.VfsFileTransfer.initialize(VfsFileTransfer.java:206)
at org.cloudifysource.esc.installer.AgentlessInstaller.uploadFilesToServer(AgentlessInstaller.java:306)
at org.cloudifysource.esc.installer.AgentlessInstaller.installOnMachineWithIP(AgentlessInstaller.java:210)
at org.cloudifysource.esc.shell.installer.CloudGridAgentBootstrapper$1.call(CloudGridAgentBootstrapper.java:865)
at org.cloudifysource.esc.shell.installer.CloudGridAgentBootstrapper$1.call(CloudGridAgentBootstrapper.java:860)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused
by: org.apache.commons.vfs2.FileSystemException: Unknown message with
code "Could not determine the type of file
"sftp://cirros#192.168.10.241/cirros/gs-files".".
at org.apache.commons.vfs2.provider.sftp.SftpFileObject.refresh(SftpFileObject.java:95)
at org.apache.commons.vfs2.provider.AbstractFileSystem.resolveFile(AbstractFileSystem.java:366)
at org.apache.commons.vfs2.provider.AbstractFileSystem.resolveFile(AbstractFileSystem.java:317)
at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:85)
at org.apache.commons.vfs2.provider.AbstractOriginatingFileProvider.findFile(AbstractOriginatingFileProvider.java:65)
at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:693)
at org.apache.commons.vfs2.impl.DefaultFileSystemManager.resolveFile(DefaultFileSystemManager.java:621)
at org.cloudifysource.esc.installer.filetransfer.VfsFileTransfer.resolveTargetDirectory(VfsFileTransfer.java:218)
at org.cloudifysource.esc.installer.filetransfer.VfsFileTransfer.initialize(VfsFileTransfer.java:203)
... 8 more
Caused
by: org.apache.commons.vfs2.FileSystemException: Could not determine
the type of file "sftp://cirros#192.168.10.241/cirros/gs-files".
at org.apache.commons.vfs2.provider.AbstractFileObject.getType(AbstractFileObject.java:505)
at org.apache.commons.vfs2.provider.sftp.SftpFileObject.refresh(SftpFileObject.java:91)
... 16 more
Caused by: org.apache.commons.vfs2.FileSystemException: Could not connect to SFTP server at "sftp://cirros#192.168.10.241/".
at org.apache.commons.vfs2.provider.sftp.SftpFileSystem.getChannel(SftpFileSystem.java:153)
at org.apache.commons.vfs2.provider.sftp.SftpFileObject.statSelf(SftpFileObject.java:151)
at org.apache.commons.vfs2.provider.sftp.SftpFileObject.doGetType(SftpFileObject.java:114)
at org.apache.commons.vfs2.provider.AbstractFileObject.getType(AbstractFileObject.java:496)
... 17 more
Caused by: com.jcraft.jsch.JSchException: java.io.IOException: Pipe closed
at com.jcraft.jsch.ChannelSftp.start(ChannelSftp.java:288)
at com.jcraft.jsch.Channel.connect(Channel.java:152)
at com.jcraft.jsch.Channel.connect(Channel.java:145)
at org.apache.commons.vfs2.provider.sftp.SftpFileSystem.getChannel(SftpFileSystem.java:130)
... 20 more
Caused by: java.io.IOException: Pipe closed
at java.io.PipedInputStream.read(PipedInputStream.java:308)
at java.io.PipedInputStream.read(PipedInputStream.java:378)
at com.jcraft.jsch.ChannelSftp.fill(ChannelSftp.java:2665)
at com.jcraft.jsch.ChannelSftp.header(ChannelSftp.java:2691)
at com.jcraft.jsch.ChannelSftp.start(ChannelSftp.java:257)
... 23 more
Looks like you are trying to sftp into a cirros instance - I am not sure cirros even supports sftp. You can try this by using the sftp command line utility.
In general, sftp has to be configured and available on the target machine.
You can try using the SCP file transfer mode by setting this in your compute template:
fileTransfer org.cloudifysource.domain.cloud.FileTransferModes.SCP
If you are really using cirros, I suspect bootstrapping will fail. Cloudify was never tested on cirros. I think cirros is lacking some very basic utilities (I think it is not running bash. Not sure if it has wget). Cirros was never meant as a generic distribution - it is meant for testing your cloud's basic functionality.
One more thing - Cloudify 2 has reached End-of-Life - it is no longer supported. You should check out Cloudify 3.

OpenLDAP on Windows 7 not starting due to unclean shutdown detected

OpenLDAP was running and the laptop and since the battery power of the laptop did not last and the WIndows 7 OS shutdown. After restating WIndows 7 OS, tried to start OpenLDAP and get following error.
Tried to see if there is any lock or any kind of information on the internet / google search but none of them gave a good response.
53021aca backend_startup_one: starting "dc=my-domain,dc=com"
53021aca bdb_db_open: "dc=my-domain,dc=com"
53021aca bdb_db_open: database "dc=my-domain,dc=com": unclean shutdown detected; attempting recovery.
53021aca bdb_db_open: database "dc=my-domain,dc=com": dbenv_open(../var/openldap-data).
53021aca bdb_db_open: database "dc=my-domain,dc=com": alock_recover failed
53021aca ====> bdb_cache_release_all
53021aca bdb_db_close: database "dc=my-domain,dc=com": alock_close failed
53021aca backend_startup_one (type=bdb, suffix="dc=my-domain,dc=com"): bi_db_open failed! (-1)
53021aca slapd shutdown: initiated
53021acb ====> bdb_cache_release_all
53021acb bdb_db_close: database "dc=my-domain,dc=com": alock_close failed
53021acb slapd destroy: freeing system resources.
53021acb slapd stopped.
Above is the logs from the OpenLDAP server...
OpenLDAP was running and the laptop and since the battery power of the laptop did not last and the WIndows 7 OS shutdown. After restating WIndows 7 OS, tried to start OpenLDAP and get following error.
Tried to see if there is any lock or any kind of information on the internet / google search but none of them gave a good response.
Delete alock file
Just solved my own problem with exact same error log.
Go to your LDAP installed directory /var/openldap-data
there should be a file named alock . Delete this file . Start your LDAP.
You are welcome.
Also try this , from : http://www.zytrax.com/
select the dos window in which it is running and type CTRL-C, the server will stop and you will be offered a prompt Terminate Batch Job?, typing y to this prompt will close the window.
If this procedure is not followed (for example you closed your PC
without terminating the LDAP server) the server will probably
subsequently refuse to start. If this is the case navigate to the
directory c:\openldap\var\run and delete any files in this directory
(slapd.args and slapd.pid). The server should now restart. Failing
this look at the log file (default in \var\log).

Resources