Fix a corrupted NPM database in Nexus? - nexus

Our Nexus server filled its disk, and when we restarted it I got the following error:
jvm 1 | 2015-08-18 09:44:13,660+1000 ERROR [jetty-main-1] *SYSTEM com.bolyuba.nexus.plugin.npm.service.internal.orient.OrientMetadataStore - Life-cycle operation failed
jvm 1 | com.orientechnologies.orient.core.exception.OStorageException: Cannot open local storage '/nexus/db/npm' with mode=rw
jvm 1 | at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.open(OAbstractPaginatedStorage.java:220) ~[nexus-npm-repository-plugin-2.11.3-01/:na]
jvm 1 | at com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.open(ODatabaseDocumentTx.java:244) ~[nexus-npm-repository-plugin-2.11.3-01/:na]
jvm 1 | at com.bolyuba.nexus.plugin.npm.service.internal.orient.OrientMetadataStore.doStart(OrientMetadataStore.java:107) ~[nexus-npm-repository-plugin-2.11.3-01/:na]
jvm 1 | at org.sonatype.sisu.goodies.lifecycle.LifecycleSupport$Handler.doStart(LifecycleSupport.java:70) ~[goodies-lifecycle-1.9.jar:1.9]
jvm 1 | at org.sonatype.sisu.goodies.lifecycle.LifecycleHandlerContext$MainMap_Starting.started(LifecycleHandlerContext.java:255) ~[goodies-lifecycle-1.9.jar:1.9]
jvm 1 | at org.sonatype.sisu.goodies.lifecycle.LifecycleHandlerContext.started(LifecycleHandlerContext.java:57) ~[goodies-lifecycle-1.9.jar:1.9]
jvm 1 | at org.sonatype.sisu.goodies.lifecycle.LifecycleSupport.start(LifecycleSupport.java:129) ~[goodies-lifecycle-1.9.jar:1.9]
jvm 1 | at com.bolyuba.nexus.plugin.npm.service.internal.orient.OrientMetadataStoreLifecycle.on(OrientMetadataStoreLifecycle.java:51) [nexus-npm-repository-plugin-2.11.3-01/:na]
jvm 1 | Caused by: com.orientechnologies.orient.core.exception.OStorageException: File with name internal.pcl does not exist in storage npm
jvm 1 | at com.orientechnologies.orient.core.index.hashindex.local.cache.OWOWCache.openFile(OWOWCache.java:249) ~[nexus-npm-repository-plugin-2.11.3-01/:na]
jvm 1 | at com.orientechnologies.orient.core.index.hashindex.local.cache.OReadWriteDiskCache.openFile(OReadWriteDiskCache.java:159) ~[nexus-npm-repository-plugin-2.11.3-01/:na]
jvm 1 | at com.orientechnologies.orient.core.storage.impl.local.paginated.base.ODurableComponent.openFile(ODurableComponent.java:145) ~[nexus-npm-repository-plugin-2.11.3-01/:na]
jvm 1 | at com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster.open(OPaginatedCluster.java:203) ~[nexus-npm-repository-plugin-2.11.3-01/:na]
jvm 1 | at com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.open(OAbstractPaginatedStorage.java:190) ~[nexus-npm-repository-plugin-2.11.3-01/:na]
jvm 1 | ... 61 common frames omitted
How do I fix this?

I couldn't find any info on this error as it related to Nexus specifically, so I thought I would share my solution.
First, you need to shut down Nexus and remove the /db/npm directory. Then boot Nexus and run the Rebuild hosted npm metadata task. That fixed the issue for me.

Related

Flyway validate failed:detected resolved migration not applied to database

I have created two migration SQL files for my Java web application: V1 and V2.
Upon the first start of the application, everything looks and works fine. Here's what I'm seeing in the log:
23-Aug-2020 09:31:35.923 INFO [localhost-startStop-1] org.flywaydb.core.internal.command.DbMigrate.info Current version of schema `cloudregs`: << Empty Schema >>
23-Aug-2020 09:31:36.074 INFO [localhost-startStop-1] org.flywaydb.core.internal.command.DbMigrate.info Migrating schema `cloudregs` to version 1 - initial
23-Aug-2020 09:31:36.118 WARNING [localhost-startStop-1] org.flywaydb.core.internal.sqlscript.DefaultSqlScriptExecutor.warn DB: Changing sql mode 'NO_AUTO_CREATE_USER' is deprecated. It will be removed in a future release. (SQL State: HY000 - Error Code: 3090)
23-Aug-2020 09:31:37.312 WARNING [localhost-startStop-1] org.flywaydb.core.internal.sqlscript.DefaultSqlScriptExecutor.warn DB: Changing sql mode 'NO_AUTO_CREATE_USER' is deprecated. It will be removed in a future release. (SQL State: HY000 - Error Code: 3090)
23-Aug-2020 09:31:37.419 INFO [localhost-startStop-1] org.flywaydb.core.internal.command.DbMigrate.info Migrating schema `cloudregs` to version 2 - user-invite
23-Aug-2020 09:31:37.459 WARNING [localhost-startStop-1] org.flywaydb.core.internal.sqlscript.DefaultSqlScriptExecutor.warn DB: Changing sql mode 'NO_AUTO_CREATE_USER' is deprecated. It will be removed in a future release. (SQL State: HY000 - Error Code: 3090)
23-Aug-2020 09:31:40.742 WARNING [localhost-startStop-1] org.flywaydb.core.internal.sqlscript.DefaultSqlScriptExecutor.warn DB: Changing sql mode 'NO_AUTO_CREATE_USER' is deprecated. It will be removed in a future release. (SQL State: HY000 - Error Code: 3090)
23-Aug-2020 09:31:40.812 INFO [localhost-startStop-1] org.flywaydb.core.internal.command.DbMigrate.info Successfully applied 2 migrations to schema `cloudregs` (execution time 00:04.912s)
And here's the corresponding flyway_schema_history:
+----------------+---------+-------------+------+---------------------+----------+--------------+---------------------+----------------+---------+
| installed_rank | version | description | type | script | checksum | installed_by | installed_on | execution_time | success |
+----------------+---------+-------------+------+---------------------+----------+--------------+---------------------+----------------+---------+
| 2 | 2 | user-invite | SQL | V2__user-invite.sql | 39331208 | root | 2020-08-23 09:31:40 | 3343 | 1 |
+----------------+---------+-------------+------+---------------------+----------+--------------+---------------------+----------------+---------+
It looks that only the second migration has been saved to flyway_schema_history.
When re-starting the application, I'm getting the following error:
Validate failed:
Detected resolved migration not applied to database: 1
What am I missing?
Edit: the issue seems to be fixed by upgrading from Flyway version 6.4.2 to 6.5.5

Upload Stem Cell Error - Keystone connection timed out

I am getting a connection timeout error while uploading the stemcell to bosh director. I am using bosh cli v2. The following is myerror logs.
> bosh -e sdp-bosh-env upload-stemcell https://bosh.io/d/stemcells/bosh-openstack-kvm-ubuntu-trusty-go_agent?v=3541.12 --fix
Using environment '10.82.73.8' as client 'admin'
Task 13
Task 13 | 05:02:40 | Update stemcell: Downloading remote stemcell (00:00:51)
Task 13 | 05:03:31 | Update stemcell: Extracting stemcell archive (00:00:03)
Task 13 | 05:03:34 | Update stemcell: Verifying stemcell manifest (00:00:00)
Task 13 | 05:03:35 | Update stemcell: Checking if this stemcell already exists (00:00:00)
Task 13 | 05:03:35 | Update stemcell: Uploading stemcell bosh-openstack-kvm-ubuntu-trusty-go_agent/3541.12 to the cloud (00:10:41)
L Error: CPI error 'Bosh::Clouds::CloudError' with message 'Unable to connect to the OpenStack Keystone API http://10.81.102.5:5000/v2.0/tokens
Connection timed out - connect(2) for 10.81.102.5:5000 (Errno::ETIMEDOUT)' in 'create_stemcell' CPI method
Task 13 | 05:14:16 | Error: CPI error 'Bosh::Clouds::CloudError' with message 'Unable to connect to the OpenStack Keystone API http://10.81.102.5:5000/v2.0/tokens
Connection timed out - connect(2) for 10.81.102.5:5000 (Errno::ETIMEDOUT)' in 'create_stemcell' CPI method
Task 13 Started Sat Apr 7 05:02:40 UTC 2018
Task 13 Finished Sat Apr 7 05:14:16 UTC 2018
Task 13 Duration 00:11:36
Task 13 error
Uploading remote stemcell 'https://bosh.io/d/stemcells/bosh-openstack-kvm-ubuntu-trusty-go_agent?v=3541.12':
Expected task '13' to succeed but state is 'error'
Exit code 1
Check OpenStack's security group for the BOSH Director machine.
SG should contain ALLOW IPv4 to 0.0.0.0/0, if it doesn't add at least egress TCP to 10.81.102.5 on port 5000.
Check connection using ssh:
bbl ssh --director
nc -tvn 10.81.102.5 5000
If it doesn't help check network/firewall configuration.
https://bosh.io/docs/uploading-stemcells/
https://github.com/cloudfoundry/bosh-bootloader/blob/master/terraform/openstack/templates/resources.tf

Airflow does not update the progress of a dag/task to be completed even though the dag/task has actually completed

I have set up airflow to be running in a distributed mode with 10 worker nodes. I tried to access the performance of the parallel workloads by triggering a test dag which contains just 1 task which just sleeps for 3 seconds and then comes out.
I triggered the dag using the command
airflow backfill test_dag -s 2015-06-20 -e 2015-07-10
The scheduler kicks of the jobs/dags in parallel and frequently I see the below o/p:
[2017-06-27 09:52:29,611] {models.py:4024} INFO - Updating state for considering 1 task(s)
[2017-06-27 09:52:29,647] {models.py:4024} INFO - Updating state for considering 1 task(s)
[2017-06-27 09:52:29,664] {jobs.py:1983} INFO - [backfill progress] | finished run 19 of 21 | tasks waiting: 0 | succeeded: 19 | kicked_off: 2 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
Here the kicked_off:2 indicates that 2 tasks are kicked off but when I see the UI for the status of the dag runs, I see 2 instances of dags to be running. When i look into the respective task instance log, it indicates that the task has been successfully completed but still the above message is displayed in the command prompt infinitely
[2017-06-27 09:52:29,611] {models.py:4024} INFO - Updating state for considering 1 task(s)
[2017-06-27 09:52:29,647] {models.py:4024} INFO - Updating state for considering 1 task(s)
[2017-06-27 09:52:29,664] {jobs.py:1983} INFO - [backfill progress] | finished run 19 of 21 | tasks waiting: 0 | succeeded: 19 | kicked_off: 2 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
Is it that the messages which are being sent by the worker is getting dropped and hence the status is not getting updated?
Is there any parameter in airflow.cfg file which allows the failed jobs like these to be retried on other worker nodes instead of infinately waiting for the message for the worker node which is responsible for executing the aobe failed tasks..

WebLogic OBIEE Scheduler Component Down

I have an OBIEE 11g installation in a Red Hat machine, but I'm finding problems to make it running. I can start WebLogic and its services, so I’m able to enter the WebLogic console and Enterprise Manager, but problems come when I try to start OBIEE components with opmnctl command.
The steps I’m performing are the following:
1) Start WebLogic
cd /home/Oracle/Middleware/user_projects/domains/bifoundation_domain/bin/
./startWebLogic.sh
2) Start NodeManager
cd /home/Oracle/Middleware/wlserver_10.3/server/bin/
./startNodeManager.sh
3) Start Managed WebLogic
cd /home/Oracle/Middleware/user_projects/domains/bifoundation_domain/bin/
./startManagedWebLogic.sh bi_server1
4) Set up OBIEE Components
cd /home/Oracle/Middleware/instances/instance1/bin/
./opmnctl startall
The result is:
opmnctl startall: starting opmn and all managed processes...
================================================================================
opmn id=JustiziaInf.mmmmm.mmmmm.9999
Response: 4 of 5 processes started.
ias-instance id=instance1
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
ias-component/process-type/process-set:
coreapplication_obisch1/OracleBISchedulerComponent/coreapplication_obisch1/
Error
--> Process (index=1,uid=1064189424,pid=4396)
failed to start a managed process after the maximum retry limit
Log:
/home/Oracle/Middleware/instances/instance1/diagnostics/logs/OracleBISchedulerComponent/
coreapplication_obisch1/console~coreapplication_obisch1~1.log
5) Check the status of components
cd /home/Oracle/Middleware/instances/instance1/bin/
./opmnctl status
Processes in Instance: instance1
---------------------------------+--------------------+---------+---------
ias-component | process-type | pid | status
---------------------------------+--------------------+---------+---------
coreapplication_obiccs1 | OracleBIClusterCo~ | 8221 | Alive
coreapplication_obisch1 | OracleBIScheduler~ | N/A | Down
coreapplication_obijh1 | OracleBIJavaHostC~ | 8726 | Alive
coreapplication_obips1 | OracleBIPresentat~ | 6921 | Alive
coreapplication_obis1 | OracleBIServerCom~ | 7348 | Alive
Read the log files from /home/Oracle/Middleware/instances/instance1/diagnostics/logs/OracleBISchedulerComponent/
coreapplication_obisch1/console~coreapplication_obisch1~1.log.
I would recommend trying the the steps in the below link as this is a common issue when upgrading OBIEE.
http://www.askjohnobiee.com/2012/11/fyi-opmnctl-failed-to-start-managed.html
Not sure, what your log says, but try these below steps and check if it works or not
Login as superuser
cd $ORACLE_HOME/Apache/Apache/bin
chmod 6750 .apachectl
logout and login as ORACLE user
opmnctl startproc process-type=OracleBIScheduler

OpenStack error when launching an instance

I am having a consistent
Error: Failed to launch instance-id": Please try again later [Error: Timeout while waiting on RPC response -topic: "network", RPC method: "get_instance_nw_info" info: ""]
every time I am launching an instance in Openstack. I've tried it both using the OpenStack dashboard and via terminal (nova). Using the terminal, here's the command I ran:
nova boot --flavor "2" --image "b26c9acf-06c0-4ff8-b1c7-aca3052485c8" --num-instances "2" --security-groups "default" --meta description="test" test
When I check the list of instances, here's the output:
+--------------------------------------+-------------------------------------------+---
-----+------------+-------------+----------+
| ID | Name |
Status | Task State | Power State | Networks |
+--------------------------------------+-------------------------------------------+---
-----+------------+-------------+----------+
| a0477666-b810-4c73-94e6-2a66576bccac | test-a0477666-b810-4c73-94e6-2a66576bccac |
ERROR | None | NOSTATE | |
| c5822a6f-4270-4718-95c4-9f28fea8de82 | test-c5822a6f-4270-4718-95c4-9f28fea8de82 |
ERROR | None | NOSTATE | |
Here's a snapshot of the error I am encountering:
Am I missing a configuration entry (i.e. using dashboard) or sub-command (i.e. in using nova via terminal) during launching?
Any feedback is greatly appreciated. Thanks in advance!

Resources