Worldserver crashes when applying updates - azerothcore

AzerothCore rev. 017772043d9d 2022-01-05 23:22:43 +0000 (master branch) (Win64, RelWithDebInfo, Static) (worldserver-daemon)
<Ctrl-C> to stop.
█████╗ ███████╗███████╗██████╗ ██████╗ ████████╗██╗ ██╗
██╔══██╗╚══███╔╝██╔════╝██╔══██╗██╔═══██╗╚══██╔══╝██║ ██║
███████║ ███╔╝ █████╗ ██████╔╝██║ ██║ ██║ ███████║
██╔══██║ ███╔╝ ██╔══╝ ██╔══██╗██║ ██║ ██║ ██╔══██║
██║ ██║███████╗███████╗██║ ██║╚██████╔╝ ██║ ██║ ██║
╚═╝ ╚═╝╚══════╝╚══════╝╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═╝ ╚═╝
██████╗ ██████╗ ██████╗ ███████╗
██╔════╝██╔═══██╗██╔══██╗██╔════╝
██║ ██║ ██║██████╔╝█████╗
██║ ██║ ██║██╔══██╗██╔══╝
╚██████╗╚██████╔╝██║ ██║███████╗
╚═════╝ ╚═════╝ ╚═╝ ╚═╝╚══════╝
AzerothCore 3.3.5a - www.azerothcore.org
> Using configuration file configs/worldserver.conf
> Using SSL version: OpenSSL 1.1.1m 14 Dec 2021 (library: OpenSSL 1.1.1m 14 Dec 2021)
> Using Boost version: 1.74.0
Process priority class set to HIGH
Initializing Scripts...
> Loading C++ scripts
Opening DatabasePool 'acore_auth'. Asynchronous connections: 1, synchronous connections: 1.
DatabasePool 'acore_auth' opened successfully. 2 total connections running.
Opening DatabasePool 'acore_characters'. Asynchronous connections: 1, synchronous connections: 2.
DatabasePool 'acore_characters' opened successfully. 3 total connections running.
Opening DatabasePool 'acore_world'. Asynchronous connections: 1, synchronous connections: 1.
DatabasePool 'acore_world' opened successfully. 2 total connections running.
Updating Auth database...
>> Auth database is up-to-date! Containing 2 new and 26 archived updates.
Updating Character database...
>> Character database is up-to-date! Containing 4 new and 40 archived updates.
Updating World database...
>> Applying update "2021_12_29_01.sql" '4170551'...
mysql: [Warning] Using a password on the command line interface can be insecure.♪◙ERROR 2006 (HY000) at line 4: MySQL server has gone away♪◙
Applying of file 'C:/azerothcore/data/sql/updates/db_world/2021_12_29_01.sql' to database 'acore_world' failed! If you are a user, please pull the latest revision from the repository. Also make sure you have not applied any of the databases with your sql client. You cannot use auto-update system and import sql files from AzerothCore repository with your sql client. If you are a developer, please fix your sql query.
Could not update the World database, see log for details.
Closing down DatabasePool 'acore_world'.
Asynchronous connections on DatabasePool 'acore_world' terminated. Proceeding with synchronous connections.
All connections on DatabasePool 'acore_world' closed.
Closing down DatabasePool 'acore_characters'.
Asynchronous connections on DatabasePool 'acore_characters' terminated. Proceeding with synchronous connections.
All connections on DatabasePool 'acore_characters' closed.
Closing down DatabasePool 'acore_auth'.
Asynchronous connections on DatabasePool 'acore_auth' terminated. Proceeding with synchronous connections.
All connections on DatabasePool 'acore_auth' closed.

From the Common Errors page on the AzerothCore Wiki:
This is most likely due to your MySQL server's max_allowed_packet
setting is too low. See this or run the command SET GLOBAL max_allowed_packet=1073741824; in your SQL client (HeidiSQL, SQLyog,
etc.) to update your max_allowed_packet.

Related

Data unpack would read past end of buffer in file util/show_help.c at line 501

I submitted a job via slurm. The job ran for 12 hours and was working as expected. Then I got Data unpack would read past end of buffer in file util/show_help.c at line 501. It is usual for me to get errors like ORTE has lost communication with a remote daemon but I usually get this in the beginning of the job. It is annoying but still does not cause as much time loss as getting error after 12 hours. Is there a quick fix for this? Open MPI version is 4.0.1.
--------------------------------------------------------------------------
By default, for Open MPI 4.0 and later, infiniband ports on a device
are not used by default. The intent is to use UCX for these devices.
You can override this policy by setting the btl_openib_allow_ib MCA parameter
to true.
Local host: barbun40
Local adapter: mlx5_0
Local port: 1
--------------------------------------------------------------------------
--------------------------------------------------------------------------
WARNING: There was an error initializing an OpenFabrics device.
Local host: barbun40
Local device: mlx5_0
--------------------------------------------------------------------------
[barbun21.yonetim:48390] [[15284,0],0] ORTE_ERROR_LOG: Data unpack would read past end of buffer in
file util/show_help.c at line 501
[barbun21.yonetim:48390] 127 more processes have sent help message help-mpi-btl-openib.txt / ib port
not selected
[barbun21.yonetim:48390] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error
messages
[barbun21.yonetim:48390] 126 more processes have sent help message help-mpi-btl-openib.txt / error in
device init
--------------------------------------------------------------------------
Primary job terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
An MPI communication peer process has unexpectedly disconnected. This
usually indicates a failure in the peer process (e.g., a crash or
otherwise exiting without calling MPI_FINALIZE first).
Although this local MPI process will likely now behave unpredictably
(it may even hang or crash), the root cause of this problem is the
failure of the peer -- that is what you need to investigate. For
example, there may be a core file that you can examine. More
generally: such peer hangups are frequently caused by application bugs
or other external events.
Local host: barbun64
Local PID: 252415
Peer host: barbun39
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:
Process name: [[15284,1],35]
Exit code: 9
--------------------------------------------------------------------------

Error in MPI program execution - no active ports found

I am trying to run a simple MPI job across multiple hosts of a cluster.
[capc#gpu6 mpi_tests]$ /opt/openmpi4.0.3/build/bin/mpirun --host gpu7,gpu6 ./a.out
WARNING: There is at least non-excluded one OpenFabrics device found,
but there are no active ports detected (or Open MPI was unable to use
them). This is most certainly not what you wanted. Check your
cables, subnet manager configuration, etc. The openib BTL will be
ignored for this job.
Local host: gpu7
We have 2 processes.
WARNING: Open MPI accepted a TCP connection from what appears to be a
another Open MPI process but cannot find a corresponding process
entry for that peer.
This attempted connection will be ignored; your MPI job may or may not
continue properly.
Local host: gpu6
PID: 29209
[gpu6:29203] 1 more process has sent help message help-mpi-btl-openib.txt / no active ports found
[gpu6:29203] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
I have compiled the MPI program with mpicc and on running with mpirun it hangs.
Can anyone guide me regarding this?

Getting error while redeploying nodes

I am still using Corda 1.0 version. when i try to redeploy nodes with existing data, getting below error while start-up but able to access the nodes . If i clear the data and redeploy the nodes, i didn't face these error message.
Logs can be found in :
C:xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx\kotlin-
source\build\nodes\xxxxxxxx\logs
Database connection url is : jdbc:h2:tcp://xxxxxxxxx/node
E 18:38:46+0530 [main] core.client.createConnection - AMQ214016: Failed to
create netty connection
javax.net.ssl.SSLException: handshake timed out
at io.netty.handler.ssl.SslHandler.handshake(...)(Unknown Source) ~[netty
all-4.1.9.Final.jar:4.1.9.Final]
Incoming connection address : xxxxxxxxxxxx
Listening on port : 10014
RPC service listening on port : 10015
Loaded CorDapps : corda-finance-1.0.0, kotlin-
source-0.1, corda-core-1.0.0
Node for "xxxxxxxxxxx" started up and registered in 213.08 sec
Welcome to the Corda interactive shell.
Useful commands include 'help' to see what is available, and 'bye' to shut
down the node.
Wed May 23 18:39:20 IST 2018>>> E 18:39:24+0530 [Thread-6 (ActiveMQ-server-
org.apache.activemq.artemis.core.server.impl.ActiveMQServerImp
l$3#4a532271)] core.client.createConnection - AMQ214016: Failed to create
netty connection
javax.net.ssl.SSLException: handshake timed out
at io.netty.handler.ssl.SslHandler.handshake(...)(Unknown Source) ~[netty-
all-4.1.9.Final.jar:4.1.9.Final]
This looks like the Artemis failed to connect to the node which means the node fails to start.
You should look at the log and if there are any other previous Corda node started which occupy the node.
If there are any legacy Corda nodes that have not been killed, try ps -ef |grep java to see if there is any other java still alive. Especially look for the port number and check if they are overlapped

Oracle 11g XE installation on docker RHEL 7 image

While installing oracle 11g XE on docker i am getting the error.
Following are the output:-
/etc/init.d/oracle-xe configure
Oracle Database 11g Express Edition Configuration
This will configure on-boot properties of Oracle Database 11g Express
Edition. The following questions will determine whether the database should
be starting upon system boot, the ports it will use, and the passwords that
will be used for database accounts. Press to accept the defaults.
Ctrl-C will abort.
Specify the HTTP port that will be used for Oracle Application Express [8080]:8080
Specify a port that will be used for the database listener [1521]:1521
Specify a password to be used for database accounts. Note that the same
password will be used for SYS and SYSTEM. Oracle recommends the use of
different passwords for each database account. This can be done after
initial configuration:
Confirm the password:
Do you want Oracle Database 11g Express Edition to be started on boot (y/n) [y]:y
Starting Oracle Net Listener...Done
Configuring database...
Database Configuration failed. Look into /u01/app/oracle/product/11.2.0/xe/config/log for details
[root#b7c63c4e1da8 Disk1]# cd /u01/app/oracle/product/11.2.0/xe/config/log
[root#b7c63c4e1da8 log]# ls
CloneRmanRestore.log cloneDBCreation.log postDBCreation.log postScripts.log
[root#b7c63c4e1da8 log]# cat CloneRmanRestore.log
ORA-00845: MEMORY_TARGET not supported on this system
select TO_CHAR(systimestamp,'YYYYMMDD HH:MI:SS') from dual
*
ERROR at line 1:
ORA-01034: ORACLE not available
Process ID: 0
Session ID: 0 Serial number: 0
declare
*
ERROR at line 1:
ORA-01034: ORACLE not available
Process ID: 0
Session ID: 0 Serial number: 0
select TO_CHAR(systimestamp,'YYYYMMDD HH:MI:SS') from dual
*
ERROR at line 1:
ORA-01034: ORACLE not available
Process ID: 0
Session ID: 0 Serial number: 0
One of the possible solution that I got was to mount the temp file to provide extra space to it which only contains 6GB approx in the docker. But i am unable to mount the memory in docker.
Got the solution for the same :-
we have to modify the files init.ora and initXETemp.ora at the path /u01/app/oracle/product/11.2.0/xe/config/scripts
with the values :-
###########################################
# Miscellaneous
###########################################
compatible=11.2.0.0.0
diagnostic_dest=/u01/app/oracle
#memory_target=1073741824
pga_aggregate_target=200540160
sga_target=601620480
You may encounter
ORA-00845: MEMORY_TARGET not supported on this system
when starting Oracle DB in an unprivileged container. Try running the container with the --privileged flag, e.g.
docker run --name oracle12 --hostname oracledb --privileged local/oracle12:12.1.0.2

Passing hostname to netty

Background: I've got two machines with identical hostnames, I need to set up a local spark cluster for testing, setting up a master and a worker works fine, but trying to run an application with the driver causes problems, netty doesn't seem to be picking the correct host (regardless of what I put in there, it just picks the first host).
Identical hostname:
$ dig +short corehost
192.168.0.100
192.168.0.101
Spark config (used by master and the local worker):
export SPARK_LOCAL_DIRS=/some/dir
export SPARK_LOCAL_IP=corehost // i tried various like 192.168.0.x for
export SPARK_MASTER_IP=corehost // local, master and the driver
export SPARK_MASTER_PORT=7077
export SPARK_WORKER_CORES=2
export SPARK_WORKER_MEMORY=2g
export SPARK_WORKER_INSTANCES=2
export SPARK_WORKER_DIR=/some/dir
Spark starts up and I can see the worker in the web-ui.
When I run the spark "job" below:
val conf = new SparkConf().setAppName("AaA")
// tried 192.168.0.x and localhost
.setMaster("spark://corehost:7077")
val sc = new SparkContext(conf)
I get this exception:
15/04/02 12:34:04 INFO SparkContext: Running Spark version 1.3.0
15/04/02 12:34:04 WARN Utils: Your hostname, corehost resolves to a loopback address: 127.0.0.1; using 192.168.0.100 instead (on interface en1)
15/04/02 12:34:04 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
15/04/02 12:34:05 ERROR NettyTransport: failed to bind to corehost.home/192.168.0.101:0, shutting down Netty transport
...
Exception in thread "main" java.net.BindException: Failed to bind to: corehost.home/192.168.0.101:0: Service 'sparkDriver' failed after 16 retries!
at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:393)
at akka.remote.transport.netty.NettyTransport$$anonfun$listen$1.apply(NettyTransport.scala:389)
at scala.util.Success$$anonfun$map$1.apply(Try.scala:206)
at scala.util.Try$.apply(Try.scala:161)
at scala.util.Success.map(Try.scala:206)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
at scala.concurrent.Future$$anonfun$map$1.apply(Future.scala:235)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.processBatch$1(BatchingExecutor.scala:67)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply$mcV$sp(BatchingExecutor.scala:82)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
at akka.dispatch.BatchingExecutor$Batch$$anonfun$run$1.apply(BatchingExecutor.scala:59)
at scala.concurrent.BlockContext$.withBlockContext(BlockContext.scala:72)
at akka.dispatch.BatchingExecutor$Batch.run(BatchingExecutor.scala:58)
at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
15/04/02 12:34:05 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
15/04/02 12:34:05 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
15/04/02 12:34:05 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
Process finished with exit code 1
Not sure how to proceed... its a whole jungle of ip addresses.
Not sure if this is a netty issue either.
My experience with the identical problem is that it revolves around setting things up locally. Try being more verbose in your spark driver code, add the SPARK_LOCAL_IP and driver host ip to the config:
val conf = new SparkConf().setAppName("AaA")
.setMaster("spark://localhost:7077")
.set("spark.local.ip","192.168.1.100")
.set("spark.driver.host","192.168.1.100")
This should tell netty which of the two identical hosts to use.

Resources