mariadb 10.3.13 table_open_cache problems - mariadb

Just upgraded mysql 5.6 to mariadb 10.3.13 - now when the server hits open_tables = 2000, my php queries stop working - if I do a flush tables it starts working correctly again. This never happened when I was using mysql - now I can't go a day without having to login and do a flush tables to get things working again
Use WHM / Cpanel to administer my VPS and on the last WHM release it started warning me that the version of MySql (really can't remember what version it was - it was what was loaded when I got my VPS) that I was running was soon coming to an end and I would need to upgrade to SQL 5.7 or MariaDB xxx. Had been wanting to move to MariaDB for awhile anyway, so that is what I did - WHM recommended the 10.3.13 version.
After some more watching and looking it appears that what makes my open_tables hits the 2000 max was the automatic CPANEL backup routines - which also backup all of my databases at one get go. Doesn't crash anything just causes problems with my PHP application connections - I don't thing the connections get rejected - they just don't return any data .... Turned all of the automatice WHM/CPANEL backups off and things have settled down a little.
table_definition_cache 400
table_open_cache 2000
I still do a mysqldump via cron to do my database backups - only two live databases and they still make the tables_open grow to that 2000 max - just not as fast.
I now run a script that runs every hour to show me some of the variables and here is what I am seeing
after doing a flush tables command both open_tables and open_table_definitions start increasing until open_table_definitions hits 400 it stops increasing while open_tables keeps increasing thru the day.
then when the mysqldumps happen in the early morning hours tables_open hits 2000 (the max setting) and my php queries are not executed
I do not get a PHP error.
I ran the following command so that I could see what was happening on the db side.
SET GLOBAL general_log = 'ON'
Looking at the log, when everything is running OK I see my application connecting, preparing the statement, executing the statement and then disconnecting ....
I did the same thing when it started acting up (i.e. my php application starts to not get a result again)
Looking at the log I see my application connecting, then preparing the statement and then instead of seeing it execute the statement, it prepares the same statement 2 more times and then disconnects ...
I logged into mysql and did a flush tables command and everything goes back to normal - application connects, prepares a statement, executes it, disconnects ...
But this never happened before I moved to MariaDB - I never messed with the MySQL server stuff at all - the only time MySQL was restarted was when I did a CENTOS 6 system update and had to reboot the server - would go months without doing a thing on the server ....

Looks like the system was the culprit - I changed the open_table_cache to 400 and my php application is no longer having any issues preparing statements, even after the nightly backups of the databases. Looking at older mysql documentation, mysql 5.6.7 had a table_open_cache setting of 400, so when I upgraded to mariadb 10.3.13 that default setting was changed to 2000 which is when I started having problems.
Not quite sure what the following is telling me, but might be of interest ....
su - mysql
-bash-4.1$ ulimit -Hn
100
-bash-4.1$ ulimit -Sn
100
-bash-4.1$ exit
logout
[~]# ulimit -Hn
4096
[~]# ulimit -Sn
4096

Related

Airflow 2 - error MySQL server has gone away

I am running Airflow with backend mariaDB and periodically when a DAG task is being scheduled, I noticed the following error in airflow worker
sqlalchemy.exc.OperationalError: (_mysql_exceptions.OperationalError) (2006, 'MySQL server has gone away').
I am not sure if the issue occurres due to misconfiguration of airflow, or it has to do that the backend is mariaDB, which as I saw it is not a recommended database.
Also, in mariaDB logs, I see the following warning repeating almost every minute
[Warning] Aborted connection 305627 to db: 'airflow' user: 'airflow' host: 'hostname' (Got an error reading communication packets)
I've seen some similar issues mentioned, but whatever I tried so far it didn't help.
The question is, Should I change database to MySQL? Or some configuration has to be done in mariaDB's end?
Airflow v2.0.1
MariaDB 10.5.5
SQLAlchemy 1.3.23
Hard to say - you need to look for the reason why your DB connection get aborted. MariaDB for quick testing with single scheduler might work, but there is some reason why your connection to the DB gets disconnected.
There are few things you can do:
airflow has db check command line command and you can run it to test if the DB configuration is working - maybe the errors that you will see will be obious when you try
airflow also has another useful command db shell - it will allow you to connect to the DB and run sql queries for example. This migh tell you if your connection is "stable". You can connect and run some queries and see if your connection is not interrupted in the meantime
You can see at more logs and your network connectivity to see if you have problems
finally check if you have enough resources to run airflow + DB. Ofthen things like that happen when you do not have enough memory for example. Airflow + DB requires at least 4GB RAM minimum from the experience (depends on the DB configuration) and if you are on Mac or Windows and using Docker, Docker VM by default has less memory than that available and you need to increase it.
look at other resources - disk space, memory, number of connections, etc. all can be your problem.

MariaDB has stopped responding - [ERROR] mysqld got signal 6

MariaDB service was stopped responding all of a sudden. It was running for more than 5 months continuously without any issues. When we check the MariaDB service status at the time of the incident, it showed as active (running) ( service mariadb status ). But we could not log into the MariaDB server, each logging attempt was just hanged without any response. All our web applications were also failed to communicate with the MariaDB service. Also, we checked the max_used_connections, and it was below the maximum value.
When we going through the logs, we saw the below error (this had been triggered at the time of the incident).
210623 2:00:19 [ERROR] mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
To report this bug, see https://mariadb.com/kb/en/reporting-bugs
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Server version: 10.2.34-MariaDB-log
key_buffer_size=67108864
read_buffer_size=1048576
max_used_connections=139
max_threads=752
thread_count=72
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1621655 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x7f4c008501e8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f4c458a7d30 thread_stack 0x49000
2021-06-23 2:04:20 139966788486912 [Warning] InnoDB: A long semaphore wait:
--Thread 139966780094208 has waited at btr0sea.cc line 1145 for 241.00 seconds the semaphore:
S-lock on RW-latch at 0x55e1838d5ab0 created in file btr0sea.cc line 191
a writer (thread id 139966610978560) has reserved it in mode exclusive
number of readers 0, waiters flag 1, lock_word: 0
Last time read locked in file btr0sea.cc line 1145
Last time write locked in file btr0sea.cc line 1218
We could not even stop the MariaDB service using general stopping commands ( service MariaDB stop). But we were able to forcefully kill the MariaDB process and then we could get the MariaDB service back online.
What could be the reason for this failure. If you have already faced similar issues please share your experience, what actions you got to prevent such failures (in the future). Your feedback is much much appreciated.
Our Environment Details are as follows
Operating system: Red Hat Enterprise Linux 7
Mariadb version: 10.2.34-MariaDB-log MariaDB Server
I also face this issue on an aws instance (c5a.4xlarge) hosting my database.
Server version: 10.5.11-MariaDB-1:10.5.11+maria~focal
It happened already 3 times occasionnaly. Like you, no possibility to stop the service but reboot the machine to get it working again.
Logs at restart suggest some tables crashed and should be repaired.

Teradata Express Does not respond after restart

Instead of commenting on thread Teradata Viewpoint on Teradata Express 16.10 is not working, I am creating new thread but it is with respect to linked so added this link to post,
I have restarted Teradata Express with command tpareset -x "increasing memory size" and increased the memory, but after restart, queries on system are running very slowly and sometimes SQl Assistant goes into Not Respoding state.
To resolve this temporary, I used tpa stop and then tpa start, after this, system runs smoothly without lag but sometimes later it again goes into slow state.
Not sure what is happening here, can someone guide on how to resolve it permanently ?

website down with mariadb "too many connections" error

I am running a single high-visited website on a high-end Centos 7 VPS (16 vCore / 128 GB of RAM) running Plesk Onyx on
Centos 7 / MariaDB 10.1 / PHP-FPM 5.6 setup.
Everything is usually smooth and fast, but it happened twice in a year that the website went down with the message "Too Many Connections" from MariaDB.
Being in a hurry to restore website I launched a " service mariadb restart " without actually launching a SHOW PROCESSLIST.
I checked mariadb logs and web server logs afterwards and I haven't find anything useful to troubleshoot the issue.
Note that when it happened first time, I raised the max_connections value to 300 in my.cnf and constantly monitored the "max_used_connections" variabile seeing that value never went over 50 so I guessed it happened because of some DDOS attack or malicious attempt.
Questions :
Any advice on how to troubleshoot this ?
How can I be alerted if the max_used_connections value is approaching the max_connections value ? Any tool ?
I am using external pingdom service to check website uptime but it didn't detect this kind of problem (the web response is 200 OK) and also a netdata instance on the server (https://netdata.io/) that didn't help...
Troubleshoot it by turning on the slowlog, preferably with a low value for long_query_time (such as "1"). Probably some naughty query will show up there.
Yes, do SHOW FULL PROCESSLIST next time. (Note "FULL".) Instead of restarting mysqld, look for the offending query. It will have one of the highest values in Time and it probably won't be in Sleep mode. It may be something potentially long like ALTER or a dump. Killing that one process will probably uncork the problem, and the problem will vanish in, perhaps, seconds.
Deleting a file that is "open" by a process (such as mysqld) will not help -- disk space is not recycled until all processes have closed the file. Killing the process closes any open files. Some logs are can be handled with FLUSH LOGS; -- this should be harmless, though it may not help.
If your tables are MyISAM, switching to InnoDB will avoid many cases of table locks (if that is what you are experiencing).
What is the value of innodb_buffer_pool_size? For that sized RAM, about 80G is reasonable.
There might be some clues in the GLOBAL STATUS; see http://mysql.rjweb.org/doc.php/mysql_analysis#tuning for analyzing it. (Caution: It will be useless immediately after a reboot.)

php5-fpm craches

I have a webserver (nginx) running debian and php5-fpm randomly seems to crach, it replys with 504 bad gateway if i call php files.
when it is in a crashed state and i do sudo /etc/init.d/php5-fpm it says that it is running, but it will still it gives 504 bad gateway until i do sudo /etc/init.d/php5-fpm
I'm thinking that it has maybe to do with one of my php files which is in a infinity loop until a certain event occurs (change in mysql database) or until it will be time-outed. I don't know if generally that is a good thing or if i should make the loop quit itself before a timeout occurs.
Thanks in advice!
First look at the nginx error.log for the actual error. I don't think PHP crashed, just your loop is using all available php-fpm processes, so there is none free to serve your next request from nginx. That should produce Timeout error in the logs (nginx will wait for some time for available php-fpm process).
Regarding your second question. You should not use infinite loops for this. And if you do, insert sleep() command inside the loop - otherwise you will overload your CPU with that loop and also database with queries.
Also I guess it is enough to have one PHP process in that loop waiting for a event. In that case use some type of semaphore (file or info in db) to let other processes know that one is already waiting for that event. Otherwise you will always eat up all available PHP processes.

Resources