same query giving different "Explain" output - innodb

I have two slave mysql db (5.5.27) (both are running on different machine with same OS).
There is three table (CATEGORY_TREE, SEO_METADATA, TAGS).
There schema definition are as follow:
| CATEGORY_TREE | CREATE TABLE `CATEGORY_TREE` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`child_cid` int(11) DEFAULT NULL,
`parent_cid` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `child_cid` (`child_cid`,`parent_cid`)
) ENGINE=InnoDB AUTO_INCREMENT=9528 DEFAULT CHARSET=latin1 |
| TAGS | CREATE TABLE `TAGS` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`name` varchar(64) NOT NULL,
`owner` varchar(20) DEFAULT NULL,
`type` varchar(20) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`,`name`,`owner`,`type`)
) ENGINE=InnoDB AUTO_INCREMENT=165498 DEFAULT CHARSET=latin1 |
| SEO_METADATA | CREATE TABLE `SEO_METADATA` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`canonicalString` varchar(128) DEFAULT NULL,
`dataId` varchar(20) DEFAULT NULL,
`oldId` int(30) DEFAULT NULL,
`type` varchar(1) DEFAULT NULL,
`canonicalString` varchar(128) DEFAULT NULL,
`searchString` text,
PRIMARY KEY (`id`),
KEY `id_type` (`dataId`,`type`),
KEY `oldid_type` (`oldId`,`type`),
KEY `canonicalstringidx` (`canonicalString`)
) ENGINE=InnoDB AUTO_INCREMENT=3863159 DEFAULT CHARSET=latin1 |
On running "EXPLAIN" for Select db query, I am getting different output
explain SELECT ct.child_cid AS 'tid', ct.parent_cid AS 'ptid', t.name, sm.canonicalString FROM CATEGORY_TREE ct, TAGS t, SEO_METADATA sm WHERE ct.child_cid = t.id AND sm.dataId = cast(ct.child_cid as char) AND sm.type = 't' AND (sm.searchString IS NULL OR sm.searchString = '') GROUP BY ct.child_cid LIMIT 10000;
**Machine 1**
+----+-------------+-------+--------+---------------+-----------+---------+------------------------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+-----------+---------+------------------------+---------+----------------------------------------------+
| 1 | SIMPLE | ct | index | child_cid | child_cid | 10 | NULL | 2264 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | t | eq_ref | PRIMARY,id | PRIMARY | 8 | cms.ct.child_cid | 1 | Using where |
| 1 | SIMPLE | sm | ALL | NULL | NULL | NULL | NULL | 1588507 | Using where; Using join buffer |
+----+-------------+-------+--------+---------------+-----------+---------+------------------------+---------+----------------------------------------------+
**Machine 2**
+----+-------------+-------+--------+---------------+-----------+---------+------------------------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+-----------+---------+------------------------+---------+----------------------------------------------+
| 1 | SIMPLE | sm | ALL | NULL | NULL | NULL | NULL | 1524208 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | ct | index | child_cid | child_cid | 10 | NULL | 2500 | Using where; Using index; Using join buffer |
| 1 | SIMPLE | t | eq_ref | PRIMARY,id | PRIMARY | 8 | cms.ct.child_cid | 1 | Using where |
+----+-------------+-------+--------+---------------+-----------+---------+------------------------+---------+----------------------------------------------+
In one it is using "index" first, but in other it is not.
I have gone through "Why does the same exact query produce 2 different MySQL explain results?", and checked all the parameter mentioned by #spencer7593.
Apart from this, I have run "optimize" table command in Machine 2 DB, but there no change in output.
I know that mysql can be "forced" to use index, but want to know the root cause for same.
Few post have mentioned that Innodb buffer-size can also be one reason for different output. In my case, both are almost same. Below is output of "SHOW ENGINE INNODB STATUS" from both db.
**Machine 1**
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 10989076480; in additional pool allocated 0
Dictionary memory allocated 5959814
Buffer pool size 655360
Free buffers 0
Database pages 645758
**Machine 2**
----------------------
BUFFER POOL AND MEMORY
----------------------
Total memory allocated 10989076480; in additional pool allocated 0
Dictionary memory allocated 5659318
Buffer pool size 655359
Free buffers 0
Database pages 645245
Thanks

Related

Simple mysql SELECT with IN-clause (which returns an empty set) runs forever

On my MariaDB I have a table, 'cv_attribute', which contains 296k records and has 8 columns:
+----------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+-------+
| cvId | varchar(255) | NO | MUL | NULL | |
| updateId | bigint(20) | NO | MUL | NULL | |
| attributeId | varchar(255) | YES | | NULL | |
| attributeType | varchar(255) | YES | | NULL | |
| attributeLang | varchar(255) | YES | | NULL | |
| attributeValue | varchar(255) | YES | | NULL | |
| MD5 | varchar(255) | YES | MUL | NULL | |
| ingestId | bigint(20) | YES | | NULL | |
+----------------+--------------+------+-----+---------+-------+
I was given a script that runs the query
SELECT DISTINCT * FROM cv_attribute WHERE `MD5` IN (SELECT `MD5` FROM cv_attribute GROUP BY `MD5` HAVING COUNT(*) > 1);
but this runs forever. As in: still running even after a week!
Notice that the query contains an IN-clause. When running the query within that IN-clause, it takes 11 seconds and returns no results:
SELECT `MD5` FROM cv_attribute GROUP BY `MD5` HAVING COUNT(*) > 1;
Empty set (11.51 sec)
I'm totally clueless how the complete query (which basically is just a SELECT DISTINCT *) never finishes, while it's IN-clause returns an empty set within 11 seconds. So, to my understanding, this SELECT DISTINCT * would just query an empty list and be pretty fast.
Note, this is the explain of the full query:
explain SELECT DISTINCT * FROM cv_attribute WHERE `MD5` IN (SELECT `MD5` FROM cv_attribute GROUP BY `MD5` HAVING COUNT(*) > 1);
+------+--------------------+--------------+------+---------------+------+---------+------+--------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+--------------------+--------------+------+---------------+------+---------+------+--------+------------------------------+
| 1 | PRIMARY | cv_attribute | ALL | NULL | NULL | NULL | NULL | 505223 | Using where; Using temporary |
| 2 | DEPENDENT SUBQUERY | cv_attribute | ALL | NULL | NULL | NULL | NULL | 505223 | Using temporary |
+------+--------------------+--------------+------+---------------+------+---------+------+--------+------------------------------+
2 rows in set (0.00 sec)
I hope anybody here would see the light!
Thanks a lot!
Carl

Conditional unique constraint in mariadb

I have this table in mariadb
create or replace table test_table (
col_key int primary key,
col_a int not null,
col_b int not null check(col_b in (0, 1))
);
There is an additional constraint that the pair (col_a, col_b) must be unique only if col_b = 1
For example, you are allowed to have
| col_key | col_a | col_b |
| ------- | ----- | ----- |
| 1 | 1 | 0 |
| 2 | 1 | 0 |
But not
| col_key | col_a | col_b |
| ------- | ----- | ----- |
| 1 | 1 | 1 |
| 2 | 1 | 1 |
I think of 2 approaches:
Since col_b only has 2 values, I can take advantage of the fact that in mariadb null can by pass unique check. So instead of 0 and 1, I change the definition of the table to this and treat null as 0 in application code.
create or replace table test_table (
col_key int primary key,
col_a int not null,
col_b int check(col_b is null or col_b = 1),
unique (col_a, col_b)
);
The upside is I have the database handle the check for me. The downside is the application code become a bit complex and the code tightly couple to mariadb's implementation.
When update or insert, write query like this
update test_table t
set t.col_b = 1
where t.col_key = 2
and not exists (select 1 from test_table t2 where t2.col_a = 1 and t2.col_b = 1)
The upside is the application code actually works with 0 and 1 values. The downside is... I unsure if this work or not. Does the database lock the table when it run the update query? Is there any chance that some other process inserts a row with col_b = 1 after the subquery returns the result?
According to your question, the combination of cola and colb is unique, if colb has the value 1. This condiition can be already defined in the table definition:
CREATE TABLE test_table (
col_key int primary key,
col_a int not null,
col_b int check(col_b is null or col_b = 1),
unique (col_a, col_b)
)
Since the unique index allows multiple NULL values, multiple same combinations of col_a and col_b are allowed, as long the value of col_b is NULL.
For updating you don't need a subquery, since the server checks this condition itself. For not raising an error, just use UPDATE IGNORE
UPDATE IGNORE test_table SET col_b=1 WHERE col_key=2
With the first solution the logic is defined in the database (and doesn't need to be defined in your application) and it is much faster since it can use an index, while the subquery performs a table scan.
MariaDB [test]> insert into test_table select seq,1,NULL from seq_1_to_1000;
Query OK, 1000 rows affected (0,081 sec)
Records: 1000 Duplicates: 0 Warnings: 0
MariaDB [test]> explain update test_table set col_b=1 where col_key=18 and not exists (select 1 from test_table t2 where t2.col_a=1 and t2.col_b=1);
+------+-------------+------------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+------------+-------+---------------+---------+---------+-------+------+-------------+
| 1 | PRIMARY | test_table | const | PRIMARY | PRIMARY | 4 | const | 1 | |
| 2 | SUBQUERY | t2 | ALL | NULL | NULL | NULL | NULL | 1000 | Using where |
+------+-------------+------------+-------+---------------+---------+---------+-------+------+-------------+
2 rows in set (0,003 sec)
MariaDB [test]> alter table test_table add unique (col_a, col_b);
Query OK, 0 rows affected (0,079 sec)
Records: 0 Duplicates: 0 Warnings: 0
MariaDB [test]> explain update test_table set col_b=1 where col_key=18;
+------+-------------+------------+-------+---------------+---------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+------------+-------+---------------+---------+---------+------+------+-------------+
| 1 | SIMPLE | test_table | range | PRIMARY | PRIMARY | 4 | NULL | 1 | Using where |
+------+-------------+------------+-------+---------------+---------+---------+------+------+-------------+

What is the default table storage format for Swisscom MariaDb Ent?

When creating a table like this
CREATE TABLE `dummy` (
`userid` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`providerid` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`provideruserid` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
and defining a PK like the following:
ALTER TABLE dummy ADD PRIMARY KEY (userid,providerid,provideruserid);
I get this error:
Error: Specified key was too long; max key length is 767 bytes
I think the following are set correctly and therefore most probably not the problem:
innodb_file_format=Barracuda
innodb_file_per_table=ON
innodb_large_prefix=ON
Most likely the table storage format is causing issues here, but I'm not able to check the defined default.
As per the documentation, the default table storage format for MariaDB (since 10.2.2) is DYNAMIC - if it is DYNAMIC then this would be perfect, but this seems not to be the case.
Does anyone know the default table storage format in Swisscom MariaDB Ent. and why it is not DYNAMIC? (Probably :))
Our version of MariaDB in Prd:
select VERSION();
+-----------------+
| VERSION() |
+-----------------+
| 10.1.22-MariaDB |
+-----------------+
1 row in set (0.00 sec)
Quote from MariaDB KB XtraDB/InnoDB Storage Formats
Compact
Compact was the default format until MariaDB 10.2.1, and is suitable
for general use if the Antelope file format is used. It was introduced
in MySQL 5.0.
In the Compact storage format (as in Redundant) BLOB and TEXT columns
are partly stored in the row page. At least 767 bytes are stored in
the row; values which exceed this value are are stored in dedicated
pages. Since Compact and Redundant rows maximum size is about 8000
bytes, this limits the number of BLOB or TEXT columns that can be used
in a table. Each BLOB page is 16KB, regardless the size of the data.
Other columns can be stored in different pages too, if they exceed the
row page's size limit.
Our version of MariaDB in Lab (soon to be deployed on Prd). This the relevant line in bosh release.
select VERSION();
+-----------------+
| VERSION() |
+-----------------+
| 10.1.26-MariaDB |
+-----------------+
1 row in set (0.00 sec)
How to fix the error ERROR 1709 (HY000): Index column size too large. The maximum column size is 767 bytes. on Swisscom Application Cloud (have a look at the diff ROW_FORMAT=DYNAMIC):
MariaDB [stackoverflow]> CREATE TABLE `dummy` (
-> `userid` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
-> `providerid` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
-> `provideruserid` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL
-> ) ENGINE=InnoDB ROW_FORMAT=DYNAMIC DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
Query OK, 0 rows affected (0.02 sec)
MariaDB [stackoverflow]> ALTER TABLE dummy ADD PRIMARY KEY (userid,providerid,provideruserid);
Query OK, 0 rows affected (0.02 sec)
Records: 0 Duplicates: 0 Warnings: 0
MariaDB [stackoverflow]> show index from dummy;
+-------+------------+----------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| dummy | 0 | PRIMARY | 1 | userid | A | 0 | NULL | NULL | | BTREE | | |
| dummy | 0 | PRIMARY | 2 | providerid | A | 0 | NULL | NULL | | BTREE | | |
| dummy | 0 | PRIMARY | 3 | provideruserid | A | 0 | NULL | NULL | | BTREE | | |
+-------+------------+----------+--------------+----------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
3 rows in set (0.00 sec)
Here the code for copy and paste:
CREATE TABLE `dummy`
(
`userid` VARCHAR(255) collate utf8mb4_unicode_ci NOT NULL,
`providerid` VARCHAR(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`provideruserid` VARCHAR(255) COLLATE utf8mb4_unicode_ci NOT NULL
)
engine=innodb row_format=dynamic DEFAULT charset=utf8mb4 COLLATE=utf8mb4_unicode_ci;

SQLite: How to extract primary keys and unique constraints with correct names

Given a SQLite db.
a table with primary key:
create table t1 (id int not null, CONSTRAINT pk_id PRIMARY KEY (id));
Now query info for it:
PRAGMA TABLE_INFO(t1);
returns:
| cid | name | type | notnull | dflt_value | pk |
| --- | ---- | ---- | ------- | -----------| -- |
| 0 | id | int | 1 | <null> | 1 |
PRAGMA index_list(t1);
returns:
| seq | name | unique | origin | partial |
| --- | ----------------------| ------ | ------ | ------- |
| 0 | sqlite_autoindex_t1_1 | 1 | pk | 0 |
As we can see index_list returns info about the PK but it reports incorrect name ("sqlite_autoindex_t1_1" instead of "pk_t1").
The same problem with UNIQUE constraints. They are created with autogenerated names.
Is it possible to extract real PRIMARY KEY/UNIQUE CONSTRAINT name?
P.S. I can see that JetBrains's DataGrip correctly show PK names in database browser. But sqliteadmin for example shows them with name like sqlite_autoindex_t1_1. For unique constraints even DataGrip doesn't show correct names (actually it doesn't show them at all).
The index and the constraint are different objects.
SQLite has no mechanism to retrieve the constraint name. You'd have to parse the SQL.

sqlite combining select results for searching

Alright so here are my two tables.
CREATE TABLE [cards] (
[id] TEXT PRIMARY KEY,
[game_id] TEXT NOT NULL,
[set_id] TEXT CONSTRAINT [id_set_id] REFERENCES [sets]([id]) ON DELETE CASCADE ON UPDATE CASCADE MATCH SIMPLE NOT DEFERRABLE INITIALLY IMMEDIATE,
[name] TEXT NOT NULL,
[image] TEXT NOT NULL);
CREATE TABLE [custom_properties] (
[id] TEXT PRIMARY KEY,
[card_id] TEXT CONSTRAINT [id_card_id] REFERENCES [cards]([id]) ON DELETE CASCADE ON UPDATE CASCADE MATCH SIMPLE NOT DEFERRABLE INITIALLY IMMEDIATE,
[game_id] TEXT CONSTRAINT [id_game_id4] REFERENCES [games]([id]) ON DELETE CASCADE ON UPDATE CASCADE MATCH SIMPLE NOT DEFERRABLE INITIALLY IMMEDIATE,
[name] TEXT NOT NULL,
[type] INTEGER NOT NULL,
[vint] INTEGER,
[vstr] TEXT);
What I would like to do is to have a search that grabs all the data from the cards row, and then adds the column who's name is (where custom_properties.card_id == cards.id).name.
Then I would like it's value to be vint if type == 1, else vstr.
So, here is an example dataset
cards
|id | game_id | set_id | name | image|
+---+---------+--------+------+------+
| a | asdf | fdsaf |loler | blah |
+------------------------------------+
custom_properties
| id | card_id | game_id | name | type | vint | vstr |
+----+---------+---------+------+------+------+------+
| f | a | asdf | range| 1 | 12 | |
| b | a | asdf | rank | 0 | | face |
+----+---------+---------+------+------+------+------+
the resulting table would look like this, where the columns range and rank are derived from custom_properties.name
|id | game_id | set_id | name | image | range | rank |
+---+---------+--------+------+-------+-------+------+
|a | asdf | fdsaf | loler| blah | 12 | face |
+---+---------+--------+------+-------+-------+------+
try this:
SELECT Cards.id,
Cards.game_id,
Cards.set_id,
Cards.name,
Cards.id,
Cards.image,
CASE
WHEN Custom_Properties.type = 1 THEN
Custom_Properties.vint
ELSE
Custom_Properties.vstr
END as Range
Custom_Properties.vstr as rank
FROM Cards INNER JOIN Custom_Properties
ON Cards.id = Custom_Properties.card_ID
WHERE Cards.Name = 'loller'
It's not actually possible to do this.

Resources