MariaDB full-text searching skipped a stopword even in Boolean Mode - mariadb

We are verifying MyISAM and InnoDB in full-text searching following this post.
The word about belongs to the list of InnoDB stopwords, so the query got an empty result when matching against it in Natural Language Mode.
In Boolean Mode, we expect to get matches on the relevant rows, but the results were still empty for both InnoDB and MyISAM engines, and we need help understanding the testing results.
the query SELECT * FROM ft_1 WHERE MATCH(copy) AGAINST('about' in BOOLEAN MODE);, for InnoDB and
the query SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('about' in BOOLEAN MODE); for MyISAM
We tested other stopwords, e.g. once, is, etc., and got the same result that the match skips the stopword. So, we wonder why and will highly appreciate hints and suggestions.
Technical Details:
SQL:
-- Definition of the InnoDB table:
CREATE TABLE test.ft_1 (
copy TEXT NULL
)
ENGINE=InnoDB
DEFAULT CHARSET=utf8mb4
COLLATE=utf8mb4_unicode_ci;
CREATE FULLTEXT INDEX ft_1_copy_IDX ON test.ft_1 (copy);
-- Definition of the MyISAM table:
CREATE TABLE `ft_myisam` (
`copy` text DEFAULT NULL,
FULLTEXT KEY `copy` (`copy`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
-- Queries:
SELECT * FROM ft_1;
SELECT * FROM ft_myisam;
SELECT * FROM ft_1 WHERE MATCH(copy) AGAINST('about');
SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('about');
-- The default Natural Language Mode returns an empty set as expected.
SELECT * FROM ft_1 WHERE MATCH(copy) AGAINST('about' in BOOLEAN MODE);
SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('about' in BOOLEAN MODE);
-- The Boolean Mode still returns an empty set, and we wonder why.
SELECT * FROM ft_1 WHERE MATCH(copy) AGAINST('clock');
SELECT * FROM ft_myisam WHERE MATCH(copy) AGAINST('clock');
-- Returns the row `It is about two o'clock` because 'clock' is not a stopword.
The data:
copy
Once upon a time
There was a wicked witch
Who ate everybody up
Once upon a wicked time
There was a wicked wicked witch
Who ate everybody wicked up
It is about two o'clock
About two
is two

Related

Properly Indexing table per time on MariaDB

I believe this is only me not realizing something obvious.
I currently have a table of positions for a car tracking software.
The current structure is as follows:
CREATE TABLE `positions` (
`id` char(36) NOT NULL,
`vehicleId` char(36) DEFAULT NULL,
`time` datetime NOT NULL,
`date` date NOT NULL, -- date being time without the hours, minutes and seconds
`lat` decimal(10,7) NOT NULL,
`lng` decimal(10,7) NOT NULL,
`speed` int(11) NOT NULL,
`attributes` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL CHECK (json_valid(`attributes`)),
`created_at` datetime(6) NOT NULL DEFAULT current_timestamp(6),
`updated_at` timestamp(6) NULL DEFAULT current_timestamp(6) ON UPDATE current_timestamp(6),
PRIMARY KEY (`id`),
KEY `IDX_0605352b480db5b3769797b9e8` (`time`),
KEY `IDX_de42da506f977dddd80bc8e3ac` (`vehicleId`,`date`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4
This table has positions from only one month, as i have a cron processes that executes once a month to remove all positions that are not from the current month.
Yet, this table got at around a million entries and queries on it became extremely slow.
I am trying to fetch all positions from a specific date and from a specific vehicle:
SELECT * FROM positions WHERE vehicleId='id here' AND date='date here';
But for some reason it is extremely slow.
Server is a Xeon E5-1630 v4 with 4 GB RAM and 160 GB SSD, Running Fedora 34(5.13.14-200.fc34.x86_64).
The server is running MariaDB server(10.5.12-MariaDB), Redis, Node.JS and Caddy
EDIT: Answering comments,
EXPLAIN SELECT * FROM positions WHERE vehicleId='5d634444-ed56-49b2-9628-ba51182391c1' AND date='2021-09-23';
+------+-------------+-----------+------+--------------------------------+--------------------------------+---------+-------------+------+-----------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-----------+------+--------------------------------+--------------------------------+---------+-------------+------+-----------------------+
| 1 | SIMPLE | positions | ref | IDX_de42da506f977dddd80bc8e3ac | IDX_de42da506f977dddd80bc8e3ac | 148 | const,const | 268 | Using index condition |
+------+-------------+-----------+------+--------------------------------+--------------------------------+---------+-------------+------+-----------------------+
innodb_buffer_pool_size is currently at 2GB(half of my server's memory)
It looks like the 2G innodb buffer pool size is exceeding the normal size of commonly used data. Options to investigate before getting more ram and increasing this are:
as vehicleId appears to be a UUID, UTF8MB4 is rather wasteful on size for this. It could be converted to ascii, latin1 or something else with 1 byte per char.
alter table positions modify vehicleID char(36) character set ascii DEFAULT NULL
Ensure you change other vehicleID types in other tables otherwise joins requiring character set conversion get rather expensive (as a recent user discovered)
Note also in 10.7.0 preview, uuid is a new datatype.
restrict retrieval
If you aren't using * you could restrict the retreival to just the fields needed. If reduced just to the index elements, this means a looking to other fields isn't needed. If attributes isn't need this prevents potentially other off-page lookups.
It looks like maybe vehicleID,time could be a composite primary key.
If this is the most common query, and the primary key isn't used elsewhere, this would increase the retrieval of non-secondary index elements. This would involve changing the query to use time ranges to most effectively use it.
Otherwise, look closer at RAM, especially ensure that MariaDB isn't swapping during query retrieval. Having buffer pool memory ending up in swap isn't useful.

bug in mysql count function: zero results with two conditions

I'm working with a LAMP server (it's a bananapi) with MYSQL 5.6 server installed on it. I have a table with several fields.
From mysql command line, if I run:
select * from tbl_name where field1=val1 and field2=val2
it returns N results.
If I run instead:
select COUNT(*) from tbl_name where field1=val1 and field2=val2
it returns 0..
In particular, field1 and field2 are foreign key values to other two separate tables. All tables are InnoDB.
It's a bug???
are you enabled fulltextsearch on the table?
i have same problem and fixed by this trik!
select count(*) from (select * from tbl_name where field1=val1 and field2=val2) as t

System or catalog tables - DBA_ / DBC - outputs are volatile

I am trying to get list of all tables/views (in other words all objects) where a particular field is referenced using the system or catalog tables. I am using the following query.
select *
from dba_col_comments
where column_name like('SXX_AXXX_%')
order by 1;
However, the output is volatile. When I repeatedly run the same query without any changes the output is varies. For instance, it produced 9300 records and then 9350 after a couple of minutes and then 9347 after a couple of minutes.
I am observing the same behaviour in Teradata as well.
My theory would be - in a production enironment temporary objects that are created are probably getting an entry in the system/catalog tables.
Any thoughts/directions?
In Teradata you will find that as global temporary tables are instantiated (referenced by an SQL statement) records should be added to the data dictionary table TVM. These records are then dropped after the session logs off leaving just the base table record associated with the original CREATE GLOBAL TEMPORARY TABLE statement that was submitted.
You can find these instances using the view DBC.AllTempTables.
In Teradata, volatile tables are not maintained within the data dictionary.
EDIT - Your mileage may vary but this should get you started on Teradata
SELECT D1.DatabaseNameI AS DatabaseName_
, T1.TVMNameI AS TableName_
, F1.FieldName AS ColumnName_
FROM "DBC".TVM T1
INNER JOIN
"DBC".Dbase D1
ON D1.DatabaseId = T1.DatabaseId
INNER JOIN
"DBC".TVFields F1
ON F1.DatabaseId = T1.DatabaseId
AND F1.TableId = T1.TVMId
WHERE F1.FieldName = 'MyColumn'
--AND D1.DatabaseNameI IN ('{Database1}', ... '{Database99}') -- Filter on databases
AND F1.FieldType in ('i', 'i1', 'i2', 'i8') -- Integer, ByteInt, SmallInt, BigInt
--AND T1.TableKind IN ('T') -- Optional Filter to just tables.
AND NOT EXISTS
(SELECT 'x'
FROM "DBC".TempTables TT1
WHERE Tt1.TableId = T1.TVMId
)
;

WHERE - IS NULL not working in SQLite?

Here's a strange one:
I can filter on NOT NULLS from SQLite, but not NULLS:
This works:
SELECT * FROM project WHERE parent_id NOT NULL;
These don't:
SELECT * FROM project WHERE parent_id IS NULL;
SELECT * FROM project WHERE parent_id ISNULL;
SELECT * FROM project WHERE parent_id NULL;
All return:
There is a problem with the syntax of your query (Query was not
executed) ...
UPDATE:
I am doing this with PHP- through my code with ezSQl and using the PHPLiteAdmin interface
Using the PHPLiteAdmin demo, this expression works- so now I'm suspecting a version issue with my PHP's SQLite? Could that be? Wasn't this expression always valid?
UPDATE 2:
When I run the code from PHP using ezSQL, the PHP warning is:
PHP Warning: SQL logic error or missing database
Is there a way to get more information out of PHP? This is maddeningly opaque and weird, especially because the same statement in the CLI works fine...
UPDATE 3
The only other possible clue I have is that the databases that I create with PHP cannot be read by the CLI, and vice versa. I get:
Error: file is encrypted or is not a database
So there's definitly two SQlite flavors butting heads here. (See this) Still, why the invalid statment??
UPDATE 4
OK I think I've traced the problem to the culprit, if not the reason- The DB I created with PHP ezSQL is the one where the IS NULL statement fails. If I create the DB using PHP's SQLite3 class, the statement works fine, and moreover, I can access the DB from the CLI, whereas ezSQL created DB gave the file is encrypted error.
So I did a little digging into ezSQL code- Off the bat I see it uses PDO methods, not the newer SQLite3 class. Maybe that's something- I'm not gonna waste further time on it...
In any case, I've found my solution, which is to steer clear of ezSQL, and just use PHPs SQLite3 class.
a IS b and a IS NOT b is the general form where a and b are expressions.
This is generally only seen in a IS NULL and a IS NOT NULL cases. There are also ISNULL and NOTNULL (also NOT NULL) operators which are short-hands for the previous expressions, respectively (they only take in a single operand).
The SQL understood in SQLite expressions is covered in SQLite Query Language: Expressions.
Make sure that (previous) statements have been terminated with a ; first if using the CLI.
These are all valid to negate a "null match":
expr NOT NULL
expr NOTNULL
expr IS NOT NULL
These are all valid to "match null":
expr ISNULL
expr IS NULL
Since all of the above constructs are themselves expressions the negations are also valid (e.g. NOT (expr NOT NULL) is equivalent to expr IS NULL).
Happy coding.
The proof in the pudding:
SQLite version 3.7.7.1 2011-06-28 17:39:05
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> create table x (y int null);
sqlite> select * from x where y isnull;
sqlite> select * from x where y notnull;
sqlite> select * from x where y not null;
sqlite> select * from x where y is null;
sqlite> select * from x where y is not null;
sqlite>
The problem could stem from how SQLite handles empty columns. For instance just because a column is empty does not mean it is NULL. Have you tested against ""?
SELECT * FROM project WHERE parent_id = ""
That query might return results.
In Android SQLite, field IS NULL doesn't work either.
field = 'null' does. Give it a try in your environment
This works on SQLite in SQLite Manager for Firefox:
select * from foo where not baz is not null
The query above returns rows where column [baz] is null. :-) Yarin, maybe it will work for you?
(The 'not' before the column name is not a typo).
This query too finds rows where baz is null:
select * from foo where [baz] is null
If you are testing perhaps the PK column (?) and the column is being treated as synonym for rowid, then no rows will have a rowid that's null.
try where your_col_name ISNULL
wheres ISNULL contains no space

Converting SQL Server to Oracle

In my project, I have a database in SQL which was working fine. But now I have to make the application support oracle db too.
Some limitations I found out was that in Oracle, there is no bit field and the table name cannot be greater than 30 char. Is there any other limitation that I need to keep in mind.
Any suggestion from past experience will be helpful.
If I recall correctly from my earlier Oracle days:
there's no IDENTITY column specification in Oracle (you need to use sequences instead)
you cannot simply return a SELECT (columns) from a stored procedure (you need to use REF CURSOR)
of course, all stored procs/funcs are different (Oracle's PL/SQL is not the same as T-SQL)
The SQL ISNULL counterpart in Oracle is NVL
select ISNULL(col, 0)...
select NVL(col, 0)...
You will also struggle if you attempt to select without a from in Oracle. Use dual:
select 'Hello' from DUAL
Bear in mind also, that in Oracle there is the distinction between PL/SQL (Procedural SQL) and pure SQL. They are two distinct and separate languages, that are commonly combined.
Varchar in Oracle Databases called
varchar2 is limited to 4000
characters
Oracles concept of temporary tables is different, they have a global redefined structure
by default sort order and string compare is case-sensitive
When you add a column to a select *
Select * from table_1 order by id;
you must prefix the * by the table_name or an alias
Select
(row_number() over (order by id)) rn,
t.*
from table_1 t
order by id;
Oracle doesn't distinguish between null and '' (empty string). For insert and update you ca use '', but to query you must use null
create table t1 (
id NUMBER(10),
val varchar2(20)
);
Insert into t1 values (1, '');
Insert into t1 values (2, null);
Select * from t1 where stringval = 0; -- correct but empty
Select * from t1 where stringval is null; -- returns both rows
ORACLE do not support TOP clause. Instead of TOP you can use ROWNUM.
SQL Server: TOP (Transact-SQL)
SELECT TOP 3 * FROM CUSTOMERS
ORACLE: ROWNUM Pseudocolumn
SELECT * FROM CUSTOMERS WHERE ROWNUM <= 3

Resources