Will this SQLite GROUP_CONCAT() order by work? - sqlite

I did some tests on my db (SQLite 3.21.0) and its seems that my solution works, but I am afraid that its just luck and it may fail in some cases, I would like to know if my solution is valid...
It seems like GROUP_CONCAT() using primary key or if there is no primary key it will try to find numeric column and will use it for the concatenation order, so I wanted to check my theory and decided to use WITH clause to generate "new" table, this way I will set the order that GROUP_CONCAT will have to use
Here is what I did: (simplified query)
WITH data AS (SELECT cID,
pType || ':' || pID || ':' || pTotal || ':' ||
(CASE WHEN pDate IS NULL OR pDate = '' THEN '0' ELSE pDate END) fees
FROM Pay2015 AS A WHERE cID = A.cID AND pType > 1 ORDER BY pType)
SELECT A.cID, GROUP_CONCAT(data.fees, '|') fees FROM Pay2015 AS A
LEFT OUTER JOIN data ON A.cID = data.cID
WHERE A.cID = 98 AND pType = 0
WITH data will have table with first column as cID and it will be 98 for every row, it will be used in join, GROUP_CONCAT will also have to use it because its the only numeric column, and ORDER BY will set rows order that I need
Now the main query will execute GROUP_CONCAT on this new table (data) and because data.cID is the same for every row it will concatenate it as it
For comparison here is regular query that has the order problem:
SELECT cID, GROUP_CONCAT(pType || ':' || pID || ':' || pTotal || ':' ||
(CASE WHEN pDate IS NULL OR pDate = '' THEN '0' ELSE pDate END), '|') fees
FROM Pay2015 AS A WHERE cID = 98 AND pType > 1 ORDER BY pType
Pay2015 table data: (pID is primary key)
And this is the results when ordering by pType: (I use | to split by, removed all columns but pType for simplicity)
As you can see, the results order is the same as pID order in regular query
What do you think?

The implementation of group_concat() processes the rows in whatever order the database happens to read them.
The only way to enforce an order is to read the rows from a subquery (or view, or CTE) that actually has a specified order:
SELECT ... group_concat(...) ... FROM (SELECT ... ORDER BY ...);
Please note that that subquery must be the only data source in the FROM clause; if you are joining it with any other table, the join might cause the database to read rows in some other order. If you need to join, you must do it inside the subquery.

Related

what is different between using Variable and not in Mariadb?

I just want to get some hint. Is there a type issue?
issue case.
SET #ids = '4094,8562,11144,3017,5815,11121,1957,4095,8563,11145,3018,5816,8527,11122,1959,4096,8564,3020,5817,8528,11123,1961,4097,8571,3021,6020,8535,11128,1962,5181,8572,3581,6021';
this #ids value is actually collected by GROUP_CONCAT() from the subquery;
SELECT
ifnull(sum(case when a.student IS NOT NULL then total END), 0)
from
tb_class a
WHERE
a.id IN (#ids)
and a.date >= '2023-02-01' AND a.DATE <= '2023-02-02'
==> 0
correct case2.
SELECT
ifnull(sum(case when a.student IS NOT NULL then total END), 0)
from
tb_class a
WHERE
a.id IN (4094,8562,11144,3017,5815,11121,1957,4095,8563,11145,3018,5816,8527,11122,1959,4096,8564,3020,5817,8528,11123,1961,4097,8571,3021,6020,8535,11128,1962,5181,8572,3581,6021)
and a.date >= '2023-02-01' AND a.DATE <= '2023-02-02'
==> 54
I got answer from googling. use function FIND_IN_SET()
SELECT
ifnull(sum(case when a.student IS NOT NULL then total END), 0)
from
tb_class a
WHERE
FIND_IN_SET(a.id, #ids)
and a.date >= '2023-02-01' AND a.DATE <= '2023-02-02'
Variables store single values, not lists. Your #ids is just a string that happens to have a comma separated list of numbers. The IN operator only compares against an explicit list; what you are doing is no different than a.id = #ids (which will actually be true, with a warning, for the first number in the list if id is a numeric type, since the string will be converted to a number and the trailing non-numeric portion discarded).
Sometimes you do want to work with a string containing a list of ids such as this, for instance if you have a query that reads many rows that you want to use to produce a small list of ids to update, without the update locking those all the rows read. Then you can use dynamic sql:
SET #ids = '4094,8562,...';
SET #sql = concat('select * from a where a.id in (',#ids,')');
prepare stmt from #sql;
execute stmt;
deallocate prepare stmt;
Or, in mariadb since 10.2,
EXECUTE IMMEDIATE concat('select * from a where a.id in (',#ids,')');
Another alternative is to use FIND_IN_SET, as shown in another answer, but that will not use an index to look up ids, so may be inefficient.

Cannot replace a string with several random strings taken from another table in sqlite

I'm trying to replace a placeholder string inside a selection of 10 random records with a random string (a name) taken from another table, using only sqlite statements.
i've done a subquery in order to replace() of the placeholder with the results of a subquery. I thought that each subquery loaded a random name from the names table, but i've found that it's not the case and each placeholder is replaced with the same string.
select id, (replace (snippet, "%NAME%", (select
name from names
where gender = "male"
) )
) as snippet
from imagedata
where timestamp is not NULL
order by random()
limit 10
I was expecting for each row of the SELECT to have different random replacement every time the subquery is invoked.
hello i'm %NAME% and this is my house
This is the car of %NAME%, let me know what you think
instead each row has the same kind of replacement:
hello i'm david and this is my house
This is the car of david, let me know what you think
and so on...
I'm not sure it can be done inside sqlite or if i have to do it in php over two different database queries.
Thanks in advance!
Seems that random() in the subquery is only evaluated once.
Try this:
select
i.id,
replace(i.snippet, '%NAME%', n.name) snippet
from (
select
id,
snippet,
abs(random()) % (select count(*) from names where gender = 'male') + 1 num
from imagedata
where timestamp is not NULL
order by random() limit 10
) i inner join (
select
n.name,
(select count(*) from names where name < n.name and gender = 'male') + 1 num
from names n
where gender = 'male'
) n on n.num = i.num

Query fails to execute after converting a column from Varchar2 to CLOB

I have a oracle query
select id from (
select ID, ROW_NUMBER() over (partition by LATEST_RECEIPT order by ID) rownumber
from Table
where LATEST_RECEIPT in
(
select LATEST_RECEIPT from Table
group by LATEST_RECEIPT
having COUNT(1) > 1
)
) t
where rownumber <> 1;
The data type of LATEST_RECEIPT was earlier varchar2(4000) and this query worked fine. Since the length of the column needs to be extended i modified it to CLOB, after which this fails. Could anyone help me fix this issue or provide a work around?
You can change your inner query to look for other rows with the same last_receipt value but a different ID (assuming ID is unique); if another row exists then that is equivalent to your count returning greater than one. But you can't simply test two CLOB values for equality, you need to use dbms_lob.compare:
select ID
from your_table t1
where exists (
select null from your_table t2
where dbms_lob.compare(t2.LATEST_RECEIPT, t1.LATEST_RECEIPT) = 0
and t2.ID != t1.ID
-- or if ID isn't unique: and t2.ROWID != t1.ROWID
);
Applying the row number filter is tricker, as you also can't use a CLOB in the analytic partition by clause. As André Schild suggested, you can use a hash; here passing the integer value 3, which is the equivalent of dbms_crypto.hash_sh1 (though in theory that could change in a future release!):
select id from (
select ID, ROW_NUMBER() over (partition by dbms_crypto.hash(LATEST_RECEIPT, 3)
order by ID) rownumber
from your_table t1
where exists (
select null from your_table t2
where dbms_lob.compare(t2.LATEST_RECEIPT, t1.LATEST_RECEIPT) = 0
and t2.ID != t1.ID
-- or if ID isn't unique: and t2.ROWID != t1.ROWID
)
)
where rownumber > 1;
It is of course possible to get a hash collision, and if that happened - you had two latest_receipt values which both appeared more than once and both hashed to the same value - then you could get too many rows back. That seems pretty unlikely, but it's something to consider.
So rather than ordering you can only look for rows which have the same lastest_receipt and a lower ID:
select ID
from your_table t1
where exists (
select null from your_table t2
where dbms_lob.compare(t2.LATEST_RECEIPT, t1.LATEST_RECEIPT) = 0
and t2.ID < t1.ID
);
Again that assumes ID is unique. If it isn't then you could still use rowid instead, but you would have less control over which rows were found - the lowest rowid isn't necessarily the lowest ID. Presumably you're using this to dine rows to delete. If you actually don't mind which row you keep and which you delete then you could still do:
and t2.ROWID < t1.ROWID
But since you are currently ordering that probably isn't acceptable, and hashing might be preferable, despite the small risk.

Get several id from pipe-separated string to concat data from another table

I've got two tables, where one holds several id's as pipe-separated string and another that holds names for each id. I want to concat the names as one-liner string with \n between the names.
Tables:
Id-table
| StringIds |
'1|2|3|4|5|4|1'
Name-table
| StringId | String Name |
1 'One'
2 'Two'
3 'Three'
4 'Four'
5 'Five'
I've tried with following code without any success:
SELECT GROUP_CONCAT(StringName || '\n')
FROM Names
WHERE
StringId
IN
(
SELECT DISTINCT
GROUP_CONCAT(REPLACE(StringIds,'|',','))
FROM Ids
)
ORDER BY StringName ASC
Expected output: 'One'\n'Two'\n'Three'\n'Four'\n'Five'\n
Fiddle
The problem is, that the sub query that you have used
SELECT DISTINCT
group_concat(replace(StringIds,'|',','))
FROM Ids
actually returns a string '1,2,3,...' not a number list 1,2,3,... as expected.
The WHERE StringId IN ((SELECT...)) wil not work with strings, it expects a list of elements and the string is ONE element.
So instead you will have to look at the string functions, and there you can use the INSTR(X,Y) function to find the StringId.
But here we must pay attention, because if i.E. we where searching for
the number 3 then we would find it in:
1,2,3,4
but it would also find it in
1,2,30,4
So the trick is to wrap the separator around the query string
and the string to search in. So if we would search for ',3,' in ',1,2,3,4,'
we would have a match, as expected, and if we search in ',1,2,30,4,', then we will not match, which is also as expected. So this is the reason we have these strange concats in our query :)
SELECT group_concat(StringName || '\n') as AllNames
FROM Names
WHERE INSTR(
(',' || (
SELECT DISTINCT
group_concat(replace(StringIds,'|',','))
FROM Ids
) || ','),
(',' || StringId || ',')
) > 0
ORDER BY StringName ASC;
Well now, if we think about it, and since we are searching in a string,
we might as well use your oringinal string instead
of converting it in advance:
SELECT group_concat(StringName || '\n') as AllNames
FROM Names
WHERE INSTR(
('|' || (
SELECT StringIds FROM Ids LIMIT 1
) || '|'),
('|' || StringId || '|')
) > 0
ORDER BY StringName ASC;
And actually there are many more ways we could do this. Let me give you one last version using LIKE comparison instead of INSTR function:
SELECT group_concat(StringName || '\n') as AllNames
FROM Names
WHERE
('|' || (
SELECT StringIds FROM Ids LIMIT 1
) || '|')
LIKE
('%|' || StringId || '|%')
ORDER BY StringName ASC;
Hope this link works, so you can Fiddle around here
UPDATE
If you end up having more entries in your Ids table and you want to print out the unique names for each entry in the Ids table, then you have to turn around the query:
SELECT
( SELECT group_concat(StringName || '\n')
FROM Names
WHERE
('|' || (
StringIds
) || '|')
LIKE
('%|' || StringId || '|%')
ORDER BY StringName ASC
) as AllNames FROM Ids
Now here Ids is the outer table looped through and for each entry the sub query is performed, which returns the AllNames value.

SQL Concatenate multiple rows

I'm using Teradata, I have a table like this
ID String
123 Jim
123 John
123 Jane
321 Jill
321 Janine
321 Johan
I want to query the table so I get
ID String
123 Jim, John, Jane
321 Jill, Janine, Johan
I tried partition but there can be many names.
How do I get this result. Even, to point me in the right direction would be great.
Unfortunately there's no PIVOT in Teradata (only a TD_UNPIVOT in 14.10).
If you got luck there's an aggregate UDF at your site to do a group concat (probably low possibility).
Otherwise there are two options: recursion or aggregation.
If the maximum number of rows per id is known aggregation is normally faster. It's a lot of code, but most of it is based on cut&paste.
SELECT
id,
MAX(CASE WHEN rn = 1 THEN string END)
|| MAX(CASE WHEN rn = 2 THEN ',' || string ELSE '' END)
|| MAX(CASE WHEN rn = 3 THEN ',' || string ELSE '' END)
|| MAX(CASE WHEN rn = 4 THEN ',' || string ELSE '' END)
|| ... -- repeat up to the known maximum
FROM
(
SELECT
id, string,
ROW_NUMBER()
OVER (PARTITION BY id
ORDER BY string) AS rn
FROM t
) AS dt
GROUP BY 1;
For large tables it's much more efficient when you materialize the result of the Derived Table in a Volatile Table first using the GROUP BY column as PI.
For recursion you should use a Volatile Table, too, as OLAP functions are not allowed in the recursive part. Using a view instead will repeatedly calculate the OLAP function and thus result in bad performance.
CREATE VOLATILE TABLE vt AS
(
SELECT
id
,string
,ROW_NUMBER()
OVER (PARTITION BY id
ORDER BY string DESC) AS rn -- reverse order!
,COUNT(*)
OVER (PARTITION BY id) AS cnt
FROM t
) WITH DATA
UNIQUE PRIMARY INDEX(id, rn)
ON COMMIT PRESERVE ROWS;
WITH RECURSIVE cte
(id, list, rn) AS
(
SELECT
id
,CAST(string AS VARCHAR(1000)) -- define maximum size based on maximum number of rows
,rn
FROM vt
WHERE rn = cnt
UNION ALL
SELECT
vt.id
,cte.list || ',' || vt.string
,vt.rn
FROM vt
JOIN cte
ON vt.id = cte.id
AND vt.rn = cte.rn - 1
)
SELECT id, list
FROM cte
WHERE rn = 1;
There's one problem with this approach, it might need a lot of spool which is easy to see when you omit theWHERE rn = 1.
SELECT ID,
TRIM(TRAILING ',' FROM (XMLAGG(TRIM(String)|| ',' ORDER BY String) (VARCHAR(10000)))) as Strings
FROM db.table
GROUP BY 1
SQL Server 2017+ and SQL Azure: STRING_AGG
Starting with the next version of SQL Server, we can finally concatenate across rows without having to resort to any variable or XML witchery.
STRING_AGG (Transact-SQL)
SELECT ID, STRING_AGG(String, ', ') AS Strings
FROM TableName
GROUP BY ID

Resources