SQLite returning data in custom order - sqlite

I'm using an SQLite query (in an iOS application) as follows:
SELECT * FROM tblStations WHERE StationID IN ('206','114','113','111','112','213','214','215','602','603','604')
However, I'm getting the resulting data in either descending or ascending order, when what I really want is for the data to be returned in the order I've specified in the IN clause.
Is this possible?

A trivial way to sort the results
NSArray *stationIDs = #[#206,#114,#113,#111,#112,#213,#214,#215,#602,#603,#604];
NSArray *stations = #[#{#"Id":#(604)},#{#"Id":#(603)},#{#"Id":#(602)},#{#"Id":#(215)},
#{#"Id":#(214)},#{#"Id":#(213)},#{#"Id":#(112)},#{#"Id":#(111)},
#{#"Id":#(113)},#{#"Id":#(114)},#{#"Id":#(206)}];
stations = [stations sortedArrayUsingComparator:
^NSComparisonResult(NSDictionary * dict1, NSDictionary *dict2)
{
NSUInteger index1 = [stationIDs indexOfObject:dict1[#"Id"]];
NSUInteger index2 = [stationIDs indexOfObject:dict2[#"Id"]];
return [#(index1) compare:#(index2)];
}];

You could use a CASE expression to map these station IDs to another value that is suitable for sorting:
SELECT *
FROM tblStations
WHERE StationID IN ('206','114','113','111','112',
'213','214','215','602','603','604')
ORDER BY CASE StationID
WHEN '206' THEN 1
WHEN '114' THEN 2
WHEN '113' THEN 3
WHEN '111' THEN 4
WHEN '112' THEN 5
WHEN '213' THEN 6
WHEN '214' THEN 7
WHEN '215' THEN 8
WHEN '602' THEN 9
WHEN '603' THEN 10
WHEN '604' THEN 11
END

I don't believe there's any means of returning SQL data in an order that isn't ascending, descending or random (either intentionally so, or simply in the order the database engine chooses to return the data).
As such, it would probably make sense to simply fetch all of the data returned by the SQLite query and store it in an NSDictionary keyed on the StationID value. It would then be trivial to retrieve in the order you require.

add an additional column to use for sorting. e.g. add a column named "sortMePlease". Fill this column according to your needs, meaning for the row for stationID 216 enter 1, for 114 enter 2, .... and finally add "ORDER BY sortMePlease ASC" to your query.

A second way of doing it (the first one being with CASE WHEN ... THEN END as already stated in another answer) is:
ORDER BY StationID=206 DESC,
StationID=114 DESC,
StationID=113 DESC,
StationID=111 DESC,
StationID=112 DESC,
StationID=213 DESC,
etc.

Related

android room database Dao two queries?

How do I return the results of two queries using one #Query statement?
I have a database of items with a single table. Each item has a due date (saved as a long in the Room database) or no due date (saved as -1 in the database). I would like to have a query that returns all items with due dates in ascending order and then return all of the remaining items, sorted by a timestamp that is saved in the database. The timestamp represents the calendar date and time when the item was originally saved to the Room database.
Here is an example of the output I expect, using a U.S. calendar for the due dates:
8/17/2022 (August 17, 2022 due date)
8/19/2022 (due date)
12/15/2022 (due date)
5601 timestamp (no due date)
4200 timestamp (no due date)
1150 timestamp (no due date)
The below query in the Dao returns the expected results of the first part of the query, the ascending due dates. So how do I append the below query with the second part where I also return the items that have no due dates and show their timestamps in descending order? I tried multiple ways to use UNION, UNION ALL, etc. with no luck.
#Query("SELECT * FROM cards WHERE cardDuedatentime !=-1 ORDER BY cardDuedatentime ASC")
First sort by the boolean expression cardDuedatentime = -1 to get all the rows with no due date at the bottom of the resultset.
Then use conditional sorting with a CASE expression to sort the rows with no due date descending and the rows with a valid due date ascending:
SELECT *
FROM cards
ORDER BY cardDuedatentime = -1,
CASE WHEN cardDuedatentime = -1 THEN -timestamp ELSE cardDuedatentime END;
If you want only 1 column in the results:
SELECT CASE WHEN cardDuedatentime = -1 THEN timestamp ELSE cardDuedatentime END time
FROM cards
ORDER BY cardDuedatentime = -1,
CASE WHEN cardDuedatentime = -1 THEN -timestamp ELSE cardDuedatentime END;
See the demo.
If I understand you questions correctly then I believe that you could use:-
#Query("WITH cte1 AS (SELECT * FROM cards WHERE cardDueDatentime != -1 ORDER BY cardDueDatentime ASC),cte2 AS (SELECT * FROM cards WHERE cardDueDatentime = -1 ORDER BY timestamp ASC) SELECT * FROM cte1 UNION ALL SELECT * FROM cte2;")
The following was used to test/demonstrate:-
DROP TABLE IF EXISTS cards;
CREATE TABLE IF NOT EXISTS cards (cardDueDatentime INTEGER,timestamp INTEGER, othercolumns TEXT);
INSERT INTO cards VALUES
(strftime('%s','2022-08-17'),strftime('%s','now'),'A')
,(-1,5601,'A')
,(strftime('%s','2022-08-11'),strftime('%s','now'),'A')
,(-1,4201,'A')
,(strftime('%s','2022-12-15'),strftime('%s','now'),'A')
,(-1,1150,'A')
;
WITH
cte1 AS (SELECT * FROM cards WHERE cardDueDatentime != -1 ORDER BY cardDueDatentime ASC),
cte2 AS (SELECT * FROM cards WHERE cardDueDatentime = -1 ORDER BY timestamp ASC)
SELECT * FROM cte1 UNION ALL SELECT * FROM cte2;
DROP TABLE IF EXISTS cards;
The result from the executing the above (using the Navicat for SQLite tool):-
Two CTE's (Common Table Expressions, aka temporary tables) were used, each to extract one of the sets of data importantly sorting them independently. They are then combined via the UNION and not sorted (as the sort affects the complete set of data).
Note how the data has purposefully been inserted so that they are not appropriately sorted.
Here's an SQLFiddle of the above
An even simpler way would be to use:-
#Query("SELECT * FROM cards ORDER BY cardDueDatentime=-1 ASC,cardDueDatentime ASC, timestamp ASC;")
Which using the data above results in the same. This works because cardDueDatentime=-1 will equate to either true (1) or false (0). Therefore -1 will equate to 1 and a valid datetime will equate to 0, so the valid datetimes precede the invalid (-1) datetimes. Then the subsequent sort fields sort each set accordingly.
If you wanted any invalid date (les than 0) then you could use something like:-
#Query("SELECT * FROM cards ORDER BY cardDueDatentime<0 ASC,max(CAST(cardDueDatentime AS INTEGER),0) ASC, timestamp ASC;")
So if you had additional rows inserted such as :-
....
,(-2,111,'B')
,(-3,11,'C')
,(-1,1,'X')
The the result would be:-
Whilst with the first simpler SELECT, with the additional data, the result (WRONG) would be :-
i.e. for the C row as -3 is not equal to -1 then it will be as if it were a valid date,
so < 0 treats it as an invalid date so it is include in the set of invalid dates;
However, with < 0 allowing the -3 to in the invalid date set, the second sort, on the cardDueDatentime would place -3 before the -2 and before the -1 so the max function will for values less than -1 make them -1 and hence -3 becomes -1 (as with all the other invalid dates) so the third sort field is then the applicable sort field within the set of invalid dates.
this could be useful if you for some reason wanted to have different sets/types of invalid dates but not affect the query.

Count(case when) redshift sql - receiving groupby error

I'm trying to do a count(case when) in Amazon Redshift.
Using this reference, I wrote:
select
sfdc_account_key,
record_type_name,
vplus_stage,
vplus_stage_entered_date,
site_delivered_date,
case when vplus_stage = 'Lost' then -1 else 0 end as stage_lost_yn,
case when vplus_stage = 'Lost' then 2000 else 0 end as stage_lost_revenue,
case when vplus_stage = 'Lost' then datediff(month,vplus_stage_entered_date,CURRENT_DATE) else 0 end as stage_lost_months_since,
count(case when vplus_stage = 'Lost' then 1 else 0 end) as stage_lost_count
from shared.vplus_enrollment_dim
where record_type_name = 'APM Website';
But I'm getting this error:
[42803][500310] [Amazon](500310) Invalid operation: column "vplus_enrollment_dim.sfdc_account_key" must appear in the GROUP BY clause or be used in an aggregate function; java.lang.RuntimeException: com.amazon.support.exceptions.ErrorException: [Amazon](500310) Invalid operation: column "vplus_enrollment_dim.sfdc_account_key" must appear in the GROUP BY clause or be used in an aggregate function;
Query was running fine before I added the count. I'm not sure what I'm doing wrong here -- thanks!
You can not have an aggregate function (sum, count etc) without group by
The syntax is like this
select a, count(*)
from table
group by a (or group by 1 in Redshift)
In your query you need to add
group by 1,2,3,4,5,6,7,8
because you have 8 columns other than count
Since I don't know your data and use case I can not tell you it will give you the right result, but SQL will be syntactically correct.
The basic rule is:
If you are using an aggregate function (eg COUNT(...)), then you must supply a GROUP BY clause to define the grouping
Exception: If all columns are aggregates (eg SELECT COUNT(*), AVG(sales) FROM table)
Any columns that are not aggregate functions must appear in the GROUP BY (eg SELECT year, month, AVG(sales) FROM table GROUP BY year, month)
Your query has a COUNT() aggregate function mixed-in with non-aggregate values, which is giving rise to the error.
In looking at your query, you probably don't want to group on all of the columns (eg stage_lost_revenue and stage_lost_months_since don't look like likely grouping columns). You might want to mock-up a query result to figure out what you actually want from such a query.

Find range of values between 2 columns in Oracle DB

Hi I have a table with 2 columns with range, so for e.g If Range Start = ABC1/000/0/0000 and Range END = ABC1/000/0/1022 .
I have to get all the values between this range and then join this with another table. Can you let me know how can I get all the values in DUAL table. I am using Oracle 11g.
Basically I need to make a list with first value as ABC1/000/0/0000 second as ABC1/000/0/0001 till ABC1/000/0/1022.
I have no idea what you mean by "storing values temporarily in DUAL". DUAL is a singe column table with a single value!
However, something like this might be what you want. If its not, then perhaps you could elaborate on your problem a little further
select blah
from another_table
where somekey in
( select blah
from table
where col between <rangeStart> and <rangeEnd>
)
So, it seems you need a few things.
Separate the "last value" from a slash-separated string, such as
ABC1/000/0/0000. It is best to do this with standard substr() and
instr() functions, not with regular expressions (for faster
execution). In instr() we can use a negative argument for
occurrence, to indicate "counting from the end of the string".
Something like this:
select range_from, substr(range_from, instr(range_from, '/', -1) + 1
from ...
Actually, you will need to convert this to a number with to_number() for further processing, and you will also need to capture the substring up to the last slash (similar use of substr() and instr(). And you will need to do the same for range_to.
Generate all the numbers from the first value to the last value. This is easily done with a connect by level query (hierarchical query). Some care must be taken since we may need to do this for several input rows (input ranges) at once.
Then put everything back together and use the result in further processing.
I will assume that the range_from string contains at least one slash, that the substring between the last slash and the end of the string represents a non-negative integer in character format, and the range_to similarly contains at least one slash and the substring from the last slash to the end of the string represents a non-negative integer. It is your responsibility to guarantee that this integer is greater than or equal to the one from range_from. In the output I will use the same substring UP TO the last slash as I find in range_from; if the requirement is that range_to must have the same initial substring, it is your responsibility to guarantee that.
I will also assume that the width (number of characters) of the "number" part (the last token in the strings) is not known beforehand and must be calculated in the query.
with
test_data( id, range_from, range_to ) as (
select 1, 'ABC1/000/0/2033', 'ABC1/000/0/2035' from dual union all
select 2, 'xyz/33/200' , 'xyz/33/200' from dual union all
select 3, '300/LMN/000' , '300/LMN/003' from dual
)
-- end of test data; SQL query begins below this line
select id, stub || lpad(to_char(from_nbr + level - 1), len, '0') as val
from (
select id, stub, length(from_str) as len, to_number(from_str) as from_nbr,
to_number(to_str) as to_nbr
from (
select id, substr(range_from, 1, instr(range_from, '/', -1)) as stub,
substr(range_from, instr(range_from, '/', -1) + 1) as from_str,
substr(range_to , instr(range_to , '/', -1) + 1) as to_str
from test_data
)
)
connect by level <= 1 + to_nbr - from_nbr
and prior id = id
and prior sys_guid() is not null
order by id, level -- if needed
;
ID VAL
-- --------------------
1 ABC1/000/0/2033
1 ABC1/000/0/2034
1 ABC1/000/0/2035
2 xyz/33/200
3 300/LMN/000
3 300/LMN/001
3 300/LMN/002
3 300/LMN/003

Cognos: Count the number of occurences of a distinct id

I'm making a report in Cognos Report Studio and I'm having abit of trouble getting a count taht I need. What I need to do is count the number of IDs for a department. But I need to split the count between initiated and completed. If an ID occures more than once, it is to be counted as completed. The others, of course, will be initiated. So I'm trying to count the number of ID occurences for a distinct ID. Here is the query I've made in SQl Developer:
SELECT
COUNT((CASE WHEN COUNT(S.RFP_ID) > 8 THEN MAX(CT.GCT_STATUS_HISTORY_CLOSE_DT) END)) AS "Sales Admin Completed"
,COUNT((CASE WHEN COUNT(S.RFP_ID) = 8 THEN MIN(CT.GCT_STATUS_HISTORY_OPEN_DT) END)) as "Sales Admin Initiated"
FROM
ADM.B_RFP_WC_COVERAGE_DIM S
JOIN ADM.B_GROUP_CHANGE_REQUEST_DIM CR
ON S. RFP_ID = CR.GCR_RFP_ID
JOIN ADM.GROUP_CHANGE_TASK_FACT CT
ON CR.GROUP_CHANGE_REQUEST_KEY = CT.GROUP_CHANGE_REQUEST_KEY
JOIN ADM.B_DEPARTMENT_DIM D
ON D.DEPARTMENT_KEY = CT.DEPARTMENT_RESP_KEY
WHERE CR.GCR_CHANGE_TYPE_ID = '20'
AND S.RFP_LOB_IND = 'WC'
AND S.RFP_AUDIT_IND = 'N'
AND CR.GCR_RECEIVED_DT BETWEEN '01-JAN-13' AND '31-DEC-13'
AND D.DEPARTMENT_DESC = 'Sales'
AND CT.GCT_STATUS_IND = 'C'
GROUP BY S.RFP_ID ;
Now this works. But I'm not sure how to translate taht into Cognos. I tried doing a CASE taht looked liek this(this code is using basic names such as dept instead of D.DEPARTMENT_DESC):
CASE WHEN dept = 'Sales' AND count(ID for {DISTINCT ID}) > 1 THEN count(distinct ID)END)
I'm using count(distinct ID) instead of count(maximum(close_date)). But the results would be the same anyway. The "AND" is where I think its being lost. It obviously isn't the proper way to count occurences. But I'm hoping I'm close. Is there a way to do this with a CASE? Or at all?
--EDIT--
To make my question more clear, here is an example:
Say I have this data in my table
ID
---
1
2
3
4
2
5
5
6
2
My desired count output would be:
Initiated Completed
--------- ---------
4 2
This is because two of the distinct IDs (2 and 5) occure more than once. So they are counted as Completed. The ones that occure only once are counted as Initiated. I am able to do this in SQl Dev, but I can't figure out how to do this in Cognos Report Studio. I hope this helps to better explaine my issue.
Oh, I didn't quite got it originally, amending the answer.
But it's still easiest to do with 2 queries in Report Studio. Key moment is that you can use a query as a source for another query, guaranteeing proper group by's and calculations.
So if you have ID list in the table in Report Studio you create:
Query 1 with dataitems:
ID,
count(*) or count (1) as count_occurences
status (initiated or completed) with a formula: if (count_occurences > 1) then ('completed') else ('initiated').
After that you create a query 2 using query one as source with just 2 data items:
[Query1].[Status]
Count with formula: count([Query1].[ID])
That will give you the result you're after.
Here's a link to doco on how to nest queries:
http://pic.dhe.ibm.com/infocenter/cx/v10r1m0/topic/com.ibm.swg.ba.cognos.ug_cr_rptstd.10.1.0.doc/c_cr_rptstd_wrkdat_working_with_queries_rel.html?path=3_3_10_6#cr_rptstd_wrkdat_working_with_queries_rel

Fastest Way to Count Distinct Values in a Column, Including NULL Values

The Transact-Sql Count Distinct operation counts all non-null values in a column. I need to count the number of distinct values per column in a set of tables, including null values (so if there is a null in the column, the result should be (Select Count(Distinct COLNAME) From TABLE) + 1.
This is going to be repeated over every column in every table in the DB. Includes hundreds of tables, some of which have over 1M rows. Because this needs to be done over every single column, adding Indexes for every column is not a good option.
This will be done as part of an ASP.net site, so integration with code logic is also ok (i.e.: this doesn't have to be completed as part of one query, though if that can be done with good performance, then even better).
What is the most efficient way to do this?
Update After Testing
I tested the different methods from the answers given on a good representative table. The table has 3.2 million records, dozens of columns (a few with indexes, most without). One column has 3.2 million unique values. Other columns range from all Null (one value) to a max of 40K unique values. For each method I performed four tests (with multiple attempts at each, averaging the results): 20 columns at one time, 5 columns at one time, 1 column with many values (3.2M) and 1 column with a small number of values (167). Here are the results, in order of fastest to slowest
Count/GroupBy (Cheran)
CountDistinct+SubQuery (Ellis)
dense_rank (Eriksson)
Count+Max (Andriy)
Testing Results (in seconds):
Method 20_Columns 5_Columns 1_Column (Large) 1_Column (Small)
1) Count/GroupBy 10.8 4.8 2.8 0.14
2) CountDistinct 12.4 4.8 3 0.7
3) dense_rank 226 30 6 4.33
4) Count+Max 98.5 44 16 12.5
Notes:
Interestingly enough, the two methods that were fastest (by far, with only a small difference in between then) were both methods that submitted separate queries for each column (and in the case of result #2, the query included a subquery, so there were really two queries submitted per column). Perhaps because the gains that would be achieved by limiting the number of table scans is small in comparison to the performance hit taken in terms of memory requirements (just a guess).
Though the dense_rank method is definitely the most elegant, it seems that it doesn't scale well (see the result for 20 columns, which is by far the worst of the four methods), and even on a small scale just cannot compete with the performance of Count.
Thanks for the help and suggestions!
SELECT COUNT(*)
FROM (SELECT ColumnName
FROM TableName
GROUP BY ColumnName) AS s;
GROUP BY selects distinct values including NULL. COUNT(*) will include NULLs, as opposed to COUNT(ColumnName), which ignores NULLs.
I think you should try to keep the number of table scans down and count all columns in one table in one go. Something like this could be worth trying.
;with C as
(
select dense_rank() over(order by Col1) as dnCol1,
dense_rank() over(order by Col2) as dnCol2
from YourTable
)
select max(dnCol1) as CountCol1,
max(dnCol2) as CountCol2
from C
Test the query at SE-Data
A development on OP's own solution:
SELECT
COUNT(DISTINCT acolumn) + MAX(CASE WHEN acolumn IS NULL THEN 1 ELSE 0 END)
FROM atable
Run one query that Counts the number of Distinct values and adds 1 if there are any NULLs in the column (using a subquery)
Select Count(Distinct COLUMNNAME) +
Case When Exists
(Select * from TABLENAME Where COLUMNNAME is Null)
Then 1 Else 0 End
From TABLENAME
You can try:
count(
distinct coalesce(
your_table.column_1, your_table.column_2
-- cast them if you want replace value from column are not same type
)
) as COUNT_TEST
Function coalesce help you combine two columns with replace not null values.
I used this in mine case and success with correctly result.
Not sure this would be the fastest but might be worth testing. Use case to give null a value. Clearly you would need to select a value for null that would not occur in the real data. According to the query plan this would be a dead heat with the count(*) (group by) solution proposed by Cheran S.
SELECT
COUNT( distinct
(case when [testNull] is null then 'dbNullValue' else [testNull] end)
)
FROM [test].[dbo].[testNullVal]
With this approach can also count more than one column
SELECT
COUNT( distinct
(case when [testNull1] is null then 'dbNullValue' else [testNull1] end)
),
COUNT( distinct
(case when [testNull2] is null then 'dbNullValue' else [testNull2] end)
)
FROM [test].[dbo].[testNullVal]

Resources