Creating a view in a not relational database - asp.net

I had an issue and I hope that someone could help me out. In fact, I work on a poorly designed database and I have no control to change things in it. I have a table "Books", and each book can have one or more author. Unfortunately the database is not fully relational (please don't ask me why because I am asking the same question from the beginning). In the table "Books" there is a field called "Author_ID" and "Author_Name", so when a book was written by 2 or 3 authors their IDs and Their names will be concatenated in the same record separated by an star. Here is a demonstration:
ID_BOOK | ID_AUTHOR | NAME AUTHOR | Adress | Country |
----------------------------------------------------------------------------------
001 |01 | AuthorU | AdrU | CtryU |
----------------------------------------------------------------------------------
002 |02*03*04 | AuthorX*AuthorY*AuthorZ | AdrX*NULL*AdrZ | NULL*NULL*CtryZ |
----------------------------------------------------------------------------------
I need to create a view against this table that would give me this result:
ID_BOOK | ID_AUTHOR | NAME AUTHOR | Adress | Country |
----------------------------------------------------------------------------------
001 |01 | AuthorU | AdrU | CtryU |
----------------------------------------------------------------------------------
002 |02 | AuthorX | AdrX | NULL |
----------------------------------------------------------------------------------
002 |03 | AuthorY | NULL | NULL |
----------------------------------------------------------------------------------
002 |04 | AuthorZ | AdrZ | CtryZ |
----------------------------------------------------------------------------------
I will continue trying to do it and I hope that someone could help me with at least some hints. Many thanks guys.
After I applied the solution given by you guys I got this problem. I am trying to solve it and hopefully you can help me. In fact, when the sql query run, the CLOB fields are disorganized when some of them contain NULL value. The reslut should be like above, but i got the result below:
ID_BOOK | ID_AUTHOR | NAME AUTHOR | Adress | Country |
----------------------------------------------------------------------------------
001 |01 | AuthorU | AdrU | CtryU |
----------------------------------------------------------------------------------
002 |02 | AuthorX | AdrX | CtryZ |
----------------------------------------------------------------------------------
002 |03 | AuthorY | AdrZ | NULL |
----------------------------------------------------------------------------------
002 |04 | AuthorZ | NULL | NULL |
----------------------------------------------------------------------------------
Why does it put the NULL values in the end? Thank you.

in 11g you can use a factored recursive sub query for this:
with data (id_book, id_author, name, item_author, item_name, i)
as (select id_book, id_author, name,
regexp_substr(id_author, '[^\*]+', 1, 1) item_author,
regexp_substr(name, '[^\*]+', 1, 1) item_name,
2 i
from books
union all
select id_book, id_author, name,
regexp_substr(id_author, '[^\*]+', 1, i) item_author,
regexp_substr(name, '[^\*]+', 1, i) item_name,
i+1
from data
where regexp_substr(id_author, '[^\*]+', 1, i) is not null)
select id_book, item_author, item_name
from data;
fiddle

A couple weeks ago I answered a similar question here. That answer has an explanation (I hope) of the general approach so I'll skip the explanation here. This query will do the trick; it uses REGEXP_REPLACE and leverages its "occurrence" parameter to pick the individual author ID's and names:
SELECT
ID_Book,
REGEXP_SUBSTR(ID_Author, '[^*]+', 1, Counter) AS AuthID,
REGEXP_SUBSTR(Name_Author, '[^*]+', 1, Counter) AS AuthName
FROM Books
CROSS JOIN (
SELECT LEVEL Counter
FROM DUAL
CONNECT BY LEVEL <= (
SELECT MAX(REGEXP_COUNT(ID_Author, '[^*]+'))
FROM Books))
WHERE REGEXP_SUBSTR(Name_Author, '[^*]+', 1, Counter) IS NOT NULL
ORDER BY 1, 2
There's a Fiddle with your data plus another row here.
Addendum: OP has Oracle 9, not 11, so regular expressions won't work. Following are instructions for doing the same task without regexes...
Without REGEXP_COUNT, the best way count authors is to count the asterisks and add one. To count asterisks, take the length of the string, then subtract its length when all the asterisks are sucked out of it: LENGTH(ID_Author) - LENGTH(REPLACE(ID_Author, '*')).
Without REGEX_SUBSTR, you need to use INSTR to find the position of the asterisks, and then SUBSTR to pull out the author IDs and names. This gets a little complicated - consider these Author columns from your original post:
Author U
Author X*Author Y*Author Z
AuthorX lies between the beginning the string and the first asterisk.
AuthorY is surrounded by asterisks
AuthorZ lies between the last asterisk and the end of the string.
AuthorU is all alone and not surrounded by anything.
Because of this, the opening piece (WITH AuthorInfo AS... below) adds an asterisk to the beginning and the end so every author name (and ID) is surrounded by asterisks. It also grabs the author count for each row. For the sample data in your original post, the opening piece will yield this:
ID_Book AuthCount ID_Author Name_Author
------- --------- ---------- -------------------------
001 1 *01* *AuthorU*
002 3 *02*03*04* *AuthorX*AuthorY*AuthorZ*
Then comes the join with the "Counter" table and the SUBSTR machinations to pull out the individual names and IDs. The final query looks like this:
WITH AuthorInfo AS (
SELECT
ID_Book,
LENGTH(ID_Author) -
LENGTH(REPLACE(ID_Author, '*')) + 1 AS AuthCount,
'*' || ID_Author || '*' AS ID_Author,
'*' || Name_Author || '*' AS Name_Author
FROM Books
)
SELECT
ID_Book,
SUBSTR(ID_Author,
INSTR(ID_Author, '*', 1, Counter) + 1,
INSTR(ID_Author, '*', 1, Counter+1) - INSTR(ID_Author, '*', 1, Counter) - 1) AS AuthID,
SUBSTR(Name_Author,
INSTR(Name_Author, '*', 1, Counter) + 1,
INSTR(Name_Author, '*', 1, Counter+1) - INSTR(Name_Author, '*', 1, Counter) - 1) AS AuthName
FROM AuthorInfo
CROSS JOIN (
SELECT LEVEL Counter
FROM DUAL
CONNECT BY LEVEL <= (SELECT MAX(AuthCount) FROM AuthorInfo))
WHERE AuthCount >= Counter
ORDER BY ID_Book, Counter
The Fiddle is here

If you have an authors table, you can do:
select b.id_book, a.id_author, a.NameAuthor
from books b left outer join
authors a
on '*'||NameAuthor||'*' like '%*||a.author||'*%'

In addition:
SELECT distinct id_book,
, trim(regexp_substr(id_author, '[^*]+', 1, LEVEL)) id_author
, trim(regexp_substr(author_name, '[^*]+', 1, LEVEL)) author_name
FROM yourtable
CONNECT BY LEVEL <= regexp_count(id_author, '[^*]+')
ORDER BY id_book, id_author
/
ID_BOOK ID_AUTHOR AUTHOR_NAME
------------------------------------
001 01 AuthorU
002 02 AuthorX
002 03 AuthorY
002 04 AuthorZ
003 123 Jane Austen
003 456 David Foster Wallace
003 789 Richard Wright
No REGEXP:
SELECT str, SUBSTR(str, substr_start_pos, substr_end_pos) final_str
FROM
(
SELECT str, substr_start_pos
, (CASE WHEN substr_end_pos <= 0 THEN (Instr(str, '*', 1)-1)
ELSE substr_end_pos END) substr_end_pos
FROM
(
SELECT distinct '02*03*04' AS str
, (Instr('02*03*04', '*', LEVEL)+1) substr_start_pos
, (Instr('02*03*04', '*', LEVEL)-1) substr_end_pos
FROM dual
CONNECT BY LEVEL <= length('02*03*04')
)
ORDER BY substr_start_pos
)
/
STR FINAL_STR
---------------------
02*03*04 02
02*03*04 03
02*03*04 04

Related

How can I set multiple aliases for a single derived table in MariaDB 5.5?

Consider a database with three tables:
goods (Id is the primary key)
+----+-------+-----+
| Id | Name | SKU |
+----+-------+-----+
| 1 | Nails | 123 |
| 2 | Nuts | 456 |
| 3 | Bolts | 789 |
+----+-------+-----+
invoiceheader (Id is the primary key)
+----+--------------+-----------+---------+
| Id | Date | Warehouse | BuyerId |
+----+--------------+-----------+---------+
| 1 | '2021-10-15' | 1 | 223 |
| 2 | '2021-09-18' | 1 | 356 |
| 3 | '2021-07-13' | 2 | 1 |
+----+--------------+-----------+---------+
invoiceitems (Id is the primary key)
+----+----------+--------+-----+-------+
| Id | HeaderId | GoodId | Qty | Price |
+----+----------+--------+-----+-------+
| 1 | 1 | 1 | 15 | 1.1 |
| 2 | 1 | 3 | 7 | 1.5 |
| 3 | 2 | 1 | 12 | 1.5 |
| 4 | 3 | 3 | 3 | 1.3 |
+----+----------+--------+-----+-------+
What I'm trying to do is to get the MAX(invoiceheader.Date) for every invoiceitems.GoodId. Or, in everyday terms, to find out, preferably in a single query, when was the last time any of the goods were sold, from a specific warehouse.
To do that, I'm using a derived query, and the solution proposed here . In order to be able to do that, I think that I need to have a way of giving multiple (well, two) aliases for a derived table.
My query looks like this at the moment:
SELECT tmp.* /* placing the second alias here, before or after tmp.* doesn't work */
FROM ( /* placing the second alias, tmpClone, here also doesn't work */
SELECT
invoiceheader.Id,
invoiceheader.Date,
invoiceitems.HeaderId,
invoiceitems.Id,
invoiceitems.GoodId
FROM invoiceheader
LEFT JOIN invoiceitems
ON invoiceheader.Id = invoiceitems.HeaderId
WHERE invoiceheader.Warehouse = 3
AND invoiceheader.Date > '0000-00-00 00:00:00'
AND invoiceheader.Date IS NOT NULL
AND invoiceheader.Date > ''
AND invoiceitems.GoodId > 0
ORDER BY
invoiceitems.GoodId ASC,
invoiceheader.Date DESC
) tmp, tmpClone /* this doesn't work with or without a comma */
INNER JOIN (
SELECT
invoiceheader.Id,
MAX(invoiceheader.Date) AS maxDate
FROM tmpClone
WHERE invoiceheader.Warehouse = 3
GROUP BY invoiceitems.GoodId
) headerGroup
ON tmp.Id = headerGroup.Id
AND tmp.Date = headerGroup.maxDate
AND tmp.HeaderId = headerGroup.Id
Is it possible to set multiple aliases for a single derived table? If it is, how should I do it?
I'm using 5.5.52-MariaDB.
you can use both (inner select) and left join to achieve this for example:
select t1.b,(select t2.b from table2 as t2 where t1.x=t2.x) as 'Y' from table as t1 Where t1.y=(select t3.y from table3 as t3 where t2.a=t3.a)
While this doesn't answer my original question, it does solve the problem from which the question arose, and I'll leave it here in case anyone ever comes across a similar issue.
The following query does what I'd intended to do - find the newest sale date for the goods from the specific warehouse.
SELECT
invoiceheader.Id,
invoiceheader.Date,
invoiceitems.HeaderId,
invoiceitems.Id,
invoiceitems.GoodId
FROM invoiceheader
INNER JOIN invoiceitems
ON invoiceheader.Id = invoiceitems.HeaderId
INNER JOIN (
SELECT
MAX(invoiceheader.Date) AS maxDate,
invoiceitems.GoodId
FROM invoiceheader
INNER JOIN invoiceitems
ON invoiceheader.Id = invoiceitems.HeaderId
WHERE invoiceheader.Warehouse = 3
AND invoiceheader.Date > '0000-00-00 00:00:00'
AND invoiceheader.Date IS NOT NULL
AND invoiceheader.Date > ''
GROUP BY invoiceitems.GoodId
) tmpDate
ON invoiceheader.Date = tmpDate.maxDate
AND invoiceitems.GoodId = tmpDate.GoodId
WHERE invoiceheader.Warehouse = 3
AND invoiceitems.GoodId > 0
ORDER BY
invoiceitems.GoodId ASC,
invoiceheader.Date DESC
The trick was to join by taking into consideration two things - MAX(invoiceheader.Date) and invoiceitems.GoodId - since one GoodId can only appear once inside a specific invoiceheader / invoiceitems JOINing (strict limit imposed on the part of the code which inserts into invoiceitems).
Whether this is the most optimal solution (ignoring the redundant conditions in the query), and whether it would scale well, remains to be seen - it has been tested on tables with ~5000 entries for invoiceheader, ~60000 entries for invoiceitems, and ~4000 entries for goods. Execution time was < 1 sec.

How to select a limited amount of values in a complex column in Hive?

I have a table with an id, name and proficiency. The proficiency column is of a complex column with map data type. How do I limit the amount of data to 2 shown in the complex map data type?
Example table
ID | name | Proficiency
003 | John | {"Cooking":3, "Talking":6 , "Chopping":8, "Teaching":5}
005 | Lennon | {"Cooking":3, "Programming":6 }
007 | King | {"Chopping":8, "Boxing":5 ,"shooting": 4}
What i want to show after the select statement
ID | name | Proficiency
003 | John | {"Cooking":3, "Talking":6 }
005 | Lennon | {"Cooking":3, "Programming":6 }
007 | King | {"Chopping":8, "Boxing":5 }
For fixed number of map elements required this can be done easily using map_keys() and map_values() functions which return arrays of keys and values, you can access key and value using array index, then assemble map again using map() function:
with MyTable as -------use your table instead of this subquery
(select stack(3,
'003', 'John' , map("Cooking",3, "Talking",6 , "Chopping",8, "Teaching",5),
'005', 'Lennon', map("Cooking",3, "Programming",6 ),
'007', 'King' , map("Chopping",8, "Boxing",5 ,"shooting", 4)
) as (ID, name, Proficiency)
) -------use your table instead of this
select t.ID, t.name,
map(map_keys(t.Proficiency)[0], map_values(t.Proficiency)[0],
map_keys(t.Proficiency)[1], map_values(t.Proficiency)[1]
) as Proficiency
from MyTable t
Result:
t.id t.name proficiency
003 John {"Cooking":3,"Talking":6}
005 Lennon {"Cooking":3,"Programming":6}
007 King {"Boxing":5,"shooting":4}
Map does not guarantee the order by definition, and map_keys, map_values return unordered arrays by definition, but they are in the same order when used in the same subquery, so keys are matching to their corresponding values.

T-SQL Server ORDER BY date and nulls last

I am studying for exam 70-761 and there is a challenge asking to place nulls in the end when using order by, I know the result is this one:
select
orderid,
shippeddate
from Sales.Orders
where custid = 20
order by case when shippeddate is null then 1 else 0 end, shippeddate
what i don't know is why the 1 and 0 and how they affect the result can anyone clarify.
Best Regards,
Daniel
There are two parameters in your order clause, it like to split two groups and then continue sort items inside those groups
First, because 0 less than 1, so all the orders without shippeddate will be push to last.
Then we will order by shippeddate
Example:
orderID | shippeddate
| null
| today
| null
| yesterday
| tomorrow
First sort by case when shippeddate is null then 1 else 0 end we will got
orderID | shippeddate
| today
| yesterday
| tomorrow
| null
| null
then continue sort with shippeddate, we will got
| yesterday
| today
| tomorrow
| null
| null
hope it useful to you

Add serial number for each id based on dates

I have a dataset like shown below (except the Ser_NO, this is the field i want to create).
+--------+------------+--------+
| CaseID | Order_Date | Ser_No |
+--------+------------+--------+
| 44 | 22-01-2018 | 1 |
+--------+------------+--------+
| 44 | 24-02-2018 | 3 |
+--------+------------+--------+
| 44 | 12-02-2018 | 2 |
+--------+------------+--------+
| 100 | 24-01-2018 | 1 |
+--------+------------+--------+
| 100 | 26-01-2018 | 2 |
+--------+------------+--------+
| 100 | 27-01-2018 | 3 |
+--------+------------+--------+
How can i achieve a serial number for each CaseId based on my dates. So the first date in a specific CaseID gets number 1, the second date in this CaseID gets number 2 and so on.
I'm working with T-SQL btw,
I've tried a few things:
CASE
WHEN COUNT(CaseID) > 1
THEN ORDER BY (Order_Date)
AND Ser_no +1
END
Thanks in advance.
First of all, although I don't understand what you did, it gives you what you wanted. The serial number is assigned by date order. The problem I can see is that the result shows you the rows in the wrong order (1, 3, 2 instead of 1, 2, 3).
To sort that order you can try this:
SELECT *, ROW_NUMBER() OVER (PARTITION BY caseid ORDER BY caseid, order_date) AS ser_no
FROM [Table]
Thanks for your reply,
Sorry for the misunderstanding, because the ser_no is not yet in my table. That is the field a want to calculate.
I finished it myself this morning, but it looks almost the same like your measure:
RANK() OVER(PARTITION BY CaseID ORDER BY CaseID, Order_Date ASC

split the regular expression and loop through

I need to loop through inside a join thats what I think I have written.
I am posting the code.
select listagg(request_num,',') within group (order by request_num) as request_num,segmentation_name from (
select MST.REQUEST_NUM,seg_dtls.SEGMENT_NAME,LAST_UPDATED_date,seg_dtls.segmentation_name from
(select * from rp_sr_master ) Mst,
(select SUBSTR(ANSWER,1,INSTR (ANSWER, '~', 1)-1) AS SM_ID,sr_id from rp_sR_details
WHERE Q_ID in (SELECT Q_ID FROM RP_QUESTIONS WHERE field_id='LM_LRE_Q6')
) Dtls, (select SM_ID, SQL_STATEMENT, CREATION_DATE, UPDATED_DATE, SEGMENT_NAME,segmentation_name ,TOTAL_COUNT
from rp_sEGMENT_master ) seg_dtls
where Dtls.SM_ID=seg_dtls.SM_ID
and Dtls.sr_id=Mst.sr_id)
group by segmentation_name;
The problem I am facing here is in the following,
(select SUBSTR(ANSWER,1,INSTR (ANSWER, '~', 1)-1) AS SM_ID,sr_id from rp_sR_details
WHERE Q_ID in (SELECT Q_ID FROM RP_QUESTIONS WHERE field_id='LM_LRE_Q6')
)
In the above code, answer will be something like this:
2603~NG non IaaS IT Professional^2600~NG non IaaS Senior IT^2598~NG data profiling SENIOR IT professional^2595~Nigeria data profiling IT professiona
It only picks the first number that is 2603 and others will be left out.
Is there any way I can loop through all the number in that 'ANSWER'.
I am looking for ideas.
Thanks.
One idea is to use a method for splitting a comma delimited string into rows, you can find examples of this method in the following answers:
Splitting comma separated values in Oracle
How can I use regex to split a string, using a string as a delimiter?
The above solutions use regexp_substr function.
If you dig into details of Oracle's REGEXP_SUBSTR function you wil find that there is optional position parameter there.
This parameter can be combined with a sulution shown in this answer:
SQL to generate a list of numbers from 1 to 100
(that is SELECT LEVEL n FROM DUAL CONNECT BY LEVEL <= 100) in the below way:
with xx as (
select '2603~NG non IaaS IT Professional^2600~NG non IaaS Senior '
|| 'IT^2598~NG data profiling SENIOR IT professional^2595~Nigeria '
|| 'data profiling IT professiona' as answer
from dual
)
select LEVEL AS n, regexp_substr( answer, '\d+', 1, level) as nbr
from xx
connect by level <= 6
;
The above query produces the following result:
N |NBR |
--|-----|
1 |2603 |
2 |2600 |
3 |2598 |
4 |2595 |
5 | |
6 | |
What we need is to eliminate null values from the resultset, it can be done using a simple condition IS NOT NULL
with xx as (
select '2603~NG non IaaS IT Professional^2600~NG non IaaS Senior '
|| 'IT^2598~NG data profiling SENIOR IT professional^2595~Nigeria '
|| 'data profiling IT professiona' as answer
from dual
)
select LEVEL AS n, regexp_substr( answer, '\d+', 1, level) as nbr
from xx
connect by regexp_substr( answer, '\d+', 1, level) IS NOT NULL
;
N |NBR |
--|-----|
1 |2603 |
2 |2600 |
3 |2598 |
4 |2595 |
The above query works perfect for a single record, but gets confused when we try to parse 2 or more rows. Luckily there is another answer on SO that helps to solve this issue:
Is there any alternative for OUTER APPLY in Oracle?
-- source data
WITH xx as (
select 1 AS id,
'2603~NG non IaaS IT Professional^2600~NG non IaaS Senior '
|| 'IT^2598~NG data profiling SENIOR IT professional^2595~Nigeria '
|| 'data profiling IT professiona' as answer
from dual
UNION ALL
select 2 AS id,
'11111~NG non IaaS IT Professional^22222~NG non IaaS Senior '
|| 'IT^2598~NG data 33333 profiling SENIOR IT professional^44~Nigeria '
|| 'data profiling 5 IT professiona 66' as answer
from dual
)
-- end of source data
SELECT t.ID, t1.n, t1.nbr
FROM xx t
CROSS JOIN LATERAL (
select LEVEL AS n, regexp_substr( t.answer, '\d+', 1, level) as nbr
from dual
connect by regexp_substr( t.answer, '\d+', 1, level) IS NOT NULL
) t1;
the above query parses numbers from two records and displays then in the following form:
ID |N |NBR |
---|--|------|
1 |1 |2603 |
1 |2 |2600 |
1 |3 |2598 |
1 |4 |2595 |
2 |1 |11111 |
2 |2 |22222 |
2 |3 |2598 |
2 |4 |33333 |
2 |5 |44 |
2 |6 |5 |
2 |7 |66 |
I belive you will manage to merge this simple "parsing" query into your main query.

Resources