show the digits between underscore and dot

show the digits between underscore and dot - oracle11g

in oracle
I want to show the digits between underscore and dot
For example : pSE1001335806_17950.dat
So column will be = 17950
But for : pSE1001311462_4558.dat
The column will be = 4558
How can I do that????

You can use a regular expression for this; if you can generalise it so the second block of alphanumeric characters, then something this simple would work:
regexp_substr(<value>, '([[:alnum:]]+)', 1, 2)
With a CTE to generate your sample values:
with files as
(
select 'pSE1001335806_17950.dat' as filename from dual
union all select 'pSE1001311462_4558.dat' from dual
)
select regexp_substr(filename, '([[:alnum:]]+)', 1, 2)
from files;
REGEXP_SUBSTR(FILENAME,'([[:ALNUM:]]+)',1,2)
--------------------------------------------------------------------------------
17950
4558

One way:
select regexp_replace('pSE1001335806_17950.dat','.*_([0-9]+).dat','\1') from dual

Related

How to use regexp_substr() to return the numbers after a specific word in a string

I have a table column full of strings like this:
'top-level:volume(1):semifinished(21491628):serial(21441769)'.
I would like to return just the numbers after 'serial' (i.e. '21441769') using regex_substr().
select ('top-level:volume(1):semifinished(21491628):serial(21441769)', ????)

We can use REGEXP_SUBSTR with a capture group:
SELECT col, REGEXP_SUBSTR(col, 'serial\\((\\d+)\\)', 1, 1, 'e', 1) AS serial
FROM yourTable;

Try this "(?<=serial().[0-9]+"

How to format a Float number in SQLite?

In SQLite I need to format a number to display it with the thousand separator and decimal separator. Example: The number 123456789 should be displayed as 1,234,567.89
What I did partially works because it does not display the thousand separator as I expected:
select *, printf ("U$%.2f", CAST(unit_val AS FLOAT) / 100) AS u_val FROM items;
u_val shows: U$1234567.89 but I need U$1,234,567.89

The following is one way that you could accomplish the result:-
select *, printf ("U$%.2f", CAST(unit_val AS FLOAT) / 100) AS u_val FROM items;
Could become :-
SELECT
*,
CASE
WHEN len < 9 THEN myfloat
WHEN len> 8 AND len < 12 THEN substr(myfloat,1,len - 6)||','||substr(myfloat,len - 5)
WHEN len > 11 AND len < 15 THEN substr(myfloat,1,len -9)||','||substr(myfloat,len-8,3)||','||substr(myfloat,len-5)
WHEN len > 14 AND len < 18 THEN substr(myfloat,1,len - 12)||','||substr(myfloat,len -11,3)||','||substr(myfloat,len-8,3)||','||substr(myfloat,len-5)
END AS u_val
FROM
(
SELECT *, length(myfloat) AS len
FROM
(
SELECT *,printf("U$%.2f",CAST(unit_val AS FLOAT) / 100)) AS myfloat
FROM Items
)
)
The innermost SELECT extracts the original data plus a new column as per your orginal SELECT.
The intermediate SELECT adds another column for the length of the new column as derived from unit_val via the printf (this could have been done in the first/innermost SELECT, getting this value simplifies (in my opinion) the outermost SELECT, or you could use multiple length(myfloats) in the outermost SELECT).
RESULT - Example
The following is the result from a testing (source column is myfloat) :-
The highlighted columns being the original columns.
The circled data being the result.
The other 2 columns are intermediate columns.
Edit
As you've clarified that the input is an integer, then :-
SELECT *,'U$'||printf('%,d',(unit_val/100))||'.'||CAST((unit_val % 100) AS INTEGER) AS u_val FROM Items
would work assuming that you are using at least version 3.18 of SQLite.
Correction
Using the SQL immediately above if the value of the last part (the cents) is less than 10 then the leading 0 is dropped. So the correct SQL is. Note for simplicity the CAST has also been dropped and rather than concatening the . it has been added to the printf formatter string so :-
SELECT
'U$' ||
printf('%,d', (unit_val / 100)) ||
printf('.%02d',unit_val % 100)
AS u_val
FROM Items
Or as a single line
SELECT 'U$' || printf('%,d', (unit_val / 100)) || printf('.%02d',unit_val % 100) AS u_val FROM Items

Here is a suggestion:
WITH cte AS (SELECT 123456789 AS unit_val)
SELECT printf('%,d.%02d', unit_val/100, unit_val%100) FROM cte;
The Common Table Expression is just there to supply a dummy value, in the absence of variables.
The %,d format adds thousands separators, but, as many have pointed out, only for integers. Because of that, you will need to use the unit_val twice, once for the integer part, and again to calculate the decimal part.
SQLite truncates integer division, so unit_val/100 gives you your dollar part. The % operator is a remainder operator (not strictly the same as “mod”), so unit_val%100 gives the cents part, as another integer. The %02d format ensures that this is always 2 digits, padding with zeroes if necessary.

sort semicolon separated values per row in a column

I want to sort semicolon separated values per row in a column. Eg.
Input:
abc;pqr;def;mno
xyz;pqr;abc
abc
xyz;jkl
Output:
abc;def;mno;pqr
abc;pqr;xyz
abc
jkl;xyz
Can anyone help?

Perhaps something like this. Breaking it down:
First we need to break up the strings into their component tokens, and then reassemble them, using LISTAGG(), while ordering them alphabetically.
There are many ways to break up a symbol-separated string. Here I demonstrate the use of a hierarchical query. It requires that the input strings be uniquely distinguished from each other. Since the exact same semicolon-separated string may appear more than once, and since there is no info from the OP about any other unique column in the table, I create a unique identifier (using ROW_NUMBER()) in the most deeply nested subquery. Then I run the hierarchical query to break up the inputs and then reassemble them in the outermost SELECT.
with
test_data as (
select 'abc;pqr;def;mno' as str from dual union all
select 'xyz;pqr;abc' from dual union all
select 'abc' from dual union all
select 'xyz;jkl' from dual
)
-- End of test data (not part of the solution!)
-- SQL query begins BELOW THIS LINE.
select str,
listagg(token, ';') within group (order by token) as sorted_str
from (
select rn, str,
regexp_substr(str, '([^;]*)(;|$)', 1, level, null, 1) as token
from (
select str, row_number() over (order by null) as rn
from test_data
)
connect by level <= length(str) - length(replace(str, ';')) + 1
and prior rn = rn
and prior sys_guid() is not null
)
group by rn, str
;
STR SORTED_STR
--------------- ---------------
abc;pqr;def;mno abc;def;mno;pqr
xyz;pqr;abc abc;pqr;xyz
abc abc
xyz;jkl jkl;xyz
4 rows selected.

REGEXP_SUBSTR to return first and last segment

I have a dataset which may store an account number in several different variations. It may contain hyphens or spaces as segment separators, or it may be fully concatenated. My desired output is the first three and last 5 alphanumeric characters. I'm having problems with joining the two segments "FIRST_THREE_AND_LAST_FIVE:
with testdata as (select '1-23-456-78-90-ABCDE' txt from dual union all
select '1 23 456 78 90 ABCDE' txt from dual union all
select '1234567890ABCDE' txt from dual union all
select '123ABCDE' txt from dual union all
select '12DE' txt from dual)
select TXT
,regexp_replace(txt, '[^[[:alnum:]]]*',null) NO_HYPHENS_OR_SPACES
,regexp_substr(regexp_replace(txt, '[^[[:alnum:]]]*',null), '([[:alnum:]]){3}',1,1) FIRST_THREE
,regexp_substr(txt, '([[:alnum:]]){5}$',1,1) LAST_FIVE
,regexp_substr(regexp_replace(txt, '[^[[:alnum:]]]*',null), '([[:alnum:]]){3}',1,1) FIRST_THREE_AND_LAST_FIVE
from testdata;
My desired output would be:
FIRST_THREE_AND_LAST_FIVE
-------------------------
123ABCDE
123ABCDE
123ABCDE
123ABCDE
(null)

Here's my try. Note that when regexp_replace() does not find a match, the original string is returned, that's why you can't get a null directly. My thought was to see if the result string matched the original string but of course that would not work for line 4 where the result is correct and happens to match the original string. Others have mentioned methods for counting length, etc with a CASE but I would get more strict and check for the first 3 being numeric and the last 5 being alpha as well since just checking for 8 characters being returned doesn't guarantee they are the right 8 characters! I'll leave that up to the reader.
Anyway this looks for a digit followed by an optional dash or space (per the specs) and remembers the digit (3 times) then also remembers the last 5 alpha characters. It then returns the remembered groups in that order.
I highly recommend you make this a function where you pass your string in and get a cleaned string in return as it will be much easier to maintain, encapsulate this code for re-usability and allow for better error checking using PL/SQL code.
SQL> with testdata(txt) as (
2 select '1-23-456-78-90-ABCDE' from dual
3 union
4 select '1 23 456 78 90 ABCDE' from dual
5 union
6 select '1234567890ABCDE' from dual
7 union
8 select '123ABCDE' from dual
9 union
10 select '12DE' from dual
11 )
12 select
13 case when length(regexp_replace(upper(txt), '^(\d)[- ]?(\d)[- ]?(\d)[- ]?.*([A-Z]{5})$', '\1\2\3\4')) < 8
14 -- Needs more robust error checking here
15 THEN 'NULL' -- for readability
16 else regexp_replace(upper(txt), '^(\d)[- ]?(\d)[- ]?(\d)[- ]?.*([A-Z]{5})$', '\1\2\3\4')
17 end result
18 from testdata;
RESULT
--------------------------------------------------------------------------------
123ABCDE
123ABCDE
123ABCDE
123ABCDE
NULL
SQL>

You can use the fact that the position parameter of REGEXP_REPLACE() can take back-references to get a lot closer. Wrapped in a CASE statement you get what you're after:
select case when length(regexp_replace(txt, '[^[:alnum:]]')) >= 8 then
regexp_replace( regexp_replace(txt, '[^[:alnum:]]')
, '^([[:alnum:]]{3}).*([[:alnum:]]{5})$'
, '\1\2')
end
from test_data
This is, where the length of the string with all non-alpha-numeric characters replaced is greater or equal to 8 return the 1st and 2nd groups, which are respectively the first 3 and last 8 alpha-numeric characters.
This feels... overly complex. Once you've replaced all non-alpha-numeric characters you can just use an ordinary SUBSTR():
with test_data as (
select '1-23-456-78-90-ABCDE' txt from dual union all
select '1 23 456 78 90 ABCDE' txt from dual union all
select '1234567890ABCDE' txt from dual union all
select '123ABCDE' txt from dual union all
select '12DE' txt from dual
)
, standardised as (
select regexp_replace(txt, '[^[:alnum:]]') as txt
from test_data
)
select case when length(txt) >= 8 then substr(txt, 1, 3) || substr(txt, -5) end
from standardised

I feel like I'm missing something, but can't you just concatenate your two working columns? I.e., since you have successful regex for first 3 and last 5, just replace FIRST_THREE_AND_LAST_FIVE with:
regexp_substr(regexp_substr(regexp_replace(txt, '[^[[:alnum:]]]*',null), '([[:alnum:]]){3}',1,1)||regexp_substr(txt, '([[:alnum:]]){5}$',1,1),'([[:alnum:]]){5}',1,1)
EDIT: Added regexp_substr wrapper to return null when required

How to remove an extra char which is coming in XMLAGG() output

Im using Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
We replaced LISTAGG() with XMLAGG() to avoid concatenation error.
when i check the lenght of charecters from both of the fuction output, XMLAGG() giving an extra char in length.
Could you please suggest me how can i overcome this issue.
Please find the below sql and out put
XMLAGG():
SELECT TO_CHAR (
SUBSTR (
XMLAGG (XMLELEMENT (e, table_name, CHR (13)).EXTRACT (
'//text()') ORDER BY tablespace_name).GetClobVal (),
1,
2000))
AS str_concate,
LENGTH (
TO_CHAR (
SUBSTR (
XMLAGG (XMLELEMENT (e, table_name, CHR (13)).EXTRACT (
'//text()') ORDER BY tablespace_name).GetClobVal (),
1,
2000)))
AS str_length
FROM all_tables
WHERE table_name = 'TEST_LOAD
OUTPUT:
STR_CONCATE STR_LENGTH
TEST_LOAD TEST_LOAD 26
LISTAGG()
SELECT LISTAGG (SUBSTR (table_name, 1, 2000), CHR (13))
WITHIN GROUP (ORDER BY tablespace_name)
AS str_concate,
LENGTH (
LISTAGG (SUBSTR (table_name, 1, 2000), CHR (13))
WITHIN GROUP (ORDER BY tablespace_name))
AS str_length
FROM all_tables
WHERE table_name = 'TEST_LOAD';
OUTPUT:
STR_CONCATE STR_LENGTH
TEST_LOAD TEST_LOAD 25

In case of XMLELEMENT, you actually create node of XML tree with two children: table_name and CHR(13). (May be it finally looks like single node since both are texts but it is not important.) It is expansion of value_expr nonterminal. The substantial thing is the node is not aware of other nodes and CHR(13) is added to every node as its suffix or, in other words, terminator.
In case of LISTAGG, you describe aggregation of multiple elements. In this case, your CHR(13) serves as delimiter (see syntax diagram) which is put between elements. It is separator rather than terminator.
Since XMLAGG does not suffer with 4000 char limit, I usually prefer XMLAGG.
If separator is needed, I recommend to prepend it before each value and cut first occurence using substr. Appending after is possible but makes expression harder.
substr(
xmlagg(
xmlelement(e, ', ' || table_name).extract('//text()')
order by tablespace_name
).getclobval(),
3 -- length(', ')+1
)