Oracle PL/SQL - Check number formats and masks - plsql

After a long time without doing PL/SQL...
I need a suggestion from the community, for something that apparently is trivial, but I am a little bit stuck on this.
I am building a load from an CSV file, and there we have a column with the amount.
The CSVs come from different suppliers, and each one can send the amount in different formats.
So I should reject lines from the CSV with amount that are not in correct number format (999,999,999,999.00), because can be an incorrect amount reported by the supplier and should be fixed.
Coming with the formats 999.999.999,00 or 999999999,99, I can do some treatments through PL/SQL to convert.
But I am having problems with values with different formats as 999,9,9,9 or whatever...
I am trying to use common functions (TO_NUMBER, TO_CHAR). But not having much success...
SELECT TO_NUMBER('999,9,9,9') FROM DUAL; or SELECT TO_NUMBER('999,9,9,9','99G990G990G990G990G990D00') FROM DUAL;
The result is ORA-01722: invalid number, and that is brilliant! However, it would reject other formats that seem correct, like 9,999.99
SELECT TO_NUMBER('999,9,9,9','99G999G999G999G999G999D00') FROM DUAL;
Using a format mask, the value 999,9,9,9 is converted to 999999 - and it is not fine. However using the format mask works fine for 9,999.99, for example.
Do you know any other function provided by Oracle that can help solve my issue?
Or any suggestion on how can I do this?
Thank you very much.
Att.,
Guilherme

You can use a regular expression to determine a valid number or not. Then if valid replace comma with nulls and thee convert that to a number. The queries below use the following regular expression:
'^(\d{1,3})(\,\d{3})*(\.\d{2}|\.?)$'
It breaks down as follows
^(\d{1,3}) --- at beginning of string 1 to 3 digits
(\,\d{3})* --- followed by 0 or more sets of comma followed by 3 digits
(.\d{2}|.?) --- that followed by decimal point followed by 2 digits OR (|) optional decimal point
$ --- end of string
Demo:
with test (num, expected)
as (select '999,999,999,999.00', 'valid' from dual union all
select '999,99,999,999.00', 'invalid' from dual union all
select '999.00', 'valid' from dual union all
select '99', 'valid' from dual union all
select '9,999.', 'valid' from dual union all
select '9,999..0', 'invalid' from dual union all
select '999,99999,999.00', 'invalid' from dual
)
select num
, expected
, case when regexp_like(num,'^(\d{1,3})(\,\d{3})*(\.\d{2}|\.?)$')
then to_char(to_number(replace(num,',',null)))
else 'Not Valid Number'
end converted
from test;
In a live setting you would not want the "to_char(to_number ..." structure). This was used for demonstration/testing as both then and else of the case statement must result in same data type. A live version would appear something like:
with test (num)
as (select '999,999,999,999.00' from dual union all
select '999,99,999,999.00' from dual union all
select '999.00' from dual union all
select '99' from dual union all
select '9,999.' from dual union all
select '9,999..0' from dual union all
select '999,99999,999.00' from dual
)
select to_number(replace(num, ',', null))
from test
where regexp_like(num,'^(\d{1,3})(\,\d{3})*(\.\d{2}|\.?)$');

Related

Why is Oracle SQL function regexp_substr not returning all matching characters?

Could anyone (with extensive experience in regular-expression matching) please clarify for me why the following query returns (what I consider) unexpected results in Oracle 12?
select regexp_substr('My email: test#tes6t.test', '[^#:space:]+#[^#:space:]+')
from dual;
Expected result: test#tes6t.test
Actual result: t#t
Another example:
select regexp_substr('Beneficiary email: super+test.media.beneficiary1#gmail.com', '[^#:space:]+#[^#:space:]+')
from dual;
Expected result: super+test.media.beneficiary1#gmail.com
Actual result: ry1#gm
EDIT:
I double-checked and this is not related to Oracle SQL, but the same behaviour applies to any regex engine.
Even when simplifying the regex to [^:space:]+#[^:space:]+ the results are the same.
I am curious to know why it does not match all the non-whitespace characters before and after the # sign. And why sometimes it matches 1 character, other times 2 or 3 or more characters, but not all.
The POSIX shortcut you are trying to use is incorrect, you need square brackets around it:
SELECT REGEXP_SUBSTR('Beneficiary email: super+test.media.beneficiary1#gmail.com', '[^#[:space:]]+#[^#[:space:]]+')
FROM dual;
or even simpler, assuming you only want to validate by checking for an '#' and the email address is always at the end of the string, after the last space:
WITH tbl(str) AS (
SELECT 'My email: test#tes6t.test' FROM dual UNION ALL
SELECT 'Beneficiary email: super+test.media.beneficiary1#gmail.com' FROM dual
)
SELECT REGEXP_REPLACE(str, '.* (.*#.*)', '\1')
from tbl
;
Note: REGEXP_REPLACE() will return the original string if the match is not found, where REGEXP_SUBSTR() will return NULL. Keep that in mind and handle no match found accordingly. Always expect the unexpected!
The REGEX is not correct in your SQL code. Try
select regexp_substr('Beneficiary email: super+test.media.beneficiary1#gmail.com', '\b[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b')
from dual;
select regexp_substr('My email: test#tes6t.test', '\b[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b')
from dual;
It gives the result that you expected.

How to convert a number with decimal values to a float in PL/SQL?

The issue is that I need to insert this number into json, and because the number contains a comma, json becomes invalid. A float would work because it contains a period not a comma.
I have tried using replace(v_decimalNumber,',','.') and it kind of works, except that the json property is converted to a string. I need it to remain some type of a numerical value.
How can this be achieved?
I am using Oracle 11g.
You just need to_number() function.
select to_number(replace('1,2', ',', '.')) float_nr from dual;
Result:
FLOAT_NR
1.2
Note that if your number has .0 like 1.0, the function will remove it and leave it only 1
The data type of v_decimalNumber is some type of character format as it can contain commas (,). Your contention is that it contains a number once the commas are removed. However there is NO SUCH THING until that contention has been validated since being character I can put any character(s) I want into it subject to any length restriction. As an example a spreadsheet column that should contain numeric data. However, it that doesn't apply users will often put N/A into telling themselves that it doesn't apply. Oracle will happily load this into your v_decimalNumber. (And that's 1 of many many ways non-numeric data can get into your column.) So before attempting to process as a numeric value you must validate it is in fact valid numeric data. The following demonstrates one such way.
with some_numbers (n) as
( select '123,4456,789.00' from dual union all
select '987654321.00' from dual union all
select '1928374655' from dual union all
select '1.2' from dual union all
select '.1' from dual union all
select '1..1' from dual union all
select 'N/A' from dual
)
, rx as (select '^[0-9]*\.?[0-9]*$' regexp from dual)
select n
, case when regexp_like(replace(n,',',null), regexp)
then to_number(replace(n,',',null))
else null
end Num_value
, case when regexp_like(replace(n,',',null), regexp)
then null
else 'Not valid number'
end msg
from some_numbers,rx ;
Take away: Never trust a character type column to contain specific data requirements except random characters. Always validate then put the data into the appropriately defined columns.

Confusion in Union all Operator

I am using Oracle SQL Developer
I am combining two output of two queries using union all operator.
I am giving simple example because I cant share whole query (Working
for Bank)
select * from tb1 where rownum between &range1 and &range2
Union all
select * from tb2 where rownum between &range1 and &range2
first query gives all credit transaction, second query gives Sum of Debit.
but I found wrong debit amount using above syntax.
when i use below syntax it gives correct Output
select * from tb1 where row_num between &range1 and &range2
Union all------------bug
select * from tb2 where row_num between &range3 and &range4
I just placed comment after union all, and this giving correct output.
I just cant understand how this can be possible?

How to replace occurrence only on the start of the string in Oracle SQL?

I have a source column and I want to search for string values starting with 05, 5 971971 and 97105 to be replaced by 9715. As showin in output table.
SOURCE OUTPUT
0514377920 971514377920
544233920 971544233920
971971511233920 971511233920
9710511233920 971511233920
I tried following which works for first case.
SELECT REGEXP_REPLACE ('0544377905', '^(\05*)', '9715')FROM dual;
But following is not working, for second case:
SELECT REGEXP_REPLACE ('544377905', '^(\5*)', '9715')FROM dual;
Something is wrong with my regular expression. As I am getting: ORA-12727: invalid back reference in regular expression.
You can provide your four patterns using alternation; that is, in parentheses with a vertical bar between them:
with t(source) as (
select '0514377920' from dual
union all select '544233920' from dual
union all select '971971511233920' from dual
union all select '9710511233920' from dual
)
SELECT source, REGEXP_REPLACE (source, '^(05|5|9719715|97105)', '9715') as output
FROM t;
SOURCE OUTPUT
--------------- --------------------
0514377920 971514377920
544233920 971544233920
971971511233920 971511233920
9710511233920 971511233920
Depending on your data and any other restrictions you have, you may be able to make it as simple as replacing the first part of any string that has a 5 in it, which works for your small sample:
SELECT source, REGEXP_REPLACE (source, '^.[^5]?5', '9715') as output
FROM t;
That matches zero or more characters that are not 5, followed by a 5. That may be too simplistic for your real situation though.

how the utl_raw.cast_to_varchar2(NLSSORT(')) works

why this query not giving any records. Please guide me
SELECT *
FROM DUAL
WHERE utl_raw.cast_to_varchar2(NLSSORT('sravanth','nls_sort=binary_ai'))
LIKE utl_raw.cast_to_varchar2(NLSSORT('sravan','nls_sort=binary_ai'))|| '%'
The reason this is not working in obvious when displaying the output of NLSSORT :
SELECT NLSSORT('sravanth','nls_sort=binary_ai') FROM DUAL
UNION ALL
SELECT NLSSORT('sravan','nls_sort=binary_ai') FROM DUAL
NLSSORT('SRAVANTH','NLS_SORT=BINARY_AI')
73726176616E746800
73726176616E00
^^
Please note that NLSSORT add an extra NUL char at the end of the string. This is not specified in the documentation -- and you shouldn't probably assume it will be always behave the same. Anyway, if you really want to use NLSSORT that way, you will have to handle the extra byte by hand. See https://stackoverflow.com/a/20490866/2363712 as an example.

Resources