Joing lists containing Hebrew in Tcl? - sqlite

I'm using Tcl 8.6.11, SQLite 3.35.5., and Manjaro Linux KDE.
I'm trying to take a verse in Hebrew and write it one word per row in a data table. This is one
verse, for example.
בְּרֵאשִׁית בָּרָא אֱלֹהִים אֵת הַשּׁמַיִם וְאֵת הָאָֽרֶץ׃
The goal was to write the data to a list and then join the list as the values part of a SQL insert statement. As you can see, each [lindex $l n] prints as expected but the [join $l ,] starting at the second element places the Hebrew in the first position instead of the last, causing the SQL statement to fail.
How can I get each component of the [join $l ,] to be ordered as they are in the [lindex $l n]?
Thank you.
set l {}
set sql { select * from src_original where type_no=0 and book_no < 40 limit 1}
dbws eval $sql origLang {
set i 0
foreach { x } $origLang(original) { lappend l "($origLang(book_no),$origLang(chapter_no),$origLang(verse_no),[incr i],$x)" }
}
puts [lindex $l 0]; # (1,1,1,1,בְּרֵאשִׁית)
puts [lindex $l 1]; # (1,1,1,2,בָּרָא)
puts [lindex $l 2]; # (1,1,1,3,אֱלֹהִים)
puts [lindex $l 3]; # (1,1,1,4,אֵת)
puts [lindex $l 4]; # (1,1,1,5,הַשּׁמַיִם)
puts [lindex $l 5]; # (1,1,1,6,וְאֵת)
puts [lindex $l 6]; # (1,1,1,7,הָאָֽרֶץ׃)
set v [join $l ,]; # (1,1,1,1,בְּרֵאשִׁית),(1,1,1,2,בָּרָא),(1,1,1,3,אֱלֹהִים),(1,1,1,4,אֵת),(1,1,1,5,הַשּׁמַיִם),(1,1,1,6,וְאֵת),(1,1,1,7,הָאָֽרֶץ׃)
set r "insert into vowel_pts (book_no, chapter_no, verse_no, index_no, word) values $v"
dbws eval $r
Thank you for examples and suggestions. I'd still like to understand whether or not join resulted in an out of order SQL statement, but, after looking at the SQL provided by #Shawn, I tried using the SQLite JSON extension and the following also works. If the limitations in the where clause of the arr_vp table are removed, such that all the words from every verse in the thirty-nine books of the Old Testament are written as an individual row, it completes in a few seconds on my ten-year old average laptop, as #DonalFellows suggested. Thanks again.
with
arr_vp as (
select book_no, chapter_no, verse_no,
'["' || replace(original,' ', '","' ) || '"]' as t
from src_original
where book_no=1
and chapter_no=1
and verse_no < 3
and type_no=0
)
select a.book_no, a.chapter_no, a.verse_no,
(key+1) as index_no,
j.value as vowel_pts
from arr_vp a,
json_each( ( select t
from arr_vp r
where r.book_no=a.book_no
and r.chapter_no=a.chapter_no
and r.verse_no=a.verse_no ) ) as j
where j.type = 'text';

As always with SQL, use parameters in a prepared statement instead of trying to add values directly into a query string at runtime. Something like:
# Populate an array of dicts
set l {}
set sql {select * from src_original where type_no=0 and book_no < 40 limit 1}
dbws eval $sql origLang {
set i 0
foreach x $origLang(original) {
lappend l [dict create book_no $origLang(book_no) \
chapter_no $origLang(chapter_no) \
verse_no $origLang(verse_no) \
index_no [incr i] \
word $x]
}
}
# And insert them one at a time.
foreach w $l {
dict with w {
dbws eval {
INSERT INTO vowel_pts(book_no, chapter_no, verse_no, index_no, word)
VALUES ($book_no, $chapter_no, $verse_no, $index_no, $word)
}
}
}
See the documentation for more about embedding (unevaluated) variable names in a SQL statement and binding values to them.
I did manage to come up with a way to do it in just core Sqlite3, assuming a standard space character separates words, that I think will work:
dbws eval {
WITH verse AS (SELECT * FROM src_original WHERE type_no = 0 AND book_no < 40 LIMIT 1),
words AS
(SELECT book_no, chapter_no, verse_no,
substr(original || ' ', 1, instr(original || ' ', ' ') - 1) AS word,
substr(original || ' ', instr(original || ' ', ' ') + 1) AS original,
1 AS index_no
FROM verse
UNION ALL
SELECT book_no, chapter_no, verse_no,
substr(original, 1, instr(original, ' ') - 1),
substr(original, instr(original, ' ') + 1),
index_no + 1
FROM words WHERE length(original) > 0)
INSERT INTO vowel_pts(book_no, chapter_no, verse_no, index_no, word)
SELECT book_no, chapter_no, verse_no, index_no, word FROM words
}

The join command does not alter the order of characters in memory. However, the rendering of mixed left-to-right and right-to-left scripts on the screen is… well, all over the place.
But since you're just doing this to move data from the database to the database, find a way to not bring the data itself into Tcl. It'll be astonishingly faster and safer too.

Related

sqlite query for getting a value from a column containing multiple values

I have a simple table that looks like below
I need to find the 'ID' which has the Number 3, so i wrote a query like below
select * from IDtable where Number like '%3%'
it is actually returning all the ID since i have used like and the Number contains many values starting with 3, how do i get the id which contains 3
Concatenate a , at the start and at the end of the column and check if it contains ',3,' with the operator LIKE:
SELECT *
FROM IDtable
WHERE ',' || Number || ',' LIKE '%,3,%'
or with INSTR():
SELECT *
FROM IDtable
WHERE INSTR(',' || Number || ',', ',3,')
In Python, you should use a ? placeholder for the parameter "3":
n = "3"
sql = """
SELECT *
FROM IDtable
WHERE ',' || Number || ',' LIKE '%,' || ? || ',%'
"""
cursor.execute(sql, (n,))
Note that a normalized table like:
ID
Number
Ab
2
Ab
9
Ab
16
...
......
cD
3
cD
10
cD
17
...
......
would save you all the trouble of querying with a complicated string expression which may prove bad for performance.

how to write for loop with nested if statement in xquery

I am trying the below code :-
return
fn:concat (fn:string-join ((
"somevalue.1.",
"somevalue.2.",
"some val 3",
"some val4",
$somevariable), " "),
for $i in $loopvar
if ((fn:exists($loopvar)) and (fn:count($loopvar) > 1)) then
" where ( " || $loopvar[i] || " and "
else if(fn:exists($loopvar) and (fn:count($loopvar) > 0)) then
" where " || $loopvar[i]
else() )
This code is forming some part of a sql query in which i am trying to get the where conditions separated by "and". The variable $loopvar is of type string and it can have multiple values like column1=3, column2=4, column3=5...and so on. I want to check if this variable has some values then it form the query(as per the above code). However I am getting error in the fn:concat part:-
fn:concat( fn:string-join( ("some val 1", "someval2"......" -- arg1 is not of type xs:anyAtomicType?
Can someone pls suggest where am I going wrong with the above piece of code?
You list MarkLogic as a tag. If so, i would consider using the Optic API where you can more elegantly express your exact query.
Your code has several issues.
In your first return statement, you are returning the fn:concat(). You could have that as the end of the first FLWOR statement, but the second FLWOR is referencing the $loopvar. IF that is in the scope of the first FLWOR, then you need to ensure that the second FLWOR is one of a sequence of items returned in the first. Wrap both the concat and that for loop inside of parenthesis to indicate that you want to return a sequence of items that are all within the scope of that first FLWOR.
The error message is telling you that the if statement following the for is unexpected, because in that FLWOR statement you started a for loop and then have an expression that you want to return, so you need to add a return statement.
https://www.w3.org/TR/xpath-full-text-10/#prod-xquery10-FLWORExpr
(ForClause | LetClause)+ WhereClause? OrderByClause? "return" ExprSingle
Lastly, it seems you are iterating over the $loopvar sequence and attempting to use $i as an index in a predicate, but forgot the $ for the $i variable i.e. $loopvar[$i]. However, unless the $i is a number, that probably won't select what you want. And if it is, it's still easier and more efficient to just use the $i variable:
(: I added some dummy variables, just to have a more complete example that would run :)
let $somevariable := "somevariable"
let $loopvar := ("a", "b", "c")
return (
fn:concat (fn:string-join ((
"somevalue.1.",
"somevalue.2.",
"some val 3",
"some val4",
$somevariable), " ")),
for $i in $loopvar
return
if ((fn:exists($loopvar)) and (fn:count($loopvar) > 1))
then " where ( " || $i || " and "
else
if(fn:exists($loopvar) and (fn:count($loopvar) > 0))
then " where " || $i
else()
)
If you did want the position of the $i in your for loop, you can use the positional variable at expression and assign a position variable: for $i at $j in $loopvar
https://www.w3.org/TR/xpath-full-text-10/#doc-xquery10-ForClause

PLSQL SUBSTR function ignore the trailing zero

select TO_NUMBER (SUBSTR(10.31, INSTR (10.31, '.') + 1)) from dual
Above query returns 31 as the output. But below query returns 3 as the output.
select TO_NUMBER (SUBSTR(10.30, INSTR (10.30, '.') + 1)) from dual
How could I get the 30 as the output instead of the 3?
As it seems (from comments) that you are starting with a numeric value that you want to turn into words, you should begin by splitting it into dollars and cents.
If you really need to use substr etc, then you could start with a known format, such as to_char(amount,'fm9990.00'), so it will be a string with exactly two decimal places. However, if you have the numeric value it would be easier to convert it into the desired units using arithmetic functions. Whole dollars are trunc(amount) and cents are 100 * mod(amount,1).
Another issue is that the 'Jsp' date format approach can't handle zeroes. If you are using Oracle 12.2 or later there is a workaround using the default on conversion error clause:
create table demo
( amount number(6,2) );
insert into demo values (10.3);
insert into demo values (.25);
insert into demo values (25);
select amount
, nvl(to_char(to_date(trunc(amount) default null on conversion error,'J'),'Jsp'),'Zero') as dollars
, nvl(to_char(to_date(100 * mod(amount,1) default null on conversion error,'J'),'Jsp'),'Zero') as cents
from demo;
AMOUNT DOLLARS CENTS
-------- ------------ -------------
10.30 Ten Thirty
25.00 Twenty-Five Zero
0.25 Zero Twenty-Five
In 12.1 you could get around it using an inline function (maybe not a bad idea even in later versions, to simplify the rest of the query):
with
function to_words(num number) return varchar2 as
begin
return
case num
when 0 then 'Zero'
else to_char(to_date(num,'J'),'Jsp')
end;
end;
select amount
, to_words(trunc(amount)) as dollars
, to_words(100 * mod(amount,1)) as cents
from demo;
For values greater than 5373484 (the Julian representation of date '9999-12-31'), you can use this from Ask Tom: Spell the number (converted here to a WITH clause, but you can create it as a standalone function):
with function spell_number
( p_number in number )
return varchar2
as
-- Tom Kyte, 2001:
-- https://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:1407603857650
l_num varchar2(50) := trunc(p_number);
l_return varchar2(4000);
type myarray is table of varchar2(15);
l_str myarray :=
myarray
( ''
, ' thousand '
, ' million '
, ' billion '
, ' trillion '
, ' quadrillion '
, ' quintillion '
, ' sextillion '
, ' septillion '
, ' octillion '
, ' nonillion '
, ' decillion '
, ' undecillion '
, ' duodecillion ');
begin
for i in 1 .. l_str.count loop
exit when l_num is null;
if substr(l_num, length(l_num) -2, 3) <> 0 then
l_return := to_char(to_date(substr(l_num, length(l_num) - 2, 3), 'J'), 'Jsp') || l_str(i) || l_return;
end if;
l_num := substr(l_num, 1, length(l_num) - 3);
end loop;
return l_return;
end spell_number;
select amount
, spell_number(trunc(amount)) as dollars
, spell_number(100 * mod(amount,1)) as cents
from demo
/
I am actually surprised that your current query is even running without error, given that Oracle's SUBSTR function is supposed to operate on strings, not numbers. That being said, if you properly use your current query with strings, then it works:
SELECT TO_NUMBER(SUBSTR('10.30', INSTR ('10.30', '.') + 1)) FROM dual; -- returns 30
A more compact (though not necessarily more performant) way of doing this might be to use REGEXP_SUBSTR:
SELECT REGEXP_SUBSTR('10.30', '[0-9]+$') FROM dual;
This would retain only digits appearing after the decimal point, in the case that a decimal point be present. Otherwise, it would just return all numbers for inputs which have no decimal component.

Replacing variable values in defined string through awk / xargs

We are dynamically generating a string in bash to insert data in oracle database. The string is like
> echo $str1
insert into tbl select '$jobid','$1','$2','$3','$sdate' from dual ;
Here the variables $1,$2 ... are dynamic and can go upto 10
Now we have data in a file with same number of ':' separated datacolumns as there are numeric variables ( $1,$2.. ) in above string.
Challenge here is to have $1 replaced with 1st column of data, $2 with 2nd column and so on. This needs to be done for all rows of dataset and a separate file needs to be generated with "insert" string as base and with replaced data from the file.
For e.g the sample data
cat test.dat
ONLINE:odr1_redo_06a.log:NO
ONLINE:odr1_redo_06b.log:NO
ONLINE:odr1_redo_05a.log:NO
and the string is
echo $str1
insert into tbl select '$jobid','$1','$2','$3','$sdate' from dual ;
Required output should be
insert into tbl select '$jobid','ONLINE','odr1_redo_06a.log','NO','$sdate' from dual ;
insert into tbl select '$jobid','ONLINE','odr1_redo_06b.log','NO','$sdate' from dual ;
insert into tbl select '$jobid','ONLINE','odr1_redo_05a.log','NO','$sdate' from dual ;
Tried using string as external variable in awk. No luck
cat test.dat | awk -F: -v var="$str1" '{print var}'
insert into tbl select '$jobid','$1','$2','$3','$sdate' from dual ;
insert into tbl select '$jobid','$1','$2','$3','$sdate' from dual ;
insert into tbl select '$jobid','$1','$2','$3','$sdate' from dual ;
or xargs
sed 's/:/ /g' test.dat | xargs -n3 bash -c "echo $str1"
insert into tbl select $jobid,$1,$2,$3,$sdate from dual
insert into tbl select $jobid,$1,$2,$3,$sdate from dual
insert into tbl select $jobid,$1,$2,$3,$sdate from dual
Writing a small loop and calling line by line bears overhead so don't prefer doing that. Any ideas how this can be done in optimal fashion ?
With Awk, for each record, replace every literal $n with the value of nth field in your template by means of gsub function and print the result.
awk -F: -v tmpl="$str1" '{
out = tmpl
for (i=1; i<=NF; i++)
gsub(("\\$" i), $i, out)
print out
}' file
Proof of concept:
$ str1="insert into tbl select '\$jobid','\$1','\$2','\$3','\$sdate' from dual ;"
$
$ awk -F: -v tmpl="$str1" '{
> out = tmpl
> for (i=1; i<=NF; i++)
> gsub(("\\$" i), $i, out)
> print out
> }' file
insert into tbl select '$jobid','ONLINE','odr1_redo_06a.log','NO','$sdate' from dual ;
insert into tbl select '$jobid','ONLINE','odr1_redo_06b.log','NO','$sdate' from dual ;
insert into tbl select '$jobid','ONLINE','odr1_redo_05a.log','NO','$sdate' from dual ;

Get several id from pipe-separated string to concat data from another table

I've got two tables, where one holds several id's as pipe-separated string and another that holds names for each id. I want to concat the names as one-liner string with \n between the names.
Tables:
Id-table
| StringIds |
'1|2|3|4|5|4|1'
Name-table
| StringId | String Name |
1 'One'
2 'Two'
3 'Three'
4 'Four'
5 'Five'
I've tried with following code without any success:
SELECT GROUP_CONCAT(StringName || '\n')
FROM Names
WHERE
StringId
IN
(
SELECT DISTINCT
GROUP_CONCAT(REPLACE(StringIds,'|',','))
FROM Ids
)
ORDER BY StringName ASC
Expected output: 'One'\n'Two'\n'Three'\n'Four'\n'Five'\n
Fiddle
The problem is, that the sub query that you have used
SELECT DISTINCT
group_concat(replace(StringIds,'|',','))
FROM Ids
actually returns a string '1,2,3,...' not a number list 1,2,3,... as expected.
The WHERE StringId IN ((SELECT...)) wil not work with strings, it expects a list of elements and the string is ONE element.
So instead you will have to look at the string functions, and there you can use the INSTR(X,Y) function to find the StringId.
But here we must pay attention, because if i.E. we where searching for
the number 3 then we would find it in:
1,2,3,4
but it would also find it in
1,2,30,4
So the trick is to wrap the separator around the query string
and the string to search in. So if we would search for ',3,' in ',1,2,3,4,'
we would have a match, as expected, and if we search in ',1,2,30,4,', then we will not match, which is also as expected. So this is the reason we have these strange concats in our query :)
SELECT group_concat(StringName || '\n') as AllNames
FROM Names
WHERE INSTR(
(',' || (
SELECT DISTINCT
group_concat(replace(StringIds,'|',','))
FROM Ids
) || ','),
(',' || StringId || ',')
) > 0
ORDER BY StringName ASC;
Well now, if we think about it, and since we are searching in a string,
we might as well use your oringinal string instead
of converting it in advance:
SELECT group_concat(StringName || '\n') as AllNames
FROM Names
WHERE INSTR(
('|' || (
SELECT StringIds FROM Ids LIMIT 1
) || '|'),
('|' || StringId || '|')
) > 0
ORDER BY StringName ASC;
And actually there are many more ways we could do this. Let me give you one last version using LIKE comparison instead of INSTR function:
SELECT group_concat(StringName || '\n') as AllNames
FROM Names
WHERE
('|' || (
SELECT StringIds FROM Ids LIMIT 1
) || '|')
LIKE
('%|' || StringId || '|%')
ORDER BY StringName ASC;
Hope this link works, so you can Fiddle around here
UPDATE
If you end up having more entries in your Ids table and you want to print out the unique names for each entry in the Ids table, then you have to turn around the query:
SELECT
( SELECT group_concat(StringName || '\n')
FROM Names
WHERE
('|' || (
StringIds
) || '|')
LIKE
('%|' || StringId || '|%')
ORDER BY StringName ASC
) as AllNames FROM Ids
Now here Ids is the outer table looped through and for each entry the sub query is performed, which returns the AllNames value.

Resources