unix extract data from file - unix

I've got a file.sql file from an informix database export.
The following is a small part of the file (I changed the data a little bit to make it anonymous):
grant dba to "xxx";
grant dba to "yyy";
grant dba to "zzz";
{ TABLE "xxx".table1 row size = 66 number of columns = 5 index size = 54 }
{ unload file name = table00100.unl number of rows = 307 }
create table "xxx".table1
(
field1 char(2) not null ,
field2 char(10) not null ,
field3 char(30),
field4 char(20) not null ,
field5 date not null
);
revoke all on "xxx".table1 from "yyy";
What I need from this file is to name the table00100.unl file back to the original table name. So I need an output like this:
mv table00100.unl table1
I've managed to to this with 2 files in between with a little of awk and sed, but isn't this possible in an easier way without the temporary files in between? My code sample:
awk '{for(i=1;i<=NF;i++) {if ($i=="unload") {print $(i+4)} else {if ($i=="TABLE") print $(i+1)}}}' file.sql | sed 's/".*".//' > temp.out
awk 'NR%2{printf "%s ",$0;next;}1' temp.out | awk '{for (i=NF;i>0;i--) if (i > 1) printf("mv %s ",$i); else printf("%s\n",$i)}' > temp.shl

if you want to use awk solely:
/TABLE/ {
sub("\".+\"\\.", "", $3);
table = $3;
}
/unload/ {
print "mv", $6, table;
};

Similar to georgexsh's solution, but with gensub:
awk '/TABLE/{table=gensub(/.*\./, "", "", $3)}/unload/{print "mv", $6, table }'

awk '($0 ~ /{ *TABLE/) { match($0,/TABLE */); to=substr($0,RSTART+RLENGTH);
match(to," "); to=substr(to,1,RSTART-1);
match(to,"[.]"); to=substr(to,RSTART+1);
}
($0 ~ /{ *unload/) { match($0,"name *= *"); from=substr($0,RSTART+RLENGTH);
match(from," "); from=substr(from,1,RSTART-1)
}
(from!="") && (to!="") { exit }
END {print "mv "from" "to}' file
The reason I make it so "complicated" is that I am not sure about all the spacings in your input if they will be consistent and if the ordering in the braces will always be the same.

Using a perl one line ( of course it is long :-))
> cat informix_unload.txt
grant dba to "xxx";
grant dba to "yyy";
grant dba to "zzz";
{ TABLE "xxx".table1 row size = 66 number of columns = 5 index size = 54 }
{ unload file name = table00100.unl number of rows = 307 }
create table "xxx".table1
(
field1 char(2) not null ,
field2 char(10) not null ,
field3 char(30),
field4 char(20) not null ,
field5 date not null
);
revoke all on "xxx".table1 from "yyy";
grant dba to "xxx";
grant dba to "yyy";
grant dba to "zzz";
{ TABLE "xxx".table2 row size = 66 number of columns = 5 index size = 54 }
{ unload file name = table00200.unl number of rows = 307 }
create table "xxx".table2
(
field1 char(2) not null ,
field2 char(10) not null ,
field3 char(30),
field4 char(20) not null ,
field5 date not null
);
revoke all on "xxx".table1 from "yyy";
-- other data
> perl -ne 'BEGIN{$x=qx(cat informix_unload.txt);while($x=~m/(.+?)unload file name = (\S+)(.+?)create table (\S+)(.+)/osm){$x=$5;print "$2 $4\n";}exit}'
table00100.unl "xxx".table1
table00200.unl "xxx".table2
>
check this out

Related

Joing lists containing Hebrew in Tcl?

I'm using Tcl 8.6.11, SQLite 3.35.5., and Manjaro Linux KDE.
I'm trying to take a verse in Hebrew and write it one word per row in a data table. This is one
verse, for example.
בְּרֵאשִׁית בָּרָא אֱלֹהִים אֵת הַשּׁמַיִם וְאֵת הָאָֽרֶץ׃
The goal was to write the data to a list and then join the list as the values part of a SQL insert statement. As you can see, each [lindex $l n] prints as expected but the [join $l ,] starting at the second element places the Hebrew in the first position instead of the last, causing the SQL statement to fail.
How can I get each component of the [join $l ,] to be ordered as they are in the [lindex $l n]?
Thank you.
set l {}
set sql { select * from src_original where type_no=0 and book_no < 40 limit 1}
dbws eval $sql origLang {
set i 0
foreach { x } $origLang(original) { lappend l "($origLang(book_no),$origLang(chapter_no),$origLang(verse_no),[incr i],$x)" }
}
puts [lindex $l 0]; # (1,1,1,1,בְּרֵאשִׁית)
puts [lindex $l 1]; # (1,1,1,2,בָּרָא)
puts [lindex $l 2]; # (1,1,1,3,אֱלֹהִים)
puts [lindex $l 3]; # (1,1,1,4,אֵת)
puts [lindex $l 4]; # (1,1,1,5,הַשּׁמַיִם)
puts [lindex $l 5]; # (1,1,1,6,וְאֵת)
puts [lindex $l 6]; # (1,1,1,7,הָאָֽרֶץ׃)
set v [join $l ,]; # (1,1,1,1,בְּרֵאשִׁית),(1,1,1,2,בָּרָא),(1,1,1,3,אֱלֹהִים),(1,1,1,4,אֵת),(1,1,1,5,הַשּׁמַיִם),(1,1,1,6,וְאֵת),(1,1,1,7,הָאָֽרֶץ׃)
set r "insert into vowel_pts (book_no, chapter_no, verse_no, index_no, word) values $v"
dbws eval $r
Thank you for examples and suggestions. I'd still like to understand whether or not join resulted in an out of order SQL statement, but, after looking at the SQL provided by #Shawn, I tried using the SQLite JSON extension and the following also works. If the limitations in the where clause of the arr_vp table are removed, such that all the words from every verse in the thirty-nine books of the Old Testament are written as an individual row, it completes in a few seconds on my ten-year old average laptop, as #DonalFellows suggested. Thanks again.
with
arr_vp as (
select book_no, chapter_no, verse_no,
'["' || replace(original,' ', '","' ) || '"]' as t
from src_original
where book_no=1
and chapter_no=1
and verse_no < 3
and type_no=0
)
select a.book_no, a.chapter_no, a.verse_no,
(key+1) as index_no,
j.value as vowel_pts
from arr_vp a,
json_each( ( select t
from arr_vp r
where r.book_no=a.book_no
and r.chapter_no=a.chapter_no
and r.verse_no=a.verse_no ) ) as j
where j.type = 'text';
As always with SQL, use parameters in a prepared statement instead of trying to add values directly into a query string at runtime. Something like:
# Populate an array of dicts
set l {}
set sql {select * from src_original where type_no=0 and book_no < 40 limit 1}
dbws eval $sql origLang {
set i 0
foreach x $origLang(original) {
lappend l [dict create book_no $origLang(book_no) \
chapter_no $origLang(chapter_no) \
verse_no $origLang(verse_no) \
index_no [incr i] \
word $x]
}
}
# And insert them one at a time.
foreach w $l {
dict with w {
dbws eval {
INSERT INTO vowel_pts(book_no, chapter_no, verse_no, index_no, word)
VALUES ($book_no, $chapter_no, $verse_no, $index_no, $word)
}
}
}
See the documentation for more about embedding (unevaluated) variable names in a SQL statement and binding values to them.
I did manage to come up with a way to do it in just core Sqlite3, assuming a standard space character separates words, that I think will work:
dbws eval {
WITH verse AS (SELECT * FROM src_original WHERE type_no = 0 AND book_no < 40 LIMIT 1),
words AS
(SELECT book_no, chapter_no, verse_no,
substr(original || ' ', 1, instr(original || ' ', ' ') - 1) AS word,
substr(original || ' ', instr(original || ' ', ' ') + 1) AS original,
1 AS index_no
FROM verse
UNION ALL
SELECT book_no, chapter_no, verse_no,
substr(original, 1, instr(original, ' ') - 1),
substr(original, instr(original, ' ') + 1),
index_no + 1
FROM words WHERE length(original) > 0)
INSERT INTO vowel_pts(book_no, chapter_no, verse_no, index_no, word)
SELECT book_no, chapter_no, verse_no, index_no, word FROM words
}
The join command does not alter the order of characters in memory. However, the rendering of mixed left-to-right and right-to-left scripts on the screen is… well, all over the place.
But since you're just doing this to move data from the database to the database, find a way to not bring the data itself into Tcl. It'll be astonishingly faster and safer too.

Replacing variable values in defined string through awk / xargs

We are dynamically generating a string in bash to insert data in oracle database. The string is like
> echo $str1
insert into tbl select '$jobid','$1','$2','$3','$sdate' from dual ;
Here the variables $1,$2 ... are dynamic and can go upto 10
Now we have data in a file with same number of ':' separated datacolumns as there are numeric variables ( $1,$2.. ) in above string.
Challenge here is to have $1 replaced with 1st column of data, $2 with 2nd column and so on. This needs to be done for all rows of dataset and a separate file needs to be generated with "insert" string as base and with replaced data from the file.
For e.g the sample data
cat test.dat
ONLINE:odr1_redo_06a.log:NO
ONLINE:odr1_redo_06b.log:NO
ONLINE:odr1_redo_05a.log:NO
and the string is
echo $str1
insert into tbl select '$jobid','$1','$2','$3','$sdate' from dual ;
Required output should be
insert into tbl select '$jobid','ONLINE','odr1_redo_06a.log','NO','$sdate' from dual ;
insert into tbl select '$jobid','ONLINE','odr1_redo_06b.log','NO','$sdate' from dual ;
insert into tbl select '$jobid','ONLINE','odr1_redo_05a.log','NO','$sdate' from dual ;
Tried using string as external variable in awk. No luck
cat test.dat | awk -F: -v var="$str1" '{print var}'
insert into tbl select '$jobid','$1','$2','$3','$sdate' from dual ;
insert into tbl select '$jobid','$1','$2','$3','$sdate' from dual ;
insert into tbl select '$jobid','$1','$2','$3','$sdate' from dual ;
or xargs
sed 's/:/ /g' test.dat | xargs -n3 bash -c "echo $str1"
insert into tbl select $jobid,$1,$2,$3,$sdate from dual
insert into tbl select $jobid,$1,$2,$3,$sdate from dual
insert into tbl select $jobid,$1,$2,$3,$sdate from dual
Writing a small loop and calling line by line bears overhead so don't prefer doing that. Any ideas how this can be done in optimal fashion ?
With Awk, for each record, replace every literal $n with the value of nth field in your template by means of gsub function and print the result.
awk -F: -v tmpl="$str1" '{
out = tmpl
for (i=1; i<=NF; i++)
gsub(("\\$" i), $i, out)
print out
}' file
Proof of concept:
$ str1="insert into tbl select '\$jobid','\$1','\$2','\$3','\$sdate' from dual ;"
$
$ awk -F: -v tmpl="$str1" '{
> out = tmpl
> for (i=1; i<=NF; i++)
> gsub(("\\$" i), $i, out)
> print out
> }' file
insert into tbl select '$jobid','ONLINE','odr1_redo_06a.log','NO','$sdate' from dual ;
insert into tbl select '$jobid','ONLINE','odr1_redo_06b.log','NO','$sdate' from dual ;
insert into tbl select '$jobid','ONLINE','odr1_redo_05a.log','NO','$sdate' from dual ;

How can i get substring in SQLite?

There is a column.
And it has values like
'/abc/def/ghi/w1.xyz'
'/jkl/mno/r.stuv'
(it's path data and the number of '/'s in each value is not fixed.)
how can i get substring column which has values like
'/abc/def/ghi/'
'/jkl/mno/'
(extracting only the directory part. removing the file part.)
i read about substr(X,Y), substr(X,Y,Z), instr(X,Y).
but it's not easy to apply them because the number of '/'s in each value is not fixed and instr(X,Y) seems to find the first occurrence from the left.
With a recursive CTE:
create table tablename(col text);
insert into tablename(col) values
('/abc/def/ghi/w1.xyz'),
('/jkl/mno/r.stuv');
with recursive cte(col, pos, rest) as (
select col, instr(col, '/') pos, substr(col, instr(col, '/') + 1) rest
from tablename
union all
select col, instr(rest, '/'), substr(rest, instr(rest, '/') + 1)
from cte
where instr(rest, '/') > 0
)
select col, substr(col, 1, sum(pos)) path
from cte
group by col
See the demo.
Results:
| col | path |
| ------------------- | ------------- |
| /abc/def/ghi/w1.xyz | /abc/def/ghi/ |
| /jkl/mno/r.stuv | /jkl/mno/ |
If you are on Unix/Linux environment, you could do something like this (let's take an example).
Let's say test.db was your SQLite3 database with a table like so:
create table test (dataset text);
insert into test (dataset) values ('/abc/def/ghi/w1.xyz');
insert into test (dataset) values ('/jkl/mno/r.stuv');
From command line, you can run:
sqlite3 test.db -batch -noheader "select dataset from test"
That'll give you this as the output (I know that's not the output you want, but just read on):
/abc/def/ghi/w1.xyz
/jkl/mno/r.stuv
Explanation
the -batch and -noheader switches suppress all output except for the data resulting from the SQL statement
sqlite3 test.db -batch -noheader "sql statement" runs the SQL statement you provided and the output is dumped on the screen (stdout)
Solution
Now, we'll use awk with it to get what you want like so:
sqlite3 test.db -batch -noheader "select dataset from test" | awk -F'/' 'BEGIN{OFS="/"} {$NF = ""; print $0}'
Result will be:
/abc/def/ghi/
/jkl/mno/
Crude explanation
awk works on every line output by your sqlite3 command
the line is split by / character with -F'/' switch
since we want the output to contain the delimiters as is, we set the output field separator to be '/' as well using OFS='/'
we set number of fields (NF) in a way that the last item is ignored and then we print the rest of the data
as the other split fields are printed, OFS inserts / between those fields

Use WHERE ... IN with multiple columns selection

I want to update the data column in tbl1 when both key1 and key2 are listed in the results of another query:
CREATE TABLE tbl1(
"key1" INT,
"key2" INT,
"data" VARCHAR(20)
);
UPDATE tbl1 set data="test 123"
WHERE
(key1, key2)
IN
(SELECT key1, key2 from tbl2 where user='123')
The SELECT key1, key2 from tbl2 where user='123' alone returns something like:
|key1|key2|
|----|----|
| 2 | 5 |
|----|----|
| 9 | 4 |
|----|----|
| 1 | 12 |
|----|----|
So the UPDATE would have to affect tbl1 only the rows where key1 and key2 are listed in the SELECT rows.
What would be the proper way to achieve this?
You could try putting the keys in separate conditions and using AND.
CREATE TABLE tbl1(
"key1" INT,
"key2" INT,
"data" VARCHAR(20)
);
UPDATE tbl1 set data="test 123"
WHERE
(key1 IN (SELECT key1 from tbl2 where user='123'))
AND (key2 IN (SELECT key2 FROM tbl2 WHERE user='123));
This would mean running the SELECT query twice (unless the optimizer is clever); I don't think there is a way to avoid that, but would be glad to be proved wrong.
Here is the solution. Not a beautiful one but it is a solution.
UPDATE tbl1 set data = "test 123"
where
(
(key1 || "," || key2)
IN
(select (key1 || "," || key2) from tbl2 where user='123' )
)

How to use set rowcount in select query

I have a select query statement which will result 600k rows. When I blindly extract the result using select statement it will impact db performance. Is there an option to use Set rowcount for fetching the data? I tried the below code but it keep on resulting top 50000 rows and ended up in infinite loop.
#!/bin/ksh -x
trap "" 1
updrowcount=50000
while [ $updrowcount -eq 50000 ]
do
QUERY="set rowcount 50000
select subject into tempdb..extract from tablename where fldr_id=8"
runisql <<EOF > db_restenter code here
$QUERY
goenter code here
quit
EOF
updrowcount=`grep "rows affected" db_rest |cut -c2- | cut -f1 -d ' '`
done
exit
If your subject is unique, you could try something like this:
set rowcount 50000
declare #subject varchar(...)
select #subject = max( subject ) from tempdb..extract
insert into tempdb..extract ( subject )
select subject
from tablename
where fldr_id=8
and (subject > #subject OR #subject is null)
order by subject
Otherwise, use a unique column, e.g.
set rowcount 50000
declare #maxId int
select #maxId = max( id ) from tempdb..extract
insert into tempdb..extract (id, subject )
select id, subject into tempdb..extract
from tablename
where fldr_id=8
and (id > #maxId OR #maxId is null)
order by id
(Obviously, you'll want to use an indexed column.)

Resources