Fastest way to dump all blobs in PL/SQL - plsql

I'm trying to dump all blobs in a large table to the file system. The table is like:
Name Null? Type
---------------------- -------- -------------
ID NOT NULL NUMBER(19)
FILENAME NOT NULL VARCHAR2(256)
OFFSET_BLOB BLOB
I need to dump all OFFSET_BLOBs to the file system in files named [filename].offsets.
My current approach is to create a stored procedure that iterates through all of the rows and then uses UTL_FILE.put_raw() to write the data to the file. It's working fine, but the table has over 250 million rows and the current estimation is to take 5 days to complete. I tried to add parallel hints on the query with /*+ FULL PARALLEL(10) */ but it doesn't improve anything :(.
Does any of you have a better approach to drastically reduce the extract time to hours instead of days? Thank you!

Related

How do I make multiple insertions into a SQLite DB from UIPath

I have an excel spreadsheet with multiple entries that I want to insert into an SQLite DB from UIPath. How do I do this?
You could do it one of two ways. For both methods, you will need to use the Excel Read Range to put the excel into a table.
Scenario 1: You could read the table in a for each loop, line by line, converting each row to SQL and use a Execute non-query activity. This is too long, and if you like O notation, this is an O(n) solution.
Scenario 2: You could upload the entire table (as long as its compatible with the DB Table) to the database.
you will need Database > Insert activity
You will need to provide the DB Connection (which I answer in another post how to create)
Then enter the sqlite database table you want to insert into in Quotes
And then enter the table name that you have created or pulled from another resource in the last field
Output will be an integer (Affected Records)
For O Notation, this is an O(1) solution. At least from our coding perspective

Ignore empty datasets

I writing a U-SQL Script that sometimes ends up with a empty data set.
Today the outputter writes an empty file when that happens. I would like the outputter to not write anything when that happens. Since I will flood the ADLS with empty files...
I have tried two things so far:
IF statement - the problem here is that I do a select count(*) from the data set and I cannot do IF #COUNT > 0 since the #count is a data set and the if statement would like to have a variable.
Write a custom outputter – But I have notice that it is not the ouputter that writes the file but some other code that runs afterwards. The file gets created after the custom outputter is done.
Does anyone have any guidance?
Thanks in advance!
One method you can do is do cook your data into a table first. Then you can INSERT into the table instead of writing to a file. Empty INSERTs do not cause job failure, nor will they affect performance at runtime or future performance on the table. Let me know if you have other questions!

Efficient way to load referenced data in one query

My application uses a database to save its data. I have table Objects that looks like
localID | title | content
1 Test "1,embed","3,embed","5,append"
and another table Contents that looks like
localID | content
1 Alpha
2 Beta
3 Gamma
4 Delta
5 Epsilon
The main applications runs in the main thread, the whole database stuff in a second thread. So if my application loads, I want to pass each record (QSqlRecord) to the main thread where it gets further processed (loaded into real objects). I pass that record via signals. But my data is split up into 2 tables. I want to return a record containing both, perhaps similar to a join:
localID | title | content
1 Test "Alpha,embed","Gamma,embed","Epsilon,append"
So this way, I would have all the needed information at once after only one thread return value. Without combining, I would have to call the database for each single referenced content.
I expect the database to contain less than 100.000 records, yet some content may be big (files saved as blob, e.g. a book of size of 300 mb or so).
I have two questions:
(How) Can I join the tables this way inside a query (efficiently)?
Am I too concerned about threading and should make it single threaded?
That way I would not need to bother with multiple read requests.
As a sidenode, this is my first post on Database Admins, I was not too sure about this site or Stackoverflow being the right place to ask this.
For any actual problem, use the way recommended by #Vérace in the comments,
i.e. a "linking" table. That is the way.
However, if you are either forced to keep the database structure
or for fun
or for learning (which is indicated by the migration header),
learning dirty tricks however, instead of good design...
have a look at this:
select
localID, title,
(
with recursive cnt(x) as
( select ','||a.content
union all
select replace(x, '"'||b.localID||',', '_"'||b.content||',')
from cnt, toy2 as b
)
select replace('_"'||replace(x, ',_"', ',"'), '_","', '"') from cnt
where not x like '%,"%' LIMIT 1
) as 'content' from toy as a;
using a recursive method to flexibly
(no assumptions on number of entries in AlphaBeta table, or number of their uses)
replace the numbers by greek
applying a naming scheme with "_" to create an end condition
prepend a "_" to content, to make it be processed
and cooperate with end condition
cleanup the end-condition "_"s for desired output
cleanup special case at start of output line
select the result of recursive together with other desired outputs
Note the assumption that your table does not naturally contain '__"' or '_"'. If that happens choose more "weird" strings there. If you have all kinds of strings in your table, then you look at a very meek example of what Verace describes as "a desaster to happen". Actually this non-trivial solution is in itself probably a desaster which happened.
Output (.headers on and .mode column):
localid title content
---------- ---------- --------------------------------------------
1 Test "Alpha,embed","Gamma,embed","Epsilon,append"
2 mal "Beta,append","Delta,embed"
Here is my mcve (.dump), with an additional line "mal" for testing purposes:
BEGIN TRANSACTION;
CREATE TABLE toy (localid int, title varchar(20), content varchar(100));
INSERT INTO toy VALUES(1,'Test','"1,embed","3,embed","5,append"');
INSERT INTO toy VALUES(2,'mal','"2,append","4,embed"');
CREATE TABLE toy2 (localID int, content varchar(10));
INSERT INTO toy2 VALUES(1,'Alpha');
INSERT INTO toy2 VALUES(2,'Beta');
INSERT INTO toy2 VALUES(3,'Gamma');
INSERT INTO toy2 VALUES(4,'Delta');
INSERT INTO toy2 VALUES(5,'Epsilon');
COMMIT;
SQLite 3.18.0 2017-03-28 18:48:43

Extracting data files for different dates from database table

I am on windows and on Oracle 11.0.2
I have a table TEMP_TRANSACTION consisting of transactions for 6 months or so. Each record has a transaction date and other data with it.
Now I want to do the following:
1. Extract data from the table for each transaction date
2. Create a flat file with a name of the transaction date;
3. Output the data for this transaction date to the flat file;
4. Move on to the next date and then do the steps 1-3 again.
I create a simple sql script to spool the data out for a transaction date and it works. Now I want to put this in a loop or something like that so that it iterates for each transaction date.
I know this is asking for something from scratch but I need pointers on how to proceed.
I have Powershell, Java at hand and no access to Unix.
Please help!
Edit: Removed powershell as my primary goal is to get it out from Oracle (PL/SQL) and if not then explore Powershell OR Java.
-Abhi
I was finally able to achieve what I was looking for. Below are the steps (may be not the most efficient ones, but it did work :) )
Created a SQL script which spools the data I was looking for (for a single day).
set colsep '|'
column spoolname new_val spoolname;
select 'TRANSACTION_' || substr(&1,0,8) ||'.txt' spoolname from dual;
set echo off
set feedback off
set linesize 5000
set pagesize 0
set sqlprompt ''
set trimspool on
set headsep off
set verify off
spool &spoolname
Select
''||local_timestamp|| ''||'|'||Field1|| ''||'|'||field2
from << transaction_table >>
where local_timestamp = &1;
select 'XX|'|| count(1)
from <<source_table>>
where local_timestamp = &1;
spool off
exit
I created a file named content.txt where I populated the local timestamp values (i.e. the transaction date time-stamps as
20141007000000
20140515000000
20140515000000
Finally I used a loop on powershell which picked up one value from content.txt and then called the sql script (from step 1) and passed the parameter:
PS C:\TEMP\data> $content = Get-Content C:\TEMP\content.txt
PS C:\TEMP\data> foreach ($line in $content){sqlplus user/password '#C:\temp\ExtractData.sql' $line}
And that is it!
I still have to refine few things but at least the idea of splitting the data is working :)
Hope this helps others who are looking for similar thing.

SQLite on CSV file

I have a small statistics program, which you can point to a CSV file. It tries to determine certain properties (like i.E. which columns might be a date). Lately I have been reading a lot about SQLite and would like to port my application to make us of it, as this would make it easier to create new statitics as only a new select would have to be written.
Now what I would like to know is, I know that SQLite can operate in memory, but of course I don't want to always load the whole file into memory as this can become rather big. So I would like to point SQLite to the CSV file and provide the column information, so that I can do queries on it. It would also be cool if I could create an index in memory (or a temprorary directory) so that the statistics will run faster. This would not need to modify the CSV, only do selects.
Can this be done out of the box? If not, can I write my own filemanager and connect it to SQLite, to achieve this? Writing my own filemanager would only be an option if the effort is not to big, as I don't want to write a full blown database code.
SQLite supports reading from a file:
$ cat data.csv
Cheese,7,12.3
Bacon,8,19.4
Eggs,3,20.3
# With no filename SQLite creates the database in memory.
$ sqlite3
sqlite> create table data (name text, units integer, price double);
sqlite> .separator ','
sqlite> .import data.csv data
sqlite> select * from data;
Cheese,7,12.3
Bacon,8,19.4
Eggs,3,20.3
You can add constrains and indexes on this table to help you with your analysis.

Resources