Extracting data files for different dates from database table

Extracting data files for different dates from database table - oracle11g

I am on windows and on Oracle 11.0.2
I have a table TEMP_TRANSACTION consisting of transactions for 6 months or so. Each record has a transaction date and other data with it.
Now I want to do the following:
1. Extract data from the table for each transaction date
2. Create a flat file with a name of the transaction date;
3. Output the data for this transaction date to the flat file;
4. Move on to the next date and then do the steps 1-3 again.
I create a simple sql script to spool the data out for a transaction date and it works. Now I want to put this in a loop or something like that so that it iterates for each transaction date.
I know this is asking for something from scratch but I need pointers on how to proceed.
I have Powershell, Java at hand and no access to Unix.
Please help!
Edit: Removed powershell as my primary goal is to get it out from Oracle (PL/SQL) and if not then explore Powershell OR Java.
-Abhi

I was finally able to achieve what I was looking for. Below are the steps (may be not the most efficient ones, but it did work :) )
Created a SQL script which spools the data I was looking for (for a single day).
set colsep '|'
column spoolname new_val spoolname;
select 'TRANSACTION_' || substr(&1,0,8) ||'.txt' spoolname from dual;
set echo off
set feedback off
set linesize 5000
set pagesize 0
set sqlprompt ''
set trimspool on
set headsep off
set verify off
spool &spoolname
Select
''||local_timestamp|| ''||'|'||Field1|| ''||'|'||field2
from << transaction_table >>
where local_timestamp = &1;
select 'XX|'|| count(1)
from <<source_table>>
where local_timestamp = &1;
spool off
exit
I created a file named content.txt where I populated the local timestamp values (i.e. the transaction date time-stamps as
20141007000000
20140515000000
20140515000000
Finally I used a loop on powershell which picked up one value from content.txt and then called the sql script (from step 1) and passed the parameter:
PS C:\TEMP\data> $content = Get-Content C:\TEMP\content.txt
PS C:\TEMP\data> foreach ($line in $content){sqlplus user/password '#C:\temp\ExtractData.sql' $line}
And that is it!
I still have to refine few things but at least the idea of splitting the data is working :)
Hope this helps others who are looking for similar thing.

Related

datafile or Tablespace use information

Without creating a trigger, are there any V$ views that show when either a Tablespace or datafile was last accessed or used?
Give you an idea of why... I'm looking to do some reorg and would be nice to know if I can take that particular object or tbs offline.

DBA_HIST_SEG_STAT records the number of reads per tablespace per snapshot. The DBA_HIST_ tables are only periodically refreshed, normally once per hour. To retrieve the latest data, a very similar query using V$SEGMENT_STATISTICS would need to be UNIONed to the query below.
Finding the information per data file is trickier. That information is in DBA_HIST_ACTIVE_SESS_HISTORY, usually in the column P1 when P1TEXT = 'file#'. But that information is only a sample, it's very possible a single read to a data file would not be captured.
Note that using the DBA_HIST_ tables requires the Configuration Pack license.
select name, begin_interval_time, end_interval_time, sum(logical_reads_delta)
from dba_hist_seg_stat
join dba_hist_snapshot using (snap_id, dbid, instance_number)
join v$tablespace using (ts#)
group by v$tablespace.name, begin_interval_time, end_interval_time
having sum(logical_reads_delta) > 0
order by v$tablespace.name, begin_interval_time desc

Understanding the ORA_ROWSCN behavior in Oracle

So this is essentially a follow-up question on Finding duplicate records.
We perform data imports from text files everyday and we ended up importing 10163 records spread across 182 files twice. On running the query mentioned above to find duplicates, the total count of records we got is 10174, which is 11 records more than what are contained in the files. I assumed about the posibility of 2 records that are exactly the same and are valid ones being accounted for as well in the query. So I thought it would be best to use a timestamp field and simply find all the records that ran today (and hence ended up adding duplicate rows). I used ORA_ROWSCN using the following query:
select count(*) from my_table
where TRUNC(SCN_TO_TIMESTAMP(ORA_ROWSCN)) = '01-MAR-2012'
;
However, the count is still more i.e. 10168. Now, I am pretty sure that the total lines in the file is 10163 by running the following command in the folder that contains all the files. wc -l *.txt.
Is it possible to find out which rows are actually inserted twice?

By default, ORA_ROWSCN is stored at the block level, not at the row level. It is only stored at the row level if the table was originally built with ROWDEPENDENCIES enabled. Assuming that you can fit many rows of your table in a single block and that you're not using the APPEND hint to insert the new data above the existing high water mark of the table, you are likely inserting new data into blocks that already have some existing data in them. By default, that is going to change the ORA_ROWSCN of every row in the block causing your query to count more rows than were actually inserted.
Since ORA_ROWSCN is only guaranteed to be an upper-bound on the last time there was DML on a row, it would be much more common to determine how many rows were inserted today by adding a CREATE_DATE column to the table that defaults to SYSDATE or to rely on SQL%ROWCOUNT after your INSERT ran (assuming, of course, that you are using a single INSERT statement to insert all the rows).
Generally, using the ORA_ROWSCN and the SCN_TO_TIMESTAMP function is going to be a problematic way to identify when a row was inserted even if the table is built with ROWDEPENDENCIES. ORA_ROWSCN returns an Oracle SCN which is a System Change Number. This is a unique identifier for a particular change (i.e. a transaction). As such, there is no direct link between a SCN and a time-- my database might be generating SCN's a million times more quickly than yours and my SCN 1 may be years different from your SCN 1. The Oracle background process SMON maintains a table that maps SCN values to approximate timestamps but it only maintains that data for a limited period of time-- otherwise, your database would end up with a multi-billion row table that was just storing SCN to timestamp mappings. If the row was inserted more than, say, a week ago (and the exact limit depends on the database and database version), SCN_TO_TIMESTAMP won't be able to convert the SCN to a timestamp and will return an error.

Building Accessories Schema and Bulk Insert

I developed an automation application of a car service. I started accessories module yet but i cant imagine how should I build the datamodel schema.
I've got data of accessories in a text file, line by line (not a cvs or ext.., Because of that, i split theme by substring). Every month, the factory send the data file to the service. It includes the prices, the names, the codes and etc. Every month the prices are updated. I thought the bulkinsert (and i did) was a good choice to take the data to SQL, but it's not a solution to my problem. I dont want duplicate data just for having the new prices. I thought to insert only the prices to another table and build a relation between the Accessories - AccesoriesPrices but sometimes, some new accessories can be added to the list, so i have to check every line of Accessories table. And, the other side, i have to keep the quantity of the accessories, the invoices, etc.
By the way, they send 70,000 lines every month. So, anyone can help me? :)
Thanks.

70,000 lines is not a large file. You'll have to parse this file yourself and issue ordinary insert and update statements based upon the data contained therein. There's no need for using bulk operations for data of this size.
The most common approach to something like this would be to write a simple SQL statement that accepts all of the parameters, then does something like this:
if(exists(select * from YourTable where <exists condition>))
update YourTable set <new values> where <exists condition>
else
insert into YourTable (<columns>) values(<values>)
(Alternatively, you could try rewriting this statement to use the merge T-SQL statement)
Where...
<exists condition> represents whatever you would need to check to see if the item already exists
<new values> is the set of Column = value statements for the columns you want to update
<columns> is the set of columns to insert data into for new items
<values> is the set of values that corresponds to the previous list of columns
You would then loop over each line in your file, parsing the data into parameter values, then running the above SQL statement using those parameters.

How to restrict PL/SQL code from executing twice with the same values to the input parameters?

I want to restrict the execution of my PL/SQL code from repetition. That is, I have written a PL/SQL code with three input parameters viz, Month, Year and a Flag. I have executed the procedure with the following values for the parameters:
Month: March
Year : 2011
Flag: Y
Now, If I am trying to execute the procedure with the same values to the parameters as above, I want to write some code in the PL/SQL to restrict the unwanted second execution. Can anyone help. I hope the question is no ambiguous.

You can use the function result cache: http://www.oracle-developer.net/display.php?id=504 . So Oracle can do this for you.

I would create another table that would store the 3 parameters of each request. When your procedure is called it would first check the "parameter request" table to see if the calling parameters have beem used before. If found, then exit the procedure. If not found, then save the parameters and execute the rest of the procedure.

Your going to need to keep "State" of the last call somewhere. I would recommend creating a table with a datetime column.
When your procedure is called update this table. So, next time when your procedure is called.. check this table to see when was the last time your procedure was called and then proceed accordingly.

Why not set up a table to track what arguments you've already executed it with?
In your procedure, first check that table to see if similar parameters have already been processed. If so, exit (with or without an error).
If not, insert them and do the processing necessary.
Depending on how tight the requirements are, you'll need to get a exclusive lock on that table to prevent concurrent execution.
A nice plus would be an extra column with "in progress"/"done"/"error" status so that you can check if things are going on properly. (Maybe a timestamp too if that's important/interesting.)
This setup allows you to easily clear some of the executions (by deleting some rows) if you find things need to be re-done for whatever reason.

Make an insert in the beginning of the procedure, and do a select for update tolock the table so no one else can process any data and if everything goes ok with the procedure, commit and release the table 😀

How to find out which package/procedure is updating a table?

I would like to find out if it is possible to find out which package or procedure in a package is updating a table?
Due to a certain project being handed over (the person who handed over the project has since left) without proper documentation, data that we know we have updated always go back to some strange source point.
We are guessing that this could be a database job or scheduler that is running the update command without our knowledge. I am hoping that there is a way to find out where the source code is calling from that is updating the table and inserting the source as a trigger on that table that we are monitoring.
Any ideas?
Thanks.

UPDATE: I poked around and found out
how to trace a statement back to its
owning PL/SQL object.
In combination with what Tony mentioned, you can create a logging table and a trigger that looks like this:
CREATE TABLE statement_tracker
( SID NUMBER
, serial# NUMBER
, date_run DATE
, program VARCHAR2(48) null
, module VARCHAR2(48) null
, machine VARCHAR2(64) null
, osuser VARCHAR2(30) null
, sql_text CLOB null
, program_id number
);
CREATE OR REPLACE TRIGGER smb_t_t
AFTER UPDATE
ON smb_test
BEGIN
INSERT
INTO statement_tracker
SELECT ss.SID
, ss.serial#
, sysdate
, ss.program
, ss.module
, ss.machine
, ss.osuser
, sq.sql_fulltext
, sq.program_id
FROM v$session ss
, v$sql sq
WHERE ss.sql_address = sq.address
AND ss.SID = USERENV('sid');
END;
/
In order for the trigger above to compile, you'll need to grant the owner of the trigger these permissions, when logged in as the SYS user:
grant select on V_$SESSION to <user>;
grant select on V_$SQL to <user>;
You will likely want to protect the insert statement in the trigger with some condition that only makes it log when the the change you're interested in is occurring - on my test server this statement runs rather slowly (1 second), so I wouldn't want to be logging all these updates. Of course, in that case, you'd need to change the trigger to be a row-level one so that you could inspect the :new or :old values. If you are really concerned about the overhead of the select, you can change it to not join against v$sql, and instead just save the SQL_ADDRESS column, then schedule a job with DBMS_JOB to go off and update the sql_text column with a second update statement, thereby offloading the update into another session and not blocking your original update.
Unfortunately, this will only tell you half the story. The statement you're going to see logged is going to be the most proximal statement - in this case, an update - even if the original statement executed by the process that initiated it is a stored procedure. This is where the program_id column comes in. If the update statement is part of a procedure or trigger, program_id will point to the object_id of the code in question - you can resolve it thusly:
SELECT * FROM all_objects where object_id = <program_id>;
In the case when the update statement was executed directly from the client, I don't know what program_id represents, but you wouldn't need it - you'd have the name of the executable in the "program" column of statement_tracker. If the update was executed from an anonymous PL/SQL block, I'm not how to track it back - you'll need to experiment further.
It may be, though, that the osuser/machine/program/module information may be enough to get you pointed in the right direction.

If it is a scheduled database job then you can find out what scheduled database jobs exist and look into what they do. Other things you can do are:
look at the dependencies views e.g. ALL_DEPENDENCIES to see what packages/triggers etc. use that table. Depending on the size of your system that may return a lot of objects to trawl through.
Search all the database source code for references to the table like this:
select distinct type, name
from all_source
where lower(text) like lower('%mytable%');
Again that may return a lot of objects, and of course there will be some "false positives" where the search string appears but isn't actually a reference to that table. You could even try something more specific like:
select distinct type, name
from all_source
where lower(text) like lower('%insert into mytable%');
but of course that would miss cases where the command was formatted differently.
Additionally, could there be SQL scripts being run through "cron" jobs on the server?

Just write an "after update" trigger and, in this trigger, log the results of "DBMS_UTILITY.FORMAT_CALL_STACK" in a dedicated table.
The purpose of this function is exactly to give you the complete call stack of al the stored procedures and triggers that have been fired to reach your code.
I am writing from the mobile app, so i can't give you more detailed examples, but if you google for it you'll find many of them.

A quick and dirty option if you're working locally, and are only interested in the first thing that's altering the data, is to throw an error in the trigger instead of logging. That way, you get the usual stack trace and it's a lot less typing and you don't need to create a new table:
AFTER UPDATE ON table_of_interest
BEGIN
RAISE_APPLICATION_ERROR(-20001, 'something changed it');
END;
/

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Extracting data files for different dates from database table - oracle11g

Related

datafile or Tablespace use information

Understanding the ORA_ROWSCN behavior in Oracle

Building Accessories Schema and Bulk Insert

How to restrict PL/SQL code from executing twice with the same values to the input parameters?

How to find out which package/procedure is updating a table?

Categories

Resources