SQL*Loader handling multiple values in a single column - oracle11g

I have a CSV data file containing data that I want to load into my database structure. A sample data in that CSV file is something like this:
col1, col2, col3, "val1,val2,val3,val4,...", col5, col6, "val1,val2,val3,...", col8
As you can see in the above scenario the data is delimited by a comma and the columns c4 and c7 are enclosed in inverted commas and they contain multiple values each. Basically, I am trying to map these multiple values into many-to-many relationships. for example, consider that every line in the above CSV file represents a movie and col4 contains genres to that movie. Now I have a table named as "Genre" in my schema and another table "Movie_Genre" so while loading the data I want to load these genres of col4 into the "Genre" table making sure a value doesn't repeat and at the same time I want to map the movies stored in "Movies" table with genres stored in "Genres" table through the use of "Movie_Genres" table which contains the "movie_id" & "genre_id".
How would I achieve this using SQL Loader on SQLPLUS Oracle 11g windows platform?

LOAD DATA
INFILE 'nameoffile.csv'
INSERT INTO TABLE Billing
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS
(
col1,
col2
)

Related

sqlite - copy subset of tables and columns into new db-file

I have a database A.db, which contains tables t1, t2 and t3.
Now I want to create a new database B.db, which contains t1 and some chosen columns col1 and col4 from t2.
With .import I get hundreds of errors and it seems to work only for full tables.
.output sounds like I just save the output as it would be printed.
Basically, I need an insert into foo select ... across different files. How can I do this?
First you must attach A.db to your current database and give it an alias like adb.
Then write the insert statement just like you would if all the tables existed in the same database, qualifying the column names with the database alias.
It's a good practice to include in the insert into... statement inside parentheses all the column names of the table foo for which you will set values from the other 2 tables, but also be sure that the order of the columns is the same with the order of the columns in the select list:
attach database 'pathtoAdatabase/A.db' as adb;
insert into foo (column1, column2, .......)
select adb.t1.column1, adb.t1.column2, ...., adb.t2.col1, adb.t2.col4
from adb.t1 inner join adb.t2
on <join condition>
Replace <join condition> with the conditions on whichyou will join the 2 tables to makes the rows that you will insert into foo, something like:
adb.t1.id = adb.t2.id

Is it possible to import a CSV file to an existing table without the headers being included?

I'm trying to import a CSV file to a table that is empty but already exists in an SQLite database. For example:
sqlite> CREATE TABLE data (...);
sqlite> .mode csv
sqlite> .import mydata.csv data
I have created the table in advance because I'd like to specify a primary key, data types, and foreign key constraints. This process works as expected, but it unfortunately includes the header row from the CSV file in the table.
Here's what I've learned from the SQLite docs regarding CSV imports:
There are two cases to consider: (1) Table "tab1" does not previously exist and (2) table "tab1" does already exist.
In the first case, when the table does not previously exist, the table is automatically created and the content of the first row of the input CSV file is used to determine the name of all the columns in the table. In other words, if the table does not previously exist, the first row of the CSV file is interpreted to be column names and the actual data starts on the second row of the CSV file.
For the second case, when the table already exists, every row of the CSV file, including the first row, is assumed to be actual content. If the CSV file contains an initial row of column labels, that row will be read as data and inserted into the table. To avoid this, make sure that table does not previously exist.
So basically, I get extra data because I've created the table in advance. Is there a flag to change this behavior? If not, what's the best workaround?
The sqlite3 command-line shell has no such flag.
If you have a sufficiently advanced OS, you can use an external tool to split off the first line:
sqlite> .import "|tail -n +2 mydata.csv" data
You can also use the --skip 1 option with .import as documented on the sqlite3 website and this SO Answer. So, you can use the following command
.import --csv --skip 1 mydata.csv data

Adding value to existing database table in RSQLite

I am new to RSQLite.
I have an input document in text format in which values are seperately by '|'
I created a table with the required variables (dummy code as follows)
db<-dbconnect(SQLite(),dbname="test.sqlite")
dbSendQuery(conn=db,
"CREATE TABLE TABLE1(
MARKS INTEGER,
ROLLNUM INTEGER
NAME CHAR(25)
DATED DATE)"
)
However I am struck at how to import values into the created table.
I cannot use INSERT INTO Values command as there are thousands of rows and more than 20+ columns in the original data file and it is impossible to manually type in each data point.
Can someone suggest an alternative efficient way to do so?
You are using a scripting language. The deal of this is literally to avoid manually typing each data point. Sorry.
You have two routes:
1: You have corrected loaded a database connection and created an empty table in your SQLite database. Nice!
To load data into the table, load your text file into R using e.g. df <-
read.table('textfile.txt', sep='|') (modify arguments to fit your text file).
To have a 'dynamic' INSERT statement, you can use placeholders. RSQLite allows for both named or positioned placeholder. To insert a single row, you can do:
dbSendQuery(db, 'INSERT INTO table1 (MARKS, ROLLNUM, NAME) VALUES (?, ?, ?);', list(1, 16, 'Big fellow'))
You see? The first ? got value 1, the second ? got value 16, and the last ? got the string Big fellow. Also note that you do not enclose placeholders for text in quotation marks (' or ")!
Now, you have thousands of rows. Or just more than one. Either way, you can send in your data frame. dbSendQuery has some requirements. 1) That each vector has the same number of entries (not an issue when providing a data.frame). And 2) You may only submit the same number of vectors as you have placeholders.
I assume your data frame, df contains columns mark, roll, and name, corrsponding to the columns. Then you may run:
dbSendQuery(db, 'INSERT INTO table1 (MARKS, ROLLNUM, NAME) VALUES (:mark, :roll, :name);', df)
This will execute an INSERT statement for each row in df!
TIP! Because an INSERT statement is execute for each row, inserting thousands of rows can take a long time, because after each insert, data is written to file and indices are updated. Insert, enclose it in an transaction:
dbBegin(db)
res <- dbSendQuery(db, 'INSERT ...;', df)
dbClearResult(res)
dbCommit(db)
and SQLite will save the data to a journal file, and only save the result when you execute the dbCommit(db). Try both methods and compare the speed!
2: Ah, yes. The second way. This can be done in SQLite entirely.
With the SQLite command utility (sqlite3 from your command line, not R), you can attach a text file as a table and simply do a INSERT INTO ... SELECT ... ; command. Alternately, read the text file in sqlite3 into a temporary table and run a INSERT INTO ... SELECT ... ;.
Useful site to remember: http://www.sqlite.com/lang.html
A little late to the party, but DBI provides dbAppendTable() which will write the contents of a dataframe to an SQL table. Column names in the dataframe must match the field names in the database. For your example, the following code would insert the contents of my random dataframe into your newly created table.
library(DBI)
db<-dbConnect(RSQLite::SQLite(),dbname=":memory")
dbExecute(db,
"CREATE TABLE TABLE1(
MARKS INTEGER,
ROLLNUM INTEGER,
NAME TEXT
)"
)
df <- data.frame(MARKS = sample(1:100, 10),
ROLLNUM = sample(1:100, 10),
NAME = stringi::stri_rand_strings(10, 10))
dbAppendTable(db, "TABLE1", df)
I don't think there is a nice way to do a large number of inserts directly from R. SQLite does have a bulk insert functionality, but the RSQLite package does not appear to expose it.
From the command line you may try the following:
.separator |
.import your_file.csv your_table
where your_file.csv is the CSV (or pipe delimited) file containing your data and your_table is the destination table.
See the documentation under CSV Import for more information.

How to cut a single file into multiple files and save them with different names in UNIX?

I have a file that contains create table of several tables.
I want to cut the file into multiple parts and save them with name that is present in the script.
For example I have,
Create table abc_bcd.xyz , .....
Create table abc_bcd.pqr , ......
Now I want my file to be divided for every table and saved with name xyz and pqr.
I tried
awk '/CREATE/{x="TABLE"++i ".ddl";}{print > x;}' filename
But this gave me files named as TABLE1.ddl , TABLE2.ddl etc.
Thanks in advance.
#shellter Yes, complete create table script is more than one line.
The file is like:
create table abc_bcd.xyz ,
( column1 datatype,
column2 datatype,
......);
create table abc_bcd.pqr ,
( column1 datatype,
column2 datatype,
......);
...
...
...
My requirement is to cut the file table wise and save them with name xyz, pqr and so on.
I tried to combine awk and grep but could not achieve my requirement.

Sqlite Create new columns from column containing ;

I have a database test.db with the first column containing 123456;abcdef;ghijk etc. Is it possible to split the data into its own colums?
123456 never changes length.
abcdef and ghijk change length and also may contain nothing.
I have tried the below string but i get the ; appear in either t2 or t3 depending on the length of abcdef and ghijk.
select substr(column,1,6) AS "t1",
substr(column,8,6) AS "t2",
substr(column,15,10) AS "t3" test.db
Is the ; seperator causing the issue?
Or can i output the database to .sql, format the txt then import to a new database?
There is no built-in SQLite function that can split strings like this.
If your are using the SQLite C API or a wrapper like APSW, you could create your own function (C, APSW).
If you want to do nothing more than a one-time conversion, export/import through a text file would be the simplest solution.
You can split your rows into columns by:
create table t1 as
select substr(c1,0,instr(c1,';')) as column1,
substr(c1,instr(c1,';')+1,instr(c1,';')-1) as column2,
substr(c1,instr(c1,';')+1+instr(substr(c1,instr(c1,';')+1),';')) as column3
from table_test;
where c1 is the column you are selecting from.

Resources