I have a database A.db, which contains tables t1, t2 and t3.
Now I want to create a new database B.db, which contains t1 and some chosen columns col1 and col4 from t2.
With .import I get hundreds of errors and it seems to work only for full tables.
.output sounds like I just save the output as it would be printed.
Basically, I need an insert into foo select ... across different files. How can I do this?
First you must attach A.db to your current database and give it an alias like adb.
Then write the insert statement just like you would if all the tables existed in the same database, qualifying the column names with the database alias.
It's a good practice to include in the insert into... statement inside parentheses all the column names of the table foo for which you will set values from the other 2 tables, but also be sure that the order of the columns is the same with the order of the columns in the select list:
attach database 'pathtoAdatabase/A.db' as adb;
insert into foo (column1, column2, .......)
select adb.t1.column1, adb.t1.column2, ...., adb.t2.col1, adb.t2.col4
from adb.t1 inner join adb.t2
on <join condition>
Replace <join condition> with the conditions on whichyou will join the 2 tables to makes the rows that you will insert into foo, something like:
adb.t1.id = adb.t2.id
Related
I am new to RSQLite.
I have an input document in text format in which values are seperately by '|'
I created a table with the required variables (dummy code as follows)
db<-dbconnect(SQLite(),dbname="test.sqlite")
dbSendQuery(conn=db,
"CREATE TABLE TABLE1(
MARKS INTEGER,
ROLLNUM INTEGER
NAME CHAR(25)
DATED DATE)"
)
However I am struck at how to import values into the created table.
I cannot use INSERT INTO Values command as there are thousands of rows and more than 20+ columns in the original data file and it is impossible to manually type in each data point.
Can someone suggest an alternative efficient way to do so?
You are using a scripting language. The deal of this is literally to avoid manually typing each data point. Sorry.
You have two routes:
1: You have corrected loaded a database connection and created an empty table in your SQLite database. Nice!
To load data into the table, load your text file into R using e.g. df <-
read.table('textfile.txt', sep='|') (modify arguments to fit your text file).
To have a 'dynamic' INSERT statement, you can use placeholders. RSQLite allows for both named or positioned placeholder. To insert a single row, you can do:
dbSendQuery(db, 'INSERT INTO table1 (MARKS, ROLLNUM, NAME) VALUES (?, ?, ?);', list(1, 16, 'Big fellow'))
You see? The first ? got value 1, the second ? got value 16, and the last ? got the string Big fellow. Also note that you do not enclose placeholders for text in quotation marks (' or ")!
Now, you have thousands of rows. Or just more than one. Either way, you can send in your data frame. dbSendQuery has some requirements. 1) That each vector has the same number of entries (not an issue when providing a data.frame). And 2) You may only submit the same number of vectors as you have placeholders.
I assume your data frame, df contains columns mark, roll, and name, corrsponding to the columns. Then you may run:
dbSendQuery(db, 'INSERT INTO table1 (MARKS, ROLLNUM, NAME) VALUES (:mark, :roll, :name);', df)
This will execute an INSERT statement for each row in df!
TIP! Because an INSERT statement is execute for each row, inserting thousands of rows can take a long time, because after each insert, data is written to file and indices are updated. Insert, enclose it in an transaction:
dbBegin(db)
res <- dbSendQuery(db, 'INSERT ...;', df)
dbClearResult(res)
dbCommit(db)
and SQLite will save the data to a journal file, and only save the result when you execute the dbCommit(db). Try both methods and compare the speed!
2: Ah, yes. The second way. This can be done in SQLite entirely.
With the SQLite command utility (sqlite3 from your command line, not R), you can attach a text file as a table and simply do a INSERT INTO ... SELECT ... ; command. Alternately, read the text file in sqlite3 into a temporary table and run a INSERT INTO ... SELECT ... ;.
Useful site to remember: http://www.sqlite.com/lang.html
A little late to the party, but DBI provides dbAppendTable() which will write the contents of a dataframe to an SQL table. Column names in the dataframe must match the field names in the database. For your example, the following code would insert the contents of my random dataframe into your newly created table.
library(DBI)
db<-dbConnect(RSQLite::SQLite(),dbname=":memory")
dbExecute(db,
"CREATE TABLE TABLE1(
MARKS INTEGER,
ROLLNUM INTEGER,
NAME TEXT
)"
)
df <- data.frame(MARKS = sample(1:100, 10),
ROLLNUM = sample(1:100, 10),
NAME = stringi::stri_rand_strings(10, 10))
dbAppendTable(db, "TABLE1", df)
I don't think there is a nice way to do a large number of inserts directly from R. SQLite does have a bulk insert functionality, but the RSQLite package does not appear to expose it.
From the command line you may try the following:
.separator |
.import your_file.csv your_table
where your_file.csv is the CSV (or pipe delimited) file containing your data and your_table is the destination table.
See the documentation under CSV Import for more information.
Sorry in advance due to being new to Rstudio...
There are two parts to this question:
1) I have a large database that has almost 6,000 tables in it. Many of these tables have no data in them. Is there a code using R to only pull a list of tables names that have data in them?
I know how to pull a list of all table names and how to pull specific table data using the code below..
test<-odbcDriverConnect('driver={SQL Server};server=(SERVER);database=(DB_Name);trusted_connection=true')
rest<-sqlQuery(test,'select*from information_schema.tables')
Table1<-sqlFetch(test, "PROPERTY")
Above is the code I use to access the database and tables.
"test" is the connection
"rest" shows the list of 5,803 tables names.. one of which is called "PROPERTY"
"Table1" is simply pulling one of the tables named "PROPERTY".
I am looking to make "rest" only show the data tables that have data in them.
2) My ultimate goal, which leads to the second question, is to create a table that shows a list of every table from this database in column#1 and then column 2,3,4,etc... would include every one of the column headers that is contained in each table. Any idea how do to that?
Thanks so much!
The Tables object below returns a data frame giving all of the tables in the database and how many rows are in each table. As a condition, it requires that any table selected have at least one record. This is probably the fastest way to get your list of non-empty tables. I pulled the query to get that information from https://stackoverflow.com/a/14163881/1017276
My only reservation about that query is that it doesn't give the schema name, and it is possible to have tables with the same name in different schemas. So this is likely only going to work well within one schema at a time.
library(RODBCext)
Tables <-
sqlExecute(
channel = test,
query = "SELECT T.name TableName, I.rows Records
FROM sysobjects t, sysindexes i
WHERE T.xtype = ? AND I.id = T.id AND I.indid IN (0,1) AND I.rows > 0
ORDER BY TableName;",
data = list(xtype = "U"),
fetch = TRUE,
stringsAsFactors = FALSE
)
This next part uses the tables you found above and then gets the column information from each of those tables. Lastly, it makes on single data frame with all of the column names.
Columns <-
lapply(Tables$TableName,
function(x) sqlColumns(test, x))
Columns <- do.call("rbind", Columns)
sqlColumns is a function in RODBC.
sqlExecute is a function in RODBCext that allows for parameterized queries. I tend to use that anytime I need to use quoted strings in a query.
I want to make a volatile table using teradata.
In the select statement I am using multiple columns from different tables.
However, some of the columns in the different tables have same names.
Therefore, I am getting a 'duplication column error'.
The question is - is there any workaround to bypass this error?
Is it possible to add for example table name to column name?
This is how my code looks:
CREATE MULTISET VOLATILE TABLE test
AS (
SEL *
FROM Table_A Left JOIN Table_B
...
)
WITH DATA
ON COMMIT PRESERVE ROWS
Instead of doing a select * , select individual column names and put aliases next to it. This will bypass the error.
A select all statement only works if you're working off one table. If you're retrieving all data from multiple tables, you've to specify that in your select statement.
CREATE MULTISET VOLATILE TABLE test AS
(
SELECT Table_A.*
, Table_B.*
FROM Table_A
LEFT JOIN Table_B ON ...
...
)
WITH DATA PRIMARY INDEX(«PI»)
ON COMMIT PRESERVE ROWS
I have a database test.db with the first column containing 123456;abcdef;ghijk etc. Is it possible to split the data into its own colums?
123456 never changes length.
abcdef and ghijk change length and also may contain nothing.
I have tried the below string but i get the ; appear in either t2 or t3 depending on the length of abcdef and ghijk.
select substr(column,1,6) AS "t1",
substr(column,8,6) AS "t2",
substr(column,15,10) AS "t3" test.db
Is the ; seperator causing the issue?
Or can i output the database to .sql, format the txt then import to a new database?
There is no built-in SQLite function that can split strings like this.
If your are using the SQLite C API or a wrapper like APSW, you could create your own function (C, APSW).
If you want to do nothing more than a one-time conversion, export/import through a text file would be the simplest solution.
You can split your rows into columns by:
create table t1 as
select substr(c1,0,instr(c1,';')) as column1,
substr(c1,instr(c1,';')+1,instr(c1,';')-1) as column2,
substr(c1,instr(c1,';')+1+instr(substr(c1,instr(c1,';')+1),';')) as column3
from table_test;
where c1 is the column you are selecting from.
Here's what I want to do:
Insert some data being passed into a procedure into a new table, but also include some data from another table too.
Example:
INSERT INTO my_new_table(name, age, whatever)
VALUES (names.name, ages.age, passed_in_data);
But that won't work.
I did check another question on here, but it was a bit much for my feeble brain.
Something like this should do it:
insert into my_new_table (
name,
age,
whatever
)
select
names.name,
ages.age,
passed_in_data
from
names inner join
ages on ....
where
....