Push/Export large datframe from R to Vertica database

Push/Export large datframe from R to Vertica database - r

I have a dataframe of 10M rows which needs to be uploaded back from R to Vertica Database.
The DBwrite() function from DBI is running into memory issues and I have tried increasing memory to 16g by
options(java.parameters = c("-XX:+UseConcMarkSweepGC", "-Xmx16g"))
Still the process is running into memory issue. I am planning to use bulk copy option of vertica to copy the csv file to create the table.
I have created an empty table on vertica
When I am executing the query
dbSendQuery(vertica, "COPY hpcom_usr.VM_test FROM LOCAL \'/opt/mount1/musoumit/MarketBasketAnalysis/Code/test.csv\' enclosed by \'\"\' DELIMITER \',\' direct REJECTED DATA \'./code/temp/rejected.txt\' EXCEPTIONS \'./code/temp/exceptions.txt\'")
I am running into this error.
Error in .verify.JDBC.result(r, "Unable to retrieve JDBC result set", :
Unable to retrieve JDBC result set
JDBC ERROR: [Vertica]JDBC A ResultSet was expected but not generated from query "COPY hpcom_usr.VM_test FROM LOCAL '/opt/mount1/musoumit/MarketBasketAnalysis/Code/test.csv' enclosed by '"' DELIMITER ',' direct REJECTED DATA './code/temp/rejected.txt' EXCEPTIONS './code/temp/exceptions.txt'". Query not executed.
Please help with what i'm doing wrong here.
Vertica also provides STDIN option aswell. Link
Please help me how can I execute this.
My Environment.
CENT OS 7
R 3.6.3 (No R Studio here I have to execute this from CLI)
Tidyverse 1.0.x
Vertica driver 9.x
System 128GB Memory and 28Core system.

Your problem is that you fire dbSendQuery() , which lives with a following dbFetch() and a final dbClearResult() - but only for query SQL statements - those that actually return a result set.
Vertica's COPY <table> FROM [LOCAL] 'file.ext' ... command is treated like a DML command. And for those - as this docu says ...
https://www.rdocumentation.org/packages/DBI/versions/0.5-1/topics/dbSendQuery
.. you need to use dbSendStatement() for data manipulation statements.
Have a go at it that way - good luck ...

dbSendUpdate(vertica, "COPY hpcom_usr.VM_test FROM LOCAL \'/opt/mount1/musoumit/MarketBasketAnalysis/Code/test.csv\' enclosed by \'\"\' DELIMITER \',\' direct REJECTED DATA \'./code/temp/rejected.txt\' EXCEPTIONS \'./code/temp/exceptions.txt\'")
instead of dbSendQuery did the trick for me.

Related

To Capture the Error in Sybase IQ SQL Execution using dbisqlc inside ksh shell Script

I have a shell script as below through which I'm connecting Sybase IQ through dbisqlc.
I would like to capture if there Error is any Error occurs during the Script Execution. I coded like below. But eventhough if there is any failure in SQL Execution , the Return Count in always 0.
dbisqlc -c "uid=UID;pwd=PWD;eng=DEV;links=tcpip(host=XXXX;port=XXX)" -q Query.sql OutputDumpFile
RC=`echo $?`
if [ $RC != 0 ]
then
echo "ERROR: Error in DB Query Execution"
fi
Please help how to capture the SQL Error in UNIX for Sybase IQ

According to this documentation, dbisql is recommended over dbisqlc:
It is recommended that you use the Interactive SQL utility (accessed
by using the dbisql command or by choosing Start » Programs » SQL
Anywhere 11 » Interactive SQL) where possible because the dbisqlc
utility does not support all the features that Interactive SQL does
It goes on to say:
Only the dbisql utility has the ability to provide exit codes from a
SQL script.
So based on that, it seems to me that dbisqlc does not convey information via exit codes.

Why am I getting: database is locked, in an SQLite3 script?

I'm getting an error when running an SQLite script.
--drop use table before replacing it
DROP TABLE IF EXISTS db.use;
--Create the use table in the saved database
CREATE TABLE db.use AS SELECT * FROM use2; -- this is the line that generates the error: Error: near line 145: database is locked
Are these two statements run asynchronously or something? I don't understand what's causing the error, but I'm wondering if it has to do with that.
Might there be a way to run the script in a lock-step manner, i.e. non-asynchronously?

This is how I was running the command: sqlite3 --init script_name.sql dbname.db, and elsewhere in the script I had an ATTACH statement reading the same database dbname.db. Essentially reading the same file twice.
The way I solved this was by executing the script in the sqlite3 shell:
sqlite3> .read script_name.sql

Have you tried to add a commit statement after the drop statement?
I think that would make sure the create table statement run after the drop statement is totally done.

Execute SQL with "like" statement in R Language

I am trying to execute a SQL Query through R to get the data from Access DB
Normal SQL statement works fine, but when it comes to like statement its throwing error
Below is code :
library(RODBC);
channel = odbcDriverConnect("Driver={Microsoft Access Driver (*.mdb, *.accdb)};DBQ=C:/Users/ADMIN/Documents/R.accdb")
test = sqlQuery(channel ,paste('SELECT R.ID, R.Template, R.WEDate FROM R WHERE R.Template Like "*slow*"'))
Error:
[1] "07002 -3010 [Microsoft][ODBC Microsoft Access Driver] Too few parameters. Expected 2."
[2] "[RODBC] ERROR: Could not SQLExecDirect 'SELECT R.ID, R.Template, R.WEDate FROM R WHERE (R.Template Like \"slow\")'
Is there a way to fix this.

Consider both of #joran's suggestions with single quote enclosing string literals AND using the ANSI-92 wildcard operator %. You would use asterisk, * (ANSI-89 mode) when running an internal query, namely inside the MSAccess.exe GUI program (which defaults to DAO) or if you connect externally to Access with DAO. Meanwhile, ADO connections uses the percent symbol which most external interfaces uses including RODBC.
I was able to reproduce your issue and both these remedies worked. Also, no need to use paste() as you are not concatenating any other object to query statement.
library(RODBC);
channel = odbcDriverConnect("Driver={Microsoft Access Driver (*.mdb, *.accdb)};
DBQ=C:/Users/ADMIN/Documents/R.accdb")
test = sqlQuery(channel,
"SELECT R.ID, R.Template, R.WEDate FROM R WHERE R.Template Like '%slow%'")

fast export unexplained failure

I have roughly 14 million records that I am attempting to export from a Teradata table to file using a fast export connection object.
There is no size limit for fast export files on our Linux system, and there is 1.2 TB of available space in the target directory.
The session fails, and gives the following errors:
READER_2_1_1 FEXP_87011 Process [16022] exited with status [12]
SDKS_38200 Partition-level [SOURCE_TABLE_NAME]: Plug-in #305400 failed in deinit()
I googled the error message, and found this post:
Here
I followed the recommendations in the port to delete the .out file in the temp directory, delete the files that were partially filled in the target directory, and drop the error table and delete the log file. This did not fix the issue and the session still fails with the same error messages.

Try to use TPT Export plug-in instead. Also you can try to execute this FastExport using bteq scripts directly on your unix environment.

Liquibase/SQLite syntax error

So I have a sqlite database. My game (C#) is able to open it and run it quite nicely. However, for maintainability and potential conflict reasons, I would like a more manageable format. Our project manager suggested Liquibase. I'm not sure if it's just out of date or what, but it will NOT convert for me, and I'm not sure why. Here's what we got:
running on windows 8.1
liquibase version 3.3.0 (http://www.liquibase.org/download/index.html)
sqlite database made using sqlite manager demo version 4.3.5 (http://www.sqlabs.com/sqlitemanager.php)
java version 8 update 25
sqlite-jdbc-3.8.7 (https://bitbucket.org/xerial/sqlite-jdbc)
The command I enter is...
liquibase --classpath=sqlite-jdbc-3.8.7.jar --driver=org.sqlite.JDBC\
--url="jdbc:sqlite:GameData.sqlite" --changeLogFile=gamedb-canges.xml\
--username=username --password=password generateChangeLog
and the error I get is...
Unexpected error running Liquibase: liquibase.exception.DatabaseException:
liquibase.exception.DatabaseException: java.sql.SQLException: [SQLITE_ERROR]
SQL error or missing database (near ")": syntax error)

I'd try removing the quotes around the URL. You can also include --logLevel=debug on the command line to get better diagnostics.

"SQL error or missing database" is an error message that comes from the SQLite db.
Try to add a full path to the sqlite db like:
--url="jdbc:sqlite:C:\myPath\GameData.sqlite"

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Push/Export large datframe from R to Vertica database - r

dbSendUpdate(vertica, "COPY hpcom_usr.VM_test FROM LOCAL \'/opt/mount1/musoumit/MarketBasketAnalysis/Code/test.csv\' enclosed by \'\"\' DELIMITER \',\' direct REJECTED DATA \'./code/temp/rejected.txt\' EXCEPTIONS \'./code/temp/exceptions.txt\'") instead of dbSendQuery did the trick for me.

Related

To Capture the Error in Sybase IQ SQL Execution using dbisqlc inside ksh shell Script

Why am I getting: database is locked, in an SQLite3 script?

Execute SQL with "like" statement in R Language

fast export unexplained failure

Liquibase/SQLite syntax error

Categories

Resources