Parse string variables from Unix Shell into HiveQL - unix

I have a shell script file which I parse both numeric and string variables. An example below:
Shell Script
hive --hiveconf time_1=34600 --hiveconf time_2=34588 --hiveconf message="hello_world" -f mytask.hql
Also I have a Hive query in the respective file 'mytask.hql' as follows:
HiveQL file
SELECT col1, col2, ${hiveconf:message} AS myMessage
FROM table1
WHERE trtime between ${hiveconf:time_1} and ${hiveconf:time_2};
The problem is that I want to have a column that contains the message "Hello world" or whatever the external - from Unix Shell Script - variable contains, in every line, but I got the following error:
[Error 10004]: Line xxx Invalid table alias or column reference 'hello_world': (possible column names are: col1, col2 ... (Etc.)
The output that I want to have is something like this:

String constants in SQL should be quoted using single quotes: '${hiveconf:message}':
SELECT col1, col2, '${hiveconf:message}' AS myMessage
FROM table1
WHERE trtime between ${hiveconf:time_1} and ${hiveconf:time_2};
And without quotes ${hiveconf:message} is resolved in hello_world, without quotes it looks like column, not a constant, this is why you got such exception.

Related

how to change dbplyr's show_query() quote covention for snowflake SQL

I am using dbplyr's package to translate my dplyr's query to SQL and it works really well, however when I copy and paste my translated SQL statement it wont run in snowflake because the quotes the columns with ` (the key above tab), whereas my snowflake SQL will only run if its columns are quoted with either " (double quote) ' (single quote) or no quote at all (if there are no breaks).
Is there a way to change the dbplyr::show_query() argument so that the outcome is in double quotes or single quotes instead of backtick? there is a con argument which I've set to simulate_snowflake() however that doesn't change anything.
The error I get is: SQL compilation error: error line 2 at position 0 invalid identifier '"COL_NAME"'
#This will not work in my snowflake SQL
SELECT
`COL_NAME`
FROM
TABLENAME
#This will work though:
SELECT
"COL_NAME"
FROM
TABLENAME
One possibility would be to use sql_render(), convert to a character string, and use any regex replace process that you like to change the quotes. For example:
<pipeline> %>%
sql_render() %>%
as.character() %>%
str_replace_all(pattern="`",replacement = "\\\"")

sqlite3 is not using end-of-line character when importing data

I have a tab-delimited file which I'm attempting to load into a table. The table has already been created and structured appropriately, the challenge is that SQLite3 is combining the last value on one row with the first value on the next row.
So for a file where the last line was SomeText, and the next line begins with 12345, the value imported is SomeText12345
Right now I'm using the following command:
.separator "\t";
.import MyFile.tsv MyTable
Any ideas how I can get the data to load while recognizing the end-of-line?
I noticed the same problem. I've always suspected it had to do with the last value in a tab-separated file being a TEXT type. A little stack-sniffing turned up this post wherein the second answer says:
There is actually a dedicated mode for importing tab separated files:
sqlite> .mode tabs
sqlite> .import MyFile.tsv MyTable

Adding value to existing database table in RSQLite

I am new to RSQLite.
I have an input document in text format in which values are seperately by '|'
I created a table with the required variables (dummy code as follows)
db<-dbconnect(SQLite(),dbname="test.sqlite")
dbSendQuery(conn=db,
"CREATE TABLE TABLE1(
MARKS INTEGER,
ROLLNUM INTEGER
NAME CHAR(25)
DATED DATE)"
)
However I am struck at how to import values into the created table.
I cannot use INSERT INTO Values command as there are thousands of rows and more than 20+ columns in the original data file and it is impossible to manually type in each data point.
Can someone suggest an alternative efficient way to do so?
You are using a scripting language. The deal of this is literally to avoid manually typing each data point. Sorry.
You have two routes:
1: You have corrected loaded a database connection and created an empty table in your SQLite database. Nice!
To load data into the table, load your text file into R using e.g. df <-
read.table('textfile.txt', sep='|') (modify arguments to fit your text file).
To have a 'dynamic' INSERT statement, you can use placeholders. RSQLite allows for both named or positioned placeholder. To insert a single row, you can do:
dbSendQuery(db, 'INSERT INTO table1 (MARKS, ROLLNUM, NAME) VALUES (?, ?, ?);', list(1, 16, 'Big fellow'))
You see? The first ? got value 1, the second ? got value 16, and the last ? got the string Big fellow. Also note that you do not enclose placeholders for text in quotation marks (' or ")!
Now, you have thousands of rows. Or just more than one. Either way, you can send in your data frame. dbSendQuery has some requirements. 1) That each vector has the same number of entries (not an issue when providing a data.frame). And 2) You may only submit the same number of vectors as you have placeholders.
I assume your data frame, df contains columns mark, roll, and name, corrsponding to the columns. Then you may run:
dbSendQuery(db, 'INSERT INTO table1 (MARKS, ROLLNUM, NAME) VALUES (:mark, :roll, :name);', df)
This will execute an INSERT statement for each row in df!
TIP! Because an INSERT statement is execute for each row, inserting thousands of rows can take a long time, because after each insert, data is written to file and indices are updated. Insert, enclose it in an transaction:
dbBegin(db)
res <- dbSendQuery(db, 'INSERT ...;', df)
dbClearResult(res)
dbCommit(db)
and SQLite will save the data to a journal file, and only save the result when you execute the dbCommit(db). Try both methods and compare the speed!
2: Ah, yes. The second way. This can be done in SQLite entirely.
With the SQLite command utility (sqlite3 from your command line, not R), you can attach a text file as a table and simply do a INSERT INTO ... SELECT ... ; command. Alternately, read the text file in sqlite3 into a temporary table and run a INSERT INTO ... SELECT ... ;.
Useful site to remember: http://www.sqlite.com/lang.html
A little late to the party, but DBI provides dbAppendTable() which will write the contents of a dataframe to an SQL table. Column names in the dataframe must match the field names in the database. For your example, the following code would insert the contents of my random dataframe into your newly created table.
library(DBI)
db<-dbConnect(RSQLite::SQLite(),dbname=":memory")
dbExecute(db,
"CREATE TABLE TABLE1(
MARKS INTEGER,
ROLLNUM INTEGER,
NAME TEXT
)"
)
df <- data.frame(MARKS = sample(1:100, 10),
ROLLNUM = sample(1:100, 10),
NAME = stringi::stri_rand_strings(10, 10))
dbAppendTable(db, "TABLE1", df)
I don't think there is a nice way to do a large number of inserts directly from R. SQLite does have a bulk insert functionality, but the RSQLite package does not appear to expose it.
From the command line you may try the following:
.separator |
.import your_file.csv your_table
where your_file.csv is the CSV (or pipe delimited) file containing your data and your_table is the destination table.
See the documentation under CSV Import for more information.

SQLite: insert binary data from command line

I have this SQLite table:
create table mytable (
aid INTEGER NOT NULL PRIMARY KEY,
bid INTEGER NOT NULL,
image BLOB
);
And I want to insert a binary file into the image field in this table. Is it possible to do it from the sqlite3 command line interface? If so, how? I'm using Ubuntu.
Thank you!
The sqlite3 command line interface adds the following two “application-defined” functions:
readfile
which typically is used as: INSERT INTO table(blob) VALUES (readfile('myimage.jpg'))
writefile
which writes a file with the contents of a database blob and returns the count of bytes written.
You may use a syntax like :
echo "insert into mytable values(1,1, \"`cat image`\")" | sqlite3 yourDb
i'm not sure for the " around blob's value. Note the backquotes around cat command, means the cat command will be executed before the echo.
[EDIT]
Blob are stored as hexa digit with "X" prefix. You can use "hexdump" unix command to produce the hexa string, but it would be better to write a command line tool that read image and do the insert.
More details on this post : http://comments.gmane.org/gmane.comp.db.sqlite.general/64149

sqlite and batch file

I am using a sqlite database.
I want to excute sqlite from dos command line and insert fields into the tables by passing them as parameters from command line.
C:\run.bat name surname
want to create the run.bat file ...
sqlite > INSERT into tab(col1, col2) Values (%1,%2);
I guess you're looking for something like this:
sqlite3 DATABASENAME 'insert into dbo.TABLE values('%1', '%2')'
You'll need the command line program sqlite3.exe. Then you can put into your batch file:
sqlite3 dbname "INSERT INTO TableName (ColName1, ColName2) VALUES (%1, %2);"
Also note that you will have to be aware of the datatypes of the columns and, if the VALUES for those datatypes require quoting, you'll need to add the quotes into the command line (and handle embedded quotes in the values, if that's a possible occurence).

Resources