Hive Query with UDF - encryption

I have some encrypted data in an HDFS csv, that I've created a Hive table for, and I want to run a Hive query that first encrypts the query param, then does the lookup. I have a UDF that does encryption as follows:
public class ParamEncrypt extends UDF {
public Text evaluate(String name) throws Exception {
String result = new String();
if (name == null) { return null; }
result = ParamData.encrypt(name);
return new Text(result);
}
}
Then I run the Hive query as:
select * from cc_details where first_name = encrypt('Ann');
The problem is, it's running encrypt('Ann') across every single record in the table. I want it do the encryption once, then do the matchup. I've tried:
select * from cc_details where first_name in (select encrypt('Ann') from cc_details limit 1);
But Hive doesn't support IN or select queries in the where clause.
What can I do?
Can I do something like:
select encrypt('Ann') as ann from cc_details where first_name = ann;
That also doesn't work because the query parser throws an error saying ann is not a known column

Finally got it with a right outer join as
select * from cc_details ssn_tbl
right outer join ( select encrypt('850-37-8230','ssn') as ssn
from cc_details limit 1) ssn_tmp
on (ssn_tbl.ssn = ssn_tmp.ssn);

I think what you are looking for is an annotation #UDFType(deterministic = true) on your UDF. It's definitely available on the Generic UDFs, you can check if it's available for regular UDF like you have created. If not, just convert your UDF to GenericUDF. You can read about it on this blog post that I wrote a while back.

Another way to do it (and actually the way I ended up going with), is by caching the result of the encryption. It's actually faster this way, because with the join, you get a separate set of map-reduce jobs, which slows down the overall execution time.
it's like this:
private static String result = null;
public Text evaluate(String data) {
if (result == null) {
result = Data.encrypt(data);
}
return new Text(result);
}

Related

How to bind select query as datasource in report?

I have a select statement that I want to bind as data source in a report.
I have not found a way to design an appropriate AOT query.
This is how it looks like in X++
public void insertData(date data = today())
{
BHNEmployeesOnDay ins;
EmplTable tbl;
CompanyInfo info;
BHNEmplAgreements Agreemnt;
BHNEmplAgreements Agreemnt2;
BHNEMPLHISTORYCOMPANY history;
BHNEMPLHISTORYCOMPANY history_test;
BHNDIVISIONTABLE division;
BHNPOSITIONTABLE position;
SysCompanyUserInfo sys1;
SysUserInfo sys2;
UserInfo usrInfo;
Date infinity = mkdate(1,1,1900);
;
delete_from ins;
while select * from tbl
join Info where info.dataAreaId == tbl.dataAreaId && info.BLX_companyForDW == 1
join sys1 where sys1.EmplId==tbl.EmplId && sys1.dataAreaId == tbl.dataAreaId
join sys2 where sys1.UserId==sys2.Id
join usrInfo where usrInfo.id==sys1.UserId
exists join history_test
where history_test.EmplId==tbl.EmplId && history_test.dataAreaId==tbl.dataAreaId
join Agreemnt where Agreemnt.HistoryId==history_test.HistoryId
&& (agreemnt.DateTo >= data || agreemnt.DateTo==infinity)
{
select firstonly *
from history order by history.DateFrom desc, Agreemnt2.DateFrom desc
where history.EmplId==tbl.EmplId && history.dataAreaId==tbl.dataAreaId
join Agreemnt2 where Agreemnt2.HistoryId==history.HistoryId
&& Agreemnt2.DateFrom<=data && (Agreemnt2.DateTo >= data || Agreemnt2.DateTo==infinity)
join division where division.DivisionId==agreemnt.DivisionId
join position where position.PositionId==agreemnt.PositionId;
ins.adddRecord(tbl.EmplId, tbl.Name_BHN, tbl.BirthDate, division.Name, position.FullName);
}
}
Currently I generate data into a table [during run() method of the report], then simply select from that table. So far only 1 person uses this report so it's not a problem, but if two people run the same report simultaneously, I'm gonna get dirty reads.
I know it's bad approach, but I'm out of ideas. I thought of making a View on T-SQL side and try to select from it - but I was told that it might not be detected or simply not transferred to other instances of our AX during export, so it has to be done on AX side.
How can I solve this?
Just in case this is query in T-SQL SQL query on pastebin
You could overwrite the report's fetch method and just use your X++ code as is to get the records and then use the report's send method to process them.
See here for an example.
The example uses a query object but you could easily swap that with your own X++ code - you just eventually have to call send for the records you want to be processed by the report.
Update:
For example you could just fetch any record of SalesTable and call send.
In this example a member variable salesTable is assumed so that you can access the current record in a display method in case you need it.
public boolean fetch()
{
boolean ret;
//ret = super();
;
select firstOnly salesTable;
this.send(salesTable);
return true;
}

Case sensitive and insensitive like in SQLite

In SQLite it is possible to change the case sensitive behaviour of 'LIKE' by using the commands:
PRAGMA case_sensitive_like=ON;
PRAGMA case_sensitive_like=OFF;
However in my situation I would like to execute a query, part of which is case sensitive and part of which isn't. For example:
SELECT * FROM mytable
WHERE caseSensitiveField like 'test%'
AND caseInsensitiveField like 'g2%'
Is this possible?
You can use the UPPER keyword on your case insensitive field then upper-case your like statement. e.g.
SELECT * FROM mytable
WHERE caseSensitiveField like 'test%'
AND UPPER(caseInsensitiveField) like 'G2%'
Use plain comparisons, which are case sensitive by default (unless you have declared the column COLLATE NOCASE):
SELECT *
FROM mytable
WHERE caseSensitiveField >= 'test'
AND caseSensitiveField < 'tesu'
AND caseInsensitiveField LIKE 'g2%'
This works only if the original LIKE is searching for a prefix, but allows using an index.
In SQLite you can use GLOB instead of LIKE for pattern search. For example:
SELECT * FROM mytable
WHERE caseSensitiveField GLOB 'test*'
AND caseInsensitiveField LIKE 'g2%'
With this approach you don't have to worry about PRAGMA.
I know this is an old question, but if you are coding in Java and have this problem this might be helpful. You can register a function that handles the like checking. I got the tip form this post: https://stackoverflow.com/a/29831950/1271573
The solution i dependent on sqlite jdbc: https://mvnrepository.com/artifact/org.xerial/sqlite-jdbc
In my case I only needed to see if a certain string existed as part of another string (like '%mystring%'), so I created a Contains function, but it should be possible to extend this to do a more sql-like check using regex or something.
To use the function in SQL to see if MyCol contains "searchstring" you would do:
select * from mytable where Contains(MyCol, 'searchstring')
Here is my Contains function:
public class Contains extends Function {
#Override
protected void xFunc() throws SQLException {
if (args() != 2) {
throw new SQLException("Contains(t1,t2): Invalid argument count. Requires 2, but found " + args());
}
String testValue = value_text(0).toLowerCase();
String isLike = value_text(1).toLowerCase();
if (testValue.contains(isLike)) {
result(1);
} else {
result(0);
}
}
}
To use this function you must first register it. When you are done with using it you can optionally destroy it. Here is how:
public static void registerContainsFunc(Connection con) throws SQLException {
Function.create(con, Contains.class.getSimpleName(), new Contains());
}
public static void destroyContainsFunc(Connection con) throws SQLException {
Function.destroy(con, Contains.class.getSimpleName());
}
I used a regular expression to do what I needed. I wanted to identify all the occurrences of the word "In" that was not all lower case.
select [COL] from [TABLE] where [COL] REGEXP '\bIn\b';
Example:
with x as (select 'in' Diff_Ins union select 'In' Diff_Ins)
select Diff_Ins from x where Diff_Ins REGEXP '\bIn\b';
As others mention, SQLite also offers the GLOB function which is case-sensitive.
Assume g2* is text entered by the user at the application-level. To simplify application-side grammar and make GLOB case-insensitive, the text needs to be normalised to a common case:
SELECT * FROM mytable WHERE LOWER(caseInsensitiveField) GLOB LOWER('g2*');
If UNICODE is required, carefully test LOWER and UPPER to confirm they operate as expected. GLOB is an extension function specific to SQLite. Building a general grammar engine supporting multiple database vendors is non-trivial.

Linq. Anonymous type error when joining to multiple tables

Im trying to return an IQueryable based on my model.
But I need to join to the same lookup table twice. Then return the query variable to the gridview.
public IQueryable<Benchmark> GetBenchMarks([QueryString("hydrant")] string hydrant,
[QueryString("revdate")] string revdate, [QueryString("street")] string street,
[QueryString("quadrant")] string quadrant, [QueryString("desc")] string desc) {
IQueryable<Benchmark> query = from p in _db.Benchmarks
join s in _db.Streets on p.Street1Number equals s.Id
join s2 in _db.Streets on p.Street2Number equals s2.Id
select new {
Street1Name = s.StreetName,
p.OrderNumber,
p.HydrantNumber,
Street2Name = s2.StreetName,
p.RevisionDate,
p.Quadrant,
p.Description,
p.Street1Number
};
}
So there is a red squiggle line on the 2nd join to s2. And the following error.
Error 5 Cannot implicitly convert type
'System.Linq.IQueryable<AnonymousType#1>' to
'System.Linq.IQueryable<Benchmarks.Model.Benchmark>'. An explicit
conversion exists (are you missing a
cast?) C:\Projects\Benchmarks\Benchmarks\Benchmarks_Home.aspx.cs 63 25 Benchmarks
Since you end your query with select new {...}, you are creating an anonymous object for each result. Instead, use select p, and each result will be a Benchmark.
However, it looks like returning a Benchmark is not what you want. In this case, you would want to change query to be of type IQueryable or IQueryable<dynamic> (and probably change the return type of the GetBenchMarks function as well, unless it does return IQueryable<Benchmark>!).
A second (potentially better) alternative would be to create a class to represent this anonymous type, and use that.
The result of your query is IEnumerable of anonymous objects, thus it cannot be converted to Benchmark.
If you want to set some additional properties (Street1Name - that are evidently not mapped on DB) from joined relations you can do:
IQueryable<Benchmark> query = from p in _db.Benchmarks
join s in _db.Streets on p.Street1Number equals s.Id
join s2 in _db.Streets on p.Street2Number equals s2.Id
select new {
....
};
var ex = query.ToList();
var result = new List<Benchmark>();
foreach(bn in ex){
result.Add(new Benchmark{ OrderNumber = bn.OrderNumber .... });
}
// return result.AsQueryable();
// but now it losts the point to return it as queryable, because the query was already executed so I would simply reurn that list
return result;
Another option is to make new class representing the object from the query and return it from the method like:
... select new LoadedBenchmark { Street1Name = s.StreetName ....}

Massive Query with inner join not returning any data

I'm using the Massive Query method to write a simple join query against an Oracle database. This is my code with the query simplified even further by taking out some columns:
dynamic logTable = new DynamicModel("mydatabase", "table1");
var sb = new StringBuilder();
sb.Append("select CONTACT_ID from table1 inner join table2 on table1.ID = table2.ID ");
sb.Append("where table1.ID=:0");
dynamic dbResult = logTable.Query(sb.ToString(), id);
The following code gives me an error: 'object' does not contain a definition for 'CONTACT_ID'
string id = dbResult.CONTACT_ID.ToString();
If I take the exact query and run it through sqldeveloper, I get back the expected results. If I try to Query through Massive without a join, I get back an object I can work with.
Any ideas?
My mistake! I was expecting my query to return only one record, but forgot that Query returns IEnumerable. Solution is to take First() or loop over the results.

How do I check in SQLite whether a table exists?

How do I, reliably, check in SQLite, whether a particular user table exists?
I am not asking for unreliable ways like checking if a "select *" on the table returned an error or not (is this even a good idea?).
The reason is like this:
In my program, I need to create and then populate some tables if they do not exist already.
If they do already exist, I need to update some tables.
Should I take some other path instead to signal that the tables in question have already been created - say for example, by creating/putting/setting a certain flag in my program initialization/settings file on disk or something?
Or does my approach make sense?
I missed that FAQ entry.
Anyway, for future reference, the complete query is:
SELECT name FROM sqlite_master WHERE type='table' AND name='{table_name}';
Where {table_name} is the name of the table to check.
Documentation section for reference: Database File Format. 2.6. Storage Of The SQL Database Schema
This will return a list of tables with the name specified; that is, the cursor will have a count of 0 (does not exist) or a count of 1 (does exist)
If you're using SQLite version 3.3+ you can easily create a table with:
create table if not exists TableName (col1 typ1, ..., colN typN)
In the same way, you can remove a table only if it exists by using:
drop table if exists TableName
A variation would be to use SELECT COUNT(*) instead of SELECT NAME, i.e.
SELECT count(*) FROM sqlite_master WHERE type='table' AND name='table_name';
This will return 0, if the table doesn't exist, 1 if it does. This is probably useful in your programming since a numerical result is quicker / easier to process. The following illustrates how you would do this in Android using SQLiteDatabase, Cursor, rawQuery with parameters.
boolean tableExists(SQLiteDatabase db, String tableName)
{
if (tableName == null || db == null || !db.isOpen())
{
return false;
}
Cursor cursor = db.rawQuery(
"SELECT COUNT(*) FROM sqlite_master WHERE type = ? AND name = ?",
new String[] {"table", tableName}
);
if (!cursor.moveToFirst())
{
cursor.close();
return false;
}
int count = cursor.getInt(0);
cursor.close();
return count > 0;
}
You could try:
SELECT name FROM sqlite_master WHERE name='table_name'
See (7) How do I list all tables/indices contained in an SQLite database in the SQLite FAQ:
SELECT name FROM sqlite_master
WHERE type='table'
ORDER BY name;
Use:
PRAGMA table_info(your_table_name)
If the resulting table is empty then your_table_name doesn't exist.
Documentation:
PRAGMA schema.table_info(table-name);
This pragma returns one row for each column in the named table. Columns in the result set include the column name, data type, whether or not the column can be NULL, and the default value for the column. The "pk" column in the result set is zero for columns that are not part of the primary key, and is the index of the column in the primary key for columns that are part of the primary key.
The table named in the table_info pragma can also be a view.
Example output:
cid|name|type|notnull|dflt_value|pk
0|id|INTEGER|0||1
1|json|JSON|0||0
2|name|TEXT|0||0
SQLite table names are case insensitive, but comparison is case sensitive by default. To make this work properly in all cases you need to add COLLATE NOCASE.
SELECT name FROM sqlite_master WHERE type='table' AND name='table_name' COLLATE NOCASE
If you are getting a "table already exists" error, make changes in the SQL string as below:
CREATE table IF NOT EXISTS table_name (para1,para2);
This way you can avoid the exceptions.
If you're using fmdb, I think you can just import FMDatabaseAdditions and use the bool function:
[yourfmdbDatabase tableExists:tableName].
The following code returns 1 if the table exists or 0 if the table does not exist.
SELECT CASE WHEN tbl_name = "name" THEN 1 ELSE 0 END FROM sqlite_master WHERE tbl_name = "name" AND type = "table"
Note that to check whether a table exists in the TEMP database, you must use sqlite_temp_master instead of sqlite_master:
SELECT name FROM sqlite_temp_master WHERE type='table' AND name='table_name';
Here's the function that I used:
Given an SQLDatabase Object = db
public boolean exists(String table) {
try {
db.query("SELECT * FROM " + table);
return true;
} catch (SQLException e) {
return false;
}
}
Use this code:
SELECT name FROM sqlite_master WHERE type='table' AND name='yourTableName';
If the returned array count is equal to 1 it means the table exists. Otherwise it does not exist.
class CPhoenixDatabase():
def __init__(self, dbname):
self.dbname = dbname
self.conn = sqlite3.connect(dbname)
def is_table(self, table_name):
""" This method seems to be working now"""
query = "SELECT name from sqlite_master WHERE type='table' AND name='{" + table_name + "}';"
cursor = self.conn.execute(query)
result = cursor.fetchone()
if result == None:
return False
else:
return True
Note: This is working now on my Mac with Python 3.7.1
You can write the following query to check the table existance.
SELECT name FROM sqlite_master WHERE name='table_name'
Here 'table_name' is your table name what you created. For example
CREATE TABLE IF NOT EXISTS country(country_id INTEGER PRIMARY KEY AUTOINCREMENT, country_code TEXT, country_name TEXT)"
and check
SELECT name FROM sqlite_master WHERE name='country'
Use
SELECT 1 FROM table LIMIT 1;
to prevent all records from being read.
Using a simple SELECT query is - in my opinion - quite reliable. Most of all it can check table existence in many different database types (SQLite / MySQL).
SELECT 1 FROM table;
It makes sense when you can use other reliable mechanism for determining if the query succeeded (for example, you query a database via QSqlQuery in Qt).
The most reliable way I have found in C# right now, using the latest sqlite-net-pcl nuget package (1.5.231) which is using SQLite 3, is as follows:
var result = database.GetTableInfo(tableName);
if ((result == null) || (result.Count == 0))
{
database.CreateTable<T>(CreateFlags.AllImplicit);
}
The function dbExistsTable() from R DBI package simplifies this problem for R programmers. See the example below:
library(DBI)
con <- dbConnect(RSQLite::SQLite(), ":memory:")
# let us check if table iris exists in the database
dbExistsTable(con, "iris")
### returns FALSE
# now let us create the table iris below,
dbCreateTable(con, "iris", iris)
# Again let us check if the table iris exists in the database,
dbExistsTable(con, "iris")
### returns TRUE
I thought I'd put my 2 cents to this discussion, even if it's rather old one..
This query returns scalar 1 if the table exists and 0 otherwise.
select
case when exists
(select 1 from sqlite_master WHERE type='table' and name = 'your_table')
then 1
else 0
end as TableExists
My preferred approach:
SELECT "name" FROM pragma_table_info("table_name") LIMIT 1;
If you get a row result, the table exists. This is better (for me) then checking with sqlite_master, as it will also check attached and temp databases.
This is my code for SQLite Cordova:
get_columnNames('LastUpdate', function (data) {
if (data.length > 0) { // In data you also have columnNames
console.log("Table full");
}
else {
console.log("Table empty");
}
});
And the other one:
function get_columnNames(tableName, callback) {
myDb.transaction(function (transaction) {
var query_exec = "SELECT name, sql FROM sqlite_master WHERE type='table' AND name ='" + tableName + "'";
transaction.executeSql(query_exec, [], function (tx, results) {
var columnNames = [];
var len = results.rows.length;
if (len>0){
var columnParts = results.rows.item(0).sql.replace(/^[^\(]+\(([^\)]+)\)/g, '$1').split(','); ///// RegEx
for (i in columnParts) {
if (typeof columnParts[i] === 'string')
columnNames.push(columnParts[i].split(" ")[0]);
};
callback(columnNames);
}
else callback(columnNames);
});
});
}
Table exists or not in database in swift
func tableExists(_ tableName:String) -> Bool {
sqlStatement = "SELECT name FROM sqlite_master WHERE type='table' AND name='\(tableName)'"
if sqlite3_prepare_v2(database, sqlStatement,-1, &compiledStatement, nil) == SQLITE_OK {
if sqlite3_step(compiledStatement) == SQLITE_ROW {
return true
}
else {
return false
}
}
else {
return false
}
sqlite3_finalize(compiledStatement)
}
c++ function checks db and all attached databases for existance of table and (optionally) column.
bool exists(sqlite3 *db, string tbl, string col="1")
{
sqlite3_stmt *stmt;
bool b = sqlite3_prepare_v2(db, ("select "+col+" from "+tbl).c_str(),
-1, &stmt, 0) == SQLITE_OK;
sqlite3_finalize(stmt);
return b;
}
Edit: Recently discovered the sqlite3_table_column_metadata function. Hence
bool exists(sqlite3* db,const char *tbl,const char *col=0)
{return sqlite3_table_column_metadata(db,0,tbl,col,0,0,0,0,0)==SQLITE_OK;}
You can also use db metadata to check if the table exists.
DatabaseMetaData md = connection.getMetaData();
ResultSet resultSet = md.getTables(null, null, tableName, null);
if (resultSet.next()) {
return true;
}
If you are running it with the python file and using sqlite3 obviously. Open command prompt or bash whatever you are using use
python3 file_name.py first in which your sql code is written.
Then Run sqlite3 file_name.db.
.table this command will give tables if they exist.
I wanted to add on Diego VĂ©lez answer regarding the PRAGMA statement.
From https://sqlite.org/pragma.html we get some useful functions that can can return information about our database.
Here I quote the following:
For example, information about the columns in an index can be read using the index_info pragma as follows:
PRAGMA index_info('idx52');
Or, the same content can be read using:
SELECT * FROM pragma_index_info('idx52');
The advantage of the table-valued function format is that the query can return just a subset of the PRAGMA columns, can include a WHERE clause, can use aggregate functions, and the table-valued function can be just one of several data sources in a join...
Diego's answer gave PRAGMA table_info(table_name) like an option, but this won't be of much use in your other queries.
So, to answer the OPs question and to improve Diegos answer, you can do
SELECT * FROM pragma_table_info('table_name');
or even better,
SELECT name FROM pragma_table_list('table_name');
if you want to mimic PoorLuzers top-voted answer.
If you deal with Big Table, I made a simple hack with Python and Sqlite and you can make the similar idea with any other language
Step 1: Don't use (if not exists) in your create table command
you may know that this if you run this command that will have an exception if you already created the table before, and want to create it again, but this will lead us to the 2nd step.
Step 2: use try and except (or try and catch for other languages) to handle the last exception
here if you didn't create the table before, the try case will continue, but if you already did, you can put do your process at except case and you will know that you already created the table.
Here is the code:
def create_table():
con = sqlite3.connect("lists.db")
cur = con.cursor()
try:
cur.execute('''CREATE TABLE UNSELECTED(
ID INTEGER PRIMARY KEY)''')
print('the table is created Now')
except sqlite3.OperationalError:
print('you already created the table before')
con.commit()
cur.close()
You can use a simple way, i use this method in C# and Xamarin,
public class LoginService : ILoginService
{
private SQLiteConnection dbconn;
}
in login service class, i have many methods for acces to the data in sqlite, i stored the data into a table, and the login page
it only shows when the user is not logged in.
for this purpose I only need to know if the table exists, in this case if it exists it is because it has data
public int ExisteSesion()
{
var rs = dbconn.GetTableInfo("Sesion");
return rs.Count;
}
if the table does not exist, it only returns a 0, if the table exists it is because it has data and it returns the total number of rows it has.
In the model I have specified the name that the table must receive to ensure its correct operation.
[Table("Sesion")]
public class Sesion
{
[PrimaryKey]
public int Id { get; set; }
public string Token { get; set; }
public string Usuario { get; set; }
}
Look into the "try - throw - catch" construct in C++. Most other programming languages have a similar construct for handling errors.

Resources