Kusto create an in-memory table for testing - azure-data-explorer

Just looking to create a quick in-memory/temp table for testing out queries. I've seen this done before but I'm having trouble finding any examples from a web search or StackOverflow search. I'm looking for something like this:
let TempTable = table("TestTable",
Column column1 = [1,2,3],
Column comumn2 = ["A","B","C"]
);
Result:
I don't need to save the table in any database, just want to use for testing & demonstration purposes.

you could use the datatable operator: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/datatableoperator
for example:
let T = datatable(column1:int, column2:string)
[
1, "A",
2, "B",
3, "C",
];
... do something with T ...

Related

How to query array column with array parameter in Azure Data Explorer (kusto)

I have table with dynamic column where I store list of IDs and I have parameter where list of IDs can be passed. So, I want to get rows where any of input values present in table column.
Something like this:
declare query_parameters (
i_ids: dynamic = dynamic([15,33,37])
);
let T = datatable(id: int, ids:dynamic)
[
1, dynamic([10, 15, 18]),
2, dynamic([22,25,29]),
3, dynamic([31, 33, 37]),
];
T
| where ids has_any(i_ids);
I need to get rows 1 and 3 but It fails with message: The source expression is of type 'dynamic' and cannot be compared with numeric arguments.
Can you please help me write proper query?
you can try using set_intersect(): https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/setintersectfunction
declare query_parameters (
i_ids: dynamic = dynamic([15,33,37])
);
let T = datatable(id: int, ids:dynamic)
[
1, dynamic([10, 15, 18]),
2, dynamic([22,25,29]),
3, dynamic([31, 33, 37]),
];
T
| where array_length(set_intersect(ids, i_ids)) > 0

Undesired flattening occuring

I'm using BigQuery on exported GA data (see schema here)
Looking at the documentation, I see that when I selected a field that is inside a record it will automatically flatten that record and duplicate the surrounding columns.
So I tried to create a denormalized table that I could query in a more SQL like mindset
SELECT
CONCAT( date, " ", if (hits.hour < 10,
CONCAT("0", STRING(hits.hour)),
STRING(hits.hour)), ":", IF(hits.minute < 10, CONCAT("0", STRING(hits.minute)), STRING(hits.minute)) ) AS hits.date__STRING,
CONCAT(fullVisitorId, STRING(visitId)) AS session_id__STRING,
fullVisitorId AS google_identity__STRING,
MAX(IF(hits.customDimensions.index=7, hits.customDimensions.value,NULL)) WITHIN RECORD AS customer_id__LONG,
hits.hitNumber AS hit_number__INT,
hits.type AS hit_type__STRING,
hits.isInteraction AS hit_is_interaction__BOOLEAN,
hits.isEntrance AS hit_is_entrance__BOOLEAN,
hits.isExit AS hit_is_exit__BOOLEAN,
hits.promotion.promoId AS promotion_id__STRING,
hits.promotion.promoName AS promotion_name__STRING,
hits.promotion.promoCreative AS promotion_creative__STRING,
hits.promotion.promoPosition AS promotion_position__STRING,
hits.eventInfo.eventCategory AS event_category__STRING,
hits.eventInfo.eventAction AS event_action__STRING,
hits.eventInfo.eventLabel AS event_label__STRING,
hits.eventInfo.eventValue AS event_value__INT,
device.language AS device_language__STRING,
device.screenResolution AS device_resolution__STRING,
device.deviceCategory AS device_category__STRING,
device.operatingSystem AS device_os__STRING,
geoNetwork.country AS geo_country__STRING,
geoNetwork.region AS geo_region__STRING,
hits.page.searchKeyword AS hit_search_keyword__STRING,
hits.page.searchCategory AS hits_search_category__STRING,
hits.page.pageTitle AS hits_page_title__STRING,
hits.page.pagePath AS page_path__STRING,
hits.page.hostname AS page_hostname__STRING,
hits.eCommerceAction.action_type AS commerce_action_type__INT,
hits.eCommerceAction.step AS commerce_action_step__INT,
hits.eCommerceAction.option AS commerce_action_option__STRING,
hits.product.productSKU AS product_sku__STRING,
hits.product.v2ProductName AS product_name__STRING,
hits.product.productRevenue AS product_revenue__INT,
hits.product.productPrice AS product_price__INT,
hits.product.productQuantity AS product_quantity__INT,
hits.product.productRefundAmount AS hits.product.product_refund_amount__INT,
hits.product.v2ProductCategory AS product_category__STRING,
hits.transaction.transactionId AS transaction_id__STRING,
hits.transaction.transactionCoupon AS transaction_coupon__STRING,
hits.transaction.transactionRevenue AS transaction_revenue__INT,
hits.transaction.transactionTax AS transaction_tax__INT,
hits.transaction.transactionShipping AS transaction_shipping__INT,
hits.transaction.affiliation AS transaction_affiliation__STRING,
hits.appInfo.screenName AS app_current_name__STRING,
hits.appInfo.screenDepth AS app_screen_depth__INT,
hits.appInfo.landingScreenName AS app_landing_screen__STRING,
hits.appInfo.exitScreenName AS app_exit_screen__STRING,
hits.exceptionInfo.description AS exception_description__STRING,
hits.exceptionInfo.isFatal AS exception_is_fatal__BOOLEAN
FROM
[98513938.ga_sessions_20151112]
HAVING
customer_id__LONG IS NOT NULL
AND customer_id__LONG != 'NA'
AND customer_id__LONG != ''
I wrote the result of this table into another table denorm (flatten on, large data set on).
I get different results when I query denorm with the clause
WHERE session_id_STRING = "100001897901013346771447300813"
versus wrapping the above query in (which yields desired results)
SELECT * FROM (_above query_) as foo where session_id_STRING = 100001897901013346771447300813
I'm sure this is by design, but if someone could explain the difference between these two methods that would be very helpful?
I believe you are saying that you did check the box "Flatten Results" when you created the output table? And I assume from your question that session_id_STRING is a repeated field?
If those are correct assumptions, then what you are seeing is exactly the behavior you referenced from the documentation above. You asked BigQuery to "flatten results" so it turned your repeated field into an un-repeated field and duplicated all the fields around it so that you have a flat (i.e., no repeated data) table.
If the desired behavior is the one you see when querying over the subquery, then you should uncheck that box when creating your table.
Looking at the documentation, I see that when I selected a field that
is inside a record it will automatically flatten that record and
duplicate the surrounding columns.
This is not correct. BTW, can you please point to the documentation - it needs to be improved.
Selecting a field does not flatten that record. So if you have a table T with a single record {a = 1, b = (2, 2, 3)}, then do
SELECT * FROM T WHERE b = 2
You still get a single record {a = 1, b = (2, 2)}. SELECT COUNT(a) from this subquery would return 1.
But once you write results of this query with flatten=on, you get two records: {a = 1, b = 2}, {a = 1, b = 2}. SELECT COUNT(a) from the flattened table would return 2.

Query to Replace a value with a different value in a table

I would like to replace a value Test to Mess in Column A in a table T where the value is Var in Column B in the same table.
Please someone help me with the query as I'm new to Oracle.
This is very easy, try this:
UPDATE t
SET A = REPLACE(A, 'Test', 'Mess')
WHERE B = 'Var';
or if You want not replace, but full text update in A column, you can make like this :
UPDATE t
SET A = 'Mess'
WHERE B = 'Var' and A = 'Test';

Cassandra - CqlEngine - using collection

I want to know how I can work with collection in cqlengine
I can insert value to list but just one value so I can't append some value to my list
I want to do this:
In CQL3:
UPDATE users
SET top_places = [ 'the shire' ] + top_places WHERE user_id = 'frodo';
In CqlEngine:
connection.setup(['127.0.0.1:9160'])
TestModel.create(id=1,field1 = [2])
this code will add 2 to my list but when I insert new value it replace by old value in list.
The only help in Cqlengine :
https://cqlengine.readthedocs.org/en/latest/topics/columns.html#collection-type-columns
And I want to know that how I can Read collection field by cqlengine.
Is it an dictionary in my django project? how I can use it?!!
Please help.
Thanks
Looking at your example it's a list.
Given a table based on the Cassandra CQL documentation:
CREATE TABLE plays (
id text PRIMARY KEY,
game text,
players int,
scores list<int>
)
You have to declare model like this:
class Plays(Model):
id = columns.Text(primary_key=True)
game = columns.Text()
players = columns.Integer()
scores = columns.List(columns.Integer())
You can create a new entry like this (omitting the code how to connect):
Plays.create(id = '123-afde', game = 'quake', players = 3, scores = [1, 2, 3])
Then to update the list of scores one does:
play = Plays.objects.filter(id = '123-afde').get()
play.scores.append(20) # <- this will add a new entry at the end of the list
play.save() # <- this will propagate the update to Cassandra - don't forget it
Now if you query your data with the CQL client you should see new values:
id | game | players | scores
----------+-------+---------+---------------
123-afde | quake | 3 | [1, 2, 3, 20]
To get the values in python you can simply use an index of an array:
print "Length is %(len)s and 3rd element is %(val)d" %\
{ "len" : len(play.scores), "val": play.scores[2] }

sqlite - how do I get a one row result back? (luaSQLite3)

How can I get a single row result (e.g. in form of a table/array) back from a sql statement. Using Lua Sqlite (LuaSQLite3). For example this one:
SELECT * FROM sqlite_master WHERE name ='myTable';
So far I note:
using "nrows"/"rows" it gives an iterator back
using "exec" it doesn't seem to give a result back(?)
Specific questions are then:
Q1 - How to get a single row (say first row) result back?
Q2 - How to get row count? (e.g. num_rows_returned = db:XXXX(sql))
In order to get a single row use the db:first_row method. Like so.
row = db:first_row("SELECT `id` FROM `table`")
print(row.id)
In order to get the row count use the SQL COUNT statement. Like so.
row = db:first_row("SELECT COUNT(`id`) AS count FROM `table`")
print(row.count)
EDIT: Ah, sorry for that. Here are some methods that should work.
You can also use db:nrows. Like so.
rows = db:nrows("SELECT `id` FROM `table`")
row = rows[1]
print(row.id)
We can also modify this to get the number of rows.
rows = db:nrows("SELECT COUNT(`id`) AS count FROM `table`")
row = rows[1]
print(row.count)
Here is a demo of getting the returned count:
> require "lsqlite3"
> db = sqlite3.open":memory:"
> db:exec "create table foo (x,y,z);"
> for x in db:urows "select count(*) from foo" do print(x) end
0
> db:exec "insert into foo values (10,11,12);"
> for x in db:urows "select count(*) from foo" do print(x) end
1
>
Just loop over the iterator you get back from the rows or whichever function you use. Except you put a break at the end, so you only iterate once.
Getting the count is all about using SQL. You compute it with the SELECT statement:
SELECT count(*) FROM ...
This will return one row containing a single value: the number of rows in the query.
This is similar to what I'm using in my project and works well for me.
local query = "SELECT content FROM playerData WHERE name = 'myTable' LIMIT 1"
local queryResultTable = {}
local queryFunction = function(userData, numberOfColumns, columnValues, columnTitles)
for i = 1, numberOfColumns do
queryResultTable[columnTitles[i]] = columnValues[i]
end
end
db:exec(query, queryFunction)
for k,v in pairs(queryResultTable) do
print(k,v)
end
You can even concatenate values into the query to place inside a generic method/function.
local query = "SELECT * FROM ZQuestionTable WHERE ConceptNumber = "..conceptNumber.." AND QuestionNumber = "..questionNumber.." LIMIT 1"

Resources