Preventing partition column from appearing in exported data - azure-data-explorer

I have an external table partitioned on Timestamp column which is of datetime type. So the external table definition looks like this:-
.create external table external_mytable (mydata:dynamic,Timestamp:datetime)
kind=blob
partition by bin(Timestamp,1d)
dataformat=json
(
h#'https://<mystorage>.blob.core.windows.net/<mycontainer>;<storagekey>'
)
The source table for the export is mytable which has a bunch of columns but I am only interested in a column called mydata holding actual payload and other columns year, month & day, which are required to drive partitioning.
My export looks like this:-
.export async to table external_mytable <| mysourcetable | project mydata,Timestamp=make_datetime(year,month,day)
Now in this case I don't ideally want Timestamp column to be part of actual exported JSON data. I am forced to specify it because this column is driving partitioning logic. Is there any way to avoid Timestamp appearing in the exported data and still be used in determining partitioning in this case?

Thanks for the ask Dhiraj, this is on our backlog. Feel feel to open similar asks on our user voice where we can update once it is complete.

Related

Truncate and Load a Kusto table instead of a Materialized view so that it can be used for continous export

We have a scenario where some reference data is getting ingested in a Kusto table (~1000 rows).
To handle data duplication due to daily data load (as Kusto does always append), we have created a Materialized view(MV) on top of the table to summarize the data and get the latest data based on ingestion_time(), so that querying the MV will always result in latest updated reference data.
Our next ask is to export this formatted data in a storage container using Kusto continuous data export (please refer MS doc), however, it seems we can't use Materialized view to set up a continuous export.
So looking at options, is there any way we can create a truncate load table instead of a Materialized View in kusto, so that we don't have a duplicate record in the table and it can be used to do continuous export.
.create async materialized-view with (backfill=true) products_vw on table products
{
products
| extend d=parse_json(record)
| extend
createdBy=tostring(d.createdBy),
createdDate = tostring(d.createdDate),
product_id=tostring(d.id),
product_name=tostring(d.name),
ingest_time=ingestion_time()
| project
ingest_time,
createdBy,
createdDate,
product_id,
product_name
| summarize arg_max(ingest_time, *) by product_id
}
You can use Azure logic app or Microsoft flow to run the applicable export command to an external table backed by Azure storage on any given time interval. The query can simply refer to the materialized view, for example:
.export to table ExternalBlob <| Your_MV

Formatting SQL WHERE Conditional Values

I am looking to see if there is a way to format conditional values in batch instead of manually typing. For example, I am filtering on 5 digit codes in SQL, my source of the codes is in Excel in list form. There can be hundreds of codes to add to a SQL WHERE statement to filter on, is there tool or formatting methods the will take a list of values and format with single quotes and comma separation?
From this:
30239
30240
30241
30242
To this:
'30239',
'30240',
'30241',
'30242',
...
Then, these formatted values can be pasted into the WHERE clause instead of manually typing all of this out. Again, this is for hundreds of values...
I used to use BrioQuery that had functionality to import text files to be used in filtering, but my current qry tool, TOAD Data Point does not seem to have this.
Thank you
Look into SQL*Loader. Create s staging table to contain the imported values. Use loader to populate the stage table. Then modify your query to reference the stage table; it becomes something like:
Select ...
where target_column_name in (select column_name from stage_table).
The structure "where in ( select)" may not be the best for performance, but once loaded you will have all the facilities SQL offers at your disposal.
It has been a few years since I've used TOAD but as I remember it has an import functionality. There are other tools for loading data into Excel into Oracle. SQL*Loader just happens to be the one Oracle supplies with the RDBMS.

Oracle BI Publisher - Dynamic number of columns

I'm creating a report in BI Publisher using the BI Publisher Desktop tool for Word.
What I need is to have a table with a dynamic column number.
Let's imagine I'm listing stocks by store: Each line is an item and I need to have a column for each store in the database, but that must be dynamic because a store can be created or deleted at any moment.
The number of stores, i.e., the number of columns that need to exist is obtained from an SQL query that goes into the report by a data set.
The query will be something like SELECT COUNT(*) AS STORE_COUNT FROM STORE; in a data set named G_1, so the number of columns is the variable G_1::STORE_COUNT.
Is there any way that can be achieved?
I'm developing the report using an .rtf file, so any help related would be appretiated.
Thank you very much.
Create a .rtf file with the column names mapped to a .xdo or .xdm file. The mapped column in .xdo or .xdm file should be in the cursor or the select statement of your stored procedure of function.

How to restrict loading data based on some criteria using sql loader in oracle?

I have a data file (.csv) which contains 10lacs records. I am uploading file data into my table TBL_UPLOADED_DATA using oracle SQL LOADER and control file concept.
I am able to upload all the data form the file to table smoothly without any issues.
Now my requirement is i want to upload only relavant data based on some criteria.
for example i have table EMPLOYEE and its columns are EMPID,EMPNAME,REMARKS,EMPSTATUS
i have a datafile with employee data that i need to upload into EMPLOYEE table.
here i want restrict some data that should not upload into EMPLOYEE table using sql loader. Assume restriction criteria is like REMARKS should not contain 'NO' and EMPSTATUS should not contain '00'.
how can i implement this. Please suggest what changes to be done in control files.
You can use the WHEN syntax to choose to include or exclude a record based on some logic, but you can only use the =, != and <> operators, so it won't do quite what you need. If your status field is two characters then you can enforce that part with:
...
INTO TABLE employee
WHEN (EMPSTATUS != '00')
FIELDS ...
... and then a record with 00 as the last field will be rejected, with the log showing something like:
1 Row not loaded because all WHEN clauses were failed.
And you could use the same method to reject a record where the remarks are just 'NO' - where that is the entire content of the field:
WHEN (REMARKS != 'NO') AND (EMPSTATUS != '00')
... but not where it is a longer value that contains NO. It isn't entirely clear which you want.
But you can't use like or a function like instr or a regular expression to be more selective. If you need something more advanced you'll need to load the data into a staging table, or use an external table instead of SQL*Loader, and then selectively insert into your real table based on those conditions.

Importing fields from multiple columns in an Excel spreadsheet into a single row in Access

We get new data for our database from an online form that outputs as an Excel sheet. To normalize the data for the database, I want to combine multiple columns into one row.
Example, I want data like this:
ID | Home Phone | Cell Phone | Work Phone
1 .... 555-1234 ...... 555-3737 ... 555-3837
To become this:
PhoneID | ID | Phone Number | Phone type
1 ............ 1 ....... 555-1234 ....... Home
2 ............ 1 ....... 555-3737 ....... Cell
3 ............ 1 ....... 555-3837 ...... Work
To import the data, I have a button that finds the spreadsheet and then runs a bunch of queries to add the data.
How can I write a query to append this data to the end of an existing table without ending up with duplicate records? The data pulled from the website is all stored and archived in an Excel sheet that will be updated without removing the old data (we don't want to lose this extra backup), so with each import, I need it to disregard all of the previously entered data.
I was able to make a query that lists everything out in the correct from the original spreadsheet (I entered the external spreadsheet into an unnormalized table in Access to test it) but when I try to append it to the phone number table, it adds all of the data repeatedly. I can remove it with a query to remove duplicate data, but I'd rather not leave it like that.
There are several possible approaches to this problem; which one you choose may depend on the size of the dataset relative to the number of updates being processed. Basically, the choices are:
1) Add a unique index to the destination table, so that Access will refuse to add a duplicate record. You'll need to handle the possible warning ("Access was unable to add xxx records due to index violations" or similar).
2) Import the incoming data to a staging table, then outer join the staging table to the destination table and append only records where the key field(s) in the destination table are null (i.e., there's no matching record in the destination table).
I have used both approaches in the past - I like the index approach for its simplicity, and I like the staging approach for its flexibility, because you can do a lot with the incoming data before you append it if you need to.
You could run a delete query on the table where you store the queried data and then run your imports.
Assuming that the data is only being updated.
The delete query will remove all records and then you can run the import to repopulate the table - therefore no duplicates.

Resources