In APEX, when performing a Data Load (e.g. upload of a csv file into APEX application), is it possible to validate input data using a transformation rule?
For example, suppose to upload data about cars that have been sold this month.
The target table has the column car_manufacturer and num_car_sold.
The column car_manufacturer must accept only three values, say ('A1', 'A2', 'A3').
In a pseudo PLSQL, just to give an idea:
IF :car_manufacturer IN ('A1, A2, A3') then :car_manufacturer else <error>
How can I check this in the upload phase? Is it possible to use a transformation rule, in order that if it fails, it returns an error message? Other ways?
Thanks in advance.
You could put a constraint on the table definition as per the other answer, or if you only want the error message for when the Data Load is used, you can use a Table Lookup.
Go to Shared Components -> Data Load Definitions
Open the Data Load Definition that you want to edit
Create Table Lookup
Select the column (e.g. car_manufacturer)
Set the Table Lookup attributes to a table that contains the list of valid values (you'll need either a table or a view for this)
Leave Insert New Value set to No (If set to 'No' (the default) then a new record will not be created in the lookup table if the lookup column value(s) entered do not already exist. If set to 'Yes' then a record will be created in the lookup table using the upload column(s) and the Upload Key Column will be retrieved from the newly created record.)
Set Error Message to the message you want to return if a match is not found.
How about having a check constraint on the table for the column "car_manufacturer"?
ALTER TABLE TABLE_NAME
ADD CONSTRAINT CHECK_CAR_MANUFACTURER
CHECK ( CAR_MANUFACTURER in ('A1', 'A2', 'A3'));
Related
I have a scenario like we need to load data from source file to target table from a particular date [like LOAD_DATE], So I’ll create a mapping parameter for LOAD_DATE and pass that in Source Qualifier query. My query looks like this.
SELECT * FROM my_TABLE where DATE >= ‘$$LOAD_DATE’
So here I need to pass parameter values for ‘$$LOAD_DATE’ from another external database. I know that I need to pass the values from the Parameter file.
But my requirement is not to hardcore the values in the Parameter file but to feed it in runtime from another database. I will appreciate your help and thoughts on this.
You dont have to hardcode.
You can do it like this -
option 1. Create a mapping to create the param file in particular format.
Read for the other DB.
In expression transformation create below port which will generate actual param string. Pls note, we need to add new line so its recognized like a actual param file.
out_str = '[<<name of folder . name of workflow or sessoin>>]' || chr(12) ||
'$$LOAD_DATE='|||| CHR(39) ||<<date value from another DB>>|| CHR(39)
Then link above port to a flat file target. Name the output file as session_param.txt or whatever suitable. Pls make sure the parameter is generated correctly.
Use above file as a parameter file in your actual workflow.
Option 2 - You can join another table with original table flow. This can be difficult and need to change existing mapping.
Join the another table from another DB with main table based on a dummy condition. make sure you get distinct values of LOAD_DATE from another table. Make sure you always have 1 value from this DB.
Once you have the LOAD_DATE field from another table, you can use it in filter transformation to filter the data.
After this point you can add your original mapping.
Whole mapping should be like this-
SQ_MAIN_TABLE ----------------------->|
sq_ANOTHER_TABLE --DISTINCT_LOAD_DT-->JNR--FIL on LOAD_DT --><<your mapping logic>>
I want to add a column to an existing impala table(and view) with a default value (so that the existing rows also have a value). The column should not allow null values.
ALTER TABLE dbName.tblName ADD COLUMNS (id STRING NOT NULL '-1')
I went through the docs but could not find an example that specifically does this. How do I do this in Impala? Hue underlines/does not recognize the NOT NULL command
Are you using Kudu as a storage layer for your table? Because if not, then according to Impala docs,
Note: Impala only allows PRIMARY KEY clauses and NOT NULL constraints on
columns for Kudu tables. These constraints are enforced on the Kudu
side.
...
For non-Kudu tables, Impala allows any column to contain NULL values,
because it is not practical to enforce a "not null" constraint on HDFS
data files that could be prepared using external tools and ETL
processes.
Impala's ALTER TABLE syntax also does not support specifying default column values (in general, non-Kudu).
With Impala you could try as follow
add the column
ALTER TABLE dbName.tblName ADD COLUMNS(id STRING);
once you've added the column you can fill that column as below using the same table
INSERT OVERWRITE dbName.tblName SELECT col1,...,coln, '-1' FROM dbName.tblName;
where col1,...,coln are the previous columns before the add columns command and '-1' is to fill the new column.
I've seen enough answers to know you can't easily check for columns in SQLITE before adding. I'm trying to make a lazy person's node in Node-Red where you pass a message to SQLITE which is the query. Adding a table if it does not exist is easy.
msg.topic='create table IF NOT EXISTS fred (id PRIMARY KEY);'; node.send(msg);
it occurred to me that adding a table which had the names of the fields would be easy - and if the field name is not in the table.... then add the field. BUT you can't add multiple fields at once - so I can't do this...
msg.topic='create table IF NOT EXISTS fred (id PRIMARY KEY, myfields TEXT);'; node.send(msg);
The problem with THAT is that I can't add this in later, there's no way to check before adding a field it the table exists!
This is what I WANT
msg.topic='create table IF NOT EXISTS fred (id PRIMARY KEY, myfields TEXT);'; node.send(msg);
msg.topic='if not (select address from myfields) alter table fred add column address text';
I just cannot think of any way to do this - any ideas anyone (the idea is that the node-red node would input a table, field and value and if the table didn't exist it would be created, if the field didn't exist it would be created, all before trying to add in the value).
You won't be able to make the ALTER TABLE conditional in the SQL itself. You'll need to handle that from your calling script.
Alternately, simply attempt to add the column to the table and accept failure as an outcome. Assuming the table exists, the only failure reason you could encounter is that the column already exists.
If you'd like to do something more graceful, you can check if the column exists in advance, then conditionally run the SQL from your calling application.
I have a data file (.csv) which contains 10lacs records. I am uploading file data into my table TBL_UPLOADED_DATA using oracle SQL LOADER and control file concept.
I am able to upload all the data form the file to table smoothly without any issues.
Now my requirement is i want to upload only relavant data based on some criteria.
for example i have table EMPLOYEE and its columns are EMPID,EMPNAME,REMARKS,EMPSTATUS
i have a datafile with employee data that i need to upload into EMPLOYEE table.
here i want restrict some data that should not upload into EMPLOYEE table using sql loader. Assume restriction criteria is like REMARKS should not contain 'NO' and EMPSTATUS should not contain '00'.
how can i implement this. Please suggest what changes to be done in control files.
You can use the WHEN syntax to choose to include or exclude a record based on some logic, but you can only use the =, != and <> operators, so it won't do quite what you need. If your status field is two characters then you can enforce that part with:
...
INTO TABLE employee
WHEN (EMPSTATUS != '00')
FIELDS ...
... and then a record with 00 as the last field will be rejected, with the log showing something like:
1 Row not loaded because all WHEN clauses were failed.
And you could use the same method to reject a record where the remarks are just 'NO' - where that is the entire content of the field:
WHEN (REMARKS != 'NO') AND (EMPSTATUS != '00')
... but not where it is a longer value that contains NO. It isn't entirely clear which you want.
But you can't use like or a function like instr or a regular expression to be more selective. If you need something more advanced you'll need to load the data into a staging table, or use an external table instead of SQL*Loader, and then selectively insert into your real table based on those conditions.
I need to create a data flow for an existing MS SSDT project that inports a flat CSV file into an existing database table. So far so good.
However I would like to reject all entries where the column "code" match values already stored in the db. Even better, if possible, in the case that the column "code" maches an entry in the database, I would like to update the column "description". The important thing is that under no circumstances should duplicate code entries be created.
Thanks
Ok so seeing as I figured it out, I thought someone else might find it useful:
The short answer is that a lookup is needed between the data source and destinations. This "lookup" will filter between matches that need updating and new values that need to go straight into a new table row (see image).
Values that match the database and need updating to the description need to be fed into an "OLE DB command".
Within the "lookup" component we need to do the following:
Go to the general tab and select Redirect rows to no match output
Go to the connection tab and insert the following SQL:
SELECT id, code FROM tableName
Go into the "Columns" tab and check the "id" column on the "Available lookup Columns" table. Also chech the "code" column and drag it to its corresponding "Available Inputs Columns" counterpart to map them to eachother so that the look up can compare them.
-- At this point if you get an error caused by the mapping, try to replace the code in setep 2 with:
SELECT id, CAST(code AS nvarchar(50)) AS code FROM tableName
In the Error Output, ensure that id under "Lookup Match Output" has a description of "Copy Column"
Now we need to configure the "OLE DB command" component:
Go to the "Connection Managers" tab and ensure the component is connected to the desired DB
Go to "Component Properties" and add the following code to the "SQLCommand" property:
UPDATE tableName SET description = ? WHERE id = ?
Note the "?". It is supposed to be there to indicate a parameter must be added to the "Column Mappings" tab, do not replace them.
Finally go into the "Column Mappings" tab and map Param_0 (the first ?) to the "description" column and "Param_1" to the "id" column. No action is needed on the "code" or any other column the db table may contain.
Now give yourself a big pat on the back for having completed a task, that in SQL would normally be one line of code, in about 10 time-consuming steps ;-)