Pentaho Kettle: Mailing the result of a transformation - report

I am having a kettle job and transformation.
Transformation will write the result set of a select sql into a csv file.
job will get the result file and mail it to the user.
I need to mail only if the file consists any data, else should not mail the result to user.
or how to find the result of a transformation is empty or not(is there any file size validator job entry available?).
I am not able to find any job entries for this kind of conditioning.
Thanks in advance.

You can use the Evaluate files metrics job step in the Conditions branch. Set your condition on the Advanced tab.

You can set your transformation to generate the file only if there is data and then use in your main job the File exists? step.

Related

Is there a way to compare file vs table record with creating new mapping using Informatica?

I'm working on a scenario where I have to compare a data record which is coming from a file with the data from a table as part of validation check before loading the data file into the staging table. I have come up with a couple of possible scenarios which involve something that needs to change within the load mapping, but my team suggested to me to make a change to something that is easy to notice since it is a non-standard approach.
Is there any approach that we can handle within the workflow manager using any of the workflow tasks or session properties?
Create a mapping that will read the file, join data with the table, do the required validation and will write nothing out (use a filter with FALSE condition) and set a variable to 0/1 to indicate if the loading should start.
Next, run the loading session if the validation passed.
This can be improved a bit if you want to store the validation errors in some audit table. Then you don't need a variable - the condition can refer to $PMTargetName#numAffectedRows built-in variable. If it's more then zero - meaning there were some errors - don't start the load.
create a workflow with command line where you need to write a script which will pull the data from the table by using JDBC connections and try to compare with data present in the file and then flag whether to load or not .
based on this command line output you need to go ahead with staging workflow or not..
Use awk commands for comparison of the data , where you ll get flexibility to compare date parts in a column
FYR : http://www.cs.unibo.it/~renzo/doc/awk/nawkA4.pdf

debatch big input flat file into smaller multiple output files with specific count

I have a positional input flat file schema of the following kind.
<Employees>
<Employee>
<Data>
In mapping, I need to extract the strings on position basis to pass on to the target schema.
I have the following conditions -
If Data has 500 records, there should be 5 files of 100 records at the output location.
If Data has 522 records, there should be 6 files (5*100, 1*22 records) at the output location.
I have tried few suggestions from internet like
Setting “Allow Message Breakup At Infix Root” to “Yes” and setting maxoccurs to "100". This doesn't seem to be working. How to Debatch (Split) a Flat File using Flat File Schema ?
I'm also working on a custom receive pipeline component suggested at Split Flat Files into smaller files (on row count) using Custom Pipeline but I'm quite new to this so it's taking some time.
Please let me know if there is any simpler way of doing this, without implementing the custom pipeline component.
I'm following the approach to divide the input flat file into multiple small files as per condition and write at the receive location, then process the files with native flat file dissembler. Please correct me if there is a better approach.
You have two options:
Import the flat file to a SQL table using SSIS.
Parse the input file as one Message, then map to a Composite Operation to insert the records into a SQL table. You could use in Insert Updategram also.
After either 1 or 2, call a Stored Procedure to retrieve the Count and Order of messages you need.
A simple way for a flat file structure without writing custom C# code is to just use a Database table. Just insert the whole file as records into the table, and then have a Receive Location that polls for records in the batch size you want.
Another approach is called the Scatter Gather Pattern, in this case you do set the Occurs to 1 which will debatch into individual records, and you then have an Orchestration that re-assembles it into the batch size you want. You will have to read up about Correlations Sets to do this.

Add file name as column in data factory pipeline destination

I am new to DF. i am loading bunch of csv files into a table and i would like to capture the name of the csv file as a new column in the destination table.
Can someone please help how i can achieve this ? thanks in advance
If you use a Mapping data flow, there is an option under source settings to hold the File name being used. And later in it can be mapped to column in Sink.
If your destination is azure table storage, you could put your filename into partition key column. Otherwise, I think there is no native way to do this with ADF. You may need custom activity or stored procedure.
A post said the could use data bricks to handle this.
Data Factory - append fields to JSON sink
Another post said they are using USQL to hanlde this.
use adf pipeline parameters as source to sink columns while mapping
For stored procedure, please reference this post. Azure Data Factory mapping 2 columns in one column

How to get file's name in File watcher job in Autosys?

I am new to Autosys and have a query. I want to detect an arrival of a file. Hence I am using a file watcher job with watch_file as *. Now, I want to get the file name as I have to pass the name as a parameter to the next job. How can I get the filename? Any help is appreciated.
There is no real easy way to get the filename. A query on the wcc database and Table called dbo.MON_JOB there is a column AGENT_STATUS that will contain the filename.
You would need to filter based on the job name.
Hope that helps.
Dave

Help regarding BizTalk

i am having my biztalk solution, uptill now i am able to do following thing
1) taken sql adapter as my source schmea i wanted node wise xml so i did xml auto,elements in my SP to its generate schmea nodewise
2) i am able to loop through all the nodes and checking condition in loop wioth decide shape now decide shape is executing perfactly,but now the issue comes i want to insert my current xml into table,from all the xml nodes i am getting single node's xml like following
<userDetails xmlns="http://SqlRowLooping"><userID>1</userID><fName>niladri</fName><lName>Roy</lName><department>it</department></userDetails>
now i have updateGram as well but i think it will accept data attibutewise,right now it is firing error saying cant find procedure userID,
help how to insert this in table,
how updategram will work..
thxs
Change the XML node to conform to updategram syntax, see MSDN

Resources