File watch in Autosys - autosys

We have a SQL table that is populated by some jobs. The table structure is
ID Date Name
1 6/10/2017 sales
2 6/10/2017 marketing
3 6/17/2017 Loans
We recieve files on our server from vendors weekly.
For instance we received a file called Sales_6/10/2017.txt.
My autosys job need to watch this file and search it within table above. If the date and name matches in table then only further my job should run. If it doesn't find the entry in table then it must not run.
i am not able to figure out how to get this file watch along with checking with entry in table.

Autosys can handle the file watcher, but it cannot execute a SQL query as a condition for starting a job. You should put that functionality into a shell script that is called by Autosys when the file arrives. The shell script should perform the query and then decide whether to proceed, depending on the results.

Related

ADF Pipeline Bulk copy activity to Log files

I have a bulk copy template to azure blob storage data transfer set up in ADF. This activity will dynamically produce 'n' number of files.
I need to write log file (txt format) after pipeline activity completed finished.
The log file should have pipeline start & completion datetime and also number of files outputted, status etc.
What is the best way or to choose the activity to do this?
Firstly,i have to say that ADF won't generate log files about the execution information automatically. You could see Visually monitor and Programmatically monitor for activities in ADF.
In above link, you could get the start time of pipeline: Run Start.Even though it does not have any Run End, you could calculate by yourself: Run End = Run Start + Duration.
As for the number of files, please refer to this link.
Anyway,all these metrics need to be got programatically i think,you could choose the language you are good at.

Airflow- failing a task which returns no data?

What would the best way be to fail a task which is the result of a BCP query (command line query for MS SQL server I am connecting to)?
I am downloading data from multiple tables every 30 minutes. If the data doesn't exist, the BCP command is still creating a file (0 size). This makes it seem like the task was always successful, but in reality it means that there is missing data on a replication server another team is maintaining.
bcp "SELECT * FROM database.dbo.table WHERE row_date = '2016-05-28' AND interval = 0" queryout /home/var/filename.csv -t, -c -S server_ip -U user -P password
The row_date and interval would be tied to the execution date in Airflow. I would like for airflow to show a failed task instance if the query returned no data though. Any suggestions?
Check for file size as part of the task?
Create an upstream task which reads the first couple of rows and tells Airflow whether the query was valid or not?
I would use your first suggestion and check for the file size as part of the task.
If it is not possible to do this in the same task as the query, create a new one with that specific purpose with an upstream dependency. In the cases that the file is empty just trigger an exception in the task.

Autosys Job Statistics from 3 months

I want to make a report of start and end times of a Autosys job from last three months.
How can i get it. Do i need to check archived history or logs?
If yes, please lemme know the details.
TIA
Autosys internally uses Oracle or Sybase database. As long as the data is available in the DB you can fetch it using autorep command. To get past run time use -r handle.
For example: autorep -J JobA -r -30
The above will give you last 30th run time for the job.
However, due to performance bottleneck that may arise due to historical data in the DBs the DBAs generally purge the data after a while. I have seen period of 1 day to 7 days based on the number of the jobs and database instance power.
Other approximate way would to be use the log files created by autosys if the option stdout is specified with unique filenames.
For example: you can have the attribute as std_out: $JOB_NAME.out.date +%m.%s
In this case the log file will be created as soon as the job starts which you can get from the filename using text function on unix,etc.
For the end-time, you can use the last modified time - this is where the approximate part comes in as the time would depend if your job had an echo to the log file or not. It can either be close or far based on the command of the script.
This method will not let you know the times for the box jobs as they never have a log attribute, for that you can depend on the first job in the box.

Automatically fetch data every 10 minute (Simple html dom)

im working on a project, where i want to fetch last minute flights and then save them into my database. The problem is that i don't want scrape everytime the user visits the website and then save into my database because that will only cause alot of duplicates. Can i somehow make the website fetch the data for me on a scheduled time and then delete previous records in the database?
If you want the OS to execute a task periodically, cron job is what you want.
Either get the cron job to call your program via the command line, or use wget to fetch the page that would trigger the data fetching.
More on cron jobs:
http://www.thesitewizard.com/general/set-cron-job.shtml

How to check for existence of Unix System Services files

I'm running batch Java on an IBM mainframe under JZOS. The job creates 0 - 6 ".txt" outputs depending upon what it finds in the database. Then, I need to convert those files from Unix to MVS (ebcdic) and I'm using OCOPY command running under IKJEFT01. However, when a particular output was not created, I get a JCL error and the job ends. I'd like to check for the presence or absence of each file name and set a condition code to control whether the IKJEFT01 steps are executed, but don't know what to use that will access the Unix file pathnames.
I have resolved this issue by writing a COBOL program to check the converted MVS files and set return codes to control the execution of subsequent JCL steps. The completed job is now undergoing user acceptance testing. Perhaps it sounds like a kludge, but it does work and I'm happy to share this solution.
The simplest way to do this in JCL is to use BPXBATCH as follows:
//EXIST EXEC PGM=BPXBATCH,
// PARM='pgm /bin/cat /full/path/to/USS/file.txt'
//*
// IF EXIST.RC = 0
//* do whatever you need to
// ENDIF
If the file exists, the step ends with CC 0 and the IF succeeds. If the file does not exist, you get a non-zero CC (256, I believe), and the IF fails.
Since there is no //STDOUT DD statement, there's no output written to JES.
The only drawback is that it is another job step, and if you have a lot of procs (like a compile/assemble job), you can run into the 255 step limit.

Resources