Trigger mainframe job from AirFlow - airflow

May I know if AirFlow support Mainframe jobs ? Can we schedule Mainframe jobs using AirFlow ?
Thanks in advance.

I do not know airflow specifically, but we have used Ansible, Jenkins, and IBM Urban Code Deploy for orchestration that includes distributed and mainframe process parts.
You can SSH into z/OS and use Bash, Python, cURL, Node.js, or Groovy. You could submit JCL via REST APIs. There is a command line processor for Db2 to execute SQL and stored procedures via bash terminal. There is the new Zowe CLI that brings a modern command line interface to z/OS.
I would ask the question - what is the nature of what you want to be scheduled? What language is it written in, or what language do you want it written in? If something exists today, what is the process and how is it scheduled today?
While I haven't used airflow, you can use modern interfaces to do things on z/OS, and frequently that is what is actually needed to integrate with orchestration tools.

Elaborating on Patrick Bossman's good summary, Apache Airflow definitely supports SSH connections to run commands and/or transfer files:
https://airflow.apache.org/howto/connection/ssh.html
z/OS includes OpenSSH as a standard, IBM supported feature in the base operating system at no additional charge, although it's possible it's not running in your particular z/OS installation. Dovetailed Technologies has published a helpful "Quick Install Guide" that explains how to configure and start OpenSSH on z/OS if it isn't configured already:
http://dovetail.com/docs/pt-quick-inst/pt-quick-inst-doc.pdf
Their reference points to IBM's official z/OS documentation if you need more information.
You may decide to have other connections to z/OS from Apache Airflow, but SSH is certainly an available option.
FYI, it appears possible to run Apache Airflow directly on z/OS 2.4 itself. I haven't personally tried it, but it looks good to go. The recipe to do that would be as follows:
Configure and fire up the z/OS Container Extensions ("zCX"), a standard, included, IBM supported, no additional charge feature in z/OS 2.4 that's compatible with IBM z14 and higher model IBM Z machines.
Install and run a Python container (Docker/OCI format) on zCX, for example a Python container from DockerHub. You'll need a Python container image that includes "s390x" architecture support, either on its own or in a multi-architecture container. (No problem with DockerHub's image.)
Use pip to install Apache Airflow within your Python container, per normal.
Configure your SSH (and perhaps other) connection(s) from Airflow to the rest of z/OS, as described above.
You can also run Apache Airflow on Linux on Z/LinuxONE, either on the same IBM Z machine where z/OS runs or on a different machine. You can test Apache Airflow using the free (for up to 120 days) IBM LinuxONE Community Cloud, and you could even create your own custom Docker/OCI container on the LinuxONE Community Cloud for deployment to zCX.
It might even be possible to run Airflow on Python for z/OS, without zCX, although if so there'd be some more work involved. Python for z/OS is available from Rocket Software here:
https://www.rocketsoftware.com/product-categories/mainframe/python-for-zos

Related

Remote execution with on a Single node with Multiple GPU

I am looking into documentation for running hydra on a single node remotely. I am looking for methods where I can run a code present in my local machine and to run it on a GCP instance.
Any pointers?
It sounds like you are looking for a Hydra Launcher that supports GCP.
For now, Hydra does not support this. We do have a Ray Launcher that launches to AWS and could be further extended to launching on GCP. Feel free to subscribe this issue.

use julia language without internet connection (mirror?)

Problem:
I would like to make julia available for our developers on our corporate network, which has no internet access at all (no proxy), due to sensitive data.
As far as I understand julia is designed to use github.
For instance julia> Pkg.init() tries to access:
git://github.com/JuliaLang/METADATA.jl
Example:
I solved this problem for R by creating a local CRAN repository (rsync) and setting up a local webserver.
I also solved this problem for python the same way by creating a local PyPi repository (bandersnatch) + webserver.
Question:
Is there a way to create a local repository for metadata and packages for julia?
Thank you in advance.
Roman
Yes, one of the benefits from using the Julia package manager is that you should be able to fork METADATA and host it anywhere you'd like (and keep a branch where you can actually check new packages before allowing your clients to update). You might be one of the first people to actually set up such a system, so expect that you will need to submit some issues (or better yet; pull requests) in order to get everything working smoothly.
See the extra arguments to Pkg.init() where you specify the METADATA repo URL.
If you want a simpler solution to manage I would also think about having a two tier setup where you install packages on one system (connected to the internet), and then copy the resulting ~/.julia directory to the restricted system. If the packages you use have binary dependencies, you might run into problems if you don't have similar systems on both sides, or if some of the dependencies is installed globally, but Pkg.build("Pkgname") might be helpful.
This is how I solved it (for now), using second suggestion by
ivarne.I use a two tier setup, two networks one connected to internet (office network), one air gapped network (development network).
System information: openSuSE-13.1 (both networks), julia-0.3.5 (both networks)
Tier one (office network)
installed julia on an NFS share, /sharename/local/julia.
soft linked /sharename/local/bin/julia to /sharename/local/julia/bin/julia
appended /sharename/local/bin/ to $PATH using a script in /etc/profile.d/scriptname.sh
created /etc/gitconfig on all office network machines: [url "https://"] insteadOf = git:// (to solve proxy server problems with github)
now every user on the office network can simply run # julia
Pkg.add("PackageName") is then used to install various packages.
The two networks are connected periodically (with certain security measures ssh, firewall, routing) for automated data exchange for a short period of time.
Tier two (development network)
installed julia on NFS share equal to tier one.
When the networks are connected I use a shell script with rsync -avz --delete to synchronize the .julia directory of tier one to tier two for every user.
Conclusion (so far):
It seems to work reasonably well.
As ivarne suggested there are problems if a package is installed AND something more than just file copying is done (compiled?) on tier one, the package wont run on tier two. But this can be resolved with Pkg.build("Pkgname").
PackageCompiler.jl seems like the best tool for using modern Julia (v1.8) on secure systems. The following approach requires a build server with the same architecture as the deployment server, something your institution probably already uses for developing containers, etc.
Build a sysimage with PackageCompiler's create_sysimage()
Upload the build (sysimage and depot) along with the Julia binaries to the secure system
Alias a script to julia, similar to the following example:
#!/bin/bash
set -Eeu -o pipefail
unset JULIA_LOAD_PATH
export JULIA_PROJECT=/Path/To/Project
export JULIA_DEPOT_PATH=/Path/To/Depot
export JULIA_PKG_OFFLINE=true
/Path/To/julia -J/Path/To/sysimage.so "$#"
I've been able to run a research pipeline on my institution's secure system, for which there is a public version of the approach.

Deploying an ASP.NET web site to a remote VPS with Jenkins

I am just starting to get my head wrapped around continuous deployment with Jenkins, but I am running into some roadblocks and I haven't really found very many good, definitive resources on the topic in regards to ASP.NET applications.
I have set up a local build server than successfully pulls down code from a SVN repo, and builds it OK with MSBuild. This works well so far, but now I'd like to automate pushing this compiled code to a development server.
My problem is this - from what I gather based on what I read (which may be an incorrect assumption...) is that the staging server is typically within the same network as the build server, meaning you can share network resources, servers, etc.
In my case, I want to run the Jenkins server on a remote VPS, then deploy to other remote VPSes (so, essentially individual isolated machines communicating with each other).
I have seen alot of terms, but I am very new in my Sys Admin / DevOps type skills.
So, my question is this:
Is it even possible to, using Jenkins on a VPS, to then deploy to any particular server I choose? (I have full access to all of them, so if its a security thing, I can fix that... but they are not within the same network/domain)
What is the method to achieve this? I've seen xcopy, Web Deployment Packages (msdeploy), batch scripts, etc. mentioned, but not really a guidance behind what to use in what situations. Are any of these methods useful to achieve my goal?
Thanks for any help or guidance!
How is your Powershell? ;) You should check out psake.
psake is a build automation tool written in PowerShell. It avoids the
angle-bracket tax associated with executable XML by leveraging the
PowerShell syntax in your build scripts. psake has a syntax inspired
by rake (aka make in Ruby) and bake (aka make in Boo), but is easier
to script because it leverages your existent command-line knowledge.
psake is pronounced sake – as in Japanese rice wine. It does NOT rhyme
with make, bake, or rake.
You can deploy your files to the target server through SSH. Jenkins do support transfers through SSH. All you need to do is setting up a SSH server ex : CopSSH and a user account with admin permissions. and configuring the Jenkins to transfer through SSH.
Create host configurations in the main Jenkins configuration
Add an SSH Server
Add the public key to the remote server (the build server)
Click "Test Configuration"
Save
Configure a job to Publish Over SSH (Post Build Action)
Add Transfer Set.
Refer Publish Over SSH For More details

How to create script to deploy asp.net application direct from clearcase?

I am trying to write a script to deploy asp.net application from Clear Case. I am using Clear Case Remote Client.
How will i start? what is the easiest way?
CCRC is for accessing code from a "web" ClearCase snapshot view.
Being a light ClearCase installation, you:
won't have all the cleartool command which would allow to detect new content (new versions on files) to be updated
won't have the easy integration you could have with TeamCity, or Jenkins, or Hudson, ... since they all rely on a cleartool command.
TeamCity, for instance, has still a pending ticket on CCRC support.
For you, since you don't want/need to use those schedulers anyway, you can start by using the CCRC CLI (rcleartool) in order to:
update your ccweb view
check if the update has gotten any new versions
deploy your app if it has gotten anything new.
rcleartool update [-username user-name][-ser/ver server-url][-pas/sword user-password]
[-print] [-ove/rwrite | -nove/rwrite | -ren/ame]
[pname ...]
Jenkins currently follows a similar path to plan for CCRC support: ticket 5192:
(and neither Jenkins nor Hudson support CCRC yet)
I'm thinking about which is better the calling of rcleartool as external tool, or develop a teamapi (or as they call now cmapi) based pure java extension.
More details on this IBM article:
"Continuous integration with IBM Rational ClearCase Remote Client"
In this general architecture schema for CI with CCRC, my suggestion above (rcleartool update) is illustrated by the link between the CM server and the build server.
Personally I'd start by not re-inventing the wheel.
Team City is one such product that can do what you're asking about
http://www.jetbrains.com/teamcity/

Alternative tools for Amazon EC2?

Amazon's official tools for interacting with EC2 are kind of clunky and a pain to deal with. I have to set up a bunch of environment variables, store separate private keys just for EC2, add extra items to my PATH, and so on. They all output tab delimited lines that are hundreds of characters long with no headings, so it's a bit of a pain to interpret them. Their instructions for setting up an SSH keypair give you one that isn't protected by a passphrase, rather than letting you use an existing keypair that you already have. The programs are all just a bit clunky and aren't very good Unix programs.
So, are there any easier to use command line tools for accessing EC2? I know there is ElasticFox, and there is their web based console, which do make the process easier, but I'm wondering if anyone else has written better command line tools for interacting with EC2.
I'm a bit late but I have a solution!
I found the same problems with the Amazon AMI tools. They're a decent reference implementation but very difficult to use particularly when you have more than a couple instances. I wrote a replacement command-line tool as part of another project, called Rudy that answers most of your concerns
The commands are more intuitive than Amazon's AMI tools:
rudy-ec2 instances -C
rudy-ec2 groups -A -p 8080 -a 11.22.33.44 group-name
rudy-ec2 volumes -C -s 100
rudy-ec2 images
...
All configuration is in a single file (~/.rudy/config).
It can output in several formats (yaml, json, csv, tsv, and of course regular text):
rudy-ec2 -f yaml snapshots
---
:awsid: snap-2457b24d
:progress: 100%
:created: "2009-05-08T15:24:17.000Z"
:volid: vol-4ee10427
:status: completed
Regarding the private keys, There are no EC2 tools that allow to create private keys for with a password for booting a public instance because the API doesn't support it. However, if you create your own image, you can use your private keys.
Here's more info:
GitHub Project
An introduction to rudy-ec2
ElasticFox is handy for most tasks. They are occasions though that a command line tool will be better suited. I personally use boto library for python. It is very easy to script all the required operations. You can also use it to upload/download files from S3. In general, I would say that a scripting language like Python or RUby, together with a AWS library, is the best solution.
I personally use Tim Kay's Perl command line tools and haven't used original Java based API for quite some time. Excellent for UNIX environment.
Not command line, but take a look at what a free RightScale account will give you - much, much easier than command line or ElasticFox IMO.
About ec2-api-tools:
I agree that they are a bit too clunky, I particular dislike the output of ec2-describe-instances.
I recently switched to python-boto which offers a very clean and easy to use interface to ec2.
About not being able to specify a passphrase for the ssh key generated by EC2:
That's not the case. You can change the passphrase of any ssh private key anytime, using:
ssh-keygen -p -f /path/to/keyfile
E.g.
ssh-keygen -p -f ~/.ssh/id_rsa
About uploading your own ssh key pair:
You can use ec2-import-keypair, like this:
for i in $(ec2-describe-regions|cut -f 2);do
ec2-import-keypair --region $i mykey --public-key-file ~/.ssh/id_rsa.pub
done
The example above will upload the public key in ~/.ssh/id_rsa.pub to every region under the name "mykey". Remember that each region has it's own keypair.
In order to have the key installed in your ec2 instances, you'll have to pass the "-k mykey" option to ec2-run-instances.
Incidentally, uploading your own keypair is the only way to login with the same key to all the instances in all regions. If you create the keypair from the web interface, you'll have a different key in every region.
I have an open source graphical system admin tool called EC2Dream that replaces the command line tool. It installs on windows, linux and Mac OS clients and is written in Ruby and FXRuby. See www.ec2dream.com.
Neill Turner
www.ec2dream.com
If you use windows, try the tool linked below (part of the O2 Platform) which gives you an easy way to start and stop Amazon EC2 images (and if you need to extend the tool it you can easily add new features (since it just an C# script that is dynamically compiled and executed)
O2 Tool – Amazon EC2 Browser
Amazon EC2 Browser – Timer to Stop Instances
the problem with alternative libraries is that they are not always kept up to date, so if new features for AWS are released, then you need to wait. You posted that your main problems are the bunch of environment variables, add extra items to your PATH, etc. We had this
issue at BitNami, and is the main reason we created BitNami Cloud Tools that ships all of the AWS command line tools together with preconfigured Java and Ruby language runtimes. You only have to download it and everything that you need will be installed in a folder without modify your system configuration. We keep it regularly up to date.
There is an entire industry called Cloud Management which try to solve this type of problems. Scalr and RightScale and the leaders in this sector (disclaimer: I work at Scalr).
Cloud management softwares are built on top of Amazon EC2 API (and usually on other public IaaS like Rackspace) and provide an improved user interface along with automation tools like backups or SSH management as you mentioned it. They don't provide easier command lines tools stricto sensu. Their goal is to make interaction with Amazon EC2 easier.
Different options are available in the market:
Scalr: Scalr is available as a hosted service with a trial version.
Otherwise you can download and install the source code yourself, as it is released under the Apache 2 license.
RightScale: while they are usually considered as expensive for small businesses, they do offer a free account.
enStratus: they offer a freemium model like RightScale.

Resources