oozie dataset template uri using wildcard

oozie dataset template uri using wildcard - oozie

I am trying to define a dependency to trigger my oozie workflow in my coordinator and my source file path is dynamic which cannot be defined before its presence, that's why i want to match my dependency againt a certian pattern, for example, the file can be like this: hdfs://path/to/file/a01b/a.parquet or hdfs://path/to/file/c01d/a.parquet. I want to match this file by
hdfs://path/to/file/*01*/
in <uri-template> of oozie coordinator dataset but it seems oozie cannot recognise such pattern of wildcard
Any idea how to achieve this?

Related

Azure Devops - Limiting scope of IISWebAppDeploymentOnMachineGroup#0 task XmlVariableSubstitution

I'm working on improving security in a legacy asp.net application. One issue identified was the use of hard-coded database connection strings in web.config.
To resovle this, I've moved the connection details to secret variables in Azure Devops variable groups.
The variable substitution is done in the IISWebAppDeploymentOnMachineGroup#0 task, by setting XmlVariableSubstitution.
This works fine. However I'm a bit concerned about how broadly this applies. This task will perform substitutions across all config files in the application, matching any element in appSettings, connectionStrings, configSections, based on key or name, against all pipeline variables.
If at some stage someone added a variable to the variable groups, which happens to match a key for any appSettings across the whole application, the value will be unintentionally and silently substituted.
I'd like to somehow limit the scope of the substitution task, to ensure it only applies where we need it to.
Is anyone aware of any way to do this?

When you use the option: XML variable substitution in the IISWebAppDeploymentOnMachineGroup task, it will loop all config files by default.
I am afraid that there is no such method can limit the scope of the Xml Variable Substitution action in the IISWebAppDeploymentOnMachineGroup task.
For a workaround, you can add File transform task to update the variable in the config file. It supports to defining the target file in the task.
for example:
- task: FileTransform#1
displayName: 'File Transform: '
inputs:
fileType: xml
targetFiles: web.config
On the other hand, you can also use the task RegEx Match & Replace task from RegEx Match & Replace. It supports to define the target variable and target file in the task. Refer to my previous ticker: RegExMatchReplace task

Can I pass data between multiple steps in a salt orchestrator?

I have a list of states for different targets that I am orchestrating.
I am using salt orchestration for this.
I want to apply a state on one target.
This process at one point generates an ID on the target.
I can easily query that ID on that target.
I now need to apply a different state to a different target, that requires that ID.
My question is: How can I share data that from one minion with the next orchestration step as a jinja variable (or similar)?
Details:
Salt creates a Ressource on a minion (Rancher Management k8s Cluster).
The Ressource is assigned a random ID by the minion (Rancher).
I can query that ID with a kubectl commmand using cmd.run
I would like to pass the result on to a next step in the orchestration. Preferably accessing it as a {{ jinjna variable }}. This step is executed on a different Minion.

How to load Autosys profile in jil

I have an autosys job which has a profile value specified in the jil. I am very new to autosys and I need to check the logs of the job. When I go to the host, it says that the log locationn specified in the jil(like $AUTLOG/errorlogs) does not exist. Do I have to load some profile on the host? How can I do this so that I am able to access the variables in the profile.

While defining the JIL for the job add the following attribute.
profile:/path/profile.ini
All the variables defined in /path/profile.ini file would be read.
Remember to export the variables inside the profile.ini file, Eg.
export Var1="some_path"
export Var2="DB_USER"
Goodluck !!

Cloudera Post deployment config updates

In cloudera is there a way to update list of configurations at a time using CM-API or CURL?
Currently I am updating one by one one using below CM API.
services_api_instance.update_service_config()
How can we update all configurations stored in json/config file at a time.

The CM API endpoint you're looking for is PUT /cm/deployment. From the CM API documentation:
Apply the supplied deployment description to the system. This will create the clusters, services, hosts and other objects specified in the argument. This call does not allow for any merge conflicts. If an entity already exists in the system, this call will fail. You can request, however, that all entities in the system are deleted before instantiating the new ones.
This basically allows you to configure all your services with one call rather than doing them one at a time.
If you are using services that require a database (Hive, Hue, Oozie ...) then make sure you set them up before you call the API. It expects all the parameters you pass in to work so external dependencies must be resolved first.

can we configure oozie coordinator which will trigger when data is available in two different directories?

I would like create oozie coordinator which will be depending on five different input folders once data available in these folders then it should trigger job. Is it possible?

Yes You can perform the same by calling some UDF Jar, which will help to check the Input Location defined in the Mention configuration and will exist with exception if the List of Input Location is empty

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

oozie dataset template uri using wildcard - oozie

Related

Azure Devops - Limiting scope of IISWebAppDeploymentOnMachineGroup#0 task XmlVariableSubstitution

Can I pass data between multiple steps in a salt orchestrator?

How to load Autosys profile in jil

Cloudera Post deployment config updates

can we configure oozie coordinator which will trigger when data is available in two different directories?

Categories

Resources