Is there a way to use config-default.xml globally in Oozie? - oozie

From the documentation, config-default.xml must be presented in the workflow workspace.
- /workflow.xml
- /config-default.xml
|
- /lib/ (*.jar;*.so)
The problem
I've created a custom Oozie action and try to add default values for retry-max and retry-interval to all the custom actions.
So my workflow.xml will look like this:
<workflow-app xmlns="uri:oozie:workflow:0.3" name="wf-name">
<action name="custom-action" retry-max="${default_retry_max}" retry-interval="${default_retry_interval}">
</action>
config-default.xml file contains the values of default_retry_max and default_retry_interval.
What I've tried
Putting config-default.xml to every workflow workspace. This works, but the problem is there will be this file everywhere.
Setting oozie.service.LiteWorkflowStoreService.user.retry.max and oozie.service.LiteWorkflowStoreService.user.retry.inteval also works, but it would affect all action types.
I've also looked at Global Configurations, but it doesn't solve this problem.
I think there should be a way to put config-default.xml to oozie.libpath and only those workflows that use this libpath will be affected.

AFAIK, there is unfortunately no clean way to do it.
You might be interested in this recently created feature request: https://issues.apache.org/jira/browse/OOZIE-3179
The only thing that worked for me was to begin the workflow with a shell step that uses a script stored in hdfs. This script holds the centralized configuration. The script would look like this:
#!/bin/sh
echo "oozie.use.system.libpath=true"
echo "hbase_zookeeper_quorum=localhost"
.. etc other system or custom variables ..
Yes, the script simply prints the variables to the stdout.
Let's say the shell step action is called "global_config". All following steps are able to get the variables using following syntax:
${wf:actionData('global_config')['hbase_zookeeper_quorum']}
HTH...

Related

How to change reports or output saving location in robot framework from RIDE

When I run test cases from RIDE the reports are saved in the below path.
C:\Windows\Temp\RIDExf4xla.d
I want save reports in specific path. Can I do this from RIDE? Is there any setting to change the reports location?
Can anyone please suggest the way to do it.
Thanks
Look at the --outputdir command within the Robot Framework Documentation:
Here is what I use:
--outputdir C:/Robot/AutomationLogs/etc/etc --timestampoutputs
You use this one liner on the "Arguments" Field, right on the top of RIDE within the run tab.
From Wamans comment you can add formats to the end of the argument, to also change the dir name dynamically. See the 2nd answer within that SO question. This should be enough for you to get what you're asking for.
There is no way to set this within a UI.
Just set it by pasting that argument option within the "Arguments" Field at the top.
use below code in command line
C:\Tests\> robot -d C:\Test_results Test.robot

How does one create optional command line arguments in oozie workflow xml

Please bear in mind that I'm a complete rookie with oozie. I know that one can specify command line arguments in the oozie workflow xml by using the arg tag. I wondered how it is possible to specify an optional command line argument such that oozie will not complain that a required parameter is missing if the user doesn't specify it?
Many thanks in advance. If the information I've given is not specific enough, I can provide a concrete example when I log into my work machine tomorrow. We use apache commons CLI options to parse the options.
E.g. I want to make the following argument optional:
-e${endDateTime}
In your workflow wherever you would use ${myparam}, replace it with ${firstNotNull(wf:conf('myparam'), 'mydefaultvalue')}
In theory you should be able to use a "config-default.xml" file next to your "workflow.xml" file to give default values to the params in the workflow (see https://oozie.apache.org/docs/3.2.0-incubating/WorkflowFunctionalSpec.html) but I couldn't get it working.

How to get and set the default output directory in Robot Framework(Ride) in Run time

I would like to move all my output files to a custom location, to a Run directory created based on Date time during Run time. The output folder by datetime is created in the TestSetup
I have function "Process_Output_files" which will move the files to the Run folder(Run1,Run2,Run3 Folders).
I have tried using the argument-d and used the function "Process_Output_files" as suite tear down to move the output files to the respective Run directory.
But I get the following error "The process cannot access the file because it is being used by another process". I know this is because the Robot Framework (Ride) is currently using this.
If I dont use the -d argument, the output files are getting saved in temp folders.
c:\users\<user>\appdata\local\temp\RIDEfmbr9x.d\output.xml
c:\users\<user>\appdata\local\temp\RIDEfmbr9x.d\log.html
c:\users\<user>\appdata\local\temp\RIDEfmbr9x.d\report.html
My question is, Is there a way to get move the files to custom location during run time with in Robot Framework.
You can use the following syntax in RIDE (Arguments:) to create the output in newfolders dynamically
--outputdir C:/AutomationLogs/%date:~-4,4%%date:~-10,2%%date:~-7,2% --timestampoutputs
The above syntax gives you the output in below folder:
Output: C:\AutomationLogs\20151125\output-20151125-155017.xml
Log: C:\AutomationLogs\20151125\log-20151125-155017.html
Report: C:\AutomationLogs\20151125\report-20151125-155017.html
Hope this helps :)
I understand the end result you want is to have your output files in their custom folders. If this is your desire, it can be accomplished at runtime and you won't have to move them as part of your post processing. This will not work in RIDE, unfortunately, since the folder structure is created dynamically. I have two options for you.
Option 1: Use a script to kick off your tests
RIDE is awesome, but in my humble opinion, one shouldn't be using it to run ones tests, only to build and debug ones tests. Scripts are far more powerful and flexible.
Assuming you have a test, test2.txt, you wish to run, the script you use to do this could be something like:
from time import gmtime, strftime
import os
#strftime returns string representations of a date-time tuple.
#gmtime returns the date-time tuple representing greenwich mean time
dts=strftime("%Y.%m.%d.%H.%M.%S", gmtime())
cmd="pybot -d Run%s test2"%(dts,)
os.system(cmd)
As an aside, if you do intend to do post processing of your files using rebot, be aware you may not need to create intermediate log and report files. The output.xml files contain everything you need, so if you don't want to create superfluous files, use --log NONE --report NONE
Option 2: Use a listener to do post processing
A listener is a program you write that responds to events (x_start, x_end, etc). The close() event is akin to the teardown function and is the last thing called. So, assuming you have a function moveFiles() you simply need to create a listener class (myListener), define the close() method to call your moveFiles() function, and alert your test that it should report to a listener with the argument --listener myListener.
This option should be compatible with RIDE though I admit I have never tried to use listeners with the IDE.
At least you can write a custom run script that handles the moving of files after the test case execution. In this case the files are no longer used by pybot.

I'm having problems with configuring a filter that replicates specific tables only

I am trying to use filters to select specific tables to replicate.
I tried running this with the installer
./tools/tungsten-installer --master-slave -a \
...
--svc-extractor-filters=replicate \
--property=replicator.filter.replicate.do=test,*.foo"
and got this exception in trepctl status after the master had not installed properly:
Plugin class name property is missing or null: key=replicator.filter.replicate
which file is this properties file? How do I find it? Moreover, in specifying the settings for the filter, how do I know what exactly to put?
I discovered that I am supposed to Modify the configuration template file prior to configuration according to Issue 219 but what changes am I supposed to make in tungsten-replicator-2.0.5-diff that will later on be patched to the extraction?
Issue 254 suggests that If you want to apply a filter out of the box, you can use these options with tungsten-installer:
-a --property=replicator.filter.Replicate.ignoreFilter=schema_x.tablex,schema_x,tabley,schema_y,tablez
--svc-thl-filter=Replicate
However when I try using this for --property=replicator.filter.replicate.do,
but the problem is still the same:
pendingExceptionMessage: Plugin class name property is missing or null: key=replicator.filter.replicate
Your assistance will be greatly appreciated.
Rumbi
Update:
Hi
I had a look at this file: /root/tungsten/tungsten-replicator/samples/
conf/filters/default/tableignore.tpl .Acoording to this sample, a
static-SERVICE_NAME.properties file is supposed to have something like
this configured, please confirm if this is the correct syntax:
replicator.filter.tabledo=com.continuent.tungsten.replicator.filter.JavaScr iptFilter
replicator.filter.tabledo.script=${replicator.home.dir}/samples/
scripts/javascript-advanced/tabledo.js
replicator.filter.tabledo.tables=foo(database).bar(table)
replicator.stage.thl-to-dbms.filters=tabledo
However, I did not find tabledo.js (or something similar) in the
directory where tableignore.js exists. Could I please have the
location of this file. If there is an alternative way of specifiying
--property=replicator.filter.replicate.do=test without the use of
this .js file, your suggestions are most welcome.
Download the latest version of tungsten replicator. The missing tpl file was added about a month ago. After installation, the filtered tables should be added to static-service.properties under the section FILTERS.
Locate your replicator configuration file in static-YOUR_SERVICE_NAME.properties, e.g.
/opt/continuent/tungsten/tungsten-replicator/conf/static-mysql2vertica.properties
Make sure the individual dbms properties are set, in particular the setting replicator.applier.dbms:
# Batch applier basic configuration information.
replicator.applier.dbms=com.continuent.tungsten.replicator.applier.batch.SimpleBatchApplier
replicator.applier.dbms.url=jdbc:mysql:thin://${replicator.global.db.host}:${replicator.global.db.port}/tungsten_${service.name}?createDB=true
replicator.applier.dbms.driver=org.drizzle.jdbc.DrizzleDriver
replicator.applier.dbms.user=${replicator.global.db.user}
replicator.applier.dbms.password=${replicator.global.db.password}
replicator.applier.dbms.startupScript=${replicator.home.dir}/samples/scripts/batch/mysql-connect.sql
# Timezone and character set.
replicator.applier.dbms.timezone=GMT+0:00
replicator.applier.dbms.charset=UTF-8
# Parameters for loading and merging via stage tables.
replicator.applier.dbms.stageTablePrefix=stage_xxx_
replicator.applier.dbms.stageDirectory=/tmp/staging
replicator.applier.dbms.stageLoadScript=${replicator.home.dir}/samples/scripts/batch/mysql-load.sql
replicator.applier.dbms.stageMergeScript=${replicator.home.dir}/samples/scripts/batch/mysql-merge.sql
replicator.applier.dbms.cleanUpFiles=false
Depending on the database you are replicating to you may have to omit/modify some of the lines.
For more information see:
https://code.google.com/p/tungsten-replicator/wiki/Replicator_Batch_Loading
I don't know if this problem is still open or not.
I am using this version 2.0.6-xxx and installing the service using the parameters works for me.
I would like to point it out, that as the parameter says "--svc-extractor-filters" defines an extractor filter. Meaning that the parameters will guide the extraction of data in the master server.
If you intend to use it on the slave service, you should use the "--svc-applier-filters".
The parameters
--svc-extractor-filters=replicate \
--property=replicator.filter.replicate.do=test,*.foo"
supposed to create the following in the properties file:
This is the filter set up.
replicator.filter.replicate=com.continuent.tungsten.replicator.filter.ReplicateFilter
replicator.filter.replicate.ignore=
replicator.filter.replicate.do=test,*.foo
And you should also be able to find the
replicator.stage.binlog-to-q.filters=replicate
parameter set.
If you intend to use this filter in the slave, please find the line with:
replicator.stage.q-to-dbms.filters=mysqlsessions,pkey,bidiSlave
and change it as
replicator.stage.q-to-dbms.filters=mysqlsessions,pkey,bidiSlave,replicate
Hope this brief description did help to you!

How to work with hook_nodeapi after image thumbnail creation with ImageCache

A bit of a followup from a previous question.
As I mentioned in that question, my overall goal is to call a Ruby script after ImageCache does its magic with generating thumbnails and whatnot.
Sebi's suggestion from this question involved using hook_nodeapi.
Sadly, my Drupal knowledge of creating modules and/or hacking into existing modules is pretty limited.
So, for this question:
Should I create my own module or attempt to modify the ImageCache module?
How do I go about getting the generated thumbnail path (from ImageCache) to pass into my Ruby script?
edit
I found this question searching through SO...
Is it possible to do something similar in the _imagecache_cache function that would do what I want?
ie
function _imagecache_cache($presetname, $path) {
...
...
// check if deriv exists... (file was created between apaches request handler and reaching this code)
// otherwise try to create the derivative.
if (file_exists($dst) || imagecache_build_derivative($preset['actions'], $src, $dst)) {
imagecache_transfer($dst);
// call ruby script here
call('MY RUBY SCRIPT');
}
Don't hack into imagecache, remember every time you hack core/contrib modules god kills a kitten ;)
You should create a module that invokes hook_nodeapi, look at the api documentation to find the correct entry point for your script, nodeapi works on various different levels of the node process so you have to pick the correct one for you (it should become clear when you check the link out) http://api.drupal.org/api/function/hook_nodeapi
You won't be able to call the function you've shown because it is private so you'll have to find another route.
You could try and build the path up manually, you should be able to pull out the filename of the uploaded file and then append it to the directory structure, ugly but it should work. e.g.
If the uploaded file is called test123.jpg then it should be in /files/imagecache/thumbnails/test123/jpg (or something similar).
Hope it helps.

Resources