how to create session name dynamically while running a workflow - unix

I have created a single mapping, session and workflow to load different tables which have the same data structure. I created a shell script to create a parameter file dynamically and run workflow based on tables. Now the issue is that when I run this workflow all session names are same as I created only one session but tables are different. I need the session names to be different and I should get like (session_name_table_name).
Please help me to solve this issue. sorry for my bad English if not able to understand.

In the pmcmdcommand pass the option -rin <run instance>. <run instance> should be replaced with your table name.
In the workflow monitor, the run instance name would appear in [ ] beside the workflow. Ex. wkf_s_m_load_test [T_TEST1]

Related

How to run liquibase changelogSyncSQL or changelogSync up to a tag and/or labels?

I'm adding liquibase to an existing project where tables and data already exist. I would like to know how I might limit the scope of changelogSync[SQL] to a subset of available changes.
Background
I've run liquibase generateChangeLog to capture the current state and placed this into say src/main/resources/db/changelog/changes/V2021.04.13.00.00.00__init01.yaml.
I've also added another changeset to cover some new requirements in a new file. Let's call it src/main/resources/db/changelog/changes/V2021.04.13.00.00.00__new-feature.yaml.
I've added a main changelog file src/main/resources/db/changelog/db.changelog-master.yaml with the following contents:
databaseChangeLog:
- includeAll:
path: changes
relativeToChangelogFile: true
I now want to ensure that when I run liquibase changelogSync[SQL] against a particular version of the db that the scope is limited to the first changelog init01, thereby allowing from that point on a liquibase update or updateToTag et al, to continue with changes following init01.
I'm surprised to see that the changelogSync[SQL] commands don't seem to offer some way (that I can see from the docs for how to do this.
Besides printing the SQL and manually changing it, is there something I've missed? Any suggested approaches welcome. Thanks!
What about changelogSyncToTagSQL ?
Wouldn't it cover your needs?
Or maybe you could try changelogSyncSQL with additional parameters "label" or and "context" ?
changelogSyncToTagSQL
context
labels
As it stands, the only solution I've found is to generate the SQL and then manually edit its contents to filter the change sets which don't correspond to your current schema.
Then you can apply the sql to the db.

tFlowToIterate with BigDataBatch job

I have currently a simple standard job on Talend doing this :
It simply reads a file of several lines (tHDFSInput), and for each line of this file (tFlowToIterate), I create a INSERT query "INSERT ... SELECT ... FROM" based on what I read in my file (tHiveRow). And it works well, it's just a bit slow.
I now need to modify my "Standard" job to make a "Big Data Batch" job in order to make it faster, and also because we asked me to only make Big Data Batch from now on.
The thing is that there is no tFlowToIterate and no tHiveRow component with Big Data Batch...
How can I do this ?
Thank's a lot.
Though I haven't tried this solution, I think this can help you.
Create hive table upfront.
Place tHDFSConfiguration component in the job and provide cluster details.
Use tFileInputDelimited component. On providing storage configuration as tHDFSConfiguration(defined in step1), this will read from HDFS.
Use tHiveOutput component. Connect tFileInputDelimited to tHiveOutput. In tHiveOutput, you can provide the table, format and save mode.
In order to load HDFS into Hive without modification on the data maybe you can use only one component : tHiveLoad
Insert your HDFS path inside the component.
tHiveLoad documentation : https://help.talend.com/reader/hCrOzogIwKfuR3mPf~LydA/ILvaWaTQF60ovIN6jpZpzg

How do I save data in R to an existing dataset in an existing Excel file?

I am creating a form in Shiny R and when the user inputs some information, I want that information to then be appended to the bottom of an existing dataset in an existing xlsx file.
So far I've tried using write.xlsx() with append = TRUE to add to an existing file, but this just creates a new sheet in the file. From there I tried specifying the sheet I want the data to be written to (sheet = 'Data'), but it tries to make a new sheet with that name instead and gets mad that it already exists.
Is this something that's even possible with write.xlsx() or will I have to find a different method to do this?
Edit
I think it would actually be helpful if I give a little more info about the app I'm making. The app is also being used to analyze the data in the existing file. This is why I just want to add to it, rather than create a new file.
Edit 2
For the most part, I seem to have things working correctly thanks to the suggestions below in the comments. With the exception of one issue I can't seem to fix. In the form, every time you submit new data it adds an additional column of numbers to the dataset, which I definitely don't want! Is there a way to prevent this from happening? As of now I am using bind_rows() to merge the data and I've also tried rbind(), rbind_all(), smartbind(), and I'm sure maybe one or two more I'm forgetting at the moment, with no luck.

How to get and set the default output directory in Robot Framework(Ride) in Run time

I would like to move all my output files to a custom location, to a Run directory created based on Date time during Run time. The output folder by datetime is created in the TestSetup
I have function "Process_Output_files" which will move the files to the Run folder(Run1,Run2,Run3 Folders).
I have tried using the argument-d and used the function "Process_Output_files" as suite tear down to move the output files to the respective Run directory.
But I get the following error "The process cannot access the file because it is being used by another process". I know this is because the Robot Framework (Ride) is currently using this.
If I dont use the -d argument, the output files are getting saved in temp folders.
c:\users\<user>\appdata\local\temp\RIDEfmbr9x.d\output.xml
c:\users\<user>\appdata\local\temp\RIDEfmbr9x.d\log.html
c:\users\<user>\appdata\local\temp\RIDEfmbr9x.d\report.html
My question is, Is there a way to get move the files to custom location during run time with in Robot Framework.
You can use the following syntax in RIDE (Arguments:) to create the output in newfolders dynamically
--outputdir C:/AutomationLogs/%date:~-4,4%%date:~-10,2%%date:~-7,2% --timestampoutputs
The above syntax gives you the output in below folder:
Output: C:\AutomationLogs\20151125\output-20151125-155017.xml
Log: C:\AutomationLogs\20151125\log-20151125-155017.html
Report: C:\AutomationLogs\20151125\report-20151125-155017.html
Hope this helps :)
I understand the end result you want is to have your output files in their custom folders. If this is your desire, it can be accomplished at runtime and you won't have to move them as part of your post processing. This will not work in RIDE, unfortunately, since the folder structure is created dynamically. I have two options for you.
Option 1: Use a script to kick off your tests
RIDE is awesome, but in my humble opinion, one shouldn't be using it to run ones tests, only to build and debug ones tests. Scripts are far more powerful and flexible.
Assuming you have a test, test2.txt, you wish to run, the script you use to do this could be something like:
from time import gmtime, strftime
import os
#strftime returns string representations of a date-time tuple.
#gmtime returns the date-time tuple representing greenwich mean time
dts=strftime("%Y.%m.%d.%H.%M.%S", gmtime())
cmd="pybot -d Run%s test2"%(dts,)
os.system(cmd)
As an aside, if you do intend to do post processing of your files using rebot, be aware you may not need to create intermediate log and report files. The output.xml files contain everything you need, so if you don't want to create superfluous files, use --log NONE --report NONE
Option 2: Use a listener to do post processing
A listener is a program you write that responds to events (x_start, x_end, etc). The close() event is akin to the teardown function and is the last thing called. So, assuming you have a function moveFiles() you simply need to create a listener class (myListener), define the close() method to call your moveFiles() function, and alert your test that it should report to a listener with the argument --listener myListener.
This option should be compatible with RIDE though I admit I have never tried to use listeners with the IDE.
At least you can write a custom run script that handles the moving of files after the test case execution. In this case the files are no longer used by pybot.

I'm having problems with configuring a filter that replicates specific tables only

I am trying to use filters to select specific tables to replicate.
I tried running this with the installer
./tools/tungsten-installer --master-slave -a \
...
--svc-extractor-filters=replicate \
--property=replicator.filter.replicate.do=test,*.foo"
and got this exception in trepctl status after the master had not installed properly:
Plugin class name property is missing or null: key=replicator.filter.replicate
which file is this properties file? How do I find it? Moreover, in specifying the settings for the filter, how do I know what exactly to put?
I discovered that I am supposed to Modify the configuration template file prior to configuration according to Issue 219 but what changes am I supposed to make in tungsten-replicator-2.0.5-diff that will later on be patched to the extraction?
Issue 254 suggests that If you want to apply a filter out of the box, you can use these options with tungsten-installer:
-a --property=replicator.filter.Replicate.ignoreFilter=schema_x.tablex,schema_x,tabley,schema_y,tablez
--svc-thl-filter=Replicate
However when I try using this for --property=replicator.filter.replicate.do,
but the problem is still the same:
pendingExceptionMessage: Plugin class name property is missing or null: key=replicator.filter.replicate
Your assistance will be greatly appreciated.
Rumbi
Update:
Hi
I had a look at this file: /root/tungsten/tungsten-replicator/samples/
conf/filters/default/tableignore.tpl .Acoording to this sample, a
static-SERVICE_NAME.properties file is supposed to have something like
this configured, please confirm if this is the correct syntax:
replicator.filter.tabledo=com.continuent.tungsten.replicator.filter.JavaScr iptFilter
replicator.filter.tabledo.script=${replicator.home.dir}/samples/
scripts/javascript-advanced/tabledo.js
replicator.filter.tabledo.tables=foo(database).bar(table)
replicator.stage.thl-to-dbms.filters=tabledo
However, I did not find tabledo.js (or something similar) in the
directory where tableignore.js exists. Could I please have the
location of this file. If there is an alternative way of specifiying
--property=replicator.filter.replicate.do=test without the use of
this .js file, your suggestions are most welcome.
Download the latest version of tungsten replicator. The missing tpl file was added about a month ago. After installation, the filtered tables should be added to static-service.properties under the section FILTERS.
Locate your replicator configuration file in static-YOUR_SERVICE_NAME.properties, e.g.
/opt/continuent/tungsten/tungsten-replicator/conf/static-mysql2vertica.properties
Make sure the individual dbms properties are set, in particular the setting replicator.applier.dbms:
# Batch applier basic configuration information.
replicator.applier.dbms=com.continuent.tungsten.replicator.applier.batch.SimpleBatchApplier
replicator.applier.dbms.url=jdbc:mysql:thin://${replicator.global.db.host}:${replicator.global.db.port}/tungsten_${service.name}?createDB=true
replicator.applier.dbms.driver=org.drizzle.jdbc.DrizzleDriver
replicator.applier.dbms.user=${replicator.global.db.user}
replicator.applier.dbms.password=${replicator.global.db.password}
replicator.applier.dbms.startupScript=${replicator.home.dir}/samples/scripts/batch/mysql-connect.sql
# Timezone and character set.
replicator.applier.dbms.timezone=GMT+0:00
replicator.applier.dbms.charset=UTF-8
# Parameters for loading and merging via stage tables.
replicator.applier.dbms.stageTablePrefix=stage_xxx_
replicator.applier.dbms.stageDirectory=/tmp/staging
replicator.applier.dbms.stageLoadScript=${replicator.home.dir}/samples/scripts/batch/mysql-load.sql
replicator.applier.dbms.stageMergeScript=${replicator.home.dir}/samples/scripts/batch/mysql-merge.sql
replicator.applier.dbms.cleanUpFiles=false
Depending on the database you are replicating to you may have to omit/modify some of the lines.
For more information see:
https://code.google.com/p/tungsten-replicator/wiki/Replicator_Batch_Loading
I don't know if this problem is still open or not.
I am using this version 2.0.6-xxx and installing the service using the parameters works for me.
I would like to point it out, that as the parameter says "--svc-extractor-filters" defines an extractor filter. Meaning that the parameters will guide the extraction of data in the master server.
If you intend to use it on the slave service, you should use the "--svc-applier-filters".
The parameters
--svc-extractor-filters=replicate \
--property=replicator.filter.replicate.do=test,*.foo"
supposed to create the following in the properties file:
This is the filter set up.
replicator.filter.replicate=com.continuent.tungsten.replicator.filter.ReplicateFilter
replicator.filter.replicate.ignore=
replicator.filter.replicate.do=test,*.foo
And you should also be able to find the
replicator.stage.binlog-to-q.filters=replicate
parameter set.
If you intend to use this filter in the slave, please find the line with:
replicator.stage.q-to-dbms.filters=mysqlsessions,pkey,bidiSlave
and change it as
replicator.stage.q-to-dbms.filters=mysqlsessions,pkey,bidiSlave,replicate
Hope this brief description did help to you!

Resources