Cloudify restart livecycle event - recipe

I'm writing a cloudify recipe and I am trying to manage the HA. In the cloudify doc, i saw the following concerning the "stopDetection" probe:
It is important to note that Cloudify will invoke the restart event if
you implemented one and the start event will only serve as a fallback.
Does it means that cloudify has a "restart" lifecycle event (such as start, stop)?
Thanks.

The "stopDetection" life cycle event handler notifies Cloudify that the process is dead and then as a result, Cloudify invokes the "start" life cycle event handler. - This means that the process is restarted and NOT the VM.
Cloudify 2.7* doesn't have a VM "restart" event handler.
If you need to restart a vm via Cloudify, you can use the Cloudify maintenance mode.
HTH,
Tamir.

Related

WebSphere WorkManager Cluster Double Execution of the Job

I have an Application deployed on WebSphere 8.5 with Java 1.7.1, defined with a cluster of 2 nodes.
In this application there is an EJB that, through a work manager submit an Async Job.
The problem is that on WAS 8.5 the Job is executed two times on both node of the cluster. In WAS 6.1 this did not happen.
The work is submitted by an Alarm Manager. Below the code extracted:
WorkManager wm = serviceLocator.getWorkManager("NameOfCustomWorkManager");
AsynchScope scope = wm.findAsynchScope("scopeName");
if (scope == null)
scope = wm.createAsynchScope("scopeName");
AlarmManager alarmManager = scope.getAlarmManager();
alarmManager.create(listener, "Alarm Context Info", (int) (DateUtils.getNextTime(nextTime) - System.currentTimeMillis())); --Fired on a certain hours
logger.info("Alarm fired.");
Somebody know if on was 8.5 there are additional configuration to avoid the problem described?
WorkManager in WebSphere Application Server, regardless of which version, does not have and has never had the ability to operate or coordinate across remote JVMs. The designed behavior of the WorkManager is that it is only able to run the work that you submit on the same JVM from where you submitted the work, and that it has no awareness of duplicate work that you submit from a different JVM and no mechanism for coordinating work across JVMs. The same is true of AlarmManager instances that you obtain from the WorkManager. (WebSphere Application Server actually has a way of accomplishing the above, which is the Scheduler, but the above code is not using that). Could it be possible that some earlier logic in the application that is impacted by the version change could be causing the Alarm to be created on both members now, whereas previously it would have only been created on one?

How to handle multiple code checkins in Concourse pipeline?

One of the github repository is resource for my pipeline. I have 3 parallel jobs in my concourse pipeline which gets triggered when there is any checkin to the github repository. Other jobs in the pipeline is in sequence. I am having the below issues:
1) I want the pipeline to complete full execution then only start new run. I am using pool resource to make sure the execution completes then only new run is triggered. Is there a better way to resolve it.
2) If there are multiple checkins while the pipeline is in progress then is there a way to only execute pipeline on the last checkin. For example 1st instance of pipeline is running and while the pipeline execution completes there are 6 checkins in the repository. Can the pipeline pick only 6th version of the repos and purge the run for previous five checkins?
using the lock pool resource is almost the perfect option but as you have rightly caught, there will be a trigger for each git commit and jobs will start to queue.
It sounds like you want this pipeline to be serialised. Have you considered serial_groups http://concourse-ci.org/single-page.html#job-serial-groups

Running Apache spark job from Spring Web application using Yarn client or any alternate way

I have recently started using spark and I want to run spark job from Spring web application.
I have a situation where I am running web application in Tomcat server using Spring boot.My web application receives a REST web service request based on that It needs to trigger spark calculation job in Yarn cluster. Since my job can take longer to run and can access data from HDFS, so I want to run the spark job in yarn-cluster mode and I don't want to keep spark context alive in my web layer. One other reason for this is my application is multi tenant so each tenant can run it's own job, so in yarn-cluster mode each tenant's job can start it's own driver and run in it's own spark cluster. In web app JVM, I assume I can't run multiple spark context in one JVM.
I want to trigger spark jobs in yarn-cluster mode from java program in the my web application. what is the best way to achieve this. I am exploring various options and looking your guidance on which one is best
1) I can use spark-submit command line shell to submit my jobs. But to trigger it from my web application I need to use either Java ProcessBuilder api or some package built on java ProcessBuilder. This has 2 issues. First it doesn't sound like a clean way of doing it. I should have a programatic way of triggering my spark applications. Second problem will be I will loose the capability of monitoring the submitted application and getting it's status.. Only crude way of doing it is reading the output stream of spark-submit shell, which again doesn't sound like good approach.
2) I tried using Yarn client to submit the job from spring application. Following is the code that I use to submit spark job using Yarn Client:
Configuration config = new Configuration();
System.setProperty("SPARK_YARN_MODE", "true");
SparkConf conf = new SparkConf();
ClientArguments cArgs = new ClientArguments(sparkArgs, conf);
Client client = new Client(cArgs, config, conf);
client.run();
But when I run the above code, it tries to connect on localhost only. I get this error:
5/08/05 14:06:10 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 15/08/05 14:06:12 INFO Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
So I don't think it can connect to remote machine.
Please suggest, what is best way of doing this with latest version of spark. Later I have plans to deploy this entire application in amazon EMR. So approach should work there also.
Thanks in advance
Spark JobServer might help:https://github.com/spark-jobserver/spark-jobserver, this project receives RESTful web requests and start a spark job. Results is returned as json response.
I also had similar issues trying to run Spark app that connects to YARN cluster - having no cluster config it was trying to connect to the local machine as for the main node of the cluster, which obviously failed.
It worked for me when I've placed core-site.xml and yarn-site.xml into the classpath (src/main/resources in typical sbt or Maven project structure) - application correctly connected to the cluster.
When using spark-submit location of those files is typically specified by HADOOP_CONF_DIR environment variable, but for stand-alone application it didn't have effect.

Apache Mesos Workflows - Event Driven Scheduler

We are currently using Apache Mesos with Marathon and Chronos to schedule long running and batch processes.
It would be great if we could create more complex workflows like with Oozie. Say for example kicking of a job when a file appears in a location or when a certain application completes or calls an API.
While it seems we could do this with Marathon/Chronos or Singularity, there seems no readily available interface for this.
You can use Chronos' /scheduler/dependency endpoint to specify "all jobs which must run at least once before this job will run." Do this on each of your Chronos jobs, and you can build arbitrarily complex workflow DAGs.
https://airbnb.github.io/chronos/#Adding%20a%20Dependent%20Job
Chronos currently only schedules jobs based on time or dependency triggers. Other events like file update, git push, or email/tweet could be modeled as a wait-for-X job that your target job would then depend on.

What does it mean start a worker process (w3wp.exe) in debug mode?

All, According the WAS.The document says the worker process is managed by WAS .
But I found when typing w3wp /? There is a debug flag.
-debug
This option launches a worker process using the default
application host config file. By default, it will use
site id 1.
What does mean start worker process in debug mode? In what case we want to start a work process with debug option ? thanks.
Added
I didn't known why I got a exception when run w3wp.exe -debug.
ERROR: There has been an error during processing of this command.
Please check the event log and see if any errors or warnings have been
logged.
When I checked the log. It looks like:
The World Wide Web Publishing Service failed to set the application
pool for the application '/xxxx' in site '1'. The data field contains
the error number.
According to http://support.microsoft.com/kb/183480 it switches the security context of the running user which is not its normal mode of operation - I suppose this would make it easier to then attach a debugger and other utilities.

Resources