I need to launch a Condor job on a cluster with multiple slots per machine.
I have an additional requirement that two jobs can not be placed at the same time in the same physical machine. This is due to some binary that I can not control which performs some networking (poorly).
This is a somewhat related question: Limiting number of concurrent processes scheduled by condor
but it does not completely solves my problem. I understand I could restrict where jobs can run in the following way: Requirements = (name == "slot1#machine1") || (name == "slot1#machine2") ...
However this is too restricting as I don't care which slot the jobs run as long as two jobs are not together in the same machine.
Is there a way to achieve this?
If this is not possible how can I tell condor to pick the machine that has the most slots available?
You can try condor_status command to check the status of the pool of machines.
The first column shows the name of the slots and machines
Now check the State - Activity:
Unclaimed : Slot is idle
Claimed-Busy : Slot is running Condor jobs
Related
I have a DAG with 2 tasks:
download_file_from_ftp >> transform_file
My concern is that tasks can be performed on different workers.The file will be downloaded on the first worker and will be transformed on another worker. An error will occur because the file is missing on the second worker. Is it possible to configure the dag that all tasks are performed on one worker?
It's a bad practice. Even if you will find a work around it will be very unreliable.
In general, if your executor allows this - you can configure tasks to execute on a specific worker type. For example in CeleryExecutor you can set tasks to a specific Queue. Assuming there is only 1 worker consuming from that queue then your tasks will be executed on the same worker BUT the fact that it's 1 worker doesn't mean it will be the same machine. It highly depended on the infrastructure that you use. For example: when you restart your machines do you get the exact same machine or new one is spawned?
I highly advise you - don't go down this road.
To solve your issue either download the file to shared disk space like S3, Google cloud storage, etc... then all workers can read the file as it's stored in cloud or combine the download and transform into a single operator thus both actions are executed together.
I'm aware, one can suspend running jobs by qmod -sj [jobid] command and in principal that works. Which means the jobs go to suspend (s) state -- fine so far, but:
I expected that if I put all running jobs to suspend state and qsub new ones to GE or have waiting jobs, that these get to be run, which is not the case.
Some search on this topic lead me to http://gridengine.org/pipermail/users/2011-February/000050.html, which in fact points to the direction, that suspended jobs make the GE free for running other ones.
See here.:
In a workload manager with "built-in" preemption, like Platform LSF,
it works by temporarily relaxing the slot count limit on a node and
then resolving the oversubscription by bumping the lowest job on the
totem pole to get the number of jobs back under the slot count limit.
In Sun Grid Engine, the same thing happens, except that instead of the
scheduler temporarily relaxing the slot count limits, you as the
administrator configure the host with more slots than you need and a
set of rules that create an artificial lower limit on the job count
that is enforced by bumping the lowest priority jobs.
Slightly different topic, but it seems the principal can hold the same: to run other jobs while maintaining your suspended ones, temporarily increase the slot counts on the relevant nodes.
We have a production environment (cluster) where there are two physical servers and 3 (three) plone/zope instances running inside each one.
We scheduled a job (with apscheduler) that needs to run only in a unique instance, but is executing by all 6 (six) instances.
To solve this, I think I need to verify if the job is running in the server1 and if it is a instance that listens on a specific port.
So, how to get programmaticly informations about a zope/plone instance?
I have a job running using Hadoop 0.20 on 32 spot instances. It has been running for 9 hours with no errors. It has processed 3800 tasks during that time, but I have noticed that just two tasks appear to be stuck and have been running alone for a couple of hours (apparently responding because they don't time out). The tasks don't typically take more than 15 minutes. I don't want to lose all the work that's already been done, because it costs me a lot of money. I would really just like to kill those two tasks and have Hadoop either reassign them or just count them as failed. Until they stop, I cannot get the reduce results from the other 3798 maps!
But I can't figure out how to do that. I have considered trying to figure out which instances are running the tasks and then terminate those instances, but
I don't know how to figure out which instances are the culprits
I am afraid it will have unintended effects.
How do I just kill individual map tasks?
Generally, on a Hadoop cluster you can kill a particular task by issuing:
hadoop job -kill-task [attempt_id]
This will kill the given map task and re-submits it on an different
node with a new id.
To get the attemp_id navigate on the Jobtracker's web UI to the map task
in question, click on it and note it's id (e.g: attempt_201210111830_0012_m_000000_0)
ssh to the master node as mentioned by Lorand, and execute:
bin/hadoop job -list
bin/hadoop job –kill <JobID>
I'm tryin get sun gridending (sge) to run the separate processes of an MPI job over all of the nodes of my cluster.
What is happening is that each node has 12 processors, so SGE is assigning 12 of my 60 processes to 5 separate nodes.
I'd like it to assign 2 processes to each of the 30 nodes available, because with 12 processes (dna sequence alignments) running on each node, the nodes are running out of memory.
So I'm wondering if it's possible to explicitly get SGE to assign the processes to a given node?
Thanks,
Paul.
Check out "allocation_rule" in the configuration for the parallel environment; either with that or then by specifying $pe_slots for allocation_rule and then using the -pe option to qsub you should be able to do what you ask for above.
You can do it by creating a queue in which you can define the queue uses only only 2 processors out of 12 processors in each node.
You can see configuration of current queue by using the command
qconf -sq queuename
you will see following in the queue configuration. This queue named in such a way that it uses only 5 execution hosts and 4 slots (processors) each.
....
slots 1,[master=4],[slave1=4],[slave2=4],[slave3=4],[slave4=4]
....
use following command to change the queue configuration
qconf -mq queuename
then change those 4 into 2.
From an admin host, run "qconf -msconf" to edit the scheduler configuration. It will bring up a list of configuration options in an editor. Look for one called "load_factor". Set the value to "-slots" (without the quotes).
This tells the scheduler that the machine is least loaded when it has the fewest slots in use. If your exec hosts have a similar number of slots each, you will get an even distribution. If you have some exec hosts that have more slots than the others, they will be preferred, but your distribution will still be more even than the default value for load_factor (which I don't remember, having changed this in my cluster quite some time ago).
You may need to set the slots on each host. I have done this myself because I need to limit the number of jobs on a particular set of boxes to less than their maximum because they don't have as much memory as some of the other ones. I don't know if it is required for this load_factor configuration, but if it is, you can add a slots consumable to each host. Do this with "qconf -me hostname", add a value to the "complex_values" that looks like "slots=16" where 16 is the number of slots you want that host to use.
This is what I learned from our sysadmin. Put this SGE resource request in your job script:
#$ -l nodes=30,ppn=2
Requests 2 MPI processes per node (ppn) and 30 nodes. I think there is no guarantee that this 30x2 layout will work on a 30-node cluster if other users also run lots of jobs but perhaps you can give it a try.