Cluster functions for Docker/Docker Swarm (https://docs.docker.com/engine/swarm/).
The submitJob
function executes
docker [docker.args] run --detach=true [image.args] [resources] [image] [cmd]
.
Arguments docker.args
, image.args
and image
can be set on construction.
The resources
part takes the named resources ncpus
and memory
from submitJobs
and maps them to the arguments --cpu-shares
and --memory
(in Megabytes). The resource threads
is mapped to the environment variables “OMP_NUM_THREADS”
and “OPENBLAS_NUM_THREADS”.
To reliably identify jobs in the swarm, jobs are labeled with “batchtools=[job.hash]” and named
using the current login name (label “user”) and the job hash (label “batchtools”).
listJobsRunning
uses docker [docker.args] ps --format={{.ID}}
to filter for running jobs.
killJobs
uses docker [docker.args] kill [batch.id]
to filter for running jobs.
These cluster functions use a Hook to remove finished jobs before a new submit and every time the Registry
is synchronized (using syncRegistry
).
This is currently required because docker does not remove terminated containers automatically.
Use docker ps -a --filter 'label=batchtools' --filter 'status=exited'
to identify and remove terminated
containers manually (or usa a cron job).
makeClusterFunctionsDocker( image, docker.args = character(0L), image.args = character(0L), scheduler.latency = 1, fs.latency = 65 )
image | [ |
---|---|
docker.args | [ |
image.args | [ |
scheduler.latency | [ |
fs.latency | [ |
Other ClusterFunctions:
makeClusterFunctionsInteractive()
,
makeClusterFunctionsLSF()
,
makeClusterFunctionsMulticore()
,
makeClusterFunctionsOpenLava()
,
makeClusterFunctionsSGE()
,
makeClusterFunctionsSSH()
,
makeClusterFunctionsSlurm()
,
makeClusterFunctionsSocket()
,
makeClusterFunctionsTORQUE()
,
makeClusterFunctions()