Slurm and Enroot
Finally touching on Slurm system. First heard about during CGG time, and we had some brief discussing of using it for cluster jobs. But our own implemention of MPI base cluster job is efficient enough, and actually it’s actually a self serviced cloud architecture, with remote compute, storage, and all orchestrations tools. So I always have an impression that MPI is better than Slurm.
1 Slurm basic
Apparently the ACCOUNT
and PARTITION
are important ids in Slurm.
sinfo -s
check node statussqueue --me
check the job queuescontrol show job job_id
show job details
2 Enroot
Enroot is a tool to run docker without docker. Details of cmd is here
- Credential
Enroot cred is set at
~/.config/enroot/.credentials
filemachine gitlab-master.nvidia.com login kylhuang password glpat-xxxxx machine nvcr.io login $oauthtoken password nvapi-yyyy
- Run Docker images with Enroot. This need to be done in the srun interactive job environment, which is after running the script 1 in the next job submissino session
# Create the squash file. enroot import --output ${PATH_TO_IMAGE}/phi3-mini.sqsh -- docker://nvcr.io/nim/microsoft/phi-3-mini-4k-instruct:latest # Create enroot image with the squash file, under /tmp/enroot-data/user-2001072735/phi3-mini-container enroot create --name phi3-mini-container ${PATH_TO_IMAGE}/phi3-mini.sqsh # Run the container enroot start phi3-mini-container enroot list
If you run enroot without srun interactive job, the
import
job will fail withenroot-aufs2ovlfs: failed to set capabilities: Operation not permitted
errors, and thecreate
will be created under~/.local/share/enroot/phi3-mini-container
andenroot start
would fail.
3 Job submission
- A job creating an interactive environment ```sh export ACCOUNT=general_sa export JOBNAME=job1
srun -A ${ACCOUNT}
-N1
-J ${JOBNAME}
–partition=interactive
–time=1:00:00 –exclusive
–pty /bin/bash -l
2. A job running a container
```sh
srun -A ${ACCOUNT} \
-N1 -p interactive \
-J ${JOBNAME} \
--container-image=${LOCAL_NIM_CACHE}/phi3-mini.sqsh \
--container-mounts="${LOCAL_NIM_CACHE}:/opt/nim/.cache:rw" \
--container-env=NGC_API_KEY \
--export=ALL --mpi=pmix \
bash
4 Job status
You can see job details by scontrol show job job_id
:
- PD (pending)
- R (running)
- CA (cancelled)
- CF(configuring)
- CG (completing) CD (completed)
- F (failed), TO (timeout), NF (node failure), RV (revoked) and SE (special exit state)