Friday, July 31, 2009

Model cluster script

*** IMPORTANT: Please also read the Comment. *** 
 
Principles of a ‘good’ cluster script: 
1) Should only copy over ONE file at the beginning & only ONE file back at the end. 
2) Should show which node and directory it is running in, so that we can figure out which nodes are not running. 
3) Should clean up after the run is done, erasing files from the compute node. 
4) Should keep stderr and stdout output to a minimum, by piping it to /dev/null 
 
A) Make a ‘files.list’ file contain a list of all the files you need to copy over. 
 
% more files.list 
ccrel.BATCH 
ccrel.awk 
crun.sh 
datain.dat 
loop.sh 
out.txt 
seed 
simdata.dat 
simped.dat 
slinkin.dat 
unmake.awk 
 
B) Create a tar file ‘files.tgz’ based on this list: 
 
tar zcvfT files.tgz files.list 
 
IMPORTANT: Do steps A and B OUTSIDE of the script that will be submitted to the cluster. 
 
Do not use zip instead of tar. Use of tar as described here creates ONE compressed file - copying of only one file over/back keeps creation of the associated meta-data to a minimum. 
 
C) Use qsub to submit a file like this to the cluster.
 
 
#!/bin/csh/ -f 
# Purpose: 

# Steps: 

# Helper scripts: 

# Uses: 
# ============================================================================== 
#$ -cwd 
#$ -m e 
#$ -j y 
#$ -N CCRELsim 
date 
# This will tell you which host it is running on. 
echo JOB_ID: $JOB_ID JOB_NAME: $JOB_NAME HOSTNAME: $HOSTNAME 
unalias cp 
# ============================================================================== 
echo Making directory /tmp/$$ 
mkdir /tmp/$$/ 
if !(-e files.tgz) then 
echo ERROR The files.tgz archive does not exist 
echo Please create it prior to running this script 
exit(1) 
endif 
# Copy the needed files over to the compute node 
cp files.tgz /tmp/$$/ 
# Set an alias to your current directory 
set HomeDir = `pwd` 
# Move into the temporary directory on the compute node 
cd /tmp/$$ 
# Extract the files from your compressed tar file 
tar zxvf files.tgz >& /dev/null 
# ============================================================================== 
# Do the needed computations. In this case, these are done by my 'loop.sh' script file that I copied over 
./loop.sh >& /dev/null 
# ============================================================================== 
# Copy results back to working directory 
unalias cp 
cd .. 
tar zcf results_$JOB_ID.tgz $$ 
cp results_$JOB_ID.tgz $HomeDir 
# ============================================================================== 
# Enter working directory 
cd $HomeDir 
# Remove all the files on compute node 
rm -rf /tmp/$$ 
rm -f /tmp/results_$JOB_ID.tgz 
date 
# ================================ 
 

1 comment:

  1. On June 6, 2007, Ryan sent this message:

    I have made a change in the way /tmp directories are created on the cluster for each script.
    You no longer have to create the /tmp directories in your script. It is taken care of automatically by the grid engine software.
    This will help in keeping the /tmp directories from filling up the drives on the compute nodes.

    I would ask that anyone wanting to submit a script please send it to me first for modification.

    Thanks,


    Ryan Evans
    Programmer / Systems Administrator, Center for Computational Genetics
    University of Pittsburgh
    Department of Human Genetics

    ReplyDelete

About Me

My photo
Pittsburgh, PA, United States