| 56 | |
| 57 | = Running "Compound Plots" = |
| 58 | Many UCERF3 plots require calculations from the entire model, including all logic tree branch choices. The OpenSHA class scratch.UCERF3.analysis.CompoundFSSPlots can be used to generate plots across all of these branches. Practically speaking, high performance computing is needed to process the entire model in parallel. All "Compound" plots in the UCERF3 reports were computing using high performance computing resources and the class scratch.UCERF3.analysis.MPJDistributedCompoundFSSPlots which uses the FastMPJ MPI for Java library to handle intra-node communication. An example script for running MPJDistribtedCompoundFSSPlots is given below, which will recreate the Fault System Solution MFD plots. You will need a compound fault system solution file as input, for example: http://opensha.usc.edu/ftp/kmilner/ucerf3/2013_05_10-ucerf3p3-production-10runs/2013_05_10-ucerf3p3-production-10runs_COMPOUND_SOL.zip |
| 59 | |
| 60 | This script is used to run MPJDistribtedCompoundFSSPlots on the Stampede supercomputer at TACC: |
| 61 | |
| 62 | {{{ |
| 63 | login3.stampede(6)$ cat plot_gen.pbs |
| 64 | #!/bin/bash |
| 65 | |
| 66 | # this is the wall clock time |
| 67 | #SBATCH -t 00:200:00 |
| 68 | # this is the number of CPUs - number of nodes is this number divided by 16 |
| 69 | #SBATCH -n 1280 |
| 70 | # queue we are submitting to |
| 71 | #SBATCH -p normal |
| 72 | |
| 73 | # this is the directory where we are running |
| 74 | RUN_NAME="2013_05_10-ucerf3p3-production-10runs" |
| 75 | COMPOUND_DIR="/work/00950/kevinm/ucerf3/inversion/compound_plots/${RUN_NAME}" |
| 76 | # this is the path to the compound solution file |
| 77 | INV_DIR="/work/00950/kevinm/ucerf3/inversion/${RUN_NAME}" |
| 78 | COMPOUND_FILE="${INV_DIR}/${RUN_NAME}_COMPOUND_SOL.zip" |
| 79 | # this is the set of plots that we want to generate |
| 80 | PLOTS="--plot-mfds" |
| 81 | # the number of threads per compute node |
| 82 | THREADS="4" |
| 83 | # the minimum number of jobs sent to a given node at a given time, typically this can be equal to the number of threads |
| 84 | MIN_DISPATCH="4" |
| 85 | # the amount of memory allocated to the JVM, should be as much as possible |
| 86 | MEMORY="25G" |
| 87 | |
| 88 | # this gets the node list |
| 89 | PBS_NODEFILE="/tmp/${USER}-hostfile-${SLURM_JOBID}" |
| 90 | echo "creating PBS_NODEFILE: $PBS_NODEFILE" |
| 91 | scontrol show hostnames $SLURM_NODELIST > $PBS_NODEFILE |
| 92 | |
| 93 | # paths to FastMPJ |
| 94 | export FMPJ_HOME=/home1/00950/kevinm/FastMPJ |
| 95 | export PATH=$PATH:$FMPJ_HOME/bin |
| 96 | |
| 97 | # this converts the node list to a format which FastMPJ can read |
| 98 | if [[ -e $PBS_NODEFILE ]]; then |
| 99 | #count the number of processors assigned by PBS |
| 100 | NP=`wc -l < $PBS_NODEFILE` |
| 101 | echo "Running on $NP processors: "`cat $PBS_NODEFILE` |
| 102 | else |
| 103 | echo "This script must be submitted to PBS with 'qsub -l nodes=X'" |
| 104 | exit 1 |
| 105 | fi |
| 106 | |
| 107 | # make sure there's at least one node |
| 108 | if [[ $NP -le 0 ]]; then |
| 109 | echo "invalid NP: $NP" |
| 110 | exit 1 |
| 111 | fi |
| 112 | |
| 113 | date |
| 114 | echo "RUNNING FMPJ" |
| 115 | # this runs MPJDistributedCompoundFSSPlots in parallel with the previously selected arguments |
| 116 | fmpjrun -machinefile $PBS_NODEFILE -np $NP -dev niodev -Djava.library.path=$FMPJ_HOME/lib -Djava.awt.headless=true -Xmx${MEMORY} -cp ${COMPOUND_DIR}/OpenSHA_complete.jar:${COMPOUND_DIR}/parallelcolt-0.9.4.jar:${COMPOUND_DIR}/commons-cli-1.2.jar:${COMPOUND_DIR}/csparsej.jar -class scratch.UCERF3.analysis.MPJDistributedCompoundFSSPlots --threads ${THREADS} --min-dispatch ${MIN_DISPATCH} $PLOTS ${COMPOUND_FILE} ${COMPOUND_DIR} |
| 117 | ret=$? |
| 118 | |
| 119 | date |
| 120 | echo "DONE with process 0. EXIT CODE: $ret" |
| 121 | |
| 122 | exit $ret |
| 123 | }}} |