| | 1 | = HPC Communications = |
| | 2 | |
| | 3 | [[Image(sc-comm.png)]] |
| | 4 | |
| | 5 | !UltraScan's communication with the High Performance Computer (HPC) or Grid Cluster is |
| | 6 | in accordance with the above drawing. The following tasks will be accomplished by the |
| | 7 | elements as described below. |
| | 8 | |
| | 9 | == Laboratory Information Management System (LIMS) == |
| | 10 | |
| | 11 | The purpose of this system it to interface with the user to specify an analysis type, |
| | 12 | such as the Genetic Algorithm (GA) or Two Dimensional Spectrum Analysis (2DSA), and |
| | 13 | the needed parameters for the analysis to the HPC system. After the user specifies the needed |
| | 14 | data, the data is packaged into a control.xml package. The command-line program |
| | 15 | grid-submit is then invoked. |
| | 16 | |
| | 17 | The contents of the control.xml file will include a generated AnalysisGroupGUID and |
| | 18 | all needed child HPCAnlysisRequest records. |
| | 19 | |
| | 20 | Specific data in the control.xml file will be specified here once we agree on this |
| | 21 | top level design. |
| | 22 | |
| | 23 | The database tables HPCAnalysisGroup, HPCAnalysisRequest, and appropriate Settings tables |
| | 24 | are populated by LIMS before calling grid-submit.php. |
| | 25 | |
| | 26 | |
| | 27 | [wiki:Us3HpcDb US3 HPC Database Tables] |
| | 28 | |
| | 29 | LIMS is currently a Web interface. In the future, it's functionality may be ported |
| | 30 | to the !UltraScan client. |
| | 31 | |
| | 32 | === grid-submit === |
| | 33 | |
| | 34 | grid-submit.php is a command line tool that creates the initial HPCAnalysisResult table |
| | 35 | entry with a queue staus of 'Submitted'. It copies control.xml and all files that it specifies |
| | 36 | to the HPC system using the gsiscp utility. |
| | 37 | |
| | 38 | It then uses the submission technique needed for the specified supercomputer cluster |
| | 39 | to queue the job. |
| | 40 | |
| | 41 | == Supercomputer Queue == |
| | 42 | |
| | 43 | This task is controlled by the Supercomputer system. It is responsible for controlling |
| | 44 | the jobs running on that system and communication with clients. |
| | 45 | |
| | 46 | Communication tasks include receiving tasks, returning job status, and informing the |
| | 47 | client when a task has been completed or aborted. |
| | 48 | |
| | 49 | == NNLS (!UltraScan HPC Analysis Program) == |
| | 50 | |
| | 51 | The NNLS program reads the control.xml file and uses that as a guide to read other data |
| | 52 | files as needed to populate internal data structures. It then performs the analysis, |
| | 53 | writing any needed output to disk. |
| | 54 | |
| | 55 | At the beginning of the program, periodically during execution, and at the end of of |
| | 56 | processing, NNLS writes a UDP status datagram to a listener on the host and port specified |
| | 57 | in the control.xml file. Each datagram will consist of the analysisRequestGUID and a |
| | 58 | status (e.g. started, iteration number, finished). This is not a reliable two-way |
| | 59 | communication and it is the responsibility of the listener to follow up and manage any |
| | 60 | missed messages. |
| | 61 | |
| | 62 | == grid-timeout == |
| | 63 | |
| | 64 | This program will ether be scheduled periodically via cron, or run as a daemon. It will |
| | 65 | check status of jobs in the mysql database and initiate a status query for jobs |
| | 66 | that have overdue status updates. If a job has been aborted, it will notify the |
| | 67 | grid-listen program of that status. |
| | 68 | |
| | 69 | == grid-query == |
| | 70 | |
| | 71 | This is a command line program that submits a status query to the Supercomputer Queue and |
| | 72 | returns the result. |
| | 73 | |
| | 74 | == grid-listen == |
| | 75 | |
| | 76 | This program runs as daemon receiving udp packets from the NNLS program or the grid-timeout |
| | 77 | program. It is responsible for updating the mysql database table HPCAnalysisResult with current |
| | 78 | status and, upon completion or abort of an analysis, fetches needed files from |
| | 79 | the supercomputer cluster, sends an email to the user, and does any other cleanup necessary. |
| | 80 | |