SC Demo for the NFC
Project: Keahey, Papka
Fusion experiments operate in a pulsed mode with a new pulse coming up roughly 15-20 minutes. Between experimental pulses Fusion scientists run analysis and simulation codes to evaluate the progress of the experiment and determine parameter adjustments going into the next pulse. With the adoption of computational Grids in the Fusion community, such codes can be run remotely as long as they can be guaranteed to finish in the prescribed time; this requires combining time-critical execution of network transfer, resource reservation, and job execution.
The demo will showcase the use of computational Grids and visualization technologies during a real-time Fusion experiment. Specifically, we will use the Globus Toolkit 3 (GT3) infrastructure to run time-critical remote Fusion codes, Access Grid technologies to create an environment where time-critical jobs can be launched and monitored, the results visualized, and the ongoing experiment can be discussed,The demo viewer will have the opportunity to passively participate in an ongoing Fusion experiment, see how the interactions during such an experiment advance Fusion science, and experience first-hand, how the technologies developed under the NFC project enhance those interactions. Also, this experiment has never been carried out before in any Fusion control room. In addition to being a demonstration it will therefore mark a milestone for the Fusion community.
The infrastructure combining time-critical execution of network transfer, resource reservation, and job execution has been partially developed. The experiment will involve reading data from MDSplus, using a data transfer service, based on GridFTP, to transfer them to the remote site where job execution will take place, executing it on a reserved resource, transferring it back using the data transfer service, and writing the data into MDSplus. These actions will be carried out automatically and orchestrated by a broker executing a previously written workflow.
From the perspective of the fusion scientist the interaction takes place in the following stages. Prior to the experiment, the scientist inquires with the broker for the codes and arguments that can be executed in the required time. The broker provides that information by requesting and combining information from the data transfer service and job execution service. These services provide it based on historical execution data as well as network prediction. Based on this information the client makes an agreement for service execution with certain arguments to be available during the time of the experiment. The broker uses the agreement in order to make the requisite resource reservations. At the time of the experiment, the scientist requests the execution of a certain code. The broker carries out the execution in the requested time, informing the client of its progress and completion of stages (data transfer time, execution time, etc.). This interaction is carried through using GUIs.
What is required: I have an infrastructure that some of my
students have been developing that I have used for a similar (but much simpler,
and not real-life-ready) demo last year. Infrastructure is flaky, and I can't
rely on student effort to provide it for the demo. I cannot do it myself
due to other time commitments (although to the extent possible I would like to
be involved in development). What is needed is support/troubleshooting for the
infrastructure, addition of some specific features required for the demo, work
with scientists to put concrete codes into the framework, and help in running
the demo. I currently estimate something like at least 1/2 FTE for 2 months,
but this might change after I discuss exact codes with my collaborators.
Benefits: the resulting infrastructure would constitute a Fusion deliverable.
In addition, (as it is similar and has partially driven the development of
OGSI-Agreement) it would be another "test drive" of the ideas (if not
an actual implementation of OGSI-Agreement; there is no time for this). It
would also constitute a major step in converting the Fusion infrastructure to
GT3 (obviously getting them to run GT3 in a real experiment makes a statement
about GT3). If successful, it would also demonstrate how remote Grid resources
can be used for time-critical calculations during an experiment.
The Access Grid (AG) will leverage its long history of aiding groups in collaboration at a distance. Building on the recent 2.0 release that incorporates all the functionality of earlier versions of the AG environment enhanced by the use of standard Grid middleware provided by the Globus Toolkit. The new AG environment provides the ability to incorporate Grid based services, the starting and monitoring of jobs on the Grid, The results can then be stored within the AG Virtual Venue or an external datastore.
As part of the National Fusion Collaboratoryıs Supercomputing 2003 demonstration the collaboration and visualization efforts will integrate with the distributed computing efforts a integrated solution for Fusion scientists to enhance the way they do science. This will tie together time-critical computing, the ability to consult with colleagues that are not collocated with the experiment, and to share visualizations as part of the analysis process.
From the perspective of the fusion scientist the interaction takes place in the following stages. Prior to the experiment, the scientist will login to the Access Grid Fusion venue where it is capable to inquire with the ?WHAT BROKER? broker for the codes and arguments that can be executed in the required time. The broker provides that information by requesting and combining information from the data transfer service and job execution service and presents this information to the scientists. At the time of the experiment, the scientist returns to the Fusion venue where he can monitor the broker requesting the execution of a certain code. As the broker carries out the execution in the requested time, informing the client of its progress and completion of stages (data transfer time, execution time, etc.) results are monitored within the Fusion venue. During this time the fusion scientist is also capable of meeting with remote colleagues to discuss already received results of the experiment and simulation. The fusion scientist has full access to data already stored in the MDSPlus via the AG based client; access to shared analysis tools such as ReviewPlus and Jscope.