Fourth Quarter FY 2003 Report - The National Fusion Collaboratory

Edited by D.P. Schissel

General Atomics (schissel@fusion.gat.com)

Overview

This quarter, testing continued during experimental operations of the usage of Access Grid and Tiled Display technology to create a collaborative fusion control room. The results of these tests were so positive that Access Grid nodes are being permanently installed in the Alcator C­Mod and DIII­D control rooms and Tiled Displays are being permanently installed in the DIII­D and NSTX control rooms.

General accomplishments include:

General

Five papers from the project were presented at the 4th IAEA TM on Control, Data Acquisition, and Remote Participation for Fusion Research. This meeting is a worldwide gathering of computer scientists and fusion scientists who are working towards enhancing remote participation in fusion research and was an excellent forum for the NFC project to highlight accomplishments. Papers presented included a project overview, TRANSP on FusionGrid, FusionGrid monitoring, Grid computing, and the collaborative control room. In addition to the papers, a series of demonstrations on FusionGrid capability was also presented using the facilities at General Atomics.

Planning and work began this quarter for participation in the 45th APS/DPP meeting and the SC2003 Meeting. At APS/DPP, a booth will be created near the main poster sessions with posters and handouts detailing the accomplishments of the NFC project and how fusion scientists can start to utilize these new capabilities. At SC2003, a demonstration will be presented on remote participation in fusion experiments utilizing FusionGrid services. Additionally, a project graphic and a presentation are being submitted to the SC2003 DOE SciDAC booth.

The NFC project was involved in the National Collaboratory Middleware Review and presented collaboratory technologies to the Alcator C­Mod scientific team at the C­Mod ideas forum.

The Project web site continued to be updated as required.

Security/Remote Computing

The NFC team has been working with NERSC to install and configure a FusionGrid Mdsplus server system running locally on the local area network at NERSC. This system will be used for storing data from simulation and analysis codes running on large­scale computer systems (Seaborg). The first application for this capability will be the NIMROD code running on Seaborg that can produce many gigabytes of data per run. By reducing network latency, the new architecture should greatly improve the performance over the current configuration where the data is being stored on an MDSplus server at GA. To provide secure transactions, the MDSplus server at NERSC is configured to use Globus GSI authentication.

The FusionGrid TRANSP analyses from the European MAST and JET tokamak experiments were run this quarter. This work required dealing with a new set of firewall issues. In genearal, the NFC team is gaining experience with the administrative aspect of NFC service operation. This has led to the discussion and recommendation of improvements, particularly in the presentation of procedures for acquiring and renewing authentication certificates for services and users.

The integration of the Akenti library code and the Globus job manager was completed. The Akenti callout module has been packaged as a Globus bundle and a modified vesion of the Globus Authorization Callout Web page has been created that includes instructions and software download information for building the Akenti callout module.

Visualization

A small AG node (PIG) was used in the DIII­D control room during experimental operations. A scientist from MIT/C­Mod was a remote participant via the AG to DIII­D operations. To support these tests, a ³Fusion Collaboratory Lobby² was created on the ANL venue server along with a GA room and an MIT room. The AG environment worked very well with audio, video, and shared applications from both sites allowing scientific discussion. The usage of the AG node in the control room illustrated additional work that must be completed to give the remote participant a true sense of presence in the control room. Needs that were identified include the shot cycle, data acquisition status, and the session leader and chief operator audio. For the AG system, more rapid response from the shared VNC applications will be required for the short between pulse data analysis time window. Initial work on advanced compression technology for these VNC sessions has begun as a result of user feedback.

A tiled display wall (2 tiles) was installed in the DIII­D control room for a several week evaluation period during experimental operations. Feedback from the scientific team was very positive both for enhanced collaboration within the control room and for the potential of increased collaboration by off­site colleagues. Work has continued to create a software package to create a ³shared, collaboratory space² on the tiled wall. The design goal is such that users can move their private windows (e.g. computational results) to the shared space for collaboration.

Given the success of these recent tests, the DIII­D National Fusion Facility has decided to permanently install an AG node (PIG) and a Tiled Display Wall in the control room. The present design is for a 4­tile display (1x4). With this decision, AG nodes are being installed in the Alcator C­Mod and DIII­D control rooms and Tiled Displays in the DIII­D and NSTX control rooms.

The ELVis visualization tool is being integrated with the Fusion Grid Monitoring system so that users of TRANSP will have web browser access to simple visualizations to monitor their run in even greater detail than previously possible. This system is being designed so that any computational service on FusionGrid can be monitored in this detail.

To support the collaborative control room, a version of SCIRun to be used directly with Chromium is being created for use on a tiled display wall.

Figure 1 AG node and Tiled Display Wall being tested during DIII­D experimental operation.


Appendix A: Non­Edited Reports from Individual Institutions

A.1      M. Papka for the ANL MCS, Futures Laboratory

This quarter's effort focused on continued development of collaboration technology, focusing on putting a Fusion Virtual Venue in place. ANL has created a "Fusion Collaboratory Lobby" on its ag-2.mcs.anl.gov venue server. In addition to the fusion lobby a GA room has been created that is VNC enabled and a MIT room has been created. The VNC support has been packaged up to operate with the latest version of ANL's AG software. A good deal of development time was spent adding compression support has been added to VNC to address some issues that were highlighted in tests between GA and MIT and tests between ANL and GA. The compression additions have been added to the AG VNC installs. Much work has been done in preparing for demonstrations at Supercomputing. This includes testing and debugging auto updating in ReviewPlus when used with the AG over VNC.

Other efforts this past quarter have included preparing for and participating in demonstrations of the collaboratories efforts using the AG. These include the IAEA Fusion meeting at the end of July, supporting the AG based discussion of the ELMS experiment, and participating in the ITPA demonstration.

A.2      K. Keahey for the ANL MCS, Distributed Systems Laboratory

No report.

A.3      D. Schissel for the General Atomics Fusion Group

General

·          Initiated planning for demonstrations at SC03 that focus around an off­site scientist (SC03 show floor) remote participating in DIII­D experimental operations (Schissel).

·          Three papers representing work of the NFC Project were presented at the 4th IAEA TM on Control, Data Acquisition, and Remote Participation for Fusion Research (Burruss, Flanagan, Schissel).

·          A full demonstration of capabilities created by the NFC project was presented to the several hundred delegates attending the 4th IAEA TM on Control, Data Acquisition, and Remote Participation for Fusion Research (Abla, Burruss, Flanagan, Peng, Schissel).

·          The successful testing of AG and Tiled Wall technology during DIII­D operations has resulted in the decision to make permanent installations of this technology in the control room. Additionally, AG nodes will be installed in two conference rooms to broadcast the morning pre­operations meeting as well as the weekly DIII­D Scientific Meeting (Schissel).

·          Participated in the National Collaboratory Middleware Review (Schissel).

·          Presented Collaboratory technologies to the C­Mod scientific team at the C­Mod Ideas Forum (Schissel).

·          The project web site (http://www.fusiongrid.org) was maintained (Schissel).

Security/Remote

·          Support was provided for the ANL work on preemptive scheduling and QoS issues (Peng).

·          Preparation for the SC03 demonstration has begun included creating a new MDSplus server to mimic an actual DIII­D shot cycle including events and data acquisition (Peng).

·          Work continued toward having an MDSplus server permanently installed at NERSC. This is critical for support of large­scale simulations (e.g. NIMROD) in the collaboratory framework (Schissel).

Visualization

·          The small AG node (PIG) was used in the DIII­D control room during experimental operations. A scientist from MIT/C­Mod was a remote participant via the AG to DIII­D operations. The AG environment worked very well with audio, video, and shared applications from both sites allowing scientific discussion and progress (Abla, Peng).

·          The usage of the AG node in the control room illustrated additional work that must be completed to give the remote participant a true sense of presence in the control room. Needs that were identified include the shot cycle, data acquisition status, and the session leader and chief operator audio.

·          A tiled display wall (2 tiles) was installed in the DIII­D control room for a several week evaluation period. Feedback from the scientific team was very positive both for enhanced collaboration within the control room and for the potential of increased collaboration by off­site colleagues (Abla).

·          The design of the new MDSplus interface within SCIRun has been completed (Peng).

 


Figure 1 AG node and Tiled Display Wall being tested during DIII­D experimental operation.

A.4      M. Thompson for the Lawrence Berkeley National Laboratory

Finished the integration of the Akenti library code and the Globus job manager. Both Akenti and Globus are dynamically linked against the same third-party libraries and it was essential to be sure that all the components linked and ran with the same versions. In addition the default build of the Xerces parsing library was done with threads enabled which interacted badly with the non-threaded version of Globus that was being built. Once everything worked together the Akenti callout module was packaged as a Globus bundle and a modified version of the Globus Authorization Callout Web page was created that included the downloads and instructions for building the Akenti callout module.

Worked on a design and started implementation of a tool to show and verify the policy for a resource. This will allow a stakeholder to check on the access policy for a resource before adding new use conditions. It will also help trouble shoot inconsistent policies and to find problems like expired Akenti certificates or X.509 certificates of people who have signed Akenti certificates.

Continued to issue certificates and answer questions from new users on how to get and use certificates to access NFC services. Made some minor modifications to the NFC Grid Identity Web page to clarify the issuing proceedure in response to customer confusion. 22 collaboratory members and all the hosts have user credentials from the new CA.

Participated in planning with the NERSC storage group to provide an MDSplus server at NERSC. The intent is to provide an experimental MDSplus server with 100GB of storage and GigE connections to Seaborg. It will allow data entry for authorized users on Seaborg and data retrival from Grid authorized users coming in from the WAN.

A.5      M. Greenwald for the MIT Plasma Fusion Science Center

We have been working with NERSC to assist them in installing and configuring an MDSplus server system running on the local area network at NERSC. This system will be used for storing data from simulation and analysis codes running on a large-scale compute server (seaborg). The first application for this capability will be the NIMROD code which can produce many gigabytes of data per run. By reducing network latency, the new architecture should greatly improve the performance over the current configuration where the data is being stored on an MDSplus server at GA. To provide for secure transactions, the MDSplus server at NERSC is configured to use Globus GSI authentication. The MDSplus server at NERSC has been successfully tested using GRID credentials from MIT. In the next step, we will configure the MDSplus client software on seaborg to use Globus GSI.

We are still waiting on the Globus developers for the XIO client/server sample programs so that we can explore the use of XIO features with MDSplus. The use of XIO should enable us to use Globus GSI and parallel socket I/O to improve performance over high bandwidth, high latency network connections.

AccessGrid 2.1 was installed on 2 PC's running Windows XP. These systems have been configured with inexpensive USB cameras and Bluetooth headset microphones. The cameras work exceptionally well, but the wireless headsets perform marginally. The same hardware was procured for a unix workstation. The camera and sound have been configured in the operating system. We are working with Argonne personnel to get the AG software working on this platform. We plan to eventually deploy this unix based configuration for users in our control room for use in supporting off-site scientific collaborations. We are testing the new unicast capabilities in AG2.1

We have been helping local users to obtain DOEGrid certificates and to migrate from the old to the new remote TRANSP service. The complexity of the certificate process has proved daunting to many users ­ even those with significant experience in computing and software. Significant improvements in the process will be needed if we are to scale the fusion grid to much larger numbers of users. The principle needs are a streamlined process and better online documentation.

A.6      D. McCune for the Princeton Plasma Physics Laboratory

Collaboratory Computational Services:

Summary:  PPPL continued to consolidate and develope its "live" NFC TRANSP run production facility.  The first TRANSP analyses from the European MAST and JET tokamak experiments have been processed (this involved dealing with a new category of firewall issues). The JET/MAST deployment has already read to useful benchmarking and trouble-shooting of the remote sites' local run production facilities.

We are gaining experience with the "administrative aspects" of NFC service operation.  This has led to discussion and recommendation of improvements, particularly in the presentation of procedures for acquiring and renewing authenticateion certificates for services and users. Discussions with M. Thompson and K. Keahey at LBNL and ANL have been particularly helpful in this regard. One issue has been the one-year expiration date for certificates.  It has been determined that the issuance of certificates valid for two years is consistent with DOEgrids policy, and is very much desired by our users; therefore the two year policy has been adopted.

·          Successful Globus TRANSP runs from JET. This site blocks both incoming and outgoing network connections through its firewall. The TRANSP compute service was built to support sites that block incoming connections as this is customary at many sites. JET has installed Globus software on its 'jac-' computers and has changed its firewall policy to allow outgoing Globus traffic to the PPPL compute service. JET is the first site with a default policy of blocking outgoing network connections that has enabled TRANSP clients.

·          MAST TRANSP runs also tested successfully, using submissions through the JET client interface.

·          PPPL Globus TRANSP Computer service now supports DOEGrid-provided certificates (transition from prior Certificate Authority).

·          Certificate lifetime extension to two years negotiated.

·          Requirements identified for improvements to documentation of the Certificate first time application and renewal processes.

Collaboratory Visualization Activity:

Summary:  PPPL continued to develop its ElVis initiative‹collaborative Java/web-based scientific graphics.  This capability is being integrated with the NFC TRANSP run production facility operating at PPPL, so that users will have browser access to plots to check their input data, and monitor the progress of their runs as they execute.

Plans are to integrate this with GA's Fusion Grid Monitor, and to show these capabilities at the November 2003 SuperComputing conference.  The past few months have shown considerable progress towards this goal. At the same time, computer support personnel for the NSTX project have taken an interest in ElVis, and are investigating the requirements for an ElVis-enabled version of the standard IDL graphics library, which would be accessible to user applications by a simple change of the IDL procedure environment, without requiring changes to legacy IDL code.

At the same time, the development of the collaborative control room system prototype continues for NSTX, with the close collaboration of PCS.  PPPL NFC work will include testing of PCS implementations of enhanced VNC-like shared X environments, with such capabilities as the ability to move displays from a private space to a public space, and multiple parallel mouse-operations in the public shared graphical space.

·          Developed roadmap for continued development of ElVis ­ incorporated feedback from 5/28 NFC Review meeting.Set up ElVis server with "phantom" client to enable on-demand data monitoring.

·          Added contour plots to ElVis.

·          Added Fortran API documentation for ElVis using javadoc.

·          Directed summer undergraduate intern's use of ElVis for visualization of Tokamak Simulation Code (TSC) results.  TSC ElVis viewer used by S. Jardin and C. Kessel, physicist users of TSC.

·          Continued design and planning (with PCS) of Collaborative display for NSTX Control Room.

·          NSTX Control Room design was presented to NSTX staff by Scott Klasky and favorably reviewed.

A.7      K. Li for the Princeton University Computer Science Department

·          Further developed the automatic alignment tool for constructing display walls that are used for collabotary activities.  We have released the tool for Windows environment.

·          Designed and implemented a software package for multiple users to use "shared, collaboratary space" in a control room.  Users can move their private windows such as their computational results to the shared space for collaborations. Multiple users can control their windows simultaneously.

A.8      Sanderson for the University of Utah Center for Scientific Computing and Imaging

During the quarter the SCIRun readers for fusion data were greatly expanded. There is now a coordinated effort to use both MDSPlus and HDF5 as the two standards for reading fusion data. We took part in discussions with MIT and X Tech (Boulder, CO) on the converters under development between HDF5 and MDSPlus. At the same time support was given to the NIMROD group for moving over to HDF5 and worked with PPPL to support their HDF5 stored data.

The expansion of the readers also required supporting SCIRun modules for transforming the data into SCIRun types. These expansions allow the conversion most types of data stored in either MDSPlus or HDF5. In addition, there is more robust support for time dependent data, data that can not be stored in memory.

We working on a version of SCIRun that can be used directly with Chromium for tiled wall display. In the past because of lack of support in Chromium a separate version was required with many work arounds. At this point very few work arounds are needed. Currently we are working on improving efficiencies that reduce performance.